User talk:West.andrew.g/Archive 6

From Wikipedia, the free encyclopedia
Jump to: navigation, search
Archive 5 Archive 6 Archive 7

STiki Milestones

Hi Andrew in my opinion the Wikipedia:STiki/milestones is pretty nice idea to encourage editors by giving them a token of appreciation. but as i see no one is looking over it. Its time we invite users willing to keep a watch over it and award the editors for completing their milestones. comments ? -- ÐℬigXЯaɣ 22:39, 6 May 2012 (UTC)

Yes, this is something User:Orphan Wiki pledged above. However, I cannot comment on how active of an editor he is/has been lately. Personally, I don't care who takes care of these things. In the spirit of collaboration, if you'd like to notify users, go ahead (I assume you've found the template we designed)! We also need to design a template/box for the milestone edits (not just the first edit). If you'd like to take a crack at that, feel free! Thanks, West.andrew.g (talk) 23:02, 6 May 2012 (UTC)
I've been giving out the welcome messages, ever since the template was completed... I don't understand why this conversation is even taking place! Orphan Wiki (talk) 23:15, 6 May 2012 (UTC)
Haha, sorry about that! I actually never bothered to check. If you look at the milestones page, you'll notice User:DBigXRay did place a check-mark next to one template he issued. I think I just mentally assumed that everyone would follow that convention. Would you mind doing that, so if you do end up going on a bit of a wiki hiatus that someone else could slide in? Otherwise, thank you and keep up the good work! Thanks, West.andrew.g (talk) 23:22, 6 May 2012 (UTC)
No problem, I went on giving out the messages, but not leaving a check-mark. Apologies about that. I've check-marked all the users I've welcomed with the new template. Orphan Wiki (talk) 23:28, 6 May 2012 (UTC)
  • (edit conflict)sorry for creating confusion if any. My comment was based on the header of page Wikipedia:STiki/milestones which talked of some kind notation should be made on whether or not a user has been awarded/notified which was lacking leading to my conclusion that it was lying un-noticed. In my opinion the natation of using a Green tickY that has been suggested seems reasonable, it must be followed by sign so we can see the time stamp and the person who has done it. I am glad to see that Orphan wiki had welcomed every user for his first edit, with the templates. The welcome templates are attractive, Nice work on those. But it seems we are lacking on the awarding for 1000+ edits etc. What about a special Stiki anti-vandalism barnstar ? any comments -- ÐℬigXЯaɣ 23:30, 6 May 2012 (UTC)

The barnstar has also been discussed. I've had little time to sort it yet, but it should be addressed soon. It'll basically be a case of using the Anti-Vandalism barnstar, but using a STiki image. I'll have a go with this now actually. Orphan Wiki (talk) 23:32, 6 May 2012 (UTC)

cool, Glad to hear that we are making progress. Actually my concerns arose due to the fact that Andrew was not interested in these non technical but rather important things, so was just making sure it wasn't ignored. keep up the good work, cheers. -- ÐℬigXЯaɣ 23:37, 6 May 2012 (UTC)
Just for clarification, its not that I am not interested. Rather, people seem eager to help out, so I see this as a good allocation of resources. Thanks, West.andrew.g (talk) 23:49, 6 May 2012 (UTC)
No probs :) Orphan Wiki (talk) 23:38, 6 May 2012 (UTC)
It'll basically be a case of using the Anti-Vandalism barnstar, but using a STiki image do you mean the image will be a stiki image instead of the barnstar image ? . In my opinion it would be good if we could add the stiki logo in the centre of a decent barnstar image, some motivations here WP:Barnstars-- ÐℬigXЯaɣ 23:40, 6 May 2012 (UTC)
(edit conflict) (edit conflict) (edit conflict) (edit conflict). I give up! Here is what I was going to say, about 5 edits ago: Okay, got all that sorted out (in a very rapid and edit-conflict laden fashion). Yes DBigXray, the design of a template to handle the 1000, 5000, 10000, 250000 case is what I proposed we should create in my first message. It could just be the anti-vandalism barnstar, but add the STiki image, and note STiki and the accomplishment in the default message. I don't have much template kung-fu, so if you'd like to take a crack, go for it! I'm getting out of here before we conflict again: Go Flyers! Thanks, West.andrew.g (talk) 23:41, 6 May 2012 (UTC)
Please check my sandbox for a something I've drafted up quickly! What do you think? I did it in about 5 minutes, so any suggestions welcome. Orphan Wiki (talk) 23:43, 6 May 2012 (UTC)
  • @andrew 718smiley.svg @orphan hmm quick work, this can be used for small milestones. but i guess these barnstars actually dont qualify as barnstars and the spirit of barnstars. the receiver may make a conclusion that this is another stiki advertisement, instead of an appreciation. what do you say about my suggestion above. ? -- ÐℬigXЯaɣ 23:47, 6 May 2012 (UTC)
Agreeing with DBig, here. We should start with the anti-vandalism barnstar (even retaining its image) and add just a touch of STiki flavor. We want people to be grateful for the award, and not feel like spam victims. Thanks, West.andrew.g (talk) 23:52, 6 May 2012 (UTC)

OK, so, just to be clear, when we say "start" with the anti-vandalism barnstar (default image et al), what do we mean? Orphan Wiki (talk) 23:54, 6 May 2012 (UTC)

  • it means for the first award (i.e. for 1000 reverts )we should use the antivandalism barnstar on the stiki template you have designed in your sandbox. for second awards onwards, we can create a new Stiki+barnstar image by using a suitable combination of them. -- ÐℬigXЯaɣ 23:57, 6 May 2012 (UTC)
But what is a "first" award, and what is a "second" award, and how should they be different? Are we going with a design similar to the one on my sandbox or not? (Feel free to amend it / the message contained within BTW). I'm still not sure... Orphan Wiki (talk) 00:00, 7 May 2012 (UTC)
For all milestones > 1 I'd like to re-use the same barnstar/template for ease of use (with a placeholder for the threshold crossed). Of course, one can always add a personal message (in addition to our standard one), if someone has crossed a truly epic threshold. If someone gives me an hour or two, I can create a first draft, With the fast and furious editing, here, though, I wasn't eager to start and run into a conflict or duplicate work. Thanks, West.andrew.g (talk) 00:02, 7 May 2012 (UTC)
No probs, I'm going to sleep now. Tomorrow, after some drafts have been drawn up, things can start moving. :) Orphan Wiki (talk) 00:05, 7 May 2012 (UTC)
agreed, the text and template will be the same, just some variations in the barnstar image so that they are not mere repetitions. after all its the barnstar image that makes it so special 718smiley.svg we must not discount on that, its good to take time (and some sleep as well Face-wink.svg )and come up with a good image and text , regards, good night. -- ÐℬigXЯaɣ 00:08, 7 May 2012 (UTC)
just did some changes to the text on User:Orphan Wiki/sandbox, any comments/changes andrew ? -- ÐℬigXЯaɣ 00:26, 7 May 2012 (UTC)
Hah! Everyone says they are calling it a night, and then they restart the editing! Since I thought you guys were done, I clean-slated over at Wikipedia:STiki/milestone_template and now have something I'm pretty happy with. Comments? Suggestions? If everyone is in agreement, I'm cool with immediately clearing this for distribution. Thanks, West.andrew.g (talk) 01:18, 7 May 2012 (UTC)
  • I think the earlier phrase and ask you to bring any comments or concerns to our attention can be replaced with a simpler and more personal keep in touch but thats my opinion. I have updated it for now , though can be replaced depending on the comments. the idea of 2 images is perfect, Even I had thought about it in the beginning but trashed it later as i felt that it would be unconventional 718smiley.svg-- ÐℬigXЯaɣ 02:20, 7 May 2012 (UTC)

I'm pretty happy with that template, and the suggestion made by DBigXRay. I'm happy to commence with the distribution straight away, and others should not hesitate to jump in also if I'm tardy. Orphan Wiki (talk) 11:14, 7 May 2012 (UTC)

I have just parameterised Wikipedia:STiki/milestone template. Do people like that? Yaris678 (talk) 11:53, 7 May 2012 (UTC)
I do, but I also cleaned up your work a bit (w.r.t. to the signature). Did the same for the "welcome" template, as well. Thanks, West.andrew.g (talk) 13:53, 7 May 2012 (UTC)
I don't want to be picky... but it was you that put in the signature. I suppose I could be accused of not cleaning that up, when I parameterised the other things.  :-) Yaris678 (talk) 15:51, 7 May 2012 (UTC)
Right... The signatures for the both Wikipedia:STiki/milestone template and Wikipedia:STiki/welcome template are now put in by default. I used a clever bit of coding I found at Template:Bubble tea. In fact, while I'm thinking about it... Yaris678 (talk) 17:27, 10 May 2012 (UTC)
Cool hack, thanks for the additions. West.andrew.g (talk) 18:10, 12 May 2012 (UTC)

Milestones page archiving

I just thought I'd post a comment here before ploughing on. Before the milestones page becomes seriously long, I'm happy to take on the task of archiving it every month. We can make an April archive now, while we concentrate on the month of May. Something like Wikipedia:STiki/milestones/Archive1 (April 2012) should be adequate? And they can be linked by an archive box or a list at the top of the page. Should I go ahead with this? Orphan Wiki (talk) 11:17, 7 May 2012 (UTC)

I was also thinking about this... but why go to the trouble? Why not just keep ~1 week of reports. Beyond that, just delete the earliest remaining section/date whenever you check-in each day/night to grant awards. There is no reason we'd need to look back in history -- and if we did, the "revision history" would be trivial to browse by date. Moreover, all this data lives in my database, and reports could easily be re-run. Just seems like more work, and unnecessary work at that. Thanks, West.andrew.g (talk) 13:57, 7 May 2012 (UTC)
OK, well, I'll delete the April records now, and keep up regular maintenance from now on, as I distribute the welcome messages and awards. Orphan Wiki (talk) 14:35, 7 May 2012 (UTC)
  • I think we forgot to congratulate the milestone achievers [1] before deleting. -- ÐℬigXЯaɣ 16:30, 7 May 2012 (UTC)
Since over a week has gone by since these people used the tool, perhaps we are outside the window of opportunity? I assume that was Orphan Wiki's thinking. Thanks, West.andrew.g (talk) 16:35, 7 May 2012 (UTC)
yes thats true for the first revert, but the 1000 and 5000 milestones wont mind the delay-- ÐℬigXЯaɣ 19:20, 7 May 2012 (UTC)
True. Go back and give them the new barnstar, then. Though I wouldn't bother for the "first-use" cases. Thanks, West.andrew.g (talk) 19:22, 7 May 2012 (UTC)
Done. Orphan Wiki (talk) 09:12, 8 May 2012 (UTC)

A barnstar for you!

Thanks User:Oliverlyc. I've moved the barnstar to the the WP:STiki page -- West.andrew.g (talk) 15:50, 13 May 2012 (UTC)

Excessive edits to a subpage

While we're not supposed to worry about performance, the sysadmins have said in the past that bots should log off-wiki rather than making hundreds of edits per day to log on-wiki. This would seem to apply to your use of User:West.andrew.g/Dead links. You should modify your bot to make its logs off-wiki, or at least batch its changes to at most a handful of edits per day. Thanks. Anomie 03:27, 15 May 2012 (UTC)

Hi Anomie. Just for informational purposes that page is transcluded at Wikipedia:STiki/Dead links where it has a nicer introduction. I envisioned the reports might be useful via the "What Links Here" functionality. A couple instances on my talk page (or STiki's) have indicated this working (here is an instance, I don't feel a need to hunt down others). It's never really panned out how I might have hoped, though. It still serves some minor functionality so I'd prefer not to terminate it altogether (though I am willing if consensus exists). I am in the midst of dissertation writing, but will make code adjustments so the edits are handled in a less system-intensive batch format within several days. I will ACK here when it is done. Thanks, West.andrew.g (talk) 05:53, 15 May 2012 (UTC)
Assuming that my implementation is bug-free, batch mode is now enabled and should take effect immediately. Thanks, West.andrew.g (talk) 14:41, 15 May 2012 (UTC)

STiki vandalism %

Can you clarify why many users have 0% AGF stats. How exactly is this calculated; does this mean the editors only use the vandalism and innocent options?Ankh.Morpork 22:32, 15 May 2012 (UTC)

Per the "CHANGELOG" post near the top of this page, the AGF functionality is quite new to the STiki tool. Previously, only vandalism/innocent/pass were available. Its only been available for about a month of STiki's 2+ year history, which explains why this is the case. I can anecdotally report that it's use since being enabled hovers in the 10-15% range (of all classifications). Thanks, West.andrew.g (talk) 22:37, 15 May 2012 (UTC)
Thank you. I wished to ensure that my usage patterns were somewhat aligned with other editors and this aspect puzzled me.Ankh.Morpork 22:43, 15 May 2012 (UTC)


The following changes were part of a new release rolled out this evening:

  • The "wiki-diff" link in the metadata panel wsa repaired so that it now displays the same diff as in the diff-browser. This was an omission in the GUI's switch to exclusively rollback functionality. (T#008)
  • Some (but not all) users reported that the different elements in the "help menu" failed to jump to the appropriate section/anchors in the help-doc. This has been fixed. Also, a related bug was discovered/fixed where external links in the help document were not being handled correctly (T#011)
  • The diff-browser now has copy-paste functionality. Text can be highlighted in the diff-browser and transferred to the clipboard either via CTRL+C or a right-click context menu (T#012).
  • It is now possible to enable HTTPS support via the "Options" menu. If checked, HTTPS will be used for all communication with the MediaWiki API and any links that open in browser and point to "". A restart is required. (T#013).
  • Following from the above, the "Appearance" menu has been renamed to "Options". This could soon become a larger bin for settings. Documentation changed accordingly.
  • Minor (invisible) changes were also made to the way that persistent settings are stored, so that the XML output remains clean and human-editable as STiki evolves.
  • Code has been put in place such that, if consensus is reached, that STiki can only be used by editors with the rollback permission. This functionality is encoded but currently not in use (T#005).
  • If STiki is going to post a user-warning to a "User Talk" page which is in fact a redirect, it now resolves that redirect (T#002).
  • Most are familiar with warning templates {uw-vandalism#} or {uw-spam#}. Less common are those like {uw-test#}, {uw-joke#}, or {uw-advert#} that also identify problematic users. The identification and escalation from these templates (into spam/vandalism ones) is now part of warning logic. Notably, STiki will not jump from a non-spam or non-vandalism level 4 warning to an AIV post (it will post a level 4 spam/vandalism instead), though it will do this for "4im" cases. (this request was not bug-tracked; it was posted by User:Allens).

Thanks to everyone for their support and suggestions/bug-reports! The next release may be a while off: I am heavily involved in my thesis dissertation and some travel in the coming month is likely to provide questionable Internet connectivity. In the meantime, enjoy this one! Thanks, West.andrew.g (talk)


Nuvola apps edu languages.svg
Hello, West.andrew.g. You have new messages at Wikipedia:STiki/milestones.
Message added 07:47, 28 May 2012 (UTC). You can remove this notice at any time by removing the {{Talkback}} or {{Tb}} template.

new--ÐℬigXЯaɣ 20:21, 29 May 2012 (UTC)

With reference to this the link to talk page gets deactivated if the sign is on the usertalk (link to the same page..), thought of reminding as he does have a link to talkpage.--ÐℬigXЯaɣ 21:36, 29 May 2012 (UTC)

I am finding STiki ... the best of the bunch.

Hi Gareth. Let us know if you have any problems and happy editing! Thanks, West.andrew.g (talk) 15:45, 14 April 2012 (UTC)
  • Hello again, Andrew!

Still being made to feel very much the new boy, which actually I quite like – having recently entered my eighth decade – I wondered if you would like some feedback. STiki is the second vandalism tool I have employed (the first being Twinkle) Then I successfully applied for Rollback, so tried Huggle. Finally, used igloo.

Having returned to your product, I would say that it is the best of the bunch. I particularly like the easy interaction between STiki and the article/the editor/the history and of course, Twinkle.

This brings me to my first question: if I revert using Twinkle on the article page, does this action get recorded in the STiki records?

Secondly, on your Leaderboard page, you, for example, show STiki as your 'favourite queue', whereas I am listed as preferring Cluebot-NG – why the difference, and how is it selected?

Thirdly, I was allowed to use STiki before Rollback authorisation was granted. Why?

Best regards, -- Gareth Griffith-Jones (talk) 09:06, 18 May 2012 (UTC)

Hi Gareth, I'll answer your questions in order
  • First, I assume you find an edit using STiki, and then use a hyperlink to open the edit/diff in your web browser. It is from here that you use Twinkle to undo the tool. When you return to the STiki window, that same edit is still being displayed, so the only way to get the next one is to use a classification button. This classification will be recorded for leaderboard purposes (i.e., if you press "vandalism" you will be credited for it -- and you won't revert the edit you just made using Twinkle).
  • Second, you will notice there is a "queue" menu in the STiki tool where the queue can be selected. These are behind the scenes algorithms that determine what edits are displayed to users first (based on vandalism probability). There are all types of ways to calculate this with varying accuracy. Right now, it seems like the "CBNG" queue has the greatest chance of showing you an instance of vandalism. CBNG is a third-party algorithm. I wrote and developed the "STiki" approach, so that is why I have some preference for it. Because all the approaches are different, they may find vandalism another does not. If you are using the CBNG queue and find your vandalism hit-rate slowing down during a long session, it may be worthwhile to switch queues and see if this helps things.
  • Third, rollback is not required to use STiki. There is some discussion whether we should make it a requirement -- but given the fact we have had very little abuse using the tool, we're inviting everyone to the party. If you are curious why you were able to do "rollbacks" in STiki when you didn't have the "rollback" permission, this is because STiki implements something called "software rollback" for those who don't have the native right. In this case, STiki does a lot of work to determine "what would a rollback do?" and then makes an edit to the same effect. This is far less efficient than native rollback, but gets the job done!
Thanks for your feedback, and happy classifying! West.andrew.g (talk) 14:10, 18 May 2012 (UTC)
Good afternoon Andrew,
Thank you for your detailed, and very clear answers, to my questions. Cheers! -- Gareth Griffith-Jones (talk) 15:59, 18 May 2012 (UTC)

────────────────────────────────────────────────────────────────────────────────────────────────────Catch up.

Good morning Andrew, I am still preferring STiki and enjoying using it too.

  1. Since your reply of the eighteenth, I have generally always checked STiki on the "queue" menu. When will that show up on the leaderboard statistics information heading?
  2. I can see no consistency in the LAST REVERT BOX, regarding "no warning issued ... edit too old". Sometimes it shows up when edit is within 24 hrs.!

Best regards, -- Gareth Griffith-Jones (talk) 10:38, 31 May 2012 (UTC)

1. Your leaderboard preference will change to "STiki" whenever a *plurality* of your classifications are performed using the "STiki (metadata)" queue. It doesn't take any recent preference into account.
2. IP users should be warned if the rollback is within 24 hours of their edit. Can you give an example of when this did not happen? Remember that Wikipedia operates on UTC time, which may be significantly different than your local time. The "metadata panel" reports the time elapsed since the edit. Are these correct for you? This calculation uses the same logic as the warning system. If *those* are wrong, your system clock may be set incorrectly.
Its okay this time, but in the future try to keep STiki-centric discussion on its talk page. Thanks for your continued support, West.andrew.g (talk) 16:43, 31 May 2012 (UTC)
Good evening Andrew,
First of all, your last remark is duly noted, and I shall concur with your wishes. It was just that I wanted your involvement and not someone else's. Regarding ...
1. That is exactly as I had anticipated – but I am surprised that it had not changed several days ago – and that is why I raised the question.
2. Curiously, when I chose 24 hrs., I had no idea that it had any relevance. Now I understand. The connection of one whole day with an IP is obvious. Just for the record,as we are on British Summer Time at present, we are UTC+1. It all makes sense now.
Thank you for your time. Kind regards, -- Gareth Griffith-Jones (talk) 19:03, 31 May 2012 (UTC)

Problem with STiki edit summary

Moved to Wikipedia talk:STiki: DℬigXray 19:03, 11 June 2012 (UTC)

Wikipedia Help Survey

Hi there, my name's Peter Coombe and I'm a Wikimedia Community Fellow working on a project to improve Wikipedia's help system. At the moment I'm trying to learn more about how people use and find the current help pages. If you could help by filling out this brief survey about your experiences, I'd be very grateful. It should take less than 10 minutes, and your responses will not be tied to your username in any way.

Thank you for your time,
the wub (talk) 18:22, 14 June 2012 (UTC) (Delivered using Global message delivery)

It's been completed. Thanks, West.andrew.g (talk) 15:35, 30 June 2012 (UTC)


I received the Barnstar, but you deserve one more. This is the most useful tool that I have ever used. The script finds vandalism that was done days ago and was not caught by other programs that I use. --Morning277 (talk) 15:09, 28 June 2012 (UTC)

Agree and seconded. also improves the efficiency :) --DBigXray 15:38, 28 June 2012 (UTC)


Thanks to Morning277 and Faizan Munawar Varya for the barnstars. They have been transferred to the "awards" section of the main STiki page. Thanks, West.andrew.g (talk) 15:39, 30 June 2012 (UTC)

Feature suggestions

Hello! You have an awesome job of creating and improving STiki. While at it, I was wondering whether there is a way the s/ware can be customised so that a CVU member can revert edits arising from Vandalism from editors in a certain country or IP block.. Or is this a feature that needs some time to prepare and code? Thank you. SUwanja Talk to Me. Email Me. 17:27, 8 July 2012 (UTC)

Moved to Wikipedia talk:STiki: for a wider discussion --DBigXray 17:35, 8 July 2012 (UTC)


Did you invent STiki? (talk) 18:51, 12 July 2012 (UTC)

As a matter of record: "yes", I did create the STiki tool which did revert the first of your contributions at your IP address. Despite the warnings you received, your actions since seem to indicate ill faith towards the mission of this project. I notice you have been blocked by administrators. Upon the expiry of that block, I encourage you to reconsider your actions and ponder the possibility of becoming a constructive community member. Absent that, our tools will continue to monitor your behaviors. Rather than wasting time on edits that will be deleted, consider improving articles in areas of your interest. Thanks, West.andrew.g (talk) 04:02, 13 July 2012 (UTC)

Stiki for mobile?

Hi, this was just a random passing thought I had. Since STiki is Java based, can it be ported to work on Java phones? --Rsrikanth05 (talk) 11:01, 16 July 2012 (UTC)

Screen size matters a lot for deciding on the diffs. Even if done it will not be much helpful due to the smaller size of the Cellphone screen.--DBigXray 11:07, 16 July 2012 (UTC)
I agree with DBigXray. I doubt a version could be realized that is too friendly for mobile phones. Something may be possible for tablet devices. However, this is not a current priority. With some exuberant users classifying several thousand edits daily and driving revert hit-rates as low as 20% (combined with the expected Summer drop in vandalism), it seems the focus needs to be on finding vandalism (i.e., better classifiers) and the completion of my thesis, rather than inspiring a new user base. If someone were interested in undertaking the task, though, I would certainly support them in it. Thanks, West.andrew.g (talk) 14:40, 16 July 2012 (UTC)


I posted a message on your page a while ago on an old IP address of mine (it changes every day I can't do anything about it so I am not a sockpuppet) and I asked you if you invented STiki, you replied and told me to stick to good faith edits. I have done so and realize that vandalism is pointless and a very bad form of humor. Thanks for the warning a while ago. I used this yesterday (Special:Contributions/ And today it's another good faith IP address. (talk) 12:36, 21 July 2012 (UTC)

Bravo on the change in attitude. However, rather than contending with a dynamic IP address that changes every day, why don't you create an account so you can persistently track your edits over time and build up a positive reputation in the eyes of the community? It's actually more secure than editing using an IP address. Thanks, West.andrew.g (talk) 18:57, 27 July 2012 (UTC)

The Wikipedia Adventure

Hi! I'm contacting you because you have participated or discussed The Wikipedia Adventure learning tutorial/game idea. I think you should know about a current Community Fellowship proposal to create the game with some Wikimedia Foundation support. Your feedback on the proposal would be very much appreciated. I should note that the feedback is for the proposal, not the proposer, and even if the Fellowship goes forward it might be undertaken by presently not-mentioned editors. Thanks again for your consideration. Proposal: Cheers, User:Ocaasi 16:42, 27 July 2012 (UTC)

Replied in the affirmative, per my previous pledge of support. Thanks, West.andrew.g (talk) 18:52, 27 July 2012 (UTC)
Thanks so much for your support! I really hope we get this made. I also have no idea how we missed eachother at Wikimania but I'm curious how your experience went and what you're up to these days with all of your research. Maybe a Skype call to catch up? Ocaasi t | c 00:00, 28 July 2012 (UTC)
Sorry for the delay, and yes, I'd love to talk sometime. I'm currently in thesis writing mode under the the title of "Securing Wiki Platforms Against Damaging Contributions" so novel research has taken a back seat for several months while I aggregate previous work. I'm glad to see that you're getting experience across the project and I'd even like to pick your brain about research relevant problems. I've analyzed vandalism and detected vandalism. Analyzed spam and detected spam. Analyzed deleted/copyrighted content ... and ... then I come to find your proposal about "Turnitin", which would seem to complete my trifecta. I cannot express how interested I would be in collaborating with you on that and contributing whatever resources I have. If you don't mind, perhaps Skype-text or IRC would be preferable so there is a record? Look forward to hearing from you. Thanks, West.andrew.g (talk) 05:47, 1 August 2012 (UTC)
Well, I'm glad our paths can cross in a productive way! My preference would be for regular Skype with a follow-up email or wiki-page to hard-code some notes from the meeting. But I'd be happy to do Skype-text if that's your preference. Maybe some combination would work. I'm also glad you stumbled across Turnitin, which is a very neat idea that is very close to being presented to the community, but is currently going through review at the Foundation. I hope to know if I can start an RfC soon. It would be very cool if there was a research component to the Turnitin project. Not only would it 'complete your Trifecta', but it would help quantify the benefit of using Turnitin's detection methods, something we need to demonstrate to be effective in order to expand the project. Let's set something up for early next week if you can slip away from your thesis for a half hour. Best, Ocaasi t | c 16:03, 1 August 2012 (UTC)
The method of communication isn't terribly important to me, I just wanted to create a record and have something "cite-able" if I managed to get anything interesting out of your about your OTRS experience or interactions with the Foundation :-). I plan to give your Turnitin proposal a very thorough reading over the weekend. If that fails to materialize in a timely fashion, I may begin to experiment with using some of the "free equivalent" plagiarism detection algorithms using my own computing infrastructure (useful in your end goal, regardless). Though CorenBot has been working in this space for a while, it obviously has its limitations, and it should prove interesting to learn about the prevalence of copyright issues. When might be a good time for you next week? Feel free to contact me privately via email, which should not be hard to find. Thanks, West.andrew.g (talk) 17:25, 3 August 2012 (UTC)

Hey AGW, email sent, with the caveat that OTRS and Foundation stuff may be limited by privacy issues. But we can talk about a lot without getting into those areas. Best, Ocaasi t | c 15:26, 4 August 2012 (UTC)

Talk Page Auto Archiving

I'll do it this evening (Central Time). Will just try to make consistent with archive size/etc from your current ones, okay? Theopolisme TALK 17:39, 4 August 2012 (UTC)

Yes check.svg Done And I have added the {{DNAU}} template to the thread#1 here so that it remains on this page until 00:00 1 January 2200 (UTC) --DBigXray 18:09, 4 August 2012 (UTC)
Many thanks, West.andrew.g (talk) 18:09, 4 August 2012 (UTC)

The Great Revival: CVU Vandalism Studies Project

Hi! We're dropping you this rather unexpected message on your talk page because you signed up (either quite a while ago or rather recently) to be a member of the Vandalism Studies project. Sadly, the project fell into semi-retirement a few years ago, but as part of a new plan to fix up the Counter-Vandalism Unit, we're bringing back the Vandalism Studies project, with a new study planned for Late 2012! But we need your help. Are you still interested in working with us on this project? Then please sign up today! (even if you signed up previously, you'll still need to sign up again - we're redoing our member list in order to not harass those who are no longer active on the Wiki - sorry!) If you have any questions, please leave them on this page. Thanks, and we can't wait to bring the project back to life! -Theopolisme (talk) & Dan653 (talk), Coordinators

Yep, done. Thanks, West.andrew.g (talk) 05:34, 1 August 2012 (UTC)
Could you add those studies, were all busy people. Dan653 (talk) 02:34, 7 August 2012 (UTC)
Yep, two most relevant have been added. Thanks, West.andrew.g (talk) 21:34, 7 August 2012 (UTC)


Nuvola apps edu languages.svg
Hello, West.andrew.g. You have new messages at I dream of horses's talk page.
Message added 05:45, 15 August 2012 (UTC). You can remove this notice at any time by removing the {{Talkback}} or {{Tb}} template.

Vandalism Studies Update - August 2012

Hello, members of the Vandalism Studies Project! As some of us are quite new with the Vandalism Studies project, it would make sense for us to re-read some of the past studies, as well as studies outside the project. Please do so if you have a chance, just so we can get into the groove of things. We're planning on attempting to salvage the Obama study (or possibly simply convert it to a new Romney study), as well as hopefully begin our third study this November. If you have any ideas for Study 3, please suggest them! If you have any questions please post them on the project talk page. Thanks, and happy editing - we can't wait to begin working on the project! --Dan653 (talk) and Theopolisme :)
11:31, 24 August 2012 (UTC)
If you would like to stop receiving Vandalism Studies newsletters, please remove your name from the member list.

I'm going to post this here (and talkback some people privately), because I really don't feel like running over and derailing the good intentions and recruitment of the WikiProject. However, looking at the proposal for a "Romney study" and some of the properties being considered for analysis, I can't help but scream "this has been done!" or "there are easier ways!". I really encourage folks to look at existing (and expand the listing of) academic research in this area (and not just my own). It seems silly to try amass ones own corpus over a single article (i.e., Romney). Research has now produced 250,000+ tagged vandalism instances. Proposing simple quantities like "are IPs the problem" is not novel. Research has derived upwards of 100 statistical measures of vandalism probability. There do remain: (1) open questions, and (2) research lessons that need better leveraged in an online fashion. These and novel developments would seem a more reasonable use of the projects resources. Thanks, West.andrew.g (talk) 14:58, 26 August 2012 (UTC)
Great! Tell us how and what we should do and we'll do it as you definetly know more about VS then us. Dan653 (talk) 17:57, 26 August 2012 (UTC)
From my (somewhat biased) perspective on the problem there is a need to determine the technical capabilities of project members and allocate resources accordingly. I tend to focus on technical solutions to the problem; though there are also social and support rules. Prior to that, though, I think the project needs a well prioritized mission statement. What is the goal? Do we want to *understand* vandalism in some social context? Do we want to develop new anti-vandal *algorithms*? Do we want tools like STiki that are used by end-users? Can we get Huggle and STiki to cooperate in some kind of anti-vandalism clearing-house? Do we want to cooperate with those projects that are looking at refining warning templates? All these things are desirable, However, to make real progress in any domain we need to start with a more narrow scope. Thanks, West.andrew.g (talk) 07:37, 27 August 2012 (UTC)

A sticky situation

Hi Andrew. Stiki is a great tool - in the right hands. Most automated processes at Wikipedia need a clear demonstration of skill and competency before they can be used, and the acquisition of user rights specific to the task is also generally required. I thought I'd just let you know in case you weren't aware, that concerns have been expressed recently by experienced editors about the social and cognitive aspects of the organisation of the CVU and its academy.I'm sure you'll join with me in helping to ensure that your own suggested standards for Stiki use are rigorously upheld. Time permitting, I'll help you keep an eye on the performance of the users of your excellent programme. Best, --Kudpung กุดผึ้ง (talk) 12:42, 28 August 2012 (UTC)

I could not agree more. Reading the recent archives of WP:STiki you'll see plenty of discussion about structuring the approvals process. Being that STiki only recently implemented entry conditions, I'd say we still tend towards a casual policy. Part of this is "assume good faith" and part of it is that we don't want to unnecessarily limit our user-base. Bad users do reflect poorly on the whole STiki community, though.
I don't closely follow the evolution and minutiae of the CVU. As I mentioned, I wasn't entirely comfortable with the approval I did grant. I am trying to write a Ph.D. thesis. The realities of this limit my on-wiki time and as the tool grows I've begun to "outsource" some of these responsibilities. The folks from CVU have been a helpful resource (though I cannot speak to their internal affairs). I'd be interested in your opinion on what you think would be an appropriate approvals process. West.andrew.g (talk) 13:02, 28 August 2012 (UTC)
I think probably an identical one to that which we use for WP:AWB - with of course your recommended criteria. The criterion for AWB (500 mainspace edits) is a guideline only, and the decision to accord is by admin discretion, and often more experience is required. The AWB request page is also transcluded to WP:PERM, so editors can apply in either place. As you are the developer and hold the 'key' to it, the introduction of a formal sanction would not require a consensus of the community - practically: I'm giving Wikipedia a 3rd party tool, and it can only be used by approved editors. The downside is that as you are not an admin you would be excluded form according the rights though that process. The admins who work in the PERM department usually get it right though - most admins expect several hundred manual vandal reverts before according Rollback, irrespective of what the CVU thinks, and CVU comments on the perm page are generally not taken into consideration. Indeed, the necessity of informal 'clerking' on that page have raised concerns recently too. Proxy requests also beg the question (with the exception of 'Autopatroller') as to why the candidates can't speak for themselves. Other scripts already need special permission, and Huggle, for example, is bundled with Reviewer. Hope this helps. Kudpung กุดผึ้ง (talk) 14:26, 28 August 2012 (UTC)
Acknowledged, we'll take this up further when I return from WikiSym in a couple days. Thanks, West.andrew.g (talk) 14:45, 28 August 2012 (UTC)
OK, thanks for all your hard work on Stiki, and good luck for your PhD - I know only too well what it involves :) Kudpung กุดผึ้ง (talk) 14:58, 28 August 2012 (UTC)

Predicting quality flaws

Hi Andrew,

I don't know if you have been following CLEF '12 or Signpost, but the latest issue of Signpost mentions something I thought might interest you.

In Wikipedia:Wikipedia Signpost/2012-08-27/Recent research#Briefly, the paragraph starting "Predicting quality flaws in Wikipedia articles", reports on a competition to identify issues worthy of tagging an article for, using machine learning. I have made a comment at Wikipedia talk:Wikipedia Signpost/2012-08-27/Recent research#Predicting quality flaws. I would be interested to know what you think.

Yaris678 (talk) 08:58, 3 September 2012 (UTC)

Interesting work. I do wish researchers would graph entire precision-recall curves instead of just presenting a single point on that graph. Regardless, there is a question here of how this might be applied in a live/online fashion. A bot that goes around auto-tagging articles seems a bit much. A STiki interface also seems inappropriate, given that these article issues aren't well summarized by a diff. Big article issues like this cannot be quickly resolved, nor can they be resolved by just anyone at random. Thanks, West.andrew.g (talk) 18:40, 3 September 2012 (UTC)
Yeah a bot would never work. For a STiki-like tool it wouldn't be based on diffs. It would just take a user to an article and the user would look at the whole article and decide which tags to apply and possibly make some other edits before pressing a button and being taken to the next article. It would be similar to new pages patrol but would be for non-new pages.
Yaris678 (talk) 22:42, 3 September 2012 (UTC)

Edit summaries

Hi Andrew. Having now reviewed several hundred revert made by those who use it, I find that their edit summaries are often simply: (Reverted edit(s) by (...) Using STiki) without further detail. It would be good to know if the edits were 'identified as vandalism' as Twinkle does. I've discovered for example, that many reverts were for unsourced, but good faith content. I don't use Stiki so I don't know if the edit summary options are so limited. Kudpung กุดผึ้ง (talk) 01:55, 4 September 2012 (UTC)

Arrow right.jpg I have moved this thread to Wikipedia talk:STiki#Edit summaries, please continue discussion there. This helps to keep all STiki-related discussion together and keeps the STiki community in the loop. Yaris678 (talk) 07:50, 4 September 2012 (UTC)

Link Spamming Wikipedia for Profit

Hello Andrew, I have been doing some reading of your papers (which are very well written) and was wondering if Figure 1 of Link Spamming Wikipedia for Profit has been uploaded to Wikipedia/media. If you are the copyright holder, please upload it! (Thank you for the barnstar btw) -- Cheers, Riley Huntley talk 14:10, 5 September 2012 (UTC)

Thank you for the kind words. To be frank, I am unsure of the details of copyright restrictions on such an image. Whoever published the conference proceedings (the ACM press or its digital equivalent) has the strongest copyright claim. However, there are some exemptions so I can post the paper on my personal webpage and do some other not-for-profit, not for re-publication things with it. There is actually a person at our department/university who spends most of their time researching these things so they can be added to Penn's Digital Commons. So suffice to say, they can be messy. If you (or anyone) has any better perspective on this, let me know, at a personal level I would be happy to upload it. Thanks, West.andrew.g (talk) 14:29, 5 September 2012 (UTC)


Nuvola apps edu languages.svg
Hello, West.andrew.g. You have new messages at I dream of horses's talk page.
Message added 06:16, 14 September 2012 (UTC). You can remove this notice at any time by removing the {{Talkback}} or {{Tb}} template.
Nuvola apps edu languages.svg
Hello, West.andrew.g. You have new messages at I dream of horses's talk page.
Message added 06:24, 14 September 2012 (UTC). You can remove this notice at any time by removing the {{Talkback}} or {{Tb}} template.

Just Talk

Hi, sir, it is a fascinating honor to meet you sir, Sorry about this new edit summary issue ... (moved to WT:STiki) ...GoShow (...............) 20:09, 21 September 2012 (UTC)

Arrow right.jpg -- I am moving this to WT:STiki. West.andrew.g (talk) 20:15, 21 September 2012 (UTC)


Nuvola apps edu languages.svg
Hello, West.andrew.g. You have new messages at WT:STiki.
Message added 02:00, 26 September 2012 (UTC). You can remove this notice at any time by removing the {{Talkback}} or {{Tb}} template.

Sriharsh1234 02:00, 26 September 2012 (UTC)


Nuvola apps edu languages.svg
Hello, West.andrew.g. You have new messages at WT:STiki.
Message added 10:44, 27 September 2012 (UTC). You can remove this notice at any time by removing the {{Talkback}} or {{Tb}} template.

Sriharsh1234 10:44, 27 September 2012 (UTC)

Nuvola apps edu languages.svg
Hello, West.andrew.g. You have new messages at Wikipedia_talk:STiki#Constantly_freezing.
You can remove this notice at any time by removing the {{Talkback}} or {{Tb}} template.

--Fox2k11 (talk) 17:58, 28 September 2012 (UTC)

Stiki API

Hey Andrew! It's been a month or so since I talked to you about Stiki and using it to discover desirable newcomers @ WikiSym. I've been waist deep in designing the system to visualize newcomer activity, but I've recently gotten to the point where I am interested in predicting the quality of new users. I've looked over your docs and I'm working with another researcher who is trying to get your scoring system up and running on our own servers. However, I'm hoping there is a way that I can tie into the scoring system that you have running through some (possibly RESTful) HTTP API. Does such an API exist? Thanks! --EpochFail(talk|work) 21:21, 2 October 2012 (UTC)

Hi Aaron. I'd be happy to help the other researcher get things up and running, but unless you plan on doing some large and internal source code changes it is quite silly for us both to be running parallel databases and parsing most of en.wp in a live fashion. There is a very simplistic API into STiki's metadata-derived vandalism scores (on a per edit basis). I have less accessible (but not "private") data sources as well (for example, CBNG scores, which are an alternative and more linguistically-driven approach to edit quality scoring, as well as WikiTrust scores for most edits).
Rather than running your own STiki instance, I'd be happy to consider: (a) determining the queries you want to run and wrapping them in some sort of API, or (b) just giving you a machine account over that database. I'd be happy to conduct further discussion about your needs on-wiki or over email. Thanks, West.andrew.g (talk) 04:18, 3 October 2012 (UTC)
First, this is awesome. Thank you for publishing a web API. I have a few concerns/questions.
  1. I'm working to keep a close to up-to-date record of the main namespace edits made by newly registered users. I estimate that this is on the order of ~125,000 revisions per month that I'd like to get a score for. This works out to a new revision every 20 seconds or so. I suspect that this will not be too much of a load, but I want to let you know the use I'm imagining up front.
  2. In the interest of reducing the overhead of these HTTP requests, I'd like to ask for the scores of a set of revision IDs per request. Does your system support batch requests? I didn't see anything about it in the readme.
  3. Since I'd like to try to stay in sync with your scores (that are relevant to newly registered users), I'll undoubtedly be making some requests that will yield nothing since I'll sometimes beat your scoring system to the punch. How painful is it if I make a lot of requests that do not yield a score?
Thanks again! --EpochFail(talk|work) 14:59, 3 October 2012 (UTC)
One more question. Do you generate scores in chronological order? For example, if I make a request for the score of NS0 rev_id=1000, can I assume that you have the score for NS0 rev_id=999? --EpochFail(talk|work) 15:07, 3 October 2012 (UTC)
Before getting into the details, I'll not an alternative method to get to the scores. On server "" IRC channel "#arm-stiki-scores" spews out scores in near real-time (thus no HTTP overhead and no queries where the edit is yet to be processed). Let me know if this is the more attractive option, or the API works better for you.
Regarding your questions about the HTTP API: (1) The query load you describe should have minimal impact on the server, (2) Batch request functionality does not yet exist, but if you go this route, I could write functionality for that purpose, (3) Scores that do not yield a score are not painful; but I would venture that STiki is probably ~10 seconds behind the recent changes feed at most points during the day, maybe up to a few minutes at one point during the night when the DB tables get partially locked for a backup routine. (4) Input order to STiki is per the recent changes feed (in order?), due to threading considerations, output is not quite in order (and you could pop into the IRC channel to get a sense of this and latency). Thanks, West.andrew.g (talk) 15:59, 3 October 2012 (UTC)

Name change

Hey there,

I noticed when you were updating the STiki nightly noticeboard for # of user edits that my new name wasn't merged with my previous name. I used to previously be Activism1234, and STiki has me down as having 860 Stiki edits. I am now Jethro B, I performed a name change, and Stiki has me down as 30 edits. But I'm the same person - same account even. Is there any chance Activism1234 edits can be merged with Jethro B on the table, and updated as such?

Thanks. --Jethro B 05:03, 3 October 2012 (UTC)

This can be done. One question: Is this a "permanent" name change, or is this a case of alternative accounts, i.e., will the "Activism1234" account ever again make an edit? Thanks, West.andrew.g (talk) 05:16, 3 October 2012 (UTC)
Permanent change. The account Activism1234 just redirects to Jethro B. --Jethro B 05:34, 3 October 2012 (UTC)
Yes check.svg Done -- Changes will be reflected in tomorrow's nights automatic update of the leaderboard (I show ~896 edits for the combined effort). Thanks, West.andrew.g (talk) 05:46, 3 October 2012 (UTC)
Thank you, I appreciate it. --Jethro B 17:54, 3 October 2012 (UTC)

Help with Statistics info on Wikitext site.

Hi Andrew.

I came across a discussion thread of yours in the "village pump" section . I'm an editor at a sister site -- Wikitext-Hebrew. We have a page there for statistics and it hasn't been updated in near 2 years. Essentially I was wondering if you could run a similar task(a one-time request fyi) for us like you did for the user in the post above and pasted the data here. I don't know if it being in a foreign language poses additional problems for the task but i'm pretty sure any of the more computer savvy ones by us are fluent enough in english to "process" the raw data you could generate to fit into the hebrew pages by us. just there's no one to create the raw data at the early stage of the process.

It's a one time request, just so we can have some sense of what traffic is like by us since as mentioned it's been years since we've had any stat analysis. many thanks, Daniel Mokhtar — Preceding unsigned comment added by (talk) 20:18, 18 October 2012 (UTC)

Greetings. I am afraid I am unable to fulfill your request. While I have been storing English Wikipedia (exclusively) statistics for ~2 years now, I have not touched any other languages or projects. Single-use processing pipelines are not a time investment I can make at this time (trying to complete a PhD). Now, it would not be hard to do this if you have some computationally-inclined people and infrastructure. At the Wikimedia statistical dumps page are the raw files of statistics. Each *hourly* statistics file for all projects is ~60MB zipped (probably 3x that unzipped), just for perspective (though you'd have to fetch these, I am sure your projects portion is a trivial fraction of this data). Extract your project's statistics hour after hour, day after day ... and then at some point you need to aggregate those files. If someone on your team is comfortable with Java and SQL, I'd be happy to send them the code I use to do this for, and it would be quite trivially portable to another project. West.andrew.g (talk) 16:21, 19 October 2012 (UTC)

Fascinating. After posting it a bit on my end it doesn't seem like I have any takers. Question: Is the Java and SQL code you have something you could send to me (despite my lack of programming languages) and I could keep on file by me till hopefully someday we will have someone by us who's suited to utilize the info? Many thanks for all your time. --Daniel — Preceding unsigned comment added by (talk) 16:32, 22 October 2012 (UTC)

Sure, contact me via email and I will send it to you. Mine is easy to find on my academic homepage. Thanks, West.andrew.g (talk) 21:48, 22 October 2012 (UTC)

ok, i was able to send to your email. just erased my email from here.-- Dmokhtar

Yes check.svg Done. Sent to your email. Best of luck. Thanks, West.andrew.g (talk) 22:47, 24 October 2012 (UTC)


Nuvola apps edu languages.svg
Hello, West.andrew.g. You have new messages at Hairrr's talk page.
Message added 02:03, 25 October 2012 (UTC). You can remove this notice at any time by removing the {{Talkback}} or {{Tb}} template.

HairTalk 02:03, 25 October 2012 (UTC)

Most viewed articles

Hello! I was directed to your post about the most popular articles in the past three months at Wikipedia:Village_pump_(technical)#Top_10.2C000_articles - great stuff! I was just asking whether I could get such data anywhere at User_talk:Scottywong#Popular_articles_stats. Apparently a monthly list used to be kept at but it was last updated for October 2010 (when Human Penis Size was only ranked 125, compared to number 22 now. The world has gotten more insecure in the past two years, apparently.) Would you happen to know if there is anyway to get that monthly chart updating again, or perhaps replaced with something like what you did? Having the page hit #s too would be great. It gonna shocks me that we don't keep better data on page views--god forbid we had ads on pages, we'd probably know the social security number of every visitor to every page!--Milowenthasspoken 01:08, 17 October 2012 (UTC)

I forgot to add, I was also interested in getting a similar list for the de.wikipedia, to figure out which articles popular there do not exist on en.wikipedia. I got interested in this after I created Ulrich Franzen this week, where de.wikipedia had long had an article on this well known American architect. Cheers.--Milowenthasspoken 01:14, 17 October 2012 (UTC)
First off, I don't do the German statistics so I'll be of no help there. Regarding your first question, I'll make you a deal... I'll code up what's necessary to make and update a "popular articles" page once every ~10 days if you'll go around and make sure it gets some exposure throughout the project (nothing spammy, just hunting down statistics pages and things of that sort where it would be appropriate). It'll need to reside in my user-space, since it technically will be "bot" updated. Real life is a touch crazy right now, but maybe over the weekend I can code this? Thanks, West.andrew.g (talk) 20:50, 17 October 2012 (UTC)
You've got your man! I'm happy to spread the information around far and wide, I really think such statistics will be of great use. Furthermore, news sources should be interested in it as well- what's most popular on wikipedia is a window to world's hivemind.--Milowenthasspoken 23:27, 17 October 2012 (UTC)
Code done. Top 5k articles updated once every 10 days. Here it is: User:West.andrew.g/Popular_pages -- Publicize away! Next/first update should occur in 6 or 7 days time. Let me know if it doesn't happen. Thanks, West.andrew.g (talk) 06:18, 22 October 2012 (UTC)
This is excellent. I'll spread the word! On the current list, I assume top entires like Winsor McCay (3.8 million views) are accurate, but the high view count is probably the result of bots or DDOS issues--I've seen entries like that on the old pre-2010 lists too.--Milowenthasspoken 13:39, 22 October 2012 (UTC)
"On October 15th, 2012, Google showed an animated Doodle for the 107th anniversary of Winsor McCay's Little Nemo in Slumberland, featuring an interactive, motion picture comic strip". I think that is your explanation. Thanks, West.andrew.g (talk) 13:42, 22 October 2012 (UTC)
Fascinating!--Milowenthasspoken 13:58, 22 October 2012 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── Quick thought: I wonder if the page can note when it was last updated? That would help track what 10 day period was sampled to create the list? Thanks.--Milowenthasspoken 14:55, 22 October 2012 (UTC)

Yeah, I'll add that now. Due to transclusion it needs to come right before the table. Thanks, West.andrew.g (talk) 15:04, 22 October 2012 (UTC)
  • Andrew, I think initial wave of spreading the word has been successful, clearly there is great interest in this data. I see some discussion of this report moving to WP:5000 or something if there is bot approval. I'd like to propose that the Signpost include an article about the report, but perhaps I should wait until move happens for maximum effect?--Milowenthasspoken 13:40, 26 October 2012 (UTC)
  • I don't see why we don't transclude the thing into the "WP" namespace as this skirts bot rules (all automated editing would still be in my personal user-space). Didn't seem too popular an idea below, though. Thanks, West.andrew.g (talk) 19:31, 26 October 2012 (UTC)
  • I would not object to the transclusion method. MBisanz talk 17:05, 27 October 2012 (UTC)
  • FYI, it seems the redirect via WP:5000 survived its deletion discussion. So with that through, publicize away! Thanks, West.andrew.g (talk) 16:45, 28 October 2012 (UTC)

Top 5000

Hi. Great job, hope you don't mind but I've moved it to Wikipedia:Top 5000 pages as I think it should be shared by the community and linked on a main template somewhere. Count de Blofeld 15:41, 22 October 2012 (UTC)

I would prefer it if this could be done as a transclusion from the previous page location (or a redirect). The page is automatically updated by a bot. This is completely legal as long as the page resides in my user-space. If it does not, then I would have to go through BRFA (bot approvals). I would prefer not to do this. Thanks, West.andrew.g (talk) 15:47, 22 October 2012 (UTC)

Ah I see. I think though it is a greatly needed tool and you'll have no problems getting bot approval. I'll move it back to your user for now and put in a BAG approval. Should be no problems. I've asked Bisanz, see what he says. Count de Blofeld 15:49, 22 October 2012 (UTC)

Cool. Ping me back here on the outcome of that decision (as I will also need to update code to point to the new location). FYI, if the move takes place, you may also want to handle the transcluded User:West.andrew.g/Popular pages header page as well. Thanks, West.andrew.g (talk) 21:51, 22 October 2012 (UTC)
(crossposting) I suggested you move the page and get bot approval for a bot account to update it. It will be an easy process because it's a simple task. Probably 20 minutes of your time at the most. MBisanz talk 21:54, 22 October 2012 (UTC)

Andrew, cool stuff. I had an idea awhile ago for something that used pageview data, but I could never get the pageview database sorted out. I wonder if you'd be interested in creating it. Here's my idea:

There are large maintenance backlogs on Wikipedia, hundreds of thousands of pages are marked for improvement in some way or another. For a new user stumbling upon Category:Articles lacking sources, it can be daunting. With nearly a quarter million articles to fix, where should one start? How can I make sure the effort I put into fixing these articles has maximum benefit to the encyclopedia?

I want to address this by ranking articles in each maintenance backlog category by their page views. Personally, I would be infinitely more motivated to add references to 10 articles if I knew that those ten articles combined get millions of page views per month, rather than if I just picked 10 random articles from a list of 250,000 and fixed them. Not only that, but my efforts would be benefiting more readers. I laid out the concept at User:Scottywong/Backlog prioritization (using Grok's website in a brute force type way, on a relatively small maintenance category), but I never had time to maintain my own pageview database.

Would you have any interest in taking this on? ‑Scottywong| speak _ 20:30, 23 October 2012 (UTC)

I'm afraid I don't have time to take on individual projects involving data use-cases (as I try to finish up a PhD thesis IRL). However, your user page seems to indicate you have some programming knowledge. I am willing to open a simple API and/or query access to developers in support of these (or similarly impact-ful) goals. Thanks, West.andrew.g (talk) 23:03, 24 October 2012 (UTC)

Hi, your Top 500 list (linked from the main contents page) is quite interesting reading. One issue- there are several odd redlinks included (e.g. "Wsearch.php"). Would it be possible to filter out any such redlinks? (talk) 04:53, 25 October 2012 (UTC)

It could be done, but would waste a bunch of bandwidth, and its nothing something I plan on doing. Besides, these red-links still impart to a reader how often non-article actions happen (i.e., search functionality or a 404 page landing). I think there is some value in that. Consider also that deleted articles will also produce a red link. Thanks, 05:12, 25 October 2012 (UTC)
Ok, thanks for clarifying. I didn't realize those redlinks were terms that people had actually typed in- some of them are quite unusual. Thanks. (talk) 08:32, 25 October 2012 (UTC)
They aren't "typed in", but they are actions undertaken on the Mediawiki interface. For example "Wsearch.api" is a use of search functionality. Thanks, West.andrew.g (talk) 19:29, 26 October 2012 (UTC)