Wikipedia:Bots/Requests for approval

From Wikipedia, the free encyclopedia
< Wikipedia:Bots  (Redirected from Wikipedia:BRFA)
Jump to: navigation, search
Shortcuts:

If you want to run a bot on the English Wikipedia, you must first get it approved. To do so, follow the instructions below to add a request. If you are not familiar with programming it may be a good idea to ask someone else to run a bot for you, rather than running your own.

 Instructions for bot operators



Archives

Old Format
Archive 1, Archive 2, Archive 3, Archive 4
New Format
Categorized Archive all subpages of this page


[edit] Current requests for approval

edit

[edit] YFdyh-bot

Operator: YFdyh000 · contribs · SUL (contribs· count · logs · page moves · block log · rights log · ANI search

Time filed: 20:43, Wednesday March 7, 2012 (UTC)

Automatic, Supervised, or Manual: Automatic / Supervised / Manual

Programming language(s): pywikipedia interwiki.py

Source code available: Standard pywikipedia

Function overview: maintain interwiki link

Links to relevant discussions (where appropriate):

Edit period(s): daily

Estimated number of pages affected: Usually maximum 10 seconds (put_throttle = 10), but only occasionally burst will not keep for a long time.

Exclusion compliant (Y/N): Yes, compliant.

Already has a bot flag (Y/N): Yes, on zh.wikipedia.org

Function details: zh.wikipedia as a starting point, maintaining interwiki of en, zh, ect.

[edit] Discussion

Is this task going to be automated, supervised, or manual? What parameters are you using for interwiki.py? — madman 03:26, 8 March 2012 (UTC)

Sometimes automatically: interwiki.py -continue -autonomous -cleanup -async, sometimes supervision, sometimes manually: interwiki.py -warnfile:logs/warning-wikipedia-en.log -lang:en -confirm --YFdyh000 (talk) 04:48, 8 March 2012 (UTC)

Which namespaces will you edit? —  HELLKNOWZ  ▎TALK 09:13, 12 March 2012 (UTC) edit

[edit] MahdiBot

Operator: Mahdi.hajiha (talk · contribs) (corrected by - Jarry1250 [Deliberation needed] 16:56, 5 March 2012 (UTC))

Time filed: 07:02, Sunday March 4, 2012 (UTC)

Automatic, Supervised, or Manual: Manual

Programming language(s): Python

Source code available: pywikipedia

Function overview: interwiki from fawiki , add picture on category and translate from fawiki

Links to relevant discussions (where appropriate):

Edit period(s): Continuous

Estimated number of pages affected: Depends on conditions!but I think near 1000 page.

Exclusion compliant (Y/N):

Already has a bot flag (Y/N): N

Function details:

[edit] Discussion

  • Note: This bot has edited its own BRFA page. Bot policy states that the bot account is only for edits on approved tasks or trials approved by BAG; the operator must log into their normal account to make any non-bot edits. AnomieBOT 16:39, 5 March 2012 (UTC)

Firstly, operator is your main user name (Mahdi.hajiha I presume) and not the bot's name. Please don't make edits from your bot account before trial/approval. You also need to post more details. What operating mode (automatic, supervised, manual) will the bot run? Will the bot be exclusion compliant? How many pages and how fast do you expect it will edit? For interwiki: What interwiki.py parameters will you use? What namespace will the bot edit? Make sure you read WP:INTERWIKIBOT. What do "add picture on category" and "translate from fawiki" tasks mean/do? —  HELLKNOWZ  ▎TALK 16:40, 5 March 2012 (UTC)

1.Sorry! im forget this. Done!
2.update.
3.for interwiki i want to create the interwiki on new article and modification interwiki old Article on fawiki.
4.add picture on category:this code created with my friends(i think this code have Present on enwiki) have give the en article and check infobox image. and if the pic public to another wiki،this use this for article./Mahdi.hajiha (talk) 14:27, 6 March 2012 (UTC)
This does not answer the questions: Will the bot be exclusion-compliant (this is actually implied for interwiki.py, but you should know whether it is or not), how fast do you expect it will edit, what interwiki.py parameters will you use, what namespace will the bot edit, and what does "translate from fawiki" mean? I'm not sure the rest of the questions have been answered satisfactorily either; while I do somewhat understand your description of the second task, I don't see how it matches the description "add picture on category". — madman 03:30, 8 March 2012 (UTC)

A user has requested the attention of the operator. Once the user has seen this message and replied, please remove this tag. (user notified) —  HELLKNOWZ  ▎TALK 09:14, 12 March 2012 (UTC) edit

[edit] DarafshBot

Operator: درفش کاویانی (talk · contribs)

Time filed: 04:53, Monday February 27, 2012 (UTC)

Automatic, Supervised, or Manual: Supervised

Programming language(s): python

Source code available: No, based on pywikipedia

Function overview: interwiki and translate to fa.wiki

Links to relevant discussions (where appropriate):

Edit period(s): Continuous

Estimated number of pages affected: few pages per hour, or more but, firstly all changes will be check manualy, before its going to be very stable

Exclusion compliant (Y/N): N

Already has a bot flag (Y/N): N

Function details: My Bot not be able to edit on English Wikipedia! that was blocked. Mamad TALK

[edit] Discussion

SUL for operator http://toolserver.org/~quentinv57/sulinfo/%D8%AF%D8%B1%D9%81%D8%B4%20%DA%A9%D8%A7%D9%88%DB%8C%D8%A7%D9%86%DB%8C

Hi درفش کاویانی ! Have you read and understood WP:BOTPOL, especially WP:INTERWIKIBOT? Josh Parris 02:53, 3 March 2012 (UTC)

Also what interwiki.py parameters will you be using and what namespaces will you edit? Also, why is the bot non-exclusion compliant? And what does "translate to fa.wiki" mean -- is this some other task in addition to interwiki? —  HELLKNOWZ  ▎TALK 08:51, 5 March 2012 (UTC)

A user has requested the attention of the operator. Once the user has seen this message and replied, please remove this tag. (user notified) —  HELLKNOWZ  ▎TALK 09:15, 12 March 2012 (UTC) edit

[edit] MetrikiBot

Operator: MetrikiBot (talk · contribs)

Time filed: 20:15, Wednesday February 29, 2012 (UTC)

Automatic, Supervised, or Manual:Manual

Programming language(s):Java

Source code available:no, not fully written yet

Function overview:I'm trying to write a bot that downloads page history information for data mining for my MS research.

Links to relevant discussions (where appropriate):

Edit period(s):No editing will be done, we are only downloading information, plan to use periodic batch runs.

Estimated number of pages affected:No pages will be edited, estimate downloading page histories for hundreds of pages.

Exclusion compliant (Y/N):Y

Already has a bot flag (Y/N):

Function details:We are not editing any pages. We are interested in high volume downloads without facing the limit of the 500 revisions that will be returned.

[edit] Discussion

  • Note: This bot appears to have edited since this BRFA was filed. Bots may not edit outside their own or their operator's userspace unless approved or approved for trial. AnomieBOT 20:24, 29 February 2012 (UTC)

[edit] Copied from pre-rename BRFA

Can you use database dump? —  HELLKNOWZ  ▎TALK 21:26, 28 February 2012 (UTC)

No, we don't have terabytes of space available for use. We are looking to download a representative sample of version histories from 100s, not thousands of examples.--Metriki (talk) 21:31, 28 February 2012 (UTC)

  • Note: This bot has edited its own BRFA page. Bot policy states that the bot account is only for edits on approved tasks or trials approved by BAG; the operator must log into their normal account to make any non-bot edits. AnomieBOT 21:33, 28 February 2012 (UTC)
Note 2 : Presumably the bot would be named MetrikiBot or similar, but the user is new and might not have expected the BRFA process to use the name of the bot, rather than the name of the user. Headbomb {talk / contribs / physics / books} 22:17, 28 February 2012 (UTC)
I have it downloaded and it takes about 1.6 TB with all the revision history for the English version with no talk pages. If you don't need all the revision history it drops to about 400-600 GB andn gets smaller as you start breaking things off you don't need (templates for example). Also, take a look here. I'm not sure if a bot of this type would be allowed. 71.163.243.232 (talk) 02:42, 29 February 2012 (UTC)
Kumioko, if you're going to retire, retire. Or at the very least don't disrupt BRFAs by making BS claims about what BAG allows or does not allow for bots. Plenty of bots like this were approved in the past, and plenty will be in the future too. Headbomb {talk / contribs / physics / books} 14:04, 29 February 2012 (UTC)
I concur with Headbomb that this bot's purpose is allowed as soon as the requester confirms a new account name. MBisanz talk 17:59, 29 February 2012 (UTC)
Kumioko may have a point: according to wikitech:Robot policy, large batched downloads via the API are not necessarily a good thing to do. Might Special:Export work better for your purpose? Anomie 21:12, 29 February 2012 (UTC)
There's a limit on Special:Export of 1000 revisions; I'd imagine that the vast majority of pages don't butt heads against that, and the odd ones that do could pull the remainder down via API calls. Josh Parris 04:53, 1 March 2012 (UTC)

A user has requested the attention of the operator. Once the user has seen this message and replied, please remove this tag. (user notified) This has gone quiet. Are you still interested in pursuing this BRfA? Josh Parris 23:02, 7 March 2012 (UTC)

[edit] Bots in a trial period

edit

[edit] 28bot 4

Operator: 28bytes (talk · contribs)

Time filed: 21:22, Monday February 27, 2012 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): Python/Pywikipedia

Source code available: N/A

Function overview: (a) Revert the insertion of a ~~~~ signature in an article, and (b) revert edits where two or more edit tests are found.

Links to relevant discussions (where appropriate): Village Pump discussion right before it was archived

Edit period(s): Continuous

Estimated number of pages affected: between 10–40 per day

Exclusion compliant (Y/N): I plan to add exclusion compliance to the code if this task is approved. Y: Exclusion compliance added in build 39, March 11, 2012. Respects {{bots}} template in both article space and on users' talk pages.

Already has a bot flag (Y/N): Y

Function details: (a) Revert edits which insert a ~~~~ signature in an article, and (b) revert edits where two or more edit tests are found. Examples of such edits are included in the related Village Pump discussion. 28bytes (talk) 21:22, 27 February 2012 (UTC)

[edit] Discussion

Revert in what way? With what message? Wouldn't it be better to simply have a disallow edit filter? - Jarry1250 [Deliberation needed] 23:02, 27 February 2012 (UTC)

Revert like this, I suppose, or am I misunderstanding the question? I don't do edit filters anymore, but if someone else wants to write one, that's fine by me. For the message I was thinking Template:Uw-articlesig. 28bytes (talk) 23:34, 27 February 2012 (UTC)
For the record: I expect the bot to only edit in article space.
It's been through the appropriate forum. Fears were raised that incompetent good faith edits would be lost rather than corrected. I want the bot to keep a central log for each of both (a) and (b) of all the additions that were made, so humans can subsequently quickly review to see if there's an acceptable rate of reverting incompetent good faith (vs test) edits. Include appropriate audit information.
It will also make reviewing the trial much easier.
Existing trusted op, non-contentious well publicized task, bot coded and tested already, so Approved for trial (200 edits or 5 days). whichever comes first. Advise here of any faults detected. A trial of this size will give the community suffient information to decide if automatic reverting is harmful. Josh Parris 00:34, 28 February 2012 (UTC)
Thanks Josh. Yes, it's currently coded to only look at (1) article space and (2) a couple of pages in my userspace. Logging the reverts will be no problem, and I'll list any false positives here. 28bytes (talk) 00:51, 28 February 2012 (UTC)
Note to self: publicize results for community reflection. Josh Parris 07:00, 1 March 2012 (UTC)
Can the trial be extended to March 10? The new functionality's only been in place since yesterday and the bot's not found anything to revert yet. FYI, when it does find something to revert, it will log an entry here: "Reverting TYPE 5 edit tests" for IP signatures and "Reverting TYPE 10 edit tests" for multiple edit tests. (The "TYPE 0" reverts are for edit test reverts approved in the first BRFA; that functionality hasn't changed and is still at 0% false positives.) 28bytes (talk) 00:28, 5 March 2012 (UTC)
Symbol full support vote.svg Approved for extended trial. until the above-mentioned date, sure. Ask for another extension if need be. Josh Parris 05:41, 6 March 2012 (UTC)
Thanks. 28bytes (talk) 13:23, 6 March 2012 (UTC)

[edit] Trial results

Trial complete. Here are the details of what the bot has reverted so far. For the IP signature edits (part "a" of this BRFA), the bot has made the following reverts that I would call unquestionably good, as the edit reverted was obvious vandalism or edit tests with no redeeming changes: [1] [2] [3] [4] [5] [6] (the bot attempted to revert but was beaten to it by another editor) [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29]

The following I would classify as "grey areas": while the edit was correctly (IMO) reverted, there were attempts at improving the article along with the misplaced signature: [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] (the bot happened to revert the addition of a giant copyvio here) [40] [41]

For the reverts of multiple edit tests made in one edit (part "b" of this BRFA) reverts, I would characterize these as clearly good reverts: [42] [43] [44] [45] [46] (attempted revert, beaten by ClueBot) [47] [48] [49] [50] [51] [52] [53] [54] [55]

As with the IP signature reverts, there were a couple of grey areas. I wouldn't call them false positives (as I would have manually undone such edits myself if those articles had been on my watchlist) but they don't appear to be vandalism: [56] [57] [58]

Of all the reverts the bot made during the trial, this is the only one that, IMO, might have been better as a partial revert than a complete revert: [59]. 28bytes (talk) 20:54, 10 March 2012 (UTC)

To summarize: you think all reverts are defensible, and the heuristic to identify dud edits has held up well.
You've listed http://en.wikipedia.org/w/index.php?title=Mako:_Island_of_Secrets&diff=prev&oldid=480749176 as both "unquestionably good" and "may have been better as a partial revert" and "grey area".
You've listed http://en.wikipedia.org/w/index.php?title=Twyla_Tharp&diff=prev&oldid=480693647 as both "unquestionably good" and "grey area".
Sometimes there's as much as a 20min delay between the edit being made and the revert; why?
Bearing in mind Wikipedia:Competency is required, and given there are such a large proportion of "grey area" edits, where there was an incompetent attempt to improve the article, do you still feel the basis of this BRfA is solid?
There are only 57 edits listed above. I think we may need a larger sample to make a call; what about you? Josh Parris 23:52, 10 March 2012 (UTC)
Oops, meant to put [60] and [61] solely in the "grey area" column. From an editorial point of view the addition of "Twyla Tharp is still living today and loves to dance for others" is not a helpful addition to the article (more so that it was inserted in between the categories and iw links), but it's certainly not vandalism. I intended to put all of the good faith edits, no matter how ill-conceived or poorly executed, in the "grey area" column. For the Mako article, the editor added an IP signature at the top of the page (bad), added an Oxford comma (unnecessary but harmless) and changed the capitalization of "Moon" to "moon" (good, since that's how all three sources have it.) I've reinstated the capitalization change manually. Unlike most of the other IP signature insertions, where just removing the signature would leave a variety of other unwelcome junk in the article, in this case a partial revert would have been ideal, although the full revert still leaves the article in better shape than no revert at all, IMO.
The 20 minute delay is an artifact of the way it processes the recent changes list; I've got on my "to do list" to rework that logic to hopefully cut down the delay. Based on the results so far I am satisfied that all of the bot's reverts have improved the encyclopedia, but I am happy to run it for longer and collect more data. 28bytes (talk) 02:18, 11 March 2012 (UTC)
In future logs please include a link to a diff.
It occurs to me that messaging editors may lead to good-faith material being re-inserted, or at least leave you a defensible position after removal; what do you think of that idea?
Do you know (or suspect) why the vandalism edits weren't caught by the anti-vandal bots? Josh Parris 10:52, 11 March 2012 (UTC)
I'll add diff-logging to the next build. Regarding messaging, it's currently configured to post User:28bot/templates/type5 for IP signatures and one of the first four listed at User:28bot/edit-tests-found#Templates for multiple edit tests (depending on whether the user is registered or unregistered, etc.) I have no idea why edits like the ones reverted in [62], [63] and especially [64], for example, weren't caught by the edit filters, ClueBot, RCP, or anything else until my bot came along. For the second one, perhaps the presence of a <ref> tag had something to do with it. 28bytes (talk) 18:40, 11 March 2012 (UTC)
Diff-logging has been added. 28bytes (talk) 02:04, 12 March 2012 (UTC)
Have you considered any metrics regarding the effect of the message you've been posting? Josh Parris 08:44, 12 March 2012 (UTC)

[edit] Second trial

I'm asking for a second trial because the edit period of the first only generated 56 changes, of which 14 were type b) - I don't feel this is enough for the community to make a statistically valid call; past approvals for bots have required very low error rates for broad-based bot reverting. I'm comfortable asking for such a large second trial because it is clear the operator is very closely monitoring the bot during its trials, so I expect that no heinous behaviour will occur. I estimate the trial will run somewhere in the area of six weeks. Symbol full support vote.svg Approved for extended trial (250 edits). (weekly-ish counts - or something - as a stay-alive on the BRfA please) Josh Parris 08:44, 12 March 2012 (UTC) edit

[edit] CeraBot II

Operator: Ceradon (talk · contribs)

Time filed: 23:02, Thursday March 1, 2012 (UTC)

Automatic, Supervised, or Manual:

Programming language(s): PHP, Perl

Source code available: No

Function overview: Reverts vandalism an test edits.

Links to relevant discussions (where appropriate): https://secure.wikimedia.org/wikipedia/en/wiki/Wikipedia:Bot_owners%27_noticeboard#X.21.27s_bots

Edit period(s): Continuous

Estimated number of pages affected: 35-100/day

Exclusion compliant (Y/N): Y

Already has a bot flag (Y/N): Y

Function details: idem, it runs off my toolserver account, I'll put a note on the bot's user page. Thanks.

[edit] Discussion

  • Will you be using X's code or have you re-coded it yourself? MBisanz talk 23:10, 1 March 2012 (UTC)
    • It might have been obvious by the link to the discussion above, but, I am willing to take over SoxBot's Anti-Vandal Bot task. However, I had to completely overhaul the main bot script and program it again in PHP. Most of the original script X! wrote was for IRC, so, instead of removing the IRC bits of code that X! was using, I reprogrammed the bot myself with help from User:Matthewrbowker and User:Pilif12p, however, I used the base that X1 used and built up from there. Thanks. Ceradon talkcontribs 23:14, 1 March 2012 (UTC)
    • Oh and I seperated the IRC bits. :) --Ceradon talkcontribs 23:15, 1 March 2012 (UTC)

Approved for trial (350 edits). MBisanz talk 23:21, 1 March 2012 (UTC)

Er, SoxBot was a vandalism bot and all I am doing is the taking over SoxBot's antivandalism and testing bot. Not just testing. --Ceradon talkcontribs 21:13, 2 March 2012 (UTC)
Oh, my mistake. Good luck anyway. Rcsprinter (articulate) (Contribs)
(Not Rcs)
21:16, 2 March 2012 (UTC)
I have granted the bot the rollback permission per request, as the bot is approved for trial. Please record here possible concerns. Snowolf How can I help? 22:49, 2 March 2012 (UTC)

Some questions:

  • Why is the source code not public? At the very least I think we should be able to see the heuristics that the bot will be using.
  • What advantages will this bot offer, as compared to ClueBot NG?
  • What namespaces will the bot run in?
  • How do you plan on dealing with false positives?
  • Before starting your trial, please create a userpage, and talkpage, with instructions for users who are affected by false positives.
  • What warning templates will the bot use?
    • Will the bot report users to WP:AIV? When will it report users?
    • Will it be able to detect warnings from other users and act appropriately?
    • If so, how recent will a warning have to be for the bot to use it?
  • Please clarify whether the bot will be Automatic, Supervised, or Manual?

That's all I can think of atm. My primary concern is false positives. As stated by Hellknowz, now that we have bots like ClueBot NG, that use very sophisticated methods there is a much lower tolerance for false positives. --Chris 15:55, 8 March 2012 (UTC) edit

[edit] Addbot 25

Operator: Addshore (talk · contribs)

Time filed: 16:17, Saturday March 3, 2012 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): PHP

Source code available: If requested

Function overview: Trying to keep WP:BOT/STATUS up to date

Links to relevant discussions (where appropriate): None (Bot Status talk page maybe counts)

Edit period(s): daily

Estimated number of pages affected: 3 pages

  • $page['inactive'] = "Wikipedia:Bots/Status/inactive_bots";
  • $page['active'] = "Wikipedia:Bots/Status/active_bots";
  • $page['counts'] = "Template:Botstats";

Exclusion compliant (Y/N): N

Already has a bot flag (Y/N): Y

Function details:

  • Getting the status pages
  • Creating a list of all bots and their current status from this page
  • Adding any new bots from Category:Active_Wikipedia_bots
  • Adding any new bots from Category:All_Wikipedia_bots
  • Sorting the list
  • Removing duplicates
  • Triming whitespace
  • Every week the bot will attempt to find the bot owners for bots that have no owner listed from the bots user page
  • Split the main list up and post back to the two sub pages
  • Update Template:Botstats

I propose to be able to add extra checks e.t.c to the bot as long as the only page edited is the status page (as above) e.g. Finding task descriptions from user pages

[edit] Discussion

  • This all makes sense, but doesn't AnomieBot already take care of some of this stuff? Or was it some other bot? Headbomb {talk / contribs / physics / books} 22:04, 3 March 2012 (UTC)
You are thinking of a different page, I am talking about Wikipedia:Bots/Status the page AnomieBot edits is Wikipedia:BAG/Status ·Add§hore· Talk To Me! 22:12, 3 March 2012 (UTC)
Approved for trial (1 run). Righto. Well I can't think of any reason why this should be denied, so let's go for trial. You're approved for one full run right now. Headbomb {talk / contribs / physics / books} 22:24, 3 March 2012 (UTC)
Trial complete. Please see here this was the only edit that was needed to be made during this run. ·Add§hore· Talk To Me! 19:50, 4 March 2012 (UTC)
What about the 21 other edits made to Wikipedia:Bots/Status/inactive bots, or bots 17 edits made to Wikipedia:Bots/Status/active_bots, or 12 edits made to Template:Botstats? Most of which done before trial? Headbomb {talk / contribs / physics / books} 20:05, 4 March 2012 (UTC)
All edits up to now should have been preformed through my main account as the edits are more 'assisted' than 'automatic'. They can be considered in the Trial if needed and I can go through explaining each edit as extras were added to the bot. Please note the BotStats template was originally a bot page before being CP moved. ·Add§hore· Talk To Me! 23:15, 4 March 2012 (UTC)

Questions:

  • How exactly does the bot detect an unapproved bot?
    • Also listing them as unapproved bots seem to break the table. That should be fixed.
  • How exactly does the bot detect bots in trial?
  • How does the bot fetch small descriptions such as {{BotS|DavidWSBot|2D|inactive|Notifies sysops of unblock requests.}}?
  • Wikipedia:Bots/Status/active_bots lists BG19bot_2 as an active bot. However, no such bot exists. What gives?

Headbomb {talk / contribs / physics / books} 03:16, 5 March 2012 (UTC)

Some more questions/remarks.

  • Some things could be tweaked a bit.
    • Add the date of the bot's last edit
    • "1 F·B" could be "1 · F · B" (fixed it myself)
    • Fix the display of {{BotS}} when bots are unapproved
  • The 7SeriesBOT links in Wikipedia:Bots/Status/active bots are messed up (fixed it myself)

Headbomb {talk / contribs / physics / books} 01:20, 6 March 2012 (UTC)

Processes:
This is probably the easiest way to answer this.

  • Load the bot lists as they currently stand
  • Get Category:Active_Wikipedia_bots and add these to active bots
  • Get Category:All_Wikipedia_bots and add these bots to list
  • Check the bot page for the bot template and a bot infobox and parse info from there
    • The status that is in the bot template will overwrite any currently in the list
    • The description of task will be parsed from the info box if no description is currently present
    • If no owner is in the bot list it will also try and get this
  • Stick the list back together and post as well as counting for the stats template.

·Add§hore· Talk To Me! 01:07, 6 March 2012 (UTC)

These tasks will probably change slightly as I further merge the registered bots page with the bot status page and include other inputs for bots but in essence the processes will stay the same. ·Add§hore· Talk To Me! 01:11, 6 March 2012 (UTC)

Changes from trial run:

  • Make the bot check to see if the user accounts in the list actually exist on wiki.
  • Make the bot get the list of account with the bot flag and add these to the list if they are not already there.
  • If the status of the bot does not equal active or inactive set it to inactive (this is until further changes are made to the status page such as including extra sections for bots in trial e.t.c

·Add§hore· Talk To Me! 01:13, 6 March 2012 (UTC) Further responses to Some more questions/remarks

  • The date of the bots last edit is something that I was thinking of adding and can easily be done.
  • The links section could be altered slightly, if anything this is more a bit of the BotS template editing that the bot
  • As said below I plan on changing any status that is not equal to inactive or active to inactive until the status page expands and changes in such a way it can incorporate extra sections such as trial and unapproved.
  • I dont really see how the links are messed up, I have removed the link anyway as it is linked to already, this could be another simple cleanup job for the bot.

·Add§hore· Talk To Me! 01:27, 6 March 2012 (UTC)

Approved for trial (1 run). Alright, well here's the thumbs up for another run. Run it whenever you feel you've spit-shined things enough. Headbomb {talk / contribs / physics / books} 01:47, 6 March 2012 (UTC)
This will probably be in the next few days ·Add§hore· Talk To Me! 01:49, 6 March 2012 (UTC)
This may have to go on hold for a few more days as something has gone and messed itself up a bit, see Wikipedia:Bots/Status ·Add§hore· Talk To Me! 14:13, 9 March 2012 (UTC)

edit

[edit] ListManBot

Operator: Kingpin13 (talk · contribs)

Time filed: 16:36, Monday March 5, 2012 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): C#

Source code available: On request

Function overview: Maintain MediaWiki:Bad image list

Links to relevant discussions (where appropriate): Original request at BOTR, notification at talk page

Edit period(s): Weekly

Estimated number of pages affected: 1 (excluding report page)

Exclusion compliant (Y/N): No, bot should be blocked if anything goes wrong, especially since it's an admin bot.

Already has a bot flag (Y/N): Y

Function details: Sort the list at MediaWiki:Bad image list alphabetically and remove deleted images. If deleted images have existing talk pages, the bot will note that in its report.

[edit] Discussion

Although the original request asked for "the" to be ignored when sorting alphabetically, this will be a configuration option, disabled by default. - Kingpin13 (talk) 16:36, 5 March 2012 (UTC)

Approved for trial (1 week). Trusted botop, simple task. —  HELLKNOWZ  ▎TALK 16:43, 5 March 2012 (UTC)

I'm going to change the edit period to daily, if that's okay? Obviously it will still not edit if nothing needs to be done. - Kingpin13 (talk) 13:41, 6 March 2012 (UTC)
Sure. —  HELLKNOWZ  ▎TALK 13:45, 6 March 2012 (UTC)

edit

[edit] Legobot 12

Operator: Legoktm (talk · contribs)

Time filed: 04:44, Monday March 5, 2012 (UTC)

Automatic, Supervised, or Manual: Automatic unsupervised

Programming language(s): Python using the rewrite branch of Pywikipedia

Source code available: User:Legobot/userspace.py

Function overview: Moves userpages which were accidentally created in the mainspace to the userspace.

Links to relevant discussions (where appropriate): Botreq

Edit period(s): Hourly (unless a more frequent time period is requested)

Estimated number of pages affected:

Exclusion compliant (Y/N): Y

Already has a bot flag (Y/N): Y

Function details:

  • Gets the latest 100 new pages in the mainspace
  • Checks if the page title matches "pagecreator/*"
  • Moves it to "User:pagecreator/*"
  • Leaves a message on the creator's talk page.

[edit] Discussion

It would get moved. If it was a false positive, {{bots|deny=Legobot}} would need to be added, and the bot would ignore it. I really don't forsee that happening, nor do I have a way to accurately detect such an article. LegoKontribsTalkM 04:52, 5 March 2012 (UTC)
K, just checking. The message left on the userpage should mention this possibility + instructions on how to revert/prevent the bot from doing it again. Another question, do you have any idea of often page moves would occur, typically? Headbomb {talk / contribs / physics / books} 04:58, 5 March 2012 (UTC)
The message already contains that. See User:Legobot/userfy move. I just ran through the last 1000 newly created articles and the bot did not pick up any. At a random guess, I would think one page move a week? I am not a very active new page patroller so I really have no idea how often this happens. LegoKontribsTalkM 05:38, 5 March 2012 (UTC)
Approved for trial (One week or 10 edits). Alright, then let's go for one week (or 10 edits, whichever comes first), and see where that gets us. Some additional remarks. First, linking to WP:MOVE would go a long way to make that message more noob friendly. Second, a quick glance as Special:NewPages show that from noon to midnight (my time), the average rate was 50 pages/hour. And considering Sundays aren't usually very active, and bots could further crank up the number of new pages (like Ganeshbot did today), fetching only the last 100 pages every hour will definitely miss stuff. Fetching the last 500 seem safer. Or better yet, have some fancy logic that will fetch all new pages created since the last time the bot checked for new pages. Headbomb {talk / contribs / physics / books} 07:10, 5 March 2012 (UTC)
  • I just rewrote the user message, hope that does not impact the trial or anything. On a sidenote, how about coding the bot so it checks whether it has moved the page before? I have no idea if that would be difficult to code, but it would prevent this bot from edit warring if it gets a false positive without needing hidden messages like {{bots|deny=Legobot}}. Yoenit (talk) 22:58, 5 March 2012 (UTC)
    I removed your sig from the bot's message. The bot will have to subst the msg and its own sig. Josh Parris 06:32, 6 March 2012 (UTC)
    doh, that is stupid. Thanks. Yoenit (talk) 07:44, 6 March 2012 (UTC)

edit

[edit] TPBot

Operator: TParis (talk · contribs)

Time filed: 01:22, Friday March 2, 2012 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): PHP, Perl, Python

Source code available: https://svn.toolserver.org/svnroot/soxred93/

Function overview:

  • Update the Admin Highlight javascript files
  • Update the RFX Graph
  • Update the RFX Report
  • Update the RFX Talley
  • Update CratStats for each crat
  • Date undated maintenance tags
  • Update Adminstats for each admin
  • Clear the Sandbox


Links to relevant discussions (where appropriate): This is a reincarnation of User:X!'s bots running the exact same code.

Edit period(s):

Estimated number of pages affected: 2025 (~2000 admins (not user pages, subpages of Template:Adminstats), ~25 crats, the sandbox, and a few RFA pages.)

Exclusion compliant (Y/N): Y

Already has a bot flag (Y/N): N

  • Function details:
  • Admin Highlight Generates a list of admins into a subpage of SoxBot as a .js page for use by other .js scripts.
  • RFX Graph Reads RFA and RFB stats and created a .svg graphic on toolserver. It them posts the code to create the graphic in a subpage of it's userpage.
  • RFX Report Scans for new RFXs and updates an RFX report in a subpage of it's userpage.
  • RFX Talley Updates the talley on open RFA and RFBs
  • CratStats Updates the subpages of Template:Cratstats to reflect current stats
  • Adminstats Updates the subpages of Template:Adminstats to reflect current stats
  • Sandbox Clears the Sandbox every 15 minutes

[edit] Discussion

  • Requestor note This is a reincarnation of SoxBot and uses all of X!'s code. No changes have been made by me except to update usernames and passwords, user page paths, and toolserver paths.--v/r - TP 01:26, 2 March 2012 (UTC)

Approved for trial (14 days). MBisanz talk 01:43, 2 March 2012 (UTC)

It's come to my attention that RFX Report and Sandbox cleansing have been taken over by other bots, so I'll be dropping those tasks from TPBot.--v/r - TP 03:16, 2 March 2012 (UTC)
I'm not insisting on the RfX tallying though, only whipped it up since there were three RfAs going at the time. :) Amalthea 12:03, 2 March 2012 (UTC)
Oh, and regarding the admin highlighter, that one I actually mentioned back at Wikipedia:BON#X!'s bots, I had redirected the admin highlighter tool to a clone that is slightly more robust and has a maintained list already. Amalthea 13:11, 2 March 2012 (UTC)
If you have X!'s code for it then Anomie would probably not mind if you took on Wikipedia:Bots/Requests for approval/SoxBot 15. Amalthea 13:13, 2 March 2012 (UTC)
I don't mind doing the tally if you dont want it; I do have X's code for it. I'm not overly concerned about taking over all of X's stuff if other folks have/want to. I just want to make sure the need is filled. I think I also have badimages, I can do that one as well if you'd like.--v/r - TP 13:43, 2 March 2012 (UTC)
Alright, I'm done with my tests. I've dropped the tasks I won't be handling because other bots have taken them over. Amalthea, I'll take over the talley and report if you don't mind.--v/r - TP 00:22, 3 March 2012 (UTC)
I misunderstood the BAG process. I've set up a cron job for TPBot now.--v/r - TP 01:44, 3 March 2012 (UTC)
Your choice. Personally I would prefer if you removed the "Last updated" signature from the page output though to avoid making all those unnecessary edits when nothing relevant has changed. Amalthea 16:34, 3 March 2012 (UTC)
I think that was there because most folks don't actually check WP:RFA and that little signature lets them know that the stats are up to date and the bot hasn't fallen asleep on the job. But I could remove it if it's generally considered annoying.--v/r - TP 17:36, 3 March 2012 (UTC)
It may show that the bot is still alive, but that argument could be made for many tasks. I personally consider that "consuming resources unnecessarily", but that may just be me. Amalthea 18:11, 3 March 2012 (UTC)
Yeah, if you could remove it without too much re-coding, that would be ideal. MBisanz talk 03:18, 8 March 2012 (UTC)
It's not any trouble at all, hardly any code involved. I just hate to see it go because I find it useful. I throw around some <small> tags to see if that helps but I understand how ya'all find it annoying so I suppose I'll pull it.--v/r - TP 13:59, 8 March 2012 (UTC)
YesY Done--v/r - TP 03:35, 9 March 2012 (UTC)

edit

[edit] DschwenBot

Operator: Dschwen (talk · contribs)

Time filed: 22:38, Friday February 24, 2012 (UTC)

Automatic, Supervised, or Manual: automatic (with initial supervision)

Programming language(s): python (pywikipedia framework)

Source code available: will be made available

Function overview: parses TIGER Line US Census shape files and extracts individual US county outlines. The outlines are converted to KML and attached to the respective county articles using the {{Attached KML}} template (note that the actual KML data does not appear in the article text). This is a new geocoding method that was developed during the last few weeks. The outlines will be displayed on the WikiMiniAtlas and can also be viewed using Google/Bing Maps

Links to relevant discussions (where appropriate): Wikipedia_talk:WikiProject_Geographical_coordinates#Autogeneration_of_U.S._highway_KML, Wikipedia_talk:WikiProject_Geographical_coordinates#Outlines

Edit period(s): one time (per US state)

Estimated number of pages affected: 3143 (the total number of US counties)

Exclusion compliant (Y/N): Y

Already has a bot flag (Y/N): N

Function details: Bot will create a new page Talk:Article/KML (unless it already exists), bot will generate the correct KML outline data and upload it on said page, bot will place {{Attached KML}} in the article text at the bottom before the first category link.

[edit] Discussion

Example
Talk:Santa Fe County, New Mexico/KML
Santa Fe County, New Mexico
Santa Fe County, New Mexico (pre KML)
diff to add KML
  1. In the example, I don't like the positioning of the KML link; I think it should be in the External Links section.
  2. The KML infobox is terrible, far too editor-centric (but this is not a bot issue in of itself).
  3. Given the KML is associated with the article, why is it a subpage of the talk rather than the article?

Josh Parris 09:23, 25 February 2012 (UTC)

Thanks for the feedback, Josh. The examples were added fully manually, I mentioned that I intend to put the template before the cats, that would be a technically simple solution. Of course detecting the External links heading is not hard either, but is it really an external link (yes, I guess, if you think about it as links to google maps/bing maps)? The template design is not really my business and not a bot issue. I know there has been some discussion to revise it. There are no subpages in the article space (but yes, that would be the preferable location to me, too) so we had to resort to moving it into talk space. It won't be likely to interfere with article discussions and won't pollute the article namespace. The basic idea here is to keep momentum going for a useful new geocoding device that emerged from an initially heated but now very productive discussion process. All involved people are fully aware that the technical implementation could be better, but we also realize that if we wait for the necessary technical changes in the MediaWiki software (i.e. uploading of xml data on commons) we will have to wait for a looongg time and this thing will just die. This is about putting the idea out there, giving it exposure and demonstrating it's usefulness. --Dschwen 14:33, 25 February 2012 (UTC)
If it makes anybody more comfortable, I do have a record developing, maintaining, and running bots on commons: DschwenBot, VICbot, and QICbot. All those bots are custom developments and have a fairly high edit volume, for fairly complex edits. --Dschwen 20:23, 25 February 2012 (UTC)
Ok, I'm pretty much ready for a limited test run on a handful of pages, if that is ok. I programmed the bot to search for an external links section, and if it does not find one, it inserts it either after "Further reading" or "References", whatever is found first. I will skip articles that either already contain an {{Attached KML}} template, or that already have an existing KML data page. Articles are just skipped and stored for manual processing if any of the conditions are not met. --Dschwen 20:51, 25 February 2012 (UTC)
A map of a county isn't really further reading, and it barely passes as a reference - is it referred to in the article? I suggest, in order:
  • External links
  • Further reading
  • References
But I'm one guy. We really need more people eyeballing this.
Have you considered adding it to {{Infobox U.S. County}}? That seems a more natural fit. If you were to go down that route, I'd suggest creating all the talk/kml pages and then modifying {{Infobox U.S. County}} to hard-reference the KML non-optionally. An entry like:
County boundaries on Google or Bing
might fit in nicely.
Please alert Wikipedia talk:WikiProject U.S. counties to this BRfA, but it looks like it's a pretty quiet wikiproject. I don't forsee any objections, but equally I'm not hopeful on others chiming in. Also raise it at Wikipedia talk:WikiProject United States in the hope that will attract more participation. Josh Parris 21:35, 25 February 2012 (UTC)
Ok, will do. And yes, we could add it to infobox county (but I'll leave that to the county guys). Right now the bot is creating a new external links section at the lostations I pointed out (it does not just put it into further reading!). --Dschwen 21:40, 25 February 2012 (UTC)
I think you should drop a note on USRoads as well. They were discussing doing something similar with Roads, Highways, INterstates, etc. I agree with Josh too, the WPUS and WPUSCounties projects suddenly got very quite over the past few weeks. --Kumioko (talk) 04:11, 26 February 2012 (UTC)
Yes, I know, that is where the whole thing originated. The discussion moved to the GeoProject page and some road people are still very involved. --Dschwen 05:32, 26 February 2012 (UTC)

A technical comment. That stuff really should be in template space, not in talk namespace. Have have something like {{Attached KML|ARTICLENAME}} transclude Template:Attached KML/ARTICLENAME. Headbomb {talk / contribs / physics / books} 02:46, 28 February 2012 (UTC)

I like that suggestion. I brought it up here. --Dschwen 03:35, 28 February 2012 (UTC)
I like the general concept too, except I worry such an arrangement could make it hard to identify attempts to hijack a page's content by injecting what, on the surface, looks to be legitimate transclusion. My first thought is to create a new namespace, perhaps KML. —EncMstr (talk) 04:00, 28 February 2012 (UTC)
It won't pass, there's no need for a new namespace for something that is no different than a template. Headbomb {talk / contribs / physics / books} 04:01, 28 February 2012 (UTC)
Colour me stupid, but isn't there a need to pass KML to external websites to render, rather than transclude it into the article page? Josh Parris 04:16, 28 February 2012 (UTC)
Quite right. There is no need for transclusion of KML data. The template {{Attached KML}} generates external links. WikiMiniAtlas (click on a coord globe to activate) is a pretty clever applet which retrieves the data from the /KML subpage of the article's talk page. (See Mojave Desert for examples.) The point is that nothing should transclude KML data. So being in template space is not quite correct. —EncMstr (talk) 04:37, 28 February 2012 (UTC)
Well transcluded or not, it's certainly not an article, not a discussion, and the template namespace is the best to handle that sort of thing. Headbomb {talk / contribs / physics / books} 14:22, 28 February 2012 (UTC)
Agree, and two other people on the GeoProject also think this is the way to go. Template subpages can be understood as auxilliary material supporting the main template. Template documentation pages are subpages of templates as well, and they are not meant to be included. It is easy to shoot down such a suggestion because it is not the perfect solution, but it certainly is better than putting it on a subpage of article talk. --Dschwen 15:09, 28 February 2012 (UTC)
This occurred to me over night: KML would be most suitable in 'File:' space. They are a lot like an .SVG file, and possibly future development will automatically handle (display, annotate) a KML in the same way a .PNG, .GIF, or .JPEG is handled now. —EncMstr (talk) 17:05, 28 February 2012 (UTC)
This is not going to happen. at least not in our lifetime. Please read bugzilla:26059. --Dschwen 17:28, 28 February 2012 (UTC)
I don't see where not going to happen follows from the conversation there. A stripped-down KML format missing title and heading tags is evidently supported by MediaWiki. What about the mapping services? —EncMstr (talk) 17:48, 28 February 2012 (UTC)
Either way, I'm perfectly fine with uploading things in the template namespace, at least as a temporary solution, and whenever MW supports KML/KMZ files, the transition to File namespace can be done at that moment. The template should be placed in the External link sections, before any links and above navigational templates. When the template and bots are updated to take this into account, we can move into trial phase. Headbomb {talk / contribs / physics / books} 18:32, 28 February 2012 (UTC)
The bot is ready (placement is already performed as you describe). --Dschwen 18:46, 28 February 2012 (UTC)
What's the rush for a temporary solution? Let's talk this out and get to the final solution and avoid rework. Josh Parris 02:14, 29 February 2012 (UTC)
Unless I'm mistaken, the KML at Talk:Santa Fe County, New Mexico/KML won't trip the <head* filter. I think it should go to ns:File. But under what name? Can any further use for KML files be foreseen other than a single use per article? Josh Parris 02:10, 29 February 2012 (UTC)

[edit] Trial

Approved for trial (10 edits). Alright, then let's go with a small trial just to make sure things aren't completely broken. Then we'll go for a larger trial to make sure things work fine. Headbomb {talk / contribs / physics / books} 21:39, 28 February 2012 (UTC)

Thanks! I have modified the (now renamed) {{Attached KML}} template to also accept data at the new location Template:Attached_KML/PAGENAME. The bot will be uploading there and existing KML files can be moved without disruption that way. --Dschwen 22:46, 28 February 2012 (UTC)
Hmm, I need advice: Where should the template be placed in Winn Parish, Louisiana for example? --Dschwen 00:42, 29 February 2012 (UTC)
Well, per my earlier advice I'd suggest Further reading See also, but also note I'm not crazy about the use of a template. Josh Parris 02:12, 29 February 2012 (UTC)
I'm not convinced that it's time for a first trial. I believe there's a number of aspects of this proposal that aren't nailed down, certainly not to my satisfaction. Josh Parris 02:18, 29 February 2012 (UTC)

Ugh, great. Just ran a small test. Looks good to me. The bot is set to only touch articles where it an unambiguously place the Attached_KML template. The generated KML displays fine in the WikiMiniAtlas and on the two proprietary mapping services. I should add the article title into th KML file so it shows up when viewing it in Google Maps. Just please do not suffocate a great idea. --Dschwen 04:41, 29 February 2012 (UTC)

  • To Dschwen - You should upload the markup before putting the template on the article. This way the article cache will be up-to-date after the bot edits the page, and if something goes wrong (Wikipedia in read-only-mode), the article will not have bogus links
  • To Josh - What exactly would need to be nailed down before proceeding, IYO?
Headbomb {talk / contribs / physics / books} 16:44, 29 February 2012 (UTC)

Ok, I reversed the edit order (thanks for the suggestion, it makes sense). Thereby I introduced a bug which associated the wrong coordinate data with some counties (St. Charles Parish, Louisiana was given the data of a county in North Dakota). This is fixed now in the code and I re-processed the broken counties. Sorry, but i guess that put me a little above the 10 edits! I also updated the KML to include a link back to the Wikipedia article. Further suggestions? --Dschwen 18:08, 29 February 2012 (UTC)

[edit] Further discussion

I haven't examined the trial. I feel this is a useful task, I have no intention of smothering it; I want it done right the first time.

What I consider the unresolved issues:

  1. Destination of the KML (I'm hoping for some discussion of the File: namespace)
  2. Technique for linking the article to the KML (I'm hoping some interested editors will chime in regarding a separate template vs an additional field in the County infobox)

Josh Parris 21:20, 29 February 2012 (UTC)

I do not see the advantages of File: namespace, in File you cannot edit the data. KML is a simple XML file. Quick edits are possible and diffs make sense. KML was proposed as an allowed filetype over 14 months ago, a first patch was provided over 12 months ago but reverted only shortly after. Since then there is the security issue with IE6. Sure, you can craft a KML that wont trigger the IE6 security check, but I can tell you with certainty that KML upload won't be enabled in that state. It would just be confusing to the users if some files are "randomly" rejected. Bottom line still is that File: does not provide significant (or any) advantages over Template: space. --Dschwen 21:54, 29 February 2012 (UTC)
The clearest advantage is handling by wikicommons. By referring to KML in the File: namespace, the data can live either on any project language wiki, or on commons and it will be conveniently resolved. I am pretty sure that is not true of all other namespaces. KML data is a natural fit for commons.
The points about it not being editable is also true of images. For most modification, they have to be downloaded, edited with client PC software, and reuploaded to commons. Diffing revisions, if it were allowed, would not be meaningful for jpeg, gif, and png media, though it would be handy with svg, and now kml. Since this technique is breaking new wikiground, it could well drive the needed supporting technology to fruition. In fact, it is a compelling case for mediawiki development to handle KML in a suitable manner. —EncMstr (talk) 23:00, 29 February 2012 (UTC)
Since the KML data is never included, but only a bunch of links is generated, a template on any language wikipedia may refer to the KML data on the english wikipedia and vice versa. Mediawiki development is really nice, but a bit of wishful thinking right now. It is not a reason to block a solution that is good enough right now. When a better solution has been developed some time in the future it will be trivial to whip up a bot that moves all KML data to commons. But we have something that is useful for the reader right now. We want them to get rich geodata to illustrate wikipedia articles now. For the reader it makes no difference where that data is hosted. They are the customers. Pandering to order and structure loving editors is nice and important to keep the project running smoothly, but here we are talking about a fairly structured solution vs. a perfect solution (and the perfectness is highly debatable!). Please by all means push for a solution you think is better, but do not stop work now, because there is an easy and obvious transition path from what we have now, to what we may get sometime. --Dschwen 23:21, 29 February 2012 (UTC)
There are two good points here; Files can't be diff'd, but they can be shared by Commons. I'd lean towards the diff being more important for detecting vandalism. Josh Parris 04:47, 1 March 2012 (UTC)
[65] - it can't be shared by Commons at this time. --Rschen7754 04:51, 1 March 2012 (UTC)
After jumping through some hoops, I got http://commons.wikimedia.org/wiki/File:NYstateroute308.kml set up. But you are correct: there is no way to link to it, not from enwiki, and not even from commons. See item 4 at commons:User:EncMstr.
You are also correct that putting the data anywhere it can be used is a very good idea. Even if it has to (and can be) moved later. Semantically, template space is better than talk subpages, so I rescind my suggestion of File: space. —EncMstr (talk) 06:17, 1 March 2012 (UTC)
It seems resolved: template space is the least worst place to put it. I presume that's been tested? Josh Parris 06:42, 1 March 2012 (UTC)
It's live. See Special:PrefixIndex/Template:Attached KML/ for subpages in use (except that /doc is, of course, the documentation subpage).
Having agreed that Template subpages are currently better than File pages and uploads, should the subpage name include ".kml" at the end? This might help downloaders assign an appropriate file extension (or mime type). It might also make KML pages easier to identify by a simple rule independent of the template (similar to the way .css and .js pages are currently treated specially in certain namespaces by virtue of the pagename). At present, the subpages are all named identically to the mainspace articles. But {{Attached KML}} could easily be modified to append ".kml" to the name of the template subpage used in the external links. We could change this now before there are very many subpages to move.
In other words, should we move each existing "Template:Article KML/articletitle" to "Template:Article KML/articletitle.kml"?
Richardguk (talk) 16:15, 1 March 2012 (UTC)
Can you please discuss details like this at the template page? It does not seem appropriate for this venue, and won't get any attention by the right people who are involved with the template. --Dschwen 16:12, 5 March 2012 (UTC)
Thanks for the feedback. I was surprised how little interest there has been in previous threads on the template talkpage, compared with the move to template space discussed here, but I agree that Template Talk is the logical place.
Subpage renaming suggestion moved to Template talk:Attached KML#KML pagenames.
Richardguk (talk) 03:09, 6 March 2012 (UTC)

[edit] Infobox or template?

Now the hard one - aesthetics. I prefer a modification to the infobox (which has the upside of the article not needing modifying at all, making a much simpler, quicker bot), but other involved editors may have a preference to make the links in a separate template, placed at some point in the article. Josh Parris 06:42, 1 March 2012 (UTC)

Perhaps I don't understand the question, but shouldn't *both* be implemented? The infobox to do the majority of existing articles, and a template for articles lacking a supported infobox, or for providing alternate data, or layers, or what-have-you which has not been thought of yet. —EncMstr (talk) 07:16, 1 March 2012 (UTC)
Are there any counties that don't have the county infobox? Why?
Alternate data? Layers? Please explain these possible future needs. Josh Parris 10:56, 1 March 2012 (UTC)
Such discussion would be better on, say WT:GEO, or a more specific project sub-page, with pointers posted on affected project, MoS and template pages. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 12:13, 1 March 2012 (UTC)
I agree there is a need for such a discussion before we can proceed with more extensive testing. Specifically, the discussion should deal with these questions:
  • Would the best place for these links to be in an infobox (presumably through a |kml= parameter), or in the external links section, or somewhere else?
  • Assuming the preferred place for these links is the infobox:
    • What happens if a county does not have an infobox?
      • Should the bot place it in the External Links section (or failing that, the See Also section) as a temporary solution?
        • What if those sections don't exist? Can the bot create them?
      • Should the bot wait for editors to place an infobox, so things are done right the first time around instead of requiring cleanup in the future?
        • Can the bot create the infobox if it's missing?
  • Assuming the preferred place for these links is the External Links section:
      • What if the External Links section doesn't exist? Should the bot use the See Also section if it's there? Or can it create an External Links section? What if there is no See Also section? Headbomb {talk / contribs / physics / books} 14:50, 1 March 2012 (UTC)
Headbomb {talk / contribs / physics / books} 14:45, 1 March 2012 (UTC)
Currently the bot places it at the top of the external links section, if it finds neither external links nor further reading but a {{reflist}} template it creates a new external links section right after the reflist (please read closely what I wrote, it is complicated!!!). The reason is the suggested section order in the MOS. I currently have no solution to find the end of a further reading section (it can be a dangling end with navigational templates and categories under the same heading). I could scan the section for a *-list and assume the end of the *-list is the end of the section. --Dschwen 15:41, 1 March 2012 (UTC)
If the box continues to float right, I suggest putting it at the start of these kinds of sections.
I see you're pushing along discussions relevant to this BRfA elsewhere; I can't see anywhere that consideration of links inside the infobox instead of / in addition to the separate template. Please make an effort to form a consensus for the template-box over inline infobox links; I don't feel the community has given it due consideration. Josh Parris 00:32, 11 March 2012 (UTC)

edit

[edit] MadmanBot 13

Operator: Madman (talk · contribs)

Time filed: 17:21, Tuesday February 28, 2012 (UTC)

Automatic, Supervised, or Manual: Automatic.

Programming language(s): PHP.

Source code available: No.

Function overview: Generates Wikipedia:Inactive administrators reports; delivers messages and e-mails inactive administrators as appropriate.

Links to relevant discussions (where appropriate): Wikipedia:Bureaucrats' noticeboard#Inactive administrators

Edit period(s): Twice monthly.

Estimated number of pages affected: 3 + 2 * (# of inactive administrators)

Exclusion compliant (Y/N): No.

Already has a bot flag (Y/N): Yes.

Function details: At the beginning of the month, the bot will identify all users who meet the inactive administrator criteria (no edits or log actions for at least twelve months) via a SQL query. A report will be generated in the appropriate section of [[Wikipedia:Inactive administrators/yyyy]]. The bot will then deliver and send the boilerplate talk page message and e-mail to all identified users and update the report to indicate it has done so. (So bureaucrats are assured that the users have been notified appropriately, the bot automatically forwards the copy of the e-mail it receives to wikien-bureaucrats@lists.wikimedia.org with all headers intact.) One week before the beginning of the next month, the bot will deliver and send the "imminent" talk page message and e-mail and update the report to indicate it has done so.

[edit] Discussion

Approved for trial (1 run). After reading all of that stuff, this seems fairly uncontroversial and well supported. I'm going to approve for trial (1 full run, with all the notices, emails, etc...), which (if possible) should happen on the 1rst of the month. I will want confirmation from both the bot op and a bureaucrat that the bot is working as intended before giving full approval. All emails should explicitly mention that the emails/notices were sent by a bot, and the place to report bugs/suggestions/malfunctions should be equally explicit. That's on top of any other support and contact information that bureaucrats wish to include, if any. I leave the content of on-wiki notices to the discretion of the bot-op/bureaucrats.

The bot could then run on the 1rst and 15th of every month. The exact dates are rather irrelevant, but it would be nice for people to have fixed dates for runs so they can predict them. Obviously the frequency of runs should be tweaked according to 'crat wishes if they ever change their mind (i.e. think once a month is enough, or that once per week is better). Headbomb {talk / contribs / physics / books} 19:16, 28 February 2012 (UTC)

Thanks! I planned on the first of the month and a week before the end of the month because that seems to be what the bureaucrats currently do (per Wikipedia:Inactive administrators#Criteria, the second notification is when desysopping is "imminent"). But of course that can be tweaked anytime according to their wishes. I'll be ready to trial this on the 1st; I'll work on appropriate changes to the notifications' language at Wikipedia:Bureaucrats' noticeboard. Cheers! — madman 19:48, 28 February 2012 (UTC)
Changes look fine to me. MBisanz talk 17:57, 29 February 2012 (UTC)

Updates: The report ran successfully, though it ran under my non-bot account first, so I reverted and ran it again. All last edit values are correct, but for some reason some last log values are not. I went through all administrators' logs and confirmed they still met the inactive administrator criteria. Notification then ran successfully; all talk page messages were delivered and all e-mails were sent. The bot reported it hadn't sent any e-mails when it updated the report because it was looking for the incorrect result value for success; I updated the report manually with e-mails I'd gotten a copy of, either from MediaWiki or from the forwarding script; I can confirm the seven remaining users definitely do not have e-mail set, as I was watching the results from the API. The forwarding script should have been forwarding the MediaWiki messages to both me and wikien-bureaucrats; however, I only got a copy of one or the other. I suspect either Sendmail on the Toolserver being wonky or my .forward file being incorrect (I suspect the latter; I meant for it to deliver to both my normal e-mail address and the script but I suspect if was delivered to my normal e-mail address before the script was called it didn't bother with the script.) I'm hoping wikien-bureaucrats got all 28 forwards but if not, I can forward the 17 that they would have not received. This definitely will be fixed by the next round of notifications, and having confirmed that I believe next month this can be run fully automatically. Cheers, — madman 01:30, 1 March 2012 (UTC)

I just checked the logs and my script did forward the e-mail all 28 times (x2) and got a successful result each time. So I might poke a Toolserver root and see what's going on (alternatively my not getting all of them could be Google Mail's fault). — madman 01:39, 1 March 2012 (UTC)
Trust me, we got all 28 emails... :-/ Hersfold (t/a/c) 01:41, 1 March 2012 (UTC)
Oh. Well, that's... good? I'm open to other suggestions for how bureaucrats can confirm the e-mails were sent appropriately; pretty much all discussion on WP:BN regarded e-mailing a copy to wikien-bureaucrats, which is why I did. :/ — madman 01:43, 1 March 2012 (UTC)
It's good, yeah, the flood of emails was just, well... floody, but I'm sure it'll be a one-time thing due to the first run of the bot. The Toolserver seems to be very reliable with the .forward stuff, so as long as you get it I think you can safely assume we got it as well. One possible idea would be to have the bot post to BN something like "I just ran and sent out X emails, please confirm" and then we'll let you know if the count doesn't match up. Hersfold (t/a/c) 01:46, 1 March 2012 (UTC)
Yeah, I think it's because two months were missed; normally it's like eight. It can post to WP:BN as you suggest and I'll also see if I can have it send one e-mail with all of the MediaWiki copies as attachments with the original headers. I think that'd be the best possible solution. I'll try to get that working before the next round of notifications so there'll be no more flooding. :) — madman 01:50, 1 March 2012 (UTC)
That'd be nice to try, although I'm not sure how well Mailman will handle attachments, even if they're all emails. It's rather notorious for sucking at everything. Hersfold (t/a/c) 01:51, 1 March 2012 (UTC)
That's true; it depends on Mailman configuration, specifically whether it's set to scrub non-text attachments. No harm in sending a test e-mail within a few days (clearly marked test of course!) to see. Face-tongue.svgmadman 04:45, 1 March 2012 (UTC)
Approved for trial (1 run). Alright, well I'm not exactly following what went wrong or what exactly went right about those emails, but it seems obvious more trialing is needed before giving this the full thumbs up. I expect the next run to take Hersfold's feedback (as well that of other crats' if they said something) into account, obviously. And if it's not already included, it might also be smart for the bot to create some kind of report about who got notified by email, and who doesn't have an email linked to their account. But that last part are just my two cents. Headbomb {talk / contribs / physics / books} 22:20, 3 March 2012 (UTC)
Basically a summary of all of the above is since there were 28 inactive administrators (we skipped a couple months of this task), 28 e-mails were sent to the mailing list and that's kind of spammy. The report you mention already exists; see Wikipedia:Inactive administrators#April 2012 for the bot's report. Hersfold, can you let me know if other 'crats have commented within the mailing list? Thanks! — madman 22:40, 3 March 2012 (UTC)
So what's going to be different next time? A single email to the mailing list detailing everything? Or just fewer emails? Headbomb {talk / contribs / physics / books} 22:41, 3 March 2012 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── I'm going to try to send one e-mail with all the individual e-mails as attachments. I think it's important for verification purposes that all of the e-mails have all of their original headers. This is easy to do. However, I'm going to have to send a test e-mail sometime this week to see if Mailman is going to scrub the attachments. — madman 22:51, 3 March 2012 (UTC)

I recognize the other crats and some users feel that it is essential a crat verify that the emails were sent by being able to view the headers, but if Madman shows me on-wiki that he left the notes and tells me he sent the emails, I would trust him because the action is so easily reversed. MBisanz talk 03:26, 8 March 2012 (UTC)

I'm also leaning towards posting the e-mails with all headers on the Toolserver in a public_html directory and then sending one e-mail advising wikien-bureaucrats how and where they can be reviewed. This would actually simplify the code considerably, as the e-mail would barely have to be parsed and wouldn't have to be translated, whereas sending an e-mail with all of them as attachments would actually complicate it quite a bit. — madman 03:35, 8 March 2012 (UTC)

edit

[edit] mmbot

Operator: Mmovchin (talk · contribs)

Time filed: 23:42, Monday February 20, 2012 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): Python

Source code available: Standard pywikipedia

Function overview: Adds missing reference sections (<references />)

Links to relevant discussions (where appropriate): Xqbot Mobius_Bot UsbBot Prombot

Edit period(s): Every 10 minutes

Estimated number of pages affected: 2-5 per edit

Exclusion compliant (Y/N): No, because every site where references are given needs a <references />-tag. Yes, implemented.

Already has a bot flag (Y/N): no

'Function details: Just adds missing reference sections (<references />). It get's his pages from WP:NOREFLIST. Examples: 1, 2, 3, 4.

[edit] Discussion

  • Comment - The requested functionality duplicates that of User:Xqbot, which already does a fine job with only one or two passes per day.  -- WikHead (talk) 00:33, 21 February 2012 (UTC)
    My personal view is that duplicated functionality is a good thing.Josh Parris 04:10, 21 February 2012 (UTC)

It seems that {{reflist}} is the preferred option nowadays; is it possible to use that instead? Josh Parris 04:10, 21 February 2012 (UTC)

@Josh Parris: Yes, of course. This is not a problem, see example 5 and 6.--mmovchin Talk 10:55, 21 February 2012 (UTC)
Also, please don't edit via bot account in mainspace until BAG gives you a trial. —  HELLKNOWZ  ▎TALK 10:59, 21 February 2012 (UTC)
  • What happens if the page already contains a "References" or similarly named section?
  • How does the bot determine where to insert the section if needed?
  • Given references tag is often removed by vandalism (or occasionally error), how long does the bot wait before performing the edit on the article? How long for IPs and editors with very few edits?
  • Exclusion compliant: "No, because every site where references are given needs a <references />-tag." -- that is not a reason to make the bot non-exclusion compliant. Every approved bot task has consensus, therefore everyone could argue that there is no reason the bot wouldn't be disallowed. What if a certain page used a syntax you do not anticipate and cannot easily fix, therefore requiring the page to be blacklisted, for example? —  HELLKNOWZ  ▎TALK 10:58, 21 February 2012 (UTC)

mmovchin, have you read and understood WP:BOTPOL? Josh Parris 11:16, 21 February 2012 (UTC)

Some people might say that with less than 400 edits under your belt, you may not be ready to operate a bot. How would you respond to that? Josh Parris 11:16, 21 February 2012 (UTC)

He's got 3298 on dewiki. —  HELLKNOWZ  ▎TALK 11:17, 21 February 2012 (UTC)
And he's a Huggle developer, and has a computer science background. Scrub that. Josh Parris 11:40, 21 February 2012 (UTC)
HELLKNOWZ:
1) When a article already contains a section named "References", "Footnotes" or "Notes" it looks whether where is already a references-tag or any reflist-template. If not, if simply adds it (See example 5).
2) References sections are usually placed before further reading / external link sections. For example, on this wiki, the script would place the "References" section in front of the "Further reading" section, if that existed. Otherwise, it would try to put it in front of the "External links" section, or if that fails, the "See also" section, etc.
3) If a references-tag or reflist-template gets removed while there are still references given in the article, it simply adds the reflist again. It checks every 10 minutes this here: WP:NOREFLIST.
4) I have implemented this possibility now.
@Josh: Yes, I have read and understood WP:BOTPOL.
@Josh and HELLKNOWZ: See here. I have 3298 edits on dewiki and 359 edits on enwiki. Or do you want to see some of my programming skills?--mmovchin Talk 11:51, 21 February 2012 (UTC)
I don't have any issues with that. —  HELLKNOWZ  ▎TALK 11:59, 21 February 2012 (UTC)

1) What about any other match, like "References and external links" [66]. Those three are definitely not the only ones possibly. "References and notes" is often used.

1.A) What are the markup syntaxes and templates the bot considers to be references. We have quite a few last I checked.

2) And when it fails to find any of those hardcoded section names?

3) That's not quite what I meant. Example: IP comes and vandalizes the page, removing {{reflist}}. 10 seconds later the bot is running a scheduled run. Bot restores the {{reflist}}, but doesn't account for vandalism. Editors see bot edit and don't verify if previous version had vandalism. This happens a lot, more so with bots that fix markup issues that are caused by vandalism and such. —  HELLKNOWZ  ▎TALK 11:59, 21 February 2012 (UTC)

1)Thanks, I added "References and notes" and "References and external links". The bot is only able to detect suchs sections if he knows how they could be named.
1A) 'en': [u'Reflist', u'Refs', u'FootnotesSmall', u'Reference', u'Ref-list', u'Reference list', u'References-small', u'Reflink', u'Footnotes', u'FootnotesSmall'] and the references-tag.
2) When it couldn't find a ref-section and one of the tags and templates in 1A then it makes it's own one named "References" and with the Reflist-template.
3) I could implement that the bot will wait 10 minutes after the last edit of an article.--mmovchin Talk 12:39, 21 February 2012 (UTC)
1) Those are not the only examples though. There are lot's of others, like "Works cited", "Bibliography", "Published works", etc. I think you should take care to detect those words mentioned in any sections. I know the chances are low and I'm sorta nitpicking, but we need to account for false positives since this is an automated process. For example, it could be "References, notes".
1A) Does it account for {{Refbegin}}/{{Refend}} or {{Notelist}}?
1B) What happens when there are several sections, like both "References" and "Citations"?
3) At least. —  HELLKNOWZ  ▎TALK 20:54, 21 February 2012 (UTC)

It may be valuable to look at the source code for AWB's module AddMissingReflist. Josh Parris 21:21, 21 February 2012 (UTC)

How are things going? Josh Parris 21:53, 25 February 2012 (UTC)

Oh, sorry. This here hasn't appeared on my watchlist.
1) If the bot simply checks keywords, there will be many false inserets. I think we do not have to include references to such sections like "Works cited", "Bibliography", "Published works", etc. If there are given references to an article, but there is no reference list, then the bot adds a section with such a list. So we need to add all capabilities how such a section could be named, otherwise the bot adds his own.
1A) Yes.
1B) Also yes, it does.
2) 1B) "Citations" is not a section to place references there. "References" is preferred. After this there are "Footnotes" and "Notes".--mmovchin Talk 13:30, 26 February 2012 (UTC)
1) But that's the problem. The bot could be inserting "References" section when there already is a section for references but it's named (slightly) differently. So if there is "Referenced material" the bot would insert another "References" section. If the bot had a blacklist of words when to leave the article for manual checking, it would not make that edit, because "reference" would trigger a skip. I'm afraid references are a touchy subject on English Wikipedia (notably, WP:CITEVAR) and bots should exercise care when making assumptions about what other may or may not have done. AWB can make mistakes, automated bots shouldn't.
1B) I asked " What happens when there are several sections, like both "References" and "Citations"?"; you replied "Also yes, it does." Sorry, but I'm not sure what you meant. This is a typical case to avoid editing by bot and leaving it for manual checking.

[edit] Trial

Anyway, Approved for trial (50 edits). Let's see how it runs. —  HELLKNOWZ  ▎TALK 13:47, 26 February 2012 (UTC)

Oh, I answered 1A) two times. I've corrected it now.--mmovchin Talk 13:53, 26 February 2012 (UTC)
Okay, so what about "Reference works" and "Reference list" in the same article? My point is that no matter how many section headlines you hardcode, I'll make up a few new ones and the bot would in principle have false positives. :) Which is a good reason to avoid editing articles you are not 99% sure will be correct. —  HELLKNOWZ  ▎TALK 13:56, 26 February 2012 (UTC)
Is it better to analyze whether a section-name starts with "Reference"? But let's see how it works. There were some problems with the crontab on toolserver but know everything should work correctly. I will stop the cron when 50 edits are made.--mmovchin Talk 15:03, 26 February 2012 (UTC)

Due to lack of time, it is to me not possible to solve the problem before 02/05/2012. The reason for this is that I am not at home this week. Once I'm at home, I will fix the problem on the toolserver.--mmovchin Talk 22:46, 27 February 2012 (UTC)

I'm going to presume you mean 2012-03-05, next Monday. Noted. Josh Parris 00:13, 28 February 2012 (UTC)

edit

[edit] User:ArticlesForCreationBot 5

Operator: Petrb (talk · contribs)

Time filed: 14:34, Monday January 30, 2012 (UTC)

Automatic or Manual: Automatic

Programming language(s): c#

Source code available: yes

Function overview: remove the "under review" parameter if no one changed the page within 24h

Links to relevant discussions (where appropriate): not needed, maintaintask

Edit period(s): daily

Estimated number of pages affected: one or two a day, maybe less

Exclusion compliant (Y/N): Y

Already has a bot flag (Y/N): Y

Function details: notify user after 36 hours if started a review of submission and didn't finish it, repeatedly every 36 hours N times.

[edit] Discussion

How it's possible to retrive the username of reviewer? Petrb (talk) 14:51, 30 January 2012 (UTC)
I simplified the task, let me know if it's not possible to do that this way Petrb (talk) 14:54, 30 January 2012 (UTC)
(edit conflict)By searching the history similar like the wikiblame tool does: You have to check which contributor did add (by searching the diffs) the r in the template and then check if that user did/does another edit with in the last 24h. If that's too complicated, use simply 24h not edited. mabdul 14:58, 30 January 2012 (UTC)
Yes I prefer the simple way, I am lazy, you know. Petrb (talk) 15:01, 30 January 2012 (UTC)
Eh, can we have some links so we can judge if this has consensus please? - Jarry1250 [Deliberation needed] 15:15, 30 January 2012 (UTC)
As I described by filling this BFRA: this is a maintain task: a review is normally really fast (within a few minutes) and only for complicated/good articles this needs a longer time period: sometimes a reviewer forgets that he/she has marked a submission for review and thus the user doesn't get a review - and at our backlogs at the moment, this can last very long until somebody recognize that anybody missed that draft. mabdul 15:26, 30 January 2012 (UTC)
Oh, I'm not saying it sounds unreasonable, but it would be good to see even a guideline page that implies "stealing" other people's reviews is acceptable and/or desirable, and in what timespan. - Jarry1250 [Deliberation needed] 15:53, 30 January 2012 (UTC)
I don't believe such a task is needed. If an article has been marked as under review for over 24 hours, it is common practice to consult the reviewer who marked it. On occasion, there are special circumstances where it takes over 24 hours for a reviewer to give an article a full review, or where the reviewer is doing some heavy work on the article over the course of several days. If it is going to take longer than 24 hours, reviewers usually leave comments on the submission. If a bot were tasked with this, it would not be able to evaluate the reason for the delay. Alpha_Quadrant (talk)
Might it be better to simply notify reviewers of older reviews (over 72 hours, perhaps?) - Jarry1250 [Deliberation needed] 23:51, 30 January 2012 (UTC)
That is a good idea. It would save other reviewers the trouble of asking about the submission. Alpha_Quadrant (talk) 22:41, 31 January 2012 (UTC)
But 72h is rather "late", do inform the user all 24h please. mabdul 14:40, 1 February 2012 (UTC)
How about 36 (48?) hours. It gives the reviewers some time, in case they plan on completing the review in a few hours. 24 hours is a fairly short time window. Alpha_Quadrant (talk) 17:38, 1 February 2012 (UTC)
36h (1.5days) sounds good. Keep in mind: it's only a notice which can be ignored/removed easily of everybody's page. It should only a notice that a page shouldn't get abandoned because of a big backlog (if Chzz is taking again a break XD) mabdul 12:28, 2 February 2012 (UTC)
Okay, before a trial, perhaps someone could draw up the text of the intended notification? In addition, perhaps someone could comment on whether there's a backlog or not and, if so, how large a backlog? Thanks. - Jarry1250 [Deliberation needed] 18:26, 2 February 2012 (UTC)
Please make it a template bot can subst to user page. Petrb (talk) 09:11, 3 February 2012 (UTC)
I proposed a wording at User:ArticlesForCreationBot/Stale and also added already a new edit summary at User:ArticlesForCreationBot/Config.
So to sum up the task again: The bot should check AFC submissions which are marked as review, if the review is stale, the bot should inform the reviewer after 36h. I think it wouldn't that bad if the reviewing user gets again a notice if he still marked the draft. (all 36h?) I added to that proposed talk page message a parameter {{{3}}} so that it could easily changed on a (new?) configuration file...
mabdul 13:28, 6 February 2012 (UTC)
I'm not sure about followup notifications, but I've proposed a change in your wording for the message in the meantime. I hope you don't mind. - Jarry1250 [Deliberation needed] 14:56, 6 February 2012 (UTC)

Approved for trial (7 days). MBisanz talk 21:11, 8 February 2012 (UTC)

{{OperatorAssistanceNeeded}} Update? MBisanz talk 00:21, 28 February 2012 (UTC)

Hi I think we should change 7 days to 10 edits, or like, because it's probably going to take a long time to have some submissions which are waiting more than 36h in this status. Petrb (talk) 15:23, 2 March 2012 (UTC)
Change to 10 edits is fine. MBisanz talk 03:16, 8 March 2012 (UTC)

edit

[edit] H3llBot 10

Operator: H3llkn0wz (talk · contribs)

Time filed: 11:19, Thursday January 5, 2012 (UTC)

Automatic or Manual: Automatic (though if there's very few I'll probably check them all anyway)

Programming language(s): C#

Source code available: No

Function overview: Convert bare text inline problem tags into their respective templates, such as <sup>[citation needed]</sup> → {{Citation needed}}

Links to relevant discussions (where appropriate): very old BOTREQ, should be non-controversial

Edit period(s): when bot is running and comes across the issue

Estimated number of pages affected: no idea, I think very few, may be a dozen or so per 100k pages

Exclusion compliant (Y/N): Y

Already has a bot flag (Y/N): Y

Function details:

The match syntax is one of:

Additionally it will fix (double) superscripted inline problem tags:

Note that all new templates or previously undated ones will be dated with current month and year.

In the above TEXT, TEMPLATE, and LINK are one of the entries in the full list below:

[edit] Discussion

Approved for trial. --Chris 17:11, 8 January 2012 (UTC)

Here's a sandbox example with different borked up syntaxes: [67]. —  HELLKNOWZ  ▎TALK 10:17, 9 January 2012 (UTC)
Here are some edits on the pages I had stored: contribs. A couple edits made double tags because there were misformatted tags present; I'll tell it to not add same tags twice somehow. Anywho, the edits don't happen very often so it'll probably be a while until I get more. —  HELLKNOWZ  ▎TALK 19:14, 8 January 2012 (UTC)
Three more [68][69][70]. —  HELLKNOWZ  ▎TALK 14:09, 24 January 2012 (UTC)
Have you fixed the double tag problem evidenced here, here, and here? (The last one was not caught.) If so, I'll approve. — madman 06:27, 4 February 2012 (UTC)
{{OperatorAssistanceNeeded|D}} Ready to approve as soon as you get to madman's questions. MBisanz talk 15:24, 6 February 2012 (UTC)
Sorry, yes, there's lots of different cases, so I'm coding it up slowly. General cases work, but I realized I need to know all tags and redirects just to check for duplicates. Real life's a bit in the way, so hopefully it's OK the BRFA lingers a bit. I don't want to claim it works 100% before I know it works 99.9%. —  HELLKNOWZ  ▎TALK 16:29, 6 February 2012 (UTC)
No problem. I'm going to switch this to Symbol full support vote.svg Approved for extended trial. so I can remember why it's hanging around (my memory's not the best). Cheers! — madman 19:01, 6 February 2012 (UTC)
Question from entirely unqualified editor: will this lead to the new citation needed tag being dated February 2012? Is so, are we happy to accept the dating of new tags with the current date? Grandiose (me, talk, contribs) 16:53, 14 February 2012 (UTC)
I don't think it's a big deal. The alternative is that I leave them undated. There is no easy, reliable way to parse the page's history, though that could be possible. If they are left undated, another bot will date them very shortly afterwards anyway. —  HELLKNOWZ  ▎TALK 17:29, 14 February 2012 (UTC)
Yes, it will just create a bulge in the backlog that will go down over time. MBisanz talk 22:17, 14 February 2012 (UTC)
And a rather tiny bulge, given how rare these cases are. In fact, I'm pretty confident no one will ever notice. :) —  HELLKNOWZ  ▎TALK 22:18, 14 February 2012 (UTC)
As mentioned here, and the corresponding bot request, could this bot expand to include "(citation needed)" (and perhaps some other variations similar to what you've mentioned above, like "(citations needed)", "''(citation needed)''" and "(''citation needed'')", though, compared to the bare form, I don't know how common these are). Mark Hurd (talk) 02:35, 18 February 2012 (UTC)
I think it more or less catches everything from those syntaxes now, including duplicates: [71]. Now, just links and plain text as described above. —  HELLKNOWZ  ▎TALK 12:28, 21 February 2012 (UTC)
Just my 2¢: Leaving the templates un-dated would be silly, because User:Smackbot goes around adding dates to those types of templates anyways. In response to Mark Hurd's comment, there aren't many instances of "(citation needed)" or its variants. Grep says there's only 150.
--Tim1357 talk
Any update Hell? MBisanz talk 03:25, 8 March 2012 (UTC)
Ha-ha, what do you mean? I'm going for the longest open BRFA here! But I do need to simplify my uber-parser, which is getting so inefficient I can't run it anymore without miles of hacky code. —  HELLKNOWZ  ▎TALK 09:34, 8 March 2012 (UTC)

edit

[edit] SharedIPArchiveBot 2

Operator: Petrb (talk · contribs)

Time filed: 21:22, Friday November 4, 2011 (UTC)

Automatic or Manual: Automatic

Programming language(s): c#

Source code available: yes http://code.google.com/p/sharp-wikibot/source/browse/branches/bot2/wiki_bot_core/wiki_bot_core/

Function overview: Perform archiving of shared IP talk pages with old messages and update their header templates

Links to relevant discussions (where appropriate): Shared IP talk page archiving proposal on VPR

Edit period(s): daily

Estimated number of pages affected: ~20000

Exclusion compliant (Y/N): Y

Already has a bot flag (Y/N): N

Function details: As part of a short-term A/B test, this bot will run through half of the pages in Category:Wikipedia user talk pages of shared IP addresses and set up an archiving system for old, outdated talk page messages. The frequency of archiving will be decided per the discussion on VPR, and the bot will continue to systematically archive old messages on these pages for the duration of the test. For these pages, it will also replace the current shared IP header templates with a slightly altered version that more prominently encourages new users to create an account (see the redesigned templates here). The bot will not archive:

  • pages that do not contain one of the 11 specified header shared IP templates
  • pages that have been edited within the time agreed upon by community discussion (i.e., pages where any kind of live discussion is still happening)
  • current block notifications
  • pages that are already archived

[edit] Discussion

  • This bot will be assisting the Wikiproject User warnings testing task force in an A/B test on shared IP talk pages. The purpose of the test is to see if archiving old messages on shared IP talk pages produces any positive effects. Currently, readers who open a Wikipedia page at a coffee shop or library are likely to see the "You have new messages" banner and be directed to a talk page with dozens of old warnings that were not meant for them. We suspect that vigorously archiving old messages sent to shared IPs might reduce the likelihood of users being hit by irrelevant messages, which might prevent good-faith contributors from being discouraged from editing. We also hypothesize that old warnings might encourage rather than discourage vandalism, as per the Broken windows theory. After we run the test, we will be able to present data on what kinds of edits came from those addresses and how many new users registered accounts from them, which will give us some indication if archiving positively affects the community. Maryana (WMF) (talk) 22:11, 4 November 2011 (UTC)
  • FYI, result of the VPR discussion is here. So, the bot would archive every 2 weeks (unless there's a live block on the talk page), and the test would run for 2 months. Maryana (WMF) (talk) 21:32, 14 November 2011 (UTC)

Ok, there seems to be some consensus for that. Approved for trial. How does this task related to Wikipedia:Bots/Requests for approval/SharedIPArchiveBot, do they need to be approved at the same time as part of the A/B testing? Or are they independent of each other? As far as testing this, I'd ask that for the first run or two, you reduce the number of edits to about 50/100, just in case something goes wrong, once any niggling problems have been fixed, then you can start going with larger numbers of edits. --Chris 03:31, 17 November 2011 (UTC)

It's running now, first: I didn't really understand how many edits am I allowed? What 50/100 means? is that 50 edits? Bot is running as "Petan-Bot" because it's crucial to have bot flag for this task, (it changes talk page). Petrb (talk) 09:54, 17 November 2011 (UTC)
You're allowed to have as many edits as you need to trial the bot (within common sense, e.g 50,000 edits within one day would not be appropriate). All I meant by 50/100, was before doing a large trial, you should do a small run of between 50 to 100 edits to make sure there aren't any problems with the code (e.g. you don't want to do 500 edits and then suddenly have to revert all of them). It's mainly just common sense, but I thought I would make sure, just in case.--Chris 10:49, 17 November 2011 (UTC)
It's necessary to get a bot flag before it can operate using this account, is it possible? Petrb (talk) 11:01, 17 November 2011 (UTC)
Normally, bots are not flagged until approved. However, this is a "special case". It needs to be flagged, so that it does not cause the users to get a "You Have New Messages" alert, and then be faced with no new message. Therefore, I hope we can make an exception.
Petrb, I'd like to see - and chec - a small number of edits, such as 50 to 100 pages - and have the chance to manually review them, before scaling up.
Also, can you please clarify what it'll do - specifically;
  • Does it only act upon pages that have not been edited for <timeframe> (I believe consensus was 2 weeks, is that correct?)
  • Does it only either archive the entire page, and put a shared IP header? Does it not remove any partial pages?
  • Does it skip users that have been blocked within <timeframe>?
Sorry to ask so many questions, but I think some of the ideas given in the bot-request might've shifted a bit, following the discussion; for example, the request here says it'll only skip "current block notifications", and I think the discussion shows that several users were concerned about removing block notices immediately after they expire.  Chzz  ►  13:33, 17 November 2011 (UTC)
Although my main concern remains the apparent removal of block notices immediately upon expiry, instead of later [72], I have some other queries about this bot, too; including a) why we're addding miszabot auto-archiving (which AFAIK has never been discussed), b) on the VPR I thought we were going to add an indication that previous warning/messages had been removed (and possibly how many) but that doesn't seem to have happened. I'm also concerned that the test was performed on several hundred pages (and using another bot account) [73] and I think, at this point, it would be best if we could evaluate the edits and resolve these questions before resuming testing.  Chzz  ►  15:53, 17 November 2011 (UTC)
Both was solved. concerning b - I don't know about it / no one requested that either on proposal of bot task (in my userspace), concerning other account it was done because this bot didn't have bot flag in the time Petrb (talk) 16:04, 17 November 2011 (UTC)
If you still want to me stop the bot, say it here, I believe that all concerns were solved. Petrb (talk) 16:58, 17 November 2011 (UTC)
Yes, STOP - it's supposed to be approved for a trial of 50-100 edits, and we wanted to check them. In addition to the run on the first account, SharedIPArchiveBot (talk · contribs) now seems to have made over 5000 edits!  Chzz  ►  02:27, 18 November 2011 (UTC)
Shutdown for the moment --Chris 03:11, 18 November 2011 (UTC)
100 edits? I didn't notice I was restricted to that count, I specificaly asked how many we can do, according to edits: the task isn't that simple it's using algoritm which archive everything what looks like messages + specified elements, since not all elementes are defined now, it may happen that bot forget to remove some template etc. however once I define it, it will be removed next time (reply to my answer on TP, and probably question you also wanted to ask). Petrb (talk) 08:45, 18 November 2011 (UTC)

I do not have time to check the edits in detail right now; I hope to write more soon, but it may be a few days. For now, I will make some brief comments. To be honest, I am annoyed and frustrated - because I, and others, have gone to some effort to discuss/reach agreement upon the operation of this bot for a trial, but our comments seem to have been disregarded. I understand Petrb's desire to get the task going ASAP, and saying you're kinda "just the programmer" and do what you're told to do. However, what you've apparently been asked to do does not match the consensus/agreements. I have asked - of various people, on the VPP thread/on this BRFA/on the 1st-task BRFA - several clear, straight-forward questions - such as, I will try to say it more clearly: Will the trial remove block notices within 'x days' of the block ending [74] - and clearly, that's not what has happened. In addition, we've had another communication breakdown, resulting in the BRFA-trial being performed on thousands of pages, instead of a much smaller number. And, trying it out using another bot account that wasn't approved for this task, just because it was flagged, wasn't a good idea; but let's brush over that one.

  • The idea of BRFA, as I understand it, is as follows; <Chris G/BAG, of course feel free to correct me>
A) A user explains what the bot will do.
B) People comment, discuss it, and we refine/address concerns
C) It is approved for trial - typically, for between 50 and 100 edits
D) Users review the edits, check for problems, and discuss solutions
Steps C and D are repeated as necessary
E) It is approved
  • In this case, we've had a misunderstanding - possibly because of the two meanings of the word "trial", in this case;
  • Unfortunately, some of the things discussed in that VPR thread do not seem to have come through here to BRFA. In particular, users wanted the bot to only archive pages that had not been used for a certain time, and not to remove block notices until some time after they had expired. I think consensus was for two weeks, in both cases.
  • Another things discussed in the trial that have not come through to this BRFA is: The message left on the IP talk pages should indicate that previous warnings had been removed. That helps users looking at the talk-page, to see what has happened. "it could be 90% welcome message, with a small note that the IP has a history, with a link to where the warnings are archived. The note would not even need to mention the history is of warnings. That way we are friendly to new users of the IP, but the information is readily available to those who know what the message means. Monty845 18:26, 28 October 2011 (UTC)"
  • Minor point: looking at the source, it seems like the exclusion compliance' code only looks for "nobots" - it could be better; see Template:Bots#C.23
  • I will try to review some of the edits, ASAP.  Chzz  ►  12:20, 18 November 2011 (UTC)

Per the above, it is best that the trial be put On hold for the moment while further discussion takes place. Also, I'd like to apologise to Petrb for the miscommunication regarding the trial details. --Chris 13:00, 18 November 2011 (UTC)

@Chzz:
Sorry, don't take this offensive, but you need days, for what? You didn't address any issue which is happening right now, I updated the bot according to your request asap when you informed me about it on irc, so bot doesn't archive block notices of blocks which expired recently, concerning the edit count, I discussed this with Steven yesterday and he also understood this as we got approval to run unlimited number of edits for two months and yes I run it over 50 edits, examined them and considered it ready for more, the bot is running in debug mode and edit very slowly, so I am checking what is it doing, and I understand that you may have problem with me trying to handle this somehow faster, but keep in mind that we are discussing this over and over for several weeks, and it may surprise, but I also have real work and other stuff to do, which makes me pretty busy, so yes, I do not have a time to read all discussions realated to this task, I created page with summary of task in my userspace and asked you, Maryana and Steven to update it according to what is being proposed on discussions, so that I can follow it and create the task according to what is there, instead of reading through huge discussions and finding out what is actually true, I apologize for being such an ignorant, but I really can't read everything everywhere, I made a configuration and shutdown page so you could at least shut it down yourself and tell me exactly what needs to be improved instead of telling everyone that bot is completely broken so we must stop it and that you will tell us what we should fix after few days. Your concerns:
  • Bot is editing more than approved:
false
  • users wanted the bot to only archive pages that had not been used for a certain time, and not to remove block notices until some time after they had expired. I think consensus was for two weeks, in both cases:
that's what happens now and happened even before.
  • The message left on the IP talk pages should indicate that previous warnings had been removed. That helps users looking at the talk-page, to see what has happened.:
is this part of consensus? yes / no? if so, I can insert it in 5 minutes
  • Bot is archiving block notices of blocks which expired recently - false (since we discussed this and it didn't even happen for many or maybe for any users since the bot wasn't running for long time until it was fixed)
So may I ask what is actually problem? Thank you Petrb (talk) 13:15, 18 November 2011 (UTC)
The core problem right now is, you do not seem to understand BRFA. Your bot has not been approved for unlimited edits. It was approved for "about 50" edits. You've made over 5000, and that is why I shouted STOP. I, also, have other things to do (as do BAG members) - so I hope you will understand why it might be several days before we can check over some of the edits, and decide if it can be approved.
It's your bot; you are responsible for what it does. Nobody else.
You said, "the bot is running in debug mode and edit very slowly" - it was editing about 10 times per minute, so I've no idea how you could possibly be checking those 5000+ edits.
Chris G/BAG, I'd really appreciate it if you could help me explain things more clearly.  Chzz  ►  19:37, 18 November 2011 (UTC)
I still read in this thread that it was approved for more than 50 edits, Chris told me that 50000 edits in one would be nonsense, and that's not what I did, I don't know where you get the number from. Petrb (talk) 20:26, 18 November 2011 (UTC)
I mean this: You're allowed to have as many edits as you need to trial the bot (within common sense, e.g 50,000 edits within one day would not be appropriate)
I either didn't understand it correctly (if so I apologise for that, but I wasn't only one who didn't) or it was wrong interpreted. I started 50 - 100 edits, in debug mode where I needed to confirm each of them, and after review I started bigger run, since no other concerns than those I already fixed, were mentioned, I continued with run, until it was shutdown, it's true that there are more bugs, but those are very minor and will be fixed. Petrb (talk) 21:10, 18 November 2011 (UTC)

Hey, uh, why is this bot substing {{archive box}} when it archives shared IP talk pages? That should not be being substed. It is a dynamic template to show all of the archive subpages. And the fact that the bot has done this over 5000 times outside of trial... Logan Talk Contributions 20:02, 18 November 2011 (UTC)

Fixed Petrb (talk) 21:36, 18 November 2011 (UTC)

I don't know if this is the correct place to put this, but anyway...
The Bot somehow overwrote an IP's talkpage so subsequent Warnings are ending up in the empty Hide/Show-Old Warnings section as seen in this edit.
Also, in this edit the bot archived all the posts from July 2008 through February 2011 but left the {{Old IP warnings top| Warnings and IP-Blocks date from July 2008 through January 2010.}} and {{Old IP warnings bottom}} intact. But they're empty now, so there's nothing to hide-show... --Shearonink (talk) 20:06, 18 November 2011 (UTC)

Again: the task isn't that simple it's using algoritm which archive everything what looks like messages + specified elements, since not all elementes are defined now, it may happen that bot forget to remove some template etc. however once I define it, it will be removed next time (reply to my answer on TP, and probably question you also wanted to ask) - I already noticed that, after update it would archive even this, archiving of such pages is much more complicated than what miszabot is doing, I am not archiving just talk threads here Petrb (talk) 20:31, 18 November 2011 (UTC)
Fixed Petrb (talk) 21:36, 18 November 2011 (UTC)

I just received a nudge in wikipedia-en-help IRC chat channel that one of my warnings was somehow not showing up on the editors talk page, due to what seems to be an issue with the all-new SharedIPArchiveBot2. Since there was some talk in the help channel about the bot, and since i felt that i might be missing something since i have been absent from editing for three months, i decided to have a look around to see what it would be doing. And to be honest - i am rather, if not very concerned about the bot and the process around it.

  • Right now the bot seems to have made over 5.000 edits even though it is on trial. I assume this is a miscommunication, but seeing that bug reports (Including my own) have already been posted we may now have anywhere between 2 and a few thousand broken talk pages. Ergo, if user 139.137.244.2 would now receive a warning (As he did), he would never see it due to the faulty trial. I equally notice that part of the edits have been made trough user:Petan-Bot instead, which makes cleaning this more difficult - and which should not have been done in the first place for this reason.
  • I am also somewhat concerned about the closure of [75] as "Strong consensus" as this is not what i see. While the majority of the editors seems to support this, the amount of editors weighting in into this result is quite marginal, especially if i compare it to the discussion about the (in my eyes) much more trivial "Does Wikipedia need a “share” button?" right about it. Was this discussion even raised on WP:CENT or WP:ANI? I can't seem to find any indication that it was.

Note that i might simply be missing pieces of the proposal here since it is somewhat hard to track every discussion since they are spread on several pages. If i missed anything please do let me know. However, if this was not raised on either ANI or CENT, i would strongly suggest raising it there regardless just to confirm consensus, as i believe that the vast majority of the editors who would comment on this proposal are not aware of it right now. Excirial (Contact me,Contribs) 22:46, 18 November 2011 (UTC)

Concerning your report it was already fixed, problem was not on thousands pages but only on 5. Petrb (talk) 22:48, 18 November 2011 (UTC)
Concerning Petan-Bot it was used because it was needed to have botflag for this task, and I wasn't sure I can get it without proving that task is working, anyway there are no mistakes done by User:Petan-Bot since I reviewed all edits there, so you don't need to be afraid of cleaning up that. Petrb (talk) 22:52, 18 November 2011 (UTC)
Petrb, one thing you seem to have missed/not heard is, that in the VPP discussion, I believe general agreement was that pages should be either entirely archived, or not archived at all. That's because we were concerned about splitting up threads, or misrepresenting subsequent comments. For that reason, I do not understand why we're worried about complicated algorithms; to me, the agreed task seems much more simple: For pages tagged as "shared IP", we check if there's been activity within <2 weeks>; we also need to check if a block has expired within <2 weeks>. If neither of those are true, then the page can be archived (it'd make sense to use User talk:IP/Archive 1 for that, assuming none currently existed). Then it would replace the header with the 'new style' heading, which would include a small note, along the lines of "Stale warnings were automatically [difflink|removed] from this page." or similar. And that's it. At least, that's how I read the discussion/consensus.  Chzz  ►  23:27, 18 November 2011 (UTC)
Really? What if "header" is in middle of page as on many pages? Or what if it's substituted in the middle so it's not just a header, what if there are some other special elements (categories) which are not to be archived. It's not that simple. Petrb (talk) 23:55, 18 November 2011 (UTC)

[edit] Start again?

I'm unhappy with the state of the pages it has acted on, and the archive-pages it has made, for several reasons. I cannot detail those with diffs right now, but we've already discussed some of the concerns above. With that in mind, would it be best if - ASAP - we just undo everything, THEN perhaps we could have a sane conversation (here), get things nice and clear, and come to agreement about what it is doing, and do a test on 50 pages. y'know, like we were supposed to in the first fucking place (expletive struck later, per [76]  Chzz  ►  01:11, 19 November 2011 (UTC))[under discussion]

I don't think that would be too hard for SharedIPArchiveBot (talk · contribs) right now, as I think we could catch almost everything with special:nuke (5469 edits, 2645 of which are 'archive' pages).

It is slightly complicated because of the 363 edits performed under Petan-Bot (talk · contribs). But it's not that hard to reverse those too.

We've got into a bit of a mess, here; I'm not bothered about blame for why we've got here, but, here we are: I'm looking forwards, and sometimes it's necessary to take one step back making progress.

Thoughts? BAG opinion?  Chzz  ►  23:19, 18 November 2011 (UTC)

You should give us at least one valid to reason to revert 5000+ edits, since 99% are ok. 1% will be fixed soon Petrb (talk) 23:25, 18 November 2011 (UTC)
I hope I don't need to remind you that between nuke and revert is very small difference, so it would be also big load for cluster, apart of that it's completely useless. Especially when you "just don't like it". What if I "just didn't like the way how all pages on wikipedia are written" would you nuke them? Petrb (talk) 23:28, 18 November 2011 (UTC)
Please do not claim "I just don't like it" - that is not true, and to claim so is disingenuous; why on Earth do you think I am putting effort into this project, and trying to get it going? Do you actually believe my motivation is some form of deliberate stubbornness, malicious attack upon your character, or...well, what? I am striving - in the face of adversity - to help make this thing work. I resent the accusation that I "just don't like it". I find it startling that, now, you are worried about a "big load for cluster", when you've just completed 5000+ edits with a non-approved bot. Yes, I'm pissed off; because I've tried my best to help make this thing go smoothly, and many of my comments have been disregarded. But my being-pissed-off is beside the point, and not constructive. So,
Despite that, here is a reason: The bot has created 2,645 archive pages [77]. In all cases, they are called "Archive", and not "Archive 1". In some cases, there were existing archives. In other cases, it partially archived the IP talk, which is in contradiction with the consensus shown at VPP. In some cases, it may have removed recent block notices, again in contradiction with consensus. In replacing the headings on the IP talks, it did not note that archives had been created (which was the apparent agreement at VPP). The current trial data will be distorted by the operations performed and now halted. I thus believe it is easier to "wipe the slate clean", put the problems behind us, and try again. This time, following the Wikipedia policy-based system for bot approval; by testing on a small number of pages, then giving users a chance to evaluate them.  Chzz  ►  23:53, 18 November 2011 (UTC)
I don't really see a reason to nuke all those edits. There was a very low error rate and it did successfully archive almost all the pages without mangling threads, deleting content, adding the wrong templates, or archiving inappropriate things. I think we should spot check the edits it already made for basic technical acceptance, not do something ridiculous like delete thousands of edits in order to start over. And Chzz, please don't swear at Petr. Obviously there was miscommunication about the desired trial of the bot, and he's acting in good faith here to try and answer any concerns. Steven Walling (WMF) • talk 23:57, 18 November 2011 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── What if have the bot self-revert all but 50-100 of its edits? That seems better to me than nuking, though Petr might have concerns about the workload on the Toolserver. Steven Walling (WMF) • talk 01:00, 19 November 2011 (UTC)

I'm not suggesting this measure for any kind of "this is WRONG!!" type reasons; purely, honestly, because it is often easier (with such matters) to just 'undo everything' and start again. We could try to analyse where it's right/wrong, and clean up but experience tells me, that'd be more hassle than a couple of our clever mops clicking a few buttons and BZZT; back to square-one, let's work from there. If I might make the comparison: it's like when a dozen editors have messed about with an article, and added 'something good/mostly bad' - it's just easier, technically, to go back to the last revision before the problem, and add back the good parts, than to try and work on the bad version. But, it's only a humble suggestion; if consensus is not to do that, I will merrily work with where-we-are-at, and try help sort it out. Chzz  ►  01:19, 19 November 2011 (UTC)
I think the reasons for the request are sound and I agree. It just seems illogical to me to delete test edits in order to make more test edits, when all we'd have to do is look at the ones already made. I'll let other people chime in though - to be honest I just want to move forward as well. Steven Walling (WMF) • talk 01:23, 19 November 2011 (UTC)

(edit conflict) The unfortunate reality here is that, the longer we wait, the more of the pages may subsequently be edited; so if we're gonna go for the 'nuclear option' we should do it soonest.  Chzz  ►  01:25, 19 November 2011 (UTC)

Steven Walling (WMF), if really isn't illogical; you (we!) want to get meaningful data from this trial. Right now, we have a corrupt sample; we have various pages edited in various ways, which don't correspond to consensus. It will be very difficult to draw any conclusions from that. If we can step back, test, agree, and then press ahead, we could try to evaluate a 2-month trial; where we are NOW makes that real hard, because pages have been amended in a way that consensus doesn't agree to.  Chzz  ►  02:27, 19 November 2011 (UTC)

Ok, this is exactly what I didn't want to happen, and well it happened. The intention behind the 50-100 edits part of the trial was that we could have a small trial first to make sure there aren't any problems, and if there were any problems we only had to revert 50 edits not thousands of edits "(e.g. you don't want to do 500 edits and then suddenly have to revert all of them)". Sorry, I should have made myself clearer on that point. {{BAGAssistanceNeeded}} Now we need to decide, what to do about these edits. On one hand, every edit is slightly broken, because the archive template was substituted when it shouldn't have been. There is also the possibility of other errors such as this which create future problems for warning users. So, is it worth performing a mass revert? Or do we just accept the damage and do our best to mitigate it --Chris 03:21, 19 November 2011 (UTC)

Chris G, I am not interested in recriminations at this point, but I think the most expeditious answer to the current situation is to rv all the edits, to clear the decks, then try to move fwd. As I said above. Obviously, the sooner that is done, the easier it is.  Chzz  ►  05:31, 19 November 2011 (UTC)
It's been made clear that every edit has the issue that either they should be using {{Archives}} or they should be edits to /Archive 1 not simply /Archive. Other mistakes have also been pointed out, and there are almost certainly more. However, reverting the bots edits maybe more complicated than simply nuking the pages and then using mass rollback. At the moment I'm reluctant to simply say "go ahead" to reversion, however, as Chzz said, the longer we wait the more complicated it gets. Chris, is the BAGAssistanceNeeded template asking for input from other BAG members about how to proceed, with regard to mass reverting/just continuing? Because my thoughts on that at the moment, would be that although I'm leaning towards mass reverting, the tricky part is figuring out how to do that without causing more damage, and I'm still open to persuasion that that may not be the best method. - Kingpin13 (talk) 07:08, 19 November 2011 (UTC)
Right if someone could also take a look on what I am telling I have also some points:
1 at Chris There were 5 mistakes like that and all fixed
2 at Chzz I told you to update bot config where is defined whether it should be numbered or not, I told you that several times and you didn't do it, you knew it's gonna not use numbered archives several weeks ago, however you were waiting for it to do 5000 edits although you could have stop it yourself just to yell at me here. However move the 2500 pages is less harm than doing extra 10 000 edits nuking whole thing and starting it over
3 at Kingpin Is that so big problem? Even some pages archived by other bots did similar mistake, no one reverted it.
I still don't see any major problem over all 5000+ edits, so could someone tell why is it worth reverting? However revert those edit's would be better, fix them would be best (IMHO), nuke them is non sense (IMHO) - @Chzz if you ever tried how nuke works - it's installed on hgwp where you have enough flags to try it out - you would see it's more than few clicks. Thanks Petrb (talk) 08:26, 19 November 2011 (UTC)

@Chzz

You obviously didn't notice anything about numbered archives although it was clear it's not going to happen.
You didn't notice we should use numbered archives, I also told you that there is a config where it could be changed, you did't do that either
  • 17. I started first small trial at 20:59 GMT few minutes after that it was interrupted according to you concerns which I fixed - I told you there is shutdown link you should use in case that you find any issues
All your concerns were fixed and no one notified me about numbers of archives
  • 17. few hours after fix I continued with test and I again told you that there is shutdown button
you knew it's running, you knew it did hundreds of edits and that it is going to do thousands (I told you that), however you wait until it did 5000 edits (if you stopped it after 500 edits you could hardly have something you could use against us).

And please keep in mind I am not telling you are anyhow responsible for that it has happened - it's my fault, I just tried to tell you that you could have inform me that there is something wrong you disagree with weeks ago, but you didn't
Now you are complaining with "mistakes" which aren't nearly mistakes and repeat over and over that we did too many edits, which is probably true (it was misunderstanding - I am sorry) but it's the only problem you have, still repeat it and you could prevent it from happening. You are constantly trying to find more and more problems in edits where there are nearly no issues, apart of those I already fixed, so you are still repeating the same thing. What is so big deal with that it wasn't numbered? The pages are never going to have large history, some of them are old 5 years and they have few templates - is it necessary to split it to more archives? And even if it was. It can be fixed. I don't see any reason to revert it. Other than that you just don't like it Petrb (talk) 09:14, 19 November 2011 (UTC)

So if I summarize it: we have only one (big?) problem and that the bot "substituted" (id didn't really substitute it) navigation template, could some please explain me what is wrong on that, that we need to revert all edits?. Thanks Petrb (talk) 09:33, 19 November 2011 (UTC)
And if you tell me what is exactly problem with that it's not being numbered I will be happy to come with some plan to fix that, however according to that histories are really going to be rather short, having one archive makes sense to me. Petrb (talk) 11:26, 19 November 2011 (UTC)
I still think that we should keep it archiving to one big page, instead of several smaller, the reason is that people wanted all warnings together instead of splitting them, the pages are usually small and archive would hardly grow too big, and having it all in one would make it much easier to search certain templates or count number of warning templates. Petrb (talk) 19:12, 19 November 2011 (UTC)

I think it's time to move forward so: Yes, I have done some mistakes there is no doubt about it and I take full responsibility for that, however instead of arguing we should look forward to some solution,

Here is the list of issues I have found reviewing the edits:

  • The bot was about to remove block notice from recently expired notices - fixed before it could do that
  • The bot have done major error when archiving page due to missing definition of some templates and that it left template which was hiding new templates, this occurred on 5 pages and is fixed (thanks for report @Logan, Matthew)
  • The bot substituted navigation template, although it's disputable if that is major issue, I can fix it of course - there are two possibilities:

1 - bot will fix all pages where it happened (2600+ edits which are not needed)
2 - bot will fix it next time when it's archived

  • The bot didn't use numbering for archives, it was done purposefully, however there are some complains about it so, I am willing to "fix" that:

1 - the bot will move all (2600 moves which are not needed)
2 - the bot will start using it and leave existing archives until they are full, in that case it move Archive to Archive 1 and start another one. Please let me know what solutions you like best. Petrb (talk) 09:35, 20 November 2011 (UTC)

  • Question - I am under the impression that the archiving will cause a "you have messages" bar, regardless of bot flag. Is it definite that it won't as suggested above? Rich Farmbrough, 21:02, 22 November 2011 (UTC).
Unless things have changed recently, an edit which is marked as both a minor and bot edit does not trigger the message bar - Kingpin13 (talk) 21:13, 22 November 2011 (UTC)

I am coming to this late (just ran into an instance of the bot on a IP talk page). It's my understanding that the bot is archiving IP talk pages

  1. so new users assigned to a dynamic IP do not see the "You have new messages" banner. If that is so, then it seems to me the solution would be to turn off the banner if it stays up more than 2 weeks.
  2. to keep up appearances under the "broken windows theory". Note that the "broken windows" here are not the talk page warnings but the instances of vandalism and link spam that the anti-vandalism editors and bots are continually removing. That's not to say it might not help to keep the talk pages cleaned up but I'm waiting to see the results of the trial.

I think it's likely this bot will increase problems:

  • At least for schools, I would hope that there might be institutional interest in monitoring vandalism by students and removing the warnings will lead most teachers or whoever might be monitoring to think everything is fine. To address this, perhaps the bot could summarize the warning and block statistics (a graph would be great) so the information is not hidden away.
  • Archiving all but the last 14 days is too aggressive. It makes it a lot harder to detect persistent vandals and spammers, will increase the ant-vandalism workload and will increase the number of vandalism-only and spam-only accounts that go unblocked. I would prefer to see the bot leave six weeks of content or provide a summary analysis of what is being archived.

Jojalozzo 17:59, 12 December 2011 (UTC)

Hi Jojalozzo,
I understand your concerns, and we talked about this on the VPR thread. Consensus was for two-week archiving, and many people wanted even more rapid archiving (some even thought we should delete the old warnings altogether – that's actually what happens on German Wikipedia after 24 hours of no activity on a shared IP talk page). Here are some details to keep in mind:
  • Tools like Huggle and Twinkle automatically reset to issuing level 1 warnings if there are no additional warnings issued to an IP talk page after 72 hours. So, if you're using them, you won't see any change at all to your vandalfighting.
  • If an IP is blocked or if warnings keep coming in, the bot won't archive its talk page.
  • We don't actually know if warnings have any deterring affect whatsoever on persistent vandals. Maybe they just make them more malicious. That's why a test like this is important.
  • All of the old warnings will be available at the archive page (prominently linked to on the talk page), so this won't change your ability as a vandalfighter to go back and check for persistent vandalism.
  • And, lastly, this is only a test, not a permanent change to the system. If this really does lead to an increase in vandalism and spam – which I highly doubt; otherwise, I wouldn't have suggested doing it :) – we'll have learned something very important about the value of warning messages.
I hope this puts you a little more at ease. Let me know if you have any more questions. Maryana (WMF) (talk) 18:36, 12 December 2011 (UTC)
I don't know if warnings have much effect either though I suspect they do work for some editors. I do think that blocks and school monitoring are effective and because blocking policy and school monitoring depends on being able to see the history of warnings, this bot will make that more difficult. That's why I suggested the bot provide a summary of the archived warnings and blocks. Jojalozzo 17:40, 14 December 2011 (UTC)
Oh, we're not touching Category:Shared IP addresses from educational institutions, which has over 60,000 users. We're only testing on Category:Wikipedia user talk pages of shared IP addresses (about 38,000 users), which has some educational institutions, but not very many. It's most ISPs and hotspots. So, if the schools turn out to be a problem in our sample, we'll know that they require extra monitoring, and if not, that'll be another test to think about in the future... but first let's try this one! :) Maryana (WMF) (talk) 20:22, 14 December 2011 (UTC)
Ok, but that still leaves
  • issues with ARV/blocking
  • the suggestion that the bot provide summary info about what it archives
  • the suggestion that instead of deleting old warnings, the bot turn off old alerts that "you have new messages".
Jojalozzo 21:06, 14 December 2011 (UTC)
  • issues with ARV/blocking – Not sure what you mean here... there shouldn't be issues with blocking, because the bot won't archive talk pages with active block notices on them. And most shared IPs don't get hardblocked, anyway, so these would be edge cases.
  • the suggestion that the bot provide summary info about what it archives – That's quite tricky, technically speaking. How can a bot know what kinds of messages it's removing? It could perhaps give an approximate number of messages archived, but that information wouldn't be very useful to anybody. All you'd have to do is click on the archive link to see them in full.
  • the suggestion that instead of deleting old warnings, the bot turn off old alerts that "you have new messages" – A good idea but, again, I don't believe it's technically feasible. Maryana (WMF) (talk)

[edit] Where are we?

Status Unknown. This task still seems somewhat heated, and judging consensus is made somewhat difficult by the very fragmented nature of the discussion surrounding the task. At this point I think it will be best to leave the edits from the last trial, and start completely afresh. Petrb, for clarity's sake, could you please restate the details of how the bot will be operating, (when/what/where it will archive etc?). Unless there are any major objections, I would like to move to a smaller scale trial and get this task moving again. --Chris 08:45, 26 November 2011 (UTC)

Of course, bellow is summary:
  1. Bot will walk through half of all the pages in the list Category:Wikipedia user talk pages of shared IP addresses
  2. It will check if the talk page is empty / archived and which template is used at the top
  3. If no template is present and content can't be archived (vandalised page) it will skip
  4. Template which matches the list provided by Maryana will be replaced with new one
  5. Bot will check the last time (if more recently than 14 days) the user's talk page was edited and if user isn't blocked
  6. Bot would check if there isn't a block notice which expired less than 14 days ago, in that case it would only replace main template (leaving notice) and skip
  7. If page is not already being archived, has messages older than 14 days, the talk page hasn't been edited in 14 days, and there are no live block notices on the page, the bot will:
  8. *create Archive N subpage of the user page with {{talk archive navigation}} {if the archive is over a max size (e.g. 100Kb) we should start archive 2,3,4 etc?}
  9. *cut and paste all old messages onto that page and save the page
  10. *leave an archive banner {{archives}} at the top of the talk page and save the page
  11. Bot will recheck all the pages again after 3 days. If the page has not been edited in 14 days and matches other critera, it will cut and paste all old messages to the archive.
  12. Bot will continue checking pages every 3 days and archive messages on talk pages that have not been edited in 14 days.
  13. Bot will not archive anything while a user is blocked {or if block has expired within 14 days}
Let me know if you needed some details Petrb (talk) 08:52, 26 November 2011 (UTC)
I admit it, I'm not all that tech-savvy and am a little confused about the scope of this next trial. At Category:Wikipedia user talk pages of shared IP addresses that Petrb mentioned, there are subcategories with dynamic IPS of 1229 pages, gov. addresses with 372 pages, .edu addresses with 60,143 and 38,135 pages listed at the bottom as being in the category of "Wikipedia user talk pages of shared IP addresses". Shearonink (talk) 17:09, 26 November 2011 (UTC)
Sorry, it's a little messy because of the inconsistent categorization system for flagging and classifying IPs...
We're only focusing on Category:Wikipedia user talk pages of shared IP addresses, not Category:Shared IP addresses from educational institutions or the other subcategories (though there are some .edu, .gov, and dynamic shared IPs in the former). There's ~38,000 pages in that category, but some of them have templates that fall outside the scope of this test (we're only making changes to talk pages that have one of these 11 templates). And we're halving that 38k because it's an A/B test, so we need a control sample. So, the number of affected talk pages will be a bit under 16,000. If we get good results from this test, we might consider proposing a change to all the shared IP talk pages (including all subcategories), but we won't know if it's worth it until we run the test. Does that make sense? Maryana (WMF) (talk) 16:25, 28 November 2011 (UTC)
Makes sense but Chris posted that this would be a small-scale trial and "a bit under 16,000" doesn't quite sound like his "smaller-scale trial". I am also troubled by the date-range being 14 days. In my experience IP-vandals will return to the scene of their Wiki-crimes often and the long-term vandals will do so more than 14 days later. Is there any consideration to the time-limit being something other than 14 days, maybe a month or whatever from the last warning? Shearonink (talk) 18:05, 28 November 2011 (UTC)
I think what Chris meant by "small-scale trial" is that the bot should do 50-100 test edits first. As for length of time for archiving, please see the discussion here about why that was the duration we settled on. The short version is that this is meant to be a short-term test, not a systemic change, and since the test will only run for 2 months, archiving every month wouldn't make too much sense :) Maryana (WMF) (talk) 18:20, 29 November 2011 (UTC)
Ok, just to make sure I understand this, the way the trial will be done is in the following order:
  • 1) Bot will do just a short test of 100 IPs.
  • 2) Bot will be stopped.
  • 3) That 100-IP test will be evaluated for any possible issues.
  • 4) Possible issues will be fixed.
  • 5) Full-scale trial will then ensue on the approximately 16000 IP talk pages for a time-period of two months.
  • 6) Bot will be stopped after two months and the complete trial will then be empirically evaluated.
-Shearonink (talk) 19:04, 29 November 2011 (UTC)
Typically we do trials with 100 edits or less; this one seemed to be a bit of a fluke / mis-communication as to the nature and extent of the trial. If there's consensus for a continued trial, let's mainly worry about numbers 1-4, and then we'll figure out any further trials, if any are needed, once we get that far. --slakrtalk / 22:31, 29 November 2011 (UTC)
Yes, that sounds like the clearest plan to me. Thanks to both of you for the comments. Steven Walling (WMF) • talk 01:54, 30 November 2011 (UTC)

────────────────────────────────────────────────────────────────────────────────────────────────────

Again I propose/persist that the bot moved the "actual wrong" archived talkpages from /Archive to /Archive 1. mabdul 14:09, 30 November 2011 (UTC)
That is no problem but is it needed? Petrb (talk) 15:20, 30 November 2011 (UTC)
Personally, I don't think it's really necessary, however it couldn't hurt to have the bot fix the pages as it goes through and edits them again (note, I think a separate run, with the sole purpose of fixing all the pages would be overkill and unnecessary). Once the trial below is over, and if there is support for it, we can look into the best way to include that in the bot. --Chris 06:44, 1 December 2011 (UTC)

Ok, lets get this moving. Approved for trial (100 edits). per what was agreed to above. To be clear, 100 edits, then stop. Then we will review, and decide our next move. --Chris 06:44, 1 December 2011 (UTC)

[edit] BRFA trial part 2

  • This worries me, because the "old warnings" 'show' indicated that there were none. I see you removed it "by hand" [78] - has it been fixed for future edits?
  • Could the edit-summary please include a link to WP:UWTEST
  • The archive box, e.g. {{archive box|[[User talk:109.232.72.10/Archive_1]]}} ([79]) is rather ugly/confusing; it would be better to use a relative path and a space ("/Archive 1") but I think it is easier to just use {{archive box | auto=yes }} instead ([80])  Chzz  ►  13:24, 1 December 2011 (UTC)
  • Here, it has left behind an {{old IP warnings bottom}} (with no corresponding 'top')  Chzz  ►  15:30, 1 December 2011 (UTC)
  1. Yes it was fixed
  2. It's in configuration - User:SharedIPArchiveBot/Config
  3. Fixed
  4. Fixed
Thanks. Trial is running now Petrb (talk) 18:27, 1 December 2011 (UTC)
Also it have done other mistakes like leaving part of messages on the page instead of archiving them I will try to update it to be more hard when checking what to archive. Petrb (talk) 18:32, 1 December 2011 (UTC)

Finished Petrb (talk) 20:36, 1 December 2011 (UTC)

I have noticed that it has done few mistakes on beginning, all should be fixed now. Petrb (talk) 20:40, 1 December 2011 (UTC)

Lets go for another slightly bigger trial to make sure those errors are fixed. Approved for trial (350 edits). --Chris 17:16, 4 December 2011 (UTC)

Trial complete. Petrb (talk) 21:07, 4 December 2011 (UTC)
Could you please report more recent? thanks :) Petrb (talk) 23:34, 4 December 2011 (UTC)
Yeah that was my fault since I click on the "see also that IP". Sorry. but let us discuss that cases:
mabdul 23:53, 4 December 2011 (UTC)
That's a question what it should do if more than 1 template is there. I will make it keep both.
Second bug is fixed. Petrb (talk) 08:41, 6 December 2011 (UTC)
1 And another bug, why was here the old IP warnings removed but nothing changed? Either archive the old warnings or let the collapseable box there! mabdul 12:39, 6 December 2011 (UTC)
2and another move that needs discussion: [81] should this box/template really archived? Shouldn't that pages get manual investigation and restoring(to the archive) of the old warnings? mabdul 12:42, 6 December 2011 (UTC)
Some cosmetica at User talk:128.174.150.43 (and many other talkpages): remove the unneeded whitespace if possible ;) mabdul 12:46, 6 December 2011 (UTC)
Can you explain that page http://toolserver.org/~petrb/logs/ you linked on the Userpage of the bot? It is neither up to date nor is it "usable" since it is rather getting long! mabdul 13:19, 6 December 2011 (UTC)
4Why was here a archivebox added? No archive was created! mabdul 13:26, 6 December 2011 (UTC) (oh and by the way: was revert!) mabdul 13:27, 6 December 2011 (UTC)
User_talk:125.161.133.239/Archive_1:5 should the IPtalk template really archive? why not simply remove it? mabdul 13:29, 6 December 2011 (UTC)
when archiving this, please remove also the __TOC__ (or notoc, depends what is on the page) mabdul 13:33, 6 December 2011 (UTC)
  • 1 it was removed because it's configured to remove it, that will be improved
  • 2 all moves that needs discussion should be discussed then, however if you want it not to remove such stuff feel free to insert it to bot ignore or skip list, it's in configuration I will be happy to explain how does it work, just insert it there, or send me what should be skipped, have a discussion and meanwhile bot would skip it
  • 3 logs are not displayed because it wasn't running on toolserver, trial is running on another pc
  • 4 Archive box was created because page was archived but abuse filter prevented creation of page
  • 5 misconfiguration only

Thank you. No idea what's wrong with toc Petrb (talk) 15:28, 6 December 2011 (UTC)

The TOC will be displayed as you can see in the archive in the middle of the page where the collapse bottom was. so that shouldn't be in there! mabdul 16:03, 6 December 2011 (UTC)
Fixed Petrb (talk) 20:03, 6 December 2011 (UTC)

Ok, lets have one more small trial. Approved for trial (400 edits).. Then I would like to move towards the 2 month extended trial, however I think it might be a good idea to place a limit on how many edits the bot can do in a day, so that if more bugs are found, they are less damaging and can be fixed easier. --Chris 04:56, 9 December 2011 (UTC)

[edit] BRFA trial part 3

So, what will no happen with the last ~3k pages? Will they get moved? Will they get cleaned up? If so, by whom and how? mabdul 21:00, 9 December 2011 (UTC)

Trial complete. Petrb (talk) 00:21, 10 December 2011 (UTC)
I did a fast check on the first 50 edits. I see two problems which really should be discussed. The first is at User_talk:142.35.26.32/Archive_1 - clicking on that template shows that the IP 142.35.26.32/Archive_1 has no edits - of course. So I think we really should change many templates.
The second problem I see, that on User_talk:142.35.51.2/Archive_1 the only notice is, that the old comemnts were deleted. Checking the history at User_talk:142.35.51.2, I see the comment (cur | prev) 03:29, 2 January 2011‎ BD2412 (talk | contribs)‎ m (53 bytes) (blank ancient IP talk page posts per WT:CSD. using AWB) (undo) - so it seems to me, that there was a BFRA(?) for a AWB job. Can we restore the clearing and archive the old edits instead of the template? mabdul 00:36, 10 December 2011 (UTC)
I found a minor thing at User talk:142.31.44.81/Archive 1: It archives __FORCETOC__ which is identical to __TOC__. mabdul 19:25, 10 December 2011 (UTC) (added to the config on my own) mabdul 19:27, 10 December 2011 (UTC)
  • Where was the consensus to replace templates like {{ISP}} with {{ISP test}}, for example, with this bot? Even if there was, shouldn't the regular template just be turned into a randomizer so that all of these "tests" don't need to be replaced later on? And, furthermore, what is the point of these encouraging shared IP templates if there is no way to track their impact on account creation? Logan Talk Contributions 22:59, 10 December 2011 (UTC)
As far as it was explained to me wmf has access to these data. Petrb (talk) 23:03, 10 December 2011 (UTC)
Hi Logan,
Consensus is here on the original VPR thread, and yes, we can track this information. We can't randomize a transcluded template, but we'll return them back to normal after the test. Please let me know if you have any more questions. Maryana (WMF) (talk) 00:43, 11 December 2011 (UTC)
Okay, thanks Maryana. :) Logan Talk Contributions 03:55, 12 December 2011 (UTC)

At the moment I am considering whether this is ready to be approved for the two month trial or not. If anyone still has strong opposition/concerns to the bot, please speak below (likewise for those supporting the task).

Secondly, we need to deal with the broken pages from the previous trial. Personally, I think the best solution would be to have the bot fix those as it is editing the rest of the pages (as opposed to a mass run to fix all the pages, which I would strongly oppose). As I understand it all the genuinely broken pages have now been fixed, and it is only more cosmetic errors (e.g. "Archive" vs "Archive 1"), that are left. --Chris 19:05, 12 December 2011 (UTC)

Undecided: I would really like to see an archival for cleared pages (per the mentioned AWB job) - and thus this would need another trial.
For the "cosmetic" changes and moves I give a strong support. mabdul 20:47, 12 December 2011 (UTC)
  • Okay guys, let's get this thing worked out. We've had 3 tests and a month and a half to talk it over and work out the kinks – if we leave this sitting any longer, everyone's going to forget what the original idea for the test was in the first place :)
If there are dire concerns about the test, please voice them. If there are concerns about details that can be worked out as we run the actual test, then let's run the test and fix them along the way. The point of this and all our tests is to get a quick-and-dirty sample, figure out if there are any positive changes we as a community can easily make, and, if not, scrap the idea and move on. I know there's WP:NO DEADLINE, but I'm going to set an arbitrary one, anyway :) Can we say either yes, let's test or no, no test by the end of this week? Don't mean to be pushy, I'd just rather know sooner than later that this isn't a fruitful alley to pursue! Maryana (WMF) (talk) 20:40, 14 December 2011 (UTC)
    • Welcome to the English Wikipedia bot request system. It's long, and it's painful. But, by golly, it works. I should stress though, (historically) we do not do (quick and) dirty anything. That is to say, personally at least, I would not approve any bots knowingly running with faulty code. Now, I'm not saying that's the case here, but if it is, then I find it quite correct that faulty code should be fixed before a test (even if it at present has only resulted in cosmetic issues, these things are often suggestive of the potential for wider problems). Of course, if there are in fact no problems, then I'd support a trial in line with WP:VPR (I haven't read it). - Jarry1250 [Weasel? Discuss.] 09:12, 16 December 2011 (UTC)
Hehe, you could've stopped at "Welcome to the English Wikipedia" :) What I meant was: if there are known issues, please bring them to Petrb's attention, and if not, then let's test. I know he's itching to get going on this, too, and the longer it sits around, the likelier it is that people just coming into it will have no idea what's going on and will have to ask a lot of questions and get a lot of redundant explanation to get caught up to speed. Of course, I have exactly zero bot approval experience, so I could be totally wrong on this, but it seems like the people who are familiar with the bot's task should have enough info at this point to judge whether or not it does its job correctly. But again, me no bot herder, so maybe that's a bad assumption :)
Anyway, thanks for your help and let me know if there's anything you need from my end! Maryana (WMF) (talk) 17:35, 16 December 2011 (UTC)
      • "But, by golly, it works." [citation needed]. --Chris 10:07, 16 December 2011 (UTC)
        • Heh. No, but seriously, it does work. Bots do get approved through it and do end up doing a lot of good work. ANd a lot of bugs are found and fixed before thousands of edits are made. - Jarry1250 [Weasel? Discuss.] 16:50, 16 December 2011 (UTC)

Regarding the AWB job these, are all the edits I could find (about 219). I'm rather surprised that this was done, mainly because the discussion cited in the edit summary, is from 2006. However, as I understand it, most of these pages will be unaffected by the bot (most of those pages no longer have any shared ip templates), so, at a guess, there'd be less than 50 pages that the bot would actually edit, which is not enough to warrant adding it too the task. On a side note, it appears that some of the edits removed shared ip templates, however any clean up of that is outside the scope of this task.

Secondly, regarding the cleanup. As I understand it, the two errors that need to be fixed now are the incorrect naming of "Archive" (instead of "Archive 1"), and the incorrect substitution of the archive box template. As I have said previously, I think the best way to deal with those, would be to have the bot fix them as it goes through its normal activities; however if that were to be done, another small trial would still be necessarily, to ensure that the fixes don't accidentally break any more pages. --Chris 10:07, 16 December 2011 (UTC)

Thank you for reply, it was actually implemented already, but if you want I can of course run another small trial to check if it's correct now. The main reason why bot didn't fix any of pages is that they were edited by last run and bot only edit pages which were not touched for certain period of time. Currently it moves all Archive pages and replace the substituted template Petrb (talk) 10:27, 16 December 2011 (UTC)
Yes, I think it would be best just to make sure that the fixes work correctly. Approved for trial (100 edits). --Chris 10:33, 16 December 2011 (UTC)
Is that a trial of regular task or only for edits which fixes previous edits? Petrb (talk) 16:34, 16 December 2011 (UTC)
Only edits which fix previous edits (if that's possible) --Chris 17:20, 16 December 2011 (UTC)

Trial complete. Petrb (talk) 10:53, 17 December 2011 (UTC)

Extra new line was fixed too Petrb (talk) 10:59, 17 December 2011 (UTC)


[edit] Extended Trial

Ok, those look fine. Lets get this moving again Symbol full support vote.svg Approved for extended trial (62 days). --Chris 11:12, 17 December 2011 (UTC)

I was unaware of this bot until my watchlist started to fill with these changes yesterday. I raised a question at Wikipedia_talk:WikiProject_user_warnings/Testing#Proposal:_shared_IP_test as the bot appeared to be exceeding what I would expect for a "test". I believe that the mass creation of archive pages on anon IP user pages is naff and unnecessary. It would be just as easy to delete all notices over a year old, they would still be in the history and we have a template for that.
I recommend a community wide RFC to gain a common understanding of the consensus for this mass change to how we handle anon IP user pages and to ensure a clear description of what this bot is up to and what analysis has been done to demonstrate this is needed.
I request the bot is stopped in the meantime.
If I get no reasonable reply on action to be taken here, I will stop this bot later today. -- (talk) 10:33, 18 December 2011 (UTC)
Fae, the RfC happened on VPR a month ago. I'm sorry you didn't get a chance to comment on it. To be very clear: this is 'not a '"mass change to how we handle anon IP user pages" – it's a two-month A/B test. After the end of the trial, the bot will revert all its edits, and we'll analyze the data to see if there's any benefit to fast archiving. If it does look like it leads to less vandalism and more anons registering accounts, we'll have an RfC to talk about making any permanent changes. If not, we'll have learned something about the value of warnings. In the meantime, this is only a test, so please don't panic :) Maryana (WMF) (talk) 16:46, 18 December 2011 (UTC)
Lot of users prefer their talk pages archived to a page and not to null for a variety of reasons. Most obviously, in a continuous archive process, it is far easier to build up a picture of previous editing activity than by going through history.
As I see it, this seems to be a shot to nothing. No-one seems to have suggested a way in which this does any harm at all, beyond being a waste of the bot writer's time.
In this context, I'm not sure why you'd want to stop everything for weeks, hold long discussions, etc with the hope of being able to midfy this task. I mean, feel free to, but it's not clear to me what would be gained from diverting people's time to this issue. (I may well be missing something here, I have to say, since I haven't read *all* the discussions.) - Jarry1250 [Weasel? Discuss.] 14:12, 18 December 2011 (UTC)
Stopped Petrb (talk) 15:18, 18 December 2011 (UTC)
@Fae I want a reply from you regarding the reason for stopping the bot, there was already RfC regarding this task and wast majority of people agreed with the task, so unless you have a good reason to stop it, I will restart the bot in few hours, we are approved to run the requested 2 months trial and it already begun, in order to have accurate result of the task, we need to have a bot running for 2 months, otherwise it would be harder to get a valid results of this research. Thank you Petrb (talk) 15:29, 18 December 2011 (UTC)
I can now see the archived Village Pump proposal where a number of alternative options were discussed. I remain confused why running this bot on 20,000 pages is considered a "test" (at least, based on the description that half of the 40,000 IP user pages with suitable templates will be handled by the bot). If half of all such pages are changed, then this is not a test as there is no way that we would mass revert all these changes. Was this really understood by the people supporting at the VP discussion? When I look at the "support" comments there were a range of test periods mentioned. None of these seemed to pick up on the suggestion that 20,000 pages would be changed during the test period. -- (talk) 18:03, 18 December 2011 (UTC)
BTW, as pointed out elsewhere, I did not actually stop this bot. I am unsure why this research needs to be done on 20,000 pages. Surely testing these stylistic changes could be done on a much smaller sample in order to make a decision for creating user archive pages and ISP header notices standardized compared to any other approach (such as collapsed sections or deleting old notices over a year old; both suggestions brought up on the VP discussion)? -- (talk) 18:08, 18 December 2011 (UTC)
Any issues? Next massive run will be started in approximately 1 week and 4 days Petrb (talk) 23:31, 22 December 2011 (UTC)

The more I keep seeing this, the more I keep thinking that old talk page warnings on IPs (e.g., >1 year) really don't need to be archived. In the past, we've typically just deleted old talk page warnings because they're simply not relevant—most of the warnings are dropped on dynamic IPs, and those that aren't dynamic IPs are typically schools where stale warnings don't matter. Because of the sheer number of IPs with talk pages, I'm not entirely sure archival is the best approach that this point, as prior warnings are in the page history (which vandal fighters check anyway, since IPs like to blank or alter warnings frequently), and making seperate archive pages will just contribute to bloat in dump files. That said, the bot seems to otherwise work from a technical standpoint. --slakrtalk / 20:23, 13 January 2012 (UTC)

I completely agree. On the German Wikipedia, an admin runs an authorized bot that deletes all shared IP talk page messages after 24 hours. That might be a bit extreme for en.wiki, but I definitely think year-old messages aren't helping anybody. However, this issue was a somewhat decisive one during the VPR discussion. I'm hoping that once we get some data back from this test, we'll have more concrete quantitative evidence about the aggregate behavior of shared IP editors, which might help placate people's fears and suggest a future course of action. Anyway, yeah, let's definitely keep brainstorming more optimal solutions post-testing :) Maryana (WMF) (talk) 21:18, 13 January 2012 (UTC)
Slakr, reason that bot is doing this is that community requested it, not because we wanted to do that. No matter what your opinion is (actually I agree with that), we can not change the task because of that. Perhaps you could join the discussion which would be probably started when this trial finish so that you can explain this to rest of the community and hopefully we would be able to tweak the bot to do it right. Petrb (talk) 17:47, 14 January 2012 (UTC)

[edit] End of extended trial

Now that we're coming to the end of the extended trial, any data that might be useful in giving this bot final approval? MBisanz talk 15:15, 6 February 2012 (UTC)

I would like some hard data published as the results. Experimenting on 20,000 pages and (presumably) creating 20,000 additional pages as permanent archive pages of old IP warnings will need some credible justification before letting this bot run wild. If the benefits seem weak, this bot should not go ahead. I still believe this was an unnecessary bot test, with a scope not clearly supported by the RFC mentioned above and seems like a solution looking for a problem rather than the reverse. I do not accept that this now has unstoppable momentum on the "but we've created it now" dubious rationale. -- (talk) 15:25, 6 February 2012 (UTC)
First, I agree about publishing the data. That's the whole point of running a proper A/B test instead of just the usual vague "trial". Second, I think you need show a little more good faith Fae. Trying new things on Wikipedia is hard enough without an attitude that is unnecessarily skeptical towards new ideas and change. Steven Walling (WMF) • talk 20:54, 6 February 2012 (UTC)
Sorry if all my comments appear like bad faith to you. Let me try rephrasing in a way that you will not find personally offensive. I am trying to ask for an unambiguous need for this bot to add archive pages to IP user home pages and mass reformat the way all such pages are formatted. Such mass changes should, in my opinion, have a clear mandate and the test data should be able to demonstrate the benefits of making this part of the default infrastructure of the way we deal with Anon IP accounts across the whole of Wikipedia. If these things are in place then you have my support. -- (talk) 22:30, 6 February 2012 (UTC)
There was a clear mandate. We spent a month talking about it on the Village Pump for proposals, and the proposal evolved significantly based on what everyone talking it about it wanted. Steven Walling (WMF) • talk 19:54, 7 February 2012 (UTC)

{{OperatorAssistanceNeeded|D}} Pinging for data on the trial result. So far, community consensus has been established from the VP discussion linked and discussed above, so a new discussion or indication of change would be needed to alter that.. Still waiting on final technical validity of the test to give final approval. MBisanz talk 21:10, 8 February 2012 (UTC)

Hi all, it's a little early for data given that the test hasn't ended yet :) The official end date of the two-month period is February 19th (that's 2 months from when we started the trial). While I certainly wish it were possible to have instantaneous numbers and graphs for you, it's going to take Faulkner some time to gather all the samples and run a rigorous analysis on them, especially given that this is only one of a number of tests that are ending around then. We can push it up to the front of the queue in terms of priority, but I'm pretty sure it will still take a week or two. So, if you'd like to have a date in mind for when to expect results, I'd say probably first or second week of March.
As to what happens on February 19th: the bot will stop archiving talk pages. If people feel really strongly that it should also go back and revert all of its edits to remove the archives it's already created, I'm sure Petrb will be happy to do that. But I don't see a logical reason for that until after we actually look at the results – if we see in the analysis that there is a clear benefit to archiving talk pages, it would be pretty cumbersome to re-revert everything again and put a new archiving system in place. Why not just wait and see what happened and then make the decision? Though we did get BAG's approval, this isn't a traditional BAG test (we already did several of those).
Finally, I really wish Steven and I didn't have to keep stressing this point, but: this is not some sneaky way of forcing a permanent change on the community without its approval. That's not what our testing is about, previous, now, or ever. We've run eight tests so far as part of our WP:UWTEST project and have never kept a test running past when we said it would end, so please don't make those kinds of accusations. Again, on February 19th, the bot will stop archiving, we'll analyze the data, present it to the community, and ask everyone to come to a new consensus if there's conclusive evidence suggesting a need for it. Maryana (WMF) (talk) 22:27, 8 February 2012 (UTC)
Ok, no rush. I do not think Petrb should revert the test edits, as that would be form over substance to the absurd degree. If, as we get closer to the 19th, Faulkner thinks everything is running fine, I would not object to a temporary continuation pending the final data in order to maintain continuity to the other projects you may be working on that don't have time-limited trials. MBisanz talk 06:16, 9 February 2012 (UTC)
  • Comment I don't think this is worth doing. I've been editing from IP addresses for years and have probably been assigned 100's of them. It's quite rare that I get one with any kind of notices on them. I think it's better to leave the talkpages intact, since they sometimes contain info relevant to article development. One thing your bot could do instead is undelete all the IP user and talk pages that one of MZMcBride's unapproved bots deleted a few years ago. A number of those had useful info. 67.117.145.9 (talk) 08:13, 18 February 2012 (UTC)

Has your data analysis finished yet? Josh Parris 01:02, 24 February 2012 (UTC)

No, we're in the process of wrapping up other tests and running analysis on them (which you're welcome to watch in close to real time on Faulkner's journal). As I said above, you should expect results for this test about 2 weeks from now. Thanks for your patience with our tiny 3-person analytics team :) Maryana (WMF) (talk) 23:11, 24 February 2012 (UTC)

[edit] Bots that have completed the trial period

edit

Personal tools
Namespaces

Variants
Actions
Navigation
Interaction
Toolbox
Print/export
Languages