Page semi-protected

Wikipedia:Bots/Requests for approval

From Wikipedia, the free encyclopedia
< Wikipedia:Bots  (Redirected from Wikipedia:BRFA)
Jump to: navigation, search

BAG member instructions

If you want to run a bot on the English Wikipedia, you must first get it approved. To do so, follow the instructions below to add a request. If you are not familiar with programming it may be a good idea to ask someone else to run a bot for you, rather than running your own.

 Instructions for bot operators

Current requests for approval

JJMC89 bot 13

Operator: JJMC89 (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 05:42, Wednesday, July 26, 2017 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): Python

Source code available: inactive admins.py on GitHub

Function overview: Report and notify Wikipedia:Inactive administrators

Links to relevant discussions (where appropriate): Request by xaosflux

Edit period(s): Daily

Estimated number of pages affected: 0-1 daily + # inactive admins semi-monthly

Namespace(s): User talk, Wikipedia

Exclusion compliant: No

Function details: This is a replacement for MadmanBot 13.

For inactive admins that will be eligible for desysoping on the first of the next month:

  • First run of the month:
    • Report inactive admins in new Month YYYY section at Wikipedia:Inactive administrators/YYYY
    • Notify inactive admins of pending suspension via talk page and email (if enabled)
  • Daily:
    • Remove active admins from Month YYYY section at Wikipedia:Inactive administrators/YYYY
  • -7 days:
    • Notify inactive admins of imminent suspension via talk page and email (if enabled)

Configuration: User:JJMC89 bot/config/InactiveAdmins

The task is not exclusion compliant since only one project page is edited and notices are mandatory.

Discussion

@Madman: for commentary. Your prior bot task has been unreliably operating lately. — xaosflux Talk 11:17, 26 July 2017 (UTC)
Emailed Madman as well. — xaosflux Talk 11:22, 26 July 2017 (UTC)

@JJMC89: can you run this on a shadow page first for comparison? (e.g. Wikipedia:Inactive administrators/2017/test) ? — xaosflux Talk 11:19, 26 July 2017 (UTC)

@Xaosflux: I output the inactive list on testwiki without notifications. This is what would be reported for August 2017. (Note: no templates there.) The only difference that I note is the last log dates. MadmanBot appears to be excluding some log types, while I am not currently excluding any. — JJMC89(T·C) 16:01, 26 July 2017 (UTC)
I actually expected another entry on there - are you running a whitelist? — xaosflux Talk 17:55, 26 July 2017 (UTC)
Yes, the config currently has Useight excluded. — JJMC89(T·C) 18:46, 26 July 2017 (UTC)
Possibly should exclude "automatic patrol" log entries - these would be redundant with edits anyway. — xaosflux Talk 17:59, 26 July 2017 (UTC)
I can code log exclusions tonight (UTC-7). — JJMC89(T·C) 18:46, 26 July 2017 (UTC)

Yobot 56

Operator: Magioladitis (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 23:15, Sunday, July 9, 2017 (UTC)

Automatic, Supervised, or Manual: Manual

Programming language(s): WPCleaner

Source code available:

Function overview: Remove duplicate categories

Links to relevant discussions (where appropriate): Wikipedia:Administrators'_noticeboard/Archive290#Request_to_remove_duplicated_categories_from_pages

Edit period(s): Daily

Estimated number of pages affected: 20 pages per day

Namespace(s): Mainspace only

Exclusion compliant (Yes/No):

Function details: Use standard WPCleaner functionality

Discussion

@BU Rob13: This is CHECKWIKI error 17. - Magioladitis (talk) 23:15, 9 July 2017 (UTC)

This is part of Yobot's reaffirmation. Yobot has been running this for many years. This task has clear consensus. -- Magioladitis (talk) 23:17, 9 July 2017 (UTC)

How does it handle category sortkey clashes? Headbomb {t · c · p · b} 00:37, 10 July 2017 (UTC)
Headbomb I ll do this part manually. AWB/WPC leave these unaffected. -- Magioladitis (talk) 09:49, 10 July 2017 (UTC)
@Magioladitis: Your link to the related discussion is not resolving, please link to the current archive location, or use a permalink. — xaosflux Talk 13:37, 16 July 2017 (UTC)
Xaosflux Done. -- Magioladitis (talk) 13:40, 16 July 2017 (UTC)

PrimeBOT 19

Operator: Primefac (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 05:38, Sunday, July 16, 2017 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): AWB

Source code available: WP:AWB

Function overview: Remove duplicate categories on articles.

Links to relevant discussions (where appropriate): WP:AN discussion with no major opposition

Edit period(s): one time run (can be extended to "once a month" if necessary, i.e. if there are a large number of pages added each month)

Estimated number of pages affected: 801 (ish)

Namespace(s): Article

Exclusion compliant (Yes/No): Yes

Function details: AWB automatically removes any duplicate categories if:

  • Both category texts are identical (e.g. [[Category:Foo]] and [[Category:Foo]])
  • Only one category has a sort key (e.g. [[Category:Foo|Bar]] and [[Category:Foo]]

This is, from a rough count, all but 20-30 of the pages listed at the CheckWiki dump for this task, the rest of which can be dealt with manually. Essentially, this is running AWB with only genfixes enabled. Edit summary will read Removing duplicate categories (CheckWiki #17) - BRFA.

Discussion

Same with Wikipedia:Bots/Requests for approval/Yobot 56 that was filled some days ago. -- Magioladitis (talk) 08:01, 16 July 2017 (UTC)

  • Despite not actually listing it properly, this is a duplicate task to 56. I'm fine withdrawing if it gets approved first. Primefac (talk) 12:40, 16 July 2017 (UTC)
  • What is your strategy for resolving sort conflicts? — xaosflux Talk 13:40, 16 July 2017 (UTC)
    Sort conflicts are ignored by AWB, so there would be no change. I'll be sure to enable the skip options for when the category isn't changed at all (i.e. "skip if only whitespace" etc). Primefac (talk) 14:11, 16 July 2017 (UTC)
  • Assuming no sort conflicts, this is expected to be 100% cosmetic for readers, correct? — xaosflux Talk 13:40, 16 July 2017 (UTC)
    Correct. I know this generally runs afoul of the cosmetic rules, but the AN discussion (sort of) determined that having duplicate categories could potentially mess up hotcat, as well as users changing/modifying/removing/etc categories from a page. Primefac (talk) 14:11, 16 July 2017 (UTC)
  • I picked a random page from the checkwiki list, Berger Blanc Suisse. In looking at the page, the obvious cleanup needed was fairly easy to spot and resolve - but simply removing the duplicated category alone would not make this page any better for the readers. With this being such a small list might it be better to be curated manually? Same question to @Magioladitis: so I don't have to ask twice. It's possible I ended up in some edge case. — xaosflux Talk 14:44, 16 July 2017 (UTC)
    • Xaosflux duplicated reflists are tracked as seperate task. Duplicated content is not that common though. I tend to find them occasionally.
    • On the other issue: As with the ISBN thing, I am OK with Primefac's bbot also get approval. Recall, that we used to have 2-3 bots doing this before Bgwhite's dissapereance. -- Magioladitis (talk) 14:48, 16 July 2017 (UTC)
      • I'm not really worried about that for this task - when we have multiple bots approved for the same task only real concern is collisions or inconsistency - don't think that is an issue here. — xaosflux Talk 14:54, 16 July 2017 (UTC)
        • I'm starting to wonder if this task is even necessary. I checked the first dozen pages on the dump page and Fram had already been through and fixed all the duplicate cats. That, combined with your example above, makes me think this would be better (from a "fixing everything at once" perspective) as a manual task. The tools "live" version" shows only 700 pages. If the number is going down, there are clearly people aware of and actively fixing this issue. However, the spread of duplicate cats doesn't seem to affect any one "type" of article, so clearly bundling it with other similar AWB-worthy tasks doesn't make much sense. Primefac (talk) 15:01, 16 July 2017 (UTC)
        • This fix is another that AWB just kind of "cleans up" without actually fixing the issue that there was an entire article stuffed into the {{multiple issues}} template. I'm starting to think the issues with this task are less about COSMETIC and more about CONTEXT. Primefac (talk) 15:19, 16 July 2017 (UTC)
  • Xaosflux, I am considering withdrawing this request. I spent the last hour or so cutting the list in half (i.e. 350ish edits, plus 90 already done by someone else), and probably 40 of those I had to also fix either duplicate Reference sections, bad code, poor formatting, etc. While AWB can do this task automatically, there are too many cases where a set of eyes making sure there isn't anything else wrong with the page would be better (especially since this is a rather cosmetic change). Primefac (talk) 17:06, 16 July 2017 (UTC)
    That was my initial thought, that these are caused by inexperienced editors who likely left additional errors not suited for automation. — xaosflux Talk 17:21, 16 July 2017 (UTC)
    • Xaosflux Most of them are caused by Cydebot. If we fix Cydebot we are good. -- Magioladitis (talk) 19:17, 16 July 2017 (UTC)
      • I find that hard to believe, given that I've just checked 50 pages where both Cydebot and I edited the page, and in not a single on of them did Cydebot add a category. Most were removing cats per a CFD result. Primefac (talk) 21:10, 16 July 2017 (UTC)
      • @Magioladitis: interesting - but I'm not seeing the data to back up your statement - from Wikipedia:CHECKWIKI/WPC_017_dump I picked 4 random pages: Read with Me, Ronald de Boer, Sophie Totzauer, and Thiago Cunha. I didn't see any errors introduced by Cydebot, and there was only a Cydebot edit to one of them. Do you have any additional information that Cydebot causes most of these errors? — xaosflux Talk 22:45, 16 July 2017 (UTC)
        • @Xaosflux: in 4/19/14 Cydebot created thousand of entries, same in 8/20/14. Example. I used to keep record for that. In 12/12/14 we had 1000 pages in one day. -- Magioladitis (talk) 03:13, 17 July 2017 (UTC)
          • OK, so that was over 3 years ago - is this a current issue? — xaosflux Talk 04:03, 17 July 2017 (UTC)
          • Xaosflux The are explosions that happen depending on the XfDs. Anyway, if the task should be manually I am still OK. If the task should not be done I am still OK. Using AWB for the task sometimes was giving the impression I was removing a valid category. If anyone does it they should be very cautious with the edit summary. -- Magioladitis (talk) 12:40, 17 July 2017 (UTC)
  • I have no objection to removing straight duplicates - they are easily detectable and removed, so very suited for a bot. Given the numbers that Primefac gave at the discussion (here) I would say that 1/2-3/4 of pages in the list is enough for a bot to be worthwhile. However, with the edge cases I am wary of WP:CONTEXTBOT and would say that leave the rest to manual editors - they seem to be doing an ok job thus far. TheMagikCow (T) (C) 18:20, 17 July 2017 (UTC)
  • I've just finished with all of the AWB-fixes-it-automatically pages. Magioladitis, how often does the dump page get updated, and how much (or less) reliable is it than the labs page? Primefac (talk) 20:19, 18 July 2017 (UTC)
    • Primefac The dump page is 100% reliable (at the point of its creation) because it is based in the full database dump while the labs page may miss pages since it checks pages up to a certain number every 15 minutes. Due to the recent community disputes the dump page is not regularly updated anymore since the people working with it bcause inactive or semi-active. -- Magioladitis (talk) 05:46, 20 July 2017 (UTC)
      • Okay, so it's basically useless after a few weeks, assuming that the pages have actually been edited. Live/labs page has 14 pages on it currently, though I'm sure it'll find more as time passes (I didn't clear out all of the dump page when I went through). Primefac (talk) 15:56, 20 July 2017 (UTC)
  • "Live" page lists 36 pages, though undoubtedly there will be more as time goes on. If this task goes through, the bot op will still have to mark those pages as "done" on the tools page to make the list accurate.
When I went through and cleared out the live list the other day, there were a ton of pages where there were genfixes other than cat changes. I cannot, however, figure out the combination of "skip" conditions that would make it so that if a cat change isn't implemented the page is skipped. Magioladitis may know. If it isn't possible to implement this skip condition, then I think it doesn't make sense to have this as an automated task (since it would literally be a pure "genfixes" bot. Primefac (talk) 14:46, 21 July 2017 (UTC)

Yobot 31

Operator: Magioladitis

Time filed: 19:49, Wednesday, 1 February 2017 (UTC)

Automatic or Manually assisted: Automatic, supervised

Programming language(s): AWB / WPCleaner

Source code available: AWB is open source. I can provide my settnigs file if asked.

Function overview: Moving HATNOTES on the top per WP:LAYOUT and WP:HNP to help accessibility and navigation

Links to relevant discussions (where appropriate): Various discussions in various places show that this is a wanted task. Wikipedia:Bots/Requests for approval/Yobot 14

Edit period(s): Often

Estimated number of pages affected: 200 pages per month

Exclusion compliant (Y/N): Y

Already has a bot flag (Y/N): Y

Function details: I'll run through articles transcluding DABlinks. I ll use a custom module created for AWB and perform genfixes only if a DABlink and/or HATNOTE has to move on the top. AWB will do the rest.

It is my intention is to have auto-tagger activated too. If I am asked I can deactivate it.

-- Magioladitis (talk) 14:11, 20 April 2010 (UTC)

Discussion

  • Is moving hatnotes to the top done by your own regex or general fixes? I support this if the setting "Skip if genfixes only" is turned on. ~ Rob13Talk 21:41, 1 February 2017 (UTC)

BU Rob13 I provide a specific skip condition in case the main task is not done. Available at User:Yobot/Task 14. -- Magioladitis (talk) 21:44, 1 February 2017 (UTC)

  • Some questions please:
    • Is this a checkwiki thing?
    • Do hatnotes always go on top or are there exceptions?
    • Is this something Yobot has done previously?
    • Are there any other bots which do this job?
    • To describe the function precisely, please could you link to a diff of such a change made manually?
    • Please give more details about this "custom module".
  • Thanks — Martin (MSGJ · talk) 21:48, 1 February 2017 (UTC)

MSGJ

You are welcome. Community health is very important. -- Magioladitis (talk) 21:51, 1 February 2017 (UTC)

Again, this should not require general fixes. The task description does not mention them, so I take it this task does not include them anyway, but for the sake of clarity it it worth pointing out explicitly. — Carl (CBM · talk) 12:12, 2 February 2017 (UTC)

{{BAGAssistanceNeeded}} -- Magioladitis (talk) 01:23, 15 February 2017 (UTC)

So are all hat notes within sections ignored? What if there is a hatnote somewhere in the body of a section, will the bot then move it to the top of the section? Next, how does the existence of {{dablinks}} suggest hatnotes are misplaced? Finally, what is "auto-tagger"? MusikAnimal talk 17:55, 20 February 2017 (UTC)
Yes. No. I am not sure I understand. - Magioladitis (talk) 23:11, 22 March 2017 (UTC)

Auto-tagger is Wikipedia:AutoWikiBrowser/General_fixes#Mainspace_tagger. -- Magioladitis (talk) 19:20, 20 February 2017 (UTC)

{{BAGAssistanceNeeded}}

@Magioladitis: In the function details you wrote I ll run throught articles trancluding DABlinks. I'm confused why you would look for transclusions of {{dablinks}} to find misplaced hatnotes. Wouldn't you instead look for {{hatnote}} and similar templates? What does Template:Dablinks have to do with hatnotes?
I can't really comment on the use of auto-tagger, as I'm not as familiar with AWB as others. It seems to me this is OK so long as the bot does not only do this. So it can add tags if and only if it also makes approved changes, in this case correcting the placement of hatnotes. I take it adding tags are not considered "general fixes"? Perhaps BU Rob13 has an opinion on the use of auto-tagger (pinging as he is more experienced with AWB)? MusikAnimal talk 01:54, 10 April 2017 (UTC)
@Musikanimal: If "skip if no replacement" is checked, then there is no risk to running either genfixes or auto-tagger. Auto-tagger is just a small portion of the genfixes that can run independently, I believe, although I could be wrong on that. When you check "skip if no replacement", that means that all edits will be skipped which do not perform one of the replacement rules designed for the main task. Among the tens of thousands or possibly hundreds of thousands of AWB edits I've done or overseen (semi-auto and as a bot operator), I have seen zero cosmetic-only errors caused by bugs in that particular option. If it's enabled, auto-tagger or genfixes would both be extremely safe to include. ~ Rob13Talk 02:17, 10 April 2017 (UTC)

@Musikanimal: Err... I mean any of the hatnote templates (e.g. {{other people}} etc.). Not the dablinks template. Typo!!! -- Magioladitis (talk) 07:43, 10 April 2017 (UTC)

@Magioladitis: Could we list all the relevant templates in the "Function details"? Perhaps you'll be looking for all templates in Template:Hatnote templates? MusikAnimal talk 17:05, 11 April 2017 (UTC)
@Musikanimal: This is the regex.
public static readonly Regex Dablinks = Tools.NestedTemplateRegex(new [] { "about", "about2", "about-distinguish", "about-distinguish2", "ambiguous link", "for", "for2", "for3", "dablink", "distinguish", "distinguish2", "distinguish-otheruses", "distinguish-otheruses2", "further", "further2", "hatnote", "otherpeople", "otherpeople1", "otherpeople2", "otherpeople3", "other hurricanes", "other people", "other people1", "other people2", "other people3", "other persons", "otherpersons", "otherpersons2", "otherpersons3", "otherplaces", "other places", "otherplaces3", "other places3", "otherships", "other ships", "otheruses-number", "other uses", "other uses2", "other uses3", "other uses4", "other uses6", "otheruses", "otheruses2", "otheruses3", "otheruses4", "other uses of", "otheruse", "outline", "2otheruses", "redirect-acronym", "redirect-distinguish", "redirect-distinguish2", "redirect-several", "redirect", "redirect2", "redirect3", "redirect4", "redirect5", "redirect6", "redirect10", "see also", "this", "two other uses", "three other uses", "disambig-acronym", "selfref" }, false);
-- Magioladitis (talk) 17:40, 11 April 2017 (UTC)
Thryduulf I will move the hatnote above those. The hatnote refers to Wikipedia in general. These templates only refer to the article itself. Still if somewhere is cleared that all these templates can be placed in the top in any order I can adjust accordingly. -- Magioladitis (talk) 09:30, 28 April 2017 (UTC)
That's fine, as long as it will not make any changes if the hatnote is below only templates that do not display anything in the content area of the article. i.e. if the hatnote is also below a visible template, or another authorised change would make a difference to how the article looks when rendered, then move the hatnotes above these templates at the same time, but otherwise skip the article. Thryduulf (talk) 10:36, 28 April 2017 (UTC)
Thryduulf Ideally, this is fine and this is a good comment. How do we know if a template produces a visible output or not? -- Magioladitis (talk) 13:58, 28 April 2017 (UTC)
I am not aware of any single collection of all of them, but everything at the top of an article that has a name starting "Use " (including the space) is invisible and everything in Category:Top icon templates or a subcategory renders outside the content area. I don't know that that is all of them, but it's going to be the the vast majority at least. If you or someone else can generate a list of templates that are used on the first line of an article space page it should be possible to identify almost everything. Actually, a category for templates that produce no visible output might not be a bad idea as it wouldn't surprise me if other bots would find that useful to - any idea where the best place to propose that would be? 14:40, 28 April 2017 (UTC)
I've just discovered another set of templates that are invisible - those in Category:Varieties of English templates and/or which start "Engvar". Thryduulf (talk) 16:58, 28 April 2017 (UTC)
I've started a discussion about categorising all the invisible templates at Wikipedia talk:WikiProject Templates#categorising article-space templates with no visible output that you may wish to contribute to. Thryduulf (talk) 17:15, 28 April 2017 (UTC)
Thryduulf The Varieties... should be all placed in the talk page and I have a code that works perfectly with that. -- Magioladitis (talk) 07:15, 29 April 2017 (UTC)
What have talk pages got to do with this? Thryduulf (talk) 09:30, 29 April 2017 (UTC)

Thryduulf See {{American English}} for instance. "This template may be included on talk pages". -- Magioladitis (talk) 13:25, 29 April 2017 (UTC)

Have you thought how you will deal with temporary notices? e.g. {{User:RMCD bot/subject notice}} (used for requested moves), {{dated prod}}, {{Temporarily undeleted}}, {{Article for deletion}}, etc. All the deletion ones seem to be in (a subcategory of) Category:Deletion templates, but that also includes many other templates, I haven't found a category related to any requested moves or other temporary messages. I think hatnotes should appear above article maintenance templates that are not temporary (e.g. {{Globalise}}) or which have no definitive timeline (e.g. merging and splitting), so I'm happy for this bot to move those relative to each other even as the only change. However, I'd prefer it not to edit solely to move a hatnote relative to a temporary notice, but I'm willing to listen to other opinions. Thryduulf (talk) 14:13, 8 May 2017 (UTC)

They should appear before temporary notices - someone hearing "this page is a copyvio" for example, will likely leave the page before getting the message that the info they re after is elsewhere. (In fact with the copy-vio template they would never see it.) The volume of edits is expected to be low. All the best: Rich Farmbrough, 18:34, 27 May 2017 (UTC).
Note: Some of these hat-notes are obsolete, I will make a request at WPT:AWB to have them removed from the regex. All the best: Rich Farmbrough, 18:41, 27 May 2017 (UTC).
Logged phab:T166440. All the best: Rich Farmbrough, 18:49, 27 May 2017 (UTC).

A user has requested the attention of a member of the Bot Approvals Group. Once assistance has been rendered, please deactivate this tag. -- Magioladitis (talk) 07:17, 22 July 2017 (UTC)

Bots in a trial period

Yobot 57

Operator: Magioladitis (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 14:58, Wednesday, July 12, 2017 (UTC)

Automatic, Supervised, or Manual: Automatic Supervised

Programming language(s): AWB

Source code available:

Function overview: Fix {{http...}}

Links to relevant discussions (where appropriate): Wikipedia_talk:WikiProject_Check_Wikipedia#About_700_articles_to_clean_up

Edit period(s): Daily

Estimated number of pages affected: 700 in first run

Namespace(s): Mainspace

Exclusion compliant (Yes/No):

Function details: Find & Replace to change {{http...}} to [http...]

Discussion

Support. This task may need to run supervised. See this correction, where {{cite web}} was the best choice, and this correction, where removing the curly braces was a better choice than replacing them with square brackets. – Jonesey95 (talk) 19:47, 12 July 2017 (UTC)

support, assuming the replacements are context aware per the examples provided above by Jonesey95 (i.e., can detect the difference between a bare URL wrapped in curly braces vs. a missing 'cite web'). Frietjes (talk) 20:48, 13 July 2017 (UTC)

  • So when will it decide to use {{cite web}} over square brackets? It might be preferable to convert to cite web in many cases than reduce it down to a plain link.—CYBERPOWER (Chat) 16:25, 16 July 2017 (UTC)
    • Cyberpower678 I guess this applies to almost(?) all bare links? Maybe, we should convert everything to cite web? We used to have a bot for that purpose in the past. -- Magioladitis (talk) 16:28, 16 July 2017 (UTC)
      • My own bot does that to an extent. I believe that is the preferable path to take. Of course that would mean the bot needs to figure out the access date and all the required parameters. @Jonesey95 and Frietjes: What do you think?—CYBERPOWER (Chat)
        • Cyberpower678 I certainly prefer thi approach. The bot should try and access the page and even get metadata from it. Was it your bot that used to do that? -- Magioladitis (talk) 16:33, 16 July 2017 (UTC)
          • IABot does it when it sees such as appropriate. Some bare links do get converted to full cite templates. Others to preserve context, just leaves it as is.—CYBERPOWER (Chat) 16:35, 16 July 2017 (UTC)
            • @Jonesey95 and Frietjes: Pinging again, as it appears to not have worked last time.—CYBERPOWER (Message) 01:59, 19 July 2017 (UTC)
              • If a bot or script is doing the conversions, it should not add an access-date, since that is the date when the content of the web page was verified to support the statement in question (assuming the link is inside ref tags). If the bot/script is adding things like a title, each one will need to be checked for reasonableness. We don't want a bunch of titles like "Error 404" inserted into WP. I think it would be appropriate to approve this task as a bot-flagged task for a human editor to perform with a script, checking every edit. – Jonesey95 (talk) 04:55, 19 July 2017 (UTC)
                • it sounds like human supervision is needed as suggested above and below. Frietjes (talk) 12:33, 19 July 2017 (UTC)
  • I think there is an issue of WP:CONTEXTBOT here. Not all links as {{https...}} will be appropriate as a bare url - some may be specific cite templates. See the match on Paul McCartney and Victoria Azarenka the first 2 results in the insource search that are both better suited to an appropriate cite template. This is a task that needs doing, but I think a semi-automatic approach would be better. TheMagikCow (T) (C) 12:49, 17 July 2017 (UTC)
  • @Magioladitis: since context is important here, and since there are only 700 pages to run this on, I would advise making this a supervised task. It's not too difficult to review 700 edits.—CYBERPOWER (Chat) 13:03, 19 July 2017 (UTC)
    • @Cyberpower678: done. -- Magioladitis (talk) 15:03, 19 July 2017 (UTC)
      • Approved for trial (50 edits). So lets make sure we don’t have to cleanup 700 edits lets see if we can fix any problems that arise in the first 50.—CYBERPOWER (Chat) 15:52, 19 July 2017 (UTC)

Cyberpower678 I am not sure that we set the exact rules of which method of conversion to choose each time. Should I use the bracket conversion or the cite web conversion? -- Magioladitis (talk) 18:52, 19 July 2017 (UTC)

  • I think {{cite web}} would be fine for this subset of the originally-provided problem (the find/replace is left to the reader). Cite web might ultimately be suboptimal, but it will get it into the citation wiki-gnomes queue where they may be converted by those editors. The subset remaining after is this one. --Izno (talk) 20:30, 19 July 2017 (UTC)
    Sorry, I thought that was clear. Essentially if it's all alone with nothing else inside of a reference, I would convert them to cite templates. If they look like they were intended to be cite templates, but just forgot the cite web aspect, fix those. If they appear to be in a form of context, or are outside references, convert them to brackets. Some may be formatted improperly but are intended to be plain URLs with text, so those should be converted to square brackets as well.—CYBERPOWER (Around) 20:35, 19 July 2017 (UTC)
    This subset (there are two false-positives--review before running) can also take a {{cite web}} after the 100+ above. Any others will likely need some close attention. --Izno (talk) 20:48, 19 July 2017 (UTC)

Wiki Feed Bot

Operator: Fako85 (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 18:57, Wednesday, January 11, 2017 (UTC)

Automatic, Supervised, or Manual: Supervised

Programming language(s): Python

Source code available: https://github.com/fako/datascope

Function overview: Get information from the API in batches. Edit a page in the user namespace when an user transcludes User:Wiki_Feed_Bot/feed. Notify users on their talk pages with an automated message (in the future)

Links to relevant discussions (where appropriate):

Edit period(s): It will edit a page that transcludes User:Wiki_Feed_Bot/feed on a daily basis and when a user clicks the "force update" link that is added to the page through the transclusion.

Estimated number of pages affected: Depends on how popular the tool will become. Each user will typically have one feed. Perhaps some will create more than one.

Exclusion compliant (Yes/No): Yes, it only edits user pages where editors places the transclusion tag. It does not check for the bots template, but it will check that the page is in the users namespace and placed by the user who owns the page.

Already has a bot flag (Yes/No):

Function details:

Demo

You can see the tool in action on its demo page.

Bot read rights

The Wiki Feed system preprocesses information once a day. It fetches all recent changes from yesterday, groups them in pages and then starts getting meta information about these pages. It gets this information from the API and other services like Wikidata and in the future the Pageview API.

To be able to do this as efficient as possible the Wiki Feed Bot would like bot read rights to fetch 5.000 items in one go. The bot reads information for about 40.000 pages each day.

Currently Wiki Feed does not use the RCStream. We're considering it, but we need some time to implement this as it requires a fair amount of changes to the system.

Edit of pages in users namespace

To use Wiki Feed people need to paste some wiki text onto a page in their own user's space. This wiki text has the following markup.

{{User:Wiki_Feed_Bot/feed|recent_changes|<module_name>=<ranking_weight>}}

When users add this to a page they own Wiki Feed will create a feed on that page. The feed will show pages that have been recently changed. The user can decide how these pages should get ranked by specifying which "modules" should be used for the feed. If the transclusion specifies revision_count=1 and category_count=2 as modules than recently edited pages with many categories and many revisions will come on top. Where the amount of categories is twice as important as edits.

Transcluding User:Wiki_Feed_Bot/feed with the syntax above will also add a link to the page that says: "force refresh". When clicking this link the feed gets placed immediately instead of once a day. The tool makes the user wait until it is done. Once the feed has been calculated the results are added to the page where the link originated from and the user gets redirected back to their user page.

Discussion

  •  Note: This request specifies the bot account as the operator. A bot may not operate itself; please update the "Operator" field to indicate the account of the human running this bot. AnomieBOT 19:09, 11 January 2017 (UTC)
  •  Note: This bot appears to have edited since this BRFA was filed. Bots may not edit outside their own or their operator's userspace unless approved or approved for trial. AnomieBOT 19:09, 11 January 2017 (UTC)
  • I just looked at the code, and when I click on the "force refresh" link on the sample feed this code runs. It looks like you're using the requests library without a User-Agent header or maxlag parameter. Are there plans to add both before this bot hits production? Enterprisey (talk!) 19:13, 11 January 2017 (UTC)
  • I've updated the operator Wiki Feed Bot (talk) 20:05, 11 January 2017 (UTC)
  • As far as I know this bot has not been editing outside of its or my userspace. There is no mechanism in place to enforce this though, so it could be misused, but I was not expecting that to happen. I'll look into where this edit was made and report on this discussion thread. Wiki Feed Bot (talk) 20:05, 11 January 2017 (UTC)
  • I'm making maxlag a high priority I only discovered its existence through this approval process. I'll make a ticket for the user agent. Both will be in place before we start announcing Wiki Feed to the public. See: [1] & [2] Wiki Feed Bot (talk) 20:05, 11 January 2017 (UTC)
  • Usersearch does not reveal any edits on the enwiki for this user. Don't know what AnomieBOT found (maybe these approval pages?) and whether things are already reverted. Wiki Feed Bot (talk) 20:09, 11 January 2017 (UTC)
  • Wiki Feed Bot, Special:Contributions/Wiki Feed Bot is what AnomieBOT is looking at. You should stop using your bot account to contribute to this BRFA, since (see WP:BOTACC) you should be using your regular account (Fako85, I assume) for responding to these. Enterprisey (talk!) 20:16, 11 January 2017 (UTC)
  • Ok will do, but I can't edit Wiki Feed Bot's user page with my own user, because I'm not editing enough. It would be great if that's possible, but otherwise I'll keep switching accounts. Fako85 (talk) 20:20, 11 January 2017 (UTC)
    Fako85, you can get that permission manually by becoming confirmed; see WP:RFP/C for instructions on how to do that. Enterprisey (talk!) 20:34, 11 January 2017 (UTC)
  • I'm highly skeptical on whether we should grant a bot flag to a bot run by an editor who isn't yet even autoconfirmed. Given the amount of damage that can be done with a bot account, bot operators are typically editors who have been around for at least a little while and built up trust with the community. ~ Rob13Talk 21:03, 11 January 2017 (UTC)
    I'm here at the dev summit on my own accord flying in from Europe. Surely that counts for something. Sitting at table #10 if you want to say hi. Fako85 (talk) 21:21, 11 January 2017 (UTC)
    Also my partner in this is Ed Saperia who organized Wikimania 2014 Fako85 (talk) 21:24, 11 January 2017 (UTC)
  • So I'm actually at the Wikimedia Developer Summit and was able to chat with Fako85 about this. The idea is pretty neat – you can do cool things like get the most edited articles in a certain category over the past day, or get a list of recent articles documenting natural disasters, sorted by number of deaths. There is a web interface for the news feed, but Fako and Ed were hoping to bring it to the wiki as a subscription service. I personally think this could be useful, e.g. WikiProject Women could have a dedicated page that lists the most recent articles on Women, or the most recently edited by number of pageviews, etc.
    For now I'd like to put this BRFA on hold until the tool is more developed and we are able to discuss the idea further with the community. Given it would be subscription-only, I don't think it's particularly controversial, but the community may have input on how it should function. We should also respect community norms that we generally don't grant advanced rights to new-ish users. In that regard I can at least offer my word that the project Fako and Ed are working on is legitimate, and I do not think they are going to use the bot account to intentionally disrupt the wiki MusikAnimal talk 22:15, 11 January 2017 (UTC)
    So I've been thinking about this more, and even being a subscription service, I think we should account for any potential misuse. My understanding, and correct me if I'm wrong Fako85, you subscribe by adding a configured link (that points to Tool Labs) to a wiki page, then click on the link. That will trigger the bot to update the page with the requested results. For this reason there are a few safeguards we should put in place:
    • For the userspace, the bot should only edit the page if the link was added by that user. This prevents a vandal from adding the link to someone's user page and making the bot add some unwanted content.
    • For now, the bot should only edit the userspace. If people show interest, we could extend this to the Wikipedia namespace (e.g. WikiProjects), and perhaps the template namespace. At the very least, the mainspace is a strict no-no.
    • If and when we do extend this to WikiProjects (and all of the Wikipedia or Template namespace), we'll want some sort of approval process. Again, a vandal could make the bot add unwanted, potentially offensive content unrelated to the WikiProject.
    I'm not sure what the best approach is for the last point – having an approval process, but first we should consult a few major WikiProjects and see if they are interested. I'm going talk to Fako more about this while we're here at the dev summit, and there also happens to be some WikiProject experts here as well who I'm sure will have something to say. I will ask any in-person participants to comment here as needed (rather than me speaking for them) MusikAnimal talk 22:43, 11 January 2017 (UTC)
  • Talking to MusikAnimal about this we came up with a better way to include feeds on pages. In short: people will need to add a template to their user pages and we'll check if this template has indeed been added by the user to prevent misuse. The process is more precisely described in this ticket: [3] Wiki Feed Bot (talk) 00:28, 12 January 2017 (UTC)
  • To summarize the discussion till now. We'll be looking for people in the community that want to use this. So far responses have been enthusiastic. We need to implement these tickets before going live:
    Thanks everybody for the feedback. It has been very helpful Wiki Feed Bot (talk) 00:28, 12 January 2017 (UTC)
    Reminder to use your personal account when editing as a human! :) MusikAnimal talk 00:35, 12 January 2017 (UTC)
  • Regarding the use of loading images to these pages - what kind of check are you doing to ensure that fair-use images are not used? — xaosflux Talk 02:58, 12 January 2017 (UTC)
    It gets all the info through the API. No external images are being used. I saw a recent change in the API where some (pageprop) images are postfixed with _free and some aren't. Is that related to this topic? Currently the system uses the _free images and ignores the others. If possible I would like to show an image whenever one is available of course (even if it's not "free"), but I don't understand the policies completely. Fako85 (talk) 21:56, 12 January 2017 (UTC)
    The policies are the enwiki-hosted images may be "fair use", and as such they can not normally be placed on pages such as user pages, project pages, etc. commons: does not have fair-use, so it is always safe to use a file from commons, but for an image from enwiki you would need to examine the licensing restrictions before including it on userpages. — xaosflux Talk 02:44, 13 January 2017 (UTC)
    Good point Xaosflux. I didn't know about these policy requirements. However recently they seem to have changed the behavior of the API (Nov 30, 2016) as described in this ticket. I'll make sure that I use the free images and stay clear from fair-use ones, which may mean that some pages will not show images in the feed. Fako85 (talk) 22:00, 13 January 2017 (UTC)
  • While Fako85 works on implementing the above, I'd like to ping Harej who helps with WikiProject X, to get his input on whether this bot would be helpful for WikiProjects MusikAnimal talk 02:35, 16 January 2017 (UTC)
    @Fako85: Any updates on the above issues? MusikAnimal talk 16:55, 31 January 2017 (UTC)
    It might be helpful, MusikAnimal? Depends on what filtering criteria you could use for generating lists of articles. Harej (talk) 10:50, 23 February 2017 (UTC)
  • As of now the tool has an user agent that mentions WikiFeedBot. It also makes requests with maxlag=5 and respects the Retry-After header. The remaining issue to add a template that people can use is still open and I hope to finish it somewhere in February. Fako85 (talk) 19:54, 1 February 2017 (UTC)
    OK sounds good. I will leave this open for now and check back with you at a later time MusikAnimal talk 20:09, 4 February 2017 (UTC)
    @Fako85: Any updates on the planned changes? MusikAnimal talk 21:23, 13 March 2017 (UTC)
    @MusikAnimal: there is progress, but none that I can show. I expect to finish it by the end of next weekend. Keep you posted and thanks for your patience Fako85 (talk) 17:55, 15 March 2017 (UTC)
  • To test what happens when you include a wiki feed tag on a non-user page I'm going to include one here. It will do a fake run, so no edits will appear, but it will include some text from the feed page. Fako85 (talk) 09:47, 19 March 2017 (UTC)
  • I have it working locally now, but the tools environment is giving me some problems. Won't be able to finish this weekend. Hopefully I can make some time in the weekend to come. Keep you posted Fako85 (talk) 18:02, 19 March 2017 (UTC)
  • It's done. Sorry for the delay. What is the next step @MusikAnimal:? Fako85 (talk) 12:26, 30 March 2017 (UTC)
    @Fako85: Sorry for *my* delay! I've been at WMCON but am back home now. So did we resolve this issue, whereby users add a template to a user page to have the bot update it? One thing with BRFAs is to keep the "Function details" updated. It looks like maybe the functionality described in #Edit of pages in users namespace is out of date. Let's update the function details to outline exactly how the bot will work then we'll go from there :) MusikAnimal talk 21:47, 5 April 2017 (UTC)
    @MusikAnimal: it's done and squashed some bugs underway. I wonder what you think about the proposal now Fako85 (talk) 10:11, 20 April 2017 (UTC)
    @Fako85: The function details look great! The only thing is I question the need to notify users when the feed is ready. If they want an immediate update, they could use the "force refresh" link, and continue their on-wiki work in a different tab in their browser. The intention of the bot is otherwise to get regular daily updates, so I don't think many would be upset if they didn't get an immediate notification. Rather, they'll just watchlist the page or remember to check back tomorrow. How does that sound?
    Lastly, we need some documentation on the available modules. I see mention of revision_count and category_count at User:Wiki Feed Bot, is there anything else? MusikAnimal talk 02:00, 27 April 2017 (UTC)
    @MusikAnimal: Thanks! The notification takes place when you press "force refresh". It takes about 30s to update the page. Currently you get redirected to a wait page. The idea was to immediately return somewhere instead of going to a wait page and notify when the page is done. However I think we can still improve on the performance quite a bit. Then perhaps the wait will be less long. This optimization recently occurred to me and I don't mind dropping the talk page requirement for now and add it if we really need it. So I removed it. The documentation is a good point. We also need many more modules. The next step is that people can write Javascript functions on pages which will get used as modules (in a sandboxed environment). Until that time we'll document the modules with comments in the methods and people can make a PR if they want to add anything. Information about this process can be next to the "force refresh" link. I think Ed Saperia should have a say in how we involve the community as he'll be taking the lead there more than me. However Britain is in the middle of an election as you probably know and he is very busy with campaigning. So we can pick this up earliest in June. Do you already have an idea what kind of module you would like to have? We can write one or two for testing purposes ;) Fako85 (talk) 08:31, 27 April 2017 (UTC)
    @Fako85: At Dev Summit you did mention using pageviews, which would be cool :) But frankly I don't have many opinions on what modules to include. My position here is more to help you get this out the door as a bot approver. In order to approve the bot, I don't think we need to test every single module you think you'll ever add, but it may be good to cover a lot of ground and check the numbers for accuracy. The custom JavaScript modules also sound interesting, and it may be good to get that tested as part of this BRFA, if you intend on adding that functionality anytime soon MusikAnimal talk 15:52, 28 April 2017 (UTC)
    @MusikAnimal: pageviews are possible, but relatively expensive. Because you can't get batches from the API yet it takes as many API calls as pages in the set. I'm looking for more efficient ways, but maybe I should enable it before the improvements to see if it is useful in the first place. The dynamic modules will take a while to do it right I think. Perhaps it will be done after the summer. Fako85 (talk) 20:22, 8 May 2017 (UTC)
    @Fako85: is this going to be on hold for a while? — xaosflux Talk 23:36, 8 June 2017 (UTC)
    @Xaosflux: I hope not. I'd prefer to develop this project agile and not get stuck with the approval, because new features may get introduced in the coming months. If we get approval we can start asking developers and editors to participate. If we do not have approval we'd infringe the rules afaiu. @MusikAnimal: would you like to see more before approval? Fako85 (talk) 19:51, 12 June 2017 (UTC)
    @Fako85: We need to see the bot actually run (after a trial is approved, just to be clear) before being able to approve it. When you're ready to run a small trial, please let us know and we can approve one. Right now, I don't think it's very clear where we are on the development of this bot. ~ Rob13Talk 15:38, 13 June 2017 (UTC)
    @BU Rob13: thanks for clarifying that. I'm new to all these procedures. The bot is ready for a trial period. @MusikAnimal: had a few suggestions, but they have been implemented. However I'm leaving for a holiday with no internet tomorrow. So I think it is best if I'll ping people here when I'm back in July to start the trial. Fako85 (talk) 16:32, 15 June 2017 (UTC)
    Sounds good! I don't think there's any issue leaving this open, so long as we get to a trial at some point. Enjoy your holiday! Just give us a ping when you return. Looking forward to it MusikAnimal talk 16:04, 16 June 2017 (UTC)
  • I'm back from my holiday and I found 4 people outside of BAG interested to test. I'll need to write some modules for them, which I'll try to finish this weekend. After that these users would like to participate in the trial. Anything else that I need to do to start the trial? Fako85 (talk) 09:33, 12 July 2017 (UTC)
    @Fako85: will you be able to trial without (apihighlimits) ? — xaosflux Talk 17:39, 15 July 2017 (UTC)
    @Xaosflux: It would definitely be possible. Not sure if it is desirable. Don't we want this to be part of the test?
  • I've found 3 users that are willing to be part of the test. One looks at possible bias in Wikipedia articles. The other two will watch "breaking news". I've created some modules for them, but I'll have to debug the breaking news one. I'll try to take a look at that on Wednesday.
{{BotTrial}} OK to trial, if this can't function without highapi's please let me know - it will mean having to flag the account as a bot early. — xaosflux Talk 22:21, 17 July 2017 (UTC)
Trial stopped, bot account is blocked pending operator response here. Any admin may unblock without consultation if the issue is resolved. — xaosflux Talk 01:50, 18 July 2017 (UTC)
Despite the response above regarding the use of non-free images, this account is still being used to place clearly marked non-free images outside of articles, in violation of fair-use practices. See page history for reported examples. — xaosflux Talk 01:54, 18 July 2017 (UTC)
The response above is about a similar, but slightly different issue. The problem is that the image is initially marked as free. It is marked as free when the bot makes its edits. Then something happens in the real world and the commons image license gets changed. It is these changes that the bot is not picking up on. I've started a conversation with the editor that runs into problems with this case. I'm hoping to learn how things would work out for her. Fako85 (talk) 06:13, 18 July 2017 (UTC)
Out of pure curiosity. How does the bot block mechanism work? Does it disallow edits from those users? For good measure I've stopped the cronjob for the time being. Fako85 (talk) 06:14, 18 July 2017 (UTC)
It is the same as an editor block, disallows edits - can be removed by any admin. — xaosflux Talk 10:55, 18 July 2017 (UTC)
See also related discussion at Wikipedia_talk:Non-free_content#User:Wiki_Feed_Bot. — xaosflux Talk 10:55, 18 July 2017 (UTC)
To summarize the outcomes of discussions outside this page. The page_image_free property from the API is unreliable. Xaosflux and me decided that it would be better to check all images. Any images from commons are ok to use. Any images from enwiki that specify Category:All free media are also ok. If an editor accidentally places an image in this category the Wiki Feed Bot may use that image. When this mistake is corrected the image may remain visible in feeds for at most 24 hours. After that Wiki Feed Bot will remove or replace the image. We'll have to explain this policy clearly somewhere. I'm close to finishing these changes. Fako85 (talk) 12:42, 22 July 2017 (UTC)
  • @Fako85: the bot has been unblocked, and trials may proceed. — xaosflux Talk 21:53, 22 July 2017 (UTC)
  • Approved for trial (150 edits or 14 days, userspace only).xaosflux Talk 21:53, 22 July 2017 (UTC)

CitationCleanerBot 2

Operator: Headbomb (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 13:43, Saturday, March 25, 2017 (UTC)

Automatic, Supervised, or Manual: Semi-automated during development, Automatic after

Programming language(s): AWB

Source code available: Upon request. Regex-based

Function overview: Convert bare identifiers to templated instances, applying AWB genfixes along the way (but skip if only cosmetic/minor genfixes are made). This will also have the benefits of standardizing appearance, as well as providing error-flagging and error-tracking. A list of common identifiers is available here, but others exist as well.

Links to relevant discussions (where appropriate): RFC, Wikipedia:Bots/Requests_for_approval/PrimeBOT_13. While not the issue of unlinked/raw identifiers wasn't directly addressed, I know of no argument that doi:10.1234/whatever is better than doi:10.1234/whatever. If ISBNs/PMIDs/RFCs are to be linked (current behaviour) and templated (future behaviour), surely all the other ones should be linked as well.

I have notified the VP about this bot task, as well as others similar ones.

Edit period(s): Every month, after dumps

Estimated number of pages affected: ~5,400 for bare DOIs, probably comparable for the other similar identifiers (e.g. {{PMC}}), and much less for the 'uncommon' identifiers like {{MR}} or {{JFM}}. This will duplicate Wikipedia:Bots/Requests_for_approval/PrimeBOT_13 to a great extent. However, I will initially focus on non-magic words, while I believe PrimeBot_13 will focus on magic word conversions.

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details: Because of the great number of identifiers out there, I'll be focusing on "uncommon" identifiers first (more or less defined as <100 instances of the bare identifier). I plan on semi-automatically running the bot while developing the regex, and only automating a run when the error rate for that identifier is 0, or only due to unavoidable GIGO. For less popular identifiers, semi-automatic trialing might very well cover all instances. If no errors were found during the manual trial, I'll consider that code to be ready for automation in future runs.

However, for the 'major' identifiers (doi, pmc, issn, bibcode, etc), I'd do, assuming BAG is fine with this, an automated trial (1 per 'major' identifier) because doing it all semi-automatically would just take way too much time. So more or less, I'm asking for

  • Indefinite trial (semi-automated mode) to cover 'less popular identifiers'
  • Normal trial (automated mode) to cover 'popular identifiers'

Discussion

@Primefac and Anomie:. Headbomb {talk / contribs / physics / books} 13:56, 25 March 2017 (UTC)

For cases where a particular ISBN/PMC/etc. should not be linked, for whatever reason, will this bot respect "nowiki" around the ISBN/PMC/etc. link? — Carl (CBM · talk) 18:39, 26 March 2017 (UTC)

Unless the "Ignore external/interwiki links, images, nowiki, math, and <!-- -->" option is malfunctioning, I don't see why it wouldn't respect nowiki tags. Headbomb {talk / contribs / physics / books} 18:58, 26 March 2017 (UTC)

This proposal looks like a good and useful idea. Thanks for taking the time to work on it! − Pintoch (talk) 11:14, 11 April 2017 (UTC)

{{BAGAssistanceNeeded}}

I'd rather this task be explicit as as to its scope "identifiers" is to vague. Can you specify exactly which identifiers this will cover? Additional identifiers can always be addressed under a new task as needed. — xaosflux Talk 01:38, 20 April 2017 (UTC)

Pretty much those User:Headbomb/Sandbox. Focusing on the CS1/2 supported ones initially, then moving on to less common identifiers, if they are actually used in a "bare" format, like INIST:21945937 vs INIST:21945937. Headbomb {t · c · p · b} 02:47, 20 April 2017 (UTC)
@Headbomb: Based on past issues with overly-broad bot tasks, I try to think about degrees of freedom when I look at a bot task. The more degrees of freedom we have, the harder it is to actually catch every issue. You're asking for a lot of degrees of freedom. We've got code that's never been run on-wiki before, edits being made on multiple different types of citation templates for each identifier, a mostly silent consensus, different types of trials being requested, and an unknown/unspecified number of identifiers being processed. It's probably not a great idea to try to accomplish all that in one approval. Would you be willing to restrict the scope of this approval to a relatively small number of identifiers so we can focus on testing the code and ensuring the community has no issues with this task? In looking at your list, I think a manageable list of identifiers would be as follows: doi, ISBN, ISSN, JSTOR, LCCN, OCLC, PMID. These are likely the identifiers with the most instances; I may have missed a couple other high-use ones that I'm less familiar with. We could handle the rest (including less-used identifiers) in a later approval or approvals. Your thoughts? ~ Rob13Talk 04:09, 3 June 2017 (UTC)

I'm asking for lots of freedom yes, but in a modular and controlled fashion. I'm fine with restricting myself to the popular identifiers at first, but it will make development a bit more annoying/complicated, since the lesser user identifiers are the hardest to test on a wider scale. If BAG is comfortable with a possibly slightly higher false positive rate post-approval (a very marginal increase, basically until someone finds a false positive, if there are some), I'm fine with multiple BRFAs. Only thing I would ask to that initial list is I'd rather have arxiv, bibcode, citeseerX, doi, hdl, ISBN, ISSN, JSTOR, PMID, and PMCID. OCLC/LCCN could be more used than arxiv/bibcode/citeseerx/hdl/PMCID, but they usually are on different type of articles which will make troubleshooting a bit trickier. Headbomb {t · c · p · b} 19:18, 6 June 2017 (UTC)

Approved for trial (250 edits). The list you provided is fine. As soon as we get those sorted and approved, I'm happy to quickly handle future BRFAs, so it shouldn't be too time-consuming of a process for you. Roughly 25 edits per identifier you listed above. Please update your task details to reflect the restricted list of identifiers before running the trial. ~ Rob13Talk 19:51, 6 June 2017 (UTC)
{{OperatorAssistanceNeeded}} Any updated on this trial? — xaosflux Talk 00:41, 19 June 2017 (UTC)
Still working on the code. I can't nail the DOI part, because I haven't yet found a reliable way to detect the end of a doi string, and I've been focusing on that rather fruitlessly since it's the hard part of the bot. I've asked for help with that at the VP. The other identifiers are pretty easy to do, so I'll be working on those shortly. Worse case, I'll exclude DOIs from bot runs and do them semi-automatically. Headbomb {t · c · p · b} 15:49, 19 June 2017 (UTC)
  • [4] 24 edits from the ISSN trial. No issues to report. Headbomb {t · c · p · b} 18:30, 19 June 2017 (UTC)
  • [5] 25 edits from the DOI trial.
    • [6] missed [7]. While I'm planning on taking care of those, down the line, right now my brain is a bit fried from all the other corner cases I've dodged. Headbomb {t · c · p · b} 21:39, 19 June 2017 (UTC)
  • [8] 25 edits for the JSTOR trial.
    I do not think that instance of GIGO is a problem; replacing an incorrect mention of JSTOR with a broken template makes it easier to detect the issue. Jo-Jo Eumerus (talk, contributions) 15:35, 22 June 2017 (UTC)
  • [12] 25 edits from the OCLC trial
    • [13], [14] didn't touch an OCLC (filtering issues)
    • [15] could be better, in the sense that it could make use of |oclc=, but that's what CitationCleanerBot 1 would do
    • [16] touched a DOI, because the OCLC was in an external link which the bot is set to avoid. I plan on doing those manually.
    • [17] shouldn't be done, I've yet to find a good solution for this however. (Follow up: This is now fixed most of the time. Corner cases will remain.) Headbomb {t · c · p · b} 12:43, 24 July 2017 (UTC)

HostBot 8

Operator: (bot) Jtmorgan (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Operator (training modules): Ragesoss (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 18:25, Tuesday, May 2, 2017 (UTC)

Automatic, Supervised, or Manual: Automatic (after supervised trial)

Programming language(s): Python

Source code available: https://github.com/jtmorgan/hostbot

Function overview: Posts a welcome message on new users' talk pages that includes links to introductory training modules on Programs & Events Dashboard, like this. Here's the template: {{Welcome training modules}}.

Links to relevant discussions (where appropriate): Wikipedia:Village_pump_(proposals)#Experiment_to_see_if_training_modules_are_helpful_for_new_users (permanent link). See also Wikipedia:Bots/Requests for approval/RagesossBot 3.

Edit period(s): Continuous

Estimated number of pages affected: 5000


Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details:

The main purpose of this task is to run a controlled experiment inviting new users to use the Programs & Events Dashboard training modules to learn the basics of Wikipedia. It finds recently registered accounts that have made between one and five edits, and sorts half of them into an experimental group and half into a control group. For the experimental group, it posts a welcome message (similar to what HostBot has done before with Teahouse invitations) that invites users to try the training modules, which are forked from the Wiki Ed classroom program training modules that we've been using and refining over the last few years and get very positive feedback from student editors.

Jonathan Morgan and I would like to get a sample of about 5000 invited users so that we can see if it makes a difference in terms of new users being more likely to stick around and keep editing. The last major experiment along these lines that I know of was The Wikipedia Adventure; in contrast to TWA, these trainings are more practical and wide-ranging, and have also been refined over time to try to head off the most common errors and confusing aspects of Wikipedia that new users run into.--ragesoss (talk) 18:25, 2 May 2017 (UTC)

The code has been tested on test.wikipedia.org and you can see a sample of the output here. J-Mo 18:32, 2 May 2017 (UTC)

Discussion

  • The discussion, archived here, has not been formally closed, but it seems that there is reasonable grounds for the trial. I certainly support such a trial, as editor retention is a key issue, that needs addressing. With new users, have you considered the best edit range? You said between 1 and 5, but I feel this is too low, and risks inviting vandalism only accounts into both groups. Perhaps a range of 7+ would filter these out? TheMagikCow (T) (C) 07:05, 5 May 2017 (UTC)
  • Thanks TheMagikCow. Regarding the edit range: we settled on 2 edits for several reasons. First, most people leave Wikipedia after their first couple edits. There are undoubtedly many reasons for this, but one that we're fairly sure of is that they find the editing process daunting (both the UI/tech and the policies). The training modules are designed to address these issue directly, so we want to put them in front of people who are experiencing them as quickly as possible. Second, we want to gather a large sample so that we can run stats to determine impact, and if we limit the invitees to people who have 7-10 edits, our maximum daily sample drops by about 90%. Third, HostBot currently sends Teahouse invites to most eligible newbies who reach the 5-edit threshold on any given day. We want to test the impact of the Training Modules independent of the impact of the Teahouse, which we already know has a positive effect on new editor retention, so we don't want to invite people who already have a Teahouse invite. Finally, regarding your concerns about inviting vandals, I plan to use the same approach to weeding out vandals in this study that I use for Teahouse invites: if someone has a level-4 user warning on their talkpage, if they have been blocked or banned, or if they have been accused of sockpuppetry, they won't receive an invite. We will also exclude people who meet these criteria from the control group. This filtering strategy has worked well for the Teahouse; there have been very few issues with disruptive editors there in the ~5 years that I've sent out HostBot invites. Does this rationale for using a 2-edit threshold make sense to you, and does my description of the vandal-filtering process address your concerns? Cheers, J-Mo 22:41, 5 May 2017 (UTC)
  • Thanks for such a detailed reply Jtmorgan! That makes perfect sense now you point about the issue of new editors making few edits and then quitting, and it sounds like 2 if the prefect number. The vandal exclusion is also a very neat idea. I am fully supportive of this task, anything to help with the issue is much welcomed. TheMagikCow (T) (C) 10:56, 6 May 2017 (UTC)
  • Approximately how many edits per day do you think this would result in? SQLQuery me! 19:17, 18 May 2017 (UTC)
  • SQL around 200 talkpage invites per day, roughly doubling the current volume of edits by HostBot. J-Mo 07:52, 22 May 2017 (UTC)
Approved for trial (1400 edits or 7 days). whichever comes first. SQLQuery me! 02:45, 23 May 2017 (UTC)
Thank you, SQL. Just a quick heads-up that I'm going to be largely AFK until June 18, and I obviously don't want to leave the bot unattended during the trial, so I anticipate that I will start this trial on or around Monday, June 19. Cheers, Jmorgan (WMF) (talk) 16:29, 24 May 2017 (UTC)
I started the trial today. J-Mo 20:25, 30 June 2017 (UTC)
SQL I stopped the bot today. 1385 invites were sent. If everything looks good to you, I'd like to run this trial for another 3-4 weeks to gather a large enough sample for retention analysis. Let me know what you think, J-Mo 21:18, 6 July 2017 (UTC)
Would it be possible to slow the bot down a little bit? It seems like running at 80+ edits/min is a little high. Other than that I really don't have any concerns - I'll give it a couple days for others to look over your trial edits as well. SQLQuery me! 03:06, 7 July 2017 (UTC)
SQL Certainly. I'll add a 5-second sleep between invites. And I'll wait for your signal of 'all clear' before starting up again. Thanks! J-Mo 18:13, 7 July 2017 (UTC)
SQL This is just to say that I've implemented the 5-second sleep, which should reduce invite volume to no more than 15-20/min. Cheers, J-Mo 23:41, 13 July 2017 (UTC)
SQL ready for the next wave?--Sage (Wiki Ed) (talk) 19:15, 26 July 2017 (UTC)

Bots that have completed the trial period

Yobot 55

Operator: Magioladitis (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 06:24, 27 June 2017 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): AWB / WPCleaner

Source code available: -

Function overview: Remove/Fix invisible unicode characters from pages

Links to relevant discussions (where appropriate):

Edit period(s): Daily

Estimated number of pages affected: 300 per day

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details: Regex: \u200E|\uFEFF|\u200B|\u2028|\u202A|\u202C|\u202D|\u202E|\u00AD can be removed. Regex: \u2004|\u2005|\u2006|\u2007|\u2008 can be replaced by space.


Discussion

{{BotOnHold}} pending Wikipedia:Administrators'_noticeboard#Request_to_remove_invisible_characters_from_pages. — xaosflux Talk 11:05, 27 June 2017 (UTC)

Xaosflux the discussion seems to conclude that tis is a useful bot task and it is preferred if a bot does it instead of a normal account. -- Magioladitis (talk) 16:08, 5 July 2017 (UTC)

If you think that discussion is ready for closing, please list at WP:ANRFC, the uninvovled notes of the closing party will be reviewed. — xaosflux Talk 16:19, 5 July 2017 (UTC)
Xaosflux I think we need an admin to close this BRFA unless a BAG member can close it. -- Magioladitis (talk) 11:24, 7 July 2017 (UTC)
I think it will close as support for a bot task, assuming so I'll be good with getting this right to a trial, seems like an overall useful task. — xaosflux Talk 12:02, 7 July 2017 (UTC)
Xaosflux The discussion closed. Rationale is "consensus that these changes be performed, and that they must be via a bot account". -- Magioladitis (talk) 15:40, 19 July 2017 (UTC)
Approved for trial (50 edits or 7 days). OK to trial. Please include edit summary link that clearly describes the task a link for more information either to the BRFA or somewhere on the bot userpage(s). Edits unrelated to this task should not be bundled in. Please provide link to a range of diffs and any issues you find here after the trial. — xaosflux Talk 16:45, 19 July 2017 (UTC)

Xaosflux WPCleaner autogenerates edit summaries and does not allow custom edit summaries. WPCleaner also has no way to count edits and stop after a number has reached. AWB can't do the task in bot mode without general fixes activated due to a bug (removal of non visible characters is condsidered as null edit under AWB's understanding). -- Magioladitis (talk) 17:08, 19 July 2017 (UTC)

Trial complete.

  • diffs
  • I had to do the saving manually i.e.by pressing Ctrl+S to every single page
  • No general fixes will also result in not adding npbs in the various places it may be needed.
  • After running AWB I will still have to run WPCleaner to remove the pages from the CHECKWIKI list. WPCleaner does that while AWB does not.

-- Magioladitis (talk) 17:22, 19 July 2017 (UTC)

The trial edits look good and the edit summaries are good.
Limitations of your chosen software that require additional development are outside the scope of a BRFA - please address with the development teams.
Re "npbs" do you mean &nbsp;? Regarding "the various places" - is this related to the replacements that this task should be making?
If you need to run something to update project work tracking pages in low volume, that is OK.
xaosflux Talk 04:26, 23 July 2017 (UTC)
OK. But I may use WPClenaer in some cases instead which auto-generats their own dit summaries.
OK. Am I allowed to perform general fixes as llong as the main task is performed or not?
Yes. If we allow general fixes I can do some extra invisible character replacement. No big deal.
OK. -- Magioladitis (talk) 06:27, 23 July 2017 (UTC)
What is the WPCleaner summary going to look like? — xaosflux Talk 15:05, 23 July 2017 (UTC)
@Xaosflux: like that. -- Magioladitis (talk) 15:25, 23 July 2017 (UTC)
Those aren't as useful for other editors - and without task link don't explain why what they may otherwise determine to be a cosmetic edit are exempt due to the community discussion above - so I'd say no to that type of edit. — xaosflux Talk 15:50, 23 July 2017 (UTC)
As for as "and all other genfixes" - personally I don't think it's the best idea - but that's not as an approver - would like to get a few other comments from others regarding that here. — xaosflux Talk 15:50, 23 July 2017 (UTC)
Input requested (Presenting both pro- and con- summaries of my thoughts) here: Wikipedia:Bots/Noticeboard#Including_.22general_fixes.22_on_a_current_BRFA. — xaosflux Talk 15:54, 23 July 2017 (UTC)
Given the main task is in itself cosmetic, but supported, I see no reason why GenFixes shouldn't be allowed. Magio has demonstrated that he can keep to the restrictions of only applying genfixes if the main task has made a change to the page contents. The edit summary must be clear, and if asked, Magioladitis must be able to explain where in a given edit the main bot task is being carried out. If he can't then the edit violates COSMETICBOT, and approval can be swiftly modified to disallow general fixes for this task.—CYBERPOWER (Chat) 17:01, 23 July 2017 (UTC)
Since the main task is not hardcoded in the general fixes, it's easy to have appropriate skip conditions. -- Magioladitis (talk) 17:14, 23 July 2017 (UTC)
Note that I am commenting as a community member, not as a BAG member, in accordance with my recusals from both CHECKWIKI and Magioladitis tasks. I support the use of general fixes in this task provided "Skip if genfixes only" is checked and the edit summary indicates general fixes may be applied. I see no reason to deny the use of general fixes running alongside a main task. This is how cosmetic fixes should be implemented; alongside major fixes. (I still don't think this fix is major or desirable, but as always, I yield to consensus.) ~ Rob13Talk 00:18, 24 July 2017 (UTC)
Note that my comments above were as a community member and not a BAG member as well. I will not be involving myself in this in a BAG capacity unless asked to.—CYBERPOWER (Message) 01:33, 24 July 2017 (UTC)
I'll pile on with the non-BAG commenting: In my personal opinion, since whitespace- and non-printable-character-changing diffs are often hard to see in the diff window, I suspect adding cosmetic general fixes into the mix will be a major drama magnet for an editor who has recently himself been involved in a lot of wikidrama. But if Magioladitis wants to take that risk, I won't oppose it. Anomie 13:43, 24 July 2017 (UTC)
Anomie In the drama top 10 this is quite low because of the good edit summary and the zero mistakes. Moving punctuation is the first place of drama. -- Magioladitis (talk) 13:52, 24 July 2017 (UTC)

Yobot 52

Operator: Magioladitis (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 09:44, Friday, February 3, 2017 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): AWB / WPCleaner

Source code available:

Function overview: Add reflist to pages missing it while they have ref tags in them + adding References section when possible

Links to relevant discussions (where appropriate): Wikipedia:Bots/Requests for approval/Xqbot 3

Edit period(s): Daily

Estimated number of pages affected: 100 pages per day

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details: This bot was doing this task for years. The bot will look for ref tags. After that it looks for on of the following section to place the reflist template: References, Footnotes, Notes. If no section is given, it places a References section before one of the following: Further reading, External links, See also, Notes. Yobot will use AddMissingReflist function of AWB or the built-in code of WPCleaner.

Discussion

Per Wikipedia:Bots/Requests for approval/Yobot 40, it seems adding a References section may be also useful. The problem is where to exactly place it. We can resolve this by teting ofcourse. -- Magioladitis (talk) 09:44, 3 February 2017 (UTC)

  • Please re-review the required fields on the approval request. I believe you forgot to fill out the bottom half of it. ~ Rob13Talk 11:11, 3 February 2017 (UTC)
    • Done. I assumed that since they are the same with the previous BRFA the BAG can understand. -- Magioladitis (talk) 12:50, 3 February 2017 (UTC)
      • The top section of a BRFA is not just for BAG to understand, it's also to record the parameters of the approval. In this case, the "Function details" should describe how you will determine where to insert the references section. Experimentation during the approval process is not the time to begin determining this. I recommend you download the wikitext of recent good and featured articles, remove the "References" sections, and then do your testing offline until you have an algorithm that readds the "References" sections in the same locations as they were in the original articles. Anomie 13:30, 3 February 2017 (UTC)

"The problem is where to exactly place it." Sounds like a standard case of WP:CONTEXTBOT. You need to specify exactly where the tag and section will go and--if it's not supported by a layout policy/guideline--that there is consensus to place it that way. Sounds like tagging the articles and/or producing a list for human editors to review is more appopriate. —  HELLKNOWZ  ▎TALK 22:04, 3 February 2017 (UTC)

Just spitballing: If external links heading present, place above that. If external links heading not present but categories present, check for templates immediately above categories. If there are a series of templates immediately above categories, place above that (intended to handle navboxes). If there aren't a series of templates immediately above categories, place above the categories. That seems like it would handle this with a relatively high degree of accuracy if you can write regex to do it. We'd need to see a trial to determine what the error rate is, but if regex can be written to use that algorithm, I think a trial can be approved. ~ Rob13Talk 23:03, 3 February 2017 (UTC)

Hellknowz handles most fo the cases. I won;t do cases AWB does not do. BU Rob13 this is what AWB does. I will do this. -- Magioladitis (talk) 23:15, 3 February 2017 (UTC)

How will the bot deal with pages that use Harvard refs, but where someone accidentally placed a reference with <ref> tags? In these cases, the correct fix is to change the new reference to the existing style, rather than to add a reflist template to the existing page. It appears to me that manual review is needed to tell if there are existing references in another style. — Carl (CBM · talk) 03:58, 9 February 2017 (UTC)
Are you volunteering? All the best: Rich Farmbrough, 23:32, 9 February 2017 (UTC).
A BRFA determines whether there is consensus for a task to be performed and whether the implementation being used is sufficient to handle the task in a proper and error-free manner. It's not an exercise in allocating volunteer time. ~ Rob13Talk 03:12, 10 February 2017 (UTC)

The task was performed the period 2010-2017 with no problems. Some bots still perform it. So the question is void. -- Magioladitis (talk) 12:38, 10 February 2017 (UTC)

Here is an example of an incorrect edit from 2016: [18]. That article uses a different citation style, which required a different fix. These are hard for a bot to avoid entirely, but in general the bot should make some sort of attempt to detect this condition. For example, the presence of any of the harvtxt family of templates, or matching the regex /\(\w+ \d\d\d\d\)/, could be signs that the page should be examined manually. — Carl (CBM · talk) 13:06, 10 February 2017 (UTC)

CBM The bot did not introduce the error. It made the error easier to find and fix by normal editors who watch the page. I could try an approach but not on my top priority since it looks a bit rare for me. Thanks for the report! -- Magioladitis (talk) 15:31, 10 February 2017 (UTC)

I am sorry that this bot request is not your top priority; perhaps you have too many of them open? I don't support the bot request if the bot task will blindly add the reflist tag without any effort to detect pages where it should not be added. That only adds a second error, rather than attempting to fix the original one. — Carl (CBM · talk) 15:37, 10 February 2017 (UTC)
This sort of wilful misinterpretation is not helpful. All the best: Rich Farmbrough, 20:54, 10 February 2017 (UTC).
I need to see some example edits in semi-auto to get a feel of what this task would do. The problem originally was where to place a references section. I suggested pseudo-code to handle that, and then was told it already does that. That seems contradictory, so I'm left unsure what exactly this bot will be doing and how it will do it. ~ Rob13Talk 15:41, 10 February 2017 (UTC)

CBM do the task is top priority. Fixing a minor issue is not. -- Magioladitis (talk) 15:48, 10 February 2017 (UTC)

You are putting the cart before the horse. The point of a BRFA is to make sure that issues are addressed, not to rubber stamp the request. There is no reason overall why this needs to be done by Yobot. If a bot operator does not constructively address issues that are raised, the request should probably not be approved. — Carl (CBM · talk) 15:54, 10 February 2017 (UTC)

CBM the thing that you are here making a different request. I discuss pages that for some reason have ref tags and you ask me to do something for pages that use mixed system. At the meantime, Mediawiki auto-generates a reflist tag. As I said, I 'll look into it but do you see the problem here? -- Magioladitis (talk) 15:56, 10 February 2017 (UTC)

The page did not use a mixed system; it used Harvard refs. Then someone erroneously added a ref in a different format. That error should be fixed by changing the new ref to match the previous formatting, not by adding a reflist tag (which is a second error, because the established style did not use footnotes). The bot code should include some logic to try to detect this situation (e.g by looking for the templates in the harvtxt family), so that it does not mistakenly add a reflist tag where none should be added. This is not a different request - it is an aspect of the bot task that is being proposed. — Carl (CBM · talk) 16:02, 10 February 2017 (UTC)
This is sort of what I implied above. The bot should be able to skip pages where it finds conflicting references or any syntax that is likely to convolute things further. I'm still waiting on full function details though. I don't think we should require the bot to be able to deal with different styles, though that's a supreme task. The bot operator should acknowledge these related tasks and how their edits may or may not affect the article's ref style and what they intend to do about them and how they avoid problematic cases or further messing up the article. —  HELLKNOWZ  ▎TALK 16:16, 10 February 2017 (UTC)


Hellknowz The bot won't affect the existing ref style on any page. I added details and a link to the other bot that was doing the same job using python. -- Magioladitis (talk) 20:15, 10 February 2017 (UTC)

I thought I had to do some extra programming but AWB is sooooo good afterall: This was already in the code (undocumented feature!) -- Magioladitis (talk) 20:42, 10 February 2017 (UTC)

"The bot won't affect the existing ref style on any page." - that is exactly what happened in the link I gave. The existing style did not use footnotes, but one was added erroneously. Adding a {{reflist}} tag only further changes the existing style. Are you saying that the bot will now do something different? Hellknowz wrote "The bot should be able to skip pages where it finds conflicting references or any syntax that is likely to convolute things further", and I also think that would be ideal. — Carl (CBM · talk) 01:04, 11 February 2017 (UTC)

{{BAGAssistanceNeeded}} I once more underline the fact that this is a take over of a bot that was already doing it. -- Magioladitis (talk) 02:06, 23 February 2017 (UTC)

  • Note - Maglioladitis has been doing this process in a very automated way using their account. I will be filing with ArbCom about it. Primefac (talk) 12:31, 29 March 2017 (UTC)
Primefac The Arbcom has closed a week ago. -- Magioladitis (talk) 12:43, 29 March 2017 (UTC)
Yup, and points 5 and 6 pretty clearly state that you shouldn't be doing semi-automated processes that Yobot would be doing if the Yobot task is not approved and/or halted. Primefac (talk) 12:47, 29 March 2017 (UTC)
Primefac Yobot is not doing this task. The points refer to tasks that will be approved after the ArbCom. -- Magioladitis (talk) 12:49, 29 March 2017 (UTC)
Clearly not. However, if Yobot isn't approved to do a bot task, should you really be doing the same automated task on your main account? That's one of the main reasons why you were brought before ArbCom in the first place. Primefac (talk) 12:52, 29 March 2017 (UTC)
Primefac Please contact the ArbCom asap. -- Magioladitis (talk) 12:53, 29 March 2017 (UTC)
You're an odd cat, Magioladitis. Primefac (talk) 13:12, 29 March 2017 (UTC)

Primefac True :) But this process after 2+ months is exhausting. I try to help the best way I can. Every action I try gets complaints lately. If I am not welcome somewhere I can just leave. Wikipedia is a thing I do everyday since the time I wake up till the time I go to sleep but if the rules has changed I will OK with it. This time I think I followed the rules I was given if not, so be it. Mea culpa. -- Magioladitis (talk) 13:22, 29 March 2017 (UTC)

Approved for trial (25 edits). Now that the brouhaha has died down, let's see some trials. The above objections are noted, but is nothing that should prevent a bot from running. Sspecifically, if there are Harvard citations as a dominant style, and a <references/> is missing because of one citation not following Harvard style is given, then the bot puts the error in everyone's face, which will lead to someone fixing it, and the article falling back into compliance. Editing conditions are

  • The reference list must actually be missing. There must be an active check that no reference list exists via <references/> or {{Reflist}} (and variants, T162492 is relevant here). If it finds anything (I suggest the "(<\s*reference|\{\{\s*reflist)" regex), the bot must skip the page, regardless if other fixes could be done.
  • AWB genfixes may not be done on their own as part of this task.
  • AWB genfixes may be enabled, but only if this is explicitly mentioned in the edit summary (T161460 is relevant here, but a 'manual' edit summary that does the same thing is acceptable)
  • If AWB genfixes are enabled, the skip if cosmetic-only and minor genfixes-only options must be enabled.
  • Edit summary must be clear about what the task is, and where to report issues.
  • Verify that and and other variants trigger the skip condition. This may be done in a sandbox. This specific item must be done before the trial edits are performed. Headbomb {t · c · p · b} 20:34, 17 April 2017 (UTC)
@Headbomb: If we're intending the likely errors identified above to trigger editor attention, this task should be done (a) without a bot flag, and (b) with minor edit unchecked. The error isn't "put in everyone's face" if it doesn't appear on watchlists. ~ Rob13Talk 20:38, 17 April 2017 (UTC)
The error is not very likely to occur in the first place, and would already have shown up in watchlists (likely not marked as minor either). However, I'm open to have this be considered a major edit, given it's adding a section that didn't exist. But the bot flag should be used, since this doesn't need to show up in recent changes, not should people be denied the option to hide this stuff from their watchlist. Headbomb {t · c · p · b} 20:45, 17 April 2017 (UTC)
I could get behind that as a compromise. ~ Rob13Talk 03:44, 18 April 2017 (UTC)

I ll start this after my wikibreak. -- Magioladitis (talk) 00:06, 1 June 2017 (UTC)

Trial complete. -- Magioladitis (talk) 23:21, 7 July 2017 (UTC)

[19], diffs. -- Magioladitis (talk) 23:22, 7 July 2017 (UTC)

I'm commenting not as a BAG member but as a regular editor, per my recusal from Yobot tasks. I wanted to note that the edit summary features a redlink; I believe you intended it to link to here. Simple fix. Second, I wanted to reiterate that I think this should run without the minor edit flag both due to potential errors and because a whole section is being added, as per above. ~ Rob13Talk 00:31, 8 July 2017 (UTC)
I fixed the red link. -- Magioladitis (talk) 06:40, 8 July 2017 (UTC)

A user has requested the attention of a member of the Bot Approvals Group. Once assistance has been rendered, please deactivate this tag. --- Magioladitis (talk) 09:29, 15 July 2017 (UTC)

Yobot 34

Operator: Magioladitis (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 00:12, Thursday, February 2, 2017 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): AWB / WPCleaner

Source code available:

Function overview: Move reference after punctuation per WP:REFPUNCT

Links to relevant discussions (where appropriate): WP:REFPUNCT

Edit period(s): Daily

Estimated number of pages affected: 1000 pages per day


Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): Yes

Function details: foo<ref>bar</ref>. will change to foo.<ref>bar</ref>.

    • Supported punctuation: ,.;:?
    • Supported templates: "Efn", "Efn-ua", "Efn-lr", "Sfn", "Shortened footnote", "Shortened footnote template", "Sfnb", "Sfnp", "Sfnm", "SfnRef"
    • Supported template lists: "Rp", "Better source"
      • Additionally it will clean duplicate punctuation before ref, not for !!, could be part of wiki table
      • if there have been changes need to call FixReferenceTags in case punctation moved didn't have witespace after it

Discussion

Similar task is done by BG19bot and Menobot. -- Magioladitis (talk) 00:12, 2 February 2017 (UTC)

  • Support a trial to ensure no cosmetic-only edits are made. ~ Rob13Talk 00:29, 2 February 2017 (UTC)
BU Rob13 What is the definition of cosmetic-only edits? -- Magioladitis (talk) 10:24, 2 February 2017 (UTC)
We've been over this so many times and I've explained the definition repeatedly. I'm not continuing this at a new venue. At this point, you either understand the concerns of the community or you do not. ~ Rob13Talk 10:40, 2 February 2017 (UTC)
The concerns are different from the definition of "cosmetic edits". I try to explain that it would be better if you avoid this term till the definition is cleared. -- Magioladitis (talk) 11:50, 2 February 2017 (UTC)
The policy as written has minor potential to confuse those who are not familiar with technical areas. You are familiar with technical areas. Everyone else at BRFA is as well, including the BAG members who will review this request. There's no lack of clarity among those with experience in the area, so I will continue using the jargon because it gets the point across to those around these parts who care to listen. ~ Rob13Talk 12:04, 2 February 2017 (UTC)

The task description does not mention any general fixes being run, and provided they are not enabled this appears to be a fine task for a bot to perform. — Carl (CBM · talk) 12:13, 2 February 2017 (UTC)

Please list all the "punctuation" marks that the bot detects and if it applies special rules to any. What if there is more than one Hi<ref>... or mixed Hi?<ref>! or other characters (Hi<ref>). or lack of spaces Hi<ref>.com or multiple refs Hi.<ref><ref> / Hi<ref>.<ref>. Does it only change <ref>Stuff</ref> or </ref> or any of the other variants? —  HELLKNOWZ  ▎TALK 22:35, 3 February 2017 (UTC)

Hellknowz The punctuation marks supported are: ,.;:?. ! is not supported because it used as in wikitables. -- Magioladitis (talk) 19:49, 12 February 2017 (UTC)

Hellknowz the details are handled by AWB's source code. -- Magioladitis (talk) 19:51, 12 February 2017 (UTC)

It seems to me that the source code of a bot should be determined by the function details, and not vice-versa. Otherwise, how would we be able to tell if there is a bug that makes the AWB source code no longer do the right thing for this bot request? It would be bad idea to have open-ended bot approvals that can change later based on source code changes, which might not even be made by the bot operator. — Carl (CBM · talk) 00:58, 13 February 2017 (UTC)
As I've already pointed out to the botop elsewhere, we don't approve source code or tools, we approve tasks, which are listed in function details. Anything not listed is assumed to not be part of the task. —  HELLKNOWZ  ▎TALK 10:51, 13 February 2017 (UTC)

Hellknowz I also added the templates supported. -- Magioladitis (talk) 11:03, 13 February 2017 (UTC)

Hellknowz I added more details. -- Magioladitis (talk) 11:08, 13 February 2017 (UTC)

{{BAGAssistanceNeeded}} -- Magioladitis (talk) 02:09, 23 February 2017 (UTC)

  • Comment looks like this can probably be declined, he's just doing it on his main account. Primefac (talk) 14:42, 13 April 2017 (UTC)

Primefac I would prefer if this done automatically and probably later pass it to the Wmflabs server so it is done on daily basis. -- Magioladitis (talk) 15:35, 13 April 2017 (UTC)

Oh, I can totally understand that. I was just providing an update to the situation. Primefac (talk) 15:38, 13 April 2017 (UTC)
Primefac I'll continue doing all the tasks done by Yobot from my personal account respecting the restrictions ofcourse. Still, I was asked to re-submit all the tasks (100+) for BRFA. I am doing this but the procedure is slow. -- Magioladitis (talk) 15:43, 13 April 2017 (UTC)
I have to spend most of the time online right now to keep with the work. Reserving free knowledge for the entire planet needs personal sacrifices still I think some thing can be done by bots to help me and others do other useful staff such as improving code, organising wiki seminars, etc.
Making sure punctuation comes before the references hardly seems like a world-ending task that must be done immediately or Wikipedia would implode. But hey, you waste your time however you feel best. I'm not one to judge. Primefac (talk) 15:46, 13 April 2017 (UTC)

Primefac there are approx. 1000 new pages per day with this error. The current backlog is 8,000 pages. Today I fixed about 2,000 pages but it took me more than 10 hours(!) of almost continuous editing in front of my laptop screen. This task should be done by bot and not depend on the fact that today I did not leave home. -- Magioladitis (talk) 20:03, 13 April 2017 (UTC)

Approved for trial (2000 edits or 2 days). SQLQuery me! 02:44, 9 May 2017 (UTC)

{{OperatorAssistanceNeeded}} Was this trial completed, do you have a trial report? — xaosflux Talk 03:11, 20 June 2017 (UTC)

Xaosflux I just started it. Please take note I am on wikibreak and that the task took 3 months only to get to the trial phase. -- Magioladitis (talk) 08:16, 20 June 2017 (UTC)

Trial complete. -- Magioladitis (talk) 09:22, 20 June 2017 (UTC)

@SQL: can you review? — xaosflux Talk 11:58, 27 June 2017 (UTC)

SportsStatsBot

Operator: DatGuy (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search) and co-botop Kees08 (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 00:12, Saturday, March 18, 2017 (UTC)

Automatic, Supervised, or Manual: automatic

Programming language(s): Python

Source code available: User:DatBot/footycode

Function overview: Bot automatically updates football (soccer) league tables

Links to relevant discussions (where appropriate): Special:Permalink/770569052#Bot to update tables

Edit period(s): Checks every 30 mins

Estimated number of pages affected: Minimum 2 templates, excluding transclusions. Minimum 53 transclusions.

Exclusion compliant (Yes/No): No

Already has a bot flag (Yes/No): Yes

Function details: A bot that automatically updates tables. The bot would take input from the sites that are customised at User:SportsStatsBot/footyconfig, and edits the template directly. The only manual thing would be to edit the bot's settings for relegations and promotions I believe. It would also be possible to turn off one of the leagues that the bot would manage, if there would be some weird event that the source updated incorrectly.

Discussion

  • This seems like a very worthwhile task for a bot - I can't see any policy reason for objection. Would the bot automatically shut off 30 minutes after the last game in the season has been played to avoid any errors in the edits regarding promotion/relegation? TheMagikCow (T) (C) 09:43, 18 March 2017 (UTC)
  • I'm not sure whether the page would have SEASON ENDED that I could look for in the html, but I believe it shan't edit if there have been no matches. Dat GuyTalkContribs 11:58, 18 March 2017 (UTC)
  • I'm not sure about looking in the html, but perhaps if the bot has not edited a season in - say 4 weeks - it would stop editing that template until it has been restarted with the updated information? I don't it's a huge problem, though. TheMagikCow (T) (C) 19:59, 18 March 2017 (UTC)
  • Note to any BAG member reading this: I plan on expanding this to include infoboxes. Should I make a new bot account? Dat GuyTalkContribs 11:58, 18 March 2017 (UTC)
  • {{BAGAssistanceNeeded}} Comments? Dat GuyTalkContribs 18:30, 24 March 2017 (UTC)
    Seems pretty nifty! What templates will the bot be editing, and how many transclusions do they have? If we're talking just a handful of pages then it's no biggie, but if there are hundreds, thousands, then that changes everything. The config page will have to at least be semi-protected, and if the bot is going to affect a lot of pages we may have to consider moving it to .js so only the botop and admins can edit it. Another important thing for this task is to properly handle edit conflicts, otherwise you may overwrite someone else's changes. Your code doesn't seem to do this, but then again as you know I don't speak Python very well :) I would also recommend that this task be exclusion compliant, especially if we're going to be editing in the mainspace.
    Just so you know, I'm off to WMCON tomorrow and won't be back till 3 April, so pre-apologies if I'm not very responsive. No need to wait for me, other BAGgers feel free to take over MusikAnimal talk 04:59, 25 March 2017 (UTC)
    It'll edit the templates themselves. For example, it will edit Template:2016-17 Premier League table and not the pages it is transcluded on. It isn't currently added, but I could make it not edit if there's a conflict. Hope you have a great time in Berlin. Dat GuyTalkContribs 14:26, 25 March 2017 (UTC)
    That template is transcluded 27 times, so editing it will affect 27 pages. This is very important information. Please update the "Estimated number of pages affected" (keyword affected), and make note of all templates the bot will edit. Thank you! MusikAnimal talk 10:02, 26 March 2017 (UTC)
    Well, 53 pages would be a minimum since the bot is currently set up to edit Bundesliga and premier league. I've updated the "estimated number of pages affected." Is this ready for trial? Dat GuyTalkContribs 10:08, 26 March 2017 (UTC)
    I feel that semi-protection on the config page is advisable, to prevent the potential of vandalism. TheMagikCow (T) (C) 17:37, 26 March 2017 (UTC)
    I see 27 transclusions for {{2016–17 Premier League table}} and 26 for {{2016–17 Bundesliga table}}. Not sure if the pages that transclude them overlap, but if not we're up to 53 pages. It is really nice that you can simply add more templates and regex to the config, but the problem I see with that is that you'd need to somehow do testing first. I consider myself quite fluent with regex but I still wouldn't change it without doing a dry run with the bot. Obviously the room for error is much greater when you are not the bot operator, or very good with regex, so maybe a config isn't the best idea? What do you think? MusikAnimal talk 09:47, 29 March 2017 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── I could make an option called 'dryrun' which outputs the result of the dry run to a subpage such as User:SportsStatsBot/dry/[leaguename]x (x = Number of dryrun). Dat GuyTalkContribs 14:31, 29 March 2017 (UTC)

A dry config page and a dry output page sound like really useful testing tools that anyone could employ before passing the config to the live page. If you are willing to code those. —  HELLKNOWZ  ▎TALK 15:53, 29 March 2017 (UTC)
@Hellknowz: It's been implemented. Dat GuyTalkContribs 16:17, 30 March 2017 (UTC)
I like it a lot! Just need to make sure it is well documented how to do a dry run, and encourage it before instructing the bot to update the actual template. On to the next question – what's up with User:SportsStatsBot/nbaconfig? Are we planning on doing NBA as well? MusikAnimal talk 21:40, 5 April 2017 (UTC)
Planning is the key word. Currently, there's enough for the statistics, but I've found difficulty of how to transition it onto the template and find whether a team has clinched a playoff spot. Dat GuyTalkContribs 13:55, 6 April 2017 (UTC)

Approved for trial (dry run only, one per template). It seems for this bot the dry run functionality is important, so let's do a trial of that first. The other major component missing here is documentation – User:SportsStatsBot currently only states that the account is a bot, nothing more. It would be good to explain what the bot does, and for highly configurable bots like this you should also explain all the available config options, and also how to do a dry run, etc. MusikAnimal talk 01:36, 10 April 2017 (UTC)

Conclusions from mini-run for a week or so (please don't consider that a full trial):
  • I let it run while on vacation, not a very good idea.
  • Bot took content from the template, and put it in the dryrun page. This made the trial effectively useless.
  • I'll change the code, so that if the page is not creating it takes from the template page. If the page is already created, it'll update itself, excluding the template.
  • Documentation has started
Thanks, Dat GuyTalkContribs 09:42, 23 April 2017 (UTC)
  • I suppose Trial complete. I tried to debug stuff on the way, and that's why it took such a long time. Everything works well aside from the function of determining when it was updated. The diff system is by bytes, and the BBC page has Last updated 13 hours ago at the bottom. Every time there is a new digit, the bot takes it as a change. If anyone knows how to fix it, it will be appreciated. Dat GuyTalkContribs 10:09, 7 May 2017 (UTC)
    How about parsing the integer out of that text? So with regex grouping you could do Last updated (\d+), which will return the number as a string, which you can parse into an integer. If it is different than what you have stored, then the bot makes the updates. MusikAnimal talk 23:55, 13 May 2017 (UTC)
    {{OperatorAssistanceNeeded|D}} SQLQuery me! 04:02, 22 May 2017 (UTC)
    Impossible since I use response.info()["Content-Length"]. It doesn't get the content of the page directly, but gets specific attributes about it. Also, I tried doing removing Last updated[^a]*ago with Regex in a file but the length is still different for some reason. I thought about using a module named difflib, but I've never used it before. Dat GuyTalkContribs 17:47, 22 May 2017 (UTC)
    I've changed a bit of the code. I believe it should work now. See [20]. Dat GuyTalkContribs 15:56, 14 June 2017 (UTC)
    Have you tried looking at the other headers? There is an ETag that I think will be updated when the content changes. You could keep track of that instead. Also, where is the dry run page? MusikAnimal talk 16:11, 16 June 2017 (UTC)
    ETag fails. Trying to get a hang of kees08 on IRC. If we can't find a way, I'll have to find another site since it seems like I've exhausted all the options. Dat GuyTalkContribs 10:38, 22 June 2017 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── I've made a pretty simple fix at [21] which should work for all normal template runs and most dry runs. Think it is time for maybe a live run. Dat GuyTalkContribs 21:32, 24 June 2017 (UTC)

I will keep fixing it up, but is there any content this is missing from the bot documentation page or the user page that I can add? Kees08 (Talk) 01:17, 7 July 2017 (UTC)


Approved requests

Bots that have been approved for operations after a successful BRFA will be listed here for informational purposes. No other approval action is required for these bots. Recently approved requests can be found here (edit), while old requests can be found in the archives.


Denied requests

Bots that have been denied for operations will be listed here for informational purposes for at least 7 days before being archived. No other action is required for these bots. Older requests can be found in the Archive.

Expired/withdrawn requests

These requests have either expired, as information required by the operator was not provided, or been withdrawn. These tasks are not authorized to run, but such lack of authorization does not necessarily follow from a finding as to merit. A bot that, having been approved for testing, was not tested by an editor, or one for which the results of testing were not posted, for example, would appear here. Bot requests should not be placed here if there is an active discussion ongoing above. Operators whose requests have expired may reactivate their requests at any time. The following list shows recent requests (if any) that have expired, listed here for informational purposes for at least 7 days before being archived. Older requests can be found in the respective archives: Expired, Withdrawn.