Wikipedia:Bots/Requests for approval/Yobot 57
- The following discussion is an archived debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA. The result of the discussion was Approved.
Operator: Magioladitis (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)
Time filed: 14:58, Wednesday, July 12, 2017 (UTC)
Automatic, Supervised, or Manual: Automatic Supervised
Programming language(s): AWB
Source code available:
Function overview: Fix {{http...}}
Links to relevant discussions (where appropriate): Wikipedia_talk:WikiProject_Check_Wikipedia#About_700_articles_to_clean_up
Edit period(s): Daily
Estimated number of pages affected: 700 in first run
Namespace(s): Mainspace
Exclusion compliant (Yes/No):
Function details: Find & Replace to change {{http...}} to [http...]
Discussion
[edit]Support. This task may need to run supervised. See this correction, where {{cite web}} was the best choice, and this correction, where removing the curly braces was a better choice than replacing them with square brackets. – Jonesey95 (talk) 19:47, 12 July 2017 (UTC)[reply]
support, assuming the replacements are context aware per the examples provided above by Jonesey95 (i.e., can detect the difference between a bare URL wrapped in curly braces vs. a missing 'cite web'). Frietjes (talk) 20:48, 13 July 2017 (UTC)[reply]
- So when will it decide to use
{{cite web}}
over square brackets? It might be preferable to convert to cite web in many cases than reduce it down to a plain link.—CYBERPOWER (Chat) 16:25, 16 July 2017 (UTC)[reply]- Cyberpower678 I guess this applies to almost(?) all bare links? Maybe, we should convert everything to cite web? We used to have a bot for that purpose in the past. -- Magioladitis (talk) 16:28, 16 July 2017 (UTC)[reply]
- My own bot does that to an extent. I believe that is the preferable path to take. Of course that would mean the bot needs to figure out the access date and all the required parameters. @Jonesey95 and Frietjes: What do you think?—CYBERPOWER (Chat)
- Cyberpower678 I certainly prefer thi approach. The bot should try and access the page and even get metadata from it. Was it your bot that used to do that? -- Magioladitis (talk) 16:33, 16 July 2017 (UTC)[reply]
- IABot does it when it sees such as appropriate. Some bare links do get converted to full cite templates. Others to preserve context, just leaves it as is.—CYBERPOWER (Chat) 16:35, 16 July 2017 (UTC)[reply]
- @Jonesey95 and Frietjes: Pinging again, as it appears to not have worked last time.—CYBERPOWER (Message) 01:59, 19 July 2017 (UTC)[reply]
- If a bot or script is doing the conversions, it should not add an access-date, since that is the date when the content of the web page was verified to support the statement in question (assuming the link is inside ref tags). If the bot/script is adding things like a title, each one will need to be checked for reasonableness. We don't want a bunch of titles like "Error 404" inserted into WP. I think it would be appropriate to approve this task as a bot-flagged task for a human editor to perform with a script, checking every edit. – Jonesey95 (talk) 04:55, 19 July 2017 (UTC)[reply]
- it sounds like human supervision is needed as suggested above and below. Frietjes (talk) 12:33, 19 July 2017 (UTC)[reply]
- If a bot or script is doing the conversions, it should not add an access-date, since that is the date when the content of the web page was verified to support the statement in question (assuming the link is inside ref tags). If the bot/script is adding things like a title, each one will need to be checked for reasonableness. We don't want a bunch of titles like "Error 404" inserted into WP. I think it would be appropriate to approve this task as a bot-flagged task for a human editor to perform with a script, checking every edit. – Jonesey95 (talk) 04:55, 19 July 2017 (UTC)[reply]
- @Jonesey95 and Frietjes: Pinging again, as it appears to not have worked last time.—CYBERPOWER (Message) 01:59, 19 July 2017 (UTC)[reply]
- IABot does it when it sees such as appropriate. Some bare links do get converted to full cite templates. Others to preserve context, just leaves it as is.—CYBERPOWER (Chat) 16:35, 16 July 2017 (UTC)[reply]
- Cyberpower678 I certainly prefer thi approach. The bot should try and access the page and even get metadata from it. Was it your bot that used to do that? -- Magioladitis (talk) 16:33, 16 July 2017 (UTC)[reply]
- My own bot does that to an extent. I believe that is the preferable path to take. Of course that would mean the bot needs to figure out the access date and all the required parameters. @Jonesey95 and Frietjes: What do you think?—CYBERPOWER (Chat)
- Cyberpower678 I guess this applies to almost(?) all bare links? Maybe, we should convert everything to cite web? We used to have a bot for that purpose in the past. -- Magioladitis (talk) 16:28, 16 July 2017 (UTC)[reply]
- I think there is an issue of WP:CONTEXTBOT here. Not all links as {{https...}} will be appropriate as a bare url - some may be specific cite templates. See the match on Paul McCartney and Victoria Azarenka the first 2 results in the insource search that are both better suited to an appropriate cite template. This is a task that needs doing, but I think a semi-automatic approach would be better. TheMagikCow (T) (C) 12:49, 17 July 2017 (UTC)[reply]
- @Magioladitis: since context is important here, and since there are only 700 pages to run this on, I would advise making this a supervised task. It's not too difficult to review 700 edits.—CYBERPOWER (Chat) 13:03, 19 July 2017 (UTC)[reply]
- @Cyberpower678: done. -- Magioladitis (talk) 15:03, 19 July 2017 (UTC)[reply]
- Approved for trial (50 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete. So lets make sure we don’t have to cleanup 700 edits lets see if we can fix any problems that arise in the first 50.—CYBERPOWER (Chat) 15:52, 19 July 2017 (UTC)[reply]
- @Cyberpower678: done. -- Magioladitis (talk) 15:03, 19 July 2017 (UTC)[reply]
Cyberpower678 I am not sure that we set the exact rules of which method of conversion to choose each time. Should I use the bracket conversion or the cite web conversion? -- Magioladitis (talk) 18:52, 19 July 2017 (UTC)[reply]
- I think {{cite web}} would be fine for this subset of the originally-provided problem (the find/replace is left to the reader). Cite web might ultimately be suboptimal, but it will get it into the citation wiki-gnomes queue where they may be converted by those editors. The subset remaining after is this one. --Izno (talk) 20:30, 19 July 2017 (UTC)[reply]
- Sorry, I thought that was clear. Essentially if it's all alone with nothing else inside of a reference, I would convert them to cite templates. If they look like they were intended to be cite templates, but just forgot the cite web aspect, fix those. If they appear to be in a form of context, or are outside references, convert them to brackets. Some may be formatted improperly but are intended to be plain URLs with text, so those should be converted to square brackets as well.—CYBERPOWER (Around) 20:35, 19 July 2017 (UTC)[reply]
- This subset (there are two false-positives--review before running) can also take a {{cite web}} after the 100+ above. Any others will likely need some close attention. --Izno (talk) 20:48, 19 July 2017 (UTC)[reply]
- ((OperatorAssistanceNeeded|D)) Where are we at here?—CYBERPOWER (Chat) 13:34, 5 August 2017 (UTC)[reply]
[1] I had to remove the url since Wikipedia can't be reference to itself. -- Magioladitis (talk) 08:22, 6 August 2017 (UTC)[reply]
Cyberpower678 20 diffs. I had to stop. Many many different variations. Many cases of unbalanced brackets, cases of completelly inappropriate bracketing anyway, etc. I even discoevered an AWB bug. I recommend we do this by normal accounts. -- Magioladitis (talk) 08:33, 6 August 2017 (UTC)[reply]
- Your bot doesn't seem to have broken anything, AKAICT. It seems to be handling them well. I do have some questions however. Why is your bot in some cases removing the braces entirely, and in some cases replacing them with brackets? I saw two different references containing only a URL, and in one edit it was made a plain URL, while in another, it was made into a bracketed URL.—CYBERPOWER (Chat) 15:19, 6 August 2017 (UTC)[reply]
- Cyberpower678 It's because I was supervising every edit. I had to manually act when I noticed that the nearby urls were plain URLs. If you OK that I continue like that, I am OK too :) This will help me report bugs and improve AWB logic too.-- Magioladitis (talk) 15:46, 6 August 2017 (UTC)[reply]
- (Commenting as a community member, not BAG member, per my recusal) I would be very supportive of running this task from the Yobot account manually to both solve the issue of false positives and prevent watchlist spam. This is an obviously beneficial fix. ~ Rob13Talk 21:10, 6 August 2017 (UTC)[reply]
- Alright, I still think we should continue to run the task from the bot account, in a supervised state. Let's do another trial. Approved for extended trial (100 edits). Please provide a link to the relevant contributions and/or diffs when the trial is complete.—CYBERPOWER (Chat) 18:08, 7 August 2017 (UTC)[reply]
Cyberpower678 I did 100 edits. 100 diffs I need someone to check before proceeding because most of the edits are not done automatically. I think I got a couple of ideas of how to improve AWB via this procedure. As soon as someone checks I'll continue. -- Magioladitis (talk) 16:06, 12 August 2017 (UTC)[reply]
- I found a purely cosmetic here.
- This edit leaves a big fat red error message.
- Here's another almost cosmetic edit.
- So for 1 and 3, why are the URLs being deleted? For number 2, why is it replacing URLs with a template where no URL for it exists?—CYBERPOWER (Around) 20:36, 12 August 2017 (UTC)[reply]
Cyberpower678 I fixed the official website issue. I usually go and add the Wikidata item manually before the edit. This time I forgot. For the others: I started by deleting to remove from the list but then I realised I could fix them by removng the commented out text. If you check the rest it should be OK. The almost cosmetic is not. I fixed 3 urls with double curly brackets. -- Magioladitis (talk) 20:44, 12 August 2017 (UTC)[reply]
Cyberpower678 I did 100 edits. -- Magioladitis (talk) 20:56, 12 August 2017 (UTC)[reply]
Trial complete. Magioladitis (talk) 20:56, 12 August 2017 (UTC)[reply]
- Approved. Edit looks good, and further issues are unlikely to surface in another trial. With the task being supervised, I'm approving this task.—CYBERPOWER (Around) 21:37, 12 August 2017 (UTC)[reply]
- The above discussion is preserved as an archive of the debate. Please do not modify it. To request review of this BRFA, please start a new section at WT:BRFA.