Wikipedia:Bots/Noticeboard

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search
Bots noticeboard

This is a message board for coordinating and discussing bot-related issues on Wikipedia (also including other programs interacting with the MediaWiki software). Although this page is frequented mainly by bot owners, any user is welcome to leave a message or join the discussion here.

If you want to report an issue or bug with a specific bot, follow the steps outlined in WP:BOTISSUE first. This not the place for requests for bot approvals or requesting that tasks be done by a bot. General questions about the MediaWiki software (such as the use of templates, etc.) should be asked at Wikipedia:Village pump (technical).


Citation bot[edit]

Citation bot (BRFA · contribs · actions log · block log · flag log · user rights)

Previous BRFAs:

Citation Bot is in an interesting situation. It is a widely-used and useful bot, but it has one of the longest block logs for any recently-operating bot on Wikipedia. The current listed operator is Smith609, with Kaldari and AManWithNoPlan as additional maintainers. The bot is currently blocked by RexxS for Disruptive editing still removing links after request to stop. AManWithNoPlan opened an unblock request, which was later closed by Boing! said Zebedee, citing concerns that the bot was operating outside of its approval. There are several editors in the ensuing discussion that also feel the bot was operating outside of approval. Complicating the situation is the fact that Citation bot has had 9 BRFAs, three of which were under DOI bot. The last BRFA was in 2011, but the behavior of Citation bot has changed since then. Because of the significant confusion surrounding Citation bot's approved tasks, I am requesting that the BAG invalidate all previous approvals for Citation bot and require that a new BRFA be filed per WP:BOTAPPEAL. While this is a fairly drastic measure, I think a clear enumeration of the bot's tasks and their approval is the only way to prevent further problems. I also believe that Citation bot should continue operating and sincerely hope that it will. --AntiCompositeNumber (talk) 19:41, 27 June 2020 (UTC)

Commenting in my personal capacity, that one editor (RexxS) with an axe to grind refuses to recognize Wikipedia:Bots/Requests for approval/DOI bot 2 (task 1 here, generalized from DOIs to other identifiers per consensus) in is both valid and reflects consensus is not grounds to vacate its previous BRFAs. As for its tasks and details, they are already detailed here. Headbomb {t · c · p · b} 20:09, 27 June 2020 (UTC)
"after request to stop", I think less than 7 hours barely counts as "after", but that's just my opinion. I appreciate not being called an "operator": thank you. AManWithNoPlan (talk) 21:05, 27 June 2020 (UTC)
i think that the block log needs to be carefully examined. The bot used to have no tests. now it has a huge test suite with code coverage at almost 100%. we even have tests that pull in the CS1/CS2 parameter lists and verify they have not changed. So, many of the blocks are for long gone bugs. The bot is valuable enough that people endured some horrible GIGO bugs - now headbomb complains that the bot failed on a page and I fix the page and not the bot. another block was for someone weaponizing the bot - that was crazy - so now we verify users before letting them run it. so, the only blocks that count are the "i hate what you are doing" blocks. There was a block for asding citeseerx IDs to articles (which is ironic since part of the current block is annoyance at the bot removing copyright violating URLs. AManWithNoPlan (talk) 21:23, 27 June 2020 (UTC)
"I appreciate not being called an "operator": thank you" - so how would you describe yourself then? A bot runner? a bot maintainer?
"now we verify users before letting them run it" - who are the other members of "we"? Do you still maintain that editors who run the bot have no responsibility for the actions of the bot? Who does then? --RexxS (talk) 21:42, 27 June 2020 (UTC)
Good question. I accidently personified the bot. "we verify" should be "the bot verifies". WMF and I worked a lot on that task, so it is easy for me to feel part of it. I fix bug in the bot and work hard to find bugs. So, I would say I am a software engineer and on very rare occasions a bot runner. "operator " is a highly technical term, so that is smith and only smith. AManWithNoPlan (talk) 00:36, 28 June 2020 (UTC)
That's a fair answer. I must say though that I'm a little bit concerned about letting the bot decide who can run it. Is that in the list of its functions? As someone who has been writing computer programs for over 50 years now, and a former systems analyst, I'm fairly familiar with highly technical terms, but I don't think I've ever seen "operator" used in that way anywhere else. Normally I would consider that a person who assembles a list of tasks and then sets a system running to perform those tasks as the "operator" of that system, but I accept your idiosyncratic use of the word, and I'll do my best to refer to you as the bot runner in future. --RexxS (talk) 01:35, 28 June 2020 (UTC)
@Headbomb: I don't think that trying to peddle a lie about "one editor with an axe to grind" is going to excuse your behaviour. There are numerous editors who have told you the bot is operating outside its approval by removing links from the citation title, such as SandyGeorgia, Nemo bis, Levivich, Nick Thorne. The real problem, however, is that you have unilaterally decided that it's okay to remove links from citation titles whenever some identifier has a link; whereas there is a consensus expressed at Wikipedia:Village pump (proposals)/Archive 167 #Auto-linking titles in citations of works with free-to-read DOIs by over a dozen editors in favour of having links in the citation title regardless of what other identifiers may be linked. The consensus is strongest for retaining citation title links where they point to free-to-read full text, but there is still considerable support for having a link from the title even when the full text is not available without subscription.
You have decided that your views on those issues will take precedence and you have initiated bot runs to enforce your decision en masse, creating a fait accompli. This is against a background of editors such as MCB taking a complaint to Wikipedia:Administrators' noticeboard/Archive143 #DOI bot blocked for policy reconsideration "because it is implementing a major policy change in the way Wikipedia makes web references, without large-scale community consensus and buy-in." At that point, Fullstop stated "While I don't mind a bot adding document identifiers, this is the only function for which DOI bot was approved, and the bot is going beyond that mandate. By not restricting itself to only adding document identifiers (and doing so without barfing), it has revoked its approval."
You have also heard SandyGeorgia and HJ Mitchell tell you that the people running the bot are not responsive to concerns expressed by ordinary editors, especially those without considerable technical background.
You have assumed that you can simply generalise from DOIs to other identifiers at will, and claiming that the documentation at Template:Cite journal represents community consensus merely demonstrates how out-of-touch with ordinary editors you are.
  • There is incontrovertible consensus that the bot should not remove links to free-to-read full text from the citation title. There must be a solid guarantee that the bot will not do that.
  • There is less strong consensus that the bot should not remove links from the citation title to sources even if they are not free-to-read full text. Editors should be able to make a decision on that when editing an article without fear that a bot will overrule their decision. The bot must not substitute its judgement for that of article editors.
At present, I believe the bot's functionality will breach both of those considerations. If the operators want to have it operate with that functionality, they should submit their request to community scrutiny and abide by the decision.
I oppose granting authorisation to the bot for these User:Citation bot #Function proposed functions:
1 - This would unlink citation titles as described above.
3 - Removing so-called "redundant" parameters gives far too much leeway for abuse. A lack of definition of what is intended by "redundant" parameters gives rise to the present problems.
8 - standardising to the dominant cite format breaches WP:CITEVAR, and allows gaming.
--RexxS (talk) 21:36, 27 June 2020 (UTC)
@Headbomb: I noticed that you cited Wikipedia:Bots/Requests for approval/DOI bot 2 for approval to remove free URLs. However, there Smith609 said that "the one URL manipulation deemed okay" was replacing "'url=http://dx.doi.org/#' with 'doi=#'". Is there any subsequent consensus allowing any other "URL manipulation[s]", or is that still the only one "deemed okay"? If there is a consensus to allow another "URL manipulation", where is that discussion? Best, --Mdaniels5757 (talk) 21:44, 27 June 2020 (UTC)
"The only url manipulation" is in the context of DOI bot mangling non-identifier URL based citations like this back in May 2008. It's not a general proscription against doing other identifier based cleanup in line with template documentation. Headbomb {t · c · p · b} 21:56, 27 June 2020 (UTC)
RexxS, there must have been some misunderstanding. I never stated that the bot operated outside its remit, only that I personally don't like the removal of pdfs.semanticscholar.org links to move them to an identifier. For that, I "blame" consensus at Help talk:Citation Style 1, which was gained to appease a single admin. For me this case is rather an example of the bot developers being too responsive to the requests of minorities of users, with the result that sometimes the overall operation of the bot appears less coherent. Nemo 06:51, 28 June 2020 (UTC)
@Headbomb: OK. Where is the discussion showing approval to do "other identifier based cleanup in line with template documentation"? Mdaniels5757 (talk) 18:10, 1 July 2020 (UTC)
This sort of ping should not be done, it is selectively canvassing certain users to a noticeboard. (t · c) buidhe 13:58, 4 July 2020 (UTC)
@Buidhe: When another editor makes a false claim about me: "one editor with an axe to grind", I am perfectly entitled to refute that lie by notifying the numerous other editors who who have raised complaints. This notice-board is unlikely to be on most editors' watchlists and you should not be afraid of a broader spread of editors becoming aware of what is going on here. It is precisely because decisions are made here in a walled-garden environment that the bot's functionality has been allowed to creep beyond its remit over time. It's time to return it to doing what it has consensus and approval for. --RexxS (talk) 17:20, 4 July 2020 (UTC)
The correct way to do it is notify all editors who participated in a discussion, not just those who have expressed viewpoints sympathetic to yours. (t · c) buidhe 17:23, 4 July 2020 (UTC)
Nonsense. It's nothing to do with my viewpoint. It's simply a matter of demonstrating that many other editors have raised issues with this bot, and you know that. --RexxS (talk) 17:47, 4 July 2020 (UTC)
Other editors raised issues with S2CID conversions while mostly unaware that S2CID links were often problematic to begin with. S2CID (when redundant, but nonetheless free and with no copyright issues) conversions were halted while CS1/2 templates are updated to support |. Not url-redundancy elimination in general. Headbomb {t · c · p · b} 17:52, 4 July 2020 (UTC)
Let the other editors speak for themselves. You're well out of order projecting your perspective onto theirs, when you've demonstrated a profound inability to comprehend the nature of complaints raised. The problems that you brought on by extending unilaterally your approval to include Semantic Scholar links was just the trigger that highlighted the bot's removal of a huge number of links from the citation title, where there is obvious consensus for them to be retained. The useful part was the identification of copyvios, but that represents only a small fraction of the total of title links removed by this bot against consensus and without approval. --RexxS (talk) 19:07, 4 July 2020 (UTC)
You're the one abusing your position here. And the S2CID issue has been resolved for weeks now. Headbomb {t · c · p · b} 20:39, 4 July 2020 (UTC)
The only person abusing their position here is you. You've granted yourself permission to run the bot in order to remove links from citation titles with no reference to the community's express wish that it should not do so. --RexxS (talk) 20:49, 4 July 2020 (UTC)
I haven't granted myself shit. Headbomb {t · c · p · b} 20:53, 4 July 2020 (UTC)
Yeah, you have. You've steadily extended the remit of this bot from a single case of replacing the url parameter with a doi one to the situation where you think you can use it to remove the title link when any of an unspecified number of identifiers exist. There's no approval for that, and there's no consensus for that. It relies purely on your implicit endorsement as a BAG member. --RexxS (talk) 21:09, 4 July 2020 (UTC)
I haven't done anything of the sort. Headbomb {t · c · p · b} 14:34, 5 July 2020 (UTC)
  • I don’t frequent this board, but was pinged, and AntiCompositeNumber’s proposal seems to be the way to address the problem spots I have encountered ... I hope this can be addressed without personalization, and that the bot can continue running once the wrinkles are resolved. SandyGeorgia (Talk) 00:17, 28 June 2020 (UTC)
  • There seems to be broad consensus on the objective of linking open access copies from {{cite journal}} and making them as easy as possible to retrieve. However, to actually to do is hard work: here lies the value of Citation bot, which has helped countless users and edits in performing an otherwise very tedious task, even though it's impossible to make a mistake here and there. I think there are two problems with the proposal to have a new BRFA now. 1) The current incident arises from a time mismatch: the citation bot was immediately adapted to new decisions on the citation templates, while the citation templates themselves are updated only at intervals of several months. The change which would fix this entire problem has been sitting at Module:Citation/CS1/sandbox for over a month despite being uncontroversial. Doing a BRFA now would be unproductive: if anything we need to wait for the templates to have stabilised and for the recent RfC to be fully implemented. 2) There is some confusion about the bot supposedly changing. My understanding (even though I was not following it in 2011) is that the bot has always been doing the same thing, but the templates have changed. Unless we want to establish a standard that a bot operating on templates needs a new BRFA every time those templates are changed by consensus, it's hard to understand what benefit there would be in a new BRFA. If the new BRFA were to decide that the bot can keep doing what it has been doing, and adapt to changing templates and consensus about their usage, we'd be back to the current situation. But if it doesn't, we'd have contradictory decisions: a consensus to do something but a decision to make it impossible to actually do it. Therefore, the best solution is generally to get a new, "better" or more precise consensus on what needs to be done about specific template parameters, in the appropriate venues. Nemo 06:51, 28 June 2020 (UTC)
  • As I said on Citation bot's page, I think that Citation bot needs to be back up and running soon. Tying it up in bureaucratic morass isn't doing us any good. I suggested that an old version of the bot be restored, if that is possible, one that doesn't implement the disputed functionality. Then the bot can be actually useful, while we iron out the kinks and authorization behind the scenes. @AManWithNoPlan: is that feasible, and would you be amenable to that? CaptainEek Edits Ho Cap'n! 19:22, 30 June 2020 (UTC)
    • The disputes is should the bot remove S2 URLs when adding S2CID parameters. I modified the bot ao that these would only be removed if there is a PMC present - which makes sure that the title atays linked - or if linking to the URL violates the wikipedia copyvio policy - the S2CID link would still exist, but the primacy of the bad link would be removed. I am surprised that part two got push back. Part one is guaranteed to be a free link, unlike S2 URLs. It is not hard to turn off the code for removing S2 URLs. — Preceding unsigned comment added by AManWithNoPlan (talkcontribs) 19:31, 30 June 2020 (UTC)
      • No, the dispute is that the bot removes links from citation titles. Period. It has no approval or consensus to do that (unless the link points to a copyright violation, which no-one is arguing about). --RexxS (talk) 23:31, 30 June 2020 (UTC)
        • It does, specifically in Wikipedia:Bots/Requests for approval/DOI bot 2 since 2008. As do other bots such as Wikipedia:Bots/Requests for approval/CitationCleanerBot. This is in line with template documentation, e.g. Template:Cite journal#Identifiers. Headbomb {t · c · p · b} 16:16, 1 July 2020 (UTC)
          DOI bot 2 only covers DOI. CitationCleanerBot 1 only covers JSTOR and was 9 years ago. The template instructions are not global consensus, and they still don't say to remove the URL, in fact, the template instructions say "The |url= parameter or title link can then be used for its prime purpose of providing a convenience link to an open access copy (as in, at least accessible to everyone for free) which would not otherwise be obviously accessible". Levivich[dubious – discuss] 17:46, 1 July 2020 (UTC)
          • There is nothing special for DOI vs JSTOR vs PMID vs PMCID vs OSTI. CitationCleanerBot has explicit approval for all of these. CitationBot has implicit approval for all of them too. For the discussion pages, see the various Help:CS1 archives. Headbomb {t · c · p · b} 21:34, 1 July 2020 (UTC)
  • As I also said on CitationBot's page, I agree with CaptainEek and am in favor of whatever can make citationbot usable again soonest. The changes in the unlock request sound fine to me. I think y'all are making mountains out of molehills here. (Since it was mentioned I looked at the block log and I am surprised it was so short for such a widely used bot.) If you guys want to figure out what the consensus is for the minor details of what it does that's fine, as long as you leave the bot running while you do. Iamnotabunny (talk) 15:57, 1 July 2020 (UTC)
  • Sorry, I have no experience with bot approvals, but I have three questions:
    1. Don't we have a rule that says every bot has to have an operator? Who is the operator for Citation Bot? It seems the listed operator is not active, AMWNP is a "maintainer" not an operator, and HB is neither. Which human being is responsible for this bot?
    2. Once we figure out #1, that person needs to answer whether or not they are willing to modify the code to stop the bot from removing the |url= parameter from citations. It's kind of a yes or no.
      • If the answer is yes, then that can be done, and then the bot can be unblocked, and conversations about the |url= parameter can continue (probably necessitating an RFC).
      • If the answer is no, then the bot stays blocked, and we move right to the RFC.
    3. If there is no operator, that seems to be like the first priority. How do we have a bot running with no operator? Levivich[dubious – discuss] 17:32, 1 July 2020 (UTC)
      @Levivich: The human responsible for the bot is (sysop) Smith609, whose last edit was June 10. --Mdaniels5757 (talk) 18:17, 1 July 2020 (UTC)
      AManWithNoPlan is one of the two maintainers designated by Smith609 to act on their behalf. Headbomb {t · c · p · b} 21:36, 1 July 2020 (UTC)
      Above, AManWithNoPlan said I appreciate not being called an "operator": thank you ... I fix bug in the bot and work hard to find bugs. So, I would say I am a software engineer and on very rare occasions a bot runner. "operator " is a highly technical term, so that is smith and only smith. I don't want to speak for AMWNP or put words in his mouth, but it does not sound to me like he is acting on behalf of Smith as operator. In my opinion, an active editor needs to take responsibility for the bot's operation before the bot can be unblocked. Any bot. Someone needs to answer question #2; I'm not sure if it's fair to put it on the shoulders of a volunteer who may not want to assume that responsibility (and should not be forced to). Levivich[dubious – discuss] 21:57, 1 July 2020 (UTC)
      Mdaniels5757, yeah but hasn't opined on this anywhere AFAICT, even though the ANI thread was opened on June 7 and the bot was blocked on June 8. So... as of right now we have a bot with an inactive operator? I mean, three weeks isn't that long of a time, especially given what's been going on in the world, but I don't see how we unblock a bot with no active operator, at least until the operator returns or someone else steps up to be the operator even if only temporarily. Levivich[dubious – discuss] 18:28, 1 July 2020 (UTC)

I believe Smith is currently quite busy in the real-world and since you can still run the bot in tool mode, he's taking care of what really matters. As for operator, I am authorized to deal with bot related issues on his behalf as long as it is not controversial, but I do not have password access to the account, which is why I avoid the term operator. People as me to "shut the bot down" and I cannot do that. AManWithNoPlan (talk) 22:03, 1 July 2020 (UTC)

Replacing URLs with parameters has been an essential part of this bot's operation since as far back as I can remember. There are a lot of different URLs that should be listed as parameters. S2CID is just the latest one. But there are many more. DOI, PMID, PMC, JSTOR, CiteSeerX, HDL, arXiv, etc.  — Chris Capoccia 💬 16:04, 4 July 2020 (UTC)

  • So....CitationBot is just dead in the water until we get Smith609 back in action? Is there no other way we can compromise or get some sort of interim solution? CitationBot has been blocked almost a month now. CaptainEek Edits Ho Cap'n! 18:56, 4 July 2020 (UTC)
actually, i think it's worse than that. i think we're actually waiting on CS1 group to roll out the programming with all the parameters like doi-access=free that links the title. because the people who are blocking the bot have this idea that titles must be linked. all the regular citation bot users don't care about linking the titles and are happy having a well-formatted citation with 5 or 6 parameters.  — Chris Capoccia 💬 19:57, 4 July 2020 (UTC)
"all the regular citation bot users don't care about linking the titles" - and that's the problem in a nutshell. None of the regular bot users are interested in the concerns raised and the consensus established about the linking by the majority of editors who do care about linking citation titles. Much as I'd like to see the bot operating again within its agreed parameters, I can't accept that it should be free to continue to remove title links whenever it adds another parameter. I find it astonishing that stopping it from unlinking the titles hasn't been done as an interim solution while we all wait for the folks at CS1 to make changes to the citation code that would re-establish the links automatically. --RexxS (talk) 20:27, 4 July 2020 (UTC)

"all the regular citation bot users don't care about linking the titles Nonsense. I've pushed for autolinking freely-available titles for years now. Most of us have. What we're against is pointless duplication of information and the hijacking of |url= to give redundant links covered by identifiers like JSTOR, PMID, PMC, etc... Headbomb {t · c · p · b} 20:45, 4 July 2020 (UTC)

I doubt Chris Capoccia will be happy with you calling his good-faith opinion "nonsense". You're going to have to accept that having a link on a citation title is not "pointless duplication of information" and the link provided on the citation title is not redundant to what is "covered by identifiers like JSTOR, PMID, PMC, etc." If the only way that can be done at present is through using |url=, then you need to allow editors to do that, rather that insisting that you know best and using a bot to enforce your view. --RexxS (talk) 20:58, 4 July 2020 (UTC)
i don't care one way or the other whether titles are linked. i do care about having a properly formatted citation and i do care about having parameters. it's completely stupid all the URLs that could be a parameter or all the people putting URLs that only work from their university because they like URLs instead of actually a parameter that could work for more people. URLs will break. Intention of a parameter is to be more durable, although you have to be real that some DOIs don't actually work. whether CS1 interprets several free parameters and adds a title URL is irrelevant to me.  — Chris Capoccia 💬 22:25, 4 July 2020 (UTC)
i also don't want URLs manually added that are going to some version of the same thing as a parameter. Way too many people putting in URLs of the whole link to PMC or URLs to the full PNAS that is the same as clicking the DOI and then clicking full.  — Chris Capoccia 💬 22:28, 4 July 2020 (UTC)
"i don't care one way or the other whether titles are linked." - but a lot of other editors do care, because that's what the readers see. Having a proper link on the title is not mutually exclusive with having identifiers linked. Most readers simply expect to be able to follow a link from the title and they have no idea what the identifiers are. That's the functionality that's relevant to me. I don't want a bot deciding for me (or for any other content editor) that the title shall not be linked. That's a decision for an editor, not for a bot. --RexxS (talk) 22:47, 4 July 2020 (UTC)
in my mind, this is the same question as whether books should use ISBNs or URLs. ISBNs go to Book Sources and are greatly preferred over URLs. If people can figure out how to click on ISBN, they can figure out how to click on other things. It's not that hard.  — Chris Capoccia 💬 23:31, 4 July 2020 (UTC)

Chris, Headbomb, I read above that you don't think titles should be linked. I do. How about we have an RFC to see if consensus is with you, or with me? Levivich[dubiousdiscuss] 00:13, 5 July 2020 (UTC)

if CS1 can parse parameters and create a title link, I don't care one way or the other on that. what i don't support is people manually adding URLs that are pretty much the same thing as something that should be a parameter in order to force title linking.  — Chris Capoccia 💬 00:50, 5 July 2020 (UTC)
Yes, I read it the first time :-) How about an RFC to see if consensus is with you -- about having a URL when there is already a parameter -- or not? Levivich[dubiousdiscuss] 01:26, 5 July 2020 (UTC)
sure whatever. although supposedly there was already something like that that started this situation.  — Chris Capoccia 💬 03:19, 5 July 2020 (UTC)
Wikipedia:Village pump (proposals)/Archive 167#Auto-linking titles in citations of works with free-to-read DOIs? I guess that leaves deciding the venue and neutral statement? Levivich[dubiousdiscuss] 04:23, 5 July 2020 (UTC)
Title should not be linked via |url= when those |url= are redundant with identifiers because those URL take the place of free versions (which may or may not exist). We already had those RFCs and multiple discussions on Help talk:CS1 and elsewhere, the consensus being documented at, e.g. Template:Cite journal#Identifiers (quoting "When an URL is equivalent to the link produced by the corresponding identifier (such as a DOI), don't add it to any URL parameter but use the appropriate identifier parameter, which is more stable and may allow to specify the access status. The |url= parameter or title link can then be used for its prime purpose of providing a convenience link to an open access copy (as in, at least accessible to everyone for free) which would not otherwise be obviously accessible."). When those identifiers have free full versions (e.g. when |doi-access=free is set), then the title should be linked if no other |url= is provided. And when the next template update rolls around, this will happen automatically. There's no point in having |url=https://worlcat.org/0123456 instead of/along with |oclc=0123456. Book of Stuff OCLC 0123456 makes it clear you will end up at the OCLC website when you click on the OCLC link. You'll have no idea where you'll land when you click on the first link in "Book of Stuff OCLC 0123456". Headbomb {t · c · p · b} 14:43, 5 July 2020 (UTC)
Same response I gave to Chris above: Yes, I read it the first time. I understand your position, as I'm sure by now you understand mine. I'm also sure you'll agree with me that the best venue for an RFC is not here, in this discussion, so there's not much point to discussing the merits of the url parameter any further here.
Have you any thoughts as to what the best venue is, and what the neutral RFC statement should be? Levivich[dubiousdiscuss] 18:55, 5 July 2020 (UTC)
Levivich, I'm thinking WP:VPT, or maaaybe WP:VPR. CaptainEek Edits Ho Cap'n! 19:16, 5 July 2020 (UTC)

It could also just be part of a BRFA. CaptainEek Edits Ho Cap'n! 19:28, 5 July 2020 (UTC)

@Levivich: the recent RfC on auto-linking was at Wikipedia:Village pump (proposals) and that was well-attended, but VPP may be better if the question is phrased as a policy/guideline. I'm interested in two issues concerning linking of citation titles: (1) linking of titles to a full, free text; and (2) linking of titles to the best online source available. I appreciate that having two questions would risk confusion, so perhaps addressing the first point is necessary before considering the second. --RexxS (talk) 19:37, 5 July 2020 (UTC)
@RexxS, Levivich, Headbomb, and AManWithNoPlan:, How does this sound for a neutral RfC statement: "The recent block of CitationBot has raised concerns over how we link to citations. How should the titles of citations link to sources? How would that change if a distinct identifier (such as a DOI, PMC, PMID, etc.) is also provided in the source?"
I'll be honest, this issue is pretty complicated and I don't think I entirely get it, so feedback is welcome. But someone had to do something, this has just sat for 2 weeks with nothing happening. CaptainEek Edits Ho Cap'n! 19:51, 20 July 2020 (UTC)
@CaptainEek: I'd prefer to keep an RfC to a simple question like "Should the title in a citation be linked to the best online source whenever available?". You could use a Background section to provide secondary information, such as the bot's activities and our prohibition on linking to copyright violations. If that has consensus, as I maintain it has, then the consequences should be easy to work out. --RexxS (talk) 20:04, 20 July 2020 (UTC)
Sorry for being out of thr loop but I have my day job, covid-19 keeping us insane, teaching a college class temporarily, and working real hard at adding lots of testing to the Citation Bot. The Bot is now much stronger. AManWithNoPlan (talk) 21:27, 20 July 2020 (UTC)
@CaptainEek: Rexxs's wording is awful, because it implies links to non-free identifiers are "the best online source available". What the title should link to is to free versions of record when available. That's the consensus, and it will not change following an RFC. What it shouldn't linked to are paywalled/database links (e.g. PMID 123465) redundant to identifiers which usurp free links to versions of records. But if you want to create another RFC on the issue, your wording is fine. Headbomb {t · c · p · b} 22:18, 20 July 2020 (UTC)
Yet another attack from Headbomb. The question I suggested speaks to a fundamental principle that he is frightened of. There is already a consensus that free versions should be linked from the citation title – although Citation bot failed to respect that, and should not be editing until it does. If no free version of a source exists, I believe that readers still want the best available online source to be linked from the citation title. An RfC will confirm that and that is why Headbomb doesn't want an RfC on the issue. Headbomb is quite wrong to think that he can ignore a consensus established by an RfC and he will do so at his peril. --RexxS (talk) 18:54, 21 July 2020 (UTC)
CaptainEek, thanks for moving the ball forward. Seems to me there are three basic issues:
  1. should the titles in citations be linked, and if so, to what (in other words, what should be in |url=)
  2. should the titles in citations be linked if that link is a duplicate of an identifier (DOI, PMC, PMID, etc.) (in other words, should we have |url= even when it is duplicative of, e.g., |doi=?)
  3. under what circumstances should a title link be removed from a citation, by human or by bot (in other words, when should |url= be blanked) Levivich[dubiousdiscuss] 22:46, 20 July 2020 (UTC)

It should also connect this to Template:Cite journal#identifiers rather than User:Citation bot. Headbomb {t · c · p · b} 23:09, 20 July 2020 (UTC)

Block this bot until somebody fixes it. The bot is removing urls from titles if the cited article has a doi. This is not good. Many users don't bother to click on the doi, as they don't know what it is.
Some urls in titles link to articles that users can access for free, while the doi's link is not free. Therefore, the bot is removing other editors' work and is preventing users from freely accessing information. If nobody corrects this quickly, I suggest that editors remove doi's from citations that have direct links to titles. That will provide a work-around that will stop this bot. Corker1 (talk) 20:35, 29 July 2020 (UTC)
That won't do anything at all, Citation bot will still convert parameter URLs to use their dedicated parameters. If the DOI is freely accessibly, then add |. Headbomb {t · c · p · b} 09:47, 30 July 2020 (UTC)
@Headbomb, RexxS, AManWithNoPlan, and Chris Capoccia: I have created an RfC at VPR with Levivich's questions: [1] CaptainEek Edits Ho Cap'n! 00:32, 5 August 2020 (UTC)

Block of EmausBot[edit]

I just came across User talk:Emaus#Bot fix of redirect with possibilities after noticing that EmausBot had been blocked some time ago (indefinitely since 12 May 2020). WT79 and admin Samsara each reverted perfectly valid and what should have been totally uncontroversial technical edits by Emaus#Bot because of their confusion over the purpose and application of the problematic {{R avoided double redirect}} which seems to encourage the creation of double redirects rather than actual avoidance of them. {{R avoided double redirect}} is a relatively newer redirect template that was created just over five years ago. Here's my explanation of its use, citing an example.

Doctor Marvin Monroe (The Simpsons) redirects to List of recurring The Simpsons characters#Dr. Marvin Monroe. That redirect is tagged with {{R avoided double redirect|Marvin Monroe}} because it "should" be a redirect to Marvin Monroe but that too is a redirect to List of recurring The Simpsons characters#Dr. Marvin Monroe. It's not currently tagged with {{R with possibilities}} but it could be... indeed the {{R avoided double redirect}} implies that it should be, because the whole point of {{R avoided double redirect}} is that Marvin Monroe has genuine possibilities, indeed the expectation, that some day that there will be a standalone article about the notable fictional character "Marvin Monroe". And that when that day arrives it would be an error demanding prompt attention if Doctor Marvin Monroe (The Simpsons) continued to redirect to List of recurring The Simpsons characters#Dr. Marvin Monroe rather than directly targeting the new standalone article. Never mind that "Marvin Monroe" has history indicating the current consensus that this is not an {{R with possibilities}}! We still need to manage those "avoided double redirects" – even if it means getting into edit wars with previously uncontroversial bots.

Samsara blocked this bot under that rationale that all bots are required to inspect the edit history of a page before they edit it, or maintain a private database of all of their past edits, and to not make the edit a second time if they have previously made the same edit and been reverted by a human – even when the human edit is counter to policy and the bot's edit was correct. I see nothing in Wikipedia:Bot policy#Bot requirements that specifically addresses this, nor any demonstration that the bot has actually done any harm. Emaus hasn't been particularly active here lately, but I'm hoping that if the bot is unblocked it will resume editing. – wbm1058 (talk) 19:06, 14 July 2020 (UTC)

"that all bots are required to inspect the edit history of a page before they edit it" That is completely untrue. There may be a rationale for the bot to be blocked, but this isn't one, especially if the bot is WP:NOBOTS-compliant. Headbomb {t · c · p · b} 19:22, 14 July 2020 (UTC)
I am tempted to unblock here. "I want a double redirect" isn't a valid reason to block a bot --Guerillero | Parlez Moi 19:25, 14 July 2020 (UTC)
@Guerillero: Indeed it is not. WP:DOUBLEREDIRECT is quite clear that these are not acceptable. Headbomb {t · c · p · b} 19:27, 14 July 2020 (UTC)
I have notified Samsara of this discussion (don't know if they receive pings). Primefac (talk) 20:23, 14 July 2020 (UTC)
Sorry, I didn't quite understand the policy and posted a comment on the talk page first; I dropped out of the discussion after not very long, before the bot was blocked. WT79 (speak to me | editing patterns | what I been doing) 20:50, 14 July 2020 (UTC)
@Guerillero: looking at User talk:Emaus #Bot fix of redirect with possibilities, you'll find that Samsara specifically stated "The block, however, is for not respecting revert actions, which should always trigger desisting or review. So when that is addressed, I will unblock". I don't agree with you that "wanting a double redirect" was the reason for the block, and there is confirmation of that in the bot's block log. --RexxS (talk) 21:17, 14 July 2020 (UTC)
He was reverting to restore an improper double redirect. --Guerillero | Parlez Moi 21:20, 14 July 2020 (UTC)
Quite likely, but that's not the point. No bot should edit-war with editors, regardless of which one is right. You can explain to an editor if they make a mistake, but you have no similar recourse when the bot makes a mistake, which is why we don't expect it to be edit-warring. --RexxS (talk) 21:43, 14 July 2020 (UTC)
I don't think there's many bots that have that level of intelligence, nor is it required by policy to my knowledge. If someone wants to keep a page in a state that triggers a maintenance task, the onus probably rests with them to use the nobots template properly to ward off any bots authorized to perform that maintenance. I support unblocking. –xenotalk 23:24, 14 July 2020 (UTC)
So there are a couple of items I'd like to be sure are clear: (1) Is this bot making edits not approved by its BRFA, or is this a new additional concern? (2) What is the response of the operator, who is ultimately responsible for every edit made by the bot. — xaosflux Talk 22:04, 14 July 2020 (UTC)
Perhaps add: (3) If Samsara were to accept that the bot was correct in making the first edit, would they now be willing to unblock? --RexxS (talk) 22:13, 14 July 2020 (UTC)
To answer the first two questions, on the two pages in question (1, 2) the bot was operating within its parameters; Emaus replied in a manner indicating that. As far as the third question goes, that point is moot (see my reply below). Primefac (talk) 23:46, 14 July 2020 (UTC)
The applicable BRFA is Wikipedia:Bots/Requests for approval/EmausBot 2 – which was approved 9 ​12 years ago, so it seems odd that it would just now have problems after all these years, no?
The BRFA asserts that it uses mw:Manual:Pywikibot/redirect.py – is that not bot-compliant? wbm1058 (talk) 22:20, 14 July 2020 (UTC)
@Wbm1058: the software a bot uses is useful to know, but doesn't make it compliant or not compliant - what does is (a) is it following the scope of the approve task and (b) have there been any community standard since enacted that would require it to reduce or cease the task? — xaosflux Talk 22:59, 14 July 2020 (UTC)
As far as "exclusion compliance" that this bot purports to be, if the bot was reverted - and an exclusion was asserted (e.g. {{nobots}}) that this bot is not complying with - that is a blockable malfunction. — xaosflux Talk 23:00, 14 July 2020 (UTC)
Sure, but the relevant edit history clearly shows use of WP:Rollback to revert the bot. Hard to use {{nobots}} kryptonite to thwart a bot when you're pressing that button. wbm1058 (talk) 23:13, 14 July 2020 (UTC)
Pywikibot's default scripts should be nobots compliant by default. Legoktm (talk) 01:21, 15 July 2020 (UTC)
While I am still interested in hearing Samsara's thoughts on the matter, the bot is operating within the scope of the relevant BRFA. There is no requirement that I (or seemingly any other BAG that has commented here) know of that requires a bot to keep track of the edit history of a page, and if a page is supposed to be an exception to whatever rule a bot is fixing, {{nobots}} should be used to indicate that - NOT edit warring with the bot or blocking it. That being said, I have unblocked the bot. Primefac (talk) 23:42, 14 July 2020 (UTC)
Really what should be done is gain consensus that double-redirects are acceptable to begin with at WP:DOUBLEREDIRECT. Because so far, they aren't. Headbomb {t · c · p · b} 00:02, 15 July 2020 (UTC)
Primefac, I support this action with my BAG hat on. —CYBERPOWER (Message) 02:22, 15 July 2020 (UTC)

VPPOL discussion closed: linking by InternetArchiveBot[edit]

I just closed a well-attended discussion at WP:VPPOL, "Stop InternetArchiveBot from linking books", with the conclusion that its continued adding of links to Internet Archive is controversial and does not have consensus support. I assume that therefore approval for this bot task should be revoked. DMacks (talk) 16:15, 15 July 2020 (UTC)

DMacks, and I have to challenge this. The bot was approved based on a proposal to implement the bot. The proposal had unanimous support from the community, and this proposal is linked in the BRFA approving it. By your own words there is no consensus, which does not translate to overturn approval of the bot. —CYBERPOWER (Around) 16:47, 15 July 2020 (UTC)
  • Statement from Cyberpower678: I would ask that BAG not rescind the approval of InternetArchiveBot 3. I am confused how the most recent discussion which had no consensus would overturn the original proposal and bot approval which both had unanimous consensus (proposal; BRFA). There were some concerns raised that a pending lawsuit should change our approach; however, the Internet Archive is functioning no differently today than it was when the bot was approved (their National Emergency Library project was shut down on June 16). Though I began this project as an unpaid volunteer, it's open knowledge that Internet Archive started paying me to work on this project due to its scale and complexity. I am currently paid by Internet Archive to improve Wikipedia, and I have been public about that. Since I started working with the Archive we have rescued over 10 million dead links and added hundreds of thousands of links to books. Our editors and readers benefit from these links with limited access to digitized books, as they would from any library. I accept that the implementation of the book-linking bot task has not been perfect, and I would sincerely like to opportunity to improve it. I suspended the bot's book linking task as of June 14 until we can resolve any outstanding issues, but I politely disagree that the bot's approval should be entirely removed. —CYBERPOWER (Chat) 20:00, 15 July 2020 (UTC)
    • The statement that "the bot should be stopped" seems inaccurate; I presume DMacks was referring to only the book-linking task without considering that the bot also does other tasks. I certainly wouldn't go any further than stopping only that task based on that discussion.
      As for whether even that task should be stopped, I'm not sure. The discussion was created by someone with an admitted WP:COI, and he seemed to WP:BLUDGEON the discussion throughout. It would be interesting to have a better RFC, specifically and neutrally addressing all of the issues people had raised, to evaluate whether consensus actually has changed. But coming so soon after the previous discussion, we might wind up with people just re-arguing that instead. Anomie 21:01, 15 July 2020 (UTC)
      • The statement seems accurate to me, having read the WP:VPPOL discussion. If you are dissatisfied with the close "Although the question here is "should the bot stop", the real idea I see is that it is controversial and that there is no longer consensus that it should run (wider discussion superceding WP:BRFA). Therefore the bot should be stopped but its existing edits can stand and no prejudice against future manual additions or removals of IA links by uninvolved editors.", then you should challenge it at WP:AN. If you want clarification from DMacks, then you should ask them. This isn't a competent location for deciding either of those issues.
        @Cyberpower678: I've followed the links given at User:InternetArchiveBot for the BRFAs for the bot and I can't find the approval for the task of replacing Google links with IA links. Can you point me to where the approval was granted (and perhaps fix the links on the user page)? --RexxS (talk) 21:13, 15 July 2020 (UTC)
        RexxS, We were only replacing Google Books links in rare cases and for specific reasons: 1) if the existing Google link was dead; 2) if the Archive had a complete a free full page view whereas Google offered only a few-sentences snippet; 3) if a public domain book was available on both websites (using a nonprofit over a for-profit site per WP:AFFILIATE). Any of those behaviors can be changed to meet community approval. —CYBERPOWER (Chat) 21:57, 15 July 2020 (UTC)
        RexxS, we thought those bot behaviors were consistent with the community approval, and with policy. If I assumed too much, I take responsibility for it and will tailor the bot to the community's preference in any/all of those circumstances. —CYBERPOWER (Chat) 22:03, 15 July 2020 (UTC)
        RexxS Please point to anywhere in the discussion where there was any mention of (much less support for) stopping the tasks described at Wikipedia:Bots/Requests for approval/InternetArchiveBot or Wikipedia:Bots/Requests for approval/InternetArchiveBot 2. Anomie 02:23, 16 July 2020 (UTC)
        Bot approval requires support *for* a task, it doesnt require support to *not* do a task after someone comes along afterwards and tacks on something that isnt on the initial approval. Since the specific issue of the linking in the RFC was not in the approval, and Cyberpower admits those behaviours were assumed could be added, and it turns out there is clearly no consensus for them to be performed, the bot does not have approval to perform the specific tasks that were the subject of the RFC. Only in death does duty end (talk) 08:31, 16 July 2020 (UTC)
        the specific tasks that were the subject of the RFC is exactly my point in the post you replied to. I presumed and DMacks confirmed below that "the bot" was something of a metonymy for "the task". I have no idea why RexxS was promoting an obvious error.
        But you also seem to be missing the fact that there was significant support for the task in question when it was originally proposed and specifically approved; unfortunately the latter is not linked from User:InternetArchiveBot, as that page seems to have not been updated for the additional approval. Anomie 13:03, 16 July 2020 (UTC)
        @Anomie: Please point to anywhere in the close where there was any mention of excluding those tasks. I hope you can see how nonsensical your request was. Now that DMacks has clarified the close, it should render those sort of non-questions irrelevant. The only obvious error was in not linking the BRFA for the task in question from the bot's page. I thought that was a requirement? If it isn't, it damn well should be. --RexxS (talk) 15:57, 16 July 2020 (UTC)
        I'm going to stop replying to you now on the topic of the lack of knowledge of the technical details causing poor wording in the close. You're welcome to continue to believe any wrong thing you want, but I'm not going to waste my time with your refusal to get the point.
        As for your other question, WP:BOTREQUIRE does say that the bot's user page should include or link to descriptions of the bot's "task (or tasks)" and various other information that is already included in the BRFAs. It looks like Cyberpower678 has now made that addition for IABot (note the linked page is transcluded into User:InternetArchiveBot). Anomie 21:10, 16 July 2020 (UTC)
        And I'm going to continue to call you out every time you criticise a good-faith close of an RFC by an uninvolved admin, based on your dislike of the outcome. The question was one of whether links should be made to sites that potentially contained copyright-violating material, and I reject your assertion that an admin needs "knowledge of the technical details" to adjudicate on that. You didn't like the outcome, we get it, but you don't have any consensus to do otherwise than to accept it or ask for clarification of it. --RexxS (talk) 22:05, 16 July 2020 (UTC)
  • I am Mark Graham. I manage the Wayback Machine at the Internet Archive, as well as our efforts to help make Wikipedia sites more useful and reliable by fixing broken links and adding links to various resource available from archive.org.
    We love working with the global Wikimedia community to collaborate on shared solutions for reliability and verifiability. For the last 5 years InternetArchiveBot has been linking to archived snapshots of web pages that no longer function or will soon not function. Our project to link to digitized versions of books makes Wikipedia's references more accessible for readers.
    We have no profit motive here. We don't have ads and don't gain revenue from increased traffic. Like Wikipedia, we work to keep our servers up and running, not to make money.
    There was a concern raised that we somehow profit from Better World Books. Better World Books is owned by Better World Libraries, a nonprofit.
    We take the wishes of the Wikipedia community very seriously, and we would very much like to continue helping the community with its inspiring mission. Markjgraham hmb (talk) 21:19, 15 July 2020 (UTC)
  • (edit conflict) I didn't participate in the original discussion, and, having just read through it, I don't see consensus to overturn the previous consensus and BAG approval. Supporters argued to discontinue linking for two main reasons (1) potential copyvio and (2) moral reasons. The first point was pretty soundly refuted by the opposes and even some supporters (like Masem) so I don't see any consensus that BAG approval should be revoked for copyvio reasons. The other main argument is that it has the potential to harm downstream users and so we should avoid using the links on principle. This was a minority position among the supporters, and even if we assigning them full weight, there still wouldn't be a consensus to overturn the previous consensus. So despite the close, I don't think this request should be acted on without further discussion. Whether that be here or at WP:AN I don't really care, but I don't think this is a rubber stamp request. Wug·a·po·des 21:36, 15 July 2020 (UTC)
  • I am inclined to agree with Anomie and Wugapodes. @Markjgraham hmb: thank you for clarifying comments here, and for your thoughtful support over the years. Wikipedia is certainly better for the thorough links to archived snapshots that have become uniform through such work - a combination of two of the oases that make the Internet not suck. Cyberpower, I appreciate your openness to feedback -- can you share a few examples of replacing G!Books links? I haven't come across this myself but saw it mentioned by a few different people. – SJ + 01:51, 16 July 2020 (UTC)
    • Replying as I wrote the GB conversion side of the bot. Google diffs: diff 1, diff 2, diff 3, diff 4. There were also some PD conversions such as diff per WP:AFFILIATE which recommends converting for-profit to non-profit. There are no immediate plans to convert more Google, because there are no more to convert. The conversion mostly finished in March. The conversion rate was less than 10%. -- GreenC 03:10, 16 July 2020 (UTC)
  • I've been asked to clarify several issues of my close. First, I'm not a BAG person, so I don't know the technical details of what is a "bot" vs "one bot-task among several that a bot runs". My close is narrowly focused on whatever you call the process(es) add the IA links (de novo or as conversion from some other archive link) because that's the only issue that was raised in the disussion (not a rejection of the bot-operator or whatever else they may do). Second, the original approval discussion that was mentioned in the VPPOL discussion was not well-attended (and although the closer claimed strong support there was some dissent. It's BAG's perogative to decide what's "good enough" when they are the sole site of discussion. But more significantly, the discussion was limited and the closer of the botreq noted that the approval queue was backlogged and the task seemed safe/noncontroversial enough so "why not?". This new discussion was substantially more attended, in a more public place (lots of WP-sitewide folks rather than only those who choose to know about approval of upcoming bot tasks), and demonstrated that the task is definitely not non-controversial. So that's why I closed it as a case of the original WP:CONLIMITED (approved) being replaced by the new no-consensus (==not approved). Another admin pointed me to WP:BOTREQUIRE #4. I did not take the strongest interpretation/"letter of the law" of WP:NOCONSENSUS regarding addition of external links to unwind the previous edits because I did not see that point made strongly in the discussion (it no longer has consensus). DMacks (talk) 04:53, 16 July 2020 (UTC)
    • Thanks for the reply DMacks, that makes sense, especially the reference to BOTREQ#4. Would you be able to be a bit more precise about what you think is and is not appropriate given the discussion? Given your close, it seems that IA bot linking to any IA page should not be done, but if it's more narrow than that, it would be helpful to know what kinds of links are appropriate. Given your explanation, I agree replacing Google Books links has no consensus and should be stopped. Does the lack of consensus extend to linking to books that were otherwise unlinked? Some participants distinguished between links to works under copyright and those not, is there no consensus for both of those? It's possible the answer is yes, but either way it would be helpful to have more clarity on exactly what doesn't have consensus given that discussion. Wug·a·po·des 06:21, 16 July 2020 (UTC)
    • @DMacks: You seem to have mischaracterized the discussion at Wikipedia:Village pump (proposals)/Archive 159#Expanding InternetArchiveBot to handle book references, or considered only Wikipedia:Bots/Requests for approval/InternetArchiveBot 3 without following the link from "Links to relevant discussions". I doubt that WP:Village pump (policy) is substantially more public than WP:Village pump (proposals), or that WP:Village pump (proposals) is attended only by those who choose to know about bot tasks. As for the VP discussions themselves, I count 24 userpage links in the original (15 explicit support !votes, 1 neutral, 0 oppose) versus 38 in the new discussion (harder to count as it wasn't conducted as a !vote, but it seems about 14 support, 12 oppose, 1 neutral, with several of those supports being very weak). Anomie 13:03, 16 July 2020 (UTC)
      • It's not my job as uninvolved closer of the discussion to supervote on the value of the goal or target site, just to pull together what is mentioned in the discussion. The current discussion demonstrates that there is not current consensus support for the bot task. It highlighted a weakness in the original approval process last year even though that was based on a consensus-support discussion for the underlying task at that time. This current discussion newly or more-strongly notes some opposing ideas that were not/not-as-strongly previously: not consensus oppose but not consensus support (such as overwhelming rejection of minority opposition ideas). "Consensus support to make edits" seems to be what a bot-task requires. First, there was definite opposition to converting of pre-existing links, so that shouldn't continue. Second, the fact that the new discussion was substantially during the time when IA made the jump into the emergency-COVID-library (NEL) project and then backed down in the face of legal claims made this discussion messy. I wish we would have first had a discussion framed from scratch about suitability of the site and that separately asked about free vs copyrighted works, with ping to WP:ELN, at a time when NEL wasn't happening. I've been squinting at the old and new discussions again for the past hour ot two--not just my take on the new discussion, and not treating the current bot-task as a single entity to confirm/reject). Taken together, I think bot-adding links to works that are out-of-copyright still has consensus (not as strong as it originally did) as long as registration is not required to access the full work being linked. DMacks (talk) 10:57, 17 July 2020 (UTC)
  • I agree the close does not reflect the (fact-based) consensus. There's a lot of FUD in that RFC, very little of which is fact-based. IAlinks are not copyright violations, nor do they host copyright violating materials. Headbomb {t · c · p · b} 15:52, 16 July 2020 (UTC)
  • I apologize for coming so late into this discussion; after the closure, I had not dropped by the Pump much and had not seen notice that it had begun. I see some strange claims being made here, like "There was a concern raised that we somehow profit from Better World Books. Better World Books is owned by Better World Libraries, a nonprofit." BWL is IA's long-time partner. That's who they're throwing sales links to. The fact that it is a non-profit does not mean that it does not make a profit off of the sales, it's just a matter of what they do with that profit. The claim that IA doesn't host copyright violating materials is at least a highly controversial one, as they are currently being sued by significant publishing players over claims of just such copyright infringement. As for the closure, most of the !votes cast were done so before there was any discussion of the links between Better World Books and IA; I apologize for not including that in the initial statement of concern, but I was unaware of the linkage at the time I launched the discussion. Once I added that information to the discussion on June 17, there was a strong swing to stopping the linking (bolded !votes after that point: 6 support, 2 oppose.) As for the question of whether links to out-of-copyright books should continue, I will note that my original request that the bot "be halted and not allowed to run until it is changed to no longer link under-copyright works" (or until the suit cleared them), but then that was before the promotional concern was raised. As for the existing unanimous discussion, there was only one person who even raised the question of US copyright there, @Masem:, and that user continued to show concerns about copyright implications in the newer discussions. There was no concern raised about the applicability of WP:COPYVIO (which does not require that copyright violation be proven before a link is prevented, but only suspected.) The link to Better World Books was not brought up. And thus, without those factors, the bot change was approved on the basis of "this task is not controversial".
I will also note that the bot seems to be working in ways that deviate from the original proposal, at least if made with Wikipedian eyes. The proposal said it would make links for "referenced books", which might be read as "books that are made referenced to", but I would suspect that most of us editors would read as "books used as references". It has not been limiting itself to books within REF tags, but has been including books that are mentioned in the body of the article... and as most of us editors know, external links are discouraged within the body of articles. If there is to be a new RFC done, that might want to be rolled into it... or it might be more of a distraction from the WP:COPYVIO and promo-y concerns. --Nat Gertler (talk)
Let me just add… There is a sea of difference between manually added links, where human judgement and local consensus are involved in the decision, and proactively mass-adding them with a bot. Any RFC about this should take this distinction into account: I absolutely support human editors adding these links (with possibility for discussing the suitability in every individual instance, if there is a copyright concern or a primacy of "best" link issue), and bots and scripts that help humans to do so, but I very much oppose bot addition of them. In addition to the lack of human judgement involved, tasking a bot to do this is not just permitting such links, it is mandating them in all cases on all articles regardless of local consensus. That's quite a different question to be asking the community to support. --Xover (talk) 10:53, 28 July 2020 (UTC)
Hi Xover. The bot has added 432,000 links to books. The scale of benefit is simply not possible through manual additions alone. As for honoring local consensus, it's really easy to just add {{bots|deny=InternetArchiveBot}} to the page. The bot respects that and will never add any links on that page. —CYBERPOWER (Chat) 18:25, 29 July 2020 (UTC)
@Cyberpower678: First off… Am I supposed to be impressed by that number? Human beings have built an encyclopedia with 6,137,772 articles. Even assuming an average of less than one book cited per article, humans have still outperformed your bot by at least one order of magnitude by adding actual citations and not just a simple URL. Just because it is technically possible for a bot to do something does not mean that it should do that thing, nor that the net result is better (more links is not inherently a good thing).
Second… Almost all the actual benefit to Wikipedia can be had by making the bot act as an interactive tool that is easily available to human editors when they work on an article. A Gadget (even a default gadget) can detect the addition of a cite to a book whose ISBN is known and offer to automatically add the link, at the user's discretion. Or, like the link fixer or dab solver, can be made easy to run for the human editor, fully or semi-automatically, but at the human being's control. Or it could even work like the dab link notifier, leaving suggested links on the user's talk page when they have added a cite for which a link exists but was not included in the cite. Or like Anomie's orphaned refs detector, that leaves notifications for humans on the article's talk page when the error can't be automatically fixed (I know you have the code for leaving talk page messages…). In any of these scenarios you can leave it up to the user with a simple checkbox whether to add links to books that are not fully available if that matters to them (some care, some don't). Give me a stupidly easy way to add such links and I'll use the heck out of it, and probably recommend it as best practice the way adding WayBack links already is. Or, hey, or you could have the humility to simply list the IA as one lookup service among the others at Special:BookSources, which facility is well used and provides numerous benefits beyond what a single hard-coded URL does. That's how most of our WP:V is done: having direct links to an electronic edition is a convenience, nothing more.
Or build a bibliographic database that lets all our citation-assistant tools easily look up what scans IA has for a given work and add that link (or link directly to the high-res single-page .jp2). That would enable all sorts of other unanticipated benefits and innovations. And, hey, by actually collaborating with the Wikimedia Movement, we could, together, build a common bibliographic database based on Wikidata that could be used not just to look up a scan URL on IA, but as a basis for structured citations here, for advanced integrations on Wikisource, and that can be expanded to include HathiTrust and JSTOR (and even Google Books if they ever return to being useful for anything) and similar. And it would enable crowdsourced addition and correction of that bibliographic data, not by IA and Wikimedia cannibalising and competing for each others' volunteers, but by taking advantage of the work that those volunteers already do today and sharing the benefits.
The only benefits that cannot be achieved through these means are the benefits that IA derives from having 350k+ backlinks from Wikipedia that they control. I in no way begrudge them that benefit (a few quibbles and recent bonehead moves aside, I'm a big fan of IA), but on Wikipedia the benefit to IA is of secondary importance (not no importance, just secondary to our goals). Once one gets one's head out of the one-way linkspam track there are so many ways the IA and the Wikimedia Movement can work together and have overlapping goals. You still listening Mark?
Finally, {{bots|deny=whateverbot}} is in practice non-functional because any attempt to add it to an article is immediately removed by bot (well, AWB usually, but same diff) with the reasoning that either the bot is operating within its BRFA and preventing it with {{bots}} is inappropriate, or the bot is not operating within its BRFA, in which case it must be taken to ANI so the bot can be blocked and adding {{bots}} to an article is inappropriate. There are other problems with pointing at {{bots|deny=whateverbot}} as some sort of magical reason why the actual problem with a bot needn't be fixed (for one, it shifts the burden from the bot operator to every single other editor to demonstrate what is controversial and what isn't), but they're mostly moot so long as that particular Catch 22 is allowed to keep operating. --Xover (talk) 18:36, 3 August 2020 (UTC)
I am listening and we completely agree that scale matters. That's why InternetArchiveBot has rescued more than 11,000,000 dead links on Wikipedia. Links that provide value to readers are the only kind of links we're interested in.
While very experienced or knowledgeable editors like you, may work with gadgets or edit interfaces, the majority will not, so manual approaches are less likely to result in benefit at scale. I see your point about multiple paths to a book, and we've hoped that giving readers a direct link to a live version is more useful than a link to an index page with 100 different options.
We are interested in helping to make the Web more useful and readable. To that end, we are very happy to see the entirety of information we've collected integrated into Wikidata; helping to make Wikidata a more comprehensive bibliographic information source is one of our main goals. We are a service-provider to the internet. We preserve webpages. We digitize books we own. We make our collections available to every person for free.
We have been collaborating with the Wikimedia movement for more than 8 years, since we started archiving every externally referenced URL added on all of the Wikipedia language editions, and then fixing 404 errors across 33 language versions of Wikipedia. The Internet Archive was a founding participant in Wikicite, the effort to create that very shared open bibliographic database we both dream of. That’s why we have been collaborating with Wikimedia for many years--with the community, with The Wikipedia Library, with the Community Programs team at the Foundation, with Wikicite, with our presence at events like Wikimania and Wikiconference North America, and with public collaboration between our nonprofits. We are on the side of Wikipedia readers and always will be. Markjgraham hmb (talk) 15:43, 4 August 2020 (UTC)
If someone is AWBing away {{bots}}, you should probably ask them to, y'know, not do that. It might help if we timestamped {{bots}} like other maintenance tags though. --AntiCompositeNumber (talk) 19:11, 3 August 2020 (UTC)
There is also {{cbignore}} for fine tuning per cite. -- GreenC 20:27, 29 July 2020 (UTC)

As an editor, I am concerned to think my work is being messed with by an unthinking bot. I did not read whatever that bot is now linking my citation as, and the bot did not read whatever they are now saying my citation is. When there comes a time that the bot is actually able to read books, study what is in books, write content based on what it actually read, and make citations based on what it actually reads, then maybe. -- Alanscottwalker (talk) 21:52, 29 July 2020 (UTC)

The bot matches the edition cited. For example in Albert Stubblebine citation #2 for Men Who Stare at Goats.. there are two editions available at Internet Archive, but not the correct edition (Simon and Schuster 2004), so it was not linked. Compare with Spitsbergen where the correct edition and page is available. -- GreenC 16:36, 30 July 2020 (UTC)

Non-free files aren't getting reduced[edit]

See User talk:DatGuy#Non-free files aren't getting reduced --Neveselbert (talk · contribs · email) 00:58, 21 July 2020 (UTC)

@Neveselbert: thanks for the note in case anyone comes looking; as a reminder no editor (or their bot) should ever be expected to make another edit or action -- if one bot is no longer doing this and someone else would like to run one they can apply at BRFA. — xaosflux Talk 01:32, 21 July 2020 (UTC)

FYI re Hasteur[edit]

HasteurBot (t · c · del · cross-wiki · SUL · edit counter · pages created (xtools • sigma· non-automated edits · BLP edits · logs (blocks • rights • moves) · rfar · spi) (assign permissions)(acc · ap · fm · mms · npr · pm · pcr · rb · te)

Hasteur is recently deceased per WP:AN#User:Hasteur. I don't think that means anything needs to be done but his bots's duties may need to be assumed (one editor is already considering it). --Izno (talk) 23:54, 21 July 2020 (UTC)

I'll admit, my Python experience isn't the best past the occasional project, but now I've looked at the code of the bots (that I know of), I can interpret it and I'm happy to offer to take up these duties and ensure the bots continue running smoothly and operate normally.
The larger issue, however, is discussing with the people at WMCS to transfer ownership of the projects which the bots run on to the new maintainer. I'm unsure how that process works, especially as, sadly, there's no way to contact the original operator in this case. Ed6767 talk! 00:11, 22 July 2020 (UTC)
I'd also be happy to handle it if you want: User:MDanielsBot is all Python and runs on Toolforge. The code is open-source, so it shouldn't be a huge problem if WMCS can't transfer the tools (but they should be able to, per the toolforge wiki). --Mdaniels5757 (talk) 02:58, 22 July 2020 (UTC)
Mdaniels5757, if you have more experience in Pywikibot, it may be worthwhile for you to maintain them due to your existing experience - and I've found his two toolforge instances: https://admin.toolforge.org/tool/hasteurbot and https://admin.toolforge.org/tool/enbbsb. Hasteurbot already seems to have The Earwig and Theopolisme as listed toolforge maintainers. Ed6767 talk! 10:34, 22 July 2020 (UTC)
Also happy to help if I can; MajavahBot is also all Python/Pywikibot on Toolforge. According to wikitech:Help:Toolforge/Abandoned_tool_policy the process on Toolforge side should be pretty straightforward.  Majavah talk · edits 11:00, 22 July 2020 (UTC)
I'll file the request for enbbsb. @The Earwig and Theopolisme: Do either of you wish to take over https://admin.toolforge.org/tool/hasteurbot? If not, would you be willing to add me (mdaniels5757 on Toolforge) as a maintainer to it? Best, --Mdaniels5757 (talk) 15:34, 22 July 2020 (UTC)
@Mdaniels5757: I don't really have time right now to take on maintenance of a new bot, so I'm glad to hear that you're willing to take it over. I've added you as a maintainer. — Earwig talk 15:39, 22 July 2020 (UTC)
Was DRN cleck bot hosted on that Toolforge tool too or somewhere else?  Majavah talk · edits 15:53, 22 July 2020 (UTC)
@Majavah: it was on that toolforge tool.
BRFA filed at 5 and 6. --Mdaniels5757 (talk) 16:18, 22 July 2020 (UTC)
I've de-botted that account and marked it deactivated. — xaosflux Talk 14:14, 22 July 2020 (UTC)
Xaosflux, (more of a general query) would the new maintainer, Majavah, have to submit a new RFBA to get it reactivated? Ed6767 talk! 15:49, 22 July 2020 (UTC)
The bot account is globally locked. We don't usually change the ownership of an account this way.  Majavah talk · edits 15:52, 22 July 2020 (UTC)
@Majavah: a new operator would need their own bot account - this should be fairly trivial for a bot operator. — xaosflux Talk 16:36, 22 July 2020 (UTC)
@Xaosflux: You forgot DRN clerk bot which was also operated by Hasteur.  Majavah talk · edits 15:51, 22 July 2020 (UTC)
 Donexaosflux Talk 16:37, 22 July 2020 (UTC)

BRFA backlog[edit]

Resolved

FYI, WP:BRFA has a decent backlog right now (and I'm not just saying that because I'm on it twice). Any BAG member willing to give old requests some attention will recieve my appreciation and a barnstar :).
Cc. a few active BAG members (no offense to the ones I'm not pinging): @Primefac, Xaosflux, and Enterprisey.
Best, --Mdaniels5757 (talk) 21:54, 2 August 2020 (UTC)

Jeez, you take a semi-wikibreak for a while and everyone gets impatient... looks better now. Primefac (talk) 22:27, 2 August 2020 (UTC)
@Primefac: Thank you! --Mdaniels5757 (talk) 02:51, 3 August 2020 (UTC)