User talk:Citation bot: Difference between revisions

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Content deleted Content added
Line 399: Line 399:
: the bot is currently blocked. [[User:AManWithNoPlan|AManWithNoPlan]] ([[User talk:AManWithNoPlan|talk]]) 16:33, 24 February 2019 (UTC)
: the bot is currently blocked. [[User:AManWithNoPlan|AManWithNoPlan]] ([[User talk:AManWithNoPlan|talk]]) 16:33, 24 February 2019 (UTC)
::That explains it. Thanks [[User:AManWithNoPlan|AManWithNoPlan]]! --- [[User:FULBERT|FULBERT]] ([[User talk:FULBERT|talk]]) 16:35, 24 February 2019 (UTC)
::That explains it. Thanks [[User:AManWithNoPlan|AManWithNoPlan]]! --- [[User:FULBERT|FULBERT]] ([[User talk:FULBERT|talk]]) 16:35, 24 February 2019 (UTC)

== Anonymous editing ==

It is good the bot is blocked. Now it can stay blocked until the ability of editors to edit ''wholly anonymously'' is removed. This is [[WP:TEACE]], either by username or by revealing your IP; but citation bot currently has the facility to bypass both of those constraints. [[User:Serial Number 54129|<span style="color:black">'''——'''</span>]][[Special:Contributions/Serial Number 54129|<span style="color:black">''SerialNumber''</span>]][[User talk:Serial Number 54129|<span style="color:#8B0000">54129</span>]] 16:39, 24 February 2019 (UTC)

Revision as of 16:39, 24 February 2019

You may want to increment {{Archive basics}} to |counter= 15 as User talk:Citation bot/Archive 14 is larger than the recommended 150Kb.

Note that the bot's maintainer and assistants (Thing 1 and Thing 2), can go weeks without logging in to Wikipedia. The code is open source and interested parties are invited to assist with the operation and extension of the bot.

Before reporting a bug, please note: Addition of DUPLICATE_xxx= to citation templates by this bot is a feature. When there are two identical parameters in a citation template, the bot renames one to DUPLICATE_xxx. The bot is pointing out the problem with the template. The solution is to choose one of the two parameters and remove the other one, or to convert it to an appropriate parameter. Also, see disussion links in case the bot did something that your disagree with to see if it is under discussion.

Please click here to report an error.

Or, for a faster response from the maintainers, submit a pull request with appropriate code fix on GitHub, if you can write the code that needs written.

Request: Usage methods tracking

Please track how the bot is activated in edit summaries (toolbar, draft, website, other peoples *.js files). Use whitelist of approved methods, with 'toolbar' being grandfathered. AManWithNoPlan (talk) 19:38, 30 December 2018 (UTC)[reply]

Page Ranges vs specific pages in journals

Hi. This edit changed a citation (Irish University Review) from the page number of the cited content (page 5) to the page range of the journal article (pages 5–21). Please make sure the bot isn't doing the same on other articles. Scolaire (talk) 10:18, 31 January 2019 (UTC)[reply]

the bots actions are correct. If you want a specific page, you need to use |at=. AManWithNoPlan (talk) 13:52, 31 January 2019 (UTC)[reply]
{{notabug}}, but template documentation could be better—it is generic and not cite journal specific. Also, incorrectly putting in the first page is so incredibly common that expanding to a range is the right thing %99.99 of the time. AManWithNoPlan (talk) 14:50, 31 January 2019 (UTC)[reply]
Thanks for fixing it. I agree that it should be made clearer in the documentation. Scolaire (talk) 15:20, 31 January 2019 (UTC)[reply]
The template documentation is not at fault here. There have been ongoing discussions over the past many years about journal citations referring to the whole work of the article rather than the specific page. The bot should probably not be acting on these at all. --Izno (talk) 16:28, 31 January 2019 (UTC)[reply]
I wonder if the bot finds a page that is within the page range then it should change it to |at= for the sake of being precise. If the page is the first page or out of range or blank the update to range. AManWithNoPlan (talk) 17:49, 31 January 2019 (UTC)[reply]
I'm still confused. If the "pages" parameter is meant for the page range of the article (which is different from its use in Cite book, for instance), why is there a "page" parameter as well as an "at" parameter? Scolaire (talk) 18:43, 31 January 2019 (UTC)[reply]
You are using the template correctly. The reason that you're still confused is because journals for some reason have a Special Case By Convention perhaps not obvious to everyone. You can search the talk archives of Help talk:CS1 to see that it has been discussed, with no obvious final resolution on the point. --Izno (talk) 19:35, 31 January 2019 (UTC)[reply]
I'd be okay with "if it finds the page inside the page range (inclusive), leave alone, otherwise update". --Izno (talk) 19:36, 31 January 2019 (UTC)[reply]

https://github.com/ms609/citation-bot/pull/1301 AManWithNoPlan (talk) 21:22, 7 February 2019 (UTC)[reply]

Citation bot continues to violate WP:ELNEVER by adding CiteSeerX links of dubious provenance

Status
new bug
Reported by
David Eppstein (talk) 18:58, 8 February 2019 (UTC)[reply]
What happens
Citation bot, apparently in automatic mode rather than editor-initiated, is adding CiteSeerX links to articles. In many cases this is useful but in a significant minority of the cases the links violate WP:ELNEVER. Links should only be added when they derive either from official and open publisher copies or from copies placed online by the original author of the work (directly or indirectly e.g. via an institutional repository that the author has contributed to). Many of the links in CiteSeerX instead derive from course reading lists, researchers' collections of related works, or other material that, in their original form, may meet the legal requirements for fair use but DO NOT meet Wikipedia's stricter requirements for links. Citation bot is unable to evaluate the provenance of its CiteSeerX links so it should never add them without human supervision.
What should happen
CiteSeerX disables the automatic addition of CiteSeerX links. Or it gets blocked the next time I see another bad link of this type.
Relevant diffs/links
Special:Diff/882384271. In this example, two CiteSeerX links were added, one for "Efficient planarity testing" (Hopcroft/Tarjan) and one for "LEDA" (Mehlhorn/Naher). Following the CiteSeerX links shows that the Mehlhorn/Naher link is ok — at least one of its original sources is from a web page under the control of one of the authors, Mehlhorn. The Hopcroft/Tarjan link is not ok — it has two original source links, both of which are personal web pages under the control of David P. Dobkin, who is not an author. Whether those links are online is between Dobkin, the authors, and the publisher, not our concern. But adding this link here is in violation of WP:ELNEVER.
We can't proceed until
Feedback from maintainers


Address the problem at its root source, report the violation to CiteSeerX, if it is, in fact, a violation. Headbomb {t · c · p · b} 19:01, 8 February 2019 (UTC)[reply]
No. CiteSeerX has different constraints than we do on what we can link. WP:ELNEVER is very clear. It is not the responsibility of CiteSeerX to prevent you from adding violating links. And it should not be the responsibility of human editors to hand-check each and every one of these thousands of edits the bot is making. If you are not willing to stop the bot from adding these bad links, I will block it. —David Eppstein (talk) 19:04, 8 February 2019 (UTC)[reply]
CiteSeerX is a big boy site, and they have their big boy pants on. If there are hosting paper while violating copyright, they're the ones exposing themselves to lawsuits, not Wikipedia. We also do not link to the paper directly, we link to CiteSeerX metadata, which is not a copyright violation. Headbomb {t · c · p · b} 19:10, 8 February 2019 (UTC)[reply]
CiteSeerX does not appear to be violating copyright law in any way. https://en.wikipedia.org/wiki/User_talk:Citation_bot/Archive_13#Do_not_automatically_add_Citeseerx AManWithNoPlan (talk) 19:11, 8 February 2019 (UTC)[reply]
That earlier discussion was about user-activated instantiations of the bot. When a user does this, they implicitly take responsibility for checking the results and making sure that they are not introducing link policy violations. The discussion here is about fully automatic instantiations, when there is no user but Citation bot itself to blame. —David Eppstein (talk) 19:34, 8 February 2019 (UTC)[reply]
Also of note, if you block the bot, very likely you will get in trouble for violating WP:INVOLVED. I know I'd start proceedings against you did you did take an admin action in such a matter. Headbomb {t · c · p · b} 19:13, 8 February 2019 (UTC)[reply]
[edit conflict] That may or may not be true but it is irrelevant. What part of "External links to websites that display copyrighted works are acceptable as long as the website is manifestly run, maintained or owned by the copyright owner; the website has licensed the work from the owner; or it uses the work in a way compliant with fair use." is unclear? Note the complete absence of whether the link host is in violation of law from those conditions. None of those conditions appear to be true for the link in question, unless you interpret "fair use" so broadly as to make any link anywhere ok. And how am I involved? I have not participated in bot development and have merely watched the bot make many dubious changes and reacted to them, the same as I would for any other editor making rapid-fire dubious changes. —David Eppstein (talk) 19:15, 8 February 2019 (UTC)[reply]
You want to change how the bot operates, and have tried repeatedly to do so, and now are using your admin bit as a blugdeon, rather than demonstrate via consensus there is a problem with the bot's actions, or that linking to CiteSeerX metadata via bot is legally problematic despite the evidence to the contrary. And I'll also point out that there is a very easy way to prevent the bot from repeating mistakes. Headbomb {t · c · p · b} 19:19, 8 February 2019 (UTC)[reply]
I have no idea whether it is legally problematic, neither do you, and that is in any case a red herring. The issue is that it is clearly in violation of Wikipedia's external link guidelines. They have different and stricter standards than the law, and it is those standards we must live up to here. —David Eppstein (talk) 19:23, 8 February 2019 (UTC)[reply]
Again, we are not linking to a copy of the paper, we are linking to CiteSeerX metadata (e.g. CiteSeerx10.1.1.54.9556). Headbomb {t · c · p · b} 19:24, 8 February 2019 (UTC)[reply]
You are continuing to Wikilawyer but a sentence only a couple lines down is again clear: "If there is reason to believe that a website has a copy of a work in violation of its copyright, do not link to it." That does not have any exception for Playboy-style "we're only reading it for the articles, not the porn" excuses. —David Eppstein (talk) 19:31, 8 February 2019 (UTC)[reply]
The bot doesn't run in a fully automated way. However, see below. Headbomb {t · c · p · b} 19:42, 8 February 2019 (UTC)[reply]
So your position is now "it's someone else's fault but I can't tell you whose"? That's not good enough. We need these bad link additions to stop. —David Eppstein (talk) 07:02, 9 February 2019 (UTC)[reply]

Meanwhile I wonder what's the point of removing an identifier which at CiteSeerX doesn't even have a PDF. Nemo 18:59, 10 February 2019 (UTC)[reply]

The only point of linking to that identifier is to follow its links to copies of a paper hosted elsewhere which, if pointed to directly, would certainly violate WP:ELNEVER. —David Eppstein (talk) 20:04, 10 February 2019 (UTC)[reply]
That's not my understanding. Exposing metadata and a citation graph is the/a primary aim of the linked CiteSeerX pages, per http://csxstatic.ist.psu.edu/home . Nemo 20:52, 10 February 2019 (UTC)[reply]

This entire discussion is misguided. The Wikipedia standard does NOT apply to references. https://en.wikipedia.org/w/index.php?title=Wikipedia:COPYLINK is the correct standard AManWithNoPlan (talk) 01:36, 20 February 2019 (UTC)[reply]

COPYLINK is also pretty clear... "if you know or reasonably suspect that an external Web site is carrying a work in violation of the creator's copyright, do not link to that copy of the work" ... "Knowingly and intentionally directing others to a site that violates copyright" (and I don't want to see you trotting out the old "they've never been convicted so it's not copyright violation" bullshit). —David Eppstein (talk) 02:38, 20 February 2019 (UTC)[reply]
not try to dodge discussion, just wanted to make sure that we were reading the correct documentation: not WP:ELNEVER but WP: COPYLINK AManWithNoPlan (talk) 03:30, 20 February 2019 (UTC)[reply]

'User-activated'

Status
new bug
Reported by
Headbomb {t · c · p · b} 19:23, 8 February 2019 (UTC)[reply]
What happens
[1]
What should happen
The username should be reported.
We can't proceed until
Feedback from maintainers


Link to the Issue on GitHub with links to documentation someone needs to read and use https://github.com/ms609/citation-bot/issues/948 AManWithNoPlan (talk) 19:45, 8 February 2019 (UTC)[reply]

Not really sure what I'm looking at there, but activating via the API with the username specified should still be allowed , e.g. https://tools.wmflabs.org/citations/process_page.php?edit=toolbar&slow=1&user=Headbomb&page=FOOBAR

. Headbomb {t · c · p · b} 19:49, 8 February 2019 (UTC)[reply]

But activating without one is also allowed. AManWithNoPlan (talk) 19:55, 8 February 2019 (UTC)[reply]
I think it's time we revise that. Edits must be attributable to those who activate the bot. Headbomb {t · c · p · b} 20:01, 8 February 2019 (UTC)[reply]
If the bot is not running in autonomous mode, that means an editor is choosing to activate it on a specific article. If that editor's name can't be given credit for the edit itself (with Citation Bot named in the edit summary), that editor's name should clearly be recorded in the edit summary. This should be a no-brainer. We don't let editors run AWB as User:AWB with a summary of "AWB general edits"; this is very similar. – Jonesey95 (talk) 21:20, 8 February 2019 (UTC)[reply]
A first pass at some code. https://github.com/ms609/citation-bot/pull/1313 probably doesn't work and will need a key from the wiki overlords AManWithNoPlan (talk) 21:31, 8 February 2019 (UTC)[reply]
This should be a no-brainer. True, but writing the code and getting it to work is not a no-brainer. We could really use some help on that. AManWithNoPlan (talk) 21:45, 8 February 2019 (UTC)[reply]
I agree that this should be a high priority. I checked again all the ELNEVER violations that the bot is continuing to add (beyond the one in my report above, see Special:Diff/882439506 (Brent), Special:Diff/882429046 (Szekely), and Special:Diff/882394979 (Fiat and Shamir), and all are marked as "User-activated". So I would like to be able to track whoever is doing this down in order to get them to stop. However, stopping this from happening takes priority over making sure that the most blameworthy party takes the blame for it, and without being able to identify a responsible user the blame is currently resting on Citation bot for introducing these bad links, and the obvious way to prevent it from happening is to block the bot. —David Eppstein (talk) 06:59, 9 February 2019 (UTC)[reply]
An edit filter for 'user-activated' would be much preferable to a block. Headbomb {t · c · p · b} 21:44, 9 February 2019 (UTC)[reply]
Aren't you the same guy that thought letting anonymous users run it on all pages that a page linked to was a good idea? AManWithNoPlan (talk) 23:22, 9 February 2019 (UTC)[reply]
No? I emphatically agreed with the need to restrict that feature. Headbomb {t · c · p · b} 23:48, 9 February 2019 (UTC)[reply]
https://github.com/ms609/citation-bot/pull/1319 work in progress AManWithNoPlan (talk) 23:45, 9 February 2019 (UTC)[reply]
Ok, as long as you're working on this I'll hold off on pushing for a block. I appear to have been mistaken in thinking this was the bot running in automatic mode; there were five more ELNEVER-violating links added on my watchlist this morning, but all of them continue to be marked as user-activated. This needs to stop, but if we can track down the user responsible for these problems then it won't be necessary to block the bot from doing its other useful work. —David Eppstein (talk) 00:21, 10 February 2019 (UTC)[reply]
But the longer this goes on the more my patience is getting stretched thin. I am having to check dozens of edits per day and finding many many violations of ELNEVER. (Another eight violations just from this morning, just from my watchlist — I don't have the patience to check the bot's entire contribution list.) Please, as a show of good faith, disable the CiteSeerX feature until these bad edits can be attributed to a human editor. —David Eppstein (talk) 19:01, 10 February 2019 (UTC)[reply]
The edit filter is that way WP:EDITFILTER. Headbomb {t · c · p · b} 19:24, 10 February 2019 (UTC)[reply]
You suggest I should request that CiteSeerX edits be blocked by edit filter rather than allowing them to continue when a human editor takes responsibility for them and takes the block if repeatedly failing to do so? That seems drastic. In the meantime, I am taking your response as a WONTFIX, and rescinding my statement above that I am holding off pushing for a block. These unattributed ELNEVER violations must stop. —David Eppstein (talk) 19:37, 10 February 2019 (UTC)[reply]
No, I suggest you make an edit filter to filter out the 'user-activated' edits of the bot. I'll also point out I'm neither maintainer nor operator of Citation bot. Headbomb {t · c · p · b} 19:39, 10 February 2019 (UTC)[reply]
You are suggesting that if I close my eyes and stop seeing it making link violations in my watchlist, the problem will go away? As long as "user activated" cannot be attributed to an actual user, it is entirely the responsibility of Citation bot to stop making bad edits. If it can't stop and can't pass the blame to a specific user, it needs to be blocked to prevent ongoing harm to the encyclopedia. —David Eppstein (talk) 19:43, 10 February 2019 (UTC)[reply]
I suggest that you use your brain and implement a solution that prevents the problematic unattributed edits of citation without throwing the baby with the bathwater. The edit filter achieves that. Blocking the bot is a net negative, especially when you've got an alternative. Headbomb {t · c · p · b} 19:46, 10 February 2019 (UTC)[reply]
So block all edits from Citation Bot that add a CiteSeerX link, rather than blocking the bot outright? I don't have the permissions to do that directly but I suppose it's worth a try requesting it. —David Eppstein (talk) 19:50, 10 Fe bruary 2019 (UTC)
The combination of 'user-activated' + 'citeseerx' would be much better. Or 'user-activated' alone. Headbomb {t · c · p · b} 20:08, 10 February 2019 (UTC)[reply]

Ok, see Wikipedia:Edit filter/Requested#CiteSeerX and Citation bot. —David Eppstein (talk) 20:45, 10 February 2019 (UTC)[reply]

Should that documentation for the citation tempates be updated to say that this should only be added if the citation is lacking other identifiers. There are plenty of documents that only exist there, since they are unpublished. AManWithNoPlan (talk) 16:11, 15 February 2019 (UTC)[reply]
link to oauth code work. https://github.com/ms609/citation-bot/pull/1335 AManWithNoPlan (talk) 21:00, 15 February 2019 (UTC)[reply]
This is still not deployed. Or if it's deployed, the bot is still not attributing the edits to the activator. Headbomb {t · c · p · b} 21:33, 20 February 2019 (UTC)[reply]
still playing with, but I cannot debug it myself a without some Smith help. AManWithNoPlan (talk) 21:36, 20 February 2019 (UTC)[reply]

Capitalization: German Mit

Status
new bug
Reported by
Headbomb {t · c · p · b} 07:45, 16 February 2019 (UTC)[reply]
What should happen
[2]
We can't proceed until
Feedback from maintainers


https://github.com/ms609/citation-bot/pull/1374 AManWithNoPlan (talk) 14:36, 24 February 2019 (UTC)[reply]

Request: If there's no URL, remove via

Status
new bug
Reported by
Headbomb {t · c · p · b} 08:36, 16 February 2019 (UTC)[reply]
What happens
[3]
What should happen
[4]
We can't proceed until
Feedback from maintainers


WP:SAYWHERE is explicitly not WP:SAYHOW. While |via= may let readers know where a link points to when it's unusual, it's pointless to have when you have no link to go with. Headbomb {t · c · p · b} 08:36, 16 February 2019 (UTC)[reply]

This is a bad change. I don't need a URL to be reproduced by a secondary organization. --Izno (talk) 13:34, 16 February 2019 (UTC)[reply]
Agreed about it being a bad change. |via= is often used in lieu of a URL precisely because EBSCOhost and other repositories don’t have permanent URLs, but it’s still useful to let people know where they got the article. Umimmak (talk) 14:09, 16 February 2019 (UTC)[reply]
And that's exactly what WP:SAYWHERE says not to do. That you read a journal article through an EBSCO database versus PASCAL, ProjectMuse, or GoogleScholar is irrelevant. This is also undue promotion/publicity of paid databases.Headbomb {t · c · p · b} 16:09, 16 February 2019 (UTC)[reply]
But we still have parameters like |jstor= because it can be helpful to let the reader know that an online version of the article exists on JSTOR. I don’t see how this use of |via= is different; it lets the reader know the article can be found in a particular database which they might have access to. If you don’t want to use |via= at all, that’s one move you could try to gain consensus for but I don’t see the benefit of removing it only when there isn’t a URL. (Surely the URL tells you where you’re going, making |via= redundant, no?) Umimmak (talk) 21:01, 16 February 2019 (UTC)[reply]
JSTOR is very different: it's an identifier, which was also assigned to publications never digitised before and lacking a DOI. (Although some JSTOR IDs have also become DOIs, while other publications with a JSTOR ID have been later been assigned a DOI by a publisher.) Nemo 21:19, 16 February 2019 (UTC)[reply]
|jstor= gives you a link to the specific paper on the JSTOR repository. We don't just add |via=JSTOR with no link to JSTOR. There is no reason for the reader to care that you've personally accessed the article via an EBSCOhost vs PASCAL vs ProQuest vs Whatever database, the only thing that matters is what article you read. How you've accessed the material is irrelevant.Headbomb {t · c · p · b} 21:20, 16 February 2019 (UTC)[reply]
Yes and if EBSCO had convenient URLs or identifiers those would be used instead of |via=; I still think it’s helpful to say a resource is available online, particularly when it’s a resource Wikipedia editors have access to via WP:LIB. Umimmak (talk) 21:30, 16 February 2019 (UTC)[reply]
Which makes it extra pointless to readers (and even harmful since you're directing them manual search for sources they cannot access), completely unneeded for WP:V, promotes a specific commercial service, and against WP:SAYWHERE. Headbomb {t · c · p · b} 01:28, 17 February 2019 (UTC)[reply]
Amazon links and Google links to books are supposed to be removed unless they link to a free preview of the referenced information. This is because we are not supposed to promote individual content providers. This is relevant to consider. AManWithNoPlan (talk) 15:11, 16 February 2019 (UTC)[reply]
Basically, if via was valid without a url, then |via=Google it you dumbass would basically be the right answer on 99% of {{cite web}}. AManWithNoPlan (talk) 15:25, 16 February 2019 (UTC)[reply]
No? It has legitimate use even with a URL. See the documentation for the parameter. This bot should not touch via whatsoever. --Izno (talk) 15:27, 16 February 2019 (UTC)[reply]

When we lose urls we obviously get rid of it, and in most cases it is probably best to get rid of when no url, but this seems like something that should only be removed by an automatic process if there are obvious other links such as doi/pmc. Having a bot just remove them in general seems dubious. Although I do laugh when vis says Google search and remove it. AManWithNoPlan (talk) 15:58, 16 February 2019 (UTC)[reply]

Also, it is amazing how often I find ‘my local library’ as the |via= AManWithNoPlan (talk) 17:19, 16 February 2019 (UTC)[reply]
Unless |via= provided some uquique source of information, it is simply promoting a specific database. Unrelated side note: EBSCO so-called urls do suck. ProQuest at least gives every document a single number (actually, they sometimes give it more than one) that you can simply use. AManWithNoPlan (talk) 22:38, 16 February 2019 (UTC)[reply]
Headbomb conveniently forgets that the RfC on |via= also closed with consensus against their position (getting to be a bit of a pattern that). However, they are correct that the parameter should not be used to indicate that a paper might be available from a particular database: it's to indicate that database through which it was actually accessed, and some judgement and common sense is required regarding its use (do not blindly add it). There is no good reason to include |via= when the specific source accessed cannot possibly have any differences from a conceptual perfect master: a prime example is a paper journal accessed in a physical copy in your local library (and the same, roughly, goes for an electronic copy on the publishers own website: both are effectively to be considered perfect copy of record, and only third-party republishing/access should be indicated in |via=). However, this does in no way require a link or identifier: you may have accessed the article through a random website or database, identiified in |via=, but omitted to include the link or identifier. That makes the flaw in the citation the absence of links or identifiers, not the precense of |via= (in fact, in that case |via= may be essential to enable locating the source in question, or determining equivalency between copies of indeterminate provenance). The matter of when |via= serves a purpose and when it is just pointless clutter is not a clear cut one, which is why it should not be treated mechanistically: it is not something that can be determined by a bot based on a simplistic rule like presence of |url=. However, there are probably a few (a very few) "blacklist" type cases that could constructively be detected, along the lines of |via=my local library or |via=web search. Things like |via=Google do not qualify: it may be trying to identify a Google Books preview or similar that needs human judgement to determine whether it makes sense or no (that it will likely mostly not make sense does not change that fact). And "human judgement" is not the few self-selected people here: it requires looking individually at each specific instance, and is subject to local consensus processes at each article. You can't make a consensus here that decides what happens over there. Case in point, Headbomb's favourite bugaboo, |via=EBSCOhost, can be argued both ways (include vs. leave out) and thus needs to employ the same consensus processes as all other such issues (SAYWHERE describes what you are not required to do, not what you are prohibited from doing). Removing it with a bot is simply an attempt to circumvent those processes. --Xover (talk) 06:53, 17 February 2019 (UTC)[reply]
There was an RFC on via? When? As for your example "There is no good reason to include |via= when the specific source accessed cannot possibly have any differences from a conceptual perfect master". That's exactly what those databases offer. Perfect reproductions of published version of records, with no material differences, save perhaps for a preamble page unique to the database. Headbomb {t · c · p · b} 07:41, 17 February 2019 (UTC)[reply]
Given you were the one that started the RfC, your question here now is pretty disingenious. And your further argument is a salient one to make when discussing one specific use in one specific citation. I might even conceivably agree with you in such a discussion (or not; it would depend on the situation). --Xover (talk) 09:17, 17 February 2019 (UTC)[reply]
Again, what RFC? Headbomb {t · c · p · b} 09:45, 17 February 2019 (UTC)[reply]
I do remember Headbomb getting some KFC, but I no memory of an RFC. AManWithNoPlan (talk) 20:31, 17 February 2019 (UTC)[reply]

Maybe it is this [5] Where via was generally considered worthless and often harmful, but did in some situations have value (in all the discussions it generally assumed that a url was present. AManWithNoPlan (talk) 21:23, 17 February 2019 (UTC)[reply]

New bug from R8R

Status
new bug
Reported by
R8R (talk) 18:27, 20 February 2019 (UTC)[reply]
What happens
First of all, many |url= instances are changed into |chapter-url=, even though many references refer not just to a chapter, but to an exact page within that chapter. Second, capitalization of the names of the French journals was entirely unnecessary; the French don't do that and both references had |language=fr. Third, |pages=IE-87 did not need to replace a hyphen with an en dash. Such a change would make sense in many cases when people don't know how to or simply don't care about the proper page ranges, but letters should signal this isn't a regular case.
Relevant diffs/links
[6]
We can't proceed until
Feedback from maintainers


“capitalization of the names of the French journals” Wikipedia style guides follow the English formatting rules. AManWithNoPlan (talk) 18:35, 20 February 2019 (UTC)[reply]
Changing pages makes the template text matched the displayed text. If that is wrong, then fix the underlying text: please see Bot main page for description. AManWithNoPlan (talk) 18:35, 20 February 2019 (UTC)[reply]
Not sure what you mean about url changes since chapter is closer to pages than the book. AManWithNoPlan (talk) 18:35, 20 February 2019 (UTC)[reply]
Thank you for your second response, I have read it and added {{hyphen}} in the citation instead. I decided to leave the issue covered by your third response be, maybe I'm making a problem out of nothing. As for the first, I don't quite follow. The French don't capitalize names of their journals; therefore, their proper names are not capitalized. If capitalization of such proper names changes depending on the surrounding language, could you provide a link to a rule that says that? I haven't found anything of this sort in en.wiki citation or CS1 rules.--R8R (talk) 18:51, 20 February 2019 (UTC)[reply]
WP:CAPITALS? --Izno (talk) 18:54, 20 February 2019 (UTC)[reply]
Ah, I see it at MOS:FOREIGNTITLE: Capitalization in foreign-language titles varies, even over time within the same language. Retain the style of the original for modern works. For historical works, follow the dominant usage in modern, English-language, reliable sources.
So the change the bot made is more-or-less incorrect. --Izno (talk) 19:02, 20 February 2019 (UTC)[reply]
The url change seems reasonable as there is no |page-url=. As for capitalization, we don't use French rules. |language= does not refer to the language title of the work(s) or where the work was produced but the content within, so that reason for not changing is straight bogus in the context of the template. As for hyphens and endashes, that's a hard problem. I'm not sure what the best behavior is for that. --Izno (talk) 18:52, 20 February 2019 (UTC)[reply]
hypens and dashes are annoying. We just change the data to match display. Is 7-8 pages 7 to 8 or page 8 of section 7? Or 3-7–3-9 is really ugly. AManWithNoPlan (talk) 18:58, 20 February 2019 (UTC)[reply]
Re the capitalization I think MOS:FOREIGNTITLE is the most relevant guideline. It says to respect the French capitalization. I think this is also in agreement with WP:COMMONNAME (a policy!). For instance, when we have articles on journals or magazines with French-language titles, they should be capitalized in the French way; e.g. Revue politique et littéraire. —David Eppstein (talk) 19:02, 20 February 2019 (UTC)[reply]
I get this, and I wouldn't have complained if I had "7-8". I, however, had "IE-87": since what stands before the hyphen is not a number, but a string of letters, there is no unambiguity here. Based on what I know from my shallow coding skills, this shouldn't be too hard to check? I get it that few people might have encountered this so far, but the fix presumably shouldn't be too hard to make, either. If you indeed decide to alter the bot to change the foreign title capitalization (thank you Izno and David for finding the appropriate guideline), wouldn't it be a good idea to change this, too?--R8R (talk) 19:17, 20 February 2019 (UTC)[reply]
(EC) With respect to the French, we do capitalize journal titles (Annales de la Société Entomologique de France). We also don't. Title casing or sentence casing is a matter of preference. Likewise how to capitalize foreign title in English is also a matter of preference. Some style guides use title casing, some use sentence/original casing. The bot isn't necessarily wrong to capitalize them, but sadly WP:CITEVAR is a thing. Best way to deal with this at the moment is to add a comment in the journal name, but exceptions could also be added at the bot-level for the more common journals. Headbomb {t · c · p · b} 19:18, 20 February 2019 (UTC)[reply]
Surely the best way to handle issues that reasonably fall under CITEVAR is for the bot not to violate CITEVAR by gratuitously changing everything to its own preferred style? —David Eppstein (talk) 21:13, 20 February 2019 (UTC)[reply]
as for page numbers we are converting the meta data to match what users see. See the template documentation. AManWithNoPlan (talk) 21:17, 20 February 2019 (UTC)[reply]
99%+ of journals cited are in English, and the bot brings it in line with MOS. Again, if you don't want the bot to change something from one style to the other, either a) don't use the bot, b) be prepared to tell it to not touch something specific, or c) report the problematic journal here so it can be added to the capitalization exceptions (La Revue scientifique, not being valid in either title/sentence casing variation). The page number thing above is a bug though.Headbomb {t · c · p · b} 21:18, 20 February 2019 (UTC)[reply]
If the Bot's changes are correct and conform to Wikipedia's policies and guidelines in 99% of cases, that still means that there are roughly 60,000 articles that it is likely to fuck up. Bots can do a lot more damage than humans so they should be much more circumspect. Human editors should not be expected to have to tag every article they edit with bot exclusions to prevent damage by Bots with badly-estimated views of the level of edits they are competent to make. —David Eppstein (talk) 01:13, 21 February 2019 (UTC)[reply]
Again, exceptions which can be bypassed at the individual levels, or at the bot level. Or by simply not activating the bot.Headbomb {t · c · p · b} 01:18, 21 February 2019 (UTC)[reply]

I stand corrected, the citation templates now detect pages of IE-7 and don’t convert the dash. AManWithNoPlan (talk) 21:35, 20 February 2019 (UTC)[reply]

https://github.com/ms609/citation-bot/pull/1354 AManWithNoPlan (talk) 02:39, 21 February 2019 (UTC)[reply]

REQUEST: JSTOR improvements

Status
new bug
Reported by
(tJosve05a (c) 02:16, 21 February 2019 (UTC)[reply]
What should happen

Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=User%3AJosve05a%2Fcite-sandbox&diff=prev&oldid=884350777
We can't proceed until
Feedback from maintainers


Ah yes. We currently limit them to 100000 and up to avoid GIGO AManWithNoPlan (talk) 02:27, 21 February 2019 (UTC)[reply]

What's the GIGO concern with "replace www.jstor.org/stable/18398 with |jstor=18398"? Headbomb {t · c · p · b} 21:09, 21 February 2019 (UTC)[reply]
I think the concern was that a really short number is a copy/paste error. 22:59, 21 February 2019 (UTC)
And yet, we still expand metadata from that URL? If we were to run the bot a second time, couldn't the bot verifiy the metadata from the template with the JSTOR link, and if they match perform the above action (and if they are widly different, assume copy+paste error)? (tJosve05a (c) 23:24, 21 February 2019 (UTC)[reply]
Well, I don't really see what's gained by keeping the error hidden behind a url in the cases where someone copy-pasted this by mistake, if that's even a thing to begin with. Headbomb {t · c · p · b} 00:31, 22 February 2019 (UTC)[reply]
true, with the parameter you see the jstor id of one and are suspicious AManWithNoPlan (talk) 01:20, 22 February 2019 (UTC)[reply]
https://github.com/ms609/citation-bot/pull/1367 AManWithNoPlan (talk) 00:37, 23 February 2019 (UTC)[reply]

format = pdf in cite arxiv

Status
new bug
Reported by
Headbomb {t · c · p · b} 01:55, 22 February 2019 (UTC)[reply]
What should happen
[7]
We can't proceed until
Feedback from maintainers


REQUEST: Another JSTOR proxy

https://www-jstor-org.libezp.lib.lsu.edu/stable/10.7249/j.ctt4cgd90.10?Search=yes&resultItemClick=true&searchText=social&searchText=media&searchText=egypt&searchUri=%2Faction%2FdoAdvancedSearch%3FcurrentPath%3D%252Faction%252FdoAdvancedSearch%26amp%3Bf5%3Dall%26amp%3Bq0%3Dsocial%2Bmedia%26amp%3Bc6%3DAND%26amp%3Bf1%3Dall%26amp%3Bc2%3DAND%26amp%3Bf2%3Dall%26amp%3Bc3%3DAND%26amp%3Bgroup%3Dnone%26amp%3Bacc%3Don%26amp%3Bc4%3DAND%26amp%3BsearchType%3DfacetSearch%26amp%3Bf6%3Dall%26amp%3Bsd%3D2010%26amp%3Bpage%3D1%26amp%3Bc1%3DAND%26amp%3Bed%3D2018%26amp%3Bq1%3Degypt%26amp%3Bf0%3Dall%26amp%3Bf4%3Dall%26amp%3Bf3%3Dall%26amp%3Bc5%3DAND&seq=1#metadata_info_tab_contents

should be changed to |jstor=10.7249/j.ctt4cgd90.10 and remove the URL. (tJosve05a (c) 10:01, 22 February 2019 (UTC)[reply]

Another one: https://www.jstor.org.libweb.lib.utsa.edu/stable/3347357 (tJosve05a (c) 12:13, 22 February 2019 (UTC)[reply]
Another one https://www.jstor.org.offcampus.lib.washington.edu/stable/44645167 (tJosve05a (c) 13:01, 22 February 2019 (UTC)[reply]

https://github.com/ms609/citation-bot/pull/1368 AManWithNoPlan (talk) 01:02, 23 February 2019 (UTC)[reply]

BUG: Bot is changing "page" to "pages" for single page citations (jstor)

Status
new bug
Reported by
MeegsC (talk) 17:01, 22 February 2019 (UTC)[reply]
What happens
it's putting "pages" for single page citations
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Bonaparte%27s_gull&curid=335162&diff=884544984&oldid=852050121
We can't proceed until
Feedback from maintainers


https://github.com/ms609/citation-bot/pull/1362 this bug fix once implemented I think will do it AManWithNoPlan (talk) 18:15, 22 February 2019 (UTC)[reply]

REQUEST: More volume/issue cleanups

Status
new bug
Reported by
(tJosve05a (c) 17:47, 22 February 2019 (UTC)[reply]
What happens
|volume=47.4|volume=47.4|issue=4 (if a JSTOR identifier says the issue is 4)
What should happen
|volume=47.4 and |volume=Vol. 47, No. 4 should be treated the same was as |volume=47(4) is (i.e. converted to |volume=47|issue=4) if metadata from e.g. JSTOR supports the change.
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=User%3AJosve05a%2Fcite-sandbox&diff=prev&oldid=884596381
We can't proceed until
Feedback from maintainers


Definitely would want metadata confirmation on that. AManWithNoPlan (talk) 18:04, 22 February 2019 (UTC)[reply]

The examples in the diff does have metadata confirmation on JSTOR, so for those it should be done for at least :) (tJosve05a (c) 18:08, 22 February 2019 (UTC)[reply]

https://github.com/ms609/citation-bot/pull/1366 AManWithNoPlan (talk) 01:02, 23 February 2019 (UTC)[reply]

Book review authors

Status
new bug
Reported by
– Arms & Hearts (talk) 19:29, 22 February 2019 (UTC)[reply]
What happens
In a citation of this book review the bot has made a mess of the author parameters, listing the authors of the works reviewed and only the initials of the reviewer, incorrectly capitalised (though the latter is all JSTOR offers too).
What should happen
Ideally the full name of the reviewer would be provided. That's probably not possible though, so the bot should either provide only the reviewer's initials, with the proper capitalisation, or should leave the author fields blank.
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=The_Old_Straight_Track&curid=1390457&diff=884598875&oldid=876311634
We can't proceed until
Feedback from maintainers


https://github.com/ms609/citation-bot/pull/1363 this should help a lot, I think. AManWithNoPlan (talk) 20:15, 22 February 2019 (UTC)[reply]

Notice

Per a request at WP:ANRFC, I have closed Help talk:Citation Style 1#RFC on publisher and location in cite journal, which concerns the actions of Citation bot (you). Since your operator has not edited in almost 2 weeks, I have also requested that the bot be blocked until it is compliant with the result of the RfC. --DannyS712 (talk) 19:52, 22 February 2019 (UTC)[reply]

I've done so. Sandstein 20:25, 22 February 2019 (UTC)[reply]
https://github.com/ms609/citation-bot/pull/1364 code written, just waiting for deployment AManWithNoPlan (talk) 20:26, 22 February 2019 (UTC)[reply]
I find it speaks volumes that you apparently think "the RFC went our way mostly" (It didn't. It pretty much did exactly the opposite.) and that, provided I'm reading the code correctly, you've switched from removing it to recommending to the bot's users to remove them manually (that is, you're trying to use them as unwitting WP:MEATPUPPETs). In my book that's a blatant attempt to WP:GAME the system and still use your bot to circumvent both explicit community consensus and WP:CITEVAR. Is this really going to have to end up at ANI before you drop thiis particular stick?
@Sandstein: Please take the above into account before entertaining a request to unblock the bot again, and if in doubt perhaps solicit input from either BAG or AN/ANI (Not trying to "Let's you and him fight": I'm obviously involved which is why I'm pointing at presumably independent forums for input.). --Xover (talk) 20:47, 22 February 2019 (UTC)[reply]
people are allowed to have different opinions on the general outcome. I wouldn’t have posted the link to the pull if I was trying to be a muppet. Also, I have changed the pull following the suggestions. Also, I am not the bot opperator, so using my unaccepted pull as guidance would be an abuse of the operators good faith. AManWithNoPlan (talk) 21:11, 22 February 2019 (UTC)[reply]
Voicing dissenting opinions and having a different opinion on closures is absolutely acceptable behavior (even if they are on another website in the comment section of pull request). Advising users who use the tool/bot's GUI about the maintainer's opinion on the inclusion of a parameter (i.e. writing on the GUI page "you should probably do this and that") is none of this community's business, and such controlling behavior is a dangerous slope. The RfC was to stop automatic removals, which is being implemented, all other STICKy behavior by users trying to censure opinions by a (volunteer maintainer, not operator) user is way out of line, imo. (tJosve05a (c) 21:19, 22 February 2019 (UTC)[reply]
that’s exactly why I included the link. So, people could comment. Thank you for the feedback. The patch is changed to just mention its existence now, or is merely mentioning it have an implication of ‘and this is stupid, so delete it’? AManWithNoPlan (talk) 21:30, 22 February 2019 (UTC)[reply]
(ec, and past my bed time; will respond at length tomorrow) Short version: yes, even the "less opinionated" version carries that connotation and reads like an attempt to circumvent consensus. Also, if the intent was to seek feedback, you need to write "Here's a first cut. Any feedback?" (or whatever). Your above message in no way suggested that you were looking for feedback, and as all my previous attempts to provide such were ignored there was no possible grounds for me to assume any would be welcomed in this case either. --Xover (talk) 22:01, 22 February 2019 (UTC)[reply]
Consensus, as I read, was only that it should not be automated (which is compiled to by this pull). There was no consensus (from what I can tell if they should be removed or not, so by "notifying" users in the GUI that it (in the bot's opinion) should probably be removed, is not "an attempt to circumvent consensus" it is clearly abiding by it. What the maintainers leave in their code comments (or pull comments in this case) on another website has no real bearing here, and what matters is that it is being turned off from automatic. I despise (harsh, but true) any and all attemt to censor what the GUI should "propose or not propose" to the end user, as long as it isn't automated, it will be up to the end user to follow policy and consensus in the end.(tJosve05a (c) 22:13, 22 February 2019 (UTC)[reply]
@Josve05a: The threading is badly messed up here, so this is actually a reply to your previous message due to intervening edit conflicts etc.
Any editor may voice their opinion anywhere they like, and the bot's operators are free to write all the essays they want (in fact, that approach would have avoided a whole lot of needless conflict and drama). But the bot policy explicitly prohibits using bots for edits that do not have consensus: editors can be as opinionated as they like, but bots should not be.
Oh, and anything on the project is "this community's business": operating a bot is a privilege, not a right. One should feel free to disagree with the community all one wants (I frequently do), but moving something to toolforge does not somehow magically make it exempt from community consensus.
However, your position, as expressed in your second message, is noted. You are, of course, entirely entitled to it, and my disagreement with it, individually, matters very little. You are an editor, not a bot, and your opinions and mine have exactly the same basic ability to effect change. --Xover (talk) 07:57, 23 February 2019 (UTC)[reply]

Given that people have an amazing ability to blindly follow orders I have removed the informative part of the pull. Honestly, anyone who thinks someone is going to read what the bot says and act upon it gives the Bot way to much credit; although people surprise me all the time. AManWithNoPlan (talk) 22:17, 22 February 2019 (UTC)[reply]

Thank you. Still provided I read the diff right, that removes my concern with CitationBot's behaviour on this point and consequently also my objection to its being unblocked once actually implemented and other relevant issues resolved. FYI ping to Sandstein. In addition I want to thank you (AManWithNoPlan) for being both responsive to community consensus (on the general part), and willing to take on specific feedback, on this issue. That certainly disproves my previous complaints on these counts. --Xover (talk) 07:57, 23 February 2019 (UTC)[reply]

Pointless removal of urls, incorrect edit summary (2)

[Wrapping up the discussion someone was in a hurry to bury in the archive.]

My thanks to Boghog for explaining that single URLs are now count double in the presence of DOIs. Headbomb's "linking twice" explanation would have been more useful if he had mentioned that DOIs count as URLs. (And his arithmetic is still faulty.)

The edit summary is still stupid. I have no problem with removing the accessdate, but removing it on the basis of there being "no specified URL" in same edit where the extant URL is removed is so stupefying that it ought to be suppressed. The "Removing parameters", while strictly true, is so lamely under-informative that I marvel at the possibility someone thought that was a useful message. Surely that could be improved. ♦ J. Johnson (JJ) (talk) 20:50, 22 February 2019 (UTC)[reply]

it will take a lot of coding. I will put that on the back burner to attack once oauth is done. AManWithNoPlan (talk) 21:06, 22 February 2019 (UTC)[reply]
Just out of curiosity: why would it "take a lot of coding"? I would expect there is an array of messages, just edit them. Or: is the challenge in extending the messaging function to specific actions? ♦ J. Johnson (JJ) (talk) 21:27, 22 February 2019 (UTC)[reply]
last time I looked the message code was a mess. I looked again and we have actually cleaned it up that it was actually easy. I am shocked it is now this easy: https://github.com/ms609/citation-bot/pull/1365 AManWithNoPlan (talk) 21:32, 22 February 2019 (UTC)[reply]
Next time I before I say something is hard, I had better double check the current code base! AManWithNoPlan (talk) 21:34, 22 February 2019 (UTC)[reply]
My gratitude for cleaning that up would not be lessened for it being easy. :-) But I am mindful of a comment from Sherlock Holmes: "I thought you might have done something clever." :-( ♦ J. Johnson (JJ) (talk) 22:54, 22 February 2019 (UTC)[reply]
I have done some very clever coding in my time, but this is not one of those times. Although, the clean-up that enabled this might have been done by me. AManWithNoPlan (talk) 23:15, 22 February 2019 (UTC)[reply]

CAPS: PS

Status
new bug
Reported by
(tJosve05a (c) 00:45, 23 February 2019 (UTC)[reply]
What happens
|journal=Ps: Political Science and Politics
What should happen
|journal=PS: Political Science and Politics
We can't proceed until
Feedback from maintainers


When expanding from {{Cite journal |jstor = 420824}} (tJosve05a (c) 00:45, 23 February 2019 (UTC)[reply]

https://github.com/ms609/citation-bot/pull/1374 AManWithNoPlan (talk) 14:35, 24 February 2019 (UTC)[reply]

Needs to be restarted

Status
new bug
Reported by
  —Chris Capoccia TC 13:14, 24 February 2019 (UTC)[reply]
What happens
won't process anything
We can't proceed until
Feedback from maintainers


Account is currently blocked. Also see discussion above. Boghog (talk) 13:16, 24 February 2019 (UTC)[reply]

The toollabs-tool that is operated via script (/use) for manual edit is down as well, which worked until yesterday evening at least. (tJosve05a (c) 13:18, 24 February 2019 (UTC)[reply]
thanks. i had read through the other entries. the discussion in "notice" is really dense with jargon and i would not have guessed that was the reason for why nothing was being processed.  —Chris Capoccia TC 13:23, 24 February 2019 (UTC)[reply]
Gadget API is up and running. Naturally, automatic edits are still blocked. AManWithNoPlan (talk) 14:55, 24 February 2019 (UTC)[reply]

Bot edits not viewable on articles

Status
new bug
Reported by
FULBERT (talk) 16:30, 24 February 2019 (UTC)[reply]
We can't proceed until
Feedback from maintainers


I just tried this for the first time on two pages, the first with Commit edits selected and the second without them. The first time the app stated it made the changes, but nothing appeared changed via View History. The second time I tried this it was with Commit edits turned off, so I reviewed them and submitted via the button on the bottom. Both times nothing appeared edited on the articles themselves via View History. Am I doing something incorrectly or is there an issue with the bot? Thanks. --- FULBERT (talk) 16:30, 24 February 2019 (UTC)[reply]

the bot is currently blocked. AManWithNoPlan (talk) 16:33, 24 February 2019 (UTC)[reply]
That explains it. Thanks AManWithNoPlan! --- FULBERT (talk) 16:35, 24 February 2019 (UTC)[reply]

Anonymous editing

It is good the bot is blocked. Now it can stay blocked until the ability of editors to edit wholly anonymously is removed. This is WP:TEACE, either by username or by revealing your IP; but citation bot currently has the facility to bypass both of those constraints. ——SerialNumber54129 16:39, 24 February 2019 (UTC)[reply]