User talk:Citation bot: Difference between revisions

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Content deleted Content added
Line 374: Line 374:


Note to me for when I have time. [[User:AManWithNoPlan|AManWithNoPlan]] ([[User talk:AManWithNoPlan|talk]]) 16:26, 10 December 2023 (UTC)
Note to me for when I have time. [[User:AManWithNoPlan|AManWithNoPlan]] ([[User talk:AManWithNoPlan|talk]]) 16:26, 10 December 2023 (UTC)

== Convesion to mathml conflices with the math:Extension ==

{{bot bug
| status = new bug
| reported by = [[User:Salix alba|Salix alba]] ([[User talk:Salix alba|talk]]): 17:03, 10 December 2023 (UTC)
| what happens = The bot replaced the wikitext:
:<nowiki>''b'' → ''s'' ℓ<sup>+</sup> ℓ<sup>−</sup></nowiki>
with the MathML text
:<nowiki><math><mrow>b<mo stretchy="false">→s<msup><mrow>ℓ</mrow><mrow>+</mrow></msup><msup><mrow>ℓ</mrow><mrow>−</mrow></msup></mrow></math></nowiki>
This conflicts with the [[Help:Displaying a formula|maths extension]] and inturn causes a maths syntax error.
<!-- and/or: --> | what should happen = No MathML text should be generated
| link showing what happens = https://en.wikipedia.org/w/index.php?title=LHCb_experiment&diff=prev&oldid=1188666266
| how to replicate the bug = <!-- If not obvious from the description or the link -->
}}
<!-- Discussion starts below this line -->

Revision as of 17:03, 10 December 2023

You may want to increment {{Archive basics}} to |counter= 38 as User talk:Citation bot/Archive 37 is larger than the recommended 150Kb.

Note that the bot's maintainer and assistants (Thing 1 and Thing 2), can go weeks without logging in to Wikipedia. The code is open source and interested parties are invited to assist with the operation and extension of the bot. Before reporting a bug, please note: Addition of DUPLICATE_xxx= to citation templates by this bot is a feature. When there are two identical parameters in a citation template, the bot renames one to DUPLICATE_xxx=. The bot is pointing out the problem with the template. The solution is to choose one of the two parameters and remove the other one, or to convert it to an appropriate parameter. A 503 error means that the bot is overloaded and you should try again later – wait at least 15 minutes and then complain here.

Submit a Bug Report

Or, for a faster response from the maintainers, submit a pull request with appropriate code fix on GitHub, if you can write the needed code.


Expand non-templated refs

Would it be possible to expand from non-templated reference <ref>[https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5553785/ Bar]</ref>, as long as |title= would be exactly the same (Bar) which already exists for the URL specified as if the bot would try to expand the bare URL (as long as there is no other content in the ref)? Jonatan Svensson Glad (talk) 17:16, 24 July 2023 (UTC)[reply]

Example here, I had to remove the brackets and the already provided title prior to running the bot. The outcome provided the exact same title as was already present prior to me doing the removal, causing a lot of manual labor in order to get the bot to attempt to expand the citation. Jonatan Svensson Glad (talk) 17:19, 24 July 2023 (UTC)[reply]
How close should the titles have to be? Also, it seems that from my experience, the title is often some mix of the title and journal and authors. AManWithNoPlan (talk) 20:08, 14 August 2023 (UTC)[reply]
Well, a first start could be exact "only-title" match inside square brackets (with only a preceding period/dot inside or outside the brackets being the difference). To later build upon with more possibilities... Jonatan Svensson Glad (talk) 21:01, 14 August 2023 (UTC)[reply]

Support for Who's Who

Status
new bug
Reported by
Jonatan Svensson Glad (talk) 01:06, 29 July 2023 (UTC)[reply]
What should happen
Implement support to expand from https://doi.org/10.1093/ww/9780199540884.013.U192476 to {{Who's Who}}
Example: https://en.wikipedia.org/w/index.php?title=Friern_Hospital&diff=prev&oldid=1167644213
We can't proceed until
Feedback from maintainers


Alternatively, deny all edits on 10.1093/ww/... doi's. Jonatan Svensson Glad (talk) 01:06, 29 July 2023 (UTC)[reply]

Or perhaps the entire 10.1093-prefix of doi's since we don't have support for {{cite ODNB}} either (example). Jonatan Svensson Glad (talk) 21:11, 29 July 2023 (UTC)[reply]
Actually we do have {{cite ODNB}} support. AManWithNoPlan (talk) 02:11, 5 November 2023 (UTC)[reply]

Changing every citation of a publisher's webpage to Cite book

I have remained silent on this issue even though it has irritated me for a while now. And now that there is discussion above about the widespread useless cosmetic edits this bot continues to waste everyone's time with, I'll raise it: Why must every citation of a publisher's webpage be changed to to Cite book? I can only speak for myself, but every time I cite such book webpages I am not citing the book itself. I am specifically referencing the information published on the webpage. So of course I do not want the citation to be changed to Cite book with a bunch of parameters of the book itself (ISBN, date, etc) added. So I inevitably stop the bot or replace the reference with a third-party source. I realise the defense will be "It doesn't hurt" or that some users are actually citing the book. And I realise this is not the most pressing issue, but why must the bot come to its own conclusion of the editor's intent? I see another user complained of this issue last year. Οἶδα (talk) 22:25, 27 September 2023 (UTC)[reply]

This may be the kind of situation where it's safest to explicitly tell citation bot not to muck with the citation. It's hard to automatically judge whether the human editor actually wanted "cite web" or "cite book". (There are many examples of people using "cite web" to cite resources that should actually be books, journal articles, etc.) –jacobolus (t) 01:38, 28 September 2023 (UTC)[reply]
I understand. But it still feels like an another unnecessary task for this bot to insert itself into every article it can possibly find. For example, this edit is completely useless and actually corrupts my intention of the citation. Call me crazy but I don't want or need a bot telling me what I am citing (and actively altering my citations accordingly). Οἶδα (talk) 21:32, 13 October 2023 (UTC)[reply]
When I've quoted publisher blurbs in the past, I usually set |type=publisher's blurb for clarity. In the specific case you've linked just above, another option would be not to cite the publisher's landing page at all, and add the book to a "Selected works" subsection or something. Indeed, the altered citation is sequential to another one, and so seems a bit superfluous. Or, alternatively, use "Citation bot bypass" somewhere in your citation as suggested by jacobolus above.
Given the overall lazy referencing culture of less experienced editors, it's likely that in the majority of cases, people who drop a link to a publisher landing page are probably trying to cite the book itself, so this behaviour of assuming that's the case is net beneficial. Folly Mox (talk) 22:13, 13 October 2023 (UTC)[reply]
I cannot personally maintain that the majority of users citing a publisher's webpage are lazily intending to cite the book itself. My experience suggests otherwise which is why I have taken issue, but I realise my editing purview might be skewed. However, if that is observably true then I will resign to accepting this as a forgivable externality. Οἶδα (talk) 06:35, 14 October 2023 (UTC)[reply]
In fairness to your point, I haven't looked into the data about how frequently this sort of change is appropriate; it could be the case that my own perspective is the skewed one. Folly Mox (talk) 08:32, 14 October 2023 (UTC)[reply]
I couldn't find a list of tasks that the bot has been approved for (other than the very first approval) nor a thorough description of all of its mystical activities. I was surprised to find it would change "Cite web" to "Cite book" (for unclear reasons). The only cure, if the bot is unchanged, seems to be the <!-- Citation bot bypass--> mechanism documented at User:Citation_bot#Stopping_the_bot_from_editing - R. S. Shaw (talk) 04:12, 6 December 2023 (UTC)[reply]

Redaction

Add redaction information. AManWithNoPlan (talk) 01:19, 30 September 2023 (UTC)[reply]

Semantic scholar links continue to mostly consist of spam

Can Citation bot please stop littering every s2cid it can find wherever it can possibly fit? The vast majority of these links contain zero useful information beyond a (redundant) link to the publisher's website (typically paywalled), and putting them on every citation in Wikipedia is more or less spam. It's a distracting waste of space with no redeeming benefits.

The easiest solution here would be to deprecate the s2cid parameter from the citation templates, hide them from the output, and just be done with it.

Next best, probably my personal recommendation, would be that only humans should ever add s2cid links (and ideally the ones which were added by a bot in the past should be removed), or barring that that a human should manually review any s2cid that gets added by any bot. At the very very least, the bot should try to check them for meaningful content and skip the vast majority of totally useless ones going forward. –jacobolus (t) 18:13, 20 October 2023 (UTC)[reply]

  • Agree totally; please stop the s2cid spam. Esculenta (talk) 18:48, 20 October 2023 (UTC)[reply]
    Agreed as well! They only got added because of someone who works for Semantic Scholar (Help talk:Citation Style 1/Archive 66#Request to add Semantic Scholar IDs to the citation template). If there is truly a consensus among editors working on a page that it would improve the citation to include an |s2cid= … fine I guess, but a vast majority of the time someone who has never edited a given article runs the prompt and the bot clutters up all the citations with a spammy parameter without any human editors actively wanting it there. Umimmak (talk) 18:59, 20 October 2023 (UTC)[reply]
I would also be supportive of "deprecate the s2cid parameter from the citation templates, hide them from the output, and just be done with it", along with stopping the bot from adding them. Unlike most of the other codes we use, I cannot remember ever seeing a case where these were useful. Stopping the bot is on-topic here but the other stuff should probably be discussed on Help talk:Citation Style 1, which is the centralized discussion point for all the citation and cite templates. —David Eppstein (talk) 20:22, 20 October 2023 (UTC)[reply]
I think this depends on which articles you are reviewing. There are plenty of useful places like S2CID 16831869. Citation bot already avoids adding s2cid where there are no sources.  — Chris Capoccia 💬 19:08, 25 October 2023 (UTC)[reply]
The example you cited is a poor example, because the publisher's page is open access; this citation should use doi-access=free and not include an s2cid. Citation bot already avoids adding s2cid where there are no sources – This is nowhere close to accurate. Citation bot adds tons of completely vacuous s2cids that provide no information beyond a link to the publisher page, more or less analogous to blogspam. –jacobolus (t) 19:14, 25 October 2023 (UTC)[reply]
You're not paying attention to what I wrote. Yes it adds s2cid where only link is publishers and same as DOI. But it does not add s2cid where there are no sources.   — Chris Capoccia 💬 15:30, 26 October 2023 (UTC)[reply]
What do you think the point is of adding an S2CID containing no meaningful content beyond a link to the publisher's website which was also already included in the citation template? From my perspective, such S2CIDs are spam with zero redeeming value. –jacobolus (t) 15:57, 26 October 2023 (UTC)[reply]

Came across some more S2CID spam today which led me to this conversation. Is there an actual way to have an RfC or something for this? It's fine if humans want to add it, but for something with a DOI already there, having a bot add something that is pretty useless doesn't help. Why? I Ask (talk) 05:15, 12 November 2023 (UTC)[reply]

There was a comment about s2cid being useful when when their servers are down, but Portico (https://www.portico.org/why-portico/) might be a good alternative. --SilverMatsu (talk) 03:01, 5 December 2023 (UTC)[reply]

When databases collide

This edit changed a proceedings title from the version given by DBLP ("Proceedings of the 22nd Annual European Symposium on Algorithms (ESA 2014), Wroclaw, Poland, September 8–10, 2014") to a much more concise version from another source ("Algorithms - ESA 2014"), maybe the publisher or maybe MathSciNet (both list it that way). Note that the actual publisher page for the full proceedings lists it has having the more detailed title "Algorithms - ESA 2014: 22th Annual European Symposium, Wrocław, Poland, September 8-10, 2014. Proceedings". The DBLP title is more or less what you get if you put that into a more intelligible order. Curiously, the bot left the DBLP title in place for the other citation it touched, from WG '92. I think that the DBLP version is better and that this level of change (not the correction of any actual error in a citation) constitutes WP:CITEVAR. Please stop. —David Eppstein (talk) 06:46, 30 October 2023 (UTC)[reply]

please link to the new Google books web pages

This edit changed links that consistently lead to the new Google books web pages to ones that do not. 50.47.144.129 (talk) 19:49, 30 October 2023 (UTC)[reply]

Good question. Right now wikipedia prefers https://books.google.com/books?id=fp9wrkMYHvMC but should this be swapped to https://www.google.com/books/edition/_/fp9wrkMYHvMC AManWithNoPlan (talk) 15:41, 7 November 2023 (UTC)[reply]

Causing template errors

Status
new bug
Reported by
MisterTech (talk) 11:35, 1 November 2023 (UTC)[reply]
What happens
Citation bot is changing journal templates to book templates, leaving the journal parameter intact which results in a template error.
What should happen
Citation bot should also change the journal parameter to a title parameter
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=Draft%3AData_spaces&diff=1182601603&oldid=1181737145
We can't proceed until
Feedback from maintainers


I don't know that that's the solution ({{cite journal}} almost always already contains |title=), but despite being a sometimes commenter on this talk page, I actually came here now to report the same error at Special:Diff/1183763093. Maybe Citation bot should check for |periodical= and its aliases before changing the type of citation template wrapper. I've been working on Category:CS1 errors: periodical ignored (25,431), and I'm never going to be able to keep up with Citation bot creating this error. Folly Mox (talk) 13:16, 7 November 2023 (UTC)[reply]
In both cases, it is a bit of problem with the wrong template and information being entered by a human being. I will see what more can be done. The bot did shrink Category:CS1 errors: periodical ignored (25,431) by tens of thousands a while back, but it seems to have hit a steady-state with the bot both fixing and adding members to this category. AManWithNoPlan (talk) 15:49, 7 November 2023 (UTC)[reply]
Special:Diff/1186701879 is another example, a few minutes ago, of Citation bot creating template errors by changing citation wrapper template without appropriate reparameterisation of values already present. For clarity, the existing citation ({{cite web}} to a publisher landing page for a book) wasn't great, but this behaviour is not desirable: altering the template called without checking whether it contains unsupported parameters.
For changing {{cite web}} to {{cite book}} where |website= is present, I can't think of a case where it would be an error to reparameterise |website= to |via=, unless |via= is already present. Folly Mox (talk) 00:09, 25 November 2023 (UTC)[reply]
I've begun reverting Citation bot whenever it produces this error, which seems to account for between 1% and 2% of its recent edits. I know I take a critical tone here frequently (still traumatised by ReferenceExpander), but Citation bot does a lot of really good work. I do appreciate it and the maintainers.
Also I'm aware that the whole reason this type of edit causes a template error in the first place is the underdiscussed and unnecessary removal of support for the |work= parameter from {{cite book}} without adequate preparation time.
I do plan to start contacting editors who frequently run Citation bot, introduce this error, and then never check the output or help fix it, as required by the guidance at the top of Citation bot's userpage. I know the responsibility does not fall solely on the maintainers. Folly Mox (talk) 18:21, 25 November 2023 (UTC)[reply]

Use of template "ODNB"

Citation bot changed one of the source descriptions in the article James Hamilton (English Army officer) from:

{{Cite web|last=Smith |first=Geoffrey |date=May 2006 |title=Armorer, Sir Nicholas (c.1620–1686) |website=[[Oxford Dictionary of National Biography]] |doi=10.1093/ref:odnb/94686 |url=http://www.oxforddnb.com/index/94686/ |access-date=13 May 2023 |url-access=subscription}}

to:

{{Cite ODNB|last=Smith |first=Geoffrey |date=May 2006 |title=Armorer, Sir Nicholas (c.1620–1686) |doi=10.1093/ref:odnb/94686 |url=http://www.oxforddnb.com/index/94686/ |access-date=13 May 2023 |url-access=subscription}}

I wondered why. I read up on Template:ODNB. It says it is a wrapper around Template:Cite encyclopedia. Well, perhaps I should not have used "Cite web" but "Cite encyclopedia" and Citation bot should probably have corrected me to:

{{Cite encyclopedia|last=Smith |first=Geoffrey |date=May 2006 |title=Armorer, Sir Nicholas (c.1620–1686) |encyclopedia=[[Oxford Dictionary of National Biography]] |edition=online |publisher=[[Oxford University Press]] |doi=10.1093/ref:odnb/94686 |url=http://www.oxforddnb.com/index/94686/ |access-date=13 May 2023 |url-access=subscription}}

However, I do not understand why we should be forced to use a wrapper around Cite encyclopedia rather than the original. I thought the use of the ODBC template was voluntary and not obligatory. With thanks and best regards Johannes Schade (talk) 13:10, 19 November 2023 (UTC)[reply]

Oversimplification of title

Status
new bug
Reported by
David Eppstein (talk) 21:12, 23 November 2023 (UTC)[reply]
What happens
I don't care what the publisher says the main title of a reference should be; the bot should not take more-detailed versions of the correct title and oversimplify them by only keeping the main title as it did in Special:Diff/1186533390. The same bug (in the form of removing subtitles from titles) has been reported here and archived months ago but the same misbehavior persists. If you don't stop it I am going to start routinely excluding this bot from articles I edit.
What should happen
Not that
We can't proceed until
Feedback from maintainers


Does Citation bot have consensus to be making changes to existing, human-added titles based solely on the metadata it scrapes? This doesn't seem like a good outcome most of the time. Folly Mox (talk) 22:21, 23 November 2023 (UTC)[reply]


There the bot is right though. The title is "Graph Drawing". As Springer themselves say, the suggested way to cite this is "Eppstein, D. (2009). Isometric Diamond Subgraphs. In: Tollis, I.G., Patrignani, M. (eds) Graph Drawing. GD 2008. Lecture Notes in Computer Science, vol 5417. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00219-9_37"

"16th International Symposium...." is the expanded subtitle of GD 2008. One could replace it with "Graph Drawing: 16th International Symposium..." instead of ""Graph Drawing. GD 2008."

But the word "Proceedings" is nowhere in there, and shouldn't be. Headbomb {t · c · p · b} 00:42, 24 November 2023 (UTC)[reply]

The title is not "Graph Drawing". The title suggested at the top of the publisher web page for the individual doi is "International Symposium on Graph Drawing, GD 2008: Graph Drawing". The title given on the landing page for the book doi is "Graph Drawing 16th International Symposium, GD 2008, Heraklion, Crete, Greece, September 21-24, 2008, Revised Papers". The title printed on the cover of the book [1] is similar but with line breaks replacing more of the punctuation. The title given in DBLP [2] is almost the same, "Graph Drawing, 16th International Symposium, GD 2008, Heraklion, Crete, Greece, September 21-24, 2008. Revised Papers". The title given in zbMATH [3] is again almost the same, "Graph drawing. 16th international symposium, GD 2008, Heraklion, Crete, Greece, September 21–24, 2008. Revised papers". The title given in MathSciNet [4] is "Graph drawing. Revised papers from the 16th International Symposium (GD 2008) held in Heraklion, September 2008".
All of these are vastly preferable to "Graph Drawing" because they actually identify the precise volume that the work in question comes from, which "Graph Drawing" alone does not. Their preferability should be obvious to anyone who puts actual thought into what citations are for rather than thinking of them as mechanical reproductions of flawed databases. It is exactly that unthinking "we must do what our database of publisher titles says even when it is stupid and uninformative" attitude that I am objecting to here and will continue to strongly object to on individual articles where this attitude translates into disimprovements.
As well, the bot dropped the wikilink on the title into the bit bucket, when it would have been preferable to keep it or move it to a title-link parameter. —David Eppstein (talk) 01:43, 24 November 2023 (UTC)[reply]
I keep seeing more of these on my watchlist, and have begun completely blocking citation bot from the affected articles. It won't take much more of this continued damage for me to switch to completely blocking citation bot from all articles that I edit. —David Eppstein (talk) 22:54, 24 November 2023 (UTC)[reply]
I'm not there yet, but I did recently turn off the "hide bot edits" watchlist toggle for the first time in a decade or so because of this script. I don't think Citation bot is a bad tool – despite my accumulating complaints on this talk page – but it's not better than a human: just faster.
I do note that the BRFA that supported Citation bot adding missing parameters (Wikipedia:Bots/Requests for approval/DOI bot 2, 2008) specifically says If the CrossRef database contradicts the information in the article, the bot will stick with the data already in Wikipedia, and assume the error to be with CrossRef. This seems wise, and I'm wondering when the behaviour was changed, and where the consensus for the change arose. Folly Mox (talk) 23:26, 24 November 2023 (UTC)[reply]
If a citation includes a title-link parameter or a wikilink in the title itself, that seems like a pretty good sign that a human took the trouble to get the information right. A bot shouldn't override that. XOR'easter (talk) 18:57, 25 November 2023 (UTC)[reply]
This is a subgenre of the issue: existing parameters should be known before an edit is made. If |title-link= is present, |title= should not be altered outside of punctuation changes. If |periodical= (or one of its aliases) is present, the wrapper template should not be changed to {{cite book}}. If adding |chapter=, and |journal= or |issue= is present, the wrapper template should be changed to {{cite conference}} rather than {{cite book}}. If none of |title= and |chapter= match the existing |title= (delta punctuation), there's a mismatch between the database record and the work intending to be cited. Folly Mox (talk) 20:54, 25 November 2023 (UTC)[reply]

Ok, after seeing this keep going and going with no effort to fix or address the problem, I am going to start adding {{bots|deny=Citation bot}} to all new articles I create, instead of merely the ones where I see this happening. —David Eppstein (talk) 07:55, 3 December 2023 (UTC)[reply]

If this goes on for too much longer, the next step will be to ask for a full block of the bot. This continued non-response to this problem is unacceptable. —David Eppstein (talk) 01:11, 4 December 2023 (UTC)[reply]
I have added "graph drawing" to the rejection list. AManWithNoPlan (talk) 01:38, 4 December 2023 (UTC)[reply]
This applies to all Springer LNCS proceedings, not just that one. —David Eppstein (talk) 19:15, 5 December 2023 (UTC)[reply]
https://github.com/ms609/citation-bot/commit/6d644b3bbd7fa038c174e8977cb1ad3e09a60ba7 AManWithNoPlan (talk) 19:51, 5 December 2023 (UTC)[reply]

A class of new(?) errors

A user brought to my attention a possibly new type of error by the bot which causes "}}: |website= ignored" and "|journal= ignored" messages. I'm not clear on what's going on, so here are the diffs they found: 1, 2, 3, 4, 5. Abductive (reasoning) 22:19, 25 November 2023 (UTC)[reply]

It is often GIGO. The bot adds corrected parameters and leave some bad stuff behind. AManWithNoPlan (talk) 22:03, 27 November 2023 (UTC)[reply]
I will look into trying to reduce this. I know that the bot fixed many more of these than it creates. AManWithNoPlan (talk) 14:22, 29 November 2023 (UTC)[reply]
Thank you for your work on this. It's probably difficult when almost all the engagement on this talkpage is complaints. I believe you when you say Citation bot has fixed this class of error more often than it has introduced it, and I appreciate that. Folly Mox (talk) 12:58, 1 December 2023 (UTC)[reply]
This has already improved a lot over where it was last week! Folly Mox (talk) 11:36, 3 December 2023 (UTC)[reply]
This bug seems eradicated for many of the more common cases, but I did find another example today, at Special:Diff/1189067865. My fix looked like this. Folly Mox (talk) 19:51, 9 December 2023 (UTC)[reply]
Ran into a few more GIGO style manifestations of this bug today, but also a bunch of conference proceedings hosted by Springer, all of which caused this same error: Special:Diff/1189110528, Special:Diff/1189117384, and Special:Diff/1189119689 (the last of which alone caused five errors). Folly Mox (talk) 03:37, 10 December 2023 (UTC)[reply]

Add Internet Archive Scholar links

Status
new feature request
Reported by
Nemo 22:07, 28 November 2023 (UTC)[reply]
What happens
Nothing
What should happen
Add links to Internet Archive Scholar archived copies, where available and found by DOI, if Unpaywall and PMC have none.
We can't proceed until
Feedback from maintainers


This should be relatively fast with the API; Google Scholar is doing the same and shows those OA links, which were generally archived due to being public domain or CC-licensed. You can see the docs at https://scholar.archive.org/api/redoc but here's an example:

$ curl -sH "Accept: application/json" https://scholar.archive.org/search?q=doi:10.1080/14786449908621245 | jq -r .results[0].fulltext.access_url
https://archive.org/download/crossref-pre-1909-scholarly-works/10.1080%252F14786449608620921.zip/10.1080%252F14786449908621245.pdf

Optionally the metadata can be used to construct the scholar.archive.org URL, which in this case is https://scholar.archive.org/work/heaairhf5fgkvgie4h54rpc4nm/access/ia_file/crossref-pre-1909-scholarly-works/10.1080%252F14786449608620921.zip/10.1080%252F14786449908621245.pdf and for a wayback URL would be something like https://scholar.archive.org/work/rv4lw3nikrfstp7bvvlxapsylu/access/wayback/https://pubs.rsc.org/en/content/articlepdf/2022/dt/d2dt00998f . (This will reduce confusion by bots which think there's utility in converting web.archive.org links into something else.)

Nemo 22:07, 28 November 2023 (UTC)[reply]

Seems like a lot of them are just copies of arXiv PDFs. AManWithNoPlan (talk) 17:16, 4 December 2023 (UTC)[reply]
If by "a lot" you mean about 2 million out of 25 million: yes, I'd expect the entire arxiv to be archive by IA scholar. There's no need to link these if there's already an arxiv identifier. (Though it's sad that the arxiv identifier doesn't auto-link.) Nemo 22:31, 4 December 2023 (UTC)[reply]
I am curious which type of url is best. I am always a bit leery of PDF links that do not end in PDF (option 3). I wonder if the first method would ever provide multiple options. AManWithNoPlan (talk) 22:13, 7 December 2023 (UTC)[reply]
Recommend the /download/ link, because it has the .pdf extension, it's more standard than the scholar.archive.org URLs, the URL is shorter and less complex, it's more aligned with where the content is actually located. scholar.archive.org is basically an index, not a repository. The data is hosted at //archive.org (that seems confusing since it's the same site but they are different servers). -- GreenC 01:21, 8 December 2023 (UTC)[reply]
As GreenC says, the archive.org/download/ links are usually preferred. In this case I'd prefer the scholar.archive.org resolver because 1) the edits will look more consistent, using the same domain name whether the PDF is under web.archive.org or archive.org, 2) some of these items might be split and relocated in the future, in which case the scholar.archive.org links will probably still work somewhat but the archive.org/download/ links may break. These are just aesthetic or very rare issues though.
I recommend using scholar.archive.org for the works which are linked to web.archive.org though, because bots and the cite templates themselves often complain about web.archive.org being in the url parameter, so you'd be forced to add all of url, archive-url, url-status=unfit and the entire family of parameters. Nemo 09:19, 8 December 2023 (UTC)[reply]

Adds doi-access=free for broken DOI

Status
new bug
Reported by
Nemo 11:19, 3 December 2023 (UTC)[reply]
What happens
special:diff/1188055766
What should happen
special:diff/1188109405
Replication instructions
Both the DOI and the PubMed full text link are broken and redirect to https://lww.com/pages/default.aspx .
We can't proceed until
Feedback from maintainers


Unfortunately this journal is not preserved so there are no archived copies either. Nemo 11:19, 3 December 2023 (UTC)[reply]

For the cases where the DOI used to provide a gratis copy but no longer does, see #Add Internet Archive Scholar links. Nemo 11:37, 3 December 2023 (UTC)[reply]
That the DOI is broken is a separate issue than it's free-to-read status. Once repaired, the DOI will be free. Headbomb {t · c · p · b} 12:12, 3 December 2023 (UTC)[reply]
And how do you know that? Nemo 13:31, 3 December 2023 (UTC)[reply]
It's originally from Medknow. All Medknow journals/DOIs are open access. Headbomb {t · c · p · b} 13:54, 3 December 2023 (UTC)[reply]
Or were. Now that they've been migrated, anything could happen. This journal has a nonfree license so it could vanish unless someone archives it. If all Medknow DOIs are broken right now, I agree it's likely they'll be fixed within a few months by LWW, but in the meanwhile they're not a suitable link target so it makes no sense to add doi-access=true. Nemo 14:33, 3 December 2023 (UTC)[reply]
Actually, not all Medknow DOIs are broken, for example The Journal of Indian Prosthodontic Society has functioning DOIs issued by Springer, like doi:10.1007/s13191-013-0262-x, for 2010–2014. (Didn't check the rest of the archive.) Have you sampled the DOIs under non-Springer prefix to see how many are working? Nemo 14:47, 3 December 2023 (UTC)[reply]
"This journal has a nonfree license" CC BY-NC-SA is a free-to-read license. Headbomb {t · c · p · b} 15:12, 3 December 2023 (UTC)[reply]
"Actually, not all Medknow DOIs are broken" I compliment you on finding one that actually works. AManWithNoPlan (talk) 22:09, 3 December 2023 (UTC)[reply]
This just blew up because of this https://en.wikipedia.org/wiki/Category:CS1_maint:_DOI_inactive_as_of_December_2023 AManWithNoPlan (talk) 18:53, 4 December 2023 (UTC)[reply]
I personally patrol this page and report ALL bad DOIs. Many of them point to the wrong place since the journal has been purchased. Or they are data DOIs that are not part of crossref, so who knows. Or they are MedDontKnow. AManWithNoPlan (talk) 19:19, 4 December 2023 (UTC)[reply]
I have no idea what a "free-to-read license" is. A free license is a well-defined concept. A "free-to-read" source is an English-Wikipedia specific moving concept vaguely defined at Access indicators for url-holding parameters. Mixing the two expressions serves no purpose. Nemo 22:36, 4 December 2023 (UTC)[reply]
I think that the idea of open-source journals that you cannot find is funny, but I do think that keeping the DOIs in the articles is good, since you can sometimes google them and find a copy online. AManWithNoPlan (talk) 19:16, 5 December 2023 (UTC)[reply]
Keeping the DOI is useful, making it auto-link less so. Nemo 09:20, 8 December 2023 (UTC)[reply]
If the bot thinks the DOI works, then it will not add the free. AManWithNoPlan (talk) 14:23, 8 December 2023 (UTC)[reply]
That's an issue for the template to handle, not a reason to not flag things that should be flagged. And the template disables automatic linking via |doi-broken-date=. Headbomb {t · c · p · b} 15:01, 8 December 2023 (UTC)[reply]
Again, we can't know whether the DOI provides a free-to-read copy when we don't even know where the copy is supposed to be. (Yes I know we were discussing this elsewhere, I'm in a hurry now.) But good the autolinking is disabled by the broken-doi parameter; the green lock should be as well. Nemo 15:09, 8 December 2023 (UTC)[reply]
"we can't know whether the DOI provides a free-to-read copy"
Yes we can. Headbomb {t · c · p · b} 15:33, 8 December 2023 (UTC)[reply]

Convert &#x00026; to &

Status
new bug
Reported by
Headbomb {t · c · p · b} 20:45, 5 December 2023 (UTC)[reply]
What should happen
[5]
We can't proceed until
Feedback from maintainers


Get a PMID API key

AManWithNoPlan (talk) 17:38, 6 December 2023 (UTC)[reply]

Specifying name list style for newly-added name entries

There is a pull request that allows specifying name list style for newly-added name entries: https://github.com/ms609/citation-bot/pull/4236

It adds an option to already existing style of first1/last1, first2,last2, etc.

This pull request introduces the following functionality. If a page contains {{Use vanc name-list-style}} template, then the bot will use |vauthors= and |veditors= attributes rather than firstN/lastN and editor-firstN/editor-lastN when adding name entries for a citation template if the names were not specified in this template. This is similar to {{Use dmy dates}} template when the bot uses date format as specified on the page. To reproduce this behaviour, edit a page on Wikipedia, add {{Use vanc name-list-style}} template (or {{Use vanc name-list-style|date=December 2023}}), delete author names (firstN/lastN) and run the bot. It will fill the names as vauthors. Maxim Masiutin (talk) 16:48, 7 December 2023 (UTC)[reply]

Why does {{Use vanc name-list-style}} exist? Was there any discussion that brought it into existence? cs1|2 doesn't know anything about that template but will understand {{CS1 config|name-list-style=vanc}}. Why create a new otherwise non-functional template?
Trappist the monk (talk) 18:26, 7 December 2023 (UTC)[reply]
I agree, we can use {{CS1 config|name-list-style=vanc}}. Should we use {{CS1 config|name-list-style=vanc}}? If yes, I will update the pull request. Anyway, {{CS1 config|name-list-style=vanc}} is not currently supported by the Citations Bot.Maxim Masiutin (talk) 18:30, 7 December 2023 (UTC)[reply]
However the templates {{CS1 config|name-list-style=vanc}} and {{Use vanc name-list-style}} are different. {{CS1 config|name-list-style=vanc}} controls how the names are displayed during the render, whereas {{CS1 config|name-list-style=vanc}} does not affect the rendering but is a hint on whether the templates should use firs/last or vauthors attribute, in analogy to {{Use dmy dates}} which also does not control the output but hints how the dates should be specified in the source. This replies your question on why {{Use vanc name-list-style}} exist and how it is different from {{CS1 config|name-list-style=vanc}}. Maxim Masiutin (talk) 15:33, 9 December 2023 (UTC)[reply]
You are mistaken. cs1|2 uses {{use dmy dates}} and {{use mdy dates}} to control date formatting when cs1|2 templates are rendered. See Template:Use dmy dates § Auto-formatting citation template dates for example. I see no reason to keep {{Use vanc name-list-style}}.
Trappist the monk (talk) 15:41, 9 December 2023 (UTC)[reply]
@Trappist the monk thank you for letting me know! Why then there are separate templates for use dmy dates if this can be solved by "{{CS1 config}}"? That is the same question you asked me about the name list style.
Anyway, my proposal is not about a particular template but about the functionality of the bot to adhere to the name list style specified for the page. My pull request can be adjusted to any template, and we need a consensus. Maxim Masiutin (talk) 16:16, 9 December 2023 (UTC)[reply]
The {{use xxx dates}} templates came first (January 2009). Development of Module:Citation (the predecessor to Module:Citation/CS1) began August 2012. Auto date formatting was added to Module:Citation/CS1 April 2019. Support for {{CS1 config}} was added August 2023. {{CS1 config}} applies only to cs1|2 templates but the {{use xxx dates}} templates apply to both the article body and to article referencing (regardless of how referencing is implemented).
Trappist the monk (talk) 16:37, 9 December 2023 (UTC)[reply]

Replace inf by sub tags

Status
new bug
Reported by
Headbomb {t · c · p · b} 15:27, 8 December 2023 (UTC)[reply]
What should happen
[6]
We can't proceed until
Feedback from maintainers


arxiv is not a journal

Status
new bug
Reported by
Trappist the monk (talk) 14:19, 9 December 2023 (UTC)[reply]
What happens
Bot changed an admittedly malformed {{cite journal}} template. In that template: |journal=arXiv, |doi=10.48550/arXiv.2206.12231, and |doi-access=free. The only action that the bot took was to convert |doi=10.48550/arXiv.2206.12231 to |arxiv=2206.12231.
What should happen
Bot should recognize that arXiv is not a journal so {{cite journal}} is the wrong template; should be changed to {{cite arxiv}}. When removing |doi=, the bot should always remove |doi-access=. Remember that {{cite arxiv}} supports a limited subset of the whole cs1|2 parameter set so other parameters in a {{cite journal}}{{cite arxiv}} conversion may need to be removed. The limited parameter set is defined in Module:Citation/CS1/Whitelist lines 340–346.
Relevant diffs/links
Diff
We can't proceed until
Feedback from maintainers


If you run the bot again, then it does clean up. I will look at having it not take two times. AManWithNoPlan (talk) 15:10, 9 December 2023 (UTC)[reply]

Bug? The bot should not replace first/last to first1/last1 when there is just one author

According to Help:Citation Style, An author may be cited using separate parameters for the author's surname and given name by using

However, the bot replaces |last= and |first= to |last1= and |first1= even when there is just one author, which is contrary to the description of the CS1 Citation Style.

The bot should probably already not replace them back, but it should definitely avoid changing that in the future. Also, when there were no authors specified, and there is a single author, the bot should use |last= and |first=

If you agree with that, I can try to submit a pull request. Maxim Masiutin (talk) 15:38, 9 December 2023 (UTC)[reply]

Could you give en example of where the bot changed last to last1, when there is not second author. AManWithNoPlan (talk) 16:34, 9 December 2023 (UTC)[reply]

unsupported parameters when changing template type to cite document

Status
new bug
Reported by
Folly Mox (talk) 19:11, 9 December 2023 (UTC)[reply]
What happens
in the normal course of removing proxy urls that duplicate stable identifiers, Citation bot removed the url from a source that was not published in a journal or book, and accordingly altered the citation template type to {{cite document}}. This caused unidentified parameter errors for |citeseerx= and |s2cid=.
Relevant diffs/links
Special:Diff/1189060531
We can't proceed until
Feedback from maintainers


This was my fix: changing back to {{cite web}}, adding the url of the source, and an unrelated fix to |publisher=. I'm not sure this is really Citation bot's fault, or if maybe the parameter set supported by {{cite document}} ought be expanded to allow for more stable identifiers. Pinging Trappist the monk as the template maintainer, to see if they have input. Folly Mox (talk) 19:11, 9 December 2023 (UTC)[reply]

Yeah, converting {{cite web}} to {{cite document}} is not going to work when |url=, |citeseerx=, and |s2cid= have assigned values. |s2cid= is excluded from {{cite document}} because links to readable copies of the source from that identifier are hit-or-miss at best (recall the plethora of complaints about the bot adding |s2cid= that have been voiced on this talk page). |citeseerx= is excluded because we have {{cite citeseerx}}.
Because the original template had |citeseerx=10.1.1.42.3374, an alternate fix might be:
{{cite citeseerx |last=Wirz |first=Marc |title=Characterizing the Grzegorczyk hierarchy by safe recursion |date=November 1999 |citeseerx=10.1.1.42.3374}}
Wirz M (November 1999). "Characterizing the Grzegorczyk hierarchy by safe recursion". CiteSeerX 10.1.1.42.3374.
{{cite document}} is a 'last resort' sort of template when absolutely none of the other cs1|2 templates apply. The bot should avoid using {{cite document}} because, almost always, there is a better choice.
Trappist the monk (talk) 19:47, 9 December 2023 (UTC)[reply]

10.15347 is free to read

WikiJournals Headbomb {t · c · p · b} 15:36, 10 December 2023 (UTC)[reply]

One cache to rule them all.

Note to me for when I have time. AManWithNoPlan (talk) 16:26, 10 December 2023 (UTC)[reply]

Convesion to mathml conflices with the math:Extension

Status
new bug
Reported by
Salix alba (talk): 17:03, 10 December 2023 (UTC)[reply]
What happens
The bot replaced the wikitext:
''b'' → ''s'' ℓ<sup>+</sup> ℓ<sup>−</sup>

with the MathML text

<math><mrow>b<mo stretchy="false">→s<msup><mrow>ℓ</mrow><mrow>+</mrow></msup><msup><mrow>ℓ</mrow><mrow>−</mrow></msup></mrow></math>

This conflicts with the maths extension and inturn causes a maths syntax error.

What should happen
No MathML text should be generated
Relevant diffs/links
https://en.wikipedia.org/w/index.php?title=LHCb_experiment&diff=prev&oldid=1188666266
We can't proceed until
Feedback from maintainers