User talk:Citation bot: Difference between revisions
Line 320: | Line 320: | ||
:Because Billboard is a magazine, whether it's online or in print is irrelevant.  <span style="font-variant:small-caps; whitespace:nowrap;">[[User:Headbomb|Headbomb]] {[[User talk:Headbomb|t]] · [[Special:Contributions/Headbomb|c]] · [[WP:PHYS|p]] · [[WP:WBOOKS|b]]}</span> 05:05, 9 January 2024 (UTC) |
:Because Billboard is a magazine, whether it's online or in print is irrelevant.  <span style="font-variant:small-caps; whitespace:nowrap;">[[User:Headbomb|Headbomb]] {[[User talk:Headbomb|t]] · [[Special:Contributions/Headbomb|c]] · [[WP:PHYS|p]] · [[WP:WBOOKS|b]]}</span> 05:05, 9 January 2024 (UTC) |
||
::Agree. Online sources that are books should use cite book. Online sources that are magazine articles should use cite magazine. Online sources that are journal articles should use cite journal. It is simply false that "Any online source should use cite web.". These are not errors and should not be "fixed". —[[User:David Eppstein|David Eppstein]] ([[User talk:David Eppstein|talk]]) 06:32, 9 January 2024 (UTC) |
::Agree. Online sources that are books should use cite book. Online sources that are magazine articles should use cite magazine. Online sources that are journal articles should use cite journal. It is simply false that "Any online source should use cite web.". These are not errors and should not be "fixed". —[[User:David Eppstein|David Eppstein]] ([[User talk:David Eppstein|talk]]) 06:32, 9 January 2024 (UTC) |
||
:::I disagree. A web source should use cite web. Take a look sometime at the kinds of errors found on the Jazz Cleanup Listing. I didn't change them BECAUSE they used cite news. I changed them because the cite news usages were creating error messages as found in the Cleanup Listing.[[User:Vmavanti|Vmavanti]] ([[User talk:Vmavanti|talk]]) 15:33, 9 January 2024 (UTC) |
|||
:::I think it's useful to distinguish when a reference is for a print magazine, though, since the parameters will be different (presence of page numbers, date of publication, quite often the same article has different titles in print vs online). WP:SAYWHEREYOUGOTIT, if you got it from the website that’s different from a print magazine — especially for older articles which might have digitization errors from OCR or if an online version has (sometimes silently) made emendations. |
:::I think it's useful to distinguish when a reference is for a print magazine, though, since the parameters will be different (presence of page numbers, date of publication, quite often the same article has different titles in print vs online). WP:SAYWHEREYOUGOTIT, if you got it from the website that’s different from a print magazine — especially for older articles which might have digitization errors from OCR or if an online version has (sometimes silently) made emendations. |
||
:::Billboard is also an online database, and references to that as a magazine I think also confuses things. |
:::Billboard is also an online database, and references to that as a magazine I think also confuses things. |
||
:::This is different of course from digital facsimiles of magazines and books which are identical in all respects to the paper versions including pagination. [[User:Umimmak|Umimmak]] ([[User talk:Umimmak|talk]]) 06:43, 9 January 2024 (UTC) |
:::This is different of course from digital facsimiles of magazines and books which are identical in all respects to the paper versions including pagination. [[User:Umimmak|Umimmak]] ([[User talk:Umimmak|talk]]) 06:43, 9 January 2024 (UTC) |
||
:::(To clarify, if it's an actual news article I think {{tl|cite news}} is still better than {{tl|cite web}} — I don't think all online content should be referenced via {{tl|cite web}}.) [[User:Umimmak|Umimmak]] ([[User talk:Umimmak|talk]]) 06:53, 9 January 2024 (UTC) |
:::(To clarify, if it's an actual news article I think {{tl|cite news}} is still better than {{tl|cite web}} — I don't think all online content should be referenced via {{tl|cite web}}.) [[User:Umimmak|Umimmak]] ([[User talk:Umimmak|talk]]) 06:53, 9 January 2024 (UTC) |
||
::::: I'm basing my judgment on 1) common sense (a web source uses cite web); 2) it's an easier template for contributors to use, based on the number of errors I have seen over eight years of editing when it comes to using cite news or cite magazine. Ask a member of the public. I have spoken to many of them over the years. I have seen their successes and their mistakes. Plenty.[[User:Vmavanti|Vmavanti]] ([[User talk:Vmavanti|talk]]) 15:33, 9 January 2024 (UTC) |
|||
== Removal of via parameters == |
== Removal of via parameters == |
Revision as of 15:33, 9 January 2024
You may want to increment {{Archive basics}} to |counter= 39
as User talk:Citation bot/Archive 38 is larger than the recommended 150Kb.
|
||||
This page has archives. Sections older than 90000 days may be automatically archived by ClueBot III when more than 4 sections are present. |
Note that the bot's maintainer and assistants (Thing 1 and Thing 2), can go weeks without logging in to Wikipedia. The code is open source and interested parties are invited to assist with the operation and extension of the bot.
Before reporting a bug, please note: Addition of DUPLICATE_xxx=
to citation templates by this bot is a feature. When there are two identical parameters in a citation template, the bot renames one to DUPLICATE_xxx=
. The bot is pointing out the problem with the template. The solution is to choose one of the two parameters and remove the other one, or to convert it to an appropriate parameter. A 503 error means that the bot is overloaded and you should try again later – wait at least 15 minutes and then complain here.
Please click here to report an error.
Or, for a faster response from the maintainers, submit a pull request with appropriate code fix on GitHub, if you can write the needed code.
Support for Who's Who
- Status
- new bug
- Reported by
- Jonatan Svensson Glad (talk) 01:06, 29 July 2023 (UTC)
- What should happen
- Implement support to expand from https://doi.org/10.1093/ww/9780199540884.013.U192476 to
{{Who's Who}}
Example: https://en.wikipedia.org/w/index.php?title=Friern_Hospital&diff=prev&oldid=1167644213 - We can't proceed until
- Feedback from maintainers
Alternatively, deny all edits on 10.1093/ww/...
doi's. Jonatan Svensson Glad (talk) 01:06, 29 July 2023 (UTC)
- Or perhaps the entire 10.1093-prefix of doi's since we don't have support for
{{cite ODNB}}
either (example). Jonatan Svensson Glad (talk) 21:11, 29 July 2023 (UTC)- Actually we do have
{{cite ODNB}}
support. AManWithNoPlan (talk) 02:11, 5 November 2023 (UTC)
- Actually we do have
Changing every citation of a publisher's webpage to Cite book
I have remained silent on this issue even though it has irritated me for a while now. And now that there is discussion above about the widespread useless cosmetic edits this bot continues to waste everyone's time with, I'll raise it: Why must every citation of a publisher's webpage be changed to to Cite book? I can only speak for myself, but every time I cite such book webpages I am not citing the book itself. I am specifically referencing the information published on the webpage. So of course I do not want the citation to be changed to Cite book with a bunch of parameters of the book itself (ISBN, date, etc) added. So I inevitably stop the bot or replace the reference with a third-party source. I realise the defense will be "It doesn't hurt" or that some users are actually citing the book. And I realise this is not the most pressing issue, but why must the bot come to its own conclusion of the editor's intent? I see another user complained of this issue last year. Οἶδα (talk) 22:25, 27 September 2023 (UTC)
- This may be the kind of situation where it's safest to explicitly tell citation bot not to muck with the citation. It's hard to automatically judge whether the human editor actually wanted "cite web" or "cite book". (There are many examples of people using "cite web" to cite resources that should actually be books, journal articles, etc.) –jacobolus (t) 01:38, 28 September 2023 (UTC)
- I understand. But it still feels like an another unnecessary task for this bot to insert itself into every article it can possibly find. For example, this edit is completely useless and actually corrupts my intention of the citation. Call me crazy but I don't want or need a bot telling me what I am citing (and actively altering my citations accordingly). Οἶδα (talk) 21:32, 13 October 2023 (UTC)
- When I've quoted publisher blurbs in the past, I usually set
|type=publisher's blurb
for clarity. In the specific case you've linked just above, another option would be not to cite the publisher's landing page at all, and add the book to a "Selected works" subsection or something. Indeed, the altered citation is sequential to another one, and so seems a bit superfluous. Or, alternatively, use "Citation bot bypass" somewhere in your citation as suggested by jacobolus above.Given the overall lazy referencing culture of less experienced editors, it's likely that in the majority of cases, people who drop a link to a publisher landing page are probably trying to cite the book itself, so this behaviour of assuming that's the case is net beneficial. Folly Mox (talk) 22:13, 13 October 2023 (UTC)- I cannot personally maintain that the majority of users citing a publisher's webpage are lazily intending to cite the book itself. My experience suggests otherwise which is why I have taken issue, but I realise my editing purview might be skewed. However, if that is observably true then I will resign to accepting this as a forgivable externality. Οἶδα (talk) 06:35, 14 October 2023 (UTC)
- In fairness to your point, I haven't looked into the data about how frequently this sort of change is appropriate; it could be the case that my own perspective is the skewed one. Folly Mox (talk) 08:32, 14 October 2023 (UTC)
- I cannot personally maintain that the majority of users citing a publisher's webpage are lazily intending to cite the book itself. My experience suggests otherwise which is why I have taken issue, but I realise my editing purview might be skewed. However, if that is observably true then I will resign to accepting this as a forgivable externality. Οἶδα (talk) 06:35, 14 October 2023 (UTC)
- When I've quoted publisher blurbs in the past, I usually set
- I understand. But it still feels like an another unnecessary task for this bot to insert itself into every article it can possibly find. For example, this edit is completely useless and actually corrupts my intention of the citation. Call me crazy but I don't want or need a bot telling me what I am citing (and actively altering my citations accordingly). Οἶδα (talk) 21:32, 13 October 2023 (UTC)
- I couldn't find a list of tasks that the bot has been approved for (other than the very first approval) nor a thorough description of all of its mystical activities. I was surprised to find it would change "Cite web" to "Cite book" (for unclear reasons). The only cure, if the bot is unchanged, seems to be the
<!-- Citation bot bypass-->
mechanism documented at User:Citation_bot#Stopping_the_bot_from_editing - R. S. Shaw (talk) 04:12, 6 December 2023 (UTC)
Redaction
Add redaction information. AManWithNoPlan (talk) 01:19, 30 September 2023 (UTC)
- https://www.crossref.org/blog/news-crossref-and-retraction-watch/ AManWithNoPlan (talk) 02:45, 26 December 2023 (UTC)
Upon finding this and finding 10.1234/OnSantasNaughtyList in the list of bad papers (the list entries mostly have DOIs, but a few only have a PMID):
{{cite....doi=10.1234/OnSantasNaughtyList...}}
the bot would convert that to:
{{cite....doi=10.1234/OnSantasNaughtyList...}}{{retracted|pmc=3127434|pmid=21685368}}
We don't want to end up with this:
{{cite....doi=10.1234/OnSantasNaughtyList...}}{{retracted|pmc=3127434|pmid=21685368}}{{retracted|pmc=3127434|pmid=21685368}}{{retracted|pmc=3127434|pmid=21685368}}{{retracted|pmc=3127434|pmid=21685368}} ... {{retracted|pmc=3127434|pmid=21685368}}
after running a bot a dozen times. So, the bot would need to have some sanity checking on what was after the template, and error a bit on the side of caution. The bot would first check if the page already included any redaction templates, and if so, then do some more checking. AManWithNoPlan (talk) 02:45, 26 December 2023 (UTC)
Semantic scholar links continue to mostly consist of spam
Can Citation bot please stop littering every s2cid it can find wherever it can possibly fit? The vast majority of these links contain zero useful information beyond a (redundant) link to the publisher's website (typically paywalled), and putting them on every citation in Wikipedia is more or less spam. It's a distracting waste of space with no redeeming benefits.
The easiest solution here would be to deprecate the s2cid parameter from the citation templates, hide them from the output, and just be done with it.
Next best, probably my personal recommendation, would be that only humans should ever add s2cid links (and ideally the ones which were added by a bot in the past should be removed), or barring that that a human should manually review any s2cid that gets added by any bot. At the very very least, the bot should try to check them for meaningful content and skip the vast majority of totally useless ones going forward. –jacobolus (t) 18:13, 20 October 2023 (UTC)
- Agree totally; please stop the s2cid spam. Esculenta (talk) 18:48, 20 October 2023 (UTC)
- Agreed as well! They only got added because of someone who works for Semantic Scholar (Help talk:Citation Style 1/Archive 66#Request to add Semantic Scholar IDs to the citation template). If there is truly a consensus among editors working on a page that it would improve the citation to include an
|s2cid=
… fine I guess, but a vast majority of the time someone who has never edited a given article runs the prompt and the bot clutters up all the citations with a spammy parameter without any human editors actively wanting it there. Umimmak (talk) 18:59, 20 October 2023 (UTC) - Although I don't agree that s2cid is a spam, still, the point is not whether it is a spam or not, but how to tell the bot to not add this attribute.
- One option could have been via a template. For example, in cs1 config we may add an attribute s2cid=disabled (or any other boolean value that means no or false or zero). Another option is to use "bots" template. For example, on my user page I can specify {{bots|optout=cs1-errors}}. We may add an attribute such as {{bots|optout=s2cid}}
- Whichever option you prefer, we need a consensus. With a consensus, I can ask the citation bot developers to accept this feature via my source code pull request. Maxim Masiutin (talk) 00:29, 4 January 2024 (UTC)
- In my opinion the bot should never add this template parameter, and should remove every existing one that was ever added by a bot. In theory, the parameter would be okay in cases where it adds a new unique access to the full text which was not otherwise available. I have literally never seen this happen in practice. –jacobolus (t) 04:47, 4 January 2024 (UTC)
- I saw that. Maxim Masiutin (talk) 04:53, 4 January 2024 (UTC)
- In my opinion the bot should never add this template parameter, and should remove every existing one that was ever added by a bot. In theory, the parameter would be okay in cases where it adds a new unique access to the full text which was not otherwise available. I have literally never seen this happen in practice. –jacobolus (t) 04:47, 4 January 2024 (UTC)
- Agreed as well! They only got added because of someone who works for Semantic Scholar (Help talk:Citation Style 1/Archive 66#Request to add Semantic Scholar IDs to the citation template). If there is truly a consensus among editors working on a page that it would improve the citation to include an
- Agree totally; please stop the s2cid spam. Esculenta (talk) 18:48, 20 October 2023 (UTC)
- I would also be supportive of "deprecate the s2cid parameter from the citation templates, hide them from the output, and just be done with it", along with stopping the bot from adding them. Unlike most of the other codes we use, I cannot remember ever seeing a case where these were useful. Stopping the bot is on-topic here but the other stuff should probably be discussed on Help talk:Citation Style 1, which is the centralized discussion point for all the citation and cite templates. —David Eppstein (talk) 20:22, 20 October 2023 (UTC)
- I think this depends on which articles you are reviewing. There are plenty of useful places like S2CID 16831869. Citation bot already avoids adding s2cid where there are no sources. — Chris Capoccia 💬 19:08, 25 October 2023 (UTC)
- The example you cited is a poor example, because the publisher's page is open access; this citation should use doi-access=free and not include an s2cid.
Citation bot already avoids adding s2cid where there are no sources
– This is nowhere close to accurate. Citation bot adds tons of completely vacuous s2cids that provide no information beyond a link to the publisher page, more or less analogous to blogspam. –jacobolus (t) 19:14, 25 October 2023 (UTC)- You're not paying attention to what I wrote. Yes it adds s2cid where only link is publishers and same as DOI. But it does not add s2cid where there are no sources. — Chris Capoccia 💬 15:30, 26 October 2023 (UTC)
- What do you think the point is of adding an S2CID containing no meaningful content beyond a link to the publisher's website which was also already included in the citation template? From my perspective, such S2CIDs are spam with zero redeeming value. –jacobolus (t) 15:57, 26 October 2023 (UTC)
- You're not paying attention to what I wrote. Yes it adds s2cid where only link is publishers and same as DOI. But it does not add s2cid where there are no sources. — Chris Capoccia 💬 15:30, 26 October 2023 (UTC)
- The example you cited is a poor example, because the publisher's page is open access; this citation should use doi-access=free and not include an s2cid.
Came across some more S2CID spam today which led me to this conversation. Is there an actual way to have an RfC or something for this? It's fine if humans want to add it, but for something with a DOI already there, having a bot add something that is pretty useless doesn't help. Why? I Ask (talk) 05:15, 12 November 2023 (UTC)
- There was a comment about s2cid being useful when when their servers are down, but Portico (https://www.portico.org/why-portico/) might be a good alternative. --SilverMatsu (talk) 03:01, 5 December 2023 (UTC)
- Portico only helps when "triggered" (e.g. the publisher goes bankrupt). Internet Archive scholar keeps track of the available archives and is more suitable for such a use case: see #Add Internet Archive Scholar links. Nemo 13:24, 4 January 2024 (UTC)
- There was a comment about s2cid being useful when when their servers are down, but Portico (https://www.portico.org/why-portico/) might be a good alternative. --SilverMatsu (talk) 03:01, 5 December 2023 (UTC)
- I personally like Semantic Scholar but I never use s2cid links from English Wikipedia citation templates. It's one of those IDs which are useful sometimes when everything else fails, but should probably be hidden by the citation templates in most cases. I don't know whether it's realistic to get such a change implemented in the citation templates though. Nemo 13:24, 4 January 2024 (UTC)
Causing template errors
- Status
- new bug
- Reported by
- MisterTech (talk) 11:35, 1 November 2023 (UTC)
- What happens
- Citation bot is changing journal templates to book templates, leaving the journal parameter intact which results in a template error.
- What should happen
- Citation bot should also change the journal parameter to a title parameter
- Relevant diffs/links
- https://en.wikipedia.org/w/index.php?title=Draft%3AData_spaces&diff=1182601603&oldid=1181737145
- We can't proceed until
- Feedback from maintainers
- I don't know that that's the solution ({{cite journal}} almost always already contains
|title=
), but despite being a sometimes commenter on this talk page, I actually came here now to report the same error at Special:Diff/1183763093. Maybe Citation bot should check for|periodical=
and its aliases before changing the type of citation template wrapper. I've been working on Category:CS1 errors: periodical ignored (23,200), and I'm never going to be able to keep up with Citation bot creating this error. Folly Mox (talk) 13:16, 7 November 2023 (UTC)- In both cases, it is a bit of problem with the wrong template and information being entered by a human being. I will see what more can be done. The bot did shrink Category:CS1 errors: periodical ignored (23,200) by tens of thousands a while back, but it seems to have hit a steady-state with the bot both fixing and adding members to this category. AManWithNoPlan (talk) 15:49, 7 November 2023 (UTC)
- Special:Diff/1186701879 is another example, a few minutes ago, of Citation bot creating template errors by changing citation wrapper template without appropriate reparameterisation of values already present. For clarity, the existing citation ({{cite web}} to a publisher landing page for a book) wasn't great, but this behaviour is not desirable: altering the template called without checking whether it contains unsupported parameters.For changing {{cite web}} to {{cite book}} where
|website=
is present, I can't think of a case where it would be an error to reparameterise|website=
to|via=
, unless|via=
is already present. Folly Mox (talk) 00:09, 25 November 2023 (UTC)- I've begun reverting Citation bot whenever it produces this error, which seems to account for between 1% and 2% of its recent edits. I know I take a critical tone here frequently (still traumatised by ReferenceExpander), but Citation bot does a lot of really good work. I do appreciate it and the maintainers.Also I'm aware that the whole reason this type of edit causes a template error in the first place is the underdiscussed and unnecessary removal of support for the
|work=
parameter from {{cite book}} without adequate preparation time.I do plan to start contacting editors who frequently run Citation bot, introduce this error, and then never check the output or help fix it, as required by the guidance at the top of Citation bot's userpage. I know the responsibility does not fall solely on the maintainers. Folly Mox (talk) 18:21, 25 November 2023 (UTC)- Books series occupy a space somewhere between books and journals. Similar to journals, books series are published on a regular (or at least semi-regular) basis with volume numbers. To at least some editors, book series look like journals, so I think it should not be suprising that they sometimes use the {{cite journal}} template for book series. And it is not just editors. Automated tools like WP:RefToolbar will generate at least a partial citation to the chapter selecting "cite journal" and specifying a DOI. In contrast RefToolbar "cite book" option does not even support chapters. Citeoid in VisualEditor will also generate a "cite journal" template including the chapter as the title if the DOI is entered. Wikipedia template filling tool will also generate cite journal templates for chapters in book series. I would argue that using {{cite journal}} for chapters in book series, is not wrong. Furthermore this usage does not throw any error messages. Converting journals to books without reparameterization is throwing errors. IMHO, Citation Bot should not convert these templates unless it also adjusts the parameters. I think the principle of first, do no harm applies here. Boghog (talk) 12:21, 1 January 2024 (UTC)
- I've run across a number of cases that fall into this genre, most recently this one. Here, the book series is Journal of Neural Transmission. Supplementum at Springer; others I can remember off the top of my head are Methods in Enzymology and Progress in Drug Research, both at Elsevier.It seems to me that if Citation bot is both 1. altering the template type to {{cite book}} while 2. getting valid data for a parameter named
journal
(as can be seen in the linked diff: it expands the abbreviated form), then the bot should be reparameterising|journal=
to|series=
if|series=
is not set.I've stated before that Citation bot needs to have more awareness of what parameters are present in the citation it's editing when it changes the template type, but it also occurs to me that it's way too aggressive at changing templates to {{cite book}} whenever it finds an isbn. A lot of the errors stem from editors citing webpages with bibliographic information (like library records, publisher landing pages, or book retailers) in order to establish the existence of a book, which is not great practice and has been discussed on this talkpage before. But many other errors come from the fact that conference proceedings and journal issues can also have isbns, and those require different parameters and are created using different templates by other citation tools.In my journey through Special:RandomInCategory/CS1 errors: periodical ignored, my rough estimate is that 50% of these errors (±10%) are introduced by Citation bot. Folly Mox (talk) 11:42, 5 January 2024 (UTC)
- I've run across a number of cases that fall into this genre, most recently this one. Here, the book series is Journal of Neural Transmission. Supplementum at Springer; others I can remember off the top of my head are Methods in Enzymology and Progress in Drug Research, both at Elsevier.It seems to me that if Citation bot is both 1. altering the template type to {{cite book}} while 2. getting valid data for a parameter named
- Books series occupy a space somewhere between books and journals. Similar to journals, books series are published on a regular (or at least semi-regular) basis with volume numbers. To at least some editors, book series look like journals, so I think it should not be suprising that they sometimes use the {{cite journal}} template for book series. And it is not just editors. Automated tools like WP:RefToolbar will generate at least a partial citation to the chapter selecting "cite journal" and specifying a DOI. In contrast RefToolbar "cite book" option does not even support chapters. Citeoid in VisualEditor will also generate a "cite journal" template including the chapter as the title if the DOI is entered. Wikipedia template filling tool will also generate cite journal templates for chapters in book series. I would argue that using {{cite journal}} for chapters in book series, is not wrong. Furthermore this usage does not throw any error messages. Converting journals to books without reparameterization is throwing errors. IMHO, Citation Bot should not convert these templates unless it also adjusts the parameters. I think the principle of first, do no harm applies here. Boghog (talk) 12:21, 1 January 2024 (UTC)
- I've begun reverting Citation bot whenever it produces this error, which seems to account for between 1% and 2% of its recent edits. I know I take a critical tone here frequently (still traumatised by ReferenceExpander), but Citation bot does a lot of really good work. I do appreciate it and the maintainers.Also I'm aware that the whole reason this type of edit causes a template error in the first place is the underdiscussed and unnecessary removal of support for the
- Special:Diff/1186701879 is another example, a few minutes ago, of Citation bot creating template errors by changing citation wrapper template without appropriate reparameterisation of values already present. For clarity, the existing citation ({{cite web}} to a publisher landing page for a book) wasn't great, but this behaviour is not desirable: altering the template called without checking whether it contains unsupported parameters.For changing {{cite web}} to {{cite book}} where
- In both cases, it is a bit of problem with the wrong template and information being entered by a human being. I will see what more can be done. The bot did shrink Category:CS1 errors: periodical ignored (23,200) by tens of thousands a while back, but it seems to have hit a steady-state with the bot both fixing and adding members to this category. AManWithNoPlan (talk) 15:49, 7 November 2023 (UTC)
A class of new(?) errors
A user brought to my attention a possibly new type of error by the bot which causes "}}: |website= ignored" and "|journal= ignored" messages. I'm not clear on what's going on, so here are the diffs they found: 1, 2, 3, 4, 5. Abductive (reasoning) 22:19, 25 November 2023 (UTC)
- It is often GIGO. The bot adds corrected parameters and leave some bad stuff behind. AManWithNoPlan (talk) 22:03, 27 November 2023 (UTC)
- I will look into trying to reduce this. I know that the bot fixed many more of these than it creates. AManWithNoPlan (talk) 14:22, 29 November 2023 (UTC)
- Thank you for your work on this. It's probably difficult when almost all the engagement on this talkpage is complaints. I believe you when you say Citation bot has fixed this class of error more often than it has introduced it, and I appreciate that. Folly Mox (talk) 12:58, 1 December 2023 (UTC)
- This has already improved a lot over where it was last week! Folly Mox (talk) 11:36, 3 December 2023 (UTC)
- This bug seems eradicated for many of the more common cases, but I did find another example today, at Special:Diff/1189067865. My fix looked like this. Folly Mox (talk) 19:51, 9 December 2023 (UTC)
- Ran into a few more GIGO style manifestations of this bug today, but also a bunch of conference proceedings hosted by Springer, all of which caused this same error: Special:Diff/1189110528, Special:Diff/1189117384, and Special:Diff/1189119689 (the last of which alone caused five errors). Folly Mox (talk) 03:37, 10 December 2023 (UTC)
- This bug seems eradicated for many of the more common cases, but I did find another example today, at Special:Diff/1189067865. My fix looked like this. Folly Mox (talk) 19:51, 9 December 2023 (UTC)
- I will look into trying to reduce this. I know that the bot fixed many more of these than it creates. AManWithNoPlan (talk) 14:22, 29 November 2023 (UTC)
Add Internet Archive Scholar links
- Status
- new feature request
- Reported by
- Nemo 22:07, 28 November 2023 (UTC)
- What happens
- Nothing
- What should happen
- Add links to Internet Archive Scholar archived copies, where available and found by DOI, if Unpaywall and PMC have none.
- We can't proceed until
- Feedback from maintainers
This should be relatively fast with the API; Google Scholar is doing the same and shows those OA links, which were generally archived due to being public domain or CC-licensed. You can see the docs at https://scholar.archive.org/api/redoc but here's an example:
$ curl -sH "Accept: application/json" https://scholar.archive.org/search?q=doi:10.1080/14786449908621245 | jq -r .results[0].fulltext.access_url https://archive.org/download/crossref-pre-1909-scholarly-works/10.1080%252F14786449608620921.zip/10.1080%252F14786449908621245.pdf
Optionally the metadata can be used to construct the scholar.archive.org URL, which in this case is https://scholar.archive.org/work/heaairhf5fgkvgie4h54rpc4nm/access/ia_file/crossref-pre-1909-scholarly-works/10.1080%252F14786449608620921.zip/10.1080%252F14786449908621245.pdf and for a wayback URL would be something like https://scholar.archive.org/work/rv4lw3nikrfstp7bvvlxapsylu/access/wayback/https://pubs.rsc.org/en/content/articlepdf/2022/dt/d2dt00998f . (This will reduce confusion by bots which think there's utility in converting web.archive.org links into something else.)
Nemo 22:07, 28 November 2023 (UTC)
- Seems like a lot of them are just copies of arXiv PDFs. AManWithNoPlan (talk) 17:16, 4 December 2023 (UTC)
- If by "a lot" you mean about 2 million out of 25 million: yes, I'd expect the entire arxiv to be archive by IA scholar. There's no need to link these if there's already an arxiv identifier. (Though it's sad that the arxiv identifier doesn't auto-link.) Nemo 22:31, 4 December 2023 (UTC)
- I am curious which type of url is best. I am always a bit leery of PDF links that do not end in PDF (option 3). I wonder if the first method would ever provide multiple options. AManWithNoPlan (talk) 22:13, 7 December 2023 (UTC)
- If by "a lot" you mean about 2 million out of 25 million: yes, I'd expect the entire arxiv to be archive by IA scholar. There's no need to link these if there's already an arxiv identifier. (Though it's sad that the arxiv identifier doesn't auto-link.) Nemo 22:31, 4 December 2023 (UTC)
- https://scholar.archive.org/search?q=doi:10.1080/14786449908621245
- https://archive.org/download/crossref-pre-1909-scholarly-works/10.1080%252F14786449608620921.zip/10.1080%252F14786449908621245.pdf
- https://scholar.archive.org/work/rv4lw3nikrfstp7bvvlxapsylu/access/wayback/https://pubs.rsc.org/en/content/articlepdf/2022/dt/d2dt00998f
- Recommend the /download/ link, because it has the .pdf extension, it's more standard than the scholar.archive.org URLs, the URL is shorter and less complex, it's more aligned with where the content is actually located. scholar.archive.org is basically an index, not a repository. The data is hosted at //archive.org (that seems confusing since it's the same site but they are different servers). -- GreenC 01:21, 8 December 2023 (UTC)
- As GreenC says, the archive.org/download/ links are usually preferred. In this case I'd prefer the scholar.archive.org resolver because 1) the edits will look more consistent, using the same domain name whether the PDF is under web.archive.org or archive.org, 2) some of these items might be split and relocated in the future, in which case the scholar.archive.org links will probably still work somewhat but the archive.org/download/ links may break. These are just aesthetic or very rare issues though.
- I recommend using scholar.archive.org for the works which are linked to web.archive.org though, because bots and the cite templates themselves often complain about web.archive.org being in the url parameter, so you'd be forced to add all of url, archive-url, url-status=unfit and the entire family of parameters. Nemo 09:19, 8 December 2023 (UTC)
- For example, doi:10.4103/0973-8398.104830 (which currently citation bot auto-links but ends up on a non-resolvable domain www[.]asiapharmaceutics[.]info) could be linked to https://scholar.archive.org/work/7ss2kx3v75d3jifq2pc4uiucce/access/wayback/http://www.asiapharmaceutics.info:80/index.php/ajp/article/download/52/48 Nemo 08:28, 11 December 2023 (UTC)
Get a PMID API key
https://www.ncbi.nlm.nih.gov/books/NBK25497/ and set NLM_APIKEY and NLM_EMAIL. AManWithNoPlan (talk) 02:47, 26 December 2023 (UTC)
Should not remove via=The Wikipedia Library on cite encyclopedia
- Status
- new bug
- Reported by
- —David Eppstein (talk) 07:05, 14 December 2023 (UTC)
- What happens
- Special:Diff/1189805360
- What should happen
- Not that. The via link on this citation is a necessary part of the citation, as it describes how a copy of the citation was and can be obtained. Without that information, the citation fails to describe how to find the reference.
- We can't proceed until
- Feedback from maintainers
I think I agree with Citation bot on this one. I think the parameter value should be |via=EBSCO Literary Reference Center Plus
. It wasn't obvious to me, a Wikipedia Library user, that I was supposed to use the default search bar at the top that is powered by EBSCO. They have pretty poor coverage of my usual topic areas. After forgetting to place my search term, "Baker & Taylor Author Biographies", in quotes for literal string matching, I actually went to Taylor & Francis next on a misguided hunch, before just asking google which publishing platform licensed the reference work, after which I was able to verify that TWL does provide access to it.
If that was my experience, what about the experience of a reader without TWL access who tries to verify that citation? What about our experience when someone sets |via=Inaccessible University Undergraduate Library System
? Folly Mox (talk) 07:29, 14 December 2023 (UTC)
- The fact that you found this specific instantiation of WP:SAYWHEREYOUGOTIT difficult to follow might be a reason for making an easier-to-follow recipe for finding the information. It is not an excuse for blanking that information. Also, although that source happens to be in the EBSCO source, I think the default search bar uses a combination of sources. I have found plenty of non-EBSCO material that way. I agree the search is not good in general, but in this case searching the title as a quoted string found it easily. —David Eppstein (talk) 07:33, 14 December 2023 (UTC)
- I agree that had I formatted my initial search properly, I would have found the source without false starts and getting lost. I think what I'm trying to communicate is that
|via=(membership in something with an institutional subscription)
is never going to be helpful for people outside that membership, and even for the members it's more of a starting point (yes, I should be able to access this content) than a way (via) to access the content. Just some sleepy thoughts. Folly Mox (talk) 08:09, 14 December 2023 (UTC)- For the same reason you think we should remove all paywalled doi links on journal articles because they are never going to be helpful to someone without a subscription? Maybe just remove non-free-to-read references altogether? No. —David Eppstein (talk) 15:23, 14 December 2023 (UTC)
- I'm sorry I communicated so poorly. That is not at all what I intended, and after having slept I do agree with you that no
|via=
parameter is a disimprovement in this case over TWL, but EBSCO would be more helpful (since other institutional subscriptions have access to it). Folly Mox (talk) 17:18, 14 December 2023 (UTC)- The distinction I'm attempting to draw here is between publishing platforms (accessible to many different groups, host the actual content, material of sufficient interest to wealthy outgroup folk can be purchased for an exorbitant sum) and access systems (TWL, SomeUniversity.edu, "I have access to ProQuest because I'm a journalist or whatever"). Access systems are generally entirely closed and invite-only, and typically don't offer a means to help specify which work is cited except by proxy links that only work when logged in to the access system.In this case, if we take "The Wikipedia Library" to mean "The Wikipedia Library search bar", that does sufficiently identify the source, but it still only works for us. Even if we take it at face value, like I did, it gives us a starting point for verification in the way that Citation bot's removal doesn't. EBSCO would let any reader know which publisher they or their institution needs a subscription with (or to hand over money to) in order to verify. Folly Mox (talk) 19:03, 14 December 2023 (UTC)
- I do not know how to make an EBSCO link to EBSCO content obtained through the Wikipedia Library that will remain permanently valid and will allow both Wikipedia Library subscribers and other EBSCO subscribers to access the content. If I did know how to provide such a link, I would have used it instead of just saying that you can find the content through the Wikipedia Library. Maybe you can educate me on how to provide such links instead of continuing to harangue me on how the access method I described was somehow so useless that bot-removal was an improvement. —David Eppstein (talk) 20:03, 14 December 2023 (UTC)
- David Eppstein, I'm legitimately deeply sorry I've made you feel harangued. I've been trying to explain myself, because I was feeling misunderstood entirely (which is likely my fault due to poor wording). I did say above that I have come round to the feeling that Citation bot's edit was a disimprovement on your original
|via=
. As to creating an EBSCO link, that's also not what I intended to mean. My position is that the most useful value of|via=
for this citation is "EBSCO Literary Reference Center Plus" as I said in my original comment. That's all.Sorry again. Folly Mox (talk) 21:51, 14 December 2023 (UTC)- I, for one, would not have any idea how to access "EBSCO Literary Reference Center Plus" (except maybe after seeing this thread), despite regularly using The Wikipedia Library. —David Eppstein (talk) 22:19, 14 December 2023 (UTC)
- David Eppstein, I'm legitimately deeply sorry I've made you feel harangued. I've been trying to explain myself, because I was feeling misunderstood entirely (which is likely my fault due to poor wording). I did say above that I have come round to the feeling that Citation bot's edit was a disimprovement on your original
- I do not know how to make an EBSCO link to EBSCO content obtained through the Wikipedia Library that will remain permanently valid and will allow both Wikipedia Library subscribers and other EBSCO subscribers to access the content. If I did know how to provide such a link, I would have used it instead of just saying that you can find the content through the Wikipedia Library. Maybe you can educate me on how to provide such links instead of continuing to harangue me on how the access method I described was somehow so useless that bot-removal was an improvement. —David Eppstein (talk) 20:03, 14 December 2023 (UTC)
- The distinction I'm attempting to draw here is between publishing platforms (accessible to many different groups, host the actual content, material of sufficient interest to wealthy outgroup folk can be purchased for an exorbitant sum) and access systems (TWL, SomeUniversity.edu, "I have access to ProQuest because I'm a journalist or whatever"). Access systems are generally entirely closed and invite-only, and typically don't offer a means to help specify which work is cited except by proxy links that only work when logged in to the access system.In this case, if we take "The Wikipedia Library" to mean "The Wikipedia Library search bar", that does sufficiently identify the source, but it still only works for us. Even if we take it at face value, like I did, it gives us a starting point for verification in the way that Citation bot's removal doesn't. EBSCO would let any reader know which publisher they or their institution needs a subscription with (or to hand over money to) in order to verify. Folly Mox (talk) 19:03, 14 December 2023 (UTC)
- I'm sorry I communicated so poorly. That is not at all what I intended, and after having slept I do agree with you that no
- For the same reason you think we should remove all paywalled doi links on journal articles because they are never going to be helpful to someone without a subscription? Maybe just remove non-free-to-read references altogether? No. —David Eppstein (talk) 15:23, 14 December 2023 (UTC)
- I agree that had I formatted my initial search properly, I would have found the source without false starts and getting lost. I think what I'm trying to communicate is that
I also agree with Citation bot. Inclusion of |via=Wikipedia Library
is cruft of very low value. The specific library system through which someone accessed an source (or even, gasp, Sci-hub) does not need documentation. Ifly6 (talk) 15:51, 14 December 2023 (UTC)
- We need some way of identifying how to find the citation. In this case my judgement as an editor was that the title and name of work alone were inadequate, and that the via= provided that identification. This is not the sort of judgement Citation bot should be automatically reversing. Your opinion as another human agreeing with the removal is not relevant to the question of whether this is the sort of edit a bot should be making. —David Eppstein (talk) 17:56, 14 December 2023 (UTC)
- In this particular case, the
via=
parameter is rather helpful; the citation is bare enough without it that improving it was on my list of things to fix about the article. The only bot edit I could imagine being good here would be to wiki-link all occurrences ofThe Wikipedia Library
in thevia=
parameter, because it's probably unfamiliar to readers who aren't themselves fairly serious Wikipedia editors. XOR'easter (talk) 18:14, 14 December 2023 (UTC)
From the citation I'm not quite sure what "Baker & Taylor Author Biographies" is. It would help to specify that Baker & Taylor is the publisher and what format the work is in. It seems to be some kind of database, so people would know to search it in the usual places like Worldcat. Given the date, it's most likely based on a previously published book which the publisher has acquired, so the best solution would be to cite the original authors and source. Nemo 20:58, 14 December 2023 (UTC)
- I don't know exactly what it is either. It is what The Wikipedia Library told me the citation was from. The suggested AMA-format citation provided by EBSCO / The Wikipedia Library is:
- Anne Sigismund Huff. Baker & Taylor Author Biographies. January 2000:1. Accessed December 14, 2023. https://search.ebscohost.com/login.aspx?direct=true&db=lkh&AN=49334395&site=eds-live&scope=site
- You will notice the useless login-page url and the total lack of publisher and format information. Given that information, it's not obvious how a human editor could reasonably have been expected to produce anything better. But we are not here to talk about that, we are here to talk about how a bot editor can be prevented from making a not-very-good citation even worse. —David Eppstein (talk) 22:24, 14 December 2023 (UTC)
- Incidentally, by some web searching I found a different way to link EBSCO content: if you use the "permalink" function on the right toolbar you will get a link that demands a Wikipedia Library login rather than an EBSCO login. So I guess it can only be read by other Wikipedia library users? How helpful. —David Eppstein (talk) 22:36, 14 December 2023 (UTC)
I'd recommend replacing the "Via" with "Literary Reference Center Plus", since the source is not really the Wikipedia Library per se. I see using the latter for "via" as something akin to putting "via=My local librarian printed it out for me", which is frankly not very useful to anyone who has a different local librarian. –jacobolus (t) 00:45, 15 December 2023 (UTC)
- The intended meaning of the "via" was that to access this source, assuming you have Wikipedia Library access, you should go to the Wikipedia Library and type the title into the search bar across the top of the screen. The search bar is not labeled "Literary Reference Center Plus". I do not know what "Literary Reference Center Plus" is. Searching the Wikipedia Library page for the string "Literary Reference Center Plus" finds nothing. Putting via="Literary Reference Center Plus" would, for me, be as useless as leaving it blank. Not everyone has the same local librarian but all established Wikipedia editors (you know, the people who might want to verify a reference, for instance to see what it says in the context of an AfD discussion or to use it to expand the article) have the same Wikipedia Library. It would be better to have a link that readers and not just editors could access, but we don't. And again, you're missing the point: it should not be whether someone else might have come up with a better description of how to access the reference, it should be whether it is appropriate for a bot to be blanking this deliberately-included information. —David Eppstein (talk) 01:57, 15 December 2023 (UTC)
- It is indeed unfortunate though that EBSCO and Baker & Taylor are apparently really bad at providing meaningful links or information about their various published documents.
- There is at least a little bit more relevant metadata which might help someone locate this document: Baker & Taylor Author Biographies is OCLC 877175691, and apparently at Literary Reference Center Plus (the name of the EBSCO database providing the document, accessible from a wide variety of public and university libraries, which should definitely be mentioned somewhere in this citation), this particular record is apparently Accession Number 49334395.
- You're probably right that the bot shouldn't blank the via parameter in this kind of case. I wouldn't be surprised to see a human editor blanking it though. –jacobolus (t) 03:26, 15 December 2023 (UTC)
- Finally through some more searching I find that the correct solution (I think?) should be to use {{EBSCOhost}} with the id as a parameter. I say "should be" because it doesn't actually work. The example in the EBSCOhost template documentation leads to a document, but the one in the citation above just sends me to a search page that tells me nothing by that id was found in the "Academic Search Complete" database. To make it work I also have to include the magic incantation dbcode=lkh: "Anne Sigismund Huff". Baker & Taylor Author Biographies. January 2000. EBSCOhost 49334395. Now wouldn't it be nice if a bot could figure all that out instead of just blanking things. —David Eppstein (talk) 06:28, 15 December 2023 (UTC)
Figure stuff about based on google scholar link
- What should happen
- https://en.wikipedia.org/w/index.php?title=Akinola_Alada&diff=next&oldid=1191984133
- We can't proceed until
- Feedback from maintainers
I will have to think about this. AManWithNoPlan (talk) 01:02, 27 December 2023 (UTC)
- Idk, can't people click through to the publisher? Seems like potentially a lot of processing to save a click for extra low effort editors.... And does google scholar have reliably complete citations? Or is it more like Citation bot would have to be programmed to follow the link and parse the target? Folly Mox (talk) 03:01, 27 December 2023 (UTC)
- This request is to parse the google scholar information since it's given, and fill the template accordingly. IDC what happens to the original link. Headbomb {t · c · p · b} Headbomb {t · c · p · b} 03:13, 27 December 2023 (UTC)
- The bot would have to follow the link and expand based upon that. The other problem is that a lot of the links are intended to be to scholar and not the article itself. AManWithNoPlan (talk) 16:03, 27 December 2023 (UTC)
- This request is to parse the google scholar information since it's given, and fill the template accordingly. IDC what happens to the original link. Headbomb {t · c · p · b} Headbomb {t · c · p · b} 03:13, 27 December 2023 (UTC)
Grove online
- Status
- new bug
- Reported by
- 86.177.202.175 (talk) 18:50, 27 December 2023 (UTC)
- What happens
- partially broken citation
- Relevant diffs/links
- https://en.wikipedia.org/w/index.php?title=Division_viol&diff=1191904104&oldid=1191729983
- We can't proceed until
- Feedback from maintainers
Afaik [?], Grove Online is routinely cited inline using Template:Cite web, which (unlike Template:Cite Grove) allows for inclusion of actual author information. Afaik, this is correct and therefore does not require automated correction. 86.177.202.175 (talk) 18:50, 27 December 2023 (UTC)
- The various Grove wrapper templates (see Template:Cite Grove § See also) use
{{cite encyclopedia}}
. I think that you are mistaken in your claim that{{cite Grove}}
does not allow forinclusion of actual author information
. - —Trappist the monk (talk) 20:45, 27 December 2023 (UTC)
- Uhm, yes... I was mistaken:) Thanks, 86.177.202.175 (talk) 21:44, 27 December 2023 (UTC)
- I've tried with Cite Grove, like this (though to my eyes it looks a bit 'busy'). Fwiw, in the paaast, I think I've seen refs like this changed to Cite web (perhaps because of it being the 'Online' version?) 86.177.202.175 (talk) 22:01, 27 December 2023 (UTC)
Handle Current Topics in Microbiology and Immunology better
- What should happen
- [1]
- We can't proceed until
- Feedback from maintainers
Had to TNT the title/journal for it to properly give the information Headbomb {t · c · p · b} 01:46, 7 January 2024 (UTC)
remove publisher = NLM for cite journal
- What should happen
- [2]
- We can't proceed until
- Feedback from maintainers
The NLM can be a publisher, but it won't be a publisher of any journal. Headbomb {t · c · p · b} 06:18, 7 January 2024 (UTC)
Enabled 1-click activation of Category:CS1 errors: extra text: pages and similar
These four cats haven't been implemented apparently
- Category:CS1 errors: extra text: edition
- Category:CS1 errors: extra text: issue
- Category:CS1 errors: extra text: pages
- Category:CS1 errors: extra text: volume
Third time's the charm? Headbomb {t · c · p · b} 03:07, 9 January 2024 (UTC)
Cite web changed to cite magazine
Any online source should use cite web. On the jazz project Cleanup Listing, I have fixed many errors due to people using cite book and cite magazine instead of cite web. On the Steve Oliver page here, Citationbot changed the Billboard reference from cite web to cite magazine. Why? Nearly always, the citation is from an online source (an online version of Billboard), not the physical copy of the magazine. I'm not a fan of Citationbot's changes.—Vmavanti (talk) 03:41, 9 January 2024 (UTC)
- Because Billboard is a magazine, whether it's online or in print is irrelevant. Headbomb {t · c · p · b} 05:05, 9 January 2024 (UTC)
- Agree. Online sources that are books should use cite book. Online sources that are magazine articles should use cite magazine. Online sources that are journal articles should use cite journal. It is simply false that "Any online source should use cite web.". These are not errors and should not be "fixed". —David Eppstein (talk) 06:32, 9 January 2024 (UTC)
- I disagree. A web source should use cite web. Take a look sometime at the kinds of errors found on the Jazz Cleanup Listing. I didn't change them BECAUSE they used cite news. I changed them because the cite news usages were creating error messages as found in the Cleanup Listing.Vmavanti (talk) 15:33, 9 January 2024 (UTC)
- I think it's useful to distinguish when a reference is for a print magazine, though, since the parameters will be different (presence of page numbers, date of publication, quite often the same article has different titles in print vs online). WP:SAYWHEREYOUGOTIT, if you got it from the website that’s different from a print magazine — especially for older articles which might have digitization errors from OCR or if an online version has (sometimes silently) made emendations.
- Billboard is also an online database, and references to that as a magazine I think also confuses things.
- This is different of course from digital facsimiles of magazines and books which are identical in all respects to the paper versions including pagination. Umimmak (talk) 06:43, 9 January 2024 (UTC)
- (To clarify, if it's an actual news article I think {{cite news}} is still better than {{cite web}} — I don't think all online content should be referenced via {{cite web}}.) Umimmak (talk) 06:53, 9 January 2024 (UTC)
- I'm basing my judgment on 1) common sense (a web source uses cite web); 2) it's an easier template for contributors to use, based on the number of errors I have seen over eight years of editing when it comes to using cite news or cite magazine. Ask a member of the public. I have spoken to many of them over the years. I have seen their successes and their mistakes. Plenty.Vmavanti (talk) 15:33, 9 January 2024 (UTC)
- Agree. Online sources that are books should use cite book. Online sources that are magazine articles should use cite magazine. Online sources that are journal articles should use cite journal. It is simply false that "Any online source should use cite web.". These are not errors and should not be "fixed". —David Eppstein (talk) 06:32, 9 January 2024 (UTC)
Removal of via parameters
- Status
- new bug
- Reported by
- Jo-Jo Eumerus (talk) 06:54, 9 January 2024 (UTC)
- What happens
- It seems like the bot removes
|via=Google Books
, despite Wikipedia:Citing sources#Say where you read it advicing that the citation say how the source was accessed. - Relevant diffs/links
- https://en.wikipedia.org/w/index.php?title=Mount_Churchill&diff=1194462097&oldid=1194336920
- We can't proceed until
- Feedback from maintainers