User talk:Citation bot

From Wikipedia, the free encyclopedia
Jump to: navigation, search


Note that the bot's maintainer can go weeks without logging in to Wikipedia and can no longer devote extensive time to bot maintenance. If a major bug arises and goes unnoticed, it may go unnoticed; as such, important matters may warrant an e-mail. Breaking changes to templates maintained by the bot will be more readily addressed if advance notice can be given.

Please click here to report an error.

This bot is only periodically maintained and new feature requests are no longer being considered. The code is open source and interested parties are invited to assist with the operation and extension of the bot; contact User:Smith609.

A barnstar for you![edit]

Original Barnstar Hires.png The Original Barnstar
Thank you, you have been very helpful to me as a new user and contributor. Tonythetiger89 (talk) 16:29, 15 August 2013 (UTC)


A kitten for you![edit]

Cute grey kitten.jpg

This kitten is Fixed

Vivian

Kashment (talk) 20:51, 20 July 2014 (UTC)
Face-smile.svg Martin (Smith609 – Talk) 05:13, 29 July 2014 (UTC)

Odd whitespace characters[edit]

Whitespace characters as produced by e.g. reference in Exaptation are dot-encoded by cite jstor

Status
Accepted
Reported by
Martin (Smith609 – Talk) 09:06, 6 September 2013 (UTC)
Type of bug
Improvement
Actual / expected output
...
Link
https://en.wikipedia.org/w/index.php?title=Co-option_%28biology%29&diff=571755892&oldid=567921155
We can't proceed until
A specific edit to the bot's code is requested below.
Requested action from maintainer

Discussion

This really isn't a bot problem that if you include an invisible character AFTER the number that is not handled by {{cite pmid}} or {{cite jstor}}. AManWithNoPlan (talk) 03:04, 16 July 2014 (UTC)

although it would be cool if the bot fixed these. AManWithNoPlan (talk) 03:07, 16 July 2014 (UTC)

Bot should add more than four editors and add displayeditors=29 if there are exactly 4 editors[edit]

Bot should add more than four editors and add displayeditors=29 if there are exactly 4 editors

Status
new bug / feature request (two related features in one request)
Reported by
Jonesey95 (talk) 23:49, 21 September 2013 (UTC)
Type of bug
Improvement
Actual / expected output
Bot limits editors to four first names and four last names.
Bot should retrieve all editors and add "displayeditors=29" parameter if there are exactly four editors.
Replication instructions
Run the citation expander on a citation that has four editors listed but more than four editors in the original work. Here's one example: Template:Cite_doi/10.1007.2F978-0-387-78705-3 (revert the citation to four editors and then run the bot on it).
We can't proceed until
Bot operator's feedback on what is feasible
Requested action from maintainer
Remove four-editor limit from bot code and add "displayeditors=29" to citations with exactly four authors.

Discussion

The bot should add "displayeditors=29" if there are exactly four editors to avoid the Lua error described for exactly 9 authors above. – Jonesey95 (talk) 23:49, 21 September 2013 (UTC)

Citation bot dev532 did not convert first names to initials in cite doi template[edit]

Citation bot dev532 did not convert first names to initials in cite doi template

Status
Accepted
Reported by
Jonesey95 (talk) 00:37, 5 February 2014 (UTC)
Type of bug
Cosmetic
Actual / expected output
Bot adds full first names.
Bot should add initials only, consistent with documentation and past practice.
Link
https://en.wikipedia.org/w/index.php?title=Template%3ACite_doi%2F10.1038.2F35020000&diff=593973611&oldid=593973575
We can't proceed until
A specific edit to the bot's code is requested below.
Requested action from maintainer
Restore previous, documented behavior of the bot.

Discussion

Complete disagreement. A sizable fraction of the Wiki community believes strongly that first names should not be abbreviated in the citation data, but rather in the rendering template if so desired (e.g., {{Cite web}}, {{Cite journal}}, etc.). Sometimes, a lot of detective work goes into finding out what an author's full first name is (nobody wants the sad story of Laurent Cassegrain to be repeated), and the bot should not increase entropy by abbreviating the citation data. Until this behaviour is removed (and note that it is not part of the bot's documented function), editors will be forced to add bot-denial instructions to the citations needing protection. Urhixidur (talk) 14:11, 8 September 2014 (UTC)
The formatting of author names in the {{cite doi}} template is documented at Template:Cite_doi#Formatting. The bot should fill in the {{cite doi}} template in accordance with the documentation. This bug report does not address the formatting of {{cite journal}} or other similar templates that the bot might work on within articles. – Jonesey95 (talk) 17:52, 8 September 2014 (UTC)
Just don't use cite doi. It's a **BAD** idea, highly vulnerable to typos, linkrot, and trivial vandalism. Make a full cite journal entry and be done with it. LeadSongDog come howl! 18:58, 8 September 2014 (UTC)
In some fields you are only supposed to use initials so that you do not know if the author is male or female or black or white or jewish, etc. AManWithNoPlan (talk) 14:28, 18 September 2014 (UTC)

Turns Wikilink for author into broken brackets[edit]

Turns Wikilink for author into broken brackets.

Status
new bug
Reported by
Bgwhite (talk) 06:43, 25 March 2014 (UTC)
Type of bug
Inconvenience
Actual / expected output
Turns |author=[[Stacy Mintzer Herlihy|Herlihy, Stacy Mintzer]], [[E. Allison Hagood|Hagood, E. Allison]], [[Paul A. Offit|Offit, Paul A.]] --> |author-separator=,|author1 = Herlihy|author2 = Stacy Mintzer]]|author3=Hagood|author4=E. Allison]]|author5=Offit|author6=Paul A.]]
Link
Diff. Look at the bottom. This is the 6th time today I've come across this. I don't remember seeing this before, so it must in recent changes of the code. Update: Here's another example.
We can't proceed until
Bot operator's feedback on what is feasible
Requested action from maintainer

Discussion

For
|author=[[Stacy Mintzer Herlihy|Herlihy, Stacy Mintzer]], [[E. Allison Hagood|Hagood, E. Allison]], [[Paul A. Offit|Offit, Paul A.]]
the expected output from the bot would be
|authorlink1=Stacy Mintzer Herlihy |last1=Herlihy |first1=Stacy Mintzer |authorlink2=E. Allison Hagood |last2=Hagood |first2=E. Allison |authorlink3=Paul A. Offit |last3=Offit |first3=Paul A.
The Graphene example is not the same problem: the ref had an unbalanced ]] and the bot simply removed that without altering the rest of the ref. --Redrose64 (talk) 08:13, 25 March 2014 (UTC)
Redrose64, Graphene is the same problem as there was no unbalanced bracket in the reference.
Original ref
|author = Wang, X.; Li, X.; Zhang, L.; Yoon, Y.; Weber, P. K.; Wang, H.; Guo, J.; [[Hongjie Dai|Dai, H.]]
After Citation bot:
|author-separator = ; |author1 = Wang |first1 = X. |last2 = Li |first2 = X. |last3 = Zhang |first3 = L. |last4 = Yoon |first4 = Y. |last5 = Weber |first5 = P. K. |last6 = Wang |first6 = H. |last7 = Guo |first7 = J. |last8 = Dai |first8 = H.]] |authorlink8 = Hongjie Dai
Bgwhite (talk) 17:29, 25 March 2014 (UTC)
Ah yes: I had assumed that since your first link (Vaccine controversies) was to the problem edit, the second link (Graphene) would also be to the problem edit. Instead, it seems that it's a link to your fix for the previous edit to that page. --Redrose64 (talk) 17:36, 25 March 2014 (UTC)
Before the bot edit, most of the cite journal templates in the Graphene article used a single author parameter to store the authors. Furthermore the wikilinks were fully functional before the bot edit. The bot is inserting a ridiculous number of new parameters in these templates in an attempt to produce clean metadata that no one will use. It would be better to leave the author parameters in these templates untouched. Boghog (talk) 07:01, 27 March 2014 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── Using |author= (singular) to store multiple names doesn't seem like a good idea. Before Citation Bot got to work, Revision 600716072 contained stuff like this:

  • <ref name="Saito92">{{Cite journal |author =Saito, R. ''et al.'' |title = Electronic structure of graphene tubules based on C60|doi=10.1103/PhysRevB.46.1804 |journal = Physical Review B | volume = 46 | page = 1804 |year =1992|bibcode = 1992PhRvB..46.1804S |issue =3 |first2 =Mitsutaka |first3 =G. |first4 =M. }}</ref> - with a name and "et al" in "author" and then a load of first, but not last, names later in the list.
  • <ref name=SiCplusH2>{{Cite journal |author = Riedl C., Coletti C., Iwasaki T., Zakharov A.A., Starke U. |year = 2009 |title = Quasi-Free-Standing Epitaxial Graphene on SiC Obtained by Hydrogen Intercalation |journal = Physical Review Letters |volume = 103 |page = 246804 |doi = 10.1103/PhysRevLett.103.246804 |pmid=20366220 |bibcode=2009PhRvL.103x6804R |issue = 24|arxiv = 0911.1953 }}</ref> - with multiple names in "author" separated by commas.
  • <ref>{{Cite journal |laysummary = http://news.ufl.edu/2009/05/07/graphene/ |author = Wang, X.; Li, X.; Zhang, L.; Yoon, Y.; Weber, P. K.; Wang, H.; Guo, J.; [[Hongjie Dai|Dai, H.]] |journal = Science |volume = 324 |issue = 5928 |year = 2009 |pmid = 19423822 | doi = 10.1126/science.1170335 |title = N-Doping of Graphene Through Electrothermal Reactions with Ammonia |bibcode = 2009Sci...324..768W |pages = 768–71 }}</ref> - with multiple names in "author" separated by semi-colons.
  • <ref>{{cite journal|author=[[Peter Debye|Debije P]], Scherrer P|year = 1916|title=Interferenz an regellos orientierten Teilchen im Röntgenlicht I|journal=Physikalische Zeitschrift|volume=17|page=277}}</ref> - lists of names in "author" with some of the names wikilinked.

I prefer using "last"/"first" for persons and "author" for committees, departments and organisations. Using "authorlink" is more robust and this works with both the "last"/"first" and "author" parameters.

Citation bot made a bit of a mess in the Graphene article.

Why did it do this to the patent?

  • {{citation|patent|US|6667100}} ->
  • {{Cite document|patent|US|6667100|ref = harv|postscript = <!-- Bot inserted parameter. Either remove it; or change its value to "." for the cite to end in a ".", as necessary. -->{{inconsistent citations}}}}

Why does the "last2" parameter get added at the end of the list of names instead of at the beginning? Why is "et al" not cleared from "author"? Why is "author" not changed to "last"/"first" to match the rest?

  • <ref name=K>{{Cite journal | author = Chen, J. H. ''et al.'' |title = Charged Impurity Scattering in Graphene |doi=10.1038/nphys935 |journal = Nature Physics | volume = 4 | pages = 377–381 |year = 2008 |bibcode = 2008NatPh...4..377C | issue=5 | first2 = C. | first3 = S. | first4 = M. S. | first5 = E. D. | first6 = M.|arxiv = 0708.2408 }}</ref> ->
  • <ref name=K>{{Cite journal | author = Chen, J. H. ''et al.'' |title = Charged Impurity Scattering in Graphene |doi=10.1038/nphys935 |journal = Nature Physics | volume = 4 | pages = 377–381 |year = 2008 |bibcode = 2008NatPh...4..377C | issue=5 | first2 = C. |last3 = Adam | first3 = S. |last4 = Fuhrer | first4 = M. S. |last5 = Williams | first5 = E. D. |last6 = Ishigami | first6 = M.|arxiv = 0708.2408 |last2 = Jang }}</ref>

Why did the bot duplicate the name found in "last2"/"first2" into "last3"/"first3"?

  • <ref>{{cite journal |journal=Rev. Mod. Phys. |year=2002 |volume=74 |page=601 |doi=10.1103/RevModPhys.74.601 |bibcode=2002RvMP...74..601O |title=Electronic excitations: Density-functional versus many-body Green's-function approaches |last1=Onida |first1=Giovanni |last2=Rubio |first2=Angel |issue=2}}</ref> ->
  • <ref>{{cite journal |journal=Rev. Mod. Phys. |year=2002 |volume=74 |page=601 |doi=10.1103/RevModPhys.74.601 |bibcode=2002RvMP...74..601O |title=Electronic excitations: Density-functional versus many-body Green's-function approaches |last1=Onida |first1=Giovanni |last2=Rubio |first2=Angel |last3=Rubio |first3=Angel |issue=2}}</ref>

Why are only "last2" to "last6" created and not "first2" to "first6"? Why is the "author" parameter with six names in it left untouched? This causes duplication in display.

  • <ref name=nmscrolling>{{Cite journal |author = S. Braga, V. R. Coluci, S. B. Legoas, R. Giro, D. S. Galvão, R. H. Baughman |year = 2004 |title = Structure and Dynamics of Carbon Nanoscrolls |journal = Nano Letters |volume = 4 |page = 881 |doi=10.1021/nl0497272 |bibcode = 2004NanoL...4..881B |issue = 5 }}</ref> ->
  • <ref name=nmscrolling>{{Cite journal |author = S. Braga, V. R. Coluci, S. B. Legoas, R. Giro, D. S. Galvão, R. H. Baughman |year = 2004 |title = Structure and Dynamics of Carbon Nanoscrolls |journal = Nano Letters |volume = 4 |page = 881 |doi=10.1021/nl0497272 |bibcode = 2004NanoL...4..881B |issue = 5 |last2 = Coluci |last3 = Legoas |last4 = Giro |last5 = Galvão |last6 = Baughman }}</ref>

Why was the working Wiley URL changed to a DOI attribute and then immediately marked as "dead"?

  • <ref>{{cite web|url=http://onlinelibrary.wiley.com/doi/10.1002/adma.200904383/abstract |title=Graphene-On-Silicon Schottky Junction Solar Cells |date= APR,9,2010}}</ref> ->
  • <ref>{{cite web|doi=10.1002/adma.200904383/abstract |title=Graphene-On-Silicon Schottky Junction Solar Cells |date= APR,9,2010|doi_brokendate=2014-03-24 }}</ref>

I have manually fixed those and very many other errors. Although the parameter names are now the same for all references, there is very little consistency in the format of some of the data in the parameters. I have fixed all the dates, but "first names" are a mixture of either first name or initials, the latter found both with or without periods. -- 79.67.241.76 (talk) 00:43, 28 March 2014 (UTC)

Another article today, Delimiter. It messed up three refs. Bgwhite (talk) 06:35, 28 March 2014 (UTC)
"Using |author= (singular) to store multiple names doesn't seem like a good idea." – Why not? Using a single parameter to store multiple authors produces more compact templates that don't overwhelm the surrounding wikitext. The only down side is that is doesn't produce clean author metadata. However how many consumers of Wikipedia citation metadata are there? I suspect not very many. I agree that it is perhaps more logical to store multiple authors in |authors= (plural). Nevertheless, per consensus and long established usage and consistent with the current {{cite journal}} documentation, full author lists can be stored in a single field called either "authors" or "author" without need for additional numbered author parameters. Boghog (talk) 07:07, 28 March 2014 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── With a free form input using "authors", there will be no consistency of display. Before Citation Bot got to work, the Graphene article contained the following names in references:

There were also references that would not display because the reference name had been duplicated, four or five different date formats including dates like "SEP,03,2013" with Chinese characters in them and many other issues. Half the "et al." were in italics and half were not.

By changing to separate parameters for names, all names display in "last, first" order with the same separators throughout. The only variation is whether the first name is stated in full or is initials, and whether there are periods after initials or not. A bot can fix those entries to be consistent. If "et al." is specified it is currently in |authorn+1= where n is the highest numbered "lastn"/"firstn" parameter. The number of authors to display can also be set using the |display-authors= parameter. -- 79.67.241.76 (talk) 11:19, 28 March 2014 (UTC)

The format of the author names could just as easily been standardized using a single author parameter which would have avoided all the parameter bloat. There is no "house style" for citations, hence there is no single "right way" to format citations. A single author parameter was the predominate style before your edits. Per WP:CITEVAR, if you want to change this style, you should have obtained consensus for this change on the article talk page before your edits. Boghog (talk) 12:35, 28 March 2014 (UTC)
If you want to list more than one author in a parameter, then use |authors=. If you want to use a single author, use |author=. Martin (Smith609 – Talk) 11:28, 12 July 2014 (UTC)

"Molecular and cellular biology" instead of "Molecular and Cellular Biology"[edit]

Capitalization

Status
feature request
Reported by
Saimondo (talk) 16:21, 3 August 2014 (UTC)
Type of bug
Inconvenience: Humans must occasionally make immediate edits to clean up after the bot
Actual / expected output
Bot writes for example "Molecular and cellular biology" instead of "Molecular and Cellular Biology" by autofilling with PMID 9858585
Link
https://en.wikipedia.org/w/index.php?title=Template%3ACite_pmid%2F9858585&diff=619550325&oldid=604044373
Replication instructions
autocomplete with PMID 9858585
We can't proceed until
Agreement on the best solution
Requested action from maintainer

Discussion

Data on NCBI seems to be ok: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC83919/ where the Journal is written as "Mol Cell Biol." on the webpage and as "MOLECULAR AND CELLULAR BIOLOGY" in the full text pdf.

What to do in those cases? Include "Molecular and Cellular Biology" in: https://en.wikipedia.org/wiki/User:Citation_bot/capitalisation_exclusions in sush cases?

The same with

-"The Journal of biological chemistry" e.g. PMID 9858585

-"The Journal of cell biology" e.g. PMID 9763423

an other cases seen in https://en.wikipedia.org/wiki/Special:RecentChangesLinked/Category:Cite_doi_templates ? Thanks--Saimondo (talk) 16:21, 3 August 2014 (UTC)

Actually PubMed lists the journal as "Molecular and cellular biology" in the webpage meta data. A very minor case of GIGO. AManWithNoPlan (talk) 02:31, 4 August 2014 (UTC)
Perhaps its worth quoting the University of Chicago Manual of Style (14th ed.) on this matter:
"In regular title capitalization, also known as headline style, the first and last words and all nouns, pronouns, adjectives, verbs, adverbs, and subordinating conjunctions (if, because, as, that, etc.) are capitalized. Articles (a, an, the), coordinating conjunctions (and, but, or, for, nor), and prepositions, regardless of length are lowercased unless they are the first or last word of the title or subtitle. The to in infinitives is also lowercased."
On the other hand, it is common in library cataloging following MARC format to capitalize only the initial word, proper nouns, and, if the title begins with an article, that article and the following noun.
Wikipedia citations should follow citation style, rather than library cataloging style. In this case, the appropriate form would be "Molecular and Cellular Biology". The Wikipedia Manual of Style provides much the same advice on the capitalization of titles. SteveMcCluskey (talk) 18:40, 4 August 2014 (UTC)
I am not very familiar with PHP (the language that Citation Bot is coded in), but it would appear that there is a mb_convert_case function:
 $str = mb_convert_case($str, MB_CASE_TITLE, "UTF-8");
that can transform a string into title case (i.e., capitalize the first and last words of the title and all nouns, pronouns, adjectives, verbs, adverbs, and subordinating conjunctions). This function would probably work well for most journal names. Boghog (talk) 19:15, 4 August 2014 (UTC)
This should be easy to implement, but I anticipate that some time down the line it will upset someone. Before I implement it, could we establish consensus and file a bot approval request if necessary? Thanks. Martin (Smith609 – Talk) 08:49, 25 August 2014 (UTC)
How about your implement it for adding journal titles, but don't implement it for changing existing entries. Eventually, the list of titles that violate the rules will be built up, and then you can make it is a fix for existing journal titles. AManWithNoPlan (talk) 01:48, 4 September 2014 (UTC)

You are of course right, it´s no error it´s the catalog style NCBI is using. I don´t have the complete overview what capitalization format is obtained by the doi or issn vs pmid queries. But if you use the cite-> templates-> cite journal option here in the edit window and use autofill with the doi:10.1128/MCB.00698-14 you get "Molecular and Cellular Biology" if you use the same publications PMID 25022755 with autofill you get "Molecular and cellular biology". If capitalization means also harmonization I think few wikipedians would be against it.

Furthermore, as far as I understand https://en.wikipedia.org/wiki/Wikipedia:Manual_of_Style#Titles_of_works the capitalization format like above should be ok (I have the impression that most journals use capitalization for their own names on their homepages/pdfs). Should we ask on the Manual of style talk page to see if there´s a consensus for capitalization? In case someone is interested, here is a recent reply of an email I (re-)sent to NCBI some time ago:

"...Standard cataloging requires that the first word in the full journal title begins with an upper case letter and remaining words (except for proper nouns) begin with lower case. Journal title abbreviations begin with all upper-case letters. I checked the XML data for several journals and found that each of the title listed in this manner. You can see several examples at the bottom of this document:

Fact Sheet: Construction of the National Library of Medicine Title Abbreviations http://www.nlm.nih.gov/pubs/factsheets/constructitle.html Sincerely, Ellen M. L. ...

-Original Message-

Dear NCBI Team, in the xml data of a specific article https://www.ncbi.nlm.nih.gov/pubmed/9858585?dopt=Abstract&report=xml&format=text the journal name is written "Molecular and cellular biology" and the abbreviation is "Mol Cell Biol.". I think the correct journal name should be "Molecular and Cellular Biology" as written on the journal homepage http://mcb.asm.org/content/19/1/612.long ." Saimondo (talk) 17:29, 10 September 2014 (UTC)

Display authors in citewatch[edit]

https://en.wikipedia.org/w/index.php?title=Template%3ACite_pmid%2F14623081&diff=622774160&oldid=622720893 (Full details to follow) Martin (Smith609 – Talk) 11:37, 26 August 2014 (UTC)

Issue & Number[edit]

Bot adds Issue to Cite journal if it already has Number present

Status
new bug
Reported by
It Is Me Here t / c 11:31, 3 September 2014 (UTC)
Type of bug
Inconvenience: Humans must occasionally make immediate edits to clean up after the bot
Actual / expected output
If an instance of {{cite journal}} has no |issue=φ, the Bot adds it, even if the {{cite journal}} already has |number=φ, throwing up a red error in read mode.
It should do nothing (bypass {{cite journal}}s with |number=φ).
Link
[1]
We can't proceed until
Agreement on the best solution
Requested action from maintainer

Discussion


Bot v579 added only last names, not first names[edit]

Status
new bug
Reported by
Jonesey95 (talk) 05:15, 4 September 2014 (UTC)
Type of bug
Inconvenience
Actual / expected output
Bot adds only last names to a citation
Bot should add first initials as well
Link
https://en.wikipedia.org/w/index.php?title=Altai-Sayan_region&diff=prev&oldid=624114302
Replication instructions
Run the bot on the previous version of the article linked above
We can't proceed until
Bot operator's feedback on what is feasible
Requested action from maintainer

Discussion

Is this because it already had authors= present AManWithNoPlan (talk) 04:00, 5 September 2014 (UTC)

Umlaut[edit]

Seems the bot doesn't recognise Umlaut (linguistics), or was I just unlucky?[2] FunkMonk (talk) 10:45, 15 September 2014 (UTC)

This looks like a problem in the source database that the bot draws the data from. Sorry that nothing can be done! Martin (Smith609 – Talk) 06:06, 19 September 2014 (UTC)
If the source database used was PubMed, then it appears that the problem is not with the source database (see PMID: 12712314). Boghog (talk) 06:41, 19 September 2014 (UTC)