User:Citation bot: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
→‎Stopping the bot from editing: add link to CS1 whitelist
→‎top: + caution that bot contains bugs
Line 1: Line 1:
<noinclude> </noinclude>
<noinclude> </noinclude>
{{mbox|image=[[File:Stop hand caution.svg|60px]]|text=This bot is no longer actively maintained and contains numerous unfixed bugs (see [[User talk:Citation bot|discussion]]). Hence editors that activate this bot should carefully check the results to make sure that they are as expected.}}
{{bot|Smith609|status=Active}}
{{bot|Smith609|status=Active}}



Revision as of 03:12, 12 May 2015

User interaction




Activate
Find out how you can use the Citation bot on your own pages here.
Bugs
Please report any bugs, ideas or suggestions, here. You can check out the bot's source code from its subversion repository.
Emergency shutoff

Administrators: Click here to understand how to block this bot with minimal disruption.

Non-administrators can report misbehaving bots to Wikipedia:Administrators' noticeboard/Incidents.

Function summary

This bot was originally designed to add Digital object identifiers (DOIs) to references; it now does much more, adding PubMed Identifiers (PMIDs), and ISBNs, and fixing common formatting errors.

The bot obtains citation data from a range of sources including CrossRef, AdsAbs, arXiv and PubMed. Because scraping data from web pages is unreliable and resource-intensive, these databases are the main source of data; unfortunately the bot is unable to tell when these databases contain errors or incomplete information. Any such error or omission should be reported directly to the data repository maintainer.

The bot periodically works through every page using citation templates on Wikipedia. If you are interested, http://tools.wmflabs.org/citations/progress-doibot.php?date= has stats on its progress since |date=. Note that the bot only operates automatically when there are no outstanding bugs – automatic (or operator-supervised) edits are marked by [Pu###] here.

Development

A stable version of the bot is always available at http://tools.wmflabs.org/citations. Time commitments preclude regular updates; maintenance is attempted every few months.

Changing citation to cite journal / cite book / etc

A frequent question is why the bot "unifies citation types". Some editors do not realize that the citation and cite journal family ("cite xxx") templates generate a subtly different output; these differences are the separator between parameters (commas or periods) and the presence of punctuation at the end of the citation. Because a uniform citation format is encouraged in any given article, the bot will tend towards this by switching "odd-citations-out" to use the dominant citation family on the page.

Postscript parameter

So as not to over-ride an intentional editorial decision, the bot will retain the original punctuation at the end of the citation using the |postscript= parameter. If the citation originally ended in a period, the bot will specify |postscript=.; if it ended without punctuation it will specify |postscript=<!--None-->.

This activity makes it clear to editors that the template in question is inconsistent with others on the page and will allow them to make an informed decision whether to keep the citation inconsistent, or whether to edit it to bring it in to line with other templates on the page.

Stopping the bot from editing

  • To prevent the Citation bot from editing a page, include the text
    {{bots|deny=Citation bot}}
    anywhere on the page. Please also leave a note here explaining why the action has become necessary, so that it can be resolved!
  • If the bot is erroneously adding a DOI, author, etc to a citation, and you want to stop it adding the data again, you need to put a comment in place of the appropriate parameter – because the bot will not overwrite existing data. So use something along the lines of
    |doi = <!-- this comment stops Citation bot adding the wrong DOI here-->
    or words to that effect. Again, it may be possible for me to fix the underlying problem if you let me know about it – but there are a few, rare instances (such as false positives and editor preference) where it is impossible to implement an automatic fix.

False positives

If the bot is adding seemingly-unrelated data to a citation, it is probably receiving a false positive from the citation databases it consults. Unfortunately, there's no way for the bot to know this, so there are two ways of avoiding it:

  • Change the citation template to one which the bot doesn't modify, such as cite web, cite news, etc;
  • Add a comment into one or more of the parameters - these comments will not be over-ridden by the bot, and will reduce the chance of the citation databases throwing false positives.

Page numbers with hyphens

The bot replaces hyphens with en dash in page number ranges. On rare occasions when a hyphen is right and an en dash is wrong (hyphen in the page number itself), manually use the hyphen HTML code &#8209; instead of the dash/hyphen.

Valid parameters

The bot draws all parameters specified in Module:Citation/CS1/Whitelist with the format "['parameter_name'] = true", and treats these as valid spellings

Capitalisation errors

See User:Citation_bot/capitalisation_exclusions.

Internationalization

There have been a number of requests for the bot to be adapted to foreign-language wikipedias. When time permits, I will be happy to work towards this. For me to adapt the bot for a foreign wiki I first need:

  • A valid bot account on that wiki with the appropriate permission for its edits
  • A translation of each of the template names and parameters used.

If you have both of these available, please let me know and I will set to work on the necessary coding.

Reading the edit summaries

To assist debugging, the bot's edit summaries begin with a code in [square brackets]. This identifies how the bot was initiated (letter), and what revision of the code was used (number). When major development is underway, the publicly accessible interface to the bot may use an older version of the code that has been established to be bug-free.

  • Pu - Initiated from the server. May be operating supervised or unsupervised.
  • Nothing (previously U) - Initiated by a user whose name is usually listed in the edit summary
  • Ax - {{Cite arXiv}} maintenance, activated when blank template detected
  • C - {{cite doi}} family maintenance, activated when blank template detected

If a bug is marked as 'fixed in r50' and you notice the bug in an edit beginning [U40], then there is no need to report the bug again. If you see it in an edit starting [Pu60], however, then please do report that it wasn't fixed as expected.

Function

Automatic or Manually Assisted: Automatic

Programming Language(s): PHP w/ Snoopy & BasicBot

Function Summary: Maintains and expands citations; ensures standards are complied to.

Edit period(s) (e.g. Continuous, daily, one time run): Visits each article every few months; can be used on specific articles whenever requested by a user.

Function Details: Citation bot only amends the parameters of Citation templates.

  1. Replaces "id=identifier" or "url=http://resource.org/identifier=# with "identifier=#"
  2. Fixes common typos in parameter names (not values), using the closest match if the typo is not in a list of frequent mistakes.
  3. Removes redundant parameters
  4. Searches for missing parameters (including URL), then adds them if available. This is especially convenient when only an identifier is included within the template
  5. Converts an endnote citation to a Wikipedia citation — Example
  6. Where the {{cite doi}} template has been used, creates or expands the accompanying reference.
  7. Automatically expands multi-use template using the {{cite doi}}, {{ref doi}}, {{cite pmid}} and related templates
  8. Adds names to references and combines duplicates
  9. Expands {{cite arXiv}} templates with an eprint parameter, and updates them to use {{cite journal}} where appropriate
  10. Where a mixture of {{citation}} and {{cite xxx}} family templates are used in an article, standardizes to the dominant format

Bot approval

External links