Help talk:CS1 errors

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Ukiyo-e[edit]

I am doing a GA review for Ukiyo-e and it has "Check date values in: |date= (help)" error messages for the web citations under Ukiyo-e#Works cited. I have something turned on somewhere to see the error messages that others do not see, and I'm not sure why the error message is appearing.

Here's the wikitext for one of the citations:

{{cite web |last = Fiorillo |first = John |title = FAQ: Care and Repair of Japanese Prints |work = Viewing Japanese Prints |url = http://www.viewingjapaneseprints.net/texts/topictexts/faq/faq_care_and_repair.html |year = 1999–2001 |accessdate = 2013-12-17 |ref = {{SfnRef|Fiorillo|1999–2001}} }}

And this is what I see on the screen:

Fiorillo, John (1999–2001). "FAQ: Care and Repair of Japanese Prints". Viewing Japanese Prints. Retrieved 2013-12-17. Check date values in: |date= (help)

The year range as an en dash has described on the linked section: https://en.wikipedia.org/wiki/Help:CS1_errors#bad_date

Is there something that I'm missing? Thanks!--CaroleHenson (talk) 14:14, 13 February 2014 (UTC)

The something that you have turned on that reveals the CS1 error messages is discussed at Help:CS1 errors#Controlling error message display.
The underlying engine that creates CS1 citations is Module:Citation/CS1. That engine does not currently allow year ranges in date-holding parameters. They will, no doubt, be supported in future. I'm not so sure that copyright dates should be used to date a citation – especially on websites where is is very common to see obviously current material with a copyright date in the past. For this citation, I would simply use |accessdate= and leave off |date= altogether.
Trappist the monk (talk) 14:38, 13 February 2014 (UTC)
Ok, thanks, Trappist!--CaroleHenson (talk) 14:43, 13 February 2014 (UTC)

In the case of these particular web pages, the author has put different copyright date notices on each article at the site. Should the dates still be avoided? Curly Turkey (gobble) 21:01, 13 February 2014 (UTC)

I think so. Dates in CS1 citations are an aid to the reader who wants to consult the same source that the editor cited. This is difficult when citing ephemeral works like websites; a fact that was there in the copyright 2001 version may not be there in the copyright 2001–2004 version. |accessdate= helps to get around that conundrum by identifying the date on which the cited fact was present on the website. Or, consider adding a link to an archived version of the web pages you wish to cite; archive.org, for example. Set |deadurl=no if the website page is still live.
It isn't clear to me just what the author's intent is when he uses a variety of copyright dates. It would seem that he uses these dates much like a last-updated-on kind of date. There are single year copyrights, year range copyrights (as illustrated above), and the odd year range plus another year copyrights (Tsuruya Kôkei (born 1946) copyright 1999–2001, 2010). What does that last mean? Does it mean that it wan't in copyright for the period 2002–2009? It's all too confusing.
Trappist the monk (talk) 22:01, 13 February 2014 (UTC)
For what it's worth, I agree. As long as the accessdate is provided in the citation, it made sense to me, too, to remove the year range for copyrighted dates/years. For me, this contrasts against specific dates, such as article dates for online magazines or newspapers.--CaroleHenson (talk) 22:23, 13 February 2014 (UTC)
  • Okay, just wanted clarification. The date parameter apparently serves a different role than I'd thought. Curly Turkey (gobble) 23:00, 13 February 2014 (UTC)

Suggestion about possibly ambiguous date range[edit]

YYYY–YY is sometimes found ambiguous by some editors, if it could be confused with YYYY–MM. The notice already mentions that date ranges may be incorrectly marked, but there is some discussion about whether this range should continue to be marked, though other acceptable year ranges are now unmarked, and if so, what the explanation would be. I would just document whatever happens in the code, saying it may differ from MOS, with links to the relevant MOS pages, continuing to say editors should use discretion. So far, no changes really need to be made, as the ranges are covered in the current explanation. —PC-XT+ 02:10, 1 April 2014 (UTC)

The note in the date error message help text about incorrectly marked citations was referring to the 2014-03-30 update. That update having happened, I have removed the note.
Trappist the monk (talk) 03:26, 1 April 2014 (UTC)
Now that the note about ranges has been removed, my suggestion applies. I suggest something like this, to better describe this particular range, and the discussion (correct this as needed): "Dates in the form YYYY–YY should be checked to be sure editors have not incorrectly used YYYY–MM. (See Wikipedia:Manual of Style/Dates and numbers#Months) While the MOS defines YYYY–YY to be correct, consider using YYYY–YYYY instead, to avoid ambiguity, and make it easier for the template to parse." —PC-XT+ 04:23, 1 April 2014 (UTC) 04:25, 1 April 2014 (UTC)
I have added text to the help text that I think says what you suggested.
Trappist the monk (talk) 11:05, 1 April 2014 (UTC)

what's wrong with this picture?[edit]

I cannot understand what's wrong with this citation:

Munday, Rosemary, ed. (l991). "How Australia Began: Significant Dates in Australian History". The Bulletin Australian Almanac & Book of Facts 1992. Sydney: Australian Consolidated Press. p. 3. ISSN 1038-054X. 

There's no |date=, so how can the value be incorrect? -- Ohc ¡digame! 03:21, 1 April 2014 (UTC)

There is a value for the year parameter, and the first character is an ell rather than a one. The editor who wrote it must have learned to type on a manual typewriter. Jc3s5h (talk) 03:27, 1 April 2014 (UTC)
no wonder! Thanks. -- Ohc ¡digame! 03:33, 1 April 2014 (UTC)
Looks like the first digit in the year parameter is a lower case L. The individual parameters |year=, |month=, and |day= are promoted to |date= before the date validation code is executed. I am aware of this issue.
Trappist the monk (talk) 03:31, 1 April 2014 (UTC)
no wonder! Thanks. -- Ohc ¡digame! 03:33, 1 April 2014 (UTC)
Could we say Check values in: |date= or |year=? – Jonesey95 (talk) 03:44, 1 April 2014 (UTC)
Or, we could take |date= out, leaving "Check date values (help)", and just list the parameters in the help text. —PC-XT+ 04:04, 1 April 2014 (UTC)
I've added text to the help text that I think explains how an error in |year= is reported as an error in |date=.
Trappist the monk (talk) 11:13, 1 April 2014 (UTC)

Accessdate[edit]

maybe accessdate should not raise a flag if doi is filled in? All the best, Rich Farmbrough, 00:15, 13 April 2014 (UTC).

This was discussed and explained at length on this page last year and also before that at VPP (ignore the non-consensus "consensus" summaries and scroll to the bolded "Do you agree that |accessdate= should only be displayed if there is a URL present" section for a quick overview) and maybe in other places.
The reason that this error message is still hidden is that we need a bot to comment out the accessdates in citations where it doesn't belong, like {{cite journal}} templates where doi/pmid/pmc is filled in, or {{cite book}} in most or all cases. Once that is done, we'll have a better idea of how many of those 42,774 articles have actual errors, and we can expose the error messages so that editors know that there is a problem to be fixed.
The basic consensus that I perceive from previous discussions is:
  • |accessdate= in a citation without a |url= should not be displayed
  • |accessdate= is only for web-based citations that might change, hence it should not be used for books or journal articles
  • comment out, do not delete, |accessdate= when |url= is missing, because sometimes people delete or omit the |url= and the |accessdate= gives a clue about the publication date of the web-based source
I believe that this accessdate category will soon be the largest subcategory of the incorrect syntax category (as it was before the date errors and deprecated parameters were tagged; bots are working on those categories as we speak). It's a nice meaty problem for a bot operator, and it shouldn't be that hard to address. I don't have the bot skills to take it on, but other people who read this page probably do. – Jonesey95 (talk) 00:50, 13 April 2014 (UTC)
It would be an interesting and not difficult task. Sadly one I am not at liberty to take up. All the best, Rich Farmbrough, 02:05, 13 April 2014 (UTC).
I did everything beginning with "." or "Ř".... Face-smile.svg All the best, Rich Farmbrough, 03:19, 13 April 2014 (UTC).

Perhaps I'll have a go at it with these assumptions:

  1. Only applies to CS1 citations that have |accessdate= but do not have |url=
  2. Journal identifiers are: |arxiv=, |bibcode=, |doi=, |issn=, |jfm=, |jstor=, |mr=, |pmc=, |pmid=, |zbl=
  3. for {{citation}} templates with:
    • a journal identifier: delete |accessdate= and its value
    • |journal= without an identifier: comment out |accessdate= value (|accessdate=<!--date-->)
    • an isbn: delete |accessdate= and its value
  4. for {{cite book}}: delete |accessdate= and its value
  5. for {{cite encyclopedia}}: comment out |accessdate= value
  6. for {{cite journal}} with:
    • a journal identifier: delete |accessdate= and its value
    • else comment out |accessdate= value
  7. for {{cite news}}: comment out |accessdate= value
  8. for {{cite web}}: do nothing

And since this is rather a lonely backwater, I'll put a note about this discussion at Help talk:Citation Style 1.

Trappist the monk (talk) 11:01, 13 April 2014 (UTC)

I do not think that we should be commenting out |accessdate= for {{cite news}} as it is basically used the same as {{cite web}}. If there is a URL then you would expect to find an accessdate. Keith D (talk) 14:15, 13 April 2014 (UTC)
Cite news could frequently refer to an online news source, which is subject to change, so the accessdate may be useful. Cite journal has long been a synonym for cite news, and is commonly applied to any periodical (or the web site of any print periodical). Some of these sources are online and subject to change, so again, accessdate could be useful. Indeed, it suggests it could be useful to have both a date parameter for the date that the publisher designates as the publication date, and a last-updated parameter to indicate the last update designated by the publisher, which might be different from the official publication date. Jc3s5h (talk) 14:36, 13 April 2014 (UTC)
Most of the cite templates (not just web and news) have a |url= parameter; if that has been filled in, I don't think that |accessdate= should be removed. --Redrose64 (talk) 14:37, 13 April 2014 (UTC)
Yeah, so I wasn't specifically clear about what I am talking about. For the above list of actions that a bot might take, I am referring to CS1 citations with |accessdate= but without |url=. I've added that restriction to the list above.
The purpose of the list of actions is to guide the development of a relatively mindless bot that can troll through Category:Pages using citations with accessdate and no URL, fixing or hiding the offending |accessdate= parameter in those citations where simple fixes make sense. This is why, for example, {{cite web}} is not fixed; those citations require a human editor to figure out why {{cite web}} doesn't have |url= but does have |accessdate=.
Trappist the monk (talk) 15:34, 13 April 2014 (UTC)
Here is a list of test edits that I've made that show what my prospective bot would do.
Trappist the monk (talk) 16:01, 13 April 2014 (UTC)
I notice that this edit comments out the access date for a citation where there is a working URL specified in |url= but the article before the edit had the url inexplicably commented out. Maybe the test should be modified to leave a citation alone if there is any non-whitespace characters in |url=. Jc3s5h (talk) 16:12, 13 April 2014 (UTC)
As far as the script is concerned, there is no |url= in those citations. The opening half of an HTML remark tag lies between the pipe and url: |<!--url. I've updated the script to protect from editing any citations that have the opening half of an HTML remark tag between the pipe and url. I have reverted the edit and had the script try again. This time, the script did not edit the page.
Trappist the monk (talk) 17:00, 13 April 2014 (UTC)

I think that accessdates should not be deleted, for the reasons listed in the (long) threads I linked to above. The presence of a filled-in ISBN or journal identifier does not necessarily mean that the identifier is valid; we have basic validation for some of them and do not check the rest. Deleting the accessdate from a citation with an invalid identifier removes a clue that might help fix the identifier. I would replace each instance of "delete" with "comment out" in the list above. – Jonesey95 (talk) 04:35, 14 April 2014 (UTC)

I'm not finding that argument very persuasive. For repair of a malformed parameter value, the other information available in a citation, which is actually related to the source, is much more useful than an arbitrary date that identifies some point in time that an anonymous Wikipedia editor consulted that source. The benefits of |accessdate= for ephemeral internet sources is undeniable, but for journals, books, and other fixed sources, meh, not so much.
Trappist the monk (talk) 11:40, 14 April 2014 (UTC)
I object to |accessdate= being deleted on any citations.
  1. I regularly use access date information, even those on material which is supposedly unchanging (e.g. books, PMID, DOI, etc). |accessdate= provides a hint as to when the citation was entered on the page. Generally, this means that the person who added the citation believed that the reference supported the text at that point. While this may, or may not, be accurate, I routinely use access dates to prioritize which references need to be checked to verify that article content is still supported by the citation. Using it in this way is, of course, imperfect. However, there really is just not enough time in existence to do all the checking which really should be performed. This is one piece of information which can be used to help determine where to spend limited time.
  2. I have also found that |accessdate= is useful in attempting to determine to what a reference actually is referring. We all know those references that have inaccurate or corrupted information. Sometimes the citation is copied from page to page with errors/vandalism – I recently fixed one that had the same error on 34 pages across 7 wikis which appeared to be the result of copying a vandalized citation. All information we have, including |accessdate=, is potentially valuable in such situations. |accessdate= can provide a hint as to the time-frame of the actual reference date when that is not included with the citation.
  3. It also can be used to eliminate some possibilities of what the reference is (It can't refer to a reference created after that date). Let's not delete the information just because some people feel it is not useful, to them, on a citation that is correctly formatted and not corrupted by vandalism and copied from page to page.
  4. |accessdate= is also useful as one of the quick sanity checks for a citation: Is |accessdate= before |date=? If so, that citation needs to be checked.
  5. In addition, I have used |accessdate= as an indicator that someone has merely copied a reference from one page to another. This can indicate that the person may have not bothered to read the actual reference which may imply that a closer examination of if the reference actually supports the text is appropriate.
I have not yet read the discussions linked above. I will do so in the reasonably near future. Was anyone arguing that having an access date is actually bad? Or just that they did not feel it was useful, for them? It's not like we have a limited amount of space on a page and we need to trim all information which is not critical.
I see no reason to consider that having an |accessdate= without a URL is inherently an error, let alone that the |accessdate= should be deleted for that reason. I can understand the converse being an error: URL without access date. I can understand not requiring an access date for most references without a URL (i.e. ones which refer to physical objects). It is not reasonable to consider it an error to have an access date when the reference does not have a URL where other information in the citation implies that what is being referenced is not primarily on the web. It is reasonable to give a warning that someone should check the citation and verify that a URL is not supposed to be there and disable the warning (perhaps with something similar to |ignore-isbn-error=true). Alternately, just have the module not flag having an |accessdate= without a URL to be an error if there is a valid ISBN, DOI, or other permanent reference.
The existence of the error is not to indicate that having the access date is wrong. It is to indicate that having an access date makes it likely that the person forgot to enter a URL when a URL is what is being referenced. The solution to this type of error is not to get rid of the access date information. The solution is to change the module so that does not report most of the cases mentioned in the list of tasks above where addition information that the module has (e.g. valid ISBN, DOI, etc.) indicates that a URL should not be required. Additionally, there should be a way to directly inform the module that in the specific instance having a |accessdate= without a URL has been checked by a human and is not an error.
It appears to me that people may be coming at this from the wrong point of view: See error...easiest solution is to remove the access date instead of solving it in another, more appropriate manner.
If you do end up deleting, or commenting out access date information, please keep in mind that you need to check for the existence of any URL. These include (not an exhaustive list):
URL positions:
  • bare URLS in, or next to the citation (some are placed next to citations; yep, not the way it is supposed to be, but it is done)
  • |url=
  • |chapterurl=
  • |chapter-url=
  • |contributionurl=
  • |contribution-url=
  • |archive-url=
  • |layurl=
  • |website= (Yep, again not supposed to happen, but it does)
  • |deadurl= (Yep, again not supposed to happen, but it does)
  • commented out URLs
  • Any parameter in which the editor has placed a URL. example:
Fortescue, Sir John William (1915), A history of the British army, 4 part 2, Macmillan and company, pp. 889–890  [Note: This example links to a source which should not be changing, but other citations link to changeable content.]
  • Keep in mind that all it takes is a missing | character and the |url= is actually in some other parameter.
In your above list of tasks #7 is just wrong. {{cite news}} is used to commonly refer to news websites with and without URLs being included. It would only be reasonable to comment out |accessdate= if you can actually verify that there is enough valid data to indicate that the source really is a physical paper copy of an article.
For the vast majority of tasks listed above the right solution is for the module to not report the missing URL error when enough information is there to indicate a valid physical source, or "permanent" electronic location information.
— Makyen (talk) 14:10, 14 April 2014 (UTC)
All of the above argues for commenting out, not deleting, the accessdate, in citations that refer to permanent sources. Remember that without a |url=, the access date is already not displayed, so commenting it out does not change the displayed citation.
Makyen's example citation above is a perfect example of why we need to go through this category with a script or a bot. The example uses {{citation}} and it does not contain any identifiers, so it would not be touched by the bot. After the bot removes all of the noise from the category by commenting out accessdates in cite templates for sources believed to be permanent, citations like the one above will be left. We will then be able to troubleshoot and fix them by hand. Thanks for the example. – Jonesey95 (talk) 16:57, 14 April 2014 (UTC)
Proponents of the RfC that resulted in the hiding of access date error messages, promised a bot that would fix citations in Category:Pages using citations with accessdate and no URL. That never happened. A bot to fix these errors was quietly begun and even more quietly abandoned. What there was of it was rather more mindless than my AWB script, it simply deleted |accessdate= when the citation did not include |url=. In my oppose to the RfC, I noted then that I thought that a bot probably couldn't do the job right.
Editor Jonesey95 has twice in this discussion asserted that we need a bot that removes all of the noise from the category. Yet, whatever action a bot takes, whether it's deleting |accessdate= entirely or simply hiding all or part of |accessdate=, the citation will drop off the radar and these broken (according to the current criteria) citations will no longer be easily found.
Since there seemed to be a desire for an automated tool to fix some of these errors, I hacked up a script to do so. But, as I tested my script, I started to wonder if it really is possible to fix these errors in the manner that has been prescribed. I don't think that it is; at least not using the current criteria and definitions which I have also started to question. To that end, I have started a separate discussion about rethinking |accessdate=.
Trappist the monk (talk) 20:04, 14 April 2014 (UTC)
It is desirable for "broken" citations to drop off the radar when they are fixed. However, simply deleting all of the accessdates from broken citations in the category (something that has not been proposed seriously) would not be appropriate, since some of the error messages indicate a problem, often a subtle problem, with the citation syntax. We need to find some middle ground between wholesale removal of the accessdates and editing each article by hand.
The trick is to send 40,000+ articles through a filter that comments out accessdates in citations that truly do not need them, like books with ISBNs or journal articles with DOI values. Once that is done, we'll be able to see the articles with actual problems. This is the same approach that was taken in the "unsupported parameters" category, where straightforward misspellings were replaced with corrected parameter names. Once that was done, the one-off "oddballs" were left for humans to process. – Jonesey95 (talk) 20:53, 14 April 2014 (UTC)
A bot, or multiple bot runs, may be appropriate in the future. However, the correct solution for the specific situations you mention is to change the module to not report an error in those cases. It should be a relatively simple code change to have the module not report an error when either a valid format ISBN or a valid format DOI are present. The solution to these is not to have a bot run through and comment out the |accessdate=.
To me, such simple situations are bugs/RFE issues with the module. They are not fodder for a bot to go change the citations just to remove them from a category in which they were placed due to a lack of discrimination by the module. If we were stuck with a template using wiki template code then that would be the correct solution. We are not. We have a Lua based module which can be programmed to more accurately assign, or not assign, citations to the error category.
Sure, at some point we will need bots/humans to go through the category. We are not there yet.
note: In the list above (2. Journal identifiers), an ISSN does not uniquely identify an article/item (in most cases). ISSNs are automatically provided by some tools when citing web based sources. ISSNs are usually assigned as one number to a particular journal – not even a particular issue of the journal. I am unsure about a couple of the other identifiers. If the ID does not uniquely identify the work to which the citation refers, it should not be cause to not display the error. — Makyen (talk) 22:22, 14 April 2014 (UTC)
Yep, I think it's pretty easy to change Module:Citation/CS1's behavior to create variant subsets of the |accessdate= error. In part that's why I started the discussion about rethinking |accessdate=. Before leaping into the code, we need better clarity on just what |accessdate= is, how it should be used, how it should be displayed, etc.
Trappist the monk (talk) 00:59, 15 April 2014 (UTC)
@Jonesey95: The filtering you describe is not the same as the filtering that fixes misspellings. A word is either misspelled or it isn't making the citation broken or not broken. Following a bot run, fixed citations are fixed, and the broken ones are still broken. After a bot run where the fix is to hide |accessdate= when there isn't a |url=, broken citations are still broken and fixed citations may be fixed or may be broken, but, in any case, are no longer conveniently listed in a place where editors can find and fix those that are still broken. Fixing in this manner is irreversible – once gone from the category, those pages are gone.
Trappist the monk (talk) 00:59, 15 April 2014 (UTC)

lccn error[edit]

I am citing a book where the lccn number printed on the inside cover does not match the formatting expected by the template. It is only seven digits long. How do I suppress the error message? Or should I just leave the lccn number off entirely? Reyk YO! 03:56, 9 May 2014 (UTC)

The code that flags LCCNs as erroneous has a bug that identifies some valid LCCNs as invalid. This bug has been fixed but not deployed to the live code yet. I recommend leaving your LCCN in the citation as it appears in the book, even if it creates a red error message. – Jonesey95 (talk) 04:16, 9 May 2014 (UTC)
(edit conflict) I disagree that it should be left as it is in the book. 7 digits is invalid under the current format and will not link correctly. Are we "correcting" yy-xxxx to yyxxxxxx??? Definitely not doing so currently.
@Reyk: As far as I know the error can not be suppressed. The issue is that the format of the LCCN which is in the book does not comply with the current formatting requirements of the LoC. The goal for the citation is for there to be a link from the LCCN in the displayed citation to the LCCN Permalink at the LoC. Doing so provides a direct way for the reader to find the cited item. In order to do so the LCCN mush comply with the current format.
The format requirements are here. The minimum number of digits that the LoC now uses for LCCNs is 8. From fixing a bunch of these, and your statement that you have 7 digits I would guess that it is in the format yy-xxxxx (maybe yyxxxxx, but less likely). Change it to yy0xxxxx (add a zero). That should get it done. The 1st 2 digits are the last two digits of the year for years prior to 2000. For 2000 and on all year digits are used. Sometimes there is a 1 to 3 character alphabetic prefix, but it sounds like yours does not have one. You should double check that the link actually does go to the correct item. From time to time these are printed wrong in the book (not often, but sometimes). If it is, use the LC search to find it. — Makyen (talk) 04:22, 9 May 2014 (UTC)
@Makyen:- Thanks for that suggestion. Adding the 0 in place of the hyphen works, and I checked the LC Search to verify that this is correct. Reyk YO! 04:32, 9 May 2014 (UTC)
  • I've had the same problem as above. The book I was trying to cite had only 5 digits to the right (published in 1967). Adding a 0 in place of the hyphen resolved the problem. Thanks. Mohamed CJ (talk) 10:01, 30 May 2014 (UTC)

template parameter conflict[edit]

|coauthor= is deprecated, yet I have just come across a problem caused by changing |coauthor= to |author2= with this edit. By substituting these, the two relevant calls in the references section no longer find their Biblio target. I didn't see this, but it becomes apparent if you import User:Ucucha/HarvErrors.js. -- Ohc ¡digame! 03:21, 28 June 2014 (UTC)

Yes, you can either change the {{sfn}} to add more authors or change ref = harv to ref = {{harvid|...}} if you are going to change |coauthor= into |author2=. Like this.Jonesey95 (talk) 05:59, 28 June 2014 (UTC)