Help talk:Citation Style 1

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Trappist the monk (talk | contribs) at 00:24, 4 April 2020 (→‎Who is the publisher?). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Citation templates
... in conception
... and in reality

spam black list and archive urls

There is a discussion: Wikipedia:Village pump (technical) § Possible interaction of spam blacklist and citation archival-url. Apparently, the spam blacklist can be triggered by a url embedded in an archive.org snapshot url (and presumably in other achive urls that include the original url). This presents a problem to editors who try to fix cs1|2 template citations. One solution described at the aforementioned discussion is to percent encode the original url in the archive url; this:

https://web.archive.org/web/20091002033137/http://www.example.com/

becomes this:

https://web.archive.org/web/20091002033137/http%3A%2F%2Fwww.example.com%2F

I have hacked on Module:Citation/CS1/sandbox and implemented this solution. Here for |url= and |title=:

{{cite book/new |title=Title |url=http://www.example.com |archive-url=https://web.archive.org/web/20091002033137/http://www.example.com/ |archive-date=2009-10-02 |url-status=unfit}}
Title. Archived from the original on 2009-10-02.{{cite book}}: CS1 maint: unfit URL (link)
'"`UNIQ--templatestyles-00000023-QINU`"'<cite class="citation book cs1">[https://web.archive.org/web/20091002033137/http://www.example.com/ ''Title'']. Archived from the original on 2009-10-02.</cite><span title="ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=book&rft.btitle=Title&rft_id=http%3A%2F%2Fwww.example.com&rfr_id=info%3Asid%2Fen.wikipedia.org%3AHelp+talk%3ACitation+Style+1" class="Z3988"></span><span class="cs1-maint citation-comment"><code class="cs1-code">{{[[Template:cite book|cite book]]}}</code>: CS1 maint: unfit URL ([[:Category:CS1 maint: unfit URL|link]])</span>

and here for |chapter-url= and |chapter=:

{{cite book/new |chapter=Chapter |chapter-url=http://www.example.com |title=Title |url=http://www.example.com |archive-url=https://web.archive.org/web/20091002033137/http://www.example.com/ |archive-date=2009-10-02 |url-status=unfit}}
"Chapter". Title. Archived from the original on 2009-10-02.{{cite book}}: CS1 maint: unfit URL (link)
'"`UNIQ--templatestyles-00000027-QINU`"'<cite class="citation book cs1">[https://web.archive.org/web/20091002033137/http://www.example.com/ "Chapter"]. [http://www.example.com ''Title'']. Archived from the original on 2009-10-02.</cite><span title="ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=bookitem&rft.atitle=Chapter&rft.btitle=Title&rft_id=http%3A%2F%2Fwww.example.com&rfr_id=info%3Asid%2Fen.wikipedia.org%3AHelp+talk%3ACitation+Style+1" class="Z3988"></span><span class="cs1-maint citation-comment"><code class="cs1-code">{{[[Template:cite book|cite book]]}}</code>: CS1 maint: unfit URL ([[:Category:CS1 maint: unfit URL|link]])</span>

This code looks for the original url (|url=) in the archive url (|achive-url=). If found, the achive url is split at the beginning of the embedded original url. The embedded original url is then percent encoded and the two parts rejoined to make a new archive url. The same is true when |chapter= and |chapter-url= are set, and |chapter-url-status=unfit (or usurped).

For now this applies to all 'unfit' and 'usurped' urls. Presuming we keep this, I wonder if we ought not have another keyword for |url-status=; perhaps blacklisted. A separate maintenance category might also be in order.

Keep? Discard? Opinions?

Trappist the monk (talk) 17:00, 3 October 2019 (UTC)[reply]

I think this is as much an acceptable solution as any, at least as long as archive services do not disallow percent-encoding referrals for whatever weird reason. A social rather than technical issue may arise from editors who may wonder why a blacklisted url displays in the first place. 72.43.99.130 (talk) 18:37, 3 October 2019 (UTC)[reply]
... editors who may wonder why a blacklisted url displays in the first place. I think that's not an issue because the title is not linked to the blacklisted url but to a (presumably) good snapshot of the website page before it was blacklisted. I presume here that the editor who chose the archive url did so in good faith and that the archived source does, indeed, support the Wikipedia article's text. I suppose that the argument might be made that a blacklisted url is a blacklisted url whether it's archived or not. Still, to your point, using |url-status=unfit or |url-status=usurped disables the link to the original url in the rendered citation.
Trappist the monk (talk) 19:12, 3 October 2019 (UTC)[reply]

Never mind. I have reverted this change per the linked discussion.

Trappist the monk (talk) 22:30, 3 October 2019 (UTC)[reply]

Regarding this:

I suppose that the argument might be made that a blacklisted url is a blacklisted url whether it's archived or not.

I think that shouldn't be an issue. We should distinguish between these two cases:
  1. The url (or domain) was always malware/spam; it was never suitable for a reference, and still is not.
  2. The url (or domain) started off as a good source, but is malware/spam now.
One strength of having an archive in the first place, is that it can help us deal with case #2, and provide a good copy of an url back before it changed. This may be an argument for different handling of the two cases above, which may imply different values for |url-status.
I am not certain what your expectations were about how editors should employ the values unfit and usurped , given that the CS doc for url has little to say about them. But we could, I suppose, assign (or reassign) the usurped value to case #2: that is, "The url was good once (and the archive may still retain a copy), but it isn't good anymore", which goes along with one set of display possibilities including a displayable |archive-url. That might leave unfit to cover case #1, with a different set of display characteristics (including forbidding |archive-url, if it was always bad). Or, if that's not what you intended unfit to be, then perhaps some new value (forbidden, blacklist, or whatever) to indicate that this was never a usable url and the |archive-url should be suppressed if there is one.
Whatever the case (and even if nothing changes wrt to those two values), the documentation should be updated to clearly explain these two values, and how they should be used. I'm okay with not having it updated now, especially if the usage or meaning of these values is in flux, but once things shake out, there should be a clear and thorough explanation. (If you want help editing some doc for it when the time is right, feel free to issue a request on my Talk page, and I'll be happy to help.) Mathglot (talk) 02:44, 5 October 2019 (UTC)[reply]

Original discussions about parameter values unfit and usurped are at:
Neither of those discussions consider blacklisted urls.
There were subsequent discussions with regard to parameter values:
With regard to your statement:
The url (or domain) was always malware/spam; it was never suitable for a reference, and still is not.
It has been pointed out that percent-encoding the original url in an archive url may be used to mask a cite that has always been malicious. That is also true of archive sites that support url shortening – create an archive copy of the malicious site at archive.today, use the shortened url to avoid the blacklist (until one of the bots that lengthens shortened urls arrives to lengthen it). As an aside, when these lengthening bots attempt to save an article that now has a blacklisted url embedded in an archive url, what happens?
I suppose that when archive urls link to malicious archives, the whole archive url can be blacklisted (presumably with sufficient flexibility that such blacklisting catches all archive urls regardless of timestamp). If there is a specific archive timestamp that can be shown to not be malicious, then an editor could possibly petition whomever does this sort of thing to white-list that particular archive. The question then becomes, how do we mark such white-listed archive urls?
For me, I understand unfit and usurped to mean that the url links to:
  • unfit – link farm or advertising or phishing or porn or other generally inappropriate content
  • usurped – new domain owner with legitimate content; original owner with legitimate content unrelated to the originally cited url's content
Yep, there is no bright line separating the two but, as can be seen from the original discussions of these parameter values, we struggled to get even these because the waters, they are muddy.
And I repeat myself yet again: if you can see how the documentation for these templates can be improved, please do so.
Trappist the monk (talk) 14:34, 5 October 2019 (UTC)[reply]
lengthening bots .. what happens? - I believe there is a flag to exempt bot accounts from being blocked on save. I prefer to get blocked to manually fix. My bot also decodes encoded schemes in the path/query portion so the filters are not bypassed. IMO re whitelisting, it is often a matter of judgement/opinion and also double jeaporady since the original blacklisting presumably had a consensus discussion, it opens every blacklisted URL up to a new potential consensus discussion. This is a loophole for users to get past blacklists and overhead to manage. -- GreenC 22:10, 5 October 2019 (UTC)[reply]
usurped – new domain owner with legitimate content; original owner with legitimate content unrelated to the originally cited url's content
I assumed usurped to be closer to hijacked? If there is a new, properly registered owner (publisher) did any usurpation take place? 72.43.99.138 (talk) 15:42, 6 October 2019 (UTC)[reply]
When I use a word,' Humpty Dumpty said in rather a scornful tone, 'it means just what I choose it to mean — neither more nor less.
I think that these definitions of usurped, unfit, and possibly other values of |url-status need solid, agreed-upon definitions. Just from the point of view of English usage, never mind specialized wiki vocabulary, usurped is much more like what IP 72 stated. The sense of a new domain owner with legit content is nothing like most native English speakers would imagine, I don't think, when seeing the word usurped.
To me, your definition is a bit more like what would apply to a word like, repurposed, or reassigned, or repositioned or perhaps some word from marketing vocab when one company buys another's superannuated property, if there is such a word. The term usurped does not seem appropriate for the meaning you assume for it. This all needs further airing out, before the spam blacklist wrinkle, which is an edge case of the broader problem, can even be discussed. I have a feeling that there may be a need for at least one, perhaps two more values for |url-status to cover the different meanings that we seem to be alluding to for it, and trying to cram into two few values. Mathglot (talk) 23:12, 9 October 2019 (UTC)[reply]
Just wanted to be clear about one point: I don't think we need new values, just for the sake of new values; there's not need to distinguish every possible thing that could happen with an url. But, when they should be handled differently by the software, then, yes: we do need values for those cases. When the confusion surrounding the current meanings of usurped and unfit are settled, I suspect we will find that we will need at least one more value, in order to assign it to different handling in the software, and I think the spam blacklist case may be one such example. Mathglot (talk) 23:19, 9 October 2019 (UTC)[reply]
If you don't like the definitions that I offered above, write better definitions. I did write above: ...as can be seen from the original discussions of these parameter values, we struggled to get even these... Yeah, we know that these parameter keywords are less than optimal so there is no real need to spend a lot of words telling us what we already know. Suggest better definitions and / or suggest better keywords.
Trappist the monk (talk) 12:43, 10 October 2019 (UTC)[reply]
For domain names that are not trademarked, |url-status=reassigned would be imo a good option to clarify there is a new registrant. Obviously trademarked domains (like say, newyorktimes.com) would not normally lapse, so in these cases |url-status=usurped would be more accurate. 72.43.99.138 (talk) 13:55, 11 October 2019 (UTC)[reply]
I agree with 72.43.99.138. — UnladenSwallow (talk) 17:25, 11 October 2019 (UTC)[reply]

Interaction with Use dmy dates

Picking up a point I made at the talk page of {{Use dmy dates}}, the consequence of the CS1/2 templates using this template to standardize dates in citations is that it violates WP:CITERETAIN, because of the retrospective element. There are articles to which {{Use dmy dates}} was added years ago (one example I found is from 2005). When these have consistent (or almost consistent) YYYY-MM-DD access and archive dates, these get changed to DMY dates, contrary to well established guidelines. It's no defence to say that this behaviour can be over-ridden by the use of |cs1-dates=, since no editor knew until the change was made that this parameter was necessary or even existed.

The CS1/2 templates should stop changing the displayed format of access and archive dates when {{Use dmy dates}} is present without |cs1-dates=. Peter coxhead (talk) 11:01, 13 February 2020 (UTC)[reply]

On the contrary, parameters allowing a particular citation to contradict the date format set for the entire rest of a page should be deleted. ―cobaltcigs 07:06, 8 March 2020 (UTC)[reply]
@Cobaltcigs: see MOS:DATEUNIFY, which is very clear: "Access and archive dates in an article's citations should all use the same format, which may be: the format used for publication dates in the article; the format expected in the citation style adopted in the article (e.g. 20 Sep 2008); or yyyy-mm-dd". I repeat that retrospectively changing the style of citations is a violation of WP:CITERETAIN, and seems to be part of a long (and in this case underhand) campaign by some editors to prevent the allowed use of "yyyy-mm-dd" dates in some contexts. Peter coxhead (talk) 10:30, 8 March 2020 (UTC)[reply]

comma space error

did a search on the archive and found this which fixed the |page= option, but it also happens on the |issue= option. Dave Rave (talk) 00:47, 15 February 2020 (UTC)[reply]

A real-life example of what you mean will help us help you. Example please?
Trappist the monk (talk) 01:12, 15 February 2020 (UTC)[reply]
any page with the option on it would show it to you .... you could sandbox it ?? oh, let me just hold your hand for a bit Dave Rave (talk) 01:16, 15 February 2020 (UTC)[reply]
Not helpful. If there is a problem with a citation template, please provide exact information. Vague statements about archived discussions involving unspecified options in "pages" will delay resolution of the issue if there is any. You could have stated that |issue= does not render properly if it includes a comma as in 5-digit numbers. That would have moved the discussion faster. 199.102.115.203 (talk) 15:03, 15 February 2020 (UTC)[reply]
Dave Rave, the {{cite compare}} template is very useful for demonstrating simple test cases for all of the CS1 templates. You can see examples of its use above. – Jonesey95 (talk) 16:25, 15 February 2020 (UTC)[reply]
what is not useful ? the archive on the pages which has the same problem ? the let me show you page where you can see all the refs with comma space, or is it just too hard ? Dave Rave (talk) 20:51, 15 February 2020 (UTC)[reply]
You (sarcastically) pointed us to a whole article and, before that, to a long talk page discussion that covered multiple topics. You are asking us to scrub through both of those things, trying to figure out what you are talking about. If you actually want help, be clear and concise in your request for help. If you just want to argue and accuse us of being lazy, please choose a different venue. – Jonesey95 (talk) 21:29, 15 February 2020 (UTC)[reply]
that might be the comma space error purported to in the title, oh, heavens, it's like talking to a wall. Must be hard being administrative, like being on a help desk. Dave Rave (talk) 00:06, 16 February 2020 (UTC)[reply]
As a somewhat off-topic sidenote, these extremely high 5-digit issue numbers were added in this edit. While I can imagine issue numbers reaching these heights for daily newspapers, I must admit that I have never seen any such high numbers printed on newspapers myself (typically they are divided into yearly volumes and thus the issue numbers remain much smaller). Therefore (and since the original contributor can't be bothered to answer this question, unfortunately), can someone confirm that these numbers are genuine and real? Were they actually printed as such on these newspapers or are they some calculated or otherwise derived numbers found in some later (online) index? If they are not true issue numbers but other index numbers, should they perhaps be moved into the |id= parameter instead? Thanks. --Matthiaspaul (talk) 09:26, 16 March 2020 (UTC)[reply]
Todays The Times (London) has No. 73,108 after the date on the front page (see here near middle). —  Jts1882 | talk  11:41, 16 March 2020 (UTC)[reply]
I see, thanks for the pointer. Much appreciated.
--Matthiaspaul (talk) 14:27, 16 March 2020 (UTC)[reply]

Access levels and archives

Is there a way to mark that a live URL has a subscription or similar access requirement but the archived version is free to read? This is not uncommon with old news articles? An example: live (subscription required to read complete article) archive (full article available for free.) Thryduulf (talk) 14:11, 17 February 2020 (UTC)[reply]

Doesn't cs1|2 do that already? Setting |url-access=subscription causes the subscription icon to follow |url= in the rendered citation:
with |url-status=live
{{cite news |title=Title |url=http://www.elnuevodia.com/noticias/locales/nota/evolucionaelproyectodecomunidadesespeciales-2294682 |url-access=subscription |archive-url=https://web.archive.org/web/20170605084500/http://www.elnuevodia.com/noticias/locales/nota/evolucionaelproyectodecomunidadesespeciales-2294682 |archive-date=2017-06-05 |url-status=live}}
"Title". Archived from the original on 2017-06-05.
with |url-status=dead
{{cite news |title=Title |url=http://www.elnuevodia.com/noticias/locales/nota/evolucionaelproyectodecomunidadesespeciales-2294682 |url-access=subscription |archive-url=https://web.archive.org/web/20170605084500/http://www.elnuevodia.com/noticias/locales/nota/evolucionaelproyectodecomunidadesespeciales-2294682 |archive-date=2017-06-05 |url-status=dead}}
"Title". Archived from the original on 2017-06-05.
cs1|2 doesn't support |archive-url-status= because an archived copy of the source's teaser-view of a subscription-only page seems rather pointless to me (though someone thought it a good idea to archive and cite the now defunct HighBeam Research teaser pages ... (example).
Trappist the monk (talk) 15:44, 17 February 2020 (UTC)[reply]
I have actually found that archiving a page is a good way to get around GDPR and paywall restrictions. That said, even if that is the case, I don't think supporting an icon on archives makes much sense. --Izno (talk) 16:57, 17 February 2020 (UTC)[reply]

Related question ( Template talk:Cite web redirects here ). The parameter |subscription= is still on the documentation page, but no longer works. Can someone retire it ? TGCP (talk) 08:09, 9 March 2020 (UTC)[reply]

 Done for {{cite web}}. – Jonesey95 (talk) 17:20, 9 March 2020 (UTC)[reply]

Request to add Semantic Scholar IDs to the citation template

I’m reaching out from Semantic Scholar, a free, non-profit academic search and discovery engine developed by the Allen Institute for AI (AI2) which was first launched in 2015 and now indexes 180 million research papers from all scientific domains. From a content perspective, we have indexing licensing agreements to index scientific content from 550+ publishers, pre-print servers and academic societies and are integrated with multiple data partners including PubMed, Microsoft Academic, Unpaywall and others that provide us with high-quality metadata for our results (all of our content is publicly and freely available and we do not generate any revenue). We’ve been actively working with Citation Bot to add Semantic Scholar as a source for outbound links for licensed content and based on the discussion here, the Wikipedia community recommended that we submit a request to add links to Semantic Scholar IDs as a new identifier type in the Citation Template which can then be used by the Citation Bot.

For additional context, our goal in incorporating links to Semantic Scholar in Wikipedia citations is to provide an additional discovery entry point for Wikipedia users to explore our open literature graph and find additional relevant information for scientific articles that they are unlikely to find elsewhere. For example, in addition to citations/references, figures and tables we provide AI-based features such as citation classifications and high-quality supplemental content like videos, presentation slides, and links to code libraries (you can see an example here).

We are proposing to add our persistent Paper IDs in the following format: semanticscholar=1fa190b60988a4ad272e39e132bcc12b00429464 (with a persistent link in this format: http://api.semanticscholar.org/1fa190b60988a4ad272e39e132bcc12b00429464), but are open to suggestions (if the IDs are too long we can use our persistent corpus ID instead which looks like this: 134350433 - note: these are currently not shown our website, but will be made available in our API within the next 2 weeks). Once these IDs are made available we plan to work with the Citation Bot to integrate API calls using our DOI resolver to generate corresponding links to Semantic Scholar pages (for example, DOI=10.1038/nrn3241 resolves to semanticscholar=da82f8e6ff009432896730061247fa6653bed1f0). Please let us know what additional information we can provide for this request to be considered by the Wikipedia community or if anyone has any questions or feedback! Sebaskohl (talk) 20:18, 18 February 2020 (UTC)[reply]

"1fa190b60988a4ad272e39e132bcc12b00429464" is indeed too long. ID codes should be designed so humans can make some sort of sense of them and actually use them. A 9 digit code can handle a billion documents. That should be plenty. There is no need for a 40-hex long code of random gibberish to encode 1640 = 2160 documents. Headbomb {t · c · p · b} 20:52, 18 February 2020 (UTC)[reply]
Also note that a URL of the type "https://domain/identifier" that resolves to "https://domain/identifier/Long-Description" is probably better than one that resolves to "https://domain/Long-Description/identifier". For a bot, one format to another is very likely trivial, but for humans, it's both easier and more accessible to truncate after the identifier, rather than cut the middle part of the url. Headbomb {t · c · p · b} 21:01, 18 February 2020 (UTC)[reply]
Thank you for the feedback Headbomb! Good news is that we just made a change to our API to support shorter persistent IDs for papers in our index (we will be making the same IDs available on our website shortly to support manual edits). semanticscholar=37220927 now maps to 26efc3b4216a117a01975be4dfffa5267bfff64d and the new IDs will be 9 digits or less in length. Linking to these IDs is also straightforward and URLs can be constructed in the following format: https://api.semanticscholar.org/CorpusID:37220927. Let me know what you think and if this satisfies the recommendations/requirements that you've outlined. If yes, do you know what the next steps are to (1) get approval and (2) if approval is granted add this new identifier to the Citation Template? Appreciate your help! Sebaskohl (talk) 16:49, 19 February 2020 (UTC)[reply]
@Sebaskohl: For the parameter name, what will SS display? CorpusID:37220927, which should be displayed as CorpusID:37220927? Or will you have stronger branding, like 'SemanticID:37220927' or 'SSCID:37220927' (short for Semantic Scholar CorpusID) or whatever. I personally like the later, but that's just me. Headbomb {t · c · p · b} 17:42, 19 February 2020 (UTC)[reply]
The new IDs will be 9 digits or less in length. Does that mean that these IDs are randomly assigned? Sequentially assigned? What about leading zeros; permitted; not permitted? (modifying https://api.semanticscholar.org/CorpusID:37220927 to https://api.semanticscholar.org/CorpusID:037220927 suggests that leading zeros are permitted) Is there a minimum value? If 1 ≤ id:length() ≤ 9 is 'valid' then the only rationality checks that cs1|2 can do is max length check and a check to be sure that the ID is only digits.
Trappist the monk (talk) 22:26, 19 February 2020 (UTC)[reply]
Thank you for the feedback! How about using S2CID as the abbreviation for Semantic Scholar CorpusID since S2 is already used on Wikidata? With regards to formatting, I can confirm that IDs are sequentially assigned and start with a minimum value of 1 with a current max value of 211133348 (to be safe a max length of 10 digits should provide sufficient extensibility into the future). Leading zeros are supported with our API, but disallowing them is probably a good idea just in case. Sebaskohl (talk) 00:11, 21 February 2020 (UTC)[reply]
Ultimately up to you what you decide to call/present the identifier. The Wikidata property can easily be named whatever, so if you feel SSCID is clearer/rolls off the tongue more easily than S2CID (this is my position), go for that. If you like S2CID better, because Semantic Scholar = S2, go for that although I can't say I ever encountered that abbreviation personally. Headbomb {t · c · p · b} 00:32, 21 February 2020 (UTC)[reply]
Sounds good! Let's go with S2CID since that aligns with the abbreviation that we already use for Semantic Scholar. Sebaskohl (talk) 00:48, 21 February 2020 (UTC)[reply]
This is fantastic. "(to be safe a max length of 10 digits should provide sufficient extensibility into the future)" <-- save this to look back on it in 10 years ;) – SJ + 14:33, 21 February 2020 (UTC)[reply]
first hack:
{{cite journal/new |vauthors=Kawchuk G, Prasad NG, Chamberlain RF, Klymkiv A, Peter L |title=The effect of a standardized massage application on spinal stiffness in asymptomatic subjects |journal=BMC Complementary and Alternative Medicine |volume=12 |issue=Supp 1 |page=P147 |s2cid=37220927 |doi=10.1186/1472-6882-12-S1-P147 |doi-access=free}}
Kawchuk G, Prasad NG, Chamberlain RF, Klymkiv A, Peter L. "The effect of a standardized massage application on spinal stiffness in asymptomatic subjects". BMC Complementary and Alternative Medicine. 12 (Supp 1): P147. doi:10.1186/1472-6882-12-S1-P147. S2CID 37220927.
I have set the upper limit bounds to 270000000.
I notice that there are s2 pages like this one where the publisher keeps the article behind a paywall but s2 has a link to alternate sources at infekt.ch (is that a pirated copy?) and to pubmed which is not a link to the article. Are either of those appropriate? When I noticed this, I wondered, because it is early days, if the s2cid couldn't somehow encode the access-status (paywalled, free-to-read, open access, ...) status of an article so that editors here don't have to bother with adding |s2cid-access=free (not yet implemented). S2 already knows when articles are open access (see the linked page in the example citation above) so adding something to the s2cid ought not be an onerous endeavor. With that info encoded, access icons would come automatically according to the value in |s2cid= (for identifiers, cs1|2 only cares about free-to-read) but other consumers of s2 via s2cid may want more/better granularity.
Trappist the monk (talk) 14:55, 21 February 2020 (UTC)[reply]
Looks great! @Trappist the monk: do you have suggestions for how to best encode open access information in the ID? Should we use a query parameter in the URL to denote when a paper is open access (e.g. https://api.semanticscholar.org/CorpusID:37220927?open-access=Y - this already works) or use a different type of delimiter? With regards to the examples that you've highlighted we will only add open-access=Y in cases when we know that the resource that we link to has been published with an open access license. This means that cases like the example you highlighted where we link out to third party websites with an unclear license like infekt.ch (a Swiss hospital website) will have a value of open-access=N (let us know if that works or if you have alternate suggestions). One additional question: Would it be possible to increase the upper limit bounds beyond 270000000 to give us room to grow (right now we are adding anywhere between 5-10 million new scientific papers and IDs per year)? Sebaskohl (talk) 17:00, 21 February 2020 (UTC)[reply]
cs1|2 templates cannot see the url unless editors or some other tool puts the url in |url= which, as I understand it, en.wiki finds to be undesirable. What I meant to say and upon rereading what I wrote, apparently failed to say, is that the access-status might be encoded into the s2cid as a suffix; perhaps: |s2cid=37220927.oa or some such. cs1|2 can then apply the free-to-read icon according to the suffix. You may have a use for more than one suffix; cs1|2 would only need whatever suffixes you choose that equate to free-to-read.
I chose the 270000000 upper limit so that typos (or stray keystrokes that add an additional digit) might be detected. This value will get bumped up as the need arises. Were it up to me, s2cid would have a check-digit because as it stands right now, any sequence of digits that evaluate to less than 270000000 is the only checking that can be done. This upper-level limit is a wholly inadequate way of verifying the integrity of an identifier but, for some identifiers, s2cid included, it's all we've got.
Trappist the monk (talk) 18:18, 21 February 2020 (UTC)[reply]
@Trappist the monk: sounds good! We'll add support for the .oa suffix on our end in cases where a link to an open access PDF is available (IDs will be in this format: |s2cid=37220927.oa). If the .oa suffix is missing then that's an indicator that no link to an open access PDF is available. I will let you know when that work is complete and thank you also for the clarification with regards to the upper limit! Sebaskohl (talk) 21:16, 21 February 2020 (UTC)[reply]
With .oa suffix:
{{cite journal/new |vauthors=Kawchuk G, Prasad NG, Chamberlain RF, Klymkiv A, Peter L |title=The effect of a standardized massage application on spinal stiffness in asymptomatic subjects |journal=BMC Complementary and Alternative Medicine |volume=12 |issue=Supp 1 |page=P147 |s2cid=37220927.oa |doi=10.1186/1472-6882-12-S1-P147 |doi-access=free}}
Kawchuk G, Prasad NG, Chamberlain RF, Klymkiv A, Peter L. "The effect of a standardized massage application on spinal stiffness in asymptomatic subjects". BMC Complementary and Alternative Medicine. 12 (Supp 1): P147. doi:10.1186/1472-6882-12-S1-P147. S2CID 37220927.oa. {{cite journal}}: Check |s2cid= value (help)
I notice that adding the suffix to the generic url returns an error message:
https://api.semanticscholar.org/CorpusID:37220927.oa{"error":"Internal server error"}
Trappist the monk (talk) 00:37, 22 February 2020 (UTC)[reply]
@Trappist the monk: We have implemented the fix for the above internal error. The link appears to be working properly now. —Jgorney (talk) 19:05, 26 February 2020 (UTC)[reply]
Good, thank you. I guess that leaves, at minimum, some way for Wikipedia editors, and automated tools like Citation bot, to discover the value of an s2 page's s2cid.
Trappist the monk (talk) 23:55, 26 February 2020 (UTC)[reply]
@Trappist the monk: We implemented today the Wikipedia "W" as a paper sharing option in the upper right corner for editiors. https://www.semanticscholar.org/paper/Exercise-of-Human-Agency-Through-Collective-Bandura/b1f74216506e3a35e3c56d5ada91d4a7112616dc
Jgorney (talk) 20:28, 27 February 2020 (UTC)[reply]
Thank you. Those urls could be used by anyone, right? No real need to qualify them as 'wikipedia compatible'. For the purposes of this discussion, all that editors here would need / want is the string of digits that follow CorpusID: from the url; the rest of the url would be discarded. Any way to just get that?
Trappist the monk (talk) 21:27, 27 February 2020 (UTC)[reply]
@Trappist the monk: Sorry I should have been more explicit. When that button is used it copies the api.semanticscholar.org:CorpusID link to the user's clipboard. So it is working as requested.
Jgorney (talk) 20:28, 2 March 2020 (UTC)[reply]
Yeah, I understand what it does, but I don't think that what it does is what it should do. If the point here is to make it dead simple for editors to add an s2cid to a cs1|2 citation template then giving the editor the whole url when all they need is the s2cid number portion of the url isn't getting all the way to the goal because now the editor has to paste the url into something and then remove everything that isn't the s2cid.
We asked for a special form of url so that our editors don't have to deal with a 40-digit hex number which is unintelligible to normal humans. We want to be able to have editors add a parameter |s2cid=<identifier number> to a cs1|2 template. The code that renders the template will concatenate https://api.semanticscholar.org/CorpusID: and <identifier number> to get a working link into s2.
This url to a wholly unrelated website has just about what it is that I think we want: the ID is listed at the top of the image details list; double-click, copy and paste into my citation template and I'm done; no chance that I left on an extraneous character or deleted a character that was part of the s2cid.
The W button still isn't Wikipedia-exclusive so ought not be marked as Wikipedia-exclusive; if you want to keep it, use a generic chain-link icon.
Trappist the monk (talk) 00:11, 4 March 2020 (UTC)[reply]
@Trappist the monk: Makes sense and appreciate the feedback on the W button. I've queued up a change to update it to a more generic chain-link icon for the URL. To address the other issue you raised we'll create a new W button that makes it easy for editors to just copy the ID so that it can be added to the citation template. I will post an update when that's done. Sebaskohl (talk) 00:16, 7 March 2020 (UTC)[reply]
I will still argue for a plain text representation of the s2cid on the s2 page. Sure, it can have a tooltip and under-the-bonnet javascript to copy the value portion of the s2cid to the reader's clipboard. When the s2cid is visible as plain text, readers can see and get the s2cid; even those readers who, for whatever reason, don't have js enabled. Hidden behind a fancy button, those js-less readers cannot get the s2cid.
Trappist the monk (talk) 14:49, 7 March 2020 (UTC)[reply]
Just because I was curious, and because you said that s2cid begins at 1, I looked: https://api.semanticscholar.org/CorpusID:1. Why is that page so dramatically different from https://api.semanticscholar.org/CorpusID:2?
And, while on the subject of 2, the open access link there links to https://eprints.soton.ac.uk/377196/1/Lessmann_Benchmarking.pdf; clearly a preprint. Shouldn't s2 identify such documents as preprints instead of giving readers the impression that the non-publisher link links to an open access copy of the article of record?
Trappist the monk (talk) 14:11, 22 February 2020 (UTC)[reply]
@Trappist the monk: Hello, let me first introduce myself. I am Joe Gorney and I work at Semantic Scholar for Sebastian. While he is out on vacation, I will be managing the S2 workflow regarding this string. To address your query about the two differences in pages is due to the source and amount of metadata we have received to build out the PDP for page #2. We have not received a quality level of metadata to flesh out the presentation of content for page #1. Additionally, the question around preprint identification is being discussed internally.
Jgorney (talk) 17:36, 24 February 2020 (UTC)[reply]
It occurred to me that it's easy to zero-pad an id to the left to 12 digits. Were we to do that then we could easily calculate a check-digit using the same algorithm as isbn-13 for which we already have a validation function in Module:Citation/CS1/Identifiers. So I hacked a sandbox to do that: Module:Sandbox/trappist the monk/check digit.
{{#invoke:Sandbox/trappist the monk/check digit|main|37220927}} → 37220927.8
Were we to do this, the need to periodically adjust an upper limit value goes away. The isbn-13 check-digit isn't perfect (there are certain digit transpositions at are undetected but the occasionally undetected transposition is better than never detecting a transposition or typo except when the transposition or typo causes the s2cid to go out of bounds. The adjacent undetectable transposed digit pairs that I know about are 16, 27, 38, and 49.
Trappist the monk (talk) 16:29, 22 February 2020 (UTC)[reply]

I'm not sure that's a good idea, if the journal gets acquired and either games or loses open access status, that would mean the identifier changes as well. That's not good. The best way would simply to have an open access flag that can be accessed via API. Headbomb {t · c · p · b} 21:36, 21 February 2020 (UTC)[reply]

@Headbomb: definitely agree that this is likely to happen. As part of this change we are also adding an "is_open_access" boolean flag to our API to ensure that the most up-to-date open access indicator is always available so that the IDs can be updated easily by the Citation Bot and Wikipedia editors. For example, if a paper changes from "open access" to "not open access" then the .oa suffix will be removed when the Citation Bot calls our API or a Wikipedia editor is adding a link manually from one of our pages. Note that from a linking perspective links with and without .oa will always link to the same resource on our side (we are adding logic so that those links will not break regardless of what the suffix is). Would that work? Sebaskohl (talk) 21:57, 21 February 2020 (UTC)[reply]
Modules / templates do not have the ability to access an api.
Trappist the monk (talk) 00:37, 22 February 2020 (UTC)[reply]

Where do we stand with |s2cid= parameter support. Are we content to keep support for |s2cid= or, since the Semantic Scholar representatives Sebaskohl and Jgorney appear to have abandoned this discussion, delete support for this parameter from the sandboxen? —Trappist the monk (talk) 14:25, 17 March 2020 (UTC)[reply]

@Trappist the monk: Apologies for the delayed response (we've been working on releasing a COVID-19 dataset that we just made public as part of a call to action from the White House). We would definitely still like to see the |s2cid= added to the citation template and are proceeding to make changes to our page design (scheduled to go out this week) based on your recommendations. This includes surfacing the IDs directly on the page as requested and changing the "W" icon to a "chain link" icon for the button. Please let us know if anything else is needed to add the |s2cid=. Sebaskohl (talk) 17:37, 17 March 2020 (UTC)[reply]
Good, thank you.
Trappist the monk (talk) 18:11, 17 March 2020 (UTC)[reply]
@Trappist the monk: As promised we've made the change to prominently display the |s2cid= on our paper detail pages and have have updated the share icon. Please let us know if anything else is needed to add the |s2cid= to the citation template! Sebaskohl (talk) 21:37, 19 March 2020 (UTC)[reply]
I just created {{S2CID}}, e.g. S2CID 37220927. It will automatically strip the .oa if it is appended to the base identifier. Headbomb {t · c · p · b} 23:26, 19 March 2020 (UTC)[reply]
As far as CS1/2 is concerned, it should probably strip the .oa, but to automatically set |s2cid-access=free if it's found. Headbomb {t · c · p · b} 00:01, 20 March 2020 (UTC)[reply]
It already does strip the .oa flag. See the examples earlier in this conversation.
Trappist the monk (talk) 00:09, 20 March 2020 (UTC)[reply]
I can't think of anything else that we want for this identifier. Others may have a different opinion. I'd like to finish off some of the remaining open topics on this page and update the module suite in early April.
Trappist the monk (talk) 00:14, 20 March 2020 (UTC)[reply]
Sounds great! We'll keep an eye out for the update to the module suite and please let us know if any additional questions come up in the meantime. Sebaskohl (talk) 20:30, 24 March 2020 (UTC)[reply]

revisiting s2cid free-to-read annotation

Sebaskohl: I chanced upon this citation:

Shrivastava, Rahul; Heinen, Joel (2007). "A microsite analysis of resource use around Kaziranga National Park, India: Implications for conservation and development planning". The Journal of Environment & Development. 16 (2): 207–226. doi:10.1177/1070496507301064.

Following the doi link shows that the publisher has that article behind a paywall. But, following the title link shows that there is an apparently free-to-read copy of the article hosted at s2. Wikipedia should not be linking to copyrighted works where it is not clear that the distributor (s2 in this case) has been properly licensed by the copyright owner (WP:ELNEVER). It isn't clear to me that the s2 copy of this journal article is properly licensed. If it is, then en.wiki is allowed to link to it and the |s2cid= rendering should show the free-to-read access icon.

Right now, the only way to display that icon is with the .oa flag at the end of the s2cid. But, since this article is not open access, that flag is inappropriate. This suggests that if s2 may legitimately host some articles that the publisher has behind a paywall but are not open access, it is necessary to have some sort of other flag to indicate that the article is free-to-read (and appropriately licensed?) Or, we drop the whole notion of the .oa flag altogether and require that editors here add |s2cid-access=free when the linked article is OA or s2 is properly licensed to host the article (this latter requires that s2 make it obvious that the copy of the article that they host is properly licensed).

Trappist the monk (talk) 14:26, 25 March 2020 (UTC)[reply]

And just as an aside, there is this, purportedly Medicine and Surgery of South American Camelids. I used to have a copy of that book, the pdf linked from S2CID 68181921 is not that book.
Trappist the monk (talk) 22:01, 25 March 2020 (UTC)[reply]
@Trappist the monk: Thank you for highlighting these two links! In both cases the S2 Corpus ID did not have the .oa suffix because we only append the .oa suffix in cases we are 100% certain that the open access license is current and up-to-date (we have a regular process running to keep licensing information up-to-date). Unfortunately in case of the first article that was no longer the case and I've removed the PDF. The second case was an unfortunate instance where the PDF was low quality (I've also removed this PDF). From a linking perspective you can be 100% confident that everything with a .oa suffix has a current and up-to-date open access license. I believe in other cases without the .oa suffix we decided to show a "closed access" symbol. Will that be sufficient or do you have alternative suggestions? Sebaskohl (talk) 20:04, 30 March 2020 (UTC)[reply]
Let me try to be sure I understand what you are saying. I think that you are saying that s2 does not / will not host copies of articles that not are properly licensed. Is that right? What about links to articles hosted at other locations? For example, what about article copies that appear to be used at various universities for academic course work but are hosted at the professors' web pages? See the Alternate sources dropdown at S2CID 186830. The publisher at doi:10.1111/1467-8721.00064 has that article behind a paywall. Are those copies that s2 is linking properly licensed? Should you and, through you, we be linking to them?
In writing this response, I note that using that same s2cid in this sandbox citation, I can add .oa to the s2cid and the link works:
{{cite journal/new |last=Bandura |first=A. |date=June 2000 |title=Exercise of Human Agency Through Collective Efficacy |journal=Current Directions in Psychological Science |volume=9 |issue=3 |pages=75–78 |doi=10.1111/1467-8721.00064 |s2cid=186830.oa}}
Bandura, A. (June 2000). "Exercise of Human Agency Through Collective Efficacy". Current Directions in Psychological Science. 9 (3): 75–78. doi:10.1111/1467-8721.00064. S2CID 186830.oa. {{cite journal}}: Check |s2cid= value (help)
Should that s2cid work? An editor can mark a non-OA s2cid with the .oa suffix to give our citation the free-to-read icon. The editor might ask, "Why not? The Alternate sources dropdown shows that copies of the article are readily available and I can link to them. They must be free-to-read, right?") Am I making sense? I'm thinking that if we are to retain this .oa suffix mechanism, s2 should intercept s2cids that have the .oa suffix but s2 doesn't link to know OA hosts or s2 doesn't host an OA copy itself. When intercepted, perhaps s2 can put up a banner that says something like "We don't have an open access copy of the article you are requesting, redirecting to ..." You know what I mean, I think. That intercept should be readily identifiable by a bot, perhaps Citation bot, so that the bot can modify the s2cid in the template where it is used. Equally, for OA s2cids without the .oa suffix, s2 should immediately redirect to the OA landing page as if the suffix were present – you do this already I think. As before, a bot should be able to easily identify redirected s2cids so that it can adjust the citation template here.
I don't know how much of this is possible. Pinging AManWithNoPlan for insights on the the bot perspective.
Trappist the monk (talk) 23:04, 30 March 2020 (UTC)[reply]
Citation Bot has special code that calls a special API that (added at our request) that only returns licenses versions. Instead of the “hey I scraped this off the web, so it must be free” versions. BUT!!! Drum roll..... does S2 have any freely available content that is also not available from the publisher? I doubt it, other than non-journal stuff. Which is why we don’t add such links at this time. Also, S2 also supports DOI based URLS, so perhaps a S2-is-free=true could enable a DOI based link? But, not all S2 stuff has a DOI. Unlike CiteSeerX, which can probably legally host copyright infringing stuff because of crown immunity (Both state and Federal), C2 has no such immunity. Their immmuty does not mean we should link to it though. Those are my random thoughts. AManWithNoPlan (talk) 23:19, 30 March 2020 (UTC)[reply]
Thank you for looping back! The .oa flag is only set in cases where we have clear and up-to-date open access information from either Unpaywall, the publisher or a repository like PubMed Central. In those cases we also link out to open access articles using alternate sources with an open access symbol. From a bot perspective, we added an "is publisher licensed" flag to our API to ensure that Wikipedia bots only link to publisher licensed content on S2 (see example). Our proposal when working with the Citation Bot is to only add/confirm S2 links if the content "is publisher licensed" or confirmed to be open access. Would that work? Sebaskohl (talk) 16:29, 31 March 2020 (UTC)[reply]
I think that you have neatly avoided answering my questions about articles linked from the alternate sources list when s2 have not marked them as open access. My example was the articles linked through the alternate sources dropdown at S2CID 186830 when the publisher's article at doi:10.1111/1467-8721.00064 is behind a paywall. Our editors will see the purportedly free-to-read articles in the alternate sources dropdown as free-to-read (there is the unlocked-lock icon) and will add the .oa flag to the s2cid to get the free-to-read icon to render in the en.wiki-published citation. Because there is no apparent indication that the article copies linked from the alternate sources dropdown are properly licensed, en.wiki should not link them indirectly through s2 just as en.wiki should not link them directly. The free-to-read (.oa) s2cid should only link to an s2 landing page that contains OA material or properly licensed OA links.
I would like to have this issue settled because it is delaying the next module-suite update.
Trappist the monk (talk) 13:10, 2 April 2020 (UTC)[reply]
Thank you for highlighting this also! We are following up on our end to look into options to further improve the alternate source links beyond the metadata that we get from Unpaywall, publishers and other sources (which we use to set the .oa indicator). Unfortunately that won't be an easy change/improvement to make and I'm wondering if the best path forward is to just remove the .oa indicator altogether for now until we have a better solution in place (we don't want to block your release)? Let me know what you think and sorry about the back-and-forth on this. Sebaskohl (talk) 23:16, 3 April 2020 (UTC)[reply]
I'm inclined to agree. So, .oa gone; new |s2cid-access= created; and our favorite example:
{{cite journal/new |vauthors=Kawchuk G, Prasad NG, Chamberlain RF, Klymkiv A, Peter L |title=The effect of a standardized massage application on spinal stiffness in asymptomatic subjects |journal=BMC Complementary and Alternative Medicine |volume=12 |issue=Supp 1 |page=P147 |s2cid=37220927 |s2cid-access=free |doi=10.1186/1472-6882-12-S1-P147 |doi-access=free}}
Kawchuk G, Prasad NG, Chamberlain RF, Klymkiv A, Peter L. "The effect of a standardized massage application on spinal stiffness in asymptomatic subjects". BMC Complementary and Alternative Medicine. 12 (Supp 1): P147. doi:10.1186/1472-6882-12-S1-P147. S2CID 37220927.
Farther up in this thread is a template with .oa showing that it is no longer recognized. Here it is again without the .oa:
Bandura, A. (June 2000). "Exercise of Human Agency Through Collective Efficacy". Current Directions in Psychological Science. 9 (3): 75–78. doi:10.1111/1467-8721.00064. S2CID 186830.
At this point, I don't think that we should resurrect .oa. To do so would only cause confusion – there might have been confusion had we retained it because turning on the free-to-read for |s2cid= would have been different from how it is turned on for other parameters. If this experiment created anything beneficial at your end for Citation bot, that should be retained.
Trappist the monk (talk) 23:48, 3 April 2020 (UTC)[reply]

make ref=harv the default for CS1

This would have many benefits and very few if any drawbacks. No one raise substantial objects in that previous proposal, and now this is a blocker for Wikipedia:Bots/Requests_for_approval/AntiCompositeBot. |ref=harv should be made default in CS1 just as it is in CS2. Headbomb {t · c · p · b} 00:06, 27 February 2020 (UTC)[reply]

As I understand it, the original rationale for not doing ref=harv for CS1 was that it creates invalid html when two references have the same authors and same publication year, and that CS1 is used so often in articles that don't also use the harv templates that this is likely to go undetected. Has that changed? —David Eppstein (talk) 01:03, 27 February 2020 (UTC)[reply]
Those are very corner case, and is also a problem that already exists within CS2. Headbomb {t · c · p · b} 01:10, 27 February 2020 (UTC)[reply]
I probably misunderstand David Eppstein's remark, but isn't the standard practice to add a distinguishing mark to the date in same author/year refs, as in 2020a, 2020b etc. These render properly. 98.0.246.242 (talk) 02:58, 27 February 2020 (UTC)[reply]
Both render properly, but if there's a citation with the same authors and years, you have two citations emitting the same anchor, so there's a collision. However, that's already the current behaviour in CS2, and is really not a big issue, especially compared to the score of broken anchors caused by CS1 not emitting those anchors to begin with. Headbomb {t · c · p · b} 03:11, 27 February 2020 (UTC)[reply]
It is standard practice *if you're using the harv templates* to disambiguate them so that they link correctly. The issue is that CS1 is mostly used by itself, not in conjunction with the harv templates, so there is less reason to disambigate them and (because they look ok) and editors don't realize that under the hood they are generating bad html. —David Eppstein (talk) 05:03, 27 February 2020 (UTC)[reply]
Most people who use CS2 templates don't use harv templates, and many who use harv templates don't know that they only work out of the box with CS2. Whatever collision this would cause, it would problematically affect an extreme minority of articles, which is far fewer than the problems it would actually solve. The point is CS1 should also emit anchors, and whatever problems it causes do not outweight the benefits this brings. As I wrote back then "If they used CS1, nothing changes. If they used CS2, then either they don't use {{harv}} or already have |ref=harv enabled. Or they use a mix of CS1 and CS2 that needs to be fixed anyway." Headbomb {t · c · p · b} 05:15, 27 February 2020 (UTC)[reply]
I have to admit that despite my arguing here, I think changing to ref=harv everywhere probably solves more problems than it causes. The html errors cause by duplicate ids are mostly invisible to readers, they are present in any case with CS2, and they are easily fixable (even in the case that you're not using harv links and don't want to add letters to the years) by using a custom ref in those cases. Finding and fixing these will no doubt give the gnomes plenty to gnaw on, making them non-problematic in the long term. —David Eppstein (talk) 21:58, 27 February 2020 (UTC)[reply]
My view, FWIW, is that adding ref=harv to CS1 references will solve more problems than it creates, and that it is the best proposal I have seen in many years of trying to make these errors more visible. It would be great if these broken refs would create an error category, or if they could be tagged by a bot, and if adding ref as a default to CS1 templates allows for bot tagging, let's try it. Javascript-wielding gnomes have been our best answer so far, and I come across short ref errors all the time, so that answer is not very good. – Jonesey95 (talk) 22:51, 27 February 2020 (UTC)[reply]
I don't think that I've changed my opinion about this. But, perceiving that this is almost a fait acompli, and because I have an (old) copy of Obama's article in my user-space, I edited that article to add |ref=harv to all of the cs1 templates that it holds (there are no cs2 templates). You should look: User:Trappist the monk/Barack Obama.
The results suggest to me that there will be a plethora of false positives from User:Ucucha/HarvErrors.js. The example article has 111 harv warnings including everything in §§ References​ and Further reading; the example article does not have {{harv}}-family or {{sfn}} templates.
Trappist the monk (talk) 00:19, 28 February 2020 (UTC)[reply]
I've trained my eyes to ignore the brown errors (full refs without matching short refs) from Ucucha's script, since they are almost always false positives. In articles with red errors (short refs without matching full refs), the brown messages sometimes help me find the one full reference that is not being cited. Maybe we could get a version of Ucucha's script that shows only the red errors. – Jonesey95 (talk) 00:37, 28 February 2020 (UTC)[reply]
Warnings are not problems. They're simply unused anchors. Errors are problems. That's what this is aiming to fix, as well as making template much friendlier to use to begin with. And it also allows for the removal of cluttering/pointless |ref=harv in CS1. Headbomb {t · c · p · b} 01:02, 28 February 2020 (UTC)[reply]
If this change causes the script to change to not show these as errors, or it causes editors to stop using the script because it shows too many false positives, and as a result they stop trying to add ref=none to articles that use CS2 but do not use harv linking, then I will count that outcome as a net positive. —David Eppstein (talk) 01:25, 28 February 2020 (UTC)[reply]
  • Question How possible would it be to do something like {{use dmy dates}} but for harv (like {{use harv cites}})? I imagine that if editors could just transclude a single template that changes the output of all the cite templates, they'd be just as fine with that as they would if it was the default. –MJLTalk 16:52, 1 March 2020 (UTC)[reply]
    Don't really know if it's feasible, but it would be relatively undesirable and mostly pointless clutter. Headbomb {t · c · p · b} 17:12, 1 March 2020 (UTC)[reply]

Interesting that you should ask that question. Over the past couple of days I have been messing about in my sandbox; more about that in a moment. When we first added support for the {{use xxx dates}} templates, I speculated that we could do something similar to unify rendering of the cs1|2 templates. The example I used was the |mode= parameter but |ref= is another that could be added to such a template. The discussion is buried in Help talk:Citation Style 1/Archive 54 § auto date formatting.

In response to comments elsewhere, I've created some code in my sandbox that reads raw cs1|2 citation templates and builds a table of CITEREFs that the {{harv}} and {{sfn}} families of templates (using Module:Footnotes/sandbox) can read to determine if there is a matching target citation in the article. When the {{harv}} or {{sfn}} template finds its CITEREF in the table, no error message:

{{Harvard citation no brackets/sandbox|Red|Blue|Gold|Black and Silver|2020|p=20 |loc=at the bottom}}Red et al. 2020, p. 20, at the bottom
{{cite journal |journal=Journal |title=Title |vauthors=Red A, Blue B, Gold C, ((Black and Silver)), Yellow EF |date=2020 |ref=harv}}
Red A, Blue B, Gold C, Black and Silver, Yellow EF (2020). "Title". Journal. {{cite journal}}: Invalid |ref=harv (help)

But, if the CITEREF isn't in the table:

{{Harvard citation no brackets/sandbox|Yellow|Black |Brown|Red|2019}}Yellow et al. 2019

When there are multiple cs1|2 citations that produce the same CITEREF:

{{sfn/sandbox|Orange |2009|pp=34–45}} – here is the sfn[1] and two same-name / same-date c1|2
  1. {{cite book |title=Title |editor-last=Orange |editor-first=A |date=2009 |ref=harv}}
    • Orange, A, ed. (2009). Title. {{cite book}}: Invalid |ref=harv (help)
  2. {{citation |title=Title |last=Orange |first=A |journal=Journal |date=2009}}
    • Orange, A (2009), "Title", Journal

And it works with the {{sfnmp}} family:

{{Sfnmp/sandbox|1a1=Green|1a2=White|1a3=Violet|1y=2005|1p=15|2a1=White|2a2=Violet|2a3=Green|2y=2004|2p=50}} – here is the sfnmp[2]

References

  1. ^ Orange 2009, pp. 34–45. sfn error: multiple targets (2×): CITEREFOrange2009 (help)
  2. ^ Green, White & Violet (2005), p. 15; White, Violet & Green (2004), p. 50.

Downsides? Inevitably. This scheme does not work for wrapped templates because those kinds of templates hide a lot of parameters (author parameters, editor parameters, contributor parameters, |ref=, |date=, |year=) under the bonnet so they aren't visible in an article's wiki source. Does not play well with ve because ve does not preview in the same way that the wiki-source editor previews (same reason the auto-date-formatting doesn't work while editing with ve). Benefits? Error messages are visible to editors who don't have User:Ucucha/HarvErrors.js; the experiment detects both errors in the {{sfnmp}} example; the script only finds one at a time; the experiment doesn't shout. Enhancements still to be done are support for error categories and help text. Another possible enhancement might add CITEREFs to the table when |ref=none so that harv templates without a target but that match the citation template where |ref=none could be annotated. Also, the {{harv}} templates support their own |ref= parameter. The content of that parameter overrides the normal CITEREF in the same way the cs1|2 templates with |ref= assigned some other text than harv, none, CITEREF... (as plain text or as created by {{sfnref}}) overrides the automatic CITEREF anchor creation. The table can hold that 'ref' text for comparison to 'ref' text in {{harv}} templates. I don't know how common this custom ref use is; I have seen it used to just hold what looks like notes which misuses the parameter but I guess I would expect negative pushback for this enhancement.

Trappist the monk (talk) 19:38, 1 March 2020 (UTC) 21:05, 1 March 2020 (UTC)[reply]

In the sandbox:

{{harvnb|Brown|2020}}Brown 2020
{{harvnb|Green|2020}}Green 2020
{{cite book/new |title=Has ref harv |last=Brown |date=2020 |ref=harv}}
Brown (2020). Has ref harv. {{cite book}}: Invalid |ref=harv (help)
{{cite book/new |title=Does not have ref harv |last=Green |date=2020}}
Green (2020). Does not have ref harv.

New maint cat to identify cs1|2 templates with |ref=harv. When that category is cleared, the code supporting |ref=harv should be rewritten.

Trappist the monk (talk) 14:15, 2 April 2020 (UTC)[reply]

broken harv link reporting

Please see the discussion at Module talk:Footnotes § broken harv link reporting where the above broken harv-link reporting scheme is proposed.

Trappist the monk (talk) 17:46, 16 March 2020 (UTC)[reply]

Well |ref=harv being the default option would still need to be default option for the number of errors to drastically go down. Headbomb {t · c · p · b} 18:29, 16 March 2020 (UTC)[reply]
@Trappist the monk: can we please have this in the April update? Headbomb {t · c · p · b} 18:05, 29 March 2020 (UTC)[reply]

PMC parameter

Resolved

Could someone edit he PMC parameter? PMC IDs are now greater than 7000000, so warnings are popping up where it is not necessary.  Bait30  Talk? 19:47, 27 February 2020 (UTC)[reply]

@Bait30: See Help talk:Citation Style 1#pmc can be larger than 6000000. Headbomb {t · c · p · b} 20:15, 27 February 2020 (UTC)[reply]
idk how I missed that. That's unfortunate because it seems like such an easy fix. Hopefully an admin can come by and fix this soon.  Bait30  Talk? 20:22, 27 February 2020 (UTC)[reply]

The fix is easy, has been done in the sandboxen but is not pressing; there are no lua script errors and at this writing, only 14 of some 4.3-ish million articles that use the module suite are showing the error. Were there lua script errors or thousands of articles showing this particular error, then certainly this would have been fixed by now. Changing one of the modules to fix a minor error dumps all 4.3-ish million articles onto the MediaWiki job queue. So, we defer updates until there are more substantive changes to be made, and then update the module suite. Category:CS1 errors: PMC (0)

Trappist the monk (talk) 20:49, 27 February 2020 (UTC)[reply]

And in the meantime, annoy editors and readers with errors than aren't errors. Both this one and the biorxiv one. Headbomb {t · c · p · b} 21:36, 27 February 2020 (UTC)[reply]
Fixing this out of order may open this page to more such requests for all kinds of minor changes. I believe Trappist's rationale is valid. 98.0.246.242 (talk) 23:14, 28 February 2020 (UTC)[reply]
Anytime where there's a visible BIG ERROR MESSAGE being emitted erroneously, the template should be updated immediately upon having a fix. This isn't just tweaking a bad comma, it's an active call to action to every editor and reader out there. Headbomb {t · c · p · b} 01:05, 29 February 2020 (UTC)[reply]
Umm, cs1|2 deliberately tones down the red error messages. We could have made big error messages but instead we chose to write normal-sized error messages. If only it were true that these error messages were an active call to action to every editor and reader out there; then the several subcategories of Category:CS1 errors holding several thousand articles would have been cleared log ago.
Trappist the monk (talk) 01:24, 29 February 2020 (UTC)[reply]
They are still visible messages being emitted erroneously. As far as calls to actions, some problems are simply easier to tackle than others. Observe the Category:CS1 errors: DOI category for example, which is kept to very close to empty at most times, despite routinely having new items in it. Headbomb {t · c · p · b} 01:32, 29 February 2020 (UTC)[reply]
The easiest way to fix the issue you believe to exist is to get consensus that we should break from our regular deployment schedule for minor issues likes this one. You have yet to do so. (I have separately been entertaining writing a little blurb somewhere on this page or nearby so that we have a page to point to about why these modules do not update often, and instructions to deploy, and the pages to notify a week prior.) --Izno (talk) 02:24, 29 February 2020 (UTC)[reply]
The easiest way to fix the issue is to update the damned module with the working fix, rather than wait for months in the hope that Trappist unilateraly decides there's enough issues to push the updates live. No other template or module on Wikipedia has a delayed fix schedule that is subject to the whims of a select few who happen to know LUA and possess the admin bit. There's no issue with putting 4 million articles in an update queue. From WP:PERF:
Nothing in this page is to say that editors should not be mindful of performance, only that it should not limit project development.
Headbomb {t · c · p · b} 02:41, 29 February 2020 (UTC)[reply]
... So, you're not going to get consensus for your position. That seems somewhere in the realm of unproductive for now and the multiple iterations in the future we'll have this discussion, because this isn't the first time. As for PERF, this module has brought the wiki to its knees before, such that WMF engineers have come to say "no, please don't" after the fact.
I'll speak specifically to No other template or module on Wikipedia has a delayed fix schedule: CS1/2 alone is used sometimes hundreds of times per page and moreover is changed the most of any of the most-widely used templates. Almost every other widely used module or template is stable, simple, and more-or-less has been so for the past half-decade that Lua has been around.
As you might note, I also have declined to push this change live. In doing so, know that I'm respecting the active consensus as I've observed it (and which no-one has challenged in any meaningful sense). If you should achieve consensus that changes like this minor ID check increment can be deployed ad hoc, I will respect that consensus instead. --Izno (talk) 03:21, 29 February 2020 (UTC)[reply]
The WMF has never asked anyone to not update or even slow down the updating schedule of the templates, nor has there ever been consensus to stagger updates that would fix things by months because of WP:PERF-reasons. Headbomb {t · c · p · b} 03:32, 29 February 2020 (UTC)[reply]
Which is a strawman. At the end of the day, get consensus. --Izno (talk) 03:44, 29 February 2020 (UTC)[reply]
This practice has been unilaterally started by Trappist the monk, without any discussion, nor any significant support. Show me consensus for it and I'll shut up. Headbomb {t · c · p · b} 03:45, 29 February 2020 (UTC)[reply]
I asked a WMF SRE about it in #wikimedia-tech, and they said that there is a potential for problems and that similar templates have caused issues in the past. Their recommendation was to make any such changes on a weekday during US and Europe working hours when most sysadmins are available.
Making changes to widely-transcluded templates is obviously something we don't want to do recklessly or needlessly, but delaying changes until an undetermined later time is also not a solution. I would suggest that Trappist and others with experience here write up a deployment guide and include a weekly or bi-weekly deployment window, preferably one that doesn't conflict with other related deployments. Having someone check #wikimedia-operations to make sure we're not in the middle of an incident would be a good idea too. There are plenty of other deployments that happen every week that could have a similar or greater affect on stability. --AntiCompositeNumber (talk) 04:41, 29 February 2020 (UTC)[reply]

The Anome has updated the identifier check. – Jonesey95 (talk) 20:54, 6 March 2020 (UTC)[reply]

I've also notified Trappist the monk that I've done this, as I've only updated the live version, not the version in the staging sandbox. -- The Anome (talk) 21:16, 6 March 2020 (UTC)[reply]

Syncing to another wiki failed

A few of us have been trying to update the Cantonese Wikipedia's mirror of Module:Citation/CS1 since 2018 without success. If the most recent version (as of today) is ported, all citation templates on the Cantonese Wikipedia return the following error:

  • Lua error in 模組:Citation/CS1 at line 3799: attempt to index field 'date_names' (a nil value).

The relevant modules and templates on the Cantonese Wikipedia are located at:

Any help will be greatly appreciated. --Deryck C. 12:27, 4 March 2020 (UTC)[reply]

You were getting the lua script error from the 2020 version of 模組:Citation/CS1 because the date_names table does not exist in your c. 2016 version of yue:Module:Citation/CS1/Configuration.
cs1|2 is a suite of eight lua modules and one css page. All of these are interdependent so you can't just update one module in a suite of older modules and expect good results; you will be disappointed. Because cs1|2 is so complex, the best thing to do is to import all of the current cs1|2 module suite into sandbox modules at yue.wiki. Get the sandbox version working (most of this work is going to be getting the contents of ~/Configuration right) and then update the yue.wiki live cs1|2 from your sandbox. Yeah, I know it's a lot of work to update a very old module suite to the current version. Let me know if you have trouble.
Trappist the monk (talk) 13:08, 4 March 2020 (UTC)[reply]
@Trappist the monk:: I'll give it a go tonight. Looks like only Module:Citation/CS1/Date validation and Module:Citation/CS1/Configuration have been localised at yue.wp; the rest are direct clones of en.wp. If I get stuck I'll self-revert and then ask you again for help. Deryck C. 22:37, 14 March 2020 (UTC)[reply]
@Trappist the monk: Porting the new version seems to be successful, though I have a question about date formatting. Where in the code is it that the default date formatting is set? The templates seem to be outputting all dates in dd MMMM yyyy which does not make sense in Cantonese (due to month names not being used in Cantonese). The East Asian languages have always needed customised date formatting strings in the format of "yyyy年m月d號". Deryck C. 00:08, 15 March 2020 (UTC)[reply]
There isn't a 'default' date format. The module suite only validates that the date format written in the raw template is valid according to the date format standards that the wiki decides are acceptable. At en.wiki those formats are specified at MOS:DATE. Show me an explicit example. Where I looked I saw dates rendered as you describe them.
Trappist the monk (talk) 00:32, 15 March 2020 (UTC)[reply]
@Trappist the monk:: If the CS1 modules don't format the dates, then it must be somewhere upstream that has a problem. Sorry about that. Deryck C. 15:49, 15 March 2020 (UTC)[reply]
Still, show me where you are seeing this. Editors can use |df=dmy-all to render dates in dmy format. If there are any articles copied from en.wiki that have contain the text {{use dmy dates or redirects of that (listed in ~/Configuration) then all dates on that particular page will be rendered in dmy format. Show me where you are seeing all dmy date format.
Trappist the monk (talk) 15:56, 15 March 2020 (UTC)[reply]
@Trappist the monk:: It was yue:格林尼治 where I saw some "15 三月 2020" style dates. But either my memory was patchy or there was replag; when I checked again this afternoon only one such date was left, and on further investigation it was actually stated as "15 三月 2020" in the Wikitext, so I went to correct it. Now the page has at least 3 different date formats (en-GB, yue, ISO), but at least all of them are correct. Deryck C. 22:48, 15 March 2020 (UTC)[reply]

Suggestion to add support for SBN parameter

Hi, I'd like to suggest to add support for an |sbn= parameter. Right now, editors have to convert SBNs into ISBNs and use the |isbn= parameter instead. However, ISBNs in pre-1968 citations look odd and are historically incorrect. Also, some editors might not know how to convert SBNs into ISBNs (although it is easy) and as a consequence not mention the SBN at all. Since the conversion is just adding a "0-" prefix to the SBN, adding support for an |sbn= parameter would be easy, as the template could internally do the conversion for number validation, to implement the underlying ISBN link and to feed metadata. The only difference would be that in the rendered citation, SBNs would correctly display as

SBN 356-02201-3

instead of incorrectly as

ISBN 0-356-02201-3

--Matthiaspaul (talk) 11:12, 5 March 2020 (UTC)[reply]

At first blush, this seemed a simple thing to do. Alas, the first blush was more of a rash. But:
{{cite book/new |title=Title |sbn=356-02201-3}}
Title. SBN 356-02201-3.
Trappist the monk (talk) 16:45, 6 March 2020 (UTC)[reply]
Cool, thanks! :-)
However, I would suggest to internally prefix with "0-" rather than "0" only. If the SBN number already contains hyphens (356-02201-3), "0-" blends in naturally (0-356-02201-3), and if it contains just a number (356022013), the "-" still makes sense to let the embedded SBN stand out a little (0-356022013). Looks slightly better to me than the other way around, however, is only cosmetically.
I would also suggest to deliberately route the SBN link in the prefix through a redirect (also per WP:NOTBROKEN), either the normal Standard Book Number or even better the Standard Book Number (identifier) one, so that "Standard Book Number" rather than "International Standard Book Number" shows up in the tooltip when hovering over the link. I think, the fact that this will cause the "Redirected from ..." note to be displayed at the top of the "International Standard Book Number" article is useful as well in order to avoid any confusion why the user lands on this article. Going through the (identifier) link would have the additional advantage that "normal" article links to the topic "Standard Book Number" can be easily distinguished from links for specific books with SBNs. This helps to avoid clutter in "What links here" (potentially thousands of incoming links from books intermixed with "normal" article links), enables filtering for one or the other type, and thereby improves reverse lookup of pages using these templates, effectively easing maintenance and research.
IMO, the same (identifier) scheme could/should be applied to ISBN etc. links as well - having the template links grouped under a specific redirect (which isn't used for other purpose, except for by an occasional accident, perhaps) would significantly improve the reverse lookup situation of normal links to the "International Standard Book Number" article.
--Matthiaspaul (talk) 11:31, 7 March 2020 (UTC)[reply]
Without we rewrite how internally-linked identifiers are handled, the cosmetic hyphenation will not be done. internal_link_id() assembles precomposed link parts stored in ~/Configuration to create the final rendering. The precomposed parts are static and cannot be modified. Because we want the 9-digit version to be displayed and linked using a 10-digit version and because internal_link_id() has no support for such a combination, and because this is merely cosmetic, I'm not inclined to rewrite working code to support this unique case.
I have switched the label link to Standard Book Number (identifier) and rather like the idea of doing the same for the other identifiers. Before I do so, are there any reasons that we shouldn't do that for the other identifiers?
Trappist the monk (talk) 13:20, 7 March 2020 (UTC)[reply]
I can't think of any - in fact, I only see advantages. However, when I implemented this a couple of years ago for some of the Catalog Lookup Links to address some other issues reported by users, this was rigorously changed to direct links by Headbomb. I tried hard but never found his argumentation conclusive (he didn't like the hatnote and said it would be inconsistent with other links (even though some of the CLLs I created were consistently using this scheme right from the start)), but at some point I gave up unconvinced.
In either case, we can easily address the consistency issue if we'd switch to use this scheme for all such identifiers in the citation template framework and the CLL templates (and since Mediawiki's ISBN magic token functionality is no longer around, this could not interfere with yet another link style as well). Once rippled through the pages I think this scheme would make reverse lookup much easier to use again on the affected pages. Probably unavoidable at first, some users might wonder why not linking to the target page directly, but if we explain the reasons for this in the template documentation I think everyone will eventually see the advantages and be happy with it in the end. To play it extra-safe we could even create a new {{R from identifier}} rcat for these kinds of redirects.
--Matthiaspaul (talk) 14:56, 7 March 2020 (UTC)[reply]
Rereading your post, I see that I misunderstood. If it is ok to insert a hyphen between the leading zero and an unhyphenated sbn identifier value, we can set the static portion of the internal link to include 0-. I understood that you want 0- only when the sbn was hyphenated so 0-356-02201-3 or 0356022013; that requires a rewrite.
{{cite book/new |title=Title |sbn=356022013}}
Title. SBN 356022013.
Trappist the monk (talk) 14:28, 7 March 2020 (UTC)[reply]
Of course, the latter would be even better (but probably not worth the effort), but, yeah, I just meant to prefix with "0-" even in the unhyphenated case.
--Matthiaspaul (talk) 14:56, 7 March 2020 (UTC)[reply]

There is a zillion reason not to create pointless (identifier) links, chief among them is that none of these links require disambiguation, but there is no difference between a 'manual' link to International Standard Book Number and one made through a citation template, much like there is no difference between a 'manual' link to electron and one made via {{Particles}}. If you're interested in finding a list of articles that link to International Standard Book Number without a citation template, then it's a simple matter of making a insource:/\[\[/International Standard Book Number/ search for those, either in the search bar, or with AWB. Headbomb {t · c · p · b} 04:26, 8 March 2020 (UTC)[reply]

That's a strawman and beside the point - I would really appreciate a more problem- and solution-oriented approach.
We are trying to find an elegant (and easily doable!) solution to solve a longstanding problem (which you simply keep on ignoring). By routing through redirects, I think we have a solution which does not even create any disadvantages elsewhere. (An even better solution would be more advanced filter capabilities built into "What links here". Unfortunately, this is not under our control and therefore not doable - but even if it would be done some years in the future, the currently proposed solution would not interfere with it, so there is no reason not to go for it.)
Why should users have to use external tools like AWB (which is a security risk and not even possible to use in many environments) or use arcane advanced search syntaxes even most technical users won't be able to reproduce just to perform simple actions like reverse lookup for normal article development? That's absurd.
The underlying problem with those "manual" links is that they pollute "What links here". This is also a long-standing problem for links from navigation templates, but it is even more prevalent with links from citation templates, adding f.e. hundreds of thousands of links to the International Standard Book Number article which would otherwise have only a few hundred incoming links. However, those remaining links are what most users are interested in in normal article development and when researching the topic. On the other hand, there are users only interested in the template links. So, it is desirable for users to have a choice, get all incoming links, only all normal links, or only all template links.
People even thought of getting rid of the template links in "What links here" by routing them through namespace prefixes. While this avoids the pollution, it creates the problem that it makes reverse lookup of those links impossible (and thereby might cause problems for some bot tasks), breaks the network of links, invalidates statistics, etc.
So, what is needed is some kind of middle ground between direct links and prefixed links, a system where all incoming links still show up in "What links here" for link network integrity, but where they can be grouped into "template links" and "non-template links". Since "What links here" presents links in random order, routing citation template links through redirects is adding some meta info without removing something, so there is no change for users who don't need to distinguish between them, but those who do now have a chance to select/filter the type of links they are interested in.
For this grouping to work, almost any redirect name would do, be it "Citation identifier link to Standard Book Number" or "Standard Book Number (identifier)". However, a parenthetical disambiguator like "(identifier)" looks nicer, can be applied universally (because identifier names never include brackets), and helps to visually separate an identifier's name from the disambiguator in tooltips. You are right that technically the disambiguator is not absolutely necessary, but there is also nothing which would disallow to use one - we even have an rcat for this: {{R from unnecessary disambiguation}}.
Basically the only potential "disadvantage" of routing through redirects is that a small "Redirected from ..." hatnote will be shown on the target page. However, with a carefully chosen disambiguator like "(identifier)", this even looks sensible and almost as if it would have been designed for this very purpose, so I see this even as an advantage rather than a disadvantage. In either case, it is only cosmetics.
There is one difference between links from navigation templates and links from citation templates. In navigation templates we normally try to use direct links so that the selected topic is shown in boldface on the target page in order to help navigation. While this is unlikely to happen in citation templates, where it happens it is even highly undesirable. Routing the identifier links through redirects has the nice sideeffect of ensuring that this cannot happen at all any more.
So, based on actual use cases there are many solid arguments pro linking through redirects and pro the "(identifier)" name. For consistency, we just need to make sure to switch the citation templates and CLL templates at about the same time.
--Matthiaspaul (talk) 12:21, 8 March 2020 (UTC)[reply]
The problem here is that you assume there is a problem where there is none. Wikipedia is written for readers, and the links in citation templates should be optimized to save readers the hassle. What's an ISBN? It's the International Standard Book Number. This is what pops up when you hover the link, and this is where you are taken when you click on ISBN. There's no sense in presenting readers 'International Standard Book Number (identifier)' when hovering the ISBN links, as if there is a different kind of ISBN out there, or forcing them to go through a pointless redirect before they end up at the article they expect. If you want to know what links to International Standard Book Number without templates, you can search for that in the same way everyone search for so-called 'direct' links that aren't embedded via templates: insource:/\[\[/International Standard Book Number/. Or without proposing nonsensical schemes like "Citation identifier link to Standard Book Number" or "Navbox link to Standard Book Number" or "Not-a-citation, but-still an identifier link to Standard Book Number" or "See also section link to Standard Book Number". Headbomb {t · c · p · b} 12:31, 8 March 2020 (UTC)[reply]
Yeah, Wikipedia is for readers, and readers want to use "What links here" when they research a topic. And these direct template links make it next to impossible for them to use "What links here" efficiently. This is a real use case and problem, and if we can address it in some elegant way, it would be almost irresponsible to ignore it. Forget about the "insource" thing - I guess less than 0.1% of the contributors know that this syntax even exists and how to use it, let alone normal users.
Regarding tooltips for a link named ISBN in a citation, displaying "International Standard Book Number" or "International Standard Book Number (identifier)" is about the same from a user's perspective. Both look nice and as if designed for this purpose, and since Wikipedians are used to this naming scheme, the "(identifier)" appendage does not interfere with the name at all. I think, in this specific context, an ISBN identifier in a citation, the later case is even more descriptive and to the point than just "International Standard Book Number". Nobody will assume this would be a different kind of identifier which actually happens to be named "International Standard Book Number (identifier)", that is, including the brackets. In contrast to this, something like "Citation identifier link to Standard Book Number" really looks ugly, that's why I gave it as a counter-example for a redirect name when we want to avoid the "(identifier)" name appendage.
I can't see anything forceful in going through redirects, it's actually good practise to link through the most specific semantically and syntactically matching redirect available. This allows more specific reverse lookup and also simplifies future maintenance, not only in this particular case, but in general (also per WP:NOTBROKEN).
--Matthiaspaul (talk) 14:02, 8 March 2020 (UTC)[reply]
What if we do an experiment? What if we change the label link for the |isbn= rendering from International Standard Book Number to International Standard Book Number (identifier)? If I understand what Editor Matthiaspaul is saying, then we should see a dramatic reduction in the number of listed articles at Special:WhatLinksHere/International Standard Book Number. Do I have this right?
If the experiment is successful, and there is still objection to the redirect's 'identifier' dab appearing when readers hover the mouse pointer over the identifier label then we might do something like this:
[[International Standard Book Number (identifier)|<span title="International Standard Book Number">ISBN</span>]]ISBN
Trappist the monk (talk) 14:34, 8 March 2020 (UTC)[reply]
Sounds good to me. I support this. --Matthiaspaul (talk) 16:15, 8 March 2020 (UTC)[reply]

There will be exactly the same amount of links to that special page because links to redirects are also listed there. And I also object to having pointless redirects in the first place. The links to ISBN from a citations are no less important than links from manual citations or from a mention in prose or in a see also section, or from a navbox. Especially when done unilaterally without a dedicated RFC asking if people want citations to link to stuff like PubMed Identifier (identifier). Headbomb {t · c · p · b} 20:37, 8 March 2020 (UTC)[reply]

Trappist already suggested a way how to suppress the display of "(identifier)" if this would be really necessary (I don't think it is, but it is good to know that we can if we need to). Also, we could use "PubMed ID (identifier)" or "PubMed (identifier)" or even "PMID (identifier)", or choose another scheme like "PubMed Identifier (citation link)", "PubMed Identifier (template link)" or "Citation link: PubMed Identifier". The point is to go through a redirect rather than linking directly. This will improve functionality/usability significantly, whereas the name of the redirect is cosmetics only. Technically, it could even be different for different identifiers (of course, it should not for consistency reasons, so that people can memorize this convention as a general scheme). In my opinion, "(identifier)" is nicely short, unobtrusive and not too specific - and at the same time it is self-explanatory enough for a generic name. And "PubMed Identifier" is one of very few, where a word would be repeated, something that doesn't harm because by using the parenthetical disambiguation scheme, it is immediately obvious to users that "PubMed Identifier" is the name of the identifier, and "(identifier)" is for classification inside WP. But again, there are options to avoid that as well.
--Matthiaspaul (talk) 10:59, 9 March 2020 (UTC)[reply]
Probably will be the same number of articles listed but after a bit of experimentation, I think that how the articles will be listed will be different. Right now, without the redirect, Special:WhatLinksHere/International Standard Book Number begins:
With the International Standard Book Number (identifier) redirect I suspect that those same articles will be listed something like this:
The Hide redirects filter will remove International Standard Book Number (identifier) and all of the articles that use the redirect from the Special:WhatLinksHere display. To prove this, I created User:Trappist the monk/sandbox redirect which redirects to my sandbox. I then created User:Trappist the monk/test which links to my sandbox through the sandbox redirect. You can see this at Special:WhatLinksHere/User:Trappist_the_monk/sandbox. If you click the Hide redirects filter link, the sandbox redirect and the test page are removed from the display.
Trappist the monk (talk) 22:51, 8 March 2020 (UTC)[reply]
Yes, the number of totally incoming links will not change, they will just be grouped differently. So far, there was no order, and users (or bots) for whom all incoming links were equally important (like apparently Headbomb above) had to recursively traverse through the whole list to ensure they catched all links. They can continue to treat the list as unsorted as before, and will not miss a single link. However, for most users, some classes of links are much more important than others. The walls of links from citation templates are mostly seen as clutter in normal article development and topic research, however, there are also users who are particularly interested in just these links. Lumping them all together, users had no choice and had to somehow live with the clutter or give up on using "What links here" efficiently. Going through the redirect, users can now suppress the bulk of links they aren't interested in simply by using the "Hide" function. And since it's optional, nothing get's lost for those who don't want or need this. Best of both worlds.
--Matthiaspaul (talk) 10:59, 9 March 2020 (UTC)[reply]
This discussion has been mentioned at Wikipedia talk:Manual of Style/Linking § Names of identifiers in citation templates.
Trappist the monk (talk) 18:35, 11 March 2020 (UTC)[reply]

choosing identifier redirects

Examples

Presuming that we pursue this, what are the identifier-label links? In the list above are the current identifier-label links followed by:

  1. the same links with the (identifier) dab
  2. is the identifier label as rendered by cs1|2 with the (identifier) dab and a <span>...</span> tag that holds the en.wiki article name

Feel free to add other possible redirect constructs

Trappist the monk (talk) 15:28, 10 March 2020 (UTC)[reply]

I added three more, not necessarily because I prefer them, but simply for completeness.
I like both of your suggested schemes, and can find pro arguments for both of them. For example:
Linking through the expanded name of an identifier (plus "(identifier)") ensures that the tooltip will already provide some helpful information for users who don't know what the symbolic name stands for. Otherwise, people will see the expanded form only by clicking the link (or enabling scripts).
Linking through the symbolic name (plus "(identifier)") has the advantage that the names are a complete no-brainer from our perspective. Also, the symbolic names will never contain the word "identifier" themselves (addressing Headbomb's argument above). Also, some identifiers might not have a proper expanded name (or the expanded name might depend on the target language), so just using the symbolic name in the redirect will ensure a higher degree of consistency in the name scheme. Finally, possible future changes in the name or spelling of a redirect's target page can be handled inside the redirect and do not require the citation template's code to be updated.
So, the first scheme appears to be slightly more helpful, your second scheme might be slightly easier to maintain and to keep consistent in the long term future.
I'm happy with both of them.
--Matthiaspaul (talk) 22:21, 10 March 2020 (UTC)[reply]
links to ISBN from […] citations are no less important than links from manual citations or from a mention in prose or in a see also section, or from a navbox
Ah, but heaven forbid that the biography of someone who spent his/her entire life in the United States to should ever link to United States. In fact, we have a certain cadre of users who will revert you 'til doomsday for doing that. I don't agree with their logic at all, but I would argue that a link to ISBN in {{cite book}} is of even lower practical value than links to first-world geography topics so often targeted for mass-demotion to plain text.
"What links here" presents links in random order
The order looks highly irregular but is far from random. Pages are ordered by wgArticleId, which is 12 for Anarchism and 34112310 for this talk page. This numbering is (very roughly) alphabetical for our oldest pages (the order in which they were imported from some previous system), and chronological for pages created after that time.
But there are still no other sorting options. True chronological and true alphabetical would be a good start. Maybe even by page size. The most difficult but most useful one would be "relevance to context" as estimated either by some kind of algorithm (something akin to whatever determines Special:Search result order).
But certainly, improved filtering options would help, e.g. the following, which should definitely be separate items:
Exclude links originating from the "body" of a template (e.g. [[International Standard Book Number|ISBN]] in {{cite book}}, or any link in a navbox).
Exclude links originating from the parameter of a template (e.g. |nationality=[[United States|American]], or |publication-place=[[Cambridge, Massachusetts|Cambridge]])
Piping a link to a suffixed redirect and using a tooltip hack to obscure the true target really seems like the worst possible solution. And by that I mean worse than doing nothing. ―cobaltcigs 17:23, 11 March 2020 (UTC)[reply]
Thanks for the explanation about how Special:WhatLinksHere pages are ordered.
Chastising us for the lack of filtering and sorting options available at Special:WhatLinksHere seems to me to be counter productive. It would be better to raise those issues with the developers at MediaWiki or wherever they are since we can do nothing to fix what you perceive to be broken.
Umm, links through redirects, whether piped or not, always obscure the true target:
[[Standard Book Number]]Standard Book Number has a tooltip: 'Standard Book Number'; redirects to International Standard Book Number#SBN
[[Standard Book Number|SBN]]SBN has a tooltip: 'Standard Book Number'; redirects to International Standard Book Number#SBN
[[Standard Book Number (identifier)|SBN]]SBN has a tooltip: 'Standard Book Number (identifier)'; redirects to International Standard Book Number#SBN
Using a [piped] link to a suffixed redirect and using a tooltip hack:
[[Standard Book Number (identifier)|<span title="Standard Book Number">SBN</span>]]SBN has a tooltip: 'Standard Book Number'; redirects to International Standard Book Number#SBN
The 'hack', if implemented, will also obscure the true target so, to me, the point you are trying to make is somewhat 'obscured'. If it is critical to reveal the true target, the 'hack' can do that:
[[Standard Book Number (identifier)|<span title="International Standard Book Number#SBN">SBN</span>]]SBN has a tooltip: 'International Standard Book Number#SBN'; redirects to International Standard Book Number#SBN
Trappist the monk (talk) 18:32, 11 March 2020 (UTC)[reply]
Those tooltips are very inaccurate.
[[Standard Book Number]] → Tooltip: International Standard Book Number
[[Standard Book Number|SBN]] → Tooltip: International Standard Book Number
[[Standard Book Number (identifier)|SBN]] → Tooltip: International Standard Book Number
If tooltips are implemented (which is a seperate issues from having pointless redirects), they should give the full description of the identifier, e.g. LCCN, JFM, Zbl, and so on. Headbomb {t · c · p · b} 18:51, 11 March 2020 (UTC)[reply]
What I mean is a link to a redirect should, in all cases, show the default tooltip of that redirect's title (being the target of that link) rather than pretending not to be a redirect. ―cobaltcigs 00:05, 12 March 2020 (UTC)[reply]
Some comments:
  1. I think I generally support linking through a redirect for these identifiers. In addition to the reasons to do so listed above, it would make searching for pages using each identifier easy to identify by using Special:Search and linksto:"DOI (identifier)" (which is a fast search) as opposed to hastemplate:Module:Citation/CS1 insource:doi insource:/\| *(doi|DOI) *=/ (which is a slower search). I don't find the identified negatives to be sufficiently negative, or negative at all (perhaps not positive).
  2. However, I do not support adding these spans with titles. This would significantly increase the HTML associated with each listed identifier (I am thinking of the performance in this case due to a comment elsewhere about our citation/template-heavy articles) for the minor/negligible gain in utility. Users will figure it out eventually.
--Izno (talk) 21:16, 11 March 2020 (UTC)[reply]

Is there a decision here? Do we attempt to use redirects for identifier label links or do we maintain the status quo? If we choose to use redirects, what form do they take?

Trappist the monk (talk) 18:36, 17 March 2020 (UTC)[reply]

Trying to distill the arguments brought forward so far it appears to be that going through a redirect named after an identifier's symbol (in official capitalization, plus "(identifier)") appears to have a slight (long-term) edge over naming them after the expanded form (that is "ISBN (identifier)", "doi (identifier)", "PMID (identifier)", "arXiv (identifier) etc.).
I would have suggested to use your span trick to show the expanded form (if available) as tooltip, but other editors seem not to be too fond of that method - perhaps we'll leave that unless requested later on.
--Matthiaspaul (talk) 19:11, 17 March 2020 (UTC)[reply]

I have tweaked Module:Citation/CS1/Identifiers/sandbox and Module:Citation/CS1/Configuration/sandbox to create identifier label links in this order:

id_handlers['<ID>'].redirect when use_identifier_redirects is true
id_handlers['<ID>'].q from wikidata when the local wiki has mw.wikibase installed and wikidata has an article name for the local language
id_handlers['<ID>'].link a locally provided article name

I have set the various id_handlers['<ID>'].redirect to be '<ID> (identifier)' where '<ID>' is the same as id_handlers['<ID>'].label; for |arxiv= the label is 'arXiv' and the redirect is 'arXiv (identifier)'.

To be done is to create the redirects. There being no objections, I shall do so.

Trappist the monk (talk) 14:52, 20 March 2020 (UTC)[reply]

As I write this, List of members of the 19th Bundestag is listed at Category:Pages with script errors for this error. That error occurs because the live version of the module always attempts to get identifier article titles for the identifier label wikilink from wikidata. The call to mw.wikibase.getEntity() for the Q value specified in the ~/Configuration id handler for the various ids is expensive. Because the order of evaluation described above is redirects, wikidata, locally defined, choosing to use redirects for identifier label wikilinks can avoid that expense. I have tweaked ~/Identifiers/sandbox so that when a redirect is defined and enabled, the module does not make the call to mw.wikibase.getEntity().

Trappist the monk (talk) 16:35, 21 March 2020 (UTC)[reply]

There should be a namespace check on these error. Or at least a documentation check on the second one. Headbomb {t · c · p · b} 20:21, 6 March 2020 (UTC)[reply]

In general, category errors should probably only be done in Article/Draft/Template spaces. Headbomb {t · c · p · b} 20:27, 6 March 2020 (UTC)[reply]
There is a namespace check on all errors. cs1|2 does not categorize:
User, Talk, User talk, Wikipedia talk, File talk, Template talk, Help talk, Category talk, Portal talk, Book talk, Draft talk, Education Program talk, Module talk, and MediaWiki talk
The Category:Articles with missing Cite arXiv inputs category is added because Help talk:Citation Style 1/Archive 8 has {{cite compare}} with |old=no which no longer works as it once did (the comparison did not include {{citation/core}}). After the change, |old=<anything> causes {{cite compare}} to include the 'old' {{citation/core}} in its rendering.
For Category:Pages with DOIs inactive as of 2019 January, {{cite journal}} transcludes Template:cite journal/doc. Both are in the template namespace so the error categories are expected and desired. To turn-off categorization in the ~/doc page use |no-cat=yes or one of its aliases.
Trappist the monk (talk) 21:13, 6 March 2020 (UTC)[reply]
I dealt with the doc page. However, the arxiv categories still need to be taken care of. This also includes Category:Articles with a publisher parameter in their Cite arxiv templates and Category:Articles with a journal parameter in their Cite arxiv templates. Headbomb {t · c · p · b} 22:03, 6 March 2020 (UTC)[reply]
Category:Pages with DOIs inactive as of 2019 September also contains Wikipedia-namespace pages, which shouldn't be in there. Headbomb {t · c · p · b} 22:24, 6 March 2020 (UTC)[reply]
Category:Articles with a publisher parameter in their Cite arxiv templates and Category:Articles with a journal parameter in their Cite arxiv templates arise from {{cite arXiv/old}} called by {{cite compare}} in Help talk:Citation Style 1/Archive 8. Neither category has been supported by cs1|2 since {{cite arxiv}} was converted to lua.
The roster of uncategorized namespaces was determined at Help talk:Citation Style 1/Archive 3 § Display errors on Talk pages but exclude from Categories?
Trappist the monk (talk) 23:02, 6 March 2020 (UTC)[reply]
The "roster" was proposed when the draft namespace didn't exist. Drafts should be covered. If the issue is caused by cite arxiv/old or cite compare, those should be updated to not emit those categories when in those non-supported namespaces. Headbomb {t · c · p · b} 23:30, 6 March 2020 (UTC)[reply]

The arxiv problem seems fixed now, somehow. Thanks to whoever helped. Still the issue of Wikipedia-namespace errors tracking though. Headbomb {t · c · p · b} 23:28, 15 March 2020 (UTC)[reply]

publisher

It's a bit irritating that you can't manually italicize a newspaper under the publisher parameter. I don't know who thought that a good idea but it's a tad annoying. I am aware that newspaper= works but it's not a good idea showing errors when you try to italicize under the publisher parameter. Sort it out, thanks.♦ Dr. Blofeld 10:21, 7 March 2020 (UTC)[reply]

The name of the newspaper goes in the |newspaper= or |work= parameter. The template does the italicization for you. – Jonesey95 (talk) 15:11, 7 March 2020 (UTC)[reply]
sort it out lol if you only knew.. -- GreenC 16:19, 7 March 2020 (UTC)[reply]
The whole concept of using one of the cite xxx templates is to describe what the text string is, such as publisher or work, and let the template figure out where it should go and how it should look. The problem is when some publications don't follow the traditions that developed during the 19th and 20th century, and decide that they don't need to distinguish between the title of a website and the name of the corporation that publishes it. Or, when a work (not necessarily online) doesn't provide a title at all, and the person citing it has no choice but to write a description that serves in place of a title; CS1 has no way to indicate this. Jc3s5h (talk) 17:32, 7 March 2020 (UTC)[reply]
That is all lovely, and could be addressed once again in a separate thread, but Dr. Blofeld specifically addressed putting the name of a newspaper into |publisher=, which is what my answer was attempting to respond to. – Jonesey95 (talk) 19:42, 7 March 2020 (UTC)[reply]
And my answer is Dr. Blofeld's request should be rejected. Jc3s5h (talk) 20:34, 7 March 2020 (UTC)[reply]

publication-place, place, or location and their proper use (cont.)

Following up on this discussion after seeing this Citation bot posting and this Help:Citation Style 1 edit.

As a result of the original discussion, we created Category:CS1 location test which, at this writing contains 822 articles. The code that adds articles to that category does not discriminate between |publication-place= and |place= or |location= having same or different values. When values are the same, cs1|2 renders only one.

I have written an awb script to troll that category and remove redundant parameters. After I run this script, we should have some idea about how multiple |publication-place=, |place=, |location= params are being used.

With regard to Editor Jc3s5h's HELP:CS1 edit, the Prefer "publication-place" over the ambiguous "location" recommendation, if that is what it is, is contrary to how these parameters are used. This is documented in the original discussion. That edit should, I think, be reverted.

Trappist the monk (talk) 18:22, 8 March 2020 (UTC)[reply]

I believe a parameter name that will still be correct, even if another editor comes along and adds a place from a dateline, is preferable to a name that may need to be revised if information from a dateline is added. Jc3s5h (talk) 19:14, 8 March 2020 (UTC)[reply]
I, too, have a hard time understanding what consensus or reality of the templates the new suggestion is based on. Nemo 19:16, 8 March 2020 (UTC)[reply]
Example using location = San Francisco
Example using place = San Francisco
Example using publication-place = New York and place = San Francisco
Example using location = New York and place = San Francisco
So it looks like the template just doesn't properly support a story with a dateline in a paper where it's place of publication is included in the publication's title. Jc3s5h (talk) 19:53, 8 March 2020 (UTC)[reply]
I might be stating the obvious, but usually the documentation should refer to the current status of templates rather than an aspirational situation. Nemo 20:07, 8 March 2020 (UTC)[reply]
Finished running the awb script against Category:CS1 location test which removed approximately 400 articles from the category. What remains must be evaluated manually.
Trappist the monk (talk) 14:25, 10 March 2020 (UTC)[reply]

where something was written is completely irrelevant to citations. The parameter should be removed. what citation guides recommend mentioning is the location of the publisher, because historically it was important to know where a publisher was so you could order a book from it. It was also useful to distinguish between different journals and magazines and newspapers of different cities that happened to have the same title. Where someone happened to sit down to write words is irrelevant. Headbomb {t · c · p · b} 20:45, 8 March 2020 (UTC)[reply]

@Headbomb: I was having the same thought, just don't cite the location from the dateline. But then it crossed my mind that online newspapers, The New York Times in particular, tend to change the text of a headline during the course of a day, while the dateline is unchanged (except maybe for time of day) and there are negligible changes in the text of the article. Especially in the case of an article that does not name the authors, the place from the dateline can help a person verifying a citation that they have found the article they were looking for. Jc3s5h (talk) 20:52, 8 March 2020 (UTC)[reply]
All the examples of this somewhat rare case that I have seen either automatically redirect to the proper (edited) article, or land in a placeholder page with a link to the new url. This is not actionable as far as CS1 is concerned, imo. The change to the documentation should be reverted. CS1 is already too complicated. 108.182.15.109 (talk) 13:55, 10 March 2020 (UTC)[reply]

Still goes against all citation guides. Because there's no two distinct articles published in the same newspaper, on the same day, with the same title, but which were somehow written in different locations different locations. Headbomb {t · c · p · b} 21:11, 8 March 2020 (UTC)[reply]

Also |location= should be made the canonical parameter. It's by far the preferred parameter of editors, and is much, much shorter and easier to type. Headbomb {t · c · p · b} 04:36, 9 March 2020 (UTC)[reply]
I don't think the information should be removed. I consider it to the part of the properties belonging to the publication, and while it is not absolutely necessary it may help to put a source into perspective and to further research the background of a publication.
The problem, if there actually is one, is the ambiguity of the |location= and |place= parameters. Perhaps we can find better names, and slowly deprecate the old ones?
Specifically for the "written at" location, what about parameter names such as |write-location=, |author-location=, |authoring-place=, |written-at-place=, |written-at=, |foreword-location=, |dateline-location=, |lockout-location=?
The first few suggestions put the focus on the generic act of writing/authoring, regardless of where that location will be specified in the publication (in the foreword, in the dateline at the top of an article, or in the lockout at the bottom of an article). The last three suggestions focus on where the location shows up in the publication, and thereby may help editors to choose the correct parameter, however, given that, they would have to be supported in parallel although they are, I think, mutually exclusive, so the template would have to make sure that only one of them is actually used in a citation. (Or are there any types of publications which can have more than one out of a location in the foreword, dateline, or lockout?)
Further, are there, perhaps, synonyms for the term dateline linguistically emphasizing more on the location rather than the date? (In German, it is called a "Spitzmarke", if that helps to find a better English term for it.)
Regarding |publication-place=, this is already quite descriptive (but long), but it's not completely without ambiguity as well. What about |publisher-place= instead?
--Matthiaspaul (talk) 09:29, 10 March 2020 (UTC)[reply]
It is not the purpose of a citation to put a source into perspective and to further research the background of a publication. If it is necessary to do that in an article, create an end-note or footnote that has whatever extra information is required. Creating more, rather esoteric, parameters does not seem to me to be an answer to the basic question which is: do we keep the |publication-place= and |location= or |place= functionality (written at...)?
Trappist the monk (talk) 14:25, 10 March 2020 (UTC)[reply]
Where stories are 'written' is to give context to story for the reader. If you see "Syracuse, NY A man lost his life Sunday night following a break in..." this tells you the story is about a man in Syracuse, NY. It has nothing to do with any sort of bibliographic information relevant to citations. Headbomb {t · c · p · b} 18:58, 10 March 2020 (UTC)[reply]
More to the topic at hand, |publication-place= and |place= or |location= should all be aliases of |location=. The only place where this really causes confusion is where conference are held, which is normally put into the full title of the proceedings, so there might be room for |conference-location=. But |location= is the location of the publisher, and every style guide is quite clear on this. We should not invent conventions out of nowhere because a some editors don't know how to follow style guides. Headbomb {t · c · p · b} 19:02, 10 March 2020 (UTC)[reply]

Template causes a page to transclude itself?

If you put something like

  • {{cite journal |last=Foobar |first=Smith |title=Title}}

to produce

  • Foobar, Smith. "Title". {{cite journal}}: Cite journal requires |journal= (help)

on a page, the page trancludes itself on its own page. Just preview this section, go down to "Templates used in this preview" and see that "Help talk:Citation Style 1" is listed in the transclusions.

This is weird and shouldn't happen. Headbomb {t · c · p · b} 23:31, 9 March 2020 (UTC)[reply]

cs1|2 reads the article looking for {{use xxx dates}} templates so that it can auto-format dates. To do that cs1|2 uses the title object's getContent() method which records the page as a transclusion.
Trappist the monk (talk) 00:22, 10 March 2020 (UTC)[reply]
Any way of making use of the {{linkless exists}} hack? I'm guessing a no, since it actually needs the content of the page. Headbomb {t · c · p · b} 00:33, 10 March 2020 (UTC)[reply]
No. Neither that template nor the underlying PROTECTIONEXPIRY magic word return article content.
Trappist the monk (talk) 00:42, 10 March 2020 (UTC)[reply]

SSRN limit now exceeded

  • Arbel, Yonathan; Toler, Andrew (2020). "All-Caps". SSRN 3519630.

The limit should be increased to at least 4000000. Headbomb {t · c · p · b} 18:55, 10 March 2020 (UTC)[reply]

It seems pointless to update it once every few months, breaking new citations for weeks while everyone ignores the error. Why have an upper limit at all? -- Tim Starling (talk) 10:03, 24 March 2020 (UTC)[reply]
The purpose for the limit is to catch simple typographic errors. Yeah, it's a weak test but since there isn't a check-digit or some other formatting cue, limiting the value has been our only way of discovering erroneous values. If you have a better solution, please tell us what that solution is. Simply raising the limit to a huge number merely hides invalid identifier numbers that have been used in a citation but have not yet been issued. I will revert your edit at Module:Citation/CS1/Configuration/sandbox.
Trappist the monk (talk) 11:53, 24 March 2020 (UTC)[reply]
I only raised the sandbox limit to 10M, sufficient to catch an error in the number of digits, I didn't raise it to a "huge number". Your change reduces it to 4M, which is only sufficient to catch typos in the first digit if the typo happens to change the digit to a number greater than 3, and so small that it will probably start throwing errors again within a year. The problem we have is that every citation of an SSRN published later than about February is showing a "check SSRN" error. The problem has been known since early March. The main template hasn't been synced since January, and if I test the current sandbox, it fails with a Lua error, so it can't be raised at all. -- Tim Starling (talk) 22:34, 24 March 2020 (UTC)[reply]
While the module should have been updated eons ago, the SSRN corpus does not increase at a rate of 500,000 submissions per year. And it doesn't only catch errors on the first digit, if you have extra digits, it will catch that too. Headbomb {t · c · p · b} 22:47, 24 March 2020 (UTC)[reply]
ID 3500012 was 29 December, and ID 3545710 was 25 March, which is a rate of about 525 papers per day, 192,000 per year. So we can expect it to exceed 4M in August 2022, if it continues to grow at the same rate. A little later than my guesstimate but I still think the cure is worse than the disease. Presumably most "check SSRN" errors to date have been caused by inappropriate limits, not by typos. The trouble is that nobody is updating these limits. It's understandable: I probably have spent about an hour on this already, and I probably still have an hour of work left if I want to do the update. If we change the limit to 10M, then we will next have to deal with this problem in about 33 years, which sounds good to me since that's about when I will next have time for this. -- Tim Starling (talk) 03:54, 25 March 2020 (UTC)[reply]

Proposal to enhance edition= parameter to support special numerical symbols

Hi all, I would like to refresh a proposal I made quite a long while ago at Help talk:Citation Style 1/Archive 11#Suggestion for edition= parameter to treat raw numbers:

The |edition= parameter should be enhanced to support a number of special numerical values ("1".."99") which are not conflictive with the parameter's normal use. If one of these tokens is found, the code would replace this by "1st", "2nd", "3rd".."99th" before passing on the value.

This would help to further decouple semantics (which edition?) from presentation (f.e. "3rd ed."). It would not only make it easier to add common edition information, but also improve readability, maintainability and translatability, and it would allow to centrally change the rendering in the future, would this become necessary ("3rd ed.", "third ed.", "third edition" etc.), depending on the output device (f.e., display the abbreviated form "3rd ed." on the small display of a mobile device, but "third edition" on a desktop or printout), or target language (e.g. "third edition", "dritte Ausgabe", etc.).

This would be very similar to the |language= parameter which meanwhile accepts free-flow text like "English", "German", etc. but also a number of special symbols like "en", "de" etc.

In order to avoid conflicts, the recognized special tokens should be restricted to just the numbers "1".."99", but this would already cover the common cases.

--Matthiaspaul (talk) 19:44, 13 March 2020 (UTC)[reply]

I don't think that this is going to happen unless and until MediaWiki CLDR and Scribunto has support for ordinal rendering. For English, it is relatively easy to convert cardinals to ordinals. But, the rules that apply to English do not necessarily apply to other languages where the cs1|2 module suite is used. A hint of that complexity may be found on this Unicode CLDR page.
Trappist the monk (talk) 11:11, 21 March 2020 (UTC)[reply]
Thanks for the pointer. But what I "envision" is much simpler than implementing CLDR. It aims only at covering the majority of cases, and for the remainder the |edition= can still be used for free text, so there would be no backdraw. Also, my suggestion to cover the range "1".."99" was arbitrary. I think, it could be reduced to "1".."25" or even "1".."15" and still be equally useful. After all, few publications actually go through more editions. In the default implementation, this could be implemented as a simple list of 15 (or 25) language-specific replacement strings, which would need to be adjusted to the local replacements when the citation templates are used in a foreign-language Wikipedia. In locales where the scheme would not be useful at all, the strings could just be left empty (or a dummy routine be implemented), so that no replacement would occur. (At a later stage, more complicated cases could always be covered in a locale-specific routine implementing the local rule-set, but this proposal is not about such a more complicated solution. It is about giving editors a chance to start providing the information symbolically at least in the most common cases as early as possible.)
With the citation templates used for millions of citations, being able to encode edition information symbolically in at least the simple cases (and thereby allowing easier adjustment of the output format and translation), would still be a considerable improvement, even if the more complicated cases would not be covered.
--Matthiaspaul (talk) 10:47, 22 March 2020 (UTC)[reply]

Add support for title-url

This would be very useful to indicate which url is meant when you have title/chapter/contribution/etc... Headbomb {t · c · p · b} 23:54, 15 March 2020 (UTC)[reply]

So |url= would be an alias for |title-url=? Kanguole 00:26, 16 March 2020 (UTC)[reply]
Yes. Or the other way around. However aliases work, as long as they are synonymous. Headbomb {t · c · p · b} 00:34, 16 March 2020 (UTC)[reply]
Or like |author-first/last= are aliases for |first/last= for symmetry with |editor-first/last=? --Matthiaspaul (talk) 09:35, 16 March 2020 (UTC)[reply]
This question was meant for clarification. As I prefer specific and non-ambiguous parameter names (even if they are longer), I would support the addition of this alias.
--Matthiaspaul (talk) 04:14, 20 March 2020 (UTC)[reply]

biorxiv appearance

In the past, the bioRxiv DOI was concatenated from doi:10.1101/123456 to bioRxiv 123456 with the understanding that 123456 was the identifier (and indeed was used in many URLs). However, the recent update to the biorxiv DOIs would mean this changes from doi:10.1101/2020.02.07.937862 to bioRxiv 2020.02.07.937862 which really isn't clear, further obfuscates what is actually used by people, and loses the 'identity' of the biorxiv string as an identifier/pseudoidentifier. So I propose we show what should be clear to everyone

bioRxiv:10.1101/123456
bioRxiv:10.1101/2020.02.07.937862

A simple AWB run (or CitationCleanerBot (talk · contribs) run) should be more than sufficient to make the updates from |biorxiv=123456 to |biorxiv=10.1101/123456 in a timely manner. Headbomb {t · c · p · b} 18:40, 17 March 2020 (UTC)[reply]

Auto-hyphenation of ISBNs in citations

Cobaltcigs has implemented an auto-hyphenation function for ISBNs based on the official ruleset for hyphenation (see {{Format ISBN}}). I think, this functionality should be incorporated into the citation template framework, so that the displayed ISBNs in citations are always properly formatted no matter if they are formatted with or without hyphens in the |isbn= parameter (and if the hyphens are inserted in the correct locations there). This can be adapted to SBNs and the new |sbn= parameter as well.

Optionally, there could be a maintenance message in edit preview showing the correct hyphenation if a given ISBN is using no or a different hyphenation in the source code, so that editors could adjust the parameter input accordingly (for cosmetically reasons only, hence this should not be an error message, and only be visible in preview).

--Matthiaspaul (talk) 20:14, 17 March 2020 (UTC)[reply]

It might be better to fix the hyphenation statically (e.g. with bots), rather than add to the processing done each time a citation is rendered. Kanguole 11:27, 19 March 2020 (UTC)[reply]
I highly doubt you'll find appetite in the community for a million edits that do little but add hyphens to ISBNs. This is something best handled by templates. Headbomb {t · c · p · b} 15:12, 19 March 2020 (UTC)[reply]
So add it as a minor task to CitationBot or something. The above template does a sequential search of a table of 1265 strings. A binary search would be faster, but it would still be doing a significant amount of processing for every ISBN in every article display, and always producing the same hyphenations, i.e. pointless server load. Kanguole 15:47, 19 March 2020 (UTC)[reply]
See WP:PERF. Headbomb {t · c · p · b} 16:56, 19 March 2020 (UTC)[reply]
I think you both have good points. The current search implementation definitely could be improved algorithmically. Possibly, this could be interwoven with the checksum validation in order to reduce the number of string scans, but this would need further investigation. I guess, eventually we'd want properly formatted ISBNs also in the article source, so either a human or a bot would have to edit the article anyway.
Perhaps, for a start, the properly formatted ISBNs should be displayed only in preview to give editors a chance to copy&paste them back into a citation's source code. And in normal view, the template would pass on whatever it finds given in the source code for performance reasons.
Or we could introduce a new parameter |auto-hyphenation=no which would be set by bots once they have edited an article. If set to no, this would bypass the hyphenation code, assuming that the bot stored the properly hyphenated string in the citation's source code. Alternatively, properly hyphenated ISBNs could be framed using the ((isbn)) syntax. (If I remember correctly there also was some "invalid-isbn"-kind parameter for "valid" ISBNs with checksum errors; perhaps this could be combined in order to not introduce yet another parameter - however, right now I can't seem to find this parameter in the documentation.The parameter is |ignore-isbn-error=.) Or we could introduce a new |entry-isbn= parameter for not (yet) properly hyphenated ISBNs, invoking the template's auto-hyphenation. Bots would change this to |isbn= while rewriting the properly hyphenated string.
--Matthiaspaul (talk) 18:19, 19 March 2020 (UTC)[reply]

So I suppose I could make the linking default to no and also take out the error messages (as {{cite book}} does these things internally). This way the (or a) bot would only need to change | isbn = FOO to | isbn = {{subst:format ISBN|FOO}}, rather than maintaining its own list of rules. With those changes made, the result for an ISBN already formatted correctly (or one that has no correct formatting because it's numerically invalid) would just be a null edit. ―cobaltcigs 15:04, 22 March 2020 (UTC)[reply]

But oh, shit: I just remembered subst'ing will fail inside the <ref>...etc...</ref> tag. This means the bot would also have to change any ref tags containing a subst: to use {{subst:#tag:ref|...etc...}} themselves. Something about pre-save transform order of operations. ―cobaltcigs 15:15, 22 March 2020 (UTC)[reply]

Discussion at Village Pump

Wikipedia:Village_pump_(technical)#=url_and_=archiveurl_do_not_match -- GreenC 14:18, 18 March 2020 (UTC)[reply]

unhide missing periodical error message

We hid the missing periodical error message as a result of this discussion. Any reason to keep it hidden?

Trappist the monk (talk) 14:46, 18 March 2020 (UTC)[reply]

That discussion is too long to comprehend. Please explain why you think it is now appropriate to make this change. Please pay particular attention to whether or not {{cite web}} is to be treated as a near-synonym of {{cite journal}} and whether it is invalid to use "cite web" for a URL that is not part of a larger publication. Jc3s5h (talk)
Likewise for anything that isn't a cite journal/magazine, e.g. cite document, which redirects to cite journal, but which shouldn't require a |journal=. And cite web which shouldn't require |website=. Headbomb {t · c · p · b} 15:34, 18 March 2020 (UTC)[reply]
It looks like there are currently about 52,000 pages in Category:CS1 errors: missing periodical. I haven't worked on fixing too many of these errors. Most of the errors I have come across have either needed |journal= or a different template entirely. I have added an explanation of that latter situation to the help text on the category page. In general, I support unhiding of the error message for {{cite journal}} and {{cite magazine}}. Per the outrage in the discussion, {{cite web}} should remain unaffected. I would like to see feedback from editors who have been working on resolving these errors. – Jonesey95 (talk) 15:40, 18 March 2020 (UTC)[reply]
Neither {{cite web}} nor {{cite news}} have contributed to Category:CS1 errors: missing periodical since this edit.
That {{cite document}} renders as a journal is not the fault of the cs1|2 module suite. The module suite knows only that it has been called by {{cite journal}} so it renders what it gets as a journal citation.
I attempt to fix these errors as I encounter them but with 52k articles, one editor doing it manually, might get through that category in the current life. I think that I ran Monkbot/task 14 over that category on the off-chance that it might fix some of these errors (where the journal or magazine name is in |publisher=). I might attempt that again.
Trappist the monk (talk) 16:15, 18 March 2020 (UTC)[reply]
52,000 is not that bad. There are many editors who will fix errors one or two at a time in the articles that they care about, as long as the error messages are visible. If we continue to hide the error messages, though, the count will grow, and only the few editors who frequent this page will work on them. – Jonesey95 (talk) 16:27, 18 March 2020 (UTC)[reply]
That {{cite document}} renders as a journal is not the fault of the cs1|2 module suite. It is, by virtue of invoking {{cite journal}} rather than its own thing. Headbomb {t · c · p · b} 19:09, 18 March 2020 (UTC)[reply]
Hard to imagine how the module suite could be blamed. Editors decided long ago, that {{cite document}} should redirect to {{cite journal}}. {{cite document}} was created as a redirect and has remained as a redirect ever since. On the redirect's date of creation, {{cite journal}} did not use Module:Citation/CS1 – that would not happen for another three years (23 March 2013).
I found one previous discussion that contemplates creation of a {{cite paper}} template; discussion fizzled and died:
Trappist the monk (talk) 20:10, 18 March 2020 (UTC)[reply]
It's not about putting blame on someone or something, but based on its name {{cite document}} simply has nothing to do with periodicals, regardless of how it is coded internally.
I would support enabling the message for {{cite journal}} and {{cite magazine}} specifically, because for them the specification of a journal or magazine is natural - and citations lacking this information can be considered incomplete. However, templates like {{cite web}}, {{cite document}}, or other templates using {{cite journal}} internally should not be affected, as most users (including myself) do not consider them to be periodicals.
--Matthiaspaul (talk) 20:17, 19 March 2020 (UTC)[reply]
{{cite web}} does not use {{cite journal}}. Both are independent cs1 templates. {{cite document}}, {{cite paper}} are redirects to {{cite journal}} so Module:Citation/CS1 sees them as journal cites because it cannot know how {{cite journal}} was called.
Trappist the monk (talk) 12:58, 20 March 2020 (UTC)[reply]
I have run Monkbot/task 14 over Category:CS1 errors: missing periodical. The bot made 9521 edits and reduced the category size by maybe 1000 articles. I think it is time to show the missing periodical error messages.
Trappist the monk (talk) 12:58, 20 March 2020 (UTC)[reply]
Thanks for that bot run, and I agree that the messages should be displayed. I fixed a dozen or so, and they weren't difficult. There are a lot of cite journal citations with DOI values and titles but no journal name; those should be easy to fix. Once we get rid of the low-hanging fruit, we may find after all that there is a need for a slightly different {{cite document}} or something like it for citing standalone non-book publications, perhaps just a wrapper that calls {{citation}} with |mode=cs1 (I just made that up without any testing, so it is probably not right). That should be a separate discussion from this one. – Jonesey95 (talk) 15:08, 20 March 2020 (UTC)[reply]
This should not be enabled until there is a way to ensure that only a very specific whitelist of templates emit the error. Namely only cite journal and cite magazine (and journal/magazine-related redirects to those templates), and not cite document and other unrelated redirects. Headbomb {t · c · p · b} 15:14, 20 March 2020 (UTC)[reply]
{{cite document}} is a redirect to {{cite journal}} and has been since its creation ten years ago. I don't know of such a thing in Wikipedia as an "unrelated redirect"; do you have a link explaining what that is? Anything that happens to {{cite journal}} will happen when it is called via {{cite document}}. If you want {{cite document}} to be a different template with different rules, please start a new discussion about that new topic, as I suggested above. – Jonesey95 (talk) 17:01, 20 March 2020 (UTC)[reply]
A document is not a journal. Therefore the two are unrelated. This should be obvious. I can create a redirect from {{cite database}} to {{cite journal}}, but that won't magically make a database a journal, not a citation to a database a citation to a journal. Headbomb {t · c · p · b} 17:09, 20 March 2020 (UTC)[reply]
Please create a new discussion thread and explain to us how {{cite document}} should work as a standalone CS1 template. Please provide real examples from real articles that support your proposal to create a new CS1 template. I think that I will agree with most of what you have to say, because as I look at the articles in this error category, I see some citations that do not appear to fit any of our existing templates except for the catch-all {{citation}}. Talking about this issue in this existing thread, which is about specific error messages, will make it less likely that you (and I, and other editors who will share our concern after the error messages appear in public) get what we want. – Jonesey95 (talk) 17:17, 20 March 2020 (UTC)[reply]
Explain to us how {{cite document}} should work as a standalone CS1 template. Simple, exactly like it currently does, not emitting an error if |journal= is missing. There's no need for a separate thread for this. Headbomb {t · c · p · b} 17:20, 20 March 2020 (UTC)[reply]
My preference would be for it to work exactly like {{citation}} does (agnostic as to which kind of citation it is citing, book, journal, web, whatever) except cs1 not cs2 by default. —David Eppstein (talk) 18:50, 20 March 2020 (UTC)[reply]
That would also be an option. Headbomb {t · c · p · b} 20:18, 22 March 2020 (UTC)[reply]

Cite ebook

redirects here, but I can't see where the info is. I guess the important thing is what replaces page numbers, and whether it works with harv referencing. ——SN54129 14:23, 19 March 2020 (UTC)[reply]

{{cite ebook}} was created as vandalism in 2010; now just a redirect to {{cite book}}.
Trappist the monk (talk) 14:48, 19 March 2020 (UTC)[reply]
As far as differences in citation style go, the general guidance is the same as it always is: include as much information as you can. Some ebooks are clearly and reliably paginated, while others are infinite scrolling nonsense. If there's page numbers in your edition, cite the page numbers. If there aren't any or they're not a reliable indicator, use |chapter= or |at= in citations and |loc= in harv footnotes to get as close as you can. --AntiCompositeNumber (talk) 14:52, 19 March 2020 (UTC)[reply]
Even in print books, pagination can vary between editions, or even printings. Citing a chapter as well as a location or page number in an ebook, if you are willing to provide both, will help WP readers verify claims in articles. – Jonesey95 (talk) 15:18, 19 March 2020 (UTC)[reply]

Add support for series-editor parameters

Some publications distinguish between authors, editors, and series editors. In order not to have to lump together the two types of editors, I would welcome if we had a |series-editor*= range of parameters in addition to the existing |editor*= range of parameter variants. They would be treated almost identical to normal editors, but listed after the authors and editors. Where normal editors are indicated by "(ed.)"/"(eds.)", series editors would be indicated by "(series ed.)"/"(series eds.)". Since AFAIK there is at present no separate class for series editors in metadata, they should be classified as editors there. --Matthiaspaul (talk) 04:28, 20 March 2020 (UTC)[reply]

|others= is also available in the meantime. If |series-editor*= existed, would it require |series= to exist? – Jonesey95 (talk) 04:38, 20 March 2020 (UTC)[reply]
Good point! In the examples that come to my mind right now, the name of the series was given as well, so it could (and should) be specified as well. (Not the other way around, I am aware of examples where a series name was given, but no series editors.)
Does someone know of an example where series editors are specified without also mentioning the name of the series? Depending on if such examples exist, a missing |series= parameter should throw a warning in edit preview or an error in the article, IMO.
--Matthiaspaul (talk) 06:20, 20 March 2020 (UTC)[reply]
What style guide even recommends that series editors are mentioned in a citation? I'd be against adding such pointless information. Headbomb {t · c · p · b} 08:19, 20 March 2020 (UTC)[reply]
Citations are meant to provide the minimum information needed to locate the original, not to give credit to everyone involved in its production. The citation templates are complicated enough already. Peter coxhead (talk) 09:23, 20 March 2020 (UTC)[reply]

Possible additional parameter check for list of authors/editors etc.

I don't know if this is a frequent error, but over the years I occasionally ran into citations doubling some of the authors or editors in longer lists. This was probably down to copy&paste errors during citation composition.

This condition could be detected if the template would check the list of (recombined first+last) author names (and likewise the list of editors) for duplicates and display a warning in edit preview.

I'm not sure if such a test would be too expensive to be performed outside edit preview as well, but if not, we would probably need some method to override the test for the (rare) case of multiple people of the same name contributing to a publication.

--Matthiaspaul (talk) 04:44, 20 March 2020 (UTC)[reply]

cite encyclopedia without |title=, part 2

This has already been discussed.

However: there is e.g. {{Zagrebački leksikon}}, an encyclopedia citation template. It is meant to be used both to cite individual articles, and to cite the entire encyclopedia in the Bibliography section and have shortened footnotes for the articles, like in Timeline of Zagreb. Both of these uses are quite legitimate. Since |title= param is now mandatory, I don't see a way of making the suggested "solution" work without applying some rather ugly hacks to the {{Zagrebački leksikon}} template. These hacks would also be unnecessary because in reality |title= is no more "mandatory" in {{cite encyclopedia}} than |page= is mandatory in {{cite book}}.

My suggestion would therefore be to make the |title= optional. GregorB (talk) 14:05, 22 March 2020 (UTC)[reply]

The real solution would be to provide support for |title=none in more than just {{cite journal}} Headbomb {t · c · p · b} 14:57, 22 March 2020 (UTC)[reply]
applying some rather ugly hacks to the {{Zagrebački leksikon}} template, change this:
|encyclopedia=Zagrebački leksikon
|title={{{title|}}}
to this:
|encyclopedia={{#if:{{{title|}}}|Zagrebački leksikon}}
|title={{#if:{{{title|}}}|{{{title|}}}|Zagrebački leksikon}}
Not really so ugly and actually quite common practice in wrapper templates.
Alternately, you can do this:
|title=Zagrebački leksikon
|entry={{{title|}}}
Trappist the monk (talk) 15:29, 22 March 2020 (UTC)[reply]
The|title=none solution is not bad, as a way of specifying that the title was intentionally omitted.
The above is precisely the "ugly hack" I was referring to. Its ugliness is not in its complexity, it's in the semantic gymnastics with the params, and the fact it's quite unnecessary, which is the thrust of my argument here: it makes no sense to work around something being mandatory when that something should not be mandatory in the first place. GregorB (talk) 23:20, 22 March 2020 (UTC)[reply]

Position of date

Consider:

Why does the position of the date in the last example change according to whether or not |author-mask1 is present? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 14:08, 22 March 2020 (UTC)[reply]

This has been discussed before. I can't find the discussion. I think there was more-or-less an agreement to use a similar date position regardless of whether the lowest-level work has an author or not. (The lowest-level work could be an article, dictionary entry, television episode, chapter, book, etc. depending on the nature of what is being cited.)
However the templates were never edited to reflect the results of the discussion. Jc3s5h (talk) 14:47, 22 March 2020 (UTC)[reply]
Date position is toward the head of the citation when there is a displayed name (author, editor). In your last example, you set |author-mask1=0 so there is no name to display. As far as the rendering is concerned, there is no author. At sometime in the distant past, editors developing the original cs1|2 templates determined that publication dates were at the front when an author/editor name is displayed and towards the end when no names are displayed – presumably so that date isn't the first element displayed in a rendered citation.
The discussion that Editor Jc3s5h mentioned is Help talk:Citation Style 1/Archive 3 § RFC: Consistent date location.
Trappist the monk (talk) 15:05, 22 March 2020 (UTC)[reply]
I believe the original impetus for the placement of dates was consistency with parenthetical referencing, which uses Author-Date or Author-Title. So that when following a short reference, readers would immediately recognize the full reference as its expansion. In contrast, in short-refing uncredited material in larger works, the norm has been to include the enclosing source in the place of the missing author, as in JournalTitle-Date (generally, Source-Date). However in the relevant full citation the leading element is Title, which does not appear in the short. So it was thought a good practice that somewhere in the full citation Source and Date appear close together. 98.0.246.242 (talk) 02:27, 24 March 2020 (UTC)[reply]
The issue described by 98.0.246.242 could be resolved by following the advice of printed stye guides: when there is no author, use a shortened title in the footnote. Although a machine will not recognize the long and short titles as referring to the same article/work/etc., a human will. For example,
<code>
Since most of us are homebound for the foreseeable future and unable to follow our routines, perhaps it is a good time to make a little time in your evenings to step out and enjoy the night sky and the sights it has to offer.{{sfn|"Sky 2020 March 24 - 31" | 2020 }}

References

*{{Cite web| title = The Sky This Week, 2020 March 24 - 31| accessdate = 2020-03-25| url = https://www.usno.navy.mil/USNO/tours-events/sky-this-week/the-sky-this-week-2020-march-24-31 | work = Naval Oceanography Portal | date = March 2020 | ref = {{harvid | "Sky 2020 March 24 - 31"| 2020}}  | publisher = US Naval Observatory }}
</code>
Jc3s5h (talk) 12:31, 25 March 2020 (UTC)[reply]

Problem in cite journal

I had trouble adding article-url and article-url-access to a cite in the Cuban Missile Crisis article as follows:

Tillman, Barrett; Nichols, John B., III (1986). "Fighting Unwinnable Wars". Proceedings. Supplement (April). United States Naval Institute: 78–86. {{cite journal}}: |article-url= ignored (help)CS1 maint: multiple names: authors list (link)

I left it with url=https://www.usni.org/magazines/proceedings/1986/april-supplement/fighting-unwinnable-wars instead. Wtmitchell (talk) (earlier Boracay Bill) 08:39, 25 March 2020 (UTC)[reply]

|article-url= is an alias of |chapter-url=. As such it is for use with cs1|2 templates that accept |chapter=. What you want, I think is:
{{cite journal |last1=Tillman |first1=Barrett |last2=Nichols |first2=John B. III |date=April 1986 |url=https://www.usni.org/magazines/proceedings/1986/april-supplement/fighting-unwinnable-wars |url-access=subscription |title=Fighting Unwinnable Wars |journal=Proceedings |issue=Supplement |pages=78–86}}
Tillman, Barrett; Nichols, John B. III (April 1986). "Fighting Unwinnable Wars". Proceedings. 112 (4: Supplement): 78–86.
Trappist the monk (talk) 11:47, 25 March 2020 (UTC)[reply]

How to create a footnote with both date and year as a ref=harv citation

How can I create a footnote with both date and year as a ref=harv citation? I mainly trying to do this for a bunch of newspaper articles from the same month and don't want to keep using 1921a, 1921b, 1921c, so on. I want the footnote to be NEWSPAPER 19 Apr 1921, NEWSPAPER 9 Apr 1921 and so on in Harvard citation format. I've asked this on another desk before but forgot where it is located. KAVEBEAR (talk) 21:49, 28 March 2020 (UTC)[reply]

I would have thought this would work, but it doesn't generate an ID. Surprisingly it also doesn't display an error for the date (at least the COINS isn't corrupted).
  • {{cite news |title=Title |date=15 January 2001b |ref=harv}}
  • "Title". 15 January 2001b. {{cite news}}: Invalid |ref=harv (help)
  • <cite class="citation news">"Title". 15 January 2001b.</cite><span title="ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Title&rft.date=2001-01-15&rfr_id=info%3Asid%2Fen.wikipedia.org%3ASpecial%3AExpandTemplates" class="Z3988"></span><templatestyles src="Module:Citation/CS1/styles.css"></templatestyles>
--Izno (talk) 22:00, 28 March 2020 (UTC)[reply]
Anchor ID is not created because there are no contributer, no author, and no editor names. cs1|2 requires names; just a date is pretty meaningless. You can do:
{{harvnb|15 January 2001b}}15 January 2001b harvnb error: multiple targets (2×): CITEREF15_January_2001b (help)
{{cite news |title=Title |date=15 January 2001b |ref={{sfnref|15 January 2001b}}}}
"Title". 15 January 2001b.
'"`UNIQ--templatestyles-000000A0-QINU`"'<cite id="CITEREF15_January_2001b" class="citation news cs1">"Title". 15 January 2001b.</cite><span title="ctx_ver=Z39.88-2004&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Title&rft.date=2001-01-15&rfr_id=info%3Asid%2Fen.wikipedia.org%3AHelp+talk%3ACitation+Style+1" class="Z3988"></span>
Trappist the monk (talk) 22:19, 28 March 2020 (UTC)[reply]
Ah, I don't use this ever. Would it not be sensible when |ref=harv to emit an error when at least one of the required pieces is not present? --Izno (talk) 22:59, 28 March 2020 (UTC)[reply]
Probably not, mostly because many just put |ref=harv by reflex, without intended to use it for generating anchors. It would make a lot of sense as a preview message though. Headbomb {t · c · p · b} 23:40, 28 March 2020 (UTC)[reply]
Perhaps this:
{{harvnb|''NEWSPAPER'' 9 Apr 1921}}NEWSPAPER 9 Apr 1921
{{harvnb|''NEWSPAPER'' 19 Apr 1921}}NEWSPAPER 19 Apr 1921
with:
{{cite news |title=World collapses |date=9 April 1921 |newspaper=NEWSPAPER |ref={{sfnref|''NEWSPAPER'' 9 Apr 1921}}}}
"World collapses". NEWSPAPER. 9 April 1921.
{{cite news |title=Never mind |date=19 April 1921 |newspaper=NEWSPAPER |ref={{sfnref|''NEWSPAPER'' 19 Apr 1921}}}}
"Never mind". NEWSPAPER. 19 April 1921.
Trappist the monk (talk) 22:19, 28 March 2020 (UTC)[reply]
I wouldn't. Short references should be that. The subnotation datea is confusing. One could assume that there is another citation from the same source with the same date. Using the serial subnotation with the proper timeframe reference (monthyeara, monthyearb etc. or yeara, yearb for example) lets the reader know that there are several similar citations in the referenced period. 98.0.246.242 (talk) 23:43, 28 March 2020 (UTC)[reply]
Trappist's way is what I have done in the past when the auto-generated CITEREF (aka harvid) is not appropriate or not working. – Jonesey95 (talk) 00:33, 29 March 2020 (UTC)[reply]
@Jonesey95 and Trappist the monk: Are there any existing articles that uses something like this. Jonesey95, you mentioned using it in the past. In what article? KAVEBEAR (talk) 16:44, 29 March 2020 (UTC)[reply]
There is nothing wrong with using {{sfnref}}} or {{harvid}}. But the norm in short references has been to use Author-Date (meaning year or rarely month year) or Author-Title. This has to do with the way the full reference is presented, and also indexing comes into play. Most databases main-index by author, subindexing by date and/or title. And they additionally index by source, again subindexing by title and/or date. Granted that this is not universal and it is becoming less so with new techniques such as on-demand indexing etc. But it is fair to assume that one will find a work faster by searching these fields. It is (or used to be until yesterday) harder to find a work by date/title combination. Short references reflect all that. Emphasis on short. 98.0.246.242 (talk) 23:36, 29 March 2020 (UTC)[reply]
I make a lot of little edits, so I'd really have to dig to find one that matches this description, but here is one where I used {{harvid}} to differentiate two sources with the same author and year. It's not perfect, but I was only trying to solve the immediate no-link problem, not try to perfect the article. – Jonesey95 (talk) 17:27, 29 March 2020 (UTC)[reply]
Benjamin Hope uses this custom format quite a bit. – Jonesey95 (talk) 17:10, 30 March 2020 (UTC)[reply]

I would try to stay close to what printed style guides like Chicago Manual of Style do so as not to surprise the reader or other editors. Lets start with what the reader will see in the bibliography. KAVEBEAR didn't mention an author for either story, but newspaper stories almost always have a headline. So the first element in the citation, as rendered, will be the article title, and that is what will determine placement in the bibliography. In the case of a duplicate title, the tie will be broken by the next element, the name of the newspaper. But this is the same. So the title will be broken by the next element, the volume of the newspaper. This logic is abominable. So I would do one of two things:

  1. Let the article rot until a consistent order of citation elements to be implemented as requested at {{#Position of date}}
  2. Rip out all the citation templates and use some other citation style that is more appropriate for the article.

Jc3s5h (talk) 13:36, 29 March 2020 (UTC)[reply]

Can you sort by titles instead? “Obituary...“ 1918 / “Obituary of John Johnson” 1918 or ”Funeral...” 1918 / “Funeral of John Johnson” 1918. KAVEBEAR (talk) 16:40, 29 March 2020 (UTC)[reply]
As you probably know, bibliographies are sorted manually. I would sort by the first rendered element. For most publications, this is/are the author(s). If there is no author, then the first element is the title of the lowest-level work, which would be the name of the article for a newspaper or journal. But what if the same newspaper re-uses the same headline several days in a row, and doesn't give an author? The next different element in the citation would be the issue number, which is often omitted from a newspaper story. That makes it a mess. It would be much better to render the date immediately after the article title, and use it to break ties. Jc3s5h (talk) 18:12, 29 March 2020 (UTC)[reply]

tag "website"

Hi, everybody, i find it very confusing that the tag is named "website", but you cannot give a website-url there. If i quote with cite web, i do have a tag "url=" where i do enter the url of the article and then there is "website", where i normally would enter the main website, for instance: the url is "www.blablabla.com/specialtextonthisauthor.html - then i would enter this in the "url"-place and under "website" would say "www.blablabla.com - but this doesn't work. Shouldn't one name the tag "website" then "title of the website" instead? Kind regards and stay safe, --Gyanda (talk) 20:06, 29 March 2020 (UTC)[reply]

It is for the name or title of the overall site, not the address and not the title of the individual web page. If you find it confusing, you can always use the synonym |work= for the name. —David Eppstein (talk) 21:21, 29 March 2020 (UTC)[reply]
It took me quite a while to understand that, David. Thanks for the explanation. I will try "work". Kind regards, --Gyanda (talk) 23:16, 29 March 2020 (UTC)[reply]

bad_paramlink error message is wrong when it occurs to author parameter

In short, this markup produces a weird error message.

  • Markup: {{Cite news|author=[[BBC News]]|author-link=BBC News|title=Test}}
  • Error message: Check |author-last1= value

This most likely comes from line 1406.--ネイ (talk) 08:20, 30 March 2020 (UTC)[reply]

Just to add - I mean "author-last1" is wrong. The input markup itself is certainly incorrect.--ネイ (talk) 08:22, 30 March 2020 (UTC)[reply]

Fixed, I think, in the sandbox:

Cite news comparison
Wikitext {{cite news|author-link=BBC News|author=[[BBC News]]|title=Test}}
Live BBC News. "Test". {{cite news}}: Check |author= value (help)
Sandbox BBC News. "Test". {{cite news}}: Check |author= value (help)
Cite news comparison
Wikitext {{cite news|author-link=[[BBC News]]|author=[[BBC News]]|title=Test}}
Live BBC News. "Test". {{cite news}}: Check |author-link= value (help)
Sandbox BBC News. "Test". {{cite news}}: Check |author-link= value (help)

Trappist the monk (talk) 10:58, 30 March 2020 (UTC)[reply]

Tracking for bad ref

Do we have tracking for problems like this one? In most cases, this would appear as a missing template, so could possibly string match the first part of the entry to find them. For example, {{#invoke:string|find|{{Fenn|Hart|2001}}|^%[%[:Template:|plain=false}} returns 1, but {{#invoke:string|find|{{harvid|Fenn|Hart|2001}}|^%[%[:Template:|plain=false}} returns 0. Thanks! Plastikspork ―Œ(talk) 16:46, 31 March 2020 (UTC)[reply]

No. In this citation:
{{Cite book|ref={{Fenn|Hart|2001}}|last1=Hart|first1=Diana|title=Under the Mat: Inside pro wrestlings greatest family |publisher=[[Fenn]]}}
Hart, Diana. Under the Mat: Inside pro wrestlings greatest family. Fenn.
cs1|2 gets [[:Template:Fenn]] from |ref={{Fenn|Hart|2001}}. Because that value is not harv, cs1|2 uses it as is. Before the value becomes the id attribute of the citation's wrapping <cite>...</cite> tag, it is anchor encoded which changes it to Template:Fenn.
cs1|2 presumes that whatever is in |ref= is a correct value.
Trappist the monk (talk) 18:32, 31 March 2020 (UTC)[reply]
Granted that an editor who uses |ref= is likely not a beginner and should be careful not to nest or enter template notation in this field unless they know exactly what they are doing... so is it prudent (and easily doable) to limit {{{ref}}} entries to text, {{harvid}} and {{sfnref}}? Or is this an undue restriction? 98.0.246.242 (talk) 23:56, 31 March 2020 (UTC)[reply]

Error in date of cite newgroup example

Wait! I fixed it myself :) See: https://en.wikipedia.org/w/index.php?title=Template%3ACite_newsgroup%2Fdoc&type=revision&diff=948490292&oldid=945888694

Removed spurious comma from:

| access-date = </nowiki>{{CURRENTDAY}} {{CURRENTMONTHNAME}}, {{CURRENTYEAR}}<nowiki>

to:

| access-date = </nowiki>{{CURRENTDAY}} {{CURRENTMONTHNAME}} {{CURRENTYEAR}}<nowiki>


Here's what I was going to ask :)

In Cite newsgroup#vertical format the parameter access-date has an error. I believe the cite newgroup documentation uses an automatically generated date. Unfortunately there is a spurious comma(',') contained. Perhaps this from a previous change from MMMMM DD, YYYY format to DD MMMMM YYYY format.

Template:Cite newsgroup The vertical format example

| access-date = 1 April, 2020

instead of:

| access-date = 1 April 2020

The horizontal format example above it uses the alternate, but correct, date |access-date=April 1, 2020.

Thanks! Lent (talk) 09:31, 1 April 2020 (UTC)[reply]

Cite book/doc: "full" parameter lists

I just added two parameters to the "full" parameter lists of the template. I have no clue what other parameters may be missing from these full parameter lists in the documentation. Could someone check and, if needed, update the lists accordingly? Don't know whether a comparable check & possible update for other cite templates wouldn't be welcome too. Tx. --Francis Schonken (talk) 11:26, 1 April 2020 (UTC)[reply]

@Francis Schonken: The full list of all parameters (some of which may duplicate others in some way) is at Module:Citation/CS1/Whitelist and some of their correlation is at Module:Citation/CS1/Configuration in the line starting with local aliases. Most of them are applicable to all templates. --Izno (talk) 16:49, 1 April 2020 (UTC)[reply]

Where can I find the range of value |pmc

Hello, I have a question. Where can I find the range of value |pmc. I want to know because my Vietnamese wiki has a little issue about this cs1. I want to enlarge the range of value: from 1 - 6000000 to 1 - 8000000. I'm looking forward to hearing from you. Đư'c (talk) 16:12, 1 April 2020 (UTC)[reply]

vi:Mô_đun:Citation/CS1/Identifiers
Trappist the monk (talk) 16:27, 1 April 2020 (UTC)[reply]

HDL parameter escaping

The citation templates encode the ? characters in Handles which breaks them. It is better than the HDL template that simply displays nothing.

HDL template:

HDL parameter: "None". hdl:2027/uc1.l0072691827. {{cite journal}}: Cite journal requires |journal= (help)

URL parameter: "None". {{cite journal}}: Cite journal requires |journal= (help)

AManWithNoPlan (talk) 17:06, 3 April 2020 (UTC)[reply]

I don't think that cs1|2 should display proxy server query parameter that are part of the hdl. I think that we should probably intercept the query string, encode the hdl url and then reattach the query string and apply that url to the handle portion of the hdl. There are several query string parameters that we will need to intercept. See hdl doc. I'll think about how to do this.
Trappist the monk (talk) 18:01, 3 April 2020 (UTC)[reply]
You are correct, not only does the ? get escapes and break the link, but the display is ugly. AManWithNoPlan (talk) 19:32, 3 April 2020 (UTC)[reply]
Cite book comparison
Wikitext {{cite book|date=193–1994|hdl=2027/uc1.l0072691827?urlappend=%3Bseq=673|page=641|section=Capital Building and Grounds|title=Official Congressional Directory 103d Congress}}
Live "Capital Building and Grounds". Official Congressional Directory 103d Congress. 193–1994. p. 641. hdl:2027/uc1.l0072691827.
Sandbox "Capital Building and Grounds". Official Congressional Directory 103d Congress. 193–1994. p. 641. hdl:2027/uc1.l0072691827.

Trappist the monk (talk) 22:38, 3 April 2020 (UTC)[reply]

Who is the publisher?

The article I came across with Bare URLS for citations was Charles Henry Elliott-Smith. Lots of sources came from https://www.thegazette.co.uk/London/issue/x/supplement/x/data.pdf, and I used {{cite news}} for them since automatic citations wouldn't work. Some of them never stated the publisher, another stated it was published by HIS MAJESTY'S STATIONERY OFFICE, and I think the publisher is The London Gazette. So who is the publisher? {{replyto}} Can I Log In's (talk) page 00:14, 4 April 2020 (UTC)[reply]

The London Gazette is published by The Stationery Office (it says in the linked articles). Or Her Majesty's Stationery Office, before 1996. —David Eppstein (talk) 00:17, 4 April 2020 (UTC)[reply]
There is a template for that: {{London Gazette}}:
{{London Gazette |issue=36276 |date=7 December 1942 |page=5340 |supp=y}}
"No. 36276". The London Gazette (Supplement). 7 December 1942. p. 5340.
Because it's a journal of record, no publisher is necessary.
Trappist the monk (talk) 00:24, 4 April 2020 (UTC)[reply]