Help talk:Citation Style 1: Difference between revisions

Content deleted Content added

Inline

Revision as of 01:21, 24 November 2020

To help centralise discussions and keep related topics together, the talk pages for all Citation Style 1 templates and modules redirect here. A list of those talk pages and their historical archives can be found here.

This is the talk page for discussing improvements to the Help:Citation Style 1 and the CS1 templates page.

Put new text under old text. Click here to start a new topic.
New to Wikipedia? Welcome! Learn to edit; get help.

Shortcut

WT:CS1

Archives: Index, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95: 25 days

Wikipedia Help B‑class High‑importance

	This page is within the scope of the Wikipedia Help Project, a collaborative effort to improve Wikipedia's help documentation for readers and contributors. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks. To browse help related resources see the Help Menu or Help Directory. Or ask for help on your talk page and a volunteer will visit you there.Wikipedia HelpWikipedia:Help ProjectTemplate:Wikipedia Help ProjectHelp articles
B	This page does not require a rating on the project's quality scale.
High	This page has been rated as High-importance on the project's importance scale.

Academic Journals

This page is within the scope of WikiProject Academic Journals, a collaborative effort to improve the coverage of Academic Journals on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.Academic JournalsWikipedia:WikiProject Academic JournalsTemplate:WikiProject Academic JournalsAcademic Journal articles

Magazines

	This page is within the scope of WikiProject Magazines, a collaborative effort to improve the coverage of magazines on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.MagazinesWikipedia:WikiProject MagazinesTemplate:WikiProject Magazinesmagazine articles
See WikiProject Magazines' writing guide for tips on how to improve this article.

Some of the templates discussed here were considered for merging or deletion at Wikipedia:Templates for discussion. Please review the prior discussions if you are considering re-nomination:

"Withdrawn" proposal to merge Template:Cite press release with Template:Cite news on March 2, 2018, see discussion.

Text has been copied to or from this page; see the list below. The source pages now serve to provide attribution for the content in the destination pages and must not be deleted as long as the copies exist. For attribution and to access older versions of the copied text, please see the history links below.

Copied pages:

Copied Template:Cite_book (oldid, history) → incubator:Template:Wp/nod/cite_book (diff)
Copied Template:Cite_journal (oldid, history) → incubator:Template:Wp/nod/cite_journal (diff)
Copied Template:Cite_web (oldid, history) → incubator:Template:Wp/nod/cite_web (diff)
Copied Module:Citation/CS1 (oldid, history) → incubator:Module:Wp/nod/Citation/CS1 (diff)
Copied Module:Citation/CS1/Configuration (oldid, history) → incubator:Module:Wp/nod/Citation/CS1/Configuration (diff)
Copied Module:Citation/CS1/Suggestions (oldid, history) → incubator:Module:Wp/nod/Citation/CS1/Suggestions (diff)
Copied Module:Citation/CS1/Utilities (oldid, history) → incubator:Module:Wp/nod/Citation/CS1/Utilities (diff)
Copied Module:Citation/CS1/Whitelist (oldid, history) → incubator:Module:Wp/nod/Citation/CS1/Whitelist (diff)
Copied Module:Citation/CS1/styles.css (oldid, history) → incubator:Module:Wp/nod/Citation/CS1/styles.css (diff)

This is the talk page for discussing improvements to the Citation Style 1 page.

Put new text under old text. Click here to start a new topic.
New to Wikipedia? Welcome! Learn to edit; get help.

Archives: Index, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95: 25 days

Citation templates

... in conception

... and in reality

Add wayback-timestamp parameter

When an archive is added to a reference, the vast majority of the time it is just a Wayback Machine archive of the exact same URL. This bloats the source code of pages massively. It would be much simpler if a wayback-timestamp parameter was added, which would be set to the timestamp of the archive found in the page's URL. This was mentioned seven years ago here but the discussion had no conclusion. Example: |wayback-timestamp=20200721125421 in \{{cite web|url=https://example.com/page|title=Example page|website=Example.com|date=2020-08-04|wayback-timestamp=20200721125421|archive-date=2020-07-21}} as opposed to the bloated {{cite web|url=https://example.com/page|title=Example page|website=Example.com|date=2020-08-04|archive-url=https://web.archive.org/web/20200721125421/https://example.com/page|archive-date=2020-07-21}}. Implementation: if waybackTimestamp then archiveUrl = 'https://web.archive.org/web/' + waybackTimestamp + '/' + url end. Nixinova T C 05:15, 4 August 2020 (UTC)[reply]

You're not wrong, but the thing is, server space keeps getting cheaper and cheaper, and programmer (paid, or volunteer time) keeps getting more expensive and scarcer. If you had to prioritize this against stuff that's either broken and needs fixing, or enhancements that would provide desired new functionality, well, you see the problem... Mathglot (talk) 04:00, 6 August 2020 (UTC)[reply]

I'm not saying to replace all archive-url's with this, just add it as an additional option. Nixinova T C 07:47, 27 August 2020 (UTC)[reply]

Bumping, I still think this is a good idea. Wayback is what most people use for archives, and this would save many kilobytes per page. Other archiving services could be used with archive-url without touching this syntax, but this would be very useful at minimising the size of references in a page's source. Nixinova T C 03:23, 2 October 2020 (UTC)[reply]

@Nixinova: Have you tried |wayb? For example, {{cite web |url=https://example.com/page |wayb=20200721125421}}? Note that when using |wayb you don't need to use |archive-date because it extracts the date from |wayb. I think that is an undocumented feature, I discovered it reading some page source to understand the inner workings. Joaopaulo1511 (talk) 08:05, 15 October 2020 (UTC)[reply]

@Nixinova: Sorry, |wayb works on Portuguese Wikipedia, but not on English Wikipedia. Check pt:ReactOS (page source) to see what I am talking about. The |wayb argument is documented here pt:Predefinição:Citar_web#URL and on other Portuguese citation templates. @Mathglot: The wayb I see it, one day of a coder's work can help editors save many months by not having to repeat wiki code over and over, and also help read (and edit) faster by uncluttering the sources' pages. And the code is already there, on the Portuguese Wikipedia, just waybting to be copied. 😅 Joaopaulo1511 (talk) 08:55, 15 October 2020 (UTC)[reply]

I like the idea in general, but not the proposed user-interface. I am not too fond of the idea of adding a specialized parameter |wayback-timestamp= or |wayb= just for archive.org. Also, these parameter names would not fit well into our parameter naming scheme. An alternative proposal, which works without introducing a new parameter, is discussed here: Help_talk:Citation_Style_1/Archive_72#Smart_substitution_token_to_reduce_redundancy_among_input_parameters

It is slightly longer (which shouldn't matter, as in both cases the full archive link must be available for truncation before adding it to a citation - basically noone types in archive links or timestamps without utilizing copy & paste), but it is more flexible (also possible for some other archivers) and it would be embedded into a more general concept potentially reducing the necessary amount of typing also for a number of other citation template parameters. Of course, both could be implemented in parallel, but for reasons of consistency across citation templates (similar to our ((accept-this-as-it-is)) syntax) I would prefer the broader concept of a smart substitution token.

--Matthiaspaul (talk) 11:09, 15 October 2020 (UTC)[reply]

(edit conflict)

Previous discussions:

Those seem to focus on all archive sources, whereas |wayb= is specific to Internet Archive. Because we have InternetArchiveBot I would guess that the vast majority of |archive-url= parameters hold wayback urls. If that is the case then perhaps there is some sense in supporting |wayb= or similar. But, for me, it is easier to copy/paste an entire archive url than it is to highlight 14 digits in the middle of the archive url and then copy/paste that. So that suggests, if the goal is to make life easier for editors, when |archive-url= holds a properly formed Internet Archive url, cs1|2 can extract the date from the 14-digit timestamp and return a YYYY-MM-DD archive date to be formatted according to |df= or {{use xxx dates}}.

I'm not all that comfortable with automatically assembling an archive url from |url= and an editor-supplied timestamp. Any change that 'fixes' the url will likely break the assembled archive-url.

—Trappist the monk (talk) 11:14, 15 October 2020 (UTC)[reply]

Moreover, this would make the bot's work more complex. Solid "we should not do this". --Izno (talk) 13:07, 15 October 2020 (UTC)[reply]

Assembling archived links from a prefix, a timestamp and an URL is hardly "complex", it's trivial to code. However, there is, as Trappist correctly wrote, a risk to break the archived link when the URL gets modified later on. So, this whole idea depends on such timestamps been adjusted or removed whenever |url= is touched, or for them to be replaced by the expanded link in |archive-url= again. However, failing to update |archive-url= when modifying |url= is almost always an error, even without this proposal. What we'd lose is the "known good state" of an already existing |archive-url= when the |url= undergoes only minor tweaking (like removing unnecessary URL parameters). As bots not updated to take |wayback-timestamp= (or similar) into account would likely just add |archive-url=, the failure mode is on the safe side if the template gives |archive-url= priority over |wayback-timestamp=. In the case of the placeholder idea, an |archive-url= containing a * would not match and would likely be overwritten by the bot when it changes |url=. It's not 100% bullet-proof over the transitional phase, but little actual damage can be made, so this aspect alone should not invalidate the idea, IMO.

In general, we should not have "mercy" with bots. They are to make life easier for humans, not the other way around. Programs exist to code once, solve often. For as long as the work required to code a program is smaller than the accumulated amount of work that would be required to repeatedly solve a problem manually, the difficulties to code and maintain a bot are worth it.

--Matthiaspaul (talk) 09:58, 16 October 2020 (UTC)[reply]

I think, the goal of these proposals, as far as archive links are concerned, is to reduce clutter in citation source code (URLs tend to be long and ugly), less so to save storage space (because it doesn't matter much) or reduce the amount of typing (as the parameter value would be crafted from a pasted archive link rather than typed in manually).

In the case of |archive-date=, the goal is actually to reduce typing and maintenance time. Although this is only addressing a minor aspect of both proposals, making |archive-date= optional for |archive-url= links from archivers known to include timestamps would be something I would support as well. Wikipedia:List of web archives on Wikipedia lists a number of archivers producing links with embedded timestamps.

--Matthiaspaul (talk) 09:58, 16 October 2020 (UTC)[reply]

last-author-amp=

This documentation edit reminds me that |last-author-amp= should be deprecated in favor of a new parameter with a better name. We do not have |last-contributor-amp=, |last-editor-amp=, |last-interviewer-amp=, or |last-translator-amp= parameters. When |last-author-amp=yes, any of the other name lists that have two or more names will use the ampersand separator between the last two names in the list.

What is the new parameter name? |last-name-amp= is problematic for obvious reasons. |last-sep-amp=? Or, something different, perhaps: |namelist-last-sep=<keyword> where <keyword> is & or amp or and; possibly other keywords? Still needs the new parameter name and keyword definitions.

—Trappist the monk (talk) 19:53, 19 August 2020 (UTC)[reply]

How about |author-ampersand=, |editor-ampersand=, etc.? Spelling out "ampersand" is a bit awkward, but its meaning is clearer than "amp". The |xxx-ampersand= model is easily extensible to other parameters, such as those listed above. The documentation could make it clear that the parameter, when set to "yes" or "y", renders an ampersand between the final two author/editor/translator names. – Jonesey95 (talk) 21:16, 19 August 2020 (UTC)[reply]

|last-author-amp= applies to all name lists even when there are no names in the author name list:

{{cite book |title=Title |translator=Translator |translator2=Translator2 |last-author-amp=yes}}

Title. Translated by Translator; Translator2. {{cite book}}: |translator= has generic name (help); Unknown parameter |last-author-amp= ignored (|name-list-style= suggested) (help)CS1 maint: numeric names: translators list (link)

This mechanism makes sense to me because the name lists in a citation should all render with the same style. A single parameter name not closely tied to a particular name list seems to me better than renaming |last-author-amp= and creating four aliases of that – I can imagine editors adding an (unnecessary) alias parameter for each name list in the citation...

—Trappist the monk (talk) 21:37, 19 August 2020 (UTC)[reply]

I've been wondering if late whether this parameter is strongly needed at all. But that aside, I'd go for |namelist-last-sep=<keyword> or similar. --Izno (talk) 21:45, 19 August 2020 (UTC)[reply]

I confess to wondering the same, but it exists and were we to take it away, no doubt, no doubt, torches, pitchforks, ...

—Trappist the monk (talk) 21:50, 19 August 2020 (UTC)[reply]

My mistake. I would support something like |name-list-ampersand= then. And I would not be excited about an open-ended var option for the separator. The last thing we need around here is more citation variation, let alone within CS1 templates. – Jonesey95 (talk) 21:56, 19 August 2020 (UTC)[reply]

There was this discussion: Help talk:Citation Style 1/Archive 44 § Is there any interest... I thought I remembered more than that one but it appears that my memory is faulty.

—Trappist the monk (talk) 22:18, 19 August 2020 (UTC)[reply]

(edit-conflict) If we switch to use a different parameter, I think, it should be one not only allowing the feature to be enabled or disabled, but to actually specify the separator as well. That would be your proposed |namelist-last-sep=, although, I think, that name is too complicated (and contains an abbreviation not all people will understand). The {{catalog lookup link}} template uses |list-leadout= for this. Given that it would apply to all name lists, |leadout-separator= or just |leadout=/|lead-out= could work as well (but could be easily confused with the |postscript= parameter).

Is there a chance that we'd need to specify alternative leadouts also for other lists in the future? Then, the parameter name should be chosen in a way already taking such extensions into account, namewise. However, the only other lists at present are identifier lists and pages — I don't see any possible need to divert from the default separation schemes there, hence, no issue.

However, there are other options as well:

If, for example, we would want to get rid of a parameter, the functionality could be merged into one of the existing parameters

|name-list-format= (either through a new token such as "amp", or by just taking all string values except for "vanc" as the actual leadout string — however, in the latter case, the parameter name should be changed to become more meaningful again)

or

|display-<names>= (either using negative values -1, -2, etc. to use & instead of the default leadout, or any string values other than "etal" to define the leadout string — in the latter case, the feature could not be used in combination with actually display-truncated lists, and in both cases, the parameter name may need to be changed as well).

If the feature is only rarely used, it could even be emulated manually using |<name>-maskn=, but this would give more options than necessary including some undermining the feature, so it would only be an option for occasional use.

--Matthiaspaul (talk) 23:01, 19 August 2020 (UTC)[reply]

I don't think that I like |list-leadout= because leadout seems rather more jargon-ish than most cs1|2 parameters. I don't particularly care for |namelist-last-sep= for the same reason.

The language list uses <space>and<space> (two languages) and ,<space>and<space> (three+ languages). I see no reason to change that.

I do rather like |name-list-format=amp and |name-list-format=and because that parameter applies to all name lists. amp and and will not conflict with vanc because Vancouver style only supports comma separators between names.

I don't think that name-list separators have anything to do with the purpose |display-<name-list>= serves (and negative numbers are just too cryptic). As it works now, |display-<name-list>= causes cs1|2 to ignore |last-name-amp=. I think that this is probably the correct action to take when both parameters are present.

—Trappist the monk (talk) 00:12, 20 August 2020 (UTC)[reply]

I forgot about the language list, but, like you, I don't see any need for a change there.

I mentioned |display-<names>= only for completeness and because it also deals in some way with the last name in a list, but I completely agree with you, that semantically it has a very different purpose. (Talking about it, this reminds me that these parameters should better be named |authors-display=/|editors-display= than |display-authors=/|editors-display= to follow the naming scheme of most of the other modern parameters to further differentiate on the left rather than the right side.)

I, too, find |name-list-format=amp[ersand]/and/vanc a good name for the purpose (and much better than |last-author-amp=yes), and also like the idea of limiting the choices to a few hardwired tokens instead of allowing this parameter to accept free text.

--Matthiaspaul (talk) 10:44, 20 August 2020 (UTC)[reply]

We really should rename |name-list-format= to something shorter, like |nf= (which is short for name format) in parrallel to |df= (which is short for date format). Headbomb {t · c · p · b} 19:58, 23 August 2020 (UTC)[reply]

In the sandbox I have extended |name-list-format= to allow the additional keywords amp and and:

{{cite book/new |title=Title |author=Black |author2=Brown |name-list-format=amp}} → Black; Brown. Title. {{cite book}}: Unknown parameter |name-list-format= ignored (|name-list-style= suggested) (help)
{{cite book/new |title=Title |author=Black |author2=Brown |name-list-format=and}} → Black; Brown. Title. {{cite book}}: Unknown parameter |name-list-format= ignored (|name-list-style= suggested) (help)
{{cite book/new |title=Title |author=Black |author2=Brown |author3=Red |name-list-format=amp}} → Black; Brown; Red. Title. {{cite book}}: Unknown parameter |name-list-format= ignored (|name-list-style= suggested) (help)
{{cite book/new |title=Title |author=Black |author2=Brown |author3=Red |name-list-format=and}} → Black; Brown; Red. Title. {{cite book}}: Unknown parameter |name-list-format= ignored (|name-list-style= suggested) (help)

|last-author-amp= still works:

{{cite book/new |title=Title |author=Black |author2=Brown |last-author-amp=yes}} → Black; Brown. Title. {{cite book}}: Unknown parameter |last-author-amp= ignored (|name-list-style= suggested) (help)

I wonder about the punctuation for and. It looks odd to me without the name separator in the three-name list:

Black; Brown and Red

or

Black; Brown; and Red

Which is better? more correct?

—Trappist the monk (talk) 10:59, 4 September 2020 (UTC)[reply]

MOS has a preference for the Oxford/serial comma, which I think reasonably extends to our use of the semicolon. --Izno (talk) 14:35, 4 September 2020 (UTC)[reply]

The following links indicate that a serial semicolon analogon to the serial comma exists, although it can't be exactly common (I cannot remember to have ever seen this in the wild and it looks quite odd to me):

Given that our specific use case here is a list of names and the fact that corporate names may include the conjunction "and" as well, I nevertheless tend to prefer the second form to avoid ambiguities. This would also be consistent with the way the language lists works at present.

Or go yet a bit further by generalizing the parameter |name-list-format= into |list-format= (also shorter per Headbomb), adding another token like "serial", and (despite what we both wrote above) apply the setting to both, name and language lists ~~with "serial" being the default (also in the "vanc" case)~~?

--Matthiaspaul (talk) 15:54, 4 September 2020 (UTC)[reply]

Tweaked to use ; and for name-lists of three or more but your point about corporate names would also suggest the same tweak for two-name lists and also for name-lists that use the ampersand.

As part of this change, in ~/Configuration for i18n I created sep_nl_and and sep_nl_end in presentation {} and have renamed:

parameter-separator → sep_list

parameter-final-separator → sep_list_end

parameter-pair-separator → sep_list_pair

These were in messages{} but I have moved them to presentation {} where they more properly belong. This change applies to the |language= list and error-message lists. I had hoped that I could use a common function to handle the writing of name lists and language lists but |<name-list>-mask=<text> heaves a spanner into the works because the rendered value from text-masked names uses a space character as a separator. I may still write that function so that at least the language-name and error-message lists can share common code.

Also as part of this change, and unrelated to it, I added require('Module:No globals') which I'm pretty sure used to exist in one of the modules though I can't now find where that was ... This addition brought to light a handful of items that oughtn't to have had global scope so I have marked those items local.

This parameter is for name lists so its name should reflect that; vanc has no meaning for language or error-message lists.

—Trappist the monk (talk) 22:14, 5 September 2020 (UTC)[reply]

In Module:Citation/CS1/Utilities/sandbox I have created list_make() as the common function that makes a comma-separated list (other separators possible) with selected coordinating conjunction. This function is now used to render certain error messages and to render the languages list:

{{cite book/new |title=Title |chapter=Chapter |section=Section}}

"Chapter". Title. {{cite book}}: More than one of |section= and |chapter= specified (help)

{{cite book/new |title=Title |page=1 |pages=23–24 |at=¶6}}

Title. p. 1. {{cite book}}: More than one of |pages=, |at=, and |page= specified (help)

and the language list:

{{cite book/new |title=Title |language=ale}} → Title (in Aleut).

{{cite book/new |title=Title |language=cop, la}} → Title (in Coptic and Latin).

{{cite book/new |title=Title |language=nv, chy, zun}} → Title (in Navajo, Cheyenne, and Zuni).

This one illustrated here because the error message may be assembled in two modules:

{{cite book/new |title=Title |year=2002 |date=2001 Dec 2}} – assembled in Module:Citation/CS1/Date validation/sandbox and Module:Citation/CS1/sandbox

Title. 2001 Dec 2. {{cite book}}: Check date values in: |date= and |year= / |date= mismatch (help)

{{cite book/new |title=Title |date=2001 Dec 2 |url=//example.com |access-date=2001}} – assembled in ~/Date validation/sandbox

Title. 2001 Dec 2. Retrieved 2001. {{cite book}}: Check date values in: |access-date= and |date= (help)

Excepting the coordinating conjunction, date error messaging renders differently from the live messaging for the same errors (separator font):

Title. 2001 Dec 2. Retrieved 2001. {{cite book}}: Check date values in: |year=, |access-date=, |date=, and |year= / |date= mismatch (help)CS1 maint: year (link)

Title. 2001 Dec 2. Retrieved 2001. {{cite book}}: Check date values in: |year=, |access-date=, |date=, and |year= / |date= mismatch (help)

—Trappist the monk (talk) 17:33, 11 September 2020 (UTC)[reply]

Just for reference sake, deprecation will cause a change to about 36k pages. --Izno (talk) 17:04, 4 September 2020 (UTC)[reply]

Yep, know about that. I have a bot task pretty much ready to go. In testing that task I learned that it is almost never the case that all cs1|2 templates in an article that could make use of |last-author-amp= (those cs1|2 templates that have two or more names in a name-list) actually have |last-author-amp=. These came from the top of my article list from my testing a week or more ago:

Belarus – 1 use in 18 eligible templates

India – 2 uses in 88

Barack Obama – 1 use in 82

Australia – 1 use in 27

Ronald Reagan – 6 uses in 33

It will, I think be the rare case that every eligible template in an article uses |last-author-amp=.

Alas, BRFAs require test runs so until the deprecation goes live (which includes the new keywords for |name-list-format=), there isn't much progress to be made.

—Trappist the monk (talk) 17:50, 4 September 2020 (UTC)[reply]

Given that so many pages need to be touched (but can be fixed up by a bot), I actually think we should change the parameter name |name-list-format= into |list-format= (regardless of if we add the "serial" token or not), so that we don't have to change them all again at a later stage.

Meanwhile I actually think we should add the "serial" token as well to allow citations to blend in perfectly with a pre-existing list style in articles.

--Matthiaspaul (talk) 15:24, 5 September 2020 (UTC)[reply]

Going through the parameter list, the term "format" is currently used for three different things:

* In the |name-list-format= parameter above

* (Indirectly in the |df= ("dat a e format") parameter)

Therefore, in our attempt to improve the consistency of parameter names, I think, we should change the |name-list-format= to something not containing the term "format" any more. Existing usage of |name-list-format=vanc amounts to some 6.5k citations, but if we have to run a bot on 36k entries anyway, before we hammer it into stone forever, another 6.5k edits doesn't really matter, if we thereby reach a higher level of consistency.

--Matthiaspaul (talk) 10:28, 9 September 2020 (UTC)[reply]

|series-separator= was apparently invented for an early lua version of {{cite episode}}. I can't find where it was actually used in the wikitext version of that template. When I migrated {{cite episode}} to the module suite, |series-separator= was not included. And then came the great separator purge with the invention of |mode=. I'm astonished that |series-separator= survived the purge (an indication of too damn many parameters?). I will remove it and its meta-parameter.

When we invented |mode=, my preferred name for that parameter was |style=. That was rejected, in part, because it would be the same as the html style= attribute.

And |df= is date format, not data.

—Trappist the monk (talk) 11:37, 9 September 2020 (UTC)[reply]

I was wondering what that is - this explains why I didn't find anything regarding |series-separator=... ;-)

type is in use as well already.

|separator-mode=? |name-list-mode=? |list-mode=?

--Matthiaspaul (talk) 12:28, 9 September 2020 (UTC)[reply]

But if it can't be |name-list-type= because |type= then it can't be |name-list-mode= because |mode=, right?

—Trappist the monk (talk) 12:35, 9 September 2020 (UTC)[reply]

Almost. ;-) It would have to be |cite-mode= then (another 6.9k hits)... (the old problem of too unspecific parameter names biting again) ;->

In the case of mode the two settings are at least both switching between different ways how citations are rendered, whereas in the case of type, the pre-existing usage of the parameter is to specify the media type ("Video") or formal document type ("Essay", "Report"), something not even remotely related to a list style in the citation itself.

I still like style; while it is true that we should try to maintain consistent parameter names across Wikipedia, I think it is even more important to at least reach a logical and consistent parameter naming scheme among the citation templates. So, if we don't find something linguistically and semantically more pleasing, I would still opt for something ending on -style - and if a temporarily confused editor would accidently throw HTML at it this wouldn't cause harm but just return an error message.

BTW. The old thread was Help_talk:Citation_Style_1/Archive_7#Display_parameters:_do_we_need_them?

Any other suggestions?

--Matthiaspaul (talk) 20:40, 9 September 2020 (UTC)[reply]

Two-and-a-half weeks have passed without an answer. As we need to find a good new name for the parameter before the pending update of the template (because otherwise, the bot task would hammer the -format name into stone forever), I have continued to seek for alternatives. Some remarks:

|name-list-format= is inadequate for our purpose, because semantically, -format implicitly deals with input data. Also, as detailed above, we have an otherwise consistent established use for this already, so we really should use something different here.
|name-list-mode= could be a good choice, but then we should move the existing |-mode= to |cite-mode= or similar (and leave |mode= as an alias for it). Semantically, -mode affects some internal configuration of the template and possibly the output, so while it would fit into a future parameter class |-mode= for all kinds of mode settings, it is not a perfect match.
|name-list-style= is linguistically very pleasing and semantically a well-suited name, as -style implies that this parameter somehow deals with output data. The HTML argument against |style= does not really apply, as our parameter would be named |name-list-style= rather than just |style=.
|name-list-appearance= is, like |name-list-style=, linguistically and semantically well-suited, but quite long.
|name-list-display= might be a good choice as well, in particular if we also switch the semantically misleading |display-names= parameters to the |name-display= form, which are semantically better suited and in compliance with our parameter naming conventions to list the input "type" last and disambiguate on the left side. Switching these names (and keeping the older ones as aliases for now) would considerably improve the consistency in documentation and make it easier to remember the parameter names. |name-list-display=vanc/and/amp would fit in the group of |author-display=0/n/etal/|editor-display=0/n/etal, etc. parameters if we define the -display as a parameter class to change the appearance of a citation and not change the template's internal configuration.

Other synonymns I came up with were linguistically or semantically worse.

My order of preference is (in descending order): |name-list-display=, |name-list-style=, |name-list-mode=

Which one should we choose?

--Matthiaspaul (talk) 21:27, 27 September 2020 (UTC)[reply]

I'm perplexed. Here you complain that the bot task would hammer the -format name into stone forever) yet, elsewhere on this page you appear to anticipate that |title=none will redefined in future. If the [hammered] ... into stone argument applies to the one it must also apply to the other.

semantically, -format implicitly deals with input data. Really? Where do you get that notion?

If we must choose another name (I'm not yet convinced that we must), I would choose |name-list-style= because this |<noun>-<verb>= parameter in combination with its assigned value, instructs cs1|2 how to style the name lists.

—Trappist the monk (talk) 14:54, 29 September 2020 (UTC)[reply]

Trappist, thanks for taking the time to think about it and your answer. Having thought about the various parameter classes and their possible future extensions for another two days, I have also come to the conclusion that |name-list-style= is the best name, and that the argument regarding a possible clash with the HTML style= attribute can be ignored here.

Regarding format being associated with input data, I had hoped that my "implicitly" would make it clear that this was meant in the context of our usage in citation templates; all the other parameters using format describe input data, |name-list-format= is the sole exception. In general, format can be associated with output data as well, of course, but at least not with internal states such as mode. While the name is "bearable" and we are used to just use what is given, if, in our attempt to improve the user interface for normal users, we seek for the most-suitable parameter name fitting into our naming scheme, such nuances or subtleties are important to become aware of. Does this make things clearer? It is also possible that not all people have the same associations... ;-)

There is no reason to be perplexed: If we keep the |name-list-format= name, your bot task will hammer it into 36k articles since we merge |last-author-amp= into this parameter. The number would be much too high to carry out this change manually (and also non-neglectible for a bot), but fortunately we have your bot task. Now, if we use |name-list-style= instead, your script will have to edit another 6.5k articles (not much of an addition for the bot, therefore acceptable), but in the end we'd have a parameter name which does not clash with other semantically considerably different uses of parameters of the -format class (as discussed above), and if we would have other settings only affecting the output we could use the -style class for them as well. If we skip this chance to rename the parameter, and would decide that |name-list-format= needs to be changed later, we would have to run a bot just for this task on 42.5k articles (which might be too much to be acceptable). So, doing it now, we can "save" 36k edits. That's why I think we should not skip the chance. (Even, if we want to freeze the code now for the update and could not come to a decision before it, I think, we should include it in the update, because if we would decide against it, we could still silently remove it again in the next update, whereas if we don't include it and then decide to use it, we would have to delay the deprecation of the |last-author-amp= parameter for another quarter.)

(Regarding redefining |title=none, that's a completely different case (best discussed in the other thread), but IIRC it only affects some 1k cites, so it is even possible to achieve manually.)

--Matthiaspaul (talk) 18:04, 29 September 2020 (UTC)[reply]

One comment regarding any change: remember that {{harv}} et al. use an ampersand. In articles that repeat references to the same book, I put the full citation on first reference and then use {{harvp}} for subsequent references, akin to how The Chicago Manual of Style shortens subsequent footnotes to a previously used source. If |last-author-amp= weren't available, I'd run into an inconsistency where full citations and shortened citations in the same reference list won't do similar things. (See footnotes 40 [full] and 51 [shortened] or footnotes 50 [full] and 55–57 [shortened] in Michigan State Trunkline Highway System for an example in just one article. Every eligible footnote should be using |last-author-amp= as well.) Imzadi 1979 → 00:46, 6 September 2020 (UTC)[reply]

I don't understand the point you are attempting to make here. It appears that you think that the |last-author-amp= functionality is going to go away because that parameter will be deprecated. Not true. |last-author-amp=yes shall be replaced with |name-list-format=amp. Writing your example citations using the sandbox:

{{cite web/new |url = http://www.michiganhighways.org/history.html |title = The History of Roads in Michigan |last1 = Pohl |first1 = Dorothy G. |last2 = Brown |first2 = Norman E. |name-list-format = amp |publisher = Association of Southern Michigan Road Commissions |date = December 2, 1997 |access-date = September 11, 2008 |page = 1 }}

Pohl, Dorothy G.; Brown, Norman E. (December 2, 1997). "The History of Roads in Michigan". Association of Southern Michigan Road Commissions. p. 1. Retrieved September 11, 2008. {{cite web}}: Unknown parameter |name-list-format= ignored (|name-list-style= suggested) (help)

{{harvp|Pohl|Brown|1997|p=3 }}

Pohl & Brown (1997), p. 3

How does that not give you what you want? Or are you silently complaining about the possible inclusion of a name separator with the ampersand: ; & so the {{cite web}} would render like this:

Pohl, Dorothy G.; & Brown, Norman E. (December 2, 1997). "The History of Roads in Michigan". Association of Southern Michigan Road Commissions. p. 1. Retrieved September 11, 2008.

My prospective bot task reports that all eligible cs1|2 templates in Michigan State Trunkline Highway System are using |last-author-amp=yes. Seems peculiar to me that the long-form cite is for page 1 but the short-form cite is for page 3.

—Trappist the monk (talk) 01:27, 6 September 2020 (UTC)[reply]

Monkbot task 17; BRFA

—Trappist the monk (talk) 16:07, 4 October 2020 (UTC)[reply]

Yes. I, too, and wondering about circumstance in which a previous editor, for a citation containing three authors, invoked the name-list-style=amp parameter, producing thus: Last1, First1; Last2, First2 & Last3, First3. It just looks weird to me. — Christopher, Sheridan, OR (talk) 06:28, 2 November 2020 (UTC)[reply]

Guidance about indexing by first name?

Is there any guidance about how to handle instances where authors should be indexed by first rather than last name? E.g. Chinese names where family name comes first, or Thai names where given name (which comes first) is the polite term of address? For example, should I call a Thai given name "last=" so the correct name comes first, as you would see in an index? Calliopejen1 (talk) 17:00, 17 September 2020 (UTC)[reply]

If you are uncomfortable using first/last in such cases, you may use |given= and |surname=. --Izno (talk) 17:50, 17 September 2020 (UTC)[reply]

What do you mean by indexed?

Whatever name you give |last= or |surname= will appear first in the rendered citation. |first= or |given= is always follows and is separated from |last= or |surname= with a comma and a space character. The only way to get cs1|2 to render a person's names in a particular order with particular punctuation is to do it manually with |author=. This same applies to the other name lists (contributor names, editor names, interviewer names, translator names). But none of this has anything to do with indexing.

What do you mean by indexed?

—Trappist the monk (talk) 18:00, 17 September 2020 (UTC)[reply]

I assume that an author name in a citation should be rendered in the way it would be listed in an index, which is what I'm referring to. There are plenty of external guidelines about this, e.g. Chicago Manual of Style 16.76-16.87. Thai names should appear in an index by first/given name. To respond to Izno, simply using given/surname doesn't work for Thai names because the given name is what they should be referred to by, though it comes first. I suppose I could just do author=, but then I would need to add ref={{harvid|first|year}} because short-form citations (which should use only the given name) wouldn't work properly. Calliopejen1 (talk) 18:10, 17 September 2020 (UTC)[reply]

Before electronic indexing this was important. Indeed, citation element order followed the indexing in printed reference works. The primary index often being published main-author-name with publish-date being a secondary index. Today though such reference works are electronic databases with flexible options regarding indexing and sub-indexing (the present discussion). Which makes the positioning of citation elements more of a presentation issue. There is however an existing guideline: present the author name the way you saw it published. Presumably, that would be the easiest way to find it. The parameter |author= fits the bill. 65.88.88.69 (talk) 18:38, 17 September 2020 (UTC)[reply]

I agree that it is a presentation issue, but I don't think that presentation is unimportant. For example, I wouldn't want us to be using the wrong part of the name in short-form citations because {{harvnb}} links to "last"/"surname" by default. That would as akin to doing a short-form citation with "Melissa" or "Jennifer" (i.e. inappropriate). And highlighting the wrong portion of the name through inversion is also odd, as is alphabetizing a work in the wrong place in a works cited list. I do think that "author" combined with ref= is probably the way to go. I'm not sure if any other cultures have this particular issue that can't be sorted out by doing given/surname. Possible it's unique to Thai names.... Calliopejen1 (talk) 18:48, 17 September 2020 (UTC)[reply]

...existing guideline: present the author name the way you saw it published. Is there? Where?

—Trappist the monk (talk) 18:46, 17 September 2020 (UTC)[reply]

It is in the same page where it is said that titles should render as published. We are not allowed to be creative with most citation elements if we want verification to be as easy ss possible. There are presentation options with dates for example (within the given dating system). But when one is trying to present a date in a foreign system, it is better to do so verbatim. 65.88.88.69 (talk) 19:23, 17 September 2020 (UTC)[reply]

What page is this, out of curiosity? Also interested in the dating issue -- should we be giving Thai solar calendar dates for Thai sources? That seems pretty unhelpful to readers, who may want to know at a glance what year a work was published (i.e. is it an up-to-date source or not?). I checked two Thai works on Worldcat, and one had no date, while another had a Gregorian date. I assume the dates in Thai library catalogs are the usual Thai solar calendar dates though... Calliopejen1 (talk) 19:34, 17 September 2020 (UTC)[reply]

I was referring to the general guidelines re: verification. It was not my intent to be mysterious or snarky, and hopefully it will not be seen so. The question the way I understand it, is how to present foreign terms to an English-speaking audience for purposes of verification. Doesn't this answer itself? The technicalities of implementation (the parameter "author", custom short reference anchors etc) will then present themselves in the discussion. 65.88.88.69 (talk) 20:00, 17 September 2020 (UTC)[reply]

I don't have access to the on-line CMOS but a cursory look-through of this copy of "Indexes" 15th edition (different chapter number but apparently same title) seems to indicate that "Indexes" is about indexes, not about citation style. But, yeah, if the affect you are wanting to achieve is given name followed by surname and linkable from a short-form template, then |author=<given> <surname> and |ref={{sfnref|<given>|<year>}} will do that. You might want to leave  so that editors who visit the article after you have finished with it know your intent.

—Trappist the monk (talk) 18:46, 17 September 2020 (UTC)[reply]

I agree it is about indexes. But where we have works cited lists, I assume we want them alphabetized in the same way/order they would appear in an index, no? Isn't that implicit in our inversion of first/last names? Calliopejen1 (talk) 18:49, 17 September 2020 (UTC)[reply]

Yeah, generally, per WP:CITE we sort by surname – that guideline seems to be mute on the topic of non-western name order. But, this is Wikipedia; I have seen (western) given-name-first reference lists sorted by surname. Why would anyone do that? I don't know, but, as long as it is consistent in the article, WP:CITEVAR protects that style.

The topic of non-western-name-order comes up here periodically. We just haven't determined how-best to deal with it. It is complicated because transliterations of Chinese and Japanese names are apparently not reversible – it is possible to transliterate a to Latin script but not possible to transliterate back to the original – so 'properly' supporting these kinds of names is more than just rendering the transliterated names without the inversion indicator (comma).

—Trappist the monk (talk) 19:15, 17 September 2020 (UTC)[reply]

Bump PMC to 8000000

PMC 7528258 is valid, but gets reported as an error. Headbomb {t · c · p · b} 18:42, 2 October 2020 (UTC)[reply]

bumped.

{{cite book/new |title=Title |pmc=7528258}} → Title. PMC 7528258.

—Trappist the monk (talk) 19:46, 2 October 2020 (UTC)[reply]

Is the rate by which this increases predictable with reasonable certainty so that we could automatically increase the upper limit depending on the current date somehow? If so, the maintenance rate for these limits could be reduced significantly. Not that this would be much of a problem right now, but it requires monitoring. Let's think a couple of years into the future when we might no longer be around here any more - it's always better if things are set up in a way that does not need any or only very few updates.

--Matthiaspaul (talk) 21:51, 2 October 2020 (UTC)[reply]

PMID limit

At Special:Permalink/982911547#PMID error, Nixinova was concerned that PMID 33022132 was outside the range specified at Help:CS1 errors#bad_pmid. This turns out not to be the case, as the limit specificed there is 33100000. However, it's awfully close, which led me to investigate it.

#1426 @ 2020-10-10: last id 33038074
#1423 @ 2020-10-07: last id 33026741
- 33038074 - 33026741 = 11333 ids / 3 days = 3778 ids/day
#1334 @ 2020-09-11: last id 32915410
- 33038074 - 32915410 = 122664 ids / 29 days = 4230 ids/day
#1100 @ 2020-03-01: last id 32113198
- 33038074 - 32113198 = 924876 ids / 223 days = 4147 ids/day

The PMIDs appear to be assigned sequentially and are documented to "not be re-used". Based on the highest numbers found in several daily files here, the rate is roughly 4000 per day. The latest PMID as of the 2020-10-10 file is 33038074, which means it will hit 33100000 in less than 16 days. Was there a reason for the (strangely specific) 33100000 limit, should it be increased (soon), and to what? —[AlanM1 (talk)]— 15:25, 11 October 2020 (UTC)[reply]

I see Trappist the monk has been maintaining Module:Citation/CS1/Configuration. —[AlanM1 (talk)]— 15:30, 11 October 2020 (UTC)[reply]

I picked 33100000 just to clear the error. The limit exists to catch simple typos: too many digits, most significant digits out of bounds. Alas, we can't catch too-few-digits or typos that produce in-bounds results... cs1|2 can't do much more to protect editors from these kinds of mistakes. The limit should be sufficiently tight that we catch typos but not so tight that we overrun the limit every few days.

We might set the limit at 33500000 which, at 4k/day, will last us 100+ days. Elsewhere on this talk page it is suggested that we automatically increment the limits for the various identifiers. I don't particularly like that as a solution because there is no way to automatically close the loop to reduce or increase the limit-deltas as conditions warrant.

—Trappist the monk (talk) 16:24, 11 October 2020 (UTC)[reply]

Not without some arbitrary number like we have today, of course. --Izno (talk) 18:04, 11 October 2020 (UTC)[reply]

If someone has a general purpose bot, perhaps a job could be added to it, to be run monthly. It could retrieve the latest XML file from the FTP link above, find the highest PMID value, add 120,000 (30 days' use), round up to the next 100,000, and update the id_handlers['PMID'].id_limit value in the config file. Or someone could do it manually. While I do have a couple of things I do monthly manually, I don't have a foolproof system in place to ensure things get done and it would seem like this is too important for my casual approach.

Are there other values here that can/should be updated, too? —[AlanM1 (talk)]— 06:05, 12 October 2020 (UTC)[reply]

We could also define a bot task to scan for the highest identifier value used in an article while performing other tasks (or have a bot continously loop over all articles), check this value against a value recorded in a new "/Limits" sub-page of the citation template, and update that value if the found value is higher. This sub-page would have to be unprotected to be easily accessible by bots and editors. The citation template could read this value and compare it against the value specified in its "/Configuration" module (which is protected), take the higher value, add some safety margin to it, and treat the result as the allowed upper limit in citations (with or without some extrapolation facility). Many variants of this are possible.

Using this approach would make it possible to more frequently update the limits while still ensuring that at least all values below the value specified in "/Configuration" are treated as valid. The limits in "/Configuration" would be updated whenever the template gets updated. By specifying a much too high value in "/Limits" vandals could temporarily disable the upper limit check but they could not cause the template to use much too low values in an attempt to invalidate (older) values in citations.

--Matthiaspaul (talk) 21:58, 12 October 2020 (UTC)[reply]

At least in theory Wikidata could also be used to retrieve some useful information instead or in addition to something like "/Limits": PMID (P698) has a property "number of records" P4876.

30060294 @ 2019-08-01
30178674 @ 2019-11-19

However, the info there is outdated.

The "number of records" is also defined for DOIs (P356) and JSTORs (P888); similarly outdated.

--Matthiaspaul (talk) 19:24, 14 October 2020 (UTC) (updated 13:25, 15 November 2020 (UTC))[reply]

Just to illustrate this a bit more, the unprotected "/Limits" subpage to be regularly kept up to date by bots or editors could be in a simple CSV format like:

pmc-limit=8000000,pmid-limit=33200000,ssrn-limit=4000000,s2cid-limit=230000000,oclc-limit=9999999999,osti-limit=23000000,rfc-limit=9000

The template would attempt to read this file and if present, check the identifier against either the internally defined limit or the limit defined in this file, depending on which one is larger.

Whenever the template would be scheduled to be updated, the internally defined limits would be updated to those from the "/Limits" file plus some margin.

Depending on the amount of overhead allowed the format of the "/Limit" could also be Lua source code instead of CSV.

--Matthiaspaul (talk) 23:54, 5 November 2020 (UTC) (updated 10:54, 10 November 2020 (UTC), 23:39, 13 November 2020 (UTC))[reply]

That was here: Help_talk:Citation_Style_1#Bump_PMC_to_8000000.

This "auto-increment" would still require monitoring/updates/adjustments of the limits and factors, but less frequently.

--Matthiaspaul (talk) 21:58, 12 October 2020 (UTC)[reply]

Sounds more complicated and error-prone than using the latest XML change file at PubMed for the max value and adding enough headroom to get past the next anticipated run. I don't think it should try to be exact, since new IDs are constantly being assigned and the latest articles may not be cited for some time. The new increment could even be re-calculated on each run based on the current and previous months' max values and file dates, plus a fudge factor based on some stats I can get from the variance in the current history file set. —[AlanM1 (talk)]— 00:16, 13 October 2020 (UTC)[reply]

Yes, for as long as such an XML file exists as an external resource, but this does not seem to be the case for all identifiers which need to be bumped up frequently. --Matthiaspaul (talk) 00:58, 13 October 2020 (UTC)[reply]

Another approach would be to allow users to temporarily enter "too high" values using the accept-this-as-written markup, this would put them into special maintenance categories similar to invalid ISBNs, etc. (This could be implemented with minimal overhead.)

If bots would run into this markup in the |pmc=, |pmid=, |ssrn= or |s2cid= parameters, they would retrieve the currently configured limit for an identifier through

{{#invoke:Cs1 documentation support|id_limits_get|<identifier>}}

like

Current PMC limit: 11300000
Current PMID limit: 39400000
Current SSRN limit: 4900000
Current S2CID limit: 270000000
Current OCLC limit: 10310000000 (will show after the next template update)
Current OSTI limit: 23010000 (will show after the next template update)
Current RFC limit: 9300 (will show after the next template update)

and compare it against the number specified in the citation. If the limit is larger, they would remove the markup, otherwise leave it as it is. This would have the advantage that the "fix" is trivially easy for editors, and that the templates would not have to read a "/Limits" file. However, bots would have to edit the citations.

Still, the bots should record the highest found numbers in some prominent place (for example in a "/Limits" file), so that the internally defined limits can be easily updated accordingly when a template update is scheduled. Otherwise, someone would have to manually go through the maintenance category to determine the new limits.

--Matthiaspaul (talk) 23:54, 5 November 2020 (UTC) (updated 10:54, 10 November 2020 (UTC), 23:39, 13 November 2020 (UTC))[reply]

Error category names standardization

Could the error categories in the next version sync be standardized? Out of the 55 categories in Category:CS1 errors, 44 start with "CS1 errors". These are the ones that use a different style:

--Gonnym (talk) 15:11, 22 October 2020 (UTC)[reply]

See Help talk:Citation Style 1/Archive 71 § error category names standardization and the top of Module:Citation/CS1/Configuration/sandbox

—Trappist the monk (talk) 15:17, 22 October 2020 (UTC)[reply]

In case that response is unclear, the sandbox version of the module has been updated to standardize the above category names (follow the Archive 71 link to see the new names). They will be updated the next time the sandboxes are copied to the live module (typically every couple of months). – Jonesey95 (talk) 15:59, 22 October 2020 (UTC)[reply]

Thanks both for the link (and Jonesey for saving me time reading that). --Gonnym (talk) 16:22, 22 October 2020 (UTC)[reply]

Since it's already archived, I'll comment here. I didn't see these 3 mentioned at the discussion. All sub-categories of Category:CS1 properties which use a colon: Category:CS1: long volume value, Category:CS1: Julian–Gregorian uncertainty and Category:CS1: abbreviated year range. --Gonnym (talk) 18:06, 23 October 2020 (UTC)[reply]

I think that it should be the other way 'round: all ‹The template Category link is being considered for merging.› Category:CS1 properties cats should have a colon after the 'CS1' prefix just as all error and maintenance categories with the 'CS1 errors' and 'CS1 maint' prefixes have a colon. I don't know if it is really necessary but, we could go further and use 'CS1 prop' prefixes.

—Trappist the monk (talk) 18:51, 23 October 2020 (UTC)[reply]

And ‹The template Category link is being considered for merging.› Category:CS1 has been listed for renaming to ‹The template Category link is being considered for merging.› Category:Citation Style 1; see Wikipedia:Categories for discussion/Speedy § Current requests.

—Trappist the monk (talk) 19:27, 23 October 2020 (UTC)[reply]

Not particularly important, but if we are going to rename / streamline the CS1 category names anyway, perhaps we should also change

"maint" -> "maintenance"

in the category names. The rationale would be to avoid unnecessary abbreviations. Space is not an issue here. "Maint" is non-standard developer jargon, therefore pretty obvious for us. But I'm not sure if uninvolved readers (our target audience) will guess its meaning equally easy. --Matthiaspaul (talk) 14:22, 26 October 2020 (UTC)[reply]

It is most definitely an issue for people who use Timeless where the categories end up in the sidebar through no fault of their own ;). Uninvolved readers can't see the category on each page anyway since it is hidden (like CS1 errors for that matter), much less the maintenance message itself, so the only other place they might stumble upon the category name is the category page itself (which provides sufficient context) or the context of discussions about the categories, like this one (which also provides sufficient context). --Izno (talk) 15:25, 26 October 2020 (UTC)[reply]

I don't understand this argument. So what if categories are in a side bar? Here is an article using timeless skin that has three hidden categories. All are visible to me (I presume because I have enabled hidden category display in my preferences). All of those category names are readable. The maintenance messaging must be turned on by interested editors but our choice of category names has no bearing that. So what is it that you are really complaining about?

—Trappist the monk (talk) 13:42, 28 October 2020 (UTC)[reply]

I did not assert that I could not see them in Timeless. I did not assert clearly anything by what I did say, in fact... To make it clear now, I do not want longer category names because they will not wrap cleanly and/or will make an already often-long sidebar on the right much longer for no obvious gain. I honestly don't want longer message names either (as CS1 maintenance: is longer than CS1 maint:), which has a similar, though of lesser nature, concern associated with how long the word is.

I did argue that how long or what is in the category name is immaterial to the casual reader who cannot see the categories in any location whatsoever (c.f. But I'm not sure if uninvolved readers (our target audience) will guess its meaning equally easy. by Matthias). Someone who can't see the category listing on a specific page won't care how long or what the names are, which means that only the following groups are of interest: a) casual people who have somehow navigated to the category page are there by happenstance and are provided an explanation in the rest of the page; b) casual people who see a discussion on a page like this one, in which they are provided sufficient context; and c) people who are not casual and have turned on hidden categories will need to learn what is going on, but that also is made obvious by the content of each named category. And then, those who see the structure once can probably figure out what is going on from thereon. (Do I presume too much?) In all cases regardless, someone can click and see what is on the category page and will see "maintenance" in some form or another on each.

I assert that the reason our error messages don't have CS1 in them is that abbreviation is just as much technobabble as the asserted shortening of "maintenance" is....

As for 'properties', I do think those should at the least be consistently at CS1:. I honestly haven't decided whether I like "props" or "properties" more, though you might tell which I lean toward. --Izno (talk) 21:15, 28 October 2020 (UTC)[reply]

Please, no, "props" is really cancer to the eyes.

In general, as Wikipedia is for readers, I wonder if we should support grammatical nonsense such as "maint" at all. If Timeless can't cope with "maintenance" in category names well, it will have problems with longer-than-average words and titles in general, that is, it is an issue that occurs all over the place. If so, it is a problem of the skin, not the contents, and consequently should be addressed at skin level, not by adjusting the contents. Looks very unprofessional to me.

--Matthiaspaul (talk) 10:24, 29 October 2020 (UTC)[reply]

You apparently continue to ignore what I said. Readers. Can't. See. These. Categories.

I happen to agree that the skin is doing something dumb here, but saying we must do X because of Y reasons and then ignoring the other Z reasons that we really don't need to do X isn't cool. "Looks very unprofessional to me." --Izno (talk) 00:50, 1 November 2020 (UTC)[reply]

I wonder how you come to that conclusion. I read your reasoning and value it as any constructive input into the discussion, but it didn't convince me much, in particular because addressing this in our narrow context by using abbreviated category names won't solve the problem anywhere else, so it's clear that the fix for this must be elsewhere. Also, while I originally wrote "uninvolved readers", editors and developers are readers as well. Since you more or less suggested to change the category names to "props" I provided my opinion on this.

Y and Z must be weighted. I think that, in general, skin issues should be solved on skin level. I would support special-casing something to improve the appearance in Timeless (although I don't know what that could be in this particular case). The extent of this support would stop where it would weaken the general appearance in other skins (including the default Vector skin).

Perhaps the solution would be to change the thresholds when categories move from the bottom of a rendered page into the sidebar and/or to blend in the categories only after pressing a special button, I don't know. (I don't use Timeless and given the many issues you reported with this skin in the past it does not appear to be very desirable to use.)

--Matthiaspaul (talk) 23:50, 1 November 2020 (UTC)[reply]

I'll leave the Timeless redesign discussion aside save to say that no, the (quantity of) issues I have reported do not reflect my happiness with the skin.

I'm not sure how much more productive this line of discussion between you and me will be, so I will leave that there also. Perhaps another editor or two will appear to discuss/give input. I will suggest something a little more off-the-wall down the way to see if that fits anyone's tastes. --Izno (talk) 04:15, 2 November 2020 (UTC)[reply]

From there, I think it's just a question whether we shall catch headgoblins. --Izno (talk) 21:19, 28 October 2020 (UTC)[reply]

This discussion has meanwhile been moved to Category_talk:CS1#Opposed speedy move request.

--Matthiaspaul (talk) 21:26, 3 November 2020 (UTC)[reply]

'CS1 maint:' → 'CS1 maintenance:' and properties: 'CS1' (with and without colon) → 'CS1 properties:'. See Module:Citation/CS1/doc/Category list. —Trappist the monk (talk) 15:30, 26 October 2020 (UTC)[reply]

And ... I've been reverted by some mechanism that doesn't notify editors that a revert has taken place ~~and which also reverted an unrelated edit~~ to Module:Citation/CS1/Configuration/sandbox.

—Trappist the monk (talk) ~~13:16, 28 October 2020 (UTC)~~ 13:20, 28 October 2020 (UTC)[reply]

Because of that unrelated edit, MediaWiki indicated there was an intervening edit that could not be reverted with 'undo', I needed to perform a manual revert i.e. opened the previous good version, copy-pasted the uncontested change in from the current version, and saved. (In retrospect, I suppose I could have undone all of the edits rather than the two I would have preferred to revert with 'undo', then reinserted the uncontested change in the one edit pane.) --Izno (talk) 21:15, 28 October 2020 (UTC)[reply]

Or not revert at all. --Matthiaspaul (talk) 10:24, 29 October 2020 (UTC)[reply]

Or not revert at all. This is a wiki. Get off that dumb horse there. Moreover, it isn't cool to deliberately misinterpret what I said to mean "I had to revert". You know that was not the intention. These modules are used on a couple million pages. Consensus is required for change. A revert makes it obvious you don't have consensus. --Izno (talk) 00:50, 1 November 2020 (UTC)[reply]

Izno, I did neither question the technical necessary steps of your reversion nor did I question your good intentions in general. Likewise, I can rule out any deliberate misinterpretation on my side. My remark was meant as a friendly reminder that reversion of perfectly good-faith contributions should remain restricted to cases where no other options for improvement exist. An occasional revert hardly harms, but frequent reversions do. Trappist edited the sandbox, not the live template. The sandboxed modules are not used on millions of pages. The sandbox is, by its very definition, open for experimentation by anyone, and while it makes sense not to wait until the next update to clean up edits for which there is no consensus also in the sandbox (at least for as long as we don't have a separate release stage area), there also is no reason to revert changes almost immediately in the middle of the discussion just because you don't agree with them.

--Matthiaspaul (talk) 23:50, 1 November 2020 (UTC)[reply]

By the same token, on Wikipedia, quick reverts save everyone time untangling good edits from bad. This is no less true in a community sandbox (that I personally treat as requiring consensus) than the module which is the live representation of that sandbox. You particularly, in the last module release, conflated many reasonable edits with many edits that I would have personally preferred not to have seen added to the module, but I could not easily revert them and gave you the benefit of the doubt that you would announce those changes proactively (like Trappist and I have done when we make changes to the sandbox).... T'was not to be. (I still don't understand a few of the changes that were made, and that disturbs me both from the user-perspective and the rationale perspective.) Instead, I will revert now, skip being worried that unnecessary complexity has been added, and ensure that what is in the sandbox is something that could be deployed tomorrow with the appropriate consensus (if we were interested in doing so).

If you (or anyone else) would like to "experiment" (rather than announce and propose actual changes), then, like the main page sandbox suggests for that area (see edit notice), your own personal copy of the modules might be preferable for experimentation or redesign. (Were it the case we could fork more easily... maybe there is a Javascript writer who could do that for us. :^) --Izno (talk) 04:15, 2 November 2020 (UTC)[reply]

Different suggestion: We should consider removing all CS1 'subtypes' from the category names, meaning that "CS1 errors:", "CS1 maint:", and "CS1(:)" would all become "CS1:", and then only the parent category name would need to match the category of message/property. This would have the side-benefit that maintenance messages promoted to errors, or properties to maintenance/errors, would not need to have their category name changed when promoted. --Izno (talk) 04:15, 2 November 2020 (UTC)[reply]

Meh. Now that the sandbox has been altered to normalize the category names, certainly the prefixes can be removed from the category names in Module:Citation/CS1/Configuration/sandbox error_conditions{}. We can then apply the prefixes in Module:Citation/CS1/sandbox where we actually create the category wikilink. Same side benefit and we retain the prefixes which I think we should do. I suspect that it isn't possible to apply some sort of css trick to category wikilinks so that individual editors could hide the prefixes in the hidden categories list...

We should not forget the non-English wikis of various types that also use these modules and the associated categories; for them, the prefixes may (or maybe not) be important.

—Trappist the monk (talk) 14:10, 2 November 2020 (UTC)[reply]

Indeed, but they are also free to customize as they see fit; they are not beholden to precisely the same naming.

As for prefix removal to elsewhere if such occurs, my understanding is that it is both harder for i18n and more expensive for Lua to process two strings like that in separate places. Where necessary we should perform string manipulation, but I do not think this would be a place were we to go down that path. --Izno (talk) 15:00, 2 November 2020 (UTC)[reply]

Agreed, other wiki's are not obliged to follow our lead. For those that do, fully spelled-out names may be important.

Of course. Any work that the module has to do costs time and processor resources. We are already concatenate Category: with <category name> which we then hand off to utilities.make_wikilink() where we concatenate [[, the prefixed category name, and ]] to make the final result. Changing utilities.make_wikilink ('Category:' .. v) to utilities.make_wikilink ('Category:' .. cfg.presentation['<category prefix>'] .. v) isn't much extra work.

Maybe better would be to write something like:

utilities.substitute (cfg.messages['cat wikilink'], {cfg.messages['cat err prefix'], v})

where the messages{} table has:

['cat wikilink'] = '[[Category:$1$2]]'

['cat err prefix'] = 'CS1 errors: ',

['cat maint prefix'] = 'CS1 maintenance: ',

['cat prop prefix'] = 'CS1 properties: ',

For i18n, should probably do something like that anyway so that other language wiki's don't have to edit Module:Citation/CS1 as well as Module:Citation/CS1/Configuration.

—Trappist the monk (talk) 16:32, 2 November 2020 (UTC)[reply]

I have have added ['cat wikilink'] = '[[Category:$1]]' and a matching [':cat wikilink'] = '[[:Category:$1|link]]' to ~/Configuration/sandbox. To use those I have replaced the calls to utilities.make_wikilink() with utilities.substitute (cfg.messages['cat wikilink'], {v}) where we make category names and the similar call where we make the link in maint error messages.

—Trappist the monk (talk) 18:00, 2 November 2020 (UTC)[reply]

Apparently, even experienced editors don't understand that properties categories are not error categories. I just stumbled on this discussion: Help talk:Citation Style 1/Archive 68 § Bogus long volume which suggests to me that were ‹The template Category link is being considered for merging.› Category:CS1: long volume value renamed to ‹The template Category link is being considered for merging.› Category:CS1 properties: long volume value then that discussion might not have been necessary or would have been about something else.

—Trappist the monk (talk) 15:21, 6 November 2020 (UTC)[reply]

polluted categories

Since we're talking about categories... According to this search result there are 311 CS1-prefixed categories that use {{polluted category}}. That template is a do-nothing template that is supposed to identify categories that hold items that exist in main-space and items that exist in user-space so that these categories are excluded from Wikipedia:Database reports/Polluted categories which is apparently moribund. Regardless, {{polluted category}} brings no benefit to the cs1-prefixed categories because Module:Citation/CS1 is the only way that anything should be listed in these categories and because Module:Citation/CS1 excludes, among others, the user and user talk namespaces when it creates category wikilinks.

So, without objection, I shall remove {{polluted category}} from all of the CS1-prefixed categories.

—Trappist the monk (talk) 19:50, 26 October 2020 (UTC)[reply]

There having been no objection, done.

—Trappist the monk (talk) 15:27, 28 October 2020 (UTC)[reply]

Lowhanging fruit for no-dash versions

I noticed a few parameters that have 0 uses in the wild and I don't believe we need the synonym in those cases: |seriesnumber=, |event-format=, |event-url=, and |eventurl= in {{cite conference}}, which has |series-number=, |conference-format=, and |conference-url=. These were removed.

I also saw a curiosity. |event= is currently a synonym for |conference= in /Configuration. However, it is apparently used in {{cite speech}} (see particularly Albert Einstein as an example from the search). Any opinions on whether that is an issue? --Izno (talk) 16:48, 22 October 2020 (UTC)[reply]

Apparently |event= is intended to name the location, venue, whatever where the speech was given. For the purposes of citing a speech, I don't think that where the speech was given has any value because someone attending the speech can't cite it and expect that readers can verify what the speaker said. Editors will likely be citing a written copy of the speech from a book, a magazine, a journal; or citing an audio, video, film recording of the speech. We have sufficient templates for those kinds of citations so I guess I have to wonder if there is any real purpose to keeping {{cite speech}}.

—Trappist the monk (talk) 19:21, 22 October 2020 (UTC)[reply]

Agreed. "Speech" can be a |type= in the templates you mention. 64.18.9.203 (talk) 12:06, 23 October 2020 (UTC)[reply]

I generally agree. Dealing with a template however is a question for WP:TFD. --Izno (talk) 22:36, 23 October 2020 (UTC)[reply]

A parameter for open content licenses (CC BY) and automatic filling/parsing via reFill and Autofill and/or a bot

Could you add a parameter to indicate open content licenses of studies? Especially (or only) Wikimedia-compatible ones and mainly CC BY 4.0.

Such tags would have many advantages for readers and editors − for instance, they can indicate that the source may have relevant freely licensed images which could be used by the reader or be uploaded (and possibly added to the article) by an editor.

It could work similar to the |doi-access=free parameter and would complement it. In particular this parameter is not about access to the (full-text of the) reference/paper but about the license of the content (in particular whether or not it's an open/compatible license and if so which).

It would be best if this parameter was set automatically by the Autofill tool (the magnify icon in the RefToolbar) and reFill. It could also be set by a bot similar to User:OAbot or even that same bot. Here's an example of one of the bot's changes. However, the parameter could be added to the template before any of these is implemented.

The visual display should include the CC BY 4.0 (or similar) logo, similar to the icon that is displayed for |doi-access=free, so that it's quickly and clearly visible that the respective study is licensed that way. The respective reference could then look like this:

Kawaguchi, Yuko; et al. (26 August 2020). "DNA Damage and Survival Time Course of Deinococcal Cell Pellets During 3 Years of Exposure to Outer Space". Frontiers in Microbiology. 11. doi:10.3389/fmicb.2020.02050. S2CID 221300151.{{cite journal}}: CS1 maint: unflagged free DOI (link) Text and images are available under a Creative Commons Attribution 4.0 International License.

(I'm currently adding it manually as in the above example to references at 2020 in science, which also helps in my, and possibly at some point others', efforts to upload relevant images from these studies to Commons. The above example is from that page.)

--Prototyperspective (talk) 13:49, 24 October 2020 (UTC)[reply]

This is a very bad idea - references should be chosen because they are the best, most suitable and reliable sources for the article, not whether they are released under a convenient license. This proposal would merely reinforce FUTON bias.Nigel Ish (talk) 14:03, 24 October 2020 (UTC)[reply]

This is not about which references to choose at all.

You could argue against pretty much all the other existing parameters like this; it's irrelevant to this proposal. --Prototyperspective (talk) 14:39, 24 October 2020 (UTC)[reply]

No. The purpose of a citation is to identify the source that editors consulted. Licensing of that source doesn't aid the reader in locating the source. Similar proposals have been rejected here before. You might want to troll through the archives of this talk page for those discussions.

—Trappist the monk (talk) 14:49, 24 October 2020 (UTC)[reply]

This is not about the selection of which reference to use at all. Agree on The purpose of a citation is to identify the source that editors consulted.

You could argue against pretty much all the other existing parameters (except for the DOI/URL and including the author parameters or the |doi-access= parameter) like this; it's irrelevant to this proposal. --Prototyperspective (talk) 15:08, 24 October 2020 (UTC)[reply]

Nbsp in |author, |last, and equivalents for other contributors

Prior to the last release, the code that looks for looked for a count of characters that was more than 1 of either commas or semicolons. For example, |author=Last, First, Jr. or something like |last=Last; Last2; Last3 (unfortunately not contrived :( ) would have triggered the maintenance message, both of which still today emit a maintenance message. (I am not sure if a mix of semicolon and comma would have done the same but think one semicolon and one comma would have.)

However, the behavior changed in the last release so that now commas and semicolons are counted separately, and if there are more than 0 semicolons, the module emits the maintenance message.

Due to an error on my part (perhaps the original code also contained the error, I haven't tested), it is now the case that any HTML entity encoding will be identified as needing maintenance. This is most common with the non-breaking space (i.e.  ), as in the last two cases of test_Mult_names on Module talk:Citation/CS1/testcases/errors. (Perhaps this is why the check was originally at least 1 semicolon, I do not know.) I noticed this because I had been working on the category for authors, which had been hanging around 13k, which is now some 30k pages (and I do not think there were that many semicolons... maybe there were and I have found a hidey hole of cleaning. :)

For a discrete example, a construction like |author=Tolkien, J. R. R. aka |author=Tolkien, J. R. R. (those are non-breaking spaces) emits the message currently.

Tolkien, J. R. R. Title. (JRR would probably have triggered this message before the last release since it has two non-breaking spaces.)

Is it worthwhile supporting HTML entities in |author=/|last=? It will come up in the |author= case most-often as we rarely abbreviate last names (and moreover almost-never have multiple last names to abbreviate), for which a 95% solution can be a conversion to |last= |first= as this check does not occur for |first= (we prefer the use of |last= |first= anyway for best metadata generation). Cases other than can be worked on if they occur, since nbsp is not the only kind of entity that could end up encoded this way in |author= (I am skeptical it would occur in most uses of |last=). By worked on I mean that we can create templates similar to {{ndash}}, or convert to the Unicode representation.

Aside: I don't know if it would be reasonable for the software to be checking |first=; I suspect so given some constructions in the wild I've seen.

Thoughts? --Izno (talk) 20:42, 24 October 2020 (UTC)[reply]

I would think non-breaking spaces (using any mechanism) may be important in situations where author names separated by a hyphen? One could argue that some readers could be confused or misunderstand a citation that splits a compound last or first name into a newline. I haven't looked at the code to see how it handles such cases. 98.0.246.242 (talk) 01:31, 25 October 2020 (UTC)[reply]

Not non-breaking spaces, but dashes/hyphens/straight lines in the middle of names, for which we do already have other workarounds. --Izno (talk) 01:38, 25 October 2020 (UTC)[reply]

Right. I mixed up non-breaking with non-wrapping in my previous comment. So now I cannot think of any other use-case for such markup, but who knows. 98.0.246.242 (talk) 02:06, 25 October 2020 (UTC)[reply]

Multiple authors' names should never be separated by any type of hyphen or dash or slash or whatever, if that's what the IP is asking. They should be separated by entering them as separate parameters (or by a comma, but only when using "Vancouver style" in which both periods and spaces are omitted from authors' initials anyway, and therefore moot).

Adding non-breaking spaces to the output for any first-name input which matches (in whole or part) a pattern of multiple consecutive initials (spaced or unspaced) seems like an easy task for regular expressions inside the module. This would be more robust than encouraging or requiring users to include html entities in the input, or even think about doing so.

There are of course certain abbreviations which look like a person's initials but should remain unspaced according to the MOS. It's conceivable that some user might enter something other than a person's name, such as |author=U.S. Treasury but in reality that should be spelled out and moved to |publisher=United States Treasury.

Of course, an even lazier approach might be to just enclose the output for every firstname and every lastname in a span with some class CSS-styled as white-space: nowrap. This would (rather aggressively!) prevent wrapping when name parts contain (a) initials (e.g. |first=J. R. R. or |first=F. Scott), and/or (b) real or implied hyphens (e.g. |first=Mary-Kate or |last=Lloyd Webber), which would otherwise risk being wrapped in the middle of. ―cobaltcigs 18:54, 25 October 2020 (UTC)[reply]

Please do not do that last, "aggressive" proposal. There are too many cases like

|author=U.S. Select Foobarian Subcommittee of the International Committee of Bazquuxians for Global Widgetization-Dingusification Standards

(where |publisher= has another long-winded thing that is the parent organization name[s]). — SMcCandlish ☏ ¢ 😼 22:15, 26 October 2020 (UTC)[reply]

Only names divided into a |first= and |last= part would be affected. Between two nowrap spans would be a comma and a regular space, where wrapping would be permitted. Barring bad input, that should only be individual human authors. And not even all of them. ―cobaltcigs 01:29, 1 November 2020 (UTC)[reply]

|author=, used regularly for organizational authors, is a synonym for |last=. Nowrapping the output for that parameter is accordingly a no-go due to regular column size constraints. (I have said a few times now that we should have a |org-author=, but alas, it has not happened.) --Izno (talk) 01:36, 1 November 2020 (UTC)[reply]

In the long run, it might be necessary to decouple |author= from |last= to improve support for foreign names, anyway. But I agree that right now it would not be worth the trouble to change this just to improve some wrapping behaviour. Given that wrapping in the middle of initials looks particularly odd, adding automatic "anti-wrapping" to |first= could already improve the display of names somewhat.

--Matthiaspaul (talk) 02:02, 6 November 2020 (UTC)[reply]

As for the template behavior, it would be nice if it permitted   and other entities, and excluded any &...; pattern from its counts of semicolons while trying to detect improper input. Now that I'm migrating back to Windows I'm remember what a hassle it is to get various special characters inserted, though I think I will buy PopChar for Windows and hope that it works as well as the Mac one (esp. compared to Windows Character Map). Even if we wish people would always use the composed Unicode character, we know that they will not. And   is actually desirable, since no one can visually tell the difference between a regular space and a non-breaking one otherwise. — SMcCandlish ☏ ¢ 😼 22:15, 26 October 2020 (UTC)[reply]

Right, we don't want to use the invisible character. We have a separate test for the invisible control characters that will emit an error. Invisible very badddd.

However, as I said earlier, we can encourage the use of {{nbsp}} to meet this user desire. Or encourage the use of normal spaces. (Now that I look, there is the caveat at WP:NBSP regarding the use of the template with links. Maybe that is sufficient reason to support it until there is some consensus about whether non-breaking behavior is actually desirable in the citation templates.)

I do not think there is a general way to allow all HTML entities. We would need to add and check against some published list (perhaps of the most common), which seems like overkill for most, since most others (maybe all-of-interest) have a visible alternative.

Finally, though I disagree with permitting  , I've tweaked the module to discount these markers. You can see the output for yourself at test_Mult_names in Module talk:Citation/CS1/testcases/errors, where the two affected test cases are orange as not-matching (because we just compare against the live version rather than the preferred output of course ;). --Izno (talk) 05:45, 2 November 2020 (UTC)[reply]

I guess I don't think that we should support the use of   in the namelists. We have noted at Template:Citation Style documentation#COinS that html entities should not be used in parameters that contribute to the citation's metadata; we should not allow something on the one hand but disallow it on the other hand. {{nbsp}} is not appropriate for use in cs1|2 templates because it will cause the inclusion of all of this in the name's metadata:

 

cs1|2 wraps some or all of the value assigned to |access-date= in ... because |access-date= is rendered at the end of the citation. That was an experiment conducted quite long ago. Did anyone notice? We don't similarly wrap |isbn= which, because of the permitted (desired) hyphens and occurring at the end of a book citation rendering, can break oddly. Did anyone notice?

And beyond the first name or maybe three, who reads the namelists in a citation? Yeah, I know, wandering a bit off topic, but wouldn't it be better for us to set a default |display-<names>= value so that all cs1|2 templates show the default number of names (+ et al when there are more names in the template)? Do we really need to display 400 names? or even 29? (that's a popular number; why?) who is going to read or even need all of those names to locate the source?

—Trappist the monk (talk) 13:03, 2 November 2020 (UTC)[reply]

Anecdotally I probably notice on occasion, but never something like "oh, I would miss this were it gone".

who reads the namelists in a citation I do not, but that might be something that varies by domain and no-doubt by personality. When I see namelists longer than 6 is when I personally add the et al in my gnoming.

I think some tool at some point was limited to 29, Citoid or refill perhaps. I've noticed a similar pattern (but again, maybe anecdotal).

It is interesting that there is a suggestion not to use nbsp in COinS parameters. I am not sure what opinion I have of that, but our typical implementation has been to categorize and remove metadata problems. Consider that we perform a substitution in page handling of dashes; we could do the same for author lists. --Izno (talk) 15:08, 2 November 2020 (UTC)[reply]

I added that suggestion, and though I don't remember exactly why, I suspect it was to avoid having to add support to translate every html entity into its unicode character form. The page handling of — and – was necessary to resolve a technical issue because editors will use a semicolon between individual page numbers when a comma should have been used.

—Trappist the monk (talk) 16:56, 2 November 2020 (UTC)[reply]

While I sometimes study author lists, including longer ones of a few dozens, I have yet to see and study a list of 400 names. Nevertheless, I don't think we should set a default limit for |display-name=. This should remain up to the editorial judgement of the article editors, not us. Setting a too low limit would also make it more difficult to enter longer lists, as one would first have to add |display-name=some-high-value to see the remaining names. Courtesy dictates to try to list all authors of a work and limit the display for practical reasons only where necessary. Depending on the "house standard" for author lists (alphabetical order, chronological increasing/decreasing order, increasing/decreasing importance by amount of contribution or "status", no order, etc.) being followed, the first author is not necessarily the main author. The editors of an article probably have the best insight into a source and context to set the display limit to an appropriate value for a citation, if necessary.

Regarding methods to avoid orphans, yes, I have occasionally recognized this (I'm one of those editors who sometimes inserts   between the last two words of a paragraph to improve text flow appearance on some browsers).

As an aside regarding , does someone know if it is possible to define a kind-of-strength for nowrapping so that the browser tries not to wrap (for as long as possible), but would start wrapping anyway when it could otherwise not avoid a horizontal scrollbar or truncation on narrow windows?

--Matthiaspaul (talk) 02:02, 6 November 2020 (UTC)[reply]

DIEP flap – 380 names. Setting the default to say, six, accomplishes the purpose of helping readers locate the source. In other discussions, editors have stated that identifiers and their links are not reader friendly (I disagree, readers are not stupid) but if that argument maintains, then surely an excessive number of authors will also dissuade readers from seeking the source. If by Courtesy dictates to try to list all authors you mean courtesy to the authors, then I disagree. Citations are not here to serve as 'credits'; that is the duty of the publisher. Courtesy plays no part in citations except as a courtesy to the reader to provide consistent, understandable, identification of the sources used by en.wiki editors to create and maintain the encyclopedia's articles.

I don't know of any citation style that requires name-lists in citations to be ordered in any way except the order in which they are named in the source; to sort the names in a list any other way is to do a disservice to readers. Of course, editors of an article probably have the best insight into a source and context to set the display limit to an appropriate value; setting a default limit does not take away from that editorial discretion.

Isn't the insertion of   between the last two words of a paragraph discouraged because what you see on your screen doesn't necessarily mean that any other viewers will see the same thing?

I am not sufficiently versed in the minutia of css so I can't answer the kind-of-strength for nowrapping question though there is the html tag . It is my understanding that using   to prevent line breaks is discouraged in favor of css: $1. That would add 28 characters to every |firstn= that is breakable. We could minimize that to some extent by nowrapping the entire name list and insert  between authors and between last and first names:

Black, A; Brown, B; Red, C; Orange, D.

—Trappist the monk (talk) 14:50, 6 November 2020 (UTC)[reply]

Thanks for the link. :-) I don't find this list intimidating, except for that this is a clear case for list-defined references rather than inline references. If there would be many citations with such long lists of contributors in an article, it would probably start to look strange at some point - that's when |display-authors= would come in handy at the editor's discretion. Interestingly enough, in this example citation, the authors are listed in alphabetically ascending order (actually, in this particular example, two such lists), so, when you would cut the list at some arbitrary point, you'd risk missing main contributors. I don't find this desirable at all.

Regarding Courtesy dictates to try to list all authors, yes, I meant the authors of the cited work. If they are listed in the published work, we should specify them as well (by default). If those authors wouldn't have published their work, we could not cite it, so I see us as a continuation in a line of works built upon each other in an environment where it is common to list the sources (to avoid circular references and help identify false information).

Citations are not only used to locate a source, but also to evaluate the significance of statements ("Is there a widely known and trustable expert on a field among the authors?") or as a starting point for new research ("Let's check the authors' affiliations for related works and other publications."). Also, having a pre-defined default limit makes entering larger lists somewhat more difficult if there are more authors than what is defined as a limit. So, it's better to leave it to the article authors to set an upper limit (if necessary at all).

Regarding "house styles", here I meant publishing standards, not citation standards. Of course, we should list authors in the order given in the source, if such an order is determinable (as is the case most of the time). In many cases, the most important authors for a work are listed among the first, but unfortunately different organizations have different publication conventions to the effect that it is also possible that significant or even the most important contributors happen to be listed in the middle or the end.

Regarding concatenating the last two words of a paragraph with  , this is quite common among web designers. It dates back to times long before the introduction of CSS and works even with the oldest browsers. As you wrote, the text of a page will flow differently depending on the width of the window (and other things). However, the   will have no effect except for when the browser would otherwise move the last word of a paragraph into a new line, whereas with   in place the browser would ensure that there will be at least two words in the last line of a paragraph, thereby preventing the last word from becoming an orphan. This might no longer be necessary with browsers supporting CSS, but also can't harm (unless the two words would be long and force the browser to go into an otherwise unnecessary horizontal scrolling mode, which, of course, would be counter-productive).

--Matthiaspaul (talk) 12:47, 7 November 2020 (UTC)[reply]

Nbsps in MediaWiki

(Slightly offtopic to nbsps in citation templates so split Cobalt's reply from SMC's comment SMC at 22:15, 26 October 2020 (UTC) in #Nbsp_in_|author,_|last,_and_equivalents_for_other_contributors to its own thread.)

Assuming your browser renders the above character U+1F63C CAT FACE WITH WRY SMILE in a colorful way (as mine does), you can determine which font said browser is using, then edit that font to fill the glyph bounds for U+00A0 NO-BREAK SPACE with some light pastel color (instead of transparent nothingness). This will give every nbsp on the page a subtle glow (much like shining a UV flashlight across a motel room), without being too distracting to read. I use a technique similar to this myself. Among other things, you'll become aware of situations where the MediaWiki software replaces regular spaces with nbsp on its own for no apparent reason (e.g. before ! or ?). Perhaps if the rules for doing this could be configured on a per-wiki basis, everyone would be happy. ―cobaltcigs 01:40, 1 November 2020 (UTC)[reply]

P.S. Never pay for software. I use BirdFont (what, no article?) for editing and gucharmap for character lookup by name. Brief research suggests both have been ported to Windows—where they probably work equally well, but I can't confirm that. ―cobaltcigs 01:40, 1 November 2020 (UTC)[reply]

I assume you are referring to the edit window when discussing the novel (distinguishing color) markup for non-breaking spaces. The user (reader) version should be free of any visible formatting notation or artifact - transparent nothingness is best, in this case.

I haven't verified the MediaWiki soft-space replacement before punctuation that you indicated above. If true, it is odd. Normally, space of any kind is considered erroneous practice if it is before most punctuation - it breaks continuity between the punctuation mark and the text the punctuation is supposed to apply to. For similar semantic (and esthetic) reasons, sentences should not wrap until after punctuation marks. Adding a hard space is compounding an error. 98.0.246.242 (talk) 04:13, 1 November 2020 (UTC)[reply]

I know of one place where MediaWiki will insert non breaking spaces (before %), but that should only occur on French Wikipedia. --Izno (talk) 14:26, 1 November 2020 (UTC)[reply]

Not so. See example screenshot (enwiki edit preview, with highlighting hacks enabled). ―cobaltcigs 19:18, 1 November 2020 (UTC)[reply]

And another, lol. ―cobaltcigs 19:26, 1 November 2020 (UTC)[reply]

@Cobaltcigs: I went to verify on phab, apparently having researched this 3 years ago.... phab:T181441#3798402. --Izno (talk) 20:54, 1 November 2020 (UTC)[reply]

The exclamation-mark french-space case is problematic in enwiki, but I am not aware of the percent sign being used for punctuation in any language? Not to say there should be a leading space there. It seems that the situation hasn't been resolved, or if it was, there was a parser update or html update that messed up things again. 98.0.246.242 (talk) 21:28, 1 November 2020 (UTC)[reply]

The task linked directly in my comment discusses the % sign directly. --Izno (talk) 21:49, 1 November 2020 (UTC)[reply]

I went through it before my previous comment. I was just wondering why the behavior persists in enwiki. That is, why is french-spacing applied in situations where French terms are not used. Without examining the related patches (some of them are still in beta, I believe) it seems to me that the parser interprets a space before certain punctuation marks and other characters as attempts at french-spacing and applies the "correct" space format, i.e. a nbsp. However, this seems to be done indiscriminately, it should be done only when French-language terms include such syntax. In English, any such spacing would very likely be wrong. It seems that in the attempt to fix a special case, the general case was botched. 98.0.246.242 (talk) 23:14, 1 November 2020 (UTC)[reply]

In case you're confused about the functionality,   is not inserted arbitrarily before these symbols. MediaWiki substitutes a plain-ol' space for the non-breaking space, where such a space is present. However, English doesn't use the plain-ol' space (indeed, as you suggest, such spacing would very likely be wrong), so it is usually not an issue here. One might argue that that code should not be executed in our locale at all, but I don't see the fundamental harm, since we wouldn't want those symbols, were they to be separated by a space, to have the same issues with wrapping that originally caused the behavior to be added to the software so long ago (... or at least, the I presume that was why). --Izno (talk) 02:36, 2 November 2020 (UTC)[reply]

Apostrophes are stripped from titles

(moved from VPT:) Apostrophes are stripped from titles in {{Cite wikisource}}:

{{Cite wikisource|Uncle Tom's Cabin}} → Uncle Tom's Cabin – via Wikisource.

{{Cite wikisource/make link|link=Uncle Tom's Cabin|label=Uncle Tom's Cabin}} → Uncle Tom's Cabin

{{citation|title={{Cite wikisource/make link|link=Uncle Tom's Cabin|label=Uncle Tom's Cabin}}}} → Uncle Tom's Cabin

{{cite book|title={{Cite wikisource/make link|link=Uncle Tom's Cabin|label=Uncle Tom's Cabin}}}} → Uncle Tom's Cabin .

Pinging Trappist the monk, who probably knows where this is happening. It appears to be happening in the CS1 module code, which is why I posted it here. I don't know if this needs an adjustment in Cite wikisource or in CS1, or somewhere else. And if Cite wikisource is doing it, other wrappers might be doing it as well. – Jonesey95 (talk) 21:43, 28 October 2020 (UTC)[reply]

Yes, it is. When you are done with {{cite wikisource/sandbox}}, I will continue to troubleshoot – I need that to be a copy of the live template so that I can use it to troubleshoot in the cs1|2 sandboxen.

—Trappist the monk (talk) 21:49, 28 October 2020 (UTC)[reply]

Go for it. – Jonesey95 (talk) 22:37, 28 October 2020 (UTC)[reply]

Fixed, perhaps...

{{cite wikisource/sandbox|Uncle Tom's Cabin}}

Uncle Tom's Cabin – via Wikisource.

Found another lurking bug, c.f.:

{{cite wikisource|chapter=Laurent,_Cornelius_Baldran |wslink=Appletons'_Cyclopædia_of_American_Biography|plaintitle=Appletons' Cyclopædia of American Biography}}

"Laurent, Cornelius Baldran" . Appletons' Cyclopædia of American Biography – via Wikisource.

{{cite wikisource/sandbox|chapter=Laurent,_Cornelius_Baldran |wslink=Appletons'_Cyclopædia_of_American_Biography|plaintitle=Appletons' Cyclopædia of American Biography}}

"Laurent, Cornelius Baldran" . Appletons' Cyclopædia of American Biography – via Wikisource.

—Trappist the monk (talk) 00:16, 29 October 2020 (UTC)[reply]

Just out of curiosity, where did the little globe icon disappear to in the Cite wikisource sandbox (in the Uncle Tom's Cabin example)? I am not attached to it, but it's in the live template. – Jonesey95 (talk) 02:29, 29 October 2020 (UTC)[reply]

I think that's supposed to be an iceberg.

—Trappist the monk (talk) 10:59, 29 October 2020 (UTC)[reply]

|nocat=

I have cleared ‹The template Category link is being considered for merging.› Category:CS1 maint: nocat. I have also removed support for |nocat= and Category:CS1 maint: nocat from the sandbox module suite. After the next update, |nocat= will cause cs1|2 templates to emit the unknown parameter error message:

{{cite book/new |title=Title |nocat=yes}}

Title. {{cite book}}: Unknown parameter |nocat= ignored (|no-tracking= suggested) (help)

—Trappist the monk (talk) 13:36, 29 October 2020 (UTC)[reply]

Thanks, Trappist. I haven't gone through all citations using |no-tracking= yet, but in the past weeks I cleaned up some of the |nocat= parameters as well and among them did not run into any citations for which we would actually need this feature in mainspace any more (for broken DOIs we now have a much better option). Did you?

If all uses in mainspace would have been removed, and categorisation would be disabled outside mainspace, the parameter could be removed completely or reduced to a pure debug option (possibly with reversed logic to optionally enable categorisation outside mainspace).

Questions from Help_talk:Citation_Style_1/Archive_71#no-cat_parameter_cleanup:

Do we actually need this in mainspace? Should we disallow the feature in mainspace?
What should be the default behaviour in other namespaces? Should the behaviour be changed to populate categories only when a special option is given?
If it would be always disabled in mainspace and enabled elsewhere, do we need a parameter to control it at all?
~~Should we change the temporary nocat category into a permanent maint category for the feature as a whole?~~
Find a better parameter name based on the resulting functionality and use case of the feature. If we don't keep a maint category the parameter name needs to be unique to also serve as a good search pattern.

Opinions?

--Matthiaspaul (talk) 23:05, 4 November 2020 (UTC)[reply]

Deprecated archive switch parameter [dead-url] on cite magazine

I believe the {dead-url} switch has been replaced by {url-status}. I intend to change references to {dead-url} on the above page to the new parameter. Are there objections. Cambial foliage❧ 14:39, 29 October 2020 (UTC)[reply]

Support for |dead-url= and its companion |deadurl= was removed quite some time ago. Alas, there are regularly used tools that continue to insert one of these forms into new cs1|2 templates so cleaning up after them is more-or-less a continuing task. Changing to |url-status= in any cs1|2 template, not just {{cite magazine}}, is the correct resolution, so please do.

—Trappist the monk (talk) 14:54, 29 October 2020 (UTC)[reply]

Thanks Trappist the monk. I would be happy to but it turns out I cannot. I believe you can? Cambial foliage❧ 15:07, 29 October 2020 (UTC)[reply]

scrap that I found the right edit button for the doc. Cambial foliage❧ 15:09, 29 October 2020 (UTC)[reply]

Footnotes

Hello. I would like to know how you match the footnote labels between the footnote marker and the reference list. I would like to do it on the Wikipedia of another language. Let me know. Hypuxylun (talk)

Hard to know just exactly what you mean. In general, in article text, citations (either plain text or template) are wrapped in <ref>...</ref> tags. In the reference section, the basic form is a single <reflist /> tag. MediaWiki's cite.php creates the list of citations where the <reflist /> tag is placed. Did I answer your question?

—Trappist the monk (talk) 23:36, 30 October 2020 (UTC)[reply]

For example, for the lower alpha footnotes, the footnote marker is ^[a], ^[b], ^[c]... and the reference list is also a, b, c... However, on the Wikipedia of another language, the footnote marker is ^[a], ^[b], ^[c]... but the reference list is still 1, 2, 3... I would like to know how you change the reference list from 1, 2, 3... into a, b, c... You can find the example on the template Sablon:Efn from the Hungarian Wiki. --- Hypuxylun (talk)

I think that the letters are provided by the "list-style-type" attribute in Template:reflist. If you use that code in the hu.WP version of reflist, it might do what you want. – Jonesey95 (talk) 01:00, 31 October 2020 (UTC)[reply]

OK, it was not included on the one in Hungarian template Sablon:Reflist. I can find a way to fix it and hope that it works. Hypuxylun (talk) 01:20, 2 November 2020

url-status defaulting to 'dead' is problematic

If 'archive-url' has a value and 'url-status' is omitted or has no value, then 'url-status' is silently treated as having the value of 'dead'.

I am finding this problematic, as I see numerous instances where the original URL is still live but an editor has set 'archive-url' (that's fine) and omitted 'url-status=live' (presumably these editors have simply been unaware that it defaults to 'dead').

Before requesting reconsideration of the default behaviour, I am wondering whether it was a deliberate decision and, if so, what the rationale was. I have searched this talk page's voluminous archives but have not yet found any such discussion. Can anyone advise (or link to) what consideration was given to making this the default behaviour? Thanks. Nurg (talk) 23:31, 30 October 2020 (UTC)[reply]

Perhaps this: Wikipedia:Requests for comment/Dead url parameter for citations.

|deadurl=no is the no-longer-supported for of |url-status=live.

—Trappist the monk (talk) 00:03, 31 October 2020 (UTC)[reply]

Hinting on Citation Bot's duplicate parameters

When a citation contains duplicate parameters the Mediawiki software will display a yellow warning at the top of the page:

This is only a preview; your changes have not yet been saved!

Warning: xxxx is calling Template:yyyy with more than one value for the "zzzz" parameter. Only the last value provided will be used.

However, this warning will only be shown in edit preview.

When Citation Bot finds duplicate parameters in citations it renames them by adding a "DUPLICATE_" prefix to them. Our citation template then throws a red error message:

Unknown parameter |DUPLICATE_zzzz= ignored

Since our citations templates can optionally issue a parameter suggestion, I added a rule so that the template would display instead:

Unknown parameter |DUPLICATE_zzzz= ignored (|zzzz= suggested)

However, I was reverted by Izno stating that the parameters should be removed or merged. While this is correct in general, for users to select or merge into one of the duplicate parameters and remove the others, they first need to know the name of the underlying parameter in question. While this can be guessed from the DUPLICATE_* name, this is a private convention used by Citation Bot, and I think it is more user-friendly to name that parameter explicitly in the error message and for consistency to use our own established message system for this (hence my addition of that rule). (The "suggested" in our standard "(|zzzz= suggested)" message does not mean that the suggested parameter is necessarily a direct 1:1 replacement (although it often is), only that it is the (most likely) parameter target that needs to be dealt with to fix the issue and that additional changes may still be required in such a parameter transformation/merge.)

Opinions?

--Matthiaspaul (talk) 23:50, 1 November 2020 (UTC)[reply]

Fix Citation Bot so that it doesn't act like a template editor? Such function is way beyond its scope. 98.0.246.242 (talk) 00:11, 2 November 2020 (UTC)[reply]

Such function is exactly in its scope because finding duplicate parameter names is something that MediaWiki prevents all templates and modules from doing.

—Trappist the monk (talk) 00:49, 2 November 2020 (UTC)[reply]

I thought it was supposed to edit incorrect citations, not the code they are based on. That part, the new parameter class |DUPLICATE_(anything)=, and the accompanying terminology should be discussed, and here. If the bot has to do something it would do better to apply the error message of the preview, which follows longstanding practice in wiki (and many other coding environments). As I think you state below, the current bot action makes for a convoluted situation. 98.0.246.242 (talk) 01:17, 2 November 2020 (UTC)[reply]

Were this a serious problem, by which I mean lots of these kinds of error messages in ‹The template Category link is being considered for merging.› Category:Pages with citations using unsupported parameters attributable to duplicate parameter names, I might be inclined to agree with you.

I don't know for sure, but a quick look into the Citation bot source (line 3271 et seq.) seems to suggest that the bot creates a single |DUPLICATE_zzzz= parameter name for each duplicated parameter name. I don't know if the bot applies this only to valid cs1|2 parameter names. If it doesn't and there are, for example, |blue=yellow and |blue=orange in a cs1|2 template, then the bot will rename one of these, perhaps the first it encounters, perhaps the last, I don't know, to say |DUPLICATE_blue=orange. Then, when cs1|2 sees that, your hint would cause cs1|2 to emit Unknown parameter |DUPLICATE_blue= ignored (|blue= suggested). Not much good to be gained by that.

Certainly, this ought to be mentioned at Help:CS1_errors#Unknown_parameter_|xxxx=_ignored.

—Trappist the monk (talk) 00:49, 2 November 2020 (UTC)[reply]

Citation Bot flags the one that is not used. Often the data is good stuff, just in the wrong place. For example the publisher might be set to Reuters and then some one else adds Fox News and fails to convert Reuters to agency. The bot makes the error apparent. AManWithNoPlan (talk) 01:06, 2 November 2020 (UTC)[reply]

I like the idea of suggesting a way to fix the problem, either at the category page or on the help page (or both; do they use the same text?), but as AManWithNoPlan says, the solution is usually to fix one of the labels, not simply reinstate the duplicated label. If there are two |publisher= or |last2= parameters, the solution is usually to change one of them (to e.g. |work= or |via= in the first case, or e.g. |last3= or |first2= in the second case). – Jonesey95 (talk) 02:09, 2 November 2020 (UTC)[reply]

Yes, the text at the help page is section-transcluded to the category. --Izno (talk) 02:24, 2 November 2020 (UTC)[reply]

Yeah, as I wrote the "(|zzzz= suggested)" should not imply that the solution is to just replace the parameter name (or even to just reinstate the duplicate parameter - that would be counter-productive). It just hints that the zzzz parameter is what (most likely) needs to be dealt with.

Still, it might be possible to further improve the hinting system by allowing the right sides of the rules to contain more than one word (or move those into a separate list of rules). The template's code could then issue this text instead of the preformatted "(|zzzz= suggested)" message. There are other cases, where this could be useful to give a few more hints what to do (for example in the case of |editors=, see Help_talk:Citation_Style_1/Archive_72#support_for_|editors=_withdrawn_(in_the_sandbox)).

['^DUPLICATE_(%w+)$'] = '$1'

could become

['^DUPLICATE_(%w+)$'] = 'merge into <code>|$1=</code>'

to display

"(merge into |zzzz=)"

Or

['ignore-isbn-error'] = 'isbn'

could become

['ignore-isbn-error'] = 'use <code>|isbn=((...))</code>'

to display

"(use |isbn=((...)))"

Or

['editors'] = 'editor'

could become

['editors'] = 'split into <code>|editor''n''=</code>'

to display

"(split into |editorn=)"

--Matthiaspaul (talk) 15:45, 3 November 2020 (UTC)[reply]

I guess, the |DUPLICATE_blue= scenario is very rare as hardly anyone would repeat a non-existing parameter |blue= more than once. It might occur in the case of previously supported parameters, but then the template would typically throw a message suggesting the new parameter name once someone would try to reintroduce |blue=. So, the user would be led to the correct solution at least by iteration (as in the |editors= example at present).

I found about a dozen uses in mainspace and some 150 in total (including some where the |DUPLICATE_zzzz= parameter was empty), probably because they are actively worked on by some editors. AManWithNoPlan probably has a better overview how often this parameter is being added by the bot.

--Matthiaspaul (talk) 15:45, 3 November 2020 (UTC)[reply]

Edition and pages extra text as errors

Per a discussion elsewhere, in the sandbox I have separated Category:CS1 maint: extra text into two separate categories, as well as promoted the two categories to errors from maintenance. The two categories are per parameter: one for |edition= and one for |p/pp/page/pages=.

This change is demonstrated at test_extra_text test on Module talk:Citation/CS1/testcases/errors. I did not implement sensitivity to the exact parameter name in the pages test since that's still a bit beyond me. I have no strong opinion on someone else doing so.

Secondly, I see "volume" text in |work= in the wild often (and equivalents, esp. in the titles of encyclopedias and books). An example might be |title=Title, Volume X: Volume Name, which I would envision as better being |title=Title|volume=X: Volume Name. I would like to entertain an "extra text" test for that pattern and an associated maintenance category, and invite discussion accordingly. --Izno (talk) 03:39, 2 November 2020 (UTC)[reply]

As there are so many possible variants, I don't see a more narrow pattern as to just search for the string "Volume" or "Vol." in a title. In most cases it will be preceded by a separator and located near the end of a title, but I can also think of cases where that would not hold true. We'd have to live with the false positives.

Similar to the volume thing, I sometimes see variously formatted "Part" info in the title as well. If the |volume= parameter isn't used, this could be abused to move the part info into there, but what we'd actually need for this is a separate parameter |part= (see also Module_talk:Citation/CS1/Feature_requests#Part/Help_talk:Citation_Style_1/Archive_58#Books_with_volumes_and_parts, there even is a COinS tag for this, &rft.part=, although, as odd as it is, this appears to be defined only for periodicals, not books).

Applying to both volumes and parts, an Arabic or Roman number at the end of a title might also give a clue (but could also be a version number and valid part of the title).

--Matthiaspaul (talk) 14:59, 3 November 2020 (UTC)[reply]

Per Help_talk:Citation_Style_1/Archive_49#Edit_request_for_Template:Cite_book the template now also detects the British abbreviation "edn" in |edition= as extra text:

Extended content

Cite book comparison
Wikitext	`{{cite book\|author=Author\|date=2020\|edition=1st\|title=Title}}`
Live	Author (2020). Title (1st ed.). `{{cite book}}`: `\|author=` has generic name (help)
Sandbox	Author (2020). Title (1st ed.). `{{cite book}}`: `\|author=` has generic name (help)

Cite book comparison
Wikitext	`{{cite book\|author=Author\|date=2020\|edition=1st ed.\|title=Title}}`
Live	Author (2020). Title (1st ed. ed.). `{{cite book}}`: `\|author=` has generic name (help); `\|edition=` has extra text (help)
Sandbox	Author (2020). Title (1st ed. ed.). `{{cite book}}`: `\|author=` has generic name (help); `\|edition=` has extra text (help)

Cite book comparison
Wikitext	`{{cite book\|author=Author\|date=2020\|edition=1st edn\|title=Title}}`
Live	Author (2020). Title (1st edn ed.). `{{cite book}}`: `\|author=` has generic name (help); `\|edition=` has extra text (help)
Sandbox	Author (2020). Title (1st edn ed.). `{{cite book}}`: `\|author=` has generic name (help); `\|edition=` has extra text (help)

--Matthiaspaul (talk) 20:25, 7 November 2020 (UTC)[reply]

The extra text test for |page=/|pages= and |quote-page=/|quote-pages= now also checks for pattern "pg(s)(.)" etc. in addition to ""p(p)(.)" etc.:

Extended content

Cite book comparison
Wikitext	`{{cite book\|page=p. 35\|title=Title}}`
Live	Title. p. p. 35. `{{cite book}}`: `\|page=` has extra text (help)
Sandbox	Title. p. p. 35. `{{cite book}}`: `\|page=` has extra text (help)

Cite book comparison
Wikitext	`{{cite book\|page=pp. 35\|title=Title}}`
Live	Title. p. pp. 35. `{{cite book}}`: `\|page=` has extra text (help)
Sandbox	Title. p. pp. 35. `{{cite book}}`: `\|page=` has extra text (help)

Cite book comparison
Wikitext	`{{cite book\|page=pgs 35\|title=Title}}`
Live	Title. p. pgs 35. `{{cite book}}`: `\|page=` has extra text (help)
Sandbox	Title. p. pgs 35. `{{cite book}}`: `\|page=` has extra text (help)

Cite book comparison
Wikitext	`{{cite book\|page=pgs. 35\|title=Title}}`
Live	Title. p. pgs. 35. `{{cite book}}`: `\|page=` has extra text (help)
Sandbox	Title. p. pgs. 35. `{{cite book}}`: `\|page=` has extra text (help)

Cite book comparison
Wikitext	`{{cite book\|page=p123\|title=Title}}`
Live	Title. p. p123. `{{cite book}}`: `\|page=` has extra text (help)
Sandbox	Title. p. p123. `{{cite book}}`: `\|page=` has extra text (help)

Cite book comparison
Wikitext	`{{cite book\|page=P123\|title=Title}}`
Live	Title. p. P123.
Sandbox	Title. p. P123.

--Matthiaspaul (talk) 01:17, 17 November 2020 (UTC)[reply]

Only remotely related to this "extra text detection" topic but I don't want to open a new thread for this minor bit: I changed the "et al." extra text detection code to also detect "et alii" and "et aliae" in addition to "et alia" and the abbreviated variants.

Extended content

Cite book comparison
Wikitext	`{{cite book\|author=Author, et alia\|date=2020\|title=Title}}`
Live	Author; et al. (2020). Title. `{{cite book}}`: `\|author=` has generic name (help); Explicit use of et al. in: `\|author=` (help)
Sandbox	Author; et al. (2020). Title. `{{cite book}}`: `\|author=` has generic name (help); Explicit use of et al. in: `\|author=` (help)

Cite book comparison
Wikitext	`{{cite book\|author=Author, et alii\|date=2020\|title=Title}}`
Live	Author; et al. (2020). Title. `{{cite book}}`: `\|author=` has generic name (help); Explicit use of et al. in: `\|author=` (help)
Sandbox	Author; et al. (2020). Title. `{{cite book}}`: `\|author=` has generic name (help); Explicit use of et al. in: `\|author=` (help)

Cite book comparison
Wikitext	`{{cite book\|author=Author, et aliae\|date=2020\|title=Title}}`
Live	Author; et al. (2020). Title. `{{cite book}}`: `\|author=` has generic name (help); Explicit use of et al. in: `\|author=` (help)
Sandbox	Author; et al. (2020). Title. `{{cite book}}`: `\|author=` has generic name (help); Explicit use of et al. in: `\|author=` (help)

Cite book comparison
Wikitext	`{{cite book\|author1=Author\|author2=et alia\|date=2020\|title=Title}}`
Live	Author; et al. (2020). Title. `{{cite book}}`: `\|author1=` has generic name (help); Explicit use of et al. in: `\|author2=` (help)
Sandbox	Author; et al. (2020). Title. `{{cite book}}`: `\|author1=` has generic name (help); Explicit use of et al. in: `\|author2=` (help)

Cite book comparison
Wikitext	`{{cite book\|author1=Author\|author2=et alii\|date=2020\|title=Title}}`
Live	Author; et al. (2020). Title. `{{cite book}}`: `\|author1=` has generic name (help); Explicit use of et al. in: `\|author2=` (help)
Sandbox	Author; et al. (2020). Title. `{{cite book}}`: `\|author1=` has generic name (help); Explicit use of et al. in: `\|author2=` (help)

Cite book comparison
Wikitext	`{{cite book\|author1=Author\|author2=et aliae\|date=2020\|title=Title}}`
Live	Author; et al. (2020). Title. `{{cite book}}`: `\|author1=` has generic name (help); Explicit use of et al. in: `\|author2=` (help)
Sandbox	Author; et al. (2020). Title. `{{cite book}}`: `\|author1=` has generic name (help); Explicit use of et al. in: `\|author2=` (help)

--Matthiaspaul (talk) 03:26, 17 November 2020 (UTC)[reply]

The sandboxed version now no longer leaves bracket-artifacts when it removes a double-bracketed pattern of et al.:

Cite book comparison
Wikitext	`{{cite book\|author1=Author1\|author2=((et al.))\|date=2020\|title=Title}}`
Live	Author1; et al. (2020). Title. `{{cite book}}`: `\|author1=` has generic name (help); Explicit use of et al. in: `\|author2=` (help)CS1 maint: numeric names: authors list (link)
Sandbox	Author1; et al. (2020). Title. `{{cite book}}`: `\|author1=` has generic name (help); Explicit use of et al. in: `\|author2=` (help)CS1 maint: numeric names: authors list (link)

--Matthiaspaul (talk) 14:12, 21 November 2020 (UTC)[reply]

CS1 maint: others

We presently capture citations that have no authorship information, besides |others=, in Category:CS1 maint: others (with some 20k pages). Due to prominence in the documentation of the templates {{cite AV media}} and {{cite AV media notes}}, these templates often have |others= exclusively, which makes it hard for other cases where this is an issue.

I am considering separating these out into a separate category (something like Category:CS1 maint: others in cite AV media (notes)) so that someone interested in working through slightly-less painful categories can do so.

Has anyone seen another of the core CS1 template set cause such inclusion in this maintenance category? Does anyone have an issue with that path? --Izno (talk) 05:05, 2 November 2020 (UTC)[reply]

Alternatively, is there something we can do about those templates? Provide still-more named parameters?... --Izno (talk) 05:08, 2 November 2020 (UTC)[reply]

This search can be helpful. We might restore |artist= as a template-specific parameter for {{cite av media notes}}. Instead of keeping it separate, the content of |artist= might be concatenated as a prefix to |title= so this:

{{cite av media notes |title=Dark Side of the Moon |artist=Pink Floyd}}

might render:

Pink Floyd: Dark Side of the Moon (Media notes).

with the metadata as:

&rft.btitle=Pink+Floyd%3A+Dark+Side+of+the+Moon

There are probably better rendering / metadata choices.

The {{cite av media}}, {{cite av media notes}}, {{cite episode}}, {{cite serial}} templates all deserve reworking. These are the templates that are the primary users of |people=, an alias of |authors= so none of the names listed in that parameter make it into the citation's metadata. All kinds of extraneous text is added to that parameter, mostly roles (director, producer, actor, voice-over, narrator, etc) none of which belongs in the metadata. Now that cs1|2 supports template-specific parameters, we could introduce specific role parameters for these templates so that the names are annotated in the rendering, and the names without annotation are included in the metadata. In the meantime, |people=, can be constrained to these templates only, and once the template specific parameters are available, deprecated and withdrawn.

To avoid the torches and pitchforks militias from those wikiprojects that use these templates, whichever those projects are should be consulted before we act on this.

—Trappist the monk (talk) 15:37, 2 November 2020 (UTC)[reply]

Sounds good to me in general. --Matthiaspaul (talk) 12:40, 3 November 2020 (UTC)[reply]

It is a good idea to reinstate |artist=. However, this may better be a free-form parameter since artist names maybe idiosyncratic, and of course we have cases of compilation works, collaborations etc.

I would think the role parameters should follow industry practice, i.e. render as they do in "credits" sections of artistic works. I suppose distinct roles should be limited to the main creators/contributors. Minor credits could be bundled in |others=. 98.0.246.242 (talk) 22:09, 3 November 2020 (UTC)[reply]

Others

Moved from Template talk:Citation#Others. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 11:12, 10 November 2020 (UTC)[reply]

Has anyone analysed what are the commonest types of role added as |others=? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 10:53, 8 November 2020 (UTC)[reply]

Not that I know of. Such analysis will be difficult because tools like ve have misused (and may still be misusing) |others= for author names and for editor names (without role being specified). That is the problem with free-form parameters; editors and tools can put just about anything there. There are approximately 52k-ish uses of |others= [search results]

—Trappist the monk (talk) 11:47, 8 November 2020 (UTC)[reply]

So should we add more non-free-from parameters, like |illustrator=? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 13:58, 8 November 2020 (UTC)[reply]

Probably better asked at WT:CS1 which is a bit more-watched.

—Trappist the monk (talk) 14:19, 8 November 2020 (UTC)[reply]

The question seems somewhat (tangentially?) relevant to discussion in #CS1 maint: others. --Izno (talk) 19:06, 10 November 2020 (UTC)[reply]

I suggest author of foreword (P2679) is another likely candidate. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 23:06, 12 November 2020 (UTC)[reply]

Perhaps not a good candidate for |others=. cs1|2 book citations support forewords, afterwords, and other contributions to an author's book:

{{cite book |author=Author |title=Title |contributor=Contributor |contribution=Foreword}}

Contributor. Foreword. Title. By Author. {{cite book}}: |author= has generic name (help)

—Trappist the monk (talk) 23:14, 12 November 2020 (UTC)[reply]

While there are use-cases for |contribution= with |contributorn= and it is good that the feature supports |contributor-first= and |contributor-last= as well as n-enumerated variants, I don't like the fact that only one |contribution= is allowed and that it is impossible to specify different types of contributions for different contributors (unless lumping them all together in |contribution=). What also looks odd most of the time is that the contributors are listed in front of the authors as this draws too much attention to them:

{{cite book |title=Title |date=2020 |author-first1=AF1 |author-last1=AL1 |editor-first1=EF1 |editor-last1=EL1 |translator-first1=TF1 |translator-last1=TL1 |contributor-first1=CF1 |contributor-last1=CL1 |contributor-first2=CF2 |contributor-last2=CL2 |contributor-first3=CF3 |contributor-last3=CL3 |contributor-first4=CF4 |contributor-last4=CL4 |contribution=Illustration/Foreword/Afterword |others=Others}}

CL1, CF1; CL2, CF2; CL3, CF3; CL4, CF4 (2020). "Illustration/Foreword/Afterword". Title. By AL1, AF1. EL1, EF1 (ed.). Translated by TL1, TF1. Others.{{cite book}}: CS1 maint: numeric names: authors list (link)

This is okay if the goal is to cite something from a foreword or afterword and draw particular attention to this specifically, but not if the goal is to cite a source in general and list the various contributors for completeness or because, f.e., the writer of a foreword was specifically "advertised" on the book cover. Right now, we'd have to use |others= for this, but this does not support enumerated and -first/-last parameter variants, and the article editor has to invent his/her own notation to list multiple contributors and their roles as in the following three examples:

{{cite book |title=Title |date=2020 |author-first1=AF1 |author-last1=AL1 |editor-first1=EF1 |editor-last1=EL1 |translator-first1=TF1 |translator-last1=TL1 |others=CL1, CF1 (Illustration). CL2, CF2; CL3, CF3 (Foreword). CL4, CF4 (Afterword). Others}}

AL1, AF1 (2020). EL1, EF1 (ed.). Title. Translated by TL1, TF1. CL1, CF1 (Illustration). CL2, CF2; CL3, CF3 (Foreword). CL4, CF4 (Afterword). Others.{{cite book}}: CS1 maint: numeric names: authors list (link)

{{cite book |title=Title |date=2020 |author-first1=AF1 |author-last1=AL1 |editor-first1=EF1 |editor-last1=EL1 |translator-first1=TF1 |translator-last1=TL1 |others=Illustration: CL1, CF1. Foreword: CL2, CF2; CL3, CF3. Afterword: CL4, CF4. Others}}

AL1, AF1 (2020). EL1, EF1 (ed.). Title. Translated by TL1, TF1. Illustration: CL1, CF1. Foreword: CL2, CF2; CL3, CF3. Afterword: CL4, CF4. Others.{{cite book}}: CS1 maint: numeric names: authors list (link)

{{cite book |title=Title |date=2020 |author-first1=AF1 |author-last1=AL1 |editor-first1=EF1 |editor-last1=EL1 |translator-first1=TF1 |translator-last1=TL1 |others=Illustrated by CL1, CF1. Foreword by CL2, CF2; CL3, CF3. Afterword by CL4, CF4. Others}}

AL1, AF1 (2020). EL1, EF1 (ed.). Title. Translated by TL1, TF1. Illustrated by CL1, CF1. Foreword by CL2, CF2; CL3, CF3. Afterword by CL4, CF4. Others.{{cite book}}: CS1 maint: numeric names: authors list (link)

Before we now introduce individual parameters for all possible roles, what I would like to see is a mix of both, |contributor= and |others=:

Multiple possible contributors with different contributions (with support for -first/-last and enumerated forms), but listed after the list of authors, editors and translators (and before |others=). This could be achieved by adding |contributor-role= (and enumerated forms). If the role would be specified, it would be listed alongside the corresponding contributor. In order to allow multiple contributors contributing to the same type of contribution, the role should occur either before all or after the last contributor of a specific group (as in the example renderings above). The markup for this could be like this:

{{cite book |title=Title |date=2020 |author-first1=AF1 |author-last1=AL1 |editor-first1=EF1 |editor-last1=EL1 |translator-first1=TF1 |translator-last1=TL1 |contributor-first1=CF1 |contributor-last1=CL1 |contribution-role1=Illustration |contributor-first2=CF2 |contributor-last2=CL2 |contributor-role2=Foreword |contributor-first3=CF3 |contributor-last3=CL3 |contributor-role3=Foreword |contributor-first4=CF4 |contributor-last4=CL4 |contributor-role4=Afterword |others=Others}}

As a further refinement we could make subsequent |contributor-role= parameters optional if they would specify the same role as that of the preceding contributor (|contributor-role3= here):

{{cite book |title=Title |date=2020 |author-first1=AF1 |author-last1=AL1 |editor-first1=EF1 |editor-last1=EL1 |translator-first1=TF1 |translator-last1=TL1 |contributor-first1=CF1 |contributor-last1=CL1 |contribution-role1=Illustration |contributor-first2=CF2 |contributor-last2=CL2 |contributor-role2=Foreword |contributor-first3=CF3 |contributor-last3=CL3 |contributor-first4=CF4 |contributor-last4=CL4 |contributor-role4=Afterword |others=Others}}

How to distinguish between the two forms? Either by the existence of |contribution=, by the existence of a |contributor-role= parameter, by introducing |others-first/-last/-role= instead of |contributor-first/-last/-role= or some mix of it.

--Matthiaspaul (talk) 20:11, 18 November 2020 (UTC)[reply]

The |contribution= and |contributor= pair are intended to cite the contributor's contribution to the work written by |author= as, for example, Anna Quindlen's introduction to Jane Austen's Pride and Prejudice, here where Quindlen is the writer who is being cited, not Austen, so it is correct that Quindlen is listed ahead of Austen in the citation. So, yes, [this] is okay if the goal is to cite something from a foreword or afterword and draw particular attention to this specifically because that is the defined purpose.

If an editor is not citing the writer of a foreword ... specifically "advertised" on the book cover, there is no need to clutter the citation with that extraneous detail; we don't need to distract or confuse the reader.

We should certainly not introduce individual parameters for all possible roles. If any such parameters are added they should only be added after careful consideration and when it can be shown that the new parameter is needed.

—Trappist the monk (talk) 13:50, 19 November 2020 (UTC)[reply]

I never proposed to introduce individual parameters for all possible roles, quite the opposite, I proposed to have a more general set of parameters that can be customized to suit all possible roles and use cases, so that we don't have to discuss this subject again and again. After all, whenever we added another set of parameters for a specific role, someone came around the corner asking for the next one. There is obviously a need to list some contributors, but the current system does not address all use cases (except for through a free-text parameter |others=, which, however, is unsatisfactory for most of the same reasons for why we are fading out |editors= and |authors= in the long term).

While there have been several requests in the past to add this and I too have come into sitations where it would have been great to handle more than one chapter in a single citation without having to lump them together in one parameter, I don't propose this. However, contributions are a completely different case, because there are often multiple contributions and of different types.

The Pride and Prejudice example you gave is a perfect example for the current use of |contribution= and |contributor=. I described this use case as well in my reply above. But it does not cover the more common use case where the afterword, foreword, illustrations, etc., are not by itself the subject to be cited, but they are nevertheless part of the contributions to a work and thus may be listed in a citation. (This is also why this ([1]) won't have the desired effect.) In this case, the contributions would be clutter when displayed before the main contributors. They should rather be listed following the main contributors like authors, editors and translators - basically they should be at the position where we show |others=. I could have worded my proposal to introduce |other-firstn=/|other-lastn=/|other-linkn=/|other-maskn= plus |other-rolen= (and fade out |others= in the long term). However, if we can combine this with the parameters for contributors we could just use the existing |contributor-firstn=/|contributor-lastn=/|contributor-linkn=/|contributor-maskn= for this as well and just add |contributor-rolen=.

--Matthiaspaul (talk) 20:17, 21 November 2020 (UTC)[reply]

Before we now introduce individual parameters for all possible roles, what I would like to see is a mix of both, |contributor= and |others=: ... reads, to me, like this mix of both is merely a prelude to the [introduction of] individual parameters for all possible roles which is something that we should not do.

I am not convinced that we need anything more than a carefully curated, select few, role-type parameters. We do not need something that will allow editors to name every last person who was even remotely connected to the cited work. We do not need to be film-credit-like and include the craft-services' third journeyman soup stirrer; leave that to the publisher.

I can imagine certain additional roles being added to replace |people= and |credits= which are predominantly used in {{cite AV media}}, {{cite episode}}, and {{cite serial}}. These new role parameters would be constrained to these templates.

But it does not cover the more common use case where the afterword, foreword, illustrations, etc., are not by itself the subject to be cited, but they are nevertheless part of the contributions to a work and thus may be listed in a citation. You're right, it doesn't and it shouldn't. When an afterword, foreword, introduction, preface, etc is not the subject to be cited, such contributions, noteworthy though they may be, are superfluous to the purpose of the citation which is to identify for the reader the subject to be cited. Including mention of afterwords, forewords, introductions, prefaces when they are not the subject to be cited merely obfuscates the subject to be cited within the citation and so does not benefit the reader. cs1|2 is not a repository for all possible bibliographic data associated with a source. If you want that, go write a template series to do that. It may be that in bibliographic lists of an author's works, for example, such a bibliographic information template might be desirable. Citations need only the bibliographic detail that is sufficient to identify the portion of the source that is the subject to be cited.

—Trappist the monk (talk) 18:49, 22 November 2020 (UTC)[reply]

My experience with "others" is that it is usually used incorrectly, for instance for authors after the first one. —David Eppstein (talk) 23:23, 12 November 2020 (UTC)[reply]

Even though the documentation has problems, in this case it correctly leads the horse to the water. 71.247.146.98 (talk) 12:56, 13 November 2020 (UTC)[reply]

Redirection

Tangent Why is that talk page un-redirected? --Izno (talk) 13:19, 10 November 2020 (UTC)[reply]

Don't know. Probably should be don't you think?

—Trappist the monk (talk) 15:05, 10 November 2020 (UTC)[reply]

As far as I understood, {{Citation}} is for CS2, not CS1. If so, redirecting here ("Help talk:Citation Style 1") would probably be wrong. I'm all for merging CS1 and CS2, but for as long as this hasn't happened, CS2 followers probably need a place to hold out as well. However, crosslinking would be appropriate, so that discussions won't be missed (as it apparently happens often).

--Matthiaspaul (talk) 16:29, 10 November 2020 (UTC)[reply]

The CS1 module handles CS2 and questions regarding it are 99% applicable to both. Help talk:CS2 also redirects here. --Izno (talk) 18:44, 10 November 2020 (UTC)[reply]

Almost, Help talk:Citation Style 2. Perhaps, we should redirect Template talk:Citation there?

--Matthiaspaul (talk) 22:08, 10 November 2020 (UTC)[reply]

No. Here is best. Help talk:Citation Style 2 has 29 watchers. Template talk:Citation has 201 watchers. This page has 384 watchers. No doubt, many of those watchers are the same.

—Trappist the monk (talk) 22:16, 10 November 2020 (UTC)[reply]

Merge the pages, rename & redirect. Only after the appropriate discussion. What the module does is irrelevant to how humans discuss and categorize things. If editors want to have seoarate pages for discussion because it makes sense to them, then that is how it should be. 208.251.187.170 (talk) 12:55, 11 November 2020 (UTC)[reply]

Several co-authors that share the same last name

Example situation: article Nova 9: The Return of Gir Draxon contains a reference in which the authors Hartley, Patricia, Kirk all share the same last name, Lesser. Assuming all three are related in some way (no hidden text was provided to say otherwise), isn't there a way to condense the list of authors to display all three first names but only one last name for the group? I'd like to suggest a clearer mention on the template page, though the use of such a parameter would be limited.

Semi-related, I removed the name-list-style=amp parameter from the citation mentioned above, because displaying both a semicolon and an ampersand to separate the three authors just looks weird. — Christopher, Sheridan, OR (talk) 05:58, 2 November 2020 (UTC)[reply]

You can do so by setting |author-mask_n= for each. I personally deplore the style you are using because it is hell to figure out bad citations when it is employed. (If you provide all of the first/last pairs as expected, it is slightly better but still leaves me at least somewhat salty. :) --Izno (talk) 06:07, 2 November 2020 (UTC)[reply]

Thank you for the suggestion. However, the Template:Cite journal is not clear about how it should be used, and doesn't give appropriate examples for demonstration. I would therefore have to play around with the author-mask_n= parameter to figure out how to use it. That's part of the problem I'd like to report—lack of examples on the template page on how to use it. — Christopher, Sheridan, OR (talk) 06:19, 2 November 2020 (UTC)[reply]

The documentation is quite dense already, so I think examples would be hard to fit. That said, this might be to your liking: Smith, A., B, & C. Title.. I think that is not really that great a style, but perhaps you have something in mind that will make it obvious. --Izno (talk) 06:24, 2 November 2020 (UTC)[reply]

Yes, that is what I'd like to see. Thank you for the assistance; I shall implement it on the article immediately.
— Christopher, Sheridan, OR (talk) 06:32, 2 November 2020 (UTC)[reply]

Fixed evaluation of accept-this-as-is syntax in parameters supporting item lists

Template parameters supporting item lists such as |pages=, |pp=, |issue=, |number= (and now also |quote-pages=) supported the accept-this-as-is syntax to suppress the conversion of hyphens to dashes globally as well as for individual list items. However, a bug prevented the code from properly evaluating item lists, where the first and the last list items were using this syntax. Such combinations were erroneously interpreted as if the global accept-this-as-is markup was used, resulting in invalid list items (fifth and last example). This has been fixed now:

Extended content

Cite journal comparison
Wikitext	`{{cite journal\|author=Author\|journal=Journal\|pages=1-3,5-7\|title=Title}}`
Live	Author. "Title". Journal: 1–3, 5–7. `{{cite journal}}`: `\|author=` has generic name (help)
Sandbox	Author. "Title". Journal: 1–3, 5–7. `{{cite journal}}`: `\|author=` has generic name (help)

Cite journal comparison
Wikitext	`{{cite journal\|author=Author\|journal=Journal\|pages=1,201-1,234\|title=Title}}`
Live	Author. "Title". Journal: 1, 201–1, 234. `{{cite journal}}`: `\|author=` has generic name (help)
Sandbox	Author. "Title". Journal: 1, 201–1, 234. `{{cite journal}}`: `\|author=` has generic name (help)

Cite journal comparison
Wikitext	`{{cite journal\|author=Author\|journal=Journal\|pages=((1,201–1,234))\|title=Title}}`
Live	Author. "Title". Journal: 1,201–1,234. `{{cite journal}}`: `\|author=` has generic name (help)
Sandbox	Author. "Title". Journal: 1,201–1,234. `{{cite journal}}`: `\|author=` has generic name (help)

Cite journal comparison
Wikitext	`{{cite journal\|author=Author\|journal=Journal\|pages=((1-3,5-7))\|title=Title}}`
Live	Author. "Title". Journal: 1-3,5-7. `{{cite journal}}`: `\|author=` has generic name (help)
Sandbox	Author. "Title". Journal: 1-3,5-7. `{{cite journal}}`: `\|author=` has generic name (help)

Cite journal comparison
Wikitext	`{{cite journal\|author=Author\|journal=Journal\|pages=((1-3)),((5-7))\|title=Title}}`
Live	Author. "Title". Journal: 1-3, 5-7. `{{cite journal}}`: `\|author=` has generic name (help)
Sandbox	Author. "Title". Journal: 1-3, 5-7. `{{cite journal}}`: `\|author=` has generic name (help)

Cite journal comparison
Wikitext	`{{cite journal\|author=Author\|journal=Journal\|pages=((1-3)),5-7\|title=Title}}`
Live	Author. "Title". Journal: 1-3, 5–7. `{{cite journal}}`: `\|author=` has generic name (help)
Sandbox	Author. "Title". Journal: 1-3, 5–7. `{{cite journal}}`: `\|author=` has generic name (help)

Cite journal comparison
Wikitext	`{{cite journal\|author=Author\|journal=Journal\|pages=((((1-3)),((5-7))))\|title=Title}}`
Live	Author. "Title". Journal: ((1-3)),((5-7)). `{{cite journal}}`: `\|author=` has generic name (help)
Sandbox	Author. "Title". Journal: ((1-3)),((5-7)). `{{cite journal}}`: `\|author=` has generic name (help)

Cite journal comparison
Wikitext	`{{cite journal\|author=Author\|journal=Journal\|pages=((1-3)),((5-7)),9-10\|title=Title}}`
Live	Author. "Title". Journal: 1-3, 5-7, 9–10. `{{cite journal}}`: `\|author=` has generic name (help)
Sandbox	Author. "Title". Journal: 1-3, 5-7, 9–10. `{{cite journal}}`: `\|author=` has generic name (help)

Cite journal comparison
Wikitext	`{{cite journal\|author=Author\|journal=Journal\|pages=((1-3)),5-7,((9-10))\|title=Title}}`
Live	Author. "Title". Journal: 1-3, 5–7, 9-10. `{{cite journal}}`: `\|author=` has generic name (help)
Sandbox	Author. "Title". Journal: 1-3, 5–7, 9-10. `{{cite journal}}`: `\|author=` has generic name (help)

--Matthiaspaul (talk) 02:19, 4 November 2020 (UTC)[reply]

The parameter evaluation for |volume= internally uses parts of the same code for list item evaluation, hyphen-to-dash conversion, and accept-this-as-is markup recognition as used for |issue=, |pages=, etc. above. However, a bug in the somewhat-heuristic code deciding if a volume value should be presented in boldface or not prevented this from being executed if the given argument was longer than 4 characters. This has now been fixed as well.

As before, the volume is shown in boldface only if it is a single number consisting of either Arabic or Roman digits only or if is not longer than 4 characters in total, that is, ranges are displayed in boldface only if they are very short, and list items framed with the accept-this-as-is markup are never shown in boldface. However, given the many requests in the past asking to not display volumes in boldface at all, this can be seen as a feature as well to optionally suppress boldface also for short volume values: ((1)), ((X)), ((1-2)), ((1–2)).

Extended content

Cite journal comparison
Wikitext	`{{cite journal\|author=Author\|journal=Journal\|title=Title\|volume=2}}`
Live	Author. "Title". Journal. 2. `{{cite journal}}`: `\|author=` has generic name (help)
Sandbox	Author. "Title". Journal. 2. `{{cite journal}}`: `\|author=` has generic name (help)

Cite journal comparison
Wikitext	`{{cite journal\|author=Author\|journal=Journal\|title=Title\|volume=((2))}}`
Live	Author. "Title". Journal. 2. `{{cite journal}}`: `\|author=` has generic name (help)
Sandbox	Author. "Title". Journal. 2. `{{cite journal}}`: `\|author=` has generic name (help)

Cite journal comparison
Wikitext	`{{cite journal\|author=Author\|journal=Journal\|title=Title\|volume=X}}`
Live	Author. "Title". Journal. X. `{{cite journal}}`: `\|author=` has generic name (help)
Sandbox	Author. "Title". Journal. X. `{{cite journal}}`: `\|author=` has generic name (help)

Cite journal comparison
Wikitext	`{{cite journal\|author=Author\|journal=Journal\|title=Title\|volume=((X))}}`
Live	Author. "Title". Journal. X. `{{cite journal}}`: `\|author=` has generic name (help)
Sandbox	Author. "Title". Journal. X. `{{cite journal}}`: `\|author=` has generic name (help)

Cite journal comparison
Wikitext	`{{cite journal\|author=Author\|journal=Journal\|title=Title\|volume=1-2}}`
Live	Author. "Title". Journal. 1–2. `{{cite journal}}`: `\|author=` has generic name (help)
Sandbox	Author. "Title". Journal. 1–2. `{{cite journal}}`: `\|author=` has generic name (help)

Cite journal comparison
Wikitext	`{{cite journal\|author=Author\|journal=Journal\|title=Title\|volume=((1-2))}}`
Live	Author. "Title". Journal. 1-2. `{{cite journal}}`: `\|author=` has generic name (help)
Sandbox	Author. "Title". Journal. 1-2. `{{cite journal}}`: `\|author=` has generic name (help)

Cite journal comparison
Wikitext	`{{cite journal\|author=Author\|journal=Journal\|title=Title\|volume=1-2}}`
Live	Author. "Title". Journal. 1–2. `{{cite journal}}`: `\|author=` has generic name (help)
Sandbox	Author. "Title". Journal. 1–2. `{{cite journal}}`: `\|author=` has generic name (help)

Cite journal comparison
Wikitext	`{{cite journal\|author=Author\|journal=Journal\|title=Title\|volume=((1–2))}}`
Live	Author. "Title". Journal. 1–2. `{{cite journal}}`: `\|author=` has generic name (help)
Sandbox	Author. "Title". Journal. 1–2. `{{cite journal}}`: `\|author=` has generic name (help)

--Matthiaspaul (talk) 20:40, 4 November 2020 (UTC)[reply]

If this is a way to circumvent/subvert the module styling, please find another solution or revert yourself. --Izno (talk) 21:01, 4 November 2020 (UTC)[reply]

This would be pointless as the volume evaluation code has always been based on heuristics trying to cover the most common cases in the most desirable way for most users, but it never ruled out potentially invalid entries. The fixed code is an improvement on this, but it still does not rule out all corner-cases, also to keep the changes minimal and the code small.

If the above mentioned behaviour (which was not some deliberately coded feature) would be actually undesired it might be possible to add extra code to explicitly test for this condition and disallow it, but I think it is easier to just not enter them this way (as before). And to rule out these combinations, that code would have to be added to the original code as well, so nothing would be gained by reverting.

However, I mentioned this possibility because we have had many requests in the past to streamline the display of volumes (that is, to not bold them at all), so some users might even find this useful (if documented accordingly). The existing heuristics were the result of trying to find a compromise so that some short and special types of volumes would be displayed in boldface whereas others would not. This works exactly like before.

--Matthiaspaul (talk) 22:40, 4 November 2020 (UTC)[reply]

An aside: I doubt that the "existing heuristics" was the result of any compromise. If I remember correctly, some years back, somebody suggested that long volume labels be unbolded because of reasons (probably purely esthetic). The initial "discussion" was barely 3 comments long, IIRC. And that was it, |volume= was reclassified into the bipolar bin. As you state, many people have asked for a resolution either way (all bold font or all regular). It must be somebody's pet cause, because nothing has transpired. Other than that, if your edits cause no harm and correct a bug (personally I was not aware of it) then I don't see why they shouldn't stand. 98.0.246.242 (talk) 03:43, 5 November 2020 (UTC)[reply]

FWIW, here are some links to former discussions regarding the bolding/non-bolding of the volume label:

--Matthiaspaul (talk) 21:07, 16 November 2020 (UTC)[reply]

Italics

I want to italicize the newspapers in Dietrich Adam but it comes up with an error. Please allow the option to do it manually, I hate it when things are controlled by a template.† Encyclopædius 13:12, 5 November 2020 (UTC)[reply]

Use {{cite news}} and |newspaper=

{{cite news |url=https://www.spiegel.de/kultur/tv/dietrich-adam-ist-tot-friederich-stahl-in-sturm-der-liebe-a-548adb45-64fe-49ce-8e71-58a7cce9c3a9 |title=Schauspieler Dietrich Adam ist tot |newspaper=Der Spiegel |date=4 November 2020|access-date=5 November 2020 |language=de}}

"Schauspieler Dietrich Adam ist tot". Der Spiegel (in German). 4 November 2020. Retrieved 5 November 2020.

Did the error message help text not answer this question?

—Trappist the monk (talk) 13:19, 5 November 2020 (UTC)[reply]

Wikipedia doesn't actually force anyone to use citation templates. The only requirement is that the style you use looks identical to the one in the rest of the article. Glades12 (talk) 13:48, 6 November 2020 (UTC)[reply]

Request for the "nbk" (NCBI bookshelf) attribute for "cite book"

Please add the "nbk" attribute for the "cite book" template to specify the NCBI NBK number. You already have the "pmc" and "pmid" attributes, but the "nbk" is different. It refers to the NCBI bookshelf site that has different URL forman than PubMed Central. The URL to the bookshelf looks like https://www.ncbi.nlm.nih.gov/books/NBK557634/ (where 557634 is the NCBI NBK number). My idea is when you specify the "nbk" to the "cite book", the direct URL to the book at the NBI site will be generated. Currently, NCBI bookshelf books cannot be accessed directly from Wikipedia or other Wikimedia cites that allow the "cite book" template. Maxim Masiutin (talk) 19:42, 6 November 2020 (UTC)[reply]

Weird category text

What's going on with Category:CS1 errors: dates? A bunch of sectioned text just appeared today, that don't have to do with dates. Does it have to do with the {{#lst}} stuff? I don't understand how those work. kennethaw88 • talk 22:14, 6 November 2020 (UTC)[reply]

Thanks for reporting this. A couple of hours ago I swapped some sections at Help:CS1 errors to reestablish the alphabetical order of entries, however, I must have overlooked something. As Izno reverted me, the effect should already have been gone by now. To be sorted out.

--Matthiaspaul (talk) 23:03, 6 November 2020 (UTC)[reply]

Fixed. --Matthiaspaul (talk) 11:04, 7 November 2020 (UTC)[reply]

Triple curly

From Women in the Byzantine Empire:

{{cite book| author = | chapter = | chapter-url = | format = | url = | title = [[The Oxford Dictionary of Byzantium]] | orig-year = | agency = ed. by Dr. [[Alexander Kazhdan]] | edition = |location= N. Y. |date = 1991 |publisher= |volume= {{{том|}}} | pages = {{{страницы|}}}| series = | isbn = 0-19-504652-8| ref = {{harvid|Kazhdan|1991}}}}

Produces:

The Oxford Dictionary of Byzantium. N. Y. 1991. ISBN 0-19-504652-8. {{cite book}}: Unknown parameter |agency= ignored (help)CS1 maint: location missing publisher (link)

Are triple curly-brackets {{{том|}}} and {{{страницы|}}} error or feature? -- GreenC 16:09, 7 November 2020 (UTC)[reply]

The template variables are in the first version of that article. cs1|2 does not see them because they are empty strings by the time the template is passed to Module:Citation/CS1.

—Trappist the monk (talk) 16:30, 7 November 2020 (UTC)[reply]

(edit conflict) It's an error caused by copying and pasting the template from the Russian Wikipedia when the article was created. I found only one other instance of this problem in article space, so it looks like it is not a big problem. – Jonesey95 (talk) 16:35, 7 November 2020 (UTC)[reply]

This is good news as finding the template's terminus }} when there are triple curly brackets embedded raised some edge case complications, now they can just be logged and removed. -- GreenC 00:40, 8 November 2020 (UTC)[reply]

Epic citations

Occasionally come across citations that might be described as "epic". From Parallel (operator):

<ref name="Cajori_1928">{{cite book |author-first=Florian |author-last=Cajori |author-link=Florian Cajori |title=A History of Mathematical Notations – Notations in Elementary Mathematics |chapter=§ 184, § 359, § 368 |volume=1 |orig-date=September 1928 |publisher=[[Open court publishing company]] |location=Chicago, US |date=1993 |edition=two volumes in one unaltered reprint |pages=[https://archive.org/details/historyofmathema00cajo_0/page/193 193, 402–403, 411–412] |isbn=0-486-67766-4 |lccn=93-29211 |url=https://archive.org/details/historyofmathema00cajo_0/page/193 |access-date=2019-07-22 |quote-pages=402–403, 411–412 |quote=§359. […] ∥ for parallel occurs in [[William Oughtred|Oughtred]]'s ''Opuscula mathematica hactenus inedita'' (1677) [p. 197], a posthumous work (§ 184) […] §368. Signs for parallel lines. […] when [[Robert Recorde|Recorde]]'s sign of equality won its way upon [[the Continent]], vertical lines came to be used for parallelism. We find ∥ for "parallel" in [[John Kersey the elder|Kersey]],{{citeref|A|ref=FC-A}} [[John Caswell|Caswell]], [[William Jones (mathematician)|Jones]],{{citeref|B|ref=FC-B}} Wilson,{{citeref|C|ref=FC-C}} [[William Emerson (mathematician)|Emerson]],{{citeref|D|ref=FC-D}} Kambly,{{citeref|E|ref=FC-E}} and the writers of the last fifty years who have been already quoted in connection with other pictographs. Before about 1875 it does not occur as often […] Hall and Stevens{{citeref|F|ref=FC-F}} use "par{{citeref|F|ref=FC-F}} or ∥" for parallel […] {{anchor|FC-A}}[A] [[John Kersey the elder|John Kersey]], ''{{citeref|Kersey (the elder)|1673|Algebra|style=plain}}'' (London, 1673), Book IV, p. 177. {{anchor|FC-B}}[B] [[William Jones (mathematician)|W. Jones]], ''Synopsis palmarioum matheseos'' (London, 1706). {{anchor|FC-C}}[C] John Wilson, ''Trigonometry'' (Edinburgh, 1714), characters explained. {{anchor|FC-D}}[D] [[William Emerson (mathematician)|W. Emerson]], ''Elements of Geometry'' (London, 1763), p. 4. {{anchor|FC-E}}[E] {{ill|Ludwig Kambly{{!}}L. Kambly|de|Ludwig Kambly}}, ''Die Elementar-Mathematik'', Part 2: ''Planimetrie'', 43. edition (Breslau, 1876), p. 8. […] {{anchor|FC-F}}[F] H. S. Hall and F. H. Stevens, ''Euclid's Elements'', Parts I and II (London, 1889), p. 10. […]}} [https://monoskop.org/images/2/21/Cajori_Florian_A_History_of_Mathematical_Notations_2_Vols.pdf]</ref>

Might we have a page to document epic/creative usage of a single CS1|2 citation. -- GreenC 14:49, 8 November 2020 (UTC)[reply]

Already exists, though likely, very few of us know of it: Module talk:Citation/CS1/Rogues gallery.

—Trappist the monk (talk) 15:05, 8 November 2020 (UTC)[reply]

It seems the main problem here is the misused |quote=. Personally I would only use that parameter to quote items relevant to the publication itself (from the verso, index, toc etc.). I would use footnotes for any quoted content. 65.204.10.231 (talk) 15:33, 8 November 2020 (UTC)[reply]

There is one even longer in Exponentiation. I think it will the specimen for the museum gallery. -- GreenC 02:56, 11 November 2020 (UTC)[reply]

Now on display (last entry). -- GreenC 03:05, 11 November 2020 (UTC)[reply]

Epic enough to have its own page in article space. 208.251.187.170 (talk) 13:08, 11 November 2020 (UTC)[reply]

Improving COinS metadata output

Investigating the COinS metadata output I have spotted some areas for possible improvement on various levels. Since most of them are small and/or affect corner-cases only they aren't worth individual threads polluting the TOC, so I will combine them into this thread.

There will be more, but so far there have been only two changes, both related to the metadata generated for identifiers which have no predefined &rft.<id-name> or &rft_id=info.<id-name> tags associated with them within COinS. For such identifiers, the template uses the &rft_id=<id-link> tag to provide URLs to the external resource. The code assembling such URLs uses prefix and suffix definitions from a table defining the various properties for the identifiers. While the suffix was added to the visible URLs, there was a bug omitting to add the suffix to the identifier URLs for COinS as well. This has been fixed. However, this is an internal change only and has no impact on the actually generated metadata because none of the identifiers defined so far actually used a suffix.

On the receiver side, users of the identifier data passed through via URLs may want to retranslate it back into a human-readable form "<id-name> <id-number>". While it is sometimes possible to derive the identifier type from the URL, this is not always the case. For example, DOI and bioRxiv as well as JFM and Zbl identifiers both resolve to the same URLs, respectively:

DOI <id-number> → "&rft_id=//doi.org/<id-number>" → ?
bioRxiv <id-number> → "&rft_id=//doi.org/<id-number>" → ?
JFM <id-number> → "&rft_id=//zbmath.org/?format=complete&q=an:<id-number>" → ?
Zbl <id-number> → "&rft_id=//zbmath.org/?format=complete&q=an:<id-number>" → ?

This is not a problem in the DOI case, because a predefined info:doi tag exists and thus is used by the metadata generator instead of creating an URL for it.

DOI <id-number> → "&rft_id=info:doi/<id-number>" → DOI <id-number>

However, to make the URLs more useable on the receiver side, the generator now appends an URI #fragment to the URLs indicating the name of the identifier. This is transparent for browsers (would this metadata be copied and pasted into the address line of a browser), but is readable for humans and scripts which can thereby pick up the original name and translate the URL back into the "<id-name> <id-number>" form for storage in their database. Examples:

bioRxiv <id-number> → "&rft_id=//doi.org/<id-number>#id-name=bioRxiv" → bioRxiv <id-number>
JFM <id-number> → "&rft_id=//zbmath.org/?format=complete&q=an:<id-number>#id-name=JFM" → JFM <id-number>
Zbl <id-number> → "&rft_id=//zbmath.org/?format=complete&q=an:<id-number>#id-name=Zbl" → Zbl <id-number>

There are some interesting concepts how to further encode information in URI fragments to describe a resource or make it automatically actionable on the client's side. If we'd find a low-footprint scheme formally describing the URL as a link to information related to a specific entity of a named identifier, this could be further refined.

--Matthiaspaul (talk) 17:36, 10 November 2020 (UTC) (updated 22:45, 10 November 2020 (UTC), updated 14:26, 16 November 2020 (UTC))[reply]

I believe one or another of your changes has caused the error in test_Zbl in Module talk:Citation/CS1/errors. --Izno (talk) 19:53, 10 November 2020 (UTC)[reply]

Thanks, according to Module_talk:Citation/CS1/testcases/errors this should be fixed now (but fixing this I spotted another issue in the existing code still to be fixed). --Matthiaspaul (talk) 23:57, 10 November 2020 (UTC)[reply]

URL in identifier

Bunce, Mrs. Oliver Bell (1 September 1897). "The Turkish Compassionate Fund". The Decorator and Furnisher. doi:10.2307/25585322. JSTOR https://www.jstor.org/stable/25585322. {{cite web}}: Check |jstor= value (help); External link in |JSTOR= (help)

|JSTOR= should emit an error. --Izno (talk) 18:49, 10 November 2020 (UTC)[reply]

|jstor= is one of three external identifiers that don't get some sort of check (the others are |osti= and |rfc=). |jstor= can hold a variety of identifiers:

And then there is stuff like this that doesn't work:

Because there is such a diversity of |jstor= identifiers, we may not be able to validate them.

I think that |osti= and |rfc= are simple numeric identifiers. Likely we have not bothered to check these because there are relatively few uses of these identifiers. |rfc= seems to be max number between 8000 and 9000. |osti= seems to be max number between 22000000 and 23000000. So these two could be given simple limit checks like we do for |pmc=.

—Trappist the monk (talk) 23:53, 10 November 2020 (UTC)[reply]

Sounds about right for RFC. Not familiar with OSTI.

As for JSTOR, here's some ideas: looks like it has a URL, or has spaces, as errors. We should already have URL detection from title checking, which would have caught at least two pages. (Not sure about schemeless URLs?) --Izno (talk) 01:48, 11 November 2020 (UTC)[reply]

Cite book comparison
Wikitext	`{{cite book\|rfc=1\|title=Title}}`
Live	Title. RFC 1.
Sandbox	Title. RFC 1.

Cite book comparison
Wikitext	`{{cite book\|rfc=10000\|title=Title}}`
Live	Title. RFC 10000. `{{cite book}}`: Check `\|rfc=` value (help)
Sandbox	Title. RFC 10000. `{{cite book}}`: Check `\|rfc=` value (help)

Cite book comparison
Wikitext	`{{cite book\|osti-access=free\|osti=1\|title=Title}}`
Live	Title. OSTI 1. `{{cite book}}`: Check `\|osti=` value (help)
Sandbox	Title. OSTI 1. `{{cite book}}`: Check `\|osti=` value (help)

Cite book comparison
Wikitext	`{{cite book\|osti=23000001\|title=Title}}`
Live	Title. OSTI 23000001.
Sandbox	Title. OSTI 23000001.

—Trappist the monk (talk) 00:14, 15 November 2020 (UTC)[reply]

Has anyone seen OSTIs lower than 1018? Otherwise we could raise the lower limit from 1 to 1018.

--Matthiaspaul (talk) 23:08, 15 November 2020 (UTC)[reply]

As so far I could not find lower OSTI numbers to be supported by the OSTI site and only found considerably higher numbers in WP, I now changed the lower bound to 1018 to catch at least some "stray digit" errors:

Extended content

Cite book comparison
Wikitext	`{{cite book\|osti=0\|title=Title}}`
Live	Title. OSTI 0. `{{cite book}}`: Check `\|osti=` value (help)
Sandbox	Title. OSTI 0. `{{cite book}}`: Check `\|osti=` value (help)

Cite book comparison
Wikitext	`{{cite book\|osti=1017\|title=Title}}`
Live	Title. OSTI 1017. `{{cite book}}`: Check `\|osti=` value (help)
Sandbox	Title. OSTI 1017. `{{cite book}}`: Check `\|osti=` value (help)

Cite book comparison
Wikitext	`{{cite book\|osti=1018\|title=Title}}`
Live	Title. OSTI 1018.
Sandbox	Title. OSTI 1018.

Cite book comparison
Wikitext	`{{cite book\|rfc=0\|title=Title}}`
Live	Title. RFC 0. `{{cite book}}`: Check `\|rfc=` value (help)
Sandbox	Title. RFC 0. `{{cite book}}`: Check `\|rfc=` value (help)

Please report if you find a lower number somewhere.

--Matthiaspaul (talk) 23:59, 16 November 2020 (UTC)[reply]

Both, URL scheme and space detection could be useful, although I couldn't find any JSTORs starting with "http:", etc. (probably fixed by you already?). I found about 20 citations with invalid JSTORs starting with "www.jstor.org", though. So, an identifier value starting with the domain name from the URL prefix from /Configuration could be a good pattern as well in general, but, given that the other identifiers have more sophisticated validation checks already, it would only make sense to add to JSTOR - but still wouldn't catch someone just entering garbage...

--Matthiaspaul (talk) 16:10, 16 November 2020 (UTC)[reply]

Yeah, but at best it's a maintenance category or a properties category while we review to see what looks like trash. If we were to do something like that, we'd want to exclude obvious ones like DOI-like identifiers, as a first case. --Izno (talk) 16:31, 16 November 2020 (UTC)[reply]

A test for stray spaces and "http(s)://" at the start of the identifier string has been added to the JSTOR code.

Extended content

Cite book comparison
Wikitext	`{{cite book\|jstor=141294\|title=Title}}`
Live	Title. JSTOR 141294.
Sandbox	Title. JSTOR 141294.

Cite book comparison
Wikitext	`{{cite book\|jstor=141 294\|title=Title}}`
Live	Title. JSTOR 294 141 294. `{{cite book}}`: Check `\|jstor=` value (help)
Sandbox	Title. JSTOR 294 141 294. `{{cite book}}`: Check `\|jstor=` value (help)

Cite book comparison
Wikitext	`{{cite book\|jstor=141dfdfdf29 4\|title=Title}}`
Live	Title. JSTOR 4 141dfdfdf29 4. `{{cite book}}`: Check `\|jstor=` value (help)
Sandbox	Title. JSTOR 4 141dfdfdf29 4. `{{cite book}}`: Check `\|jstor=` value (help)

Cite book comparison
Wikitext	`{{cite book\|jstor=http://141294\|title=Title}}`
Live	Title. JSTOR http://141294. `{{cite book}}`: Check `\|jstor=` value (help)
Sandbox	Title. JSTOR http://141294. `{{cite book}}`: Check `\|jstor=` value (help)

Cite book comparison
Wikitext	`{{cite book\|jstor=https://141294\|title=Title}}`
Live	Title. JSTOR https://141294. `{{cite book}}`: Check `\|jstor=` value (help)
Sandbox	Title. JSTOR https://141294. `{{cite book}}`: Check `\|jstor=` value (help)

However, there is still an older bug invalidating strings with spaces (also present in the live code).

--Matthiaspaul (talk) 16:50, 19 November 2020 (UTC)[reply]

Should be fixed now by encoding the id as well.

--Matthiaspaul (talk) 20:22, 19 November 2020 (UTC)[reply]

Add an iaident parameter

CS1 templates are very complex and ever changing, and writing a bot to enhance certain references, such as book references, to make them more easily accessible to readers can have unintended side-effects, consequences that may actually make things worse. I propose adding two new parameters to the CS1 templates. The first one is iaident. When this is populated, the module can figure out where to put the link to archive.org. If a URL is lacking, it go where any URL would normally go, if it isn't, it can perhaps append it to the citation in some way like "View at archive.org" or something like that. The URL would be https://archive.org/details/<iaident>. The second parameter would be iaoffset. In certain cases where pages don't link properly, iaoffset would be used to direct the server to the correct page/location of the media being viewed. This is the raw location. When used the URL simply becomes https://archive.org/details/<iaident>/page/n<iaoffset>.

These two additions will have no impact on existing citations and will allow a more harmonious addition of readable page previews to citations without stepping on anyone's toes, or accidentally breaking something in an existing reference.—CYBERPOWER (Chat) 13:28, 16 November 2020 (UTC)[reply]

We already have provision for archive links - why do we need special provision for the Internet Archive? They don't need any further advertising here.Nigel Ish (talk) 14:07, 16 November 2020 (UTC)[reply]

Nigel Ish, what I proposed is not an archive link, it's a link to a book scan at Internet Archive for readers to preview in an attempt to improve verifiability. The addition of these links is already approved, so the claim they are advertising is false. Internet Archive has nothing to gain from "advertising" their service. They are not making any revenue off of it. For example, you have a Cite Book reference with no link to be able to view the book. That's what this will serve. It only serves to make it easier for readers and editors to verify a claim on Wikipedia. I don't see how this does anything but help Wikipedia's core principles. —CYBERPOWER (Chat) 14:43, 16 November 2020 (UTC)[reply]

I am not sure I understand. As noted above, there's an archive url parameter already, for works that can be found in an archive. And |via= can inform the reader that the version of the work they are reading is published in an archive. If the work is only found in an online archive, then what is cited is the archive, likely via {{cite web}}. The particulars of the citation will make this obvious. I don't know what this has to do with bots "enhancing references" or how complexity can be reduced by adding even more specialized parameters. 65.204.10.231 (talk) 14:13, 16 November 2020 (UTC)[reply]

To explain more clearly, archive URL is for archives of website. What I'm proposing is not an archive of a web page. It's a media URL of a book, magazine, whatever, that stored at Internet Archive. As it currently stands, these URLs are placed in the url section, but doing that may have other consequences such as clashing with title-link, or something else I, or another botop may be unaware of. The proposal is to just put this info in it's own parameter so the template can deal with it appropriately. —CYBERPOWER (Chat) 14:47, 16 November 2020 (UTC)[reply]

Archive URLs point to any item archived online, be it webpage, book, video etc. As mentioned previously, when one cites s scanned item at Internet Archive, one is actually citing the archive. The source (in this case a website) is the Internet Archive. The scanned item (they are all digitized by scanning or other means) is an entry (webpage location) in that website. There is no need for an identifier, and I still don't understand how bots enter into this. If you feel something like that is needed, you can always make a wrapper for {{cite web}} as a single-source/special purpose template for Internet Archive. There are several examples. 50.74.165.202 (talk) 16:44, 16 November 2020 (UTC)[reply]

There are over 600,000 citations that link scanned books. Examples. It does seem kind of silly we don't use the ID system for this, it is one of the most frequently linked things on enwiki. There are 3.7 million {{cite book}} templates and if all these were in cite books (most are) that is 16%. -- GreenC

Most identifier parameters do not contain "id" or "identifier" in their name, so if this is introduced please just call it "ia" or "internetarchive". Note that we already have OpenLibrary identifiers that can be used to link a large part of IA books (but not other content).

I have no opinion on whether using an identifier is preferable to using the URL, but I support the stated goal (to facilitate linking books). Maybe it can simply be achieved by some Lua transformations on the URL? Nemo 16:24, 16 November 2020 (UTC)[reply]

Which reminds me that we should put |ol= into the metadata to make it easier for third-parties to correlate the data. (The technical reason for why we don't include it already is because different OL identifiers require different prefixes and this doesn't fit very well into the current implementation.)

--Matthiaspaul (talk) 16:47, 16 November 2020 (UTC)[reply]

Nemo bis, No objections to the naming conventions. —CYBERPOWER (Chat) 17:01, 16 November 2020 (UTC)[reply]

(edit-conflict) So, what you both are asking for is basically an identifier for archive.org, so that it does not occupy the title link? I like this idea, and if this identifier would be included in the list of auto-linking targets, it would be as convenient to use as if it would occupy |url= by itself but only be considered by the template when |url= is not specified as well. This would free |url= for other uses. If this is what you propose, I would support it. Ideally, though, this parameter would not take a complete URL such as "https://archive.org/details/sixmonthsatwhit02carpgoog" as a value, but just an id (like "Identifier=sixmonthsatwhit02carpgoog"). How does this correspond with the "Identifier-ark=ark:/13960/t40s07c8h"? Is it possible to derive the former from the latter (ark)?

Is my assumption correct that these scanned documents do not need to be archived any more because they can be considered to be archived already, that is, these links will be permanent? This would be another argument for having a specific identifier parameter for them and leave |url= with its |archive-url= companion for links which actually need |archive-url= to prevent link-rot.

--Matthiaspaul (talk) 16:38, 16 November 2020 (UTC)[reply]

We are not in the business of developing identifiers, nor extracting homebrewed ones from URL fragments. Neither is this a novel idea, similar have been discussed before. It hasn't happened for the reasons already spelled out here. This is more or less superfluous. Adds complexity. Brings nothing extra to discovery. Hasn't anyone noticed that editors can insert custom ids? In |id= an editor can insert the source's own identifying scheme, if any. 50.74.165.202 (talk) 17:01, 16 November 2020 (UTC)[reply]

Matthiaspaul, everything at Internet Archive is intended to be there permanently. There are some very rare exceptions to that rule, but what is saved to the Internet Archive will generally stay there forever. —CYBERPOWER (Chat) 17:14, 16 November 2020 (UTC)[reply]

I'm actually not aware of Identifier-ark. What does it do? —CYBERPOWER (Chat) 17:16, 16 November 2020 (UTC)[reply]

On the page (https://archive.org/details/sixmonthsatwhit02carpgoog) I linked above (nothing special, just the first example I found writing this), the entry "Identifier" contains the value "sixmonthsatwhit02carpgoog", and the entry "Identifier-ark" the value "ark:/13960/t40s07c8h", respectively. I have seen those "ark" identifiers in other IA pages related to scanned books, that's why I am interested in how they are related. --Matthiaspaul (talk) 18:01, 16 November 2020 (UTC)[reply]

Matthiaspaul, okay, I just wanted to be sure, but they are completely unrelated. It is not possible to derive either value from the other. —CYBERPOWER (Chat) 13:05, 17 November 2020 (UTC)[reply]

I support the addition of a |ia= with the caveat that it should be documented to take the Internet Archive identifier (and, yes, these are unique identifiers assigned by IA; they just don't have a resolver that abstracts the identifier from the physical address (URL)) of the scan where the information it supports was found, rather than any old scan of some book that may or may not be the same work in the same edition in a copy sufficiently identical to the original to support WP:V. People will still use it sloppily of course, but if the definition is strict we at least pull the trend in the right direction over time. This also means we treat it as an identifier and not a convenience link (those can go in |url=). This means the derived URL should not be auto-promoted to the |url=. It also means the parameter should not be bot-populated unless other information in the template uniquely identifies the scan to which it refers. IA book scans are a great resource and we should take advantage of it to the fullest extent practical, but not uncritically and sloppily.

I don't see the case for the proposed |iaoffset= parameter, and at first blush it would seem to be conceptually in conflict with everything else in CS1. --Xover (talk) 18:57, 16 November 2020 (UTC)[reply]

Xover, iaoffset is needed in the event the page number itself is not providing a working link to the target page of the book. iaoffset will change the link to the raw location of the book you want to view, which will always work. It's hopefully not going to be needed often. Use cases are roman numerals or numberless pages being referenced. —CYBERPOWER (Chat) 13:07, 17 November 2020 (UTC)[reply]

I have seen digitized blobs of many journals/magazines/collections in one file. Would this |ia-offset= (provisional name) be useful to point to the start of the relevant work as well?

However, I'm not too fond of adding two parameters for this. Perhaps, in those cases where it is needed, it should be allowed to just append /page/n<iaoffset> to the identifier... '/' is obviously a character which can never occur in the identifier. Are there other "reserved" characters? What is the format of these identifiers (as RegEx or similar)?

--Matthiaspaul (talk) 13:44, 17 November 2020 (UTC)[reply]

Matthiaspaul, n<iaoffset> is a pointer to the raw page scan location of the work. For example, n5 would take you to the 5th image scan of the media, which would probably be the cover page, or book information and copyright. n10 may take you to a page in the book with the page number iii. Conversely, dropping the n will take you the book's page 10. In most cases the n prefix doesn't need to be used, but there are cases where they are required so the link goes straight to the desired page that has the information needed to verify the reference. —CYBERPOWER (Chat) 13:54, 17 November 2020 (UTC)[reply]

Is there a document describing the inner format (if there is any) of these identifiers for validation checks, or are they just strings of random length containing random characters without checksum or date information? Who composes these identifiers and according to which rules?

--Matthiaspaul (talk) 15:01, 17 November 2020 (UTC)[reply]

Matthiaspaul, nope. There is no hidden information in these strings. They're effectively almost random. —CYBERPOWER (Chat) 21:01, 17 November 2020 (UTC)[reply]

@Cyberpower678: I understand its intended functionality, but I still don't see the case for adding it. No other identifier supported in CS1 links directly to a specific page (caveat: there are some field-specific ones in there that I'm not that familiar with), but to the work as such or a specific copy of it, and that's quite good enough. Linking directly to a specific point in a source is at best a convenience, and in some contexts can even be a (very very minor) inconvenience. Matthiaspaul's example above (linking to a specific article within a magazine or a specific issue within a whole volume collection of a periodical) is the best use case for this, but even in those instances it falls into "convenience" territory and fails to justify the addition of a dedicated parameter IMO (and the same goes for the additional complexity of trying to encode it into the identifier; identifiers should generally be opaque). --Xover (talk) 14:34, 17 November 2020 (UTC)[reply]

It seems that we have heard this type of request before, particularly for a google books 'id'. If I remember correctly, those requests were rejected because the 'id' isn't a persistent id and in fact, isn't an id at all, but merely a token in the url query string. I also recall Semantic Scholar's wish for an identifier. They originally wanted us to use the forty character path element from their url:

https://www.semanticscholar.org/paper/041a49f7fdc8eef74ac2e52a768011ed0c29d0ce

Before we would let them have a cs1|2 identifier, we required them to create a simpler form, their corpus ID which they then map to whatever url they want:

https://api.semanticscholar.org/CorpusID:219352572

|s2cid=219352572

The |ia-identifier=sixmonthsatwhit02carpgoog seems a lot the same to me.

HathiTrust, uses the handle system to link to books and to specific places in that book. For example, their copy of Six Months at the White House with Abraham Lincoln is here:

https://hdl.handle.net/2027/uc1.$b301895

and to link to page 15 they give this as the handle:

https://hdl.handle.net/2027/uc1.$b301895?urlappend=%3Bseq=23

I could imagine an IA corpus ID (something with a check-digit would be good) so: |iacid=<corpus ID> for the book and if a particular scan is desired then perhaps something like |iacid=<corpus ID>.n<scan ID>. cs1|2 would then build a handle system url that internet archive can redirect to the appropriate location

Why isn't Internet Archive listed at Special:BookSources?

—Trappist the monk (talk) 12:41, 18 November 2020 (UTC)[reply]

All this is well and good, but also a moot point since any such id is not necessary. It adds nothing that cannot easily be done now, without it. Instead of wasting time in trinkets, I would direct everybody's energies into fixing the many design and logical flaws in the cs1/cs2 system. 65.204.10.231 (talk) 13:42, 18 November 2020 (UTC)[reply]

(edit-conflict) I have run into cases in a citation where I wanted to include a "genuine" URL to some document/site but also had a link to a digitized copy of the work at Google Books or Internet Archive, so I had to append some of those links after the citation as convenience links. I have also seen editors or bots/scripts "fighting" over those entries by replacing the URL in |url= by one of the Google- or IA-type ones. It would have been much better, if those extra resources could be listed among the identifiers, so that they don't occupy the place of |url= any more and the bots would have a dedicated place where to put them without disturbing anyone. If parameters like |ia= or |gbooks= (provisional names) would be included in the list of auto-linking identifiers, they could still show up as title links if none of the other links take precedence.

However, as Trappist correctly pointed out, it only makes sense for "identifiers" which are established and stable long-term and don't need an archived link to prevent link-rot (because they are already sort-of-links-to-archived-copies). Also, it would be great if they would be shorter and follow some logical system (or we'd have to devise some way to link to them without showing the value)...

As Cyberpower and GreenC both have good connections to IA, they likely know who to ask at IA to make this happen.

--Matthiaspaul (talk) 16:28, 18 November 2020 (UTC)[reply]

Matthiaspaul, identifiers don't change. Once assigned, they are permanent. —CYBERPOWER (Chat) 21:44, 18 November 2020 (UTC)[reply]

BTW. They already have property assignments in Wikidata:

ia: P724
gbooks: P675

So, if we'd have corresponding parameters for them they could be used by {{cite Q}} as well.

--Matthiaspaul (talk) 17:34, 18 November 2020 (UTC)[reply]

Trappist the monk, IA identifiers however are persistent and do map to a specific scan. I'm not sure what exactly you are asking here. They are not tokens. The addition of /page/<page> further points to a specific location of said scan. This will never change. Further more the use of page, p, pp, pages, can be used by the module to assist in said pointing unless overriden by the offset parameter, or by the specification of /page/<page> in the identifier param. —CYBERPOWER (Around) 16:02, 18 November 2020 (UTC)[reply]

This will never change. Maybe; maybe not. Whatever mechanism IA uses is proprietary to IA. It seems better to me to avoid proprietary systems and use a system supported by many users so the handle system seems to fit; cs1|2 already supports |hdl= so we don't have to craft something special for IA.

I'm not sure that I see the need for a separate identifier. The primary use of cs1|2 templates is (supposed to be) to identify the source that the en.wiki editor consulted to support our article. I have never really felt comfortable with bots adding, and especially replacing, urls that the bot surmises may link to the source the editor consulted. Unless these bots have learned how to mindread, the bot does not and cannot know with any certainty what source the editor consulted. If editors want to blue-link titles to sources available at IA, they can use |url= to link to the source that they consulted.

The only question I asked, and that you did not answer, was: Why isn't Internet Archive listed at Special:BookSources?

—Trappist the monk (talk) 20:20, 18 November 2020 (UTC)[reply]

Trappist the monk, I can't answer that question. I'm not familiar with the functions of Special:BookSources. I don't understand your argument of proprietary. The strings are arbitrary, and unique to the book scan it's linked to. A bot does not need to mind read to ISBN match a book to something stored at Internet Archive. ISBNs are also unique, so there's no mindreading going on here. A unique identifier to a book, added by a human, is being matched to a unique identifier at IA. —CYBERPOWER (Chat) 21:41, 18 November 2020 (UTC)[reply]

In concept ISBNs are unique. In practice, they are not always unique. In past discussion on this page, Editors noted that ISBNs are not always unique because different editions may have different pagination, different covers, etc. But ISBNs are why I asked about Special:BookSources. If it is possible to search IA with an ISBN then IA should be listed at Special:BookSources; if google and amazon, why not IA? Get IA listed at Special:BookSources and there will be no need for a special identifier in cs1|2. A listing at Special:BookSources does not prevent editors from adding direct links with |url= to the facsimile at IA, and may increase the use of IA urls for books; better to link to IA than to google or amazon, isn't it? Google and amazon are right there at the top of the list at Special:BookSources; is it any wonder that editors looking for courtesy links use them?

Does citoid know about books at IA? If not, why not? I know that citoid knows about worldcat which has abominably poor metadata. If you can demonstrate that the metadata at IA are as good or better than the metadata at world cat, I would think it a no brainer for citoid to use IA, especially because IA has copies of the books it indexes whereas worldcat does not.

The strings are arbitrary... Arbitrary. That's certainly part of it for me. The strings are arbitrary and, for the example in this discussion, sixmonthsatwhit02carpgoog, seem to suggest that google is where I will land if I click on that 'identifier'. Arbitrary does not look systematic, it does not look professional. Editors at discussions here and elsewhere have complained that readers won't click on identifiers because they don't understand the meaning of the initialisms and so are intimidated. I think that our readers smarter than that; especially readers who have gotten to the point of following an article far enough that the references matter.

I don't think that a proprietary system that uses arbitrary strings benefits en.wiki. I have a hard time believing it whenever anyone says [this] will never change. This is the internet; nothing on the internet is static. A non-proprietary system, supporting multiple users is, I think, a better long-term choice for en.wiki because the stable identifier abstracts to the actual url of the source. That url can change as source providers upgrade their technology and internal data handling without it impacting us.

—Trappist the monk (talk) 00:23, 19 November 2020 (UTC)[reply]

A couple of points here…

I agree, and have previously suggested to both Cyberpower and Markjgraham, that they should first pursue options for making IA links easy for humans to add, specifically through Special:BookSources and Citoid. I am worried by their failure to pursue these options and read it as indication that they are only really interested in approaches that let them bulk-add links to IA via bot (cf. WP:VPP § Stop InternetArchiveBot from linking books and WP:BOTN § VPPOL discussion closed: linking by InternetArchiveBot). Bots are not a good match for this problem, and wishing screws were nails does not make the hammer any more suited.

That being said, the identifiers for works at IA have several of the important properties of identifiers (vs. addresses). They are unique, have a controlled syntax, are stable over time; and these properties are backed by guarantee from a generally well respected organisation of sufficient demonstrated longevity for our purposes. The properties it lacks are abstraction (it maps directly to an address in a static way) and a facility for resolving the identifier to an address other than the resource's current canonical address. It is also a proprietary identifier, and one backed by only a single organisation. However, this is no worse than |jstor=, and in some ways better because unlike JSTOR's "Stable URL", IA does actually treat this as an identifier. It is picked by the uploader, often according to a suggested schema, but it it assigned and managed by IA; and, crucially, it shows up in various APIs on their side where e.g. JSTOR would have used the URL (i.e. they actually treat it as an identifier in practice). It would be better if IA registered a HDL or DOI for each scan, but I don't see this as a bright line. I don't think an identifier's visual appearance, or the presence of certain substrings, are fair objections. Identifiers should be opaque except any defined hierarchy (DOI prefixes and such), and if they are too long their display can be truncated (or people will choose not to add them).

Specific params for such identifiers also makes it easier for users to discover (and thus actually make use of) than generic ones, and makes it easier to add multiple links where that is relevant. Having spent far far too many hours manually cleaning up article references I very much appreciate every additional identifier available, because even nominally stable identifiers like DOIs die in the timescales we care about. I don't know any services mirroring IA specifically (unlike JSTOR and Project MUSE that often both have copies of a given journal issue), but just as an illustration we have a lot of IA works uploaded at Commons. Being able to point both at the original at archive.org and the alternate copy at Commons will save somebody's behind a decade down the line when IA decides to annoy the publishers enough to get sued out of existence (or whatever).

Finally, there is not a 1:1 relationship between an ISBN and a specific scan of a specific copy of a specific edition of a specific work. Starting from an ISBN you can get to a search that lists lots of these, but you can't point at only one. That's (part of) why bot adding these links is a bad idea and Special:BookSources is the most appropriate avenue for making IA accessible at volume. But starting in the other end, you certainly can add the identifier of the specific scan you consulted when adding the reference. And sometimes the ability to specify a copy of a book (there are multiple advanced academic degrees made based on the copy-to-copy differences in the First Folio), and even the scan used of that copy (the same copy scanned by both Google and IA may have material differences in quality (hint: Google's scanner operators exhibit not a single fig given about quality)), is important.

Bottom line, for me, is that while this is not a no brainer, I ultimately fall down on the side of wanting this parameter. I also wish IA would actually participate here, and discuss issues surrounding linking, discoverability, metadata (their's is almost as bad as Worldcat's, just in different ways), but absent that I'll settle for ways we can more effectively make use of IA as a resource. --Xover (talk) 09:35, 19 November 2020 (UTC)[reply]

And then there is this 'identifier': northangerabbeyb00aust_1. Apparently, accuracy in creating these 'identifiers' is not a criteria for their creation. Some sort of numerical corpus ID (just take the next available number) would be much better than seeing an identifier naming Northanger Abbey in a citation for Pride and Predjudice: https://archive.org/details/northangerabbeyb00aust_1. That url was added by bot. It does illustrate the offset issue. The cited page is vii so the page link that the bot added did not work (since removed) but, had the bot written [https://archive.org/details/northangerabbeyb00aust_1/page/n9 vii] it would have worked: vii.

—Trappist the monk (talk) 14:16, 19 November 2020 (UTC)[reply]

Correct. Pages can be referred to by the physical leaf number, or the printed page number. For example anything without a printed page number, such as anything before printed "Page 1", it uses the "/page/n10" syntax eg. the 10th page leaf from the start. If the printed page number can't be asserted due to scanning errors, etc.. it uses the "n" leaf system. Determining (asserting) the printed page number from a OCR scan is not always possible, indeed technically challenging, so this is the default method to get to a page when page assertions are unavailable. -- GreenC 15:43, 19 November 2020 (UTC)[reply]

I wonder why this subject invites such elaborate discussion. All IA items are online. There is already a standardized, constantly utilized, familiar locator (the URL) to easily reach the referenced archive, as well as in-source locations such as specific pages (in the case of archived print media). Is there any reason for IA to have preferential treatment over other archives? Archives, just like any other source, are not automatically reliable. Afaik, IA's archiving protocols are opaque, and the resulting archives not vetted. Granted that the last time time I looked at IA governance was several years ago, but I was surprised to find out that there were no official "Archivist" positions at the organization. That is like having libraries without trained librarians. Not that university archiving operations are much better. I have seen horrible scans of well known works in such institutions. In some cases, really bad version control, with a different archive of the same original showing up seemingly randomly, no doubt thanks to some mysterious algorithm. But do go ahead and try to make sense of all this if that is your thing. 98.0.246.251 (talk) 01:59, 20 November 2020 (UTC)[reply]

Discussion is good, for as long as it remains constructive and aims at seeking the best solution to address a problem as this one.

I too am somewhat sceptical of unmanned bot actions for tasks where editorial judgement might be necessary.

I nevertheless support the addition of this identifier because it is also useful for editors manually improving citations. There is often more than one link that could be added to |url= and it would be good to have a separate place for at least the most common and established providers of content to free the |url= parameter and its companion |archive-url= for better purposes in order to improve the quality and usefulness of citations and to fight link-rot. Both, GB and IA identifiers have proven to be stable for many years (with minor exceptions), more stable than many URLs to other sites, but in the hyphothetical case that they would suddenly change their link formats, change their identifiers or change their services in unacceptable way, it would be trivially easy for us to centrally adjust or mute the corresponding template output, that is, it gives us more control.

Still, it would be great if IA could introduce some abstraction layer on top of their identifiers first, so that they become shorter and do not contain potentially misleading human-readable text fragments.

--Matthiaspaul (talk) 20:42, 21 November 2020 (UTC)[reply]

Well, my comment was centered on the opinion that there is no pressing problem to add anything. The idea that identifiers can be used as failovers for URLs, may not really hold water. For the simple reason that practically all ids are basically wrappers for, or reformatted abstractions of, URLs. One could argue that some ids may be using a different repository, or other (supposedly) authoritative service, or just simply a mirror that may stay up. But all of these can break too, and I do not know that we have a way to judge the future stability of the underlying infrastructure. I assume some, such as ISBNs (that resolve at web servers run by trade-affiliated entities) are more robust than others, simply because they are by now necessary for commerce. But even ISBN resolvers are known to have gone down. 98.0.246.242 (talk) 01:56, 22 November 2020 (UTC)[reply]

Obviously, we cannot predict the future. However, I don't know when they have been introduced originally, but both IA and GB identifiers have proven to be static for more than a decade already, and from the descriptions on their web sites they both see them as permanent long-term identifiers for use in public interfaces, not as short-time or only internal handles only accidently leaked to the outside world which could change/be renumbered the next time they set up their databases.

https://archive.org/services/docs/api/metadata-schema/index.html

http://blog.archive.org/2011/03/31/how-archive-org-items-are-structured/

https://developers.google.com/books/docs/v1/using#ids

So, it doesn't look as if they would intend to change them (to the better or worse) in the foreseeable future.

--Matthiaspaul (talk) 22:32, 23 November 2020 (UTC)[reply]

To reiterate, nobody will stop you if you wish to insert any "official" or semi-official identifier in |id=, regardless of whether such is well maintained or not. But there has to be a more compelling reason to formalize these into yet more parameters. Not every secondary identifier must be coded, documented and explained. This particular citation system is already overly complex and there is a good chance that the needs of the non-expert reader are not met. The litmus test: the most complex citation possible should be understood by the least knowledgeable reader possible. 107.14.54.1 (talk) 01:21, 24 November 2020 (UTC)[reply]

ISBN line breaks

Moved from Template talk:Citation § ISBN line breaks

– {{u|Sdkb}} ^talk 20:05, 16 November 2020 (UTC)[reply]

Screenshot; look at ref 114

During the ongoing FA review for Biblical criticism, I noticed that some ISBNs in the citations with dashes (e.g. Bauckham, currently ref 114) break onto multiple lines. This makes them marginally harder to read, so I think it would be preferable if they were non-breaking. Would it be possible to place a {{no wrap}} around the input for |ISBN= and other parameters that might have the same issue? {{u|Sdkb}} ^talk 18:09, 16 November 2020 (UTC)[reply]

In my browser, ISBNs and the "ISBN" text are always nowrapped, no matter how I modify the window width. Perhaps you could create a demonstration page in your sandbox, or upload a screen shot. – Jonesey95 (talk) 18:22, 16 November 2020 (UTC)[reply]

@Jonesey95: Screenshot added. {{u|Sdkb}} ^talk 18:34, 16 November 2020 (UTC)[reply]

reference info for Biblical criticism

unnamed refs

69

named refs

132

self closed

229

Refn templates

8

cs1 refs

208

cs1 templates

215

rp templates

296

webarchive templates

9

use xxx dates

dmy

cs1|2 dmy dates

11

cs1|2 ymd dates

3

cs1|2 last/first

196

cs1|2 author

4

List of cs1 templates

cite book (169)
Cite book (2)
cite encyclopedia (2)
cite journal (14)
Cite journal (2)
cite news (1)
Cite web (11)
cite web (14)

explanations

As far as I know, there has only been one previous discussion about preventing the rendered isbn from wrapping (there was an earlier discussion where it was mentioned). The discussion did not gain sufficient support.

Why now, all of a sudden? There are a lot of FAs that use cs1|2 and that have |isbn= with hyphenated isbns; the category has 6,541 articles of which 4,774 have hyphenated isbns; see this search.

A better venue is Help talk:Citation Style 1 because Biblical criticism does not use {{citation}}.

—Trappist the monk (talk) 18:59, 16 November 2020 (UTC)[reply]

Trappist the monk, I wasn't aware of that previous discussion; thanks for the link. The "why now" is just that I happened to notice it now while doing that review. And I'll move this to that venue.

While there's not uniformity in the prior discussion, it does look like there's enough support that consensus might develop with further discussion. What I notice is that there is a non-breaking space between the ISBN label and the number itself. Surely that would be a better breaking spot than any of the hyphens within the number? We should either change that to a breaking space, make the number non-breaking, or both, but definitely not neither. {{u|Sdkb}} ^talk 20:02, 16 November 2020 (UTC)[reply]

We also recently touched this in Help_talk:Citation_Style_1#Nbsp_in_|author,_|last,_and_equivalents_for_other_contributors

We currently frame ISBNs in <bdi>.

I would support to make the numbers for ISBN, SBN, ISSN, EISSN and ISMN identifiers as well as all dates (except for in the |orig-date= parameter) in suitable date formats non-wrapping. If this wouldn't grow the length of the non-wrapping string too long, this would ideally include the identifier names as well, but at the minimum we should keep the numbers from wrapping.

--Matthiaspaul (talk) 20:49, 16 November 2020 (UTC)[reply]

Following the example of many other messages containing short symbols/abbreviations (for example with volumes), to avoid odd-looking line breaks the sandboxed template now utilizes   in the message fragments used to display " et al.", " ed." (for edition) and "§ " and "§§ " (sections).

--Matthiaspaul (talk) 13:59, 17 November 2020 (UTC)[reply]

Cite OEIS generates invalid HTML

While updating Happy number, I tried to add "Cited in (an OEIS citation)", but noticed that every citation generates an id "CITEREFSloane" by default, which is incorrect HTML with more than one citation. When I tried to specify an explicit |ref= I got a cite error "Unrecognised parameter". I could not immediately see why that was, so I created the link by a bodge. This of course continued to annoy me, so I had another look this evening.

Apart from the constant id, there were two problems which are fixed in this (current) revision (testcases). The link after the final refs testcase jumps to the test citation for the live template and there are now no errors for the ref parameter displayed.

We also need to correct the default ref id. I propose a default id of

CITEREF<editor-last>_"<sequenceno>"

for which the user would add something like

{{sfn|Sloane "A12345"}} or {{harvtxt|Sloane "A12345"}}

to link to this, which seems both reasonably simple and clear. The quotes around the sequence number correspond to the quotes around the full entry title in the citation. You can see this in the (current) sandbox. In the testcases, the link after the next-to-last testcase for dates jumps to the test citation, but the live citation still has the incorrect id. Of course, I will update the documentation accordingly.

There may be other cite wrappers with the same problem now that cite * generate ids by default. Parameter check lists also need themselves to be checked.

~~Just as I finished preparing this, I notice that the testcases no longer display the missing error messages for the |foo= and |date= parameters. I can't see any reason for this at present.~~ They appear in preview mode.

Comments welcome, especially "yes, please do it" of course. --Mirokado (talk) 22:54, 20 November 2020 (UTC)[reply]

{{Cite OEIS}} is not a cs1|2 template. Problems with that template are best addressed at its talk page. If there is something wrong with the underlying {{cite web}}, then we want to know about it.

—Trappist the monk (talk) 23:09, 20 November 2020 (UTC)[reply]

OK, copied most of this to Template talk:Cite OEIS#Generates invalid HTML for further comments.
"Other cite wrappers causing the same problem now that cite * generate ids by default" is certainly something relevant to this page, even if there is no really easy central solution. If someone is bored on a wet Saturday afternoon, here is something for them to look at. --Mirokado (talk) 00:24, 21 November 2020 (UTC)[reply]

Those other wrapper templates, like {{Cite OEIS}}, must adapt if they haven't already done so. This is really no different from wrapper templates needing to adapt when old forms of parameter names that the wrappers use are deprecated and support for them withdrawn. The issue that you are complaining about, automatic CITEREF anchor creation, changed nothing because |ref=harv was specified with this edit to {{Cite OEIS}}. That setting became superfluous when cs1|2 began creating automatic CITEREF anchors. With this edit, {{Cite OEIS}} lost the superfluous |ref=harv setting and gained the ability to set the citation's CITEREF anchor externally.

—Trappist the monk (talk) 00:59, 21 November 2020 (UTC)[reply]

Undated sources

At present a source without a stated date uses the format date=n.d., and displays as
The newspaper. n.d. Retrieved 6 December 2015.
This is rather obscure to the reader. I would suggest either that date=n.d. be retained in the cite parameters, but displayed to the reader as "Undated", or that date=undated be allowed and displayed. (A display of "No date" for parameter n.d. would be OK.)

A parameter that tells editors that a reference is undated also saves an attempt to find and add a date, in the same way as the recommended author= does.

Example with date=n.d.:
"Pooley Bridge, Cumbria". Britain Express. n.d. Retrieved 6 December 2015.

Example with unsupported date=Undated:
"Pooley Bridge, Cumbria". Britain Express. Undated. Retrieved 6 December 2015. {{cite web}}: Check date values in: |date= (help)

Best wishes, Pol098 (talk) 13:35, 23 November 2020 (UTC)[reply]

This is rather obscure to the reader. Really? Why do you believe that readers are incapable of understanding this rather common initialism? It is perfectly acceptable to omit |date= when the source is not dated. Similarly, it is perfectly acceptable to write |date= for the benefit of editors if you think it appropriate.

Beyond incompetent readers, is there any substantive reason for cs1|2 to deviate from what is, apparently, accepted practice among the various external style guides?

—Trappist the monk (talk) 13:53, 23 November 2020 (UTC)[reply]

"Beyond incompetent readers ..." Requiring readers to be "competent" (and not necessarily English speakers; English Wikipedia is used worldwide) is not a good idea. Dropping "n.d." into the middle of a reference isn't necessarily clear ("Date=n.d." would be clearer, though "Undated" is better). To answer the question as asked: there is no substantive reason beyond "incompetent readers"; but that is enough for what is a trivial change without consequences (unless I have missed something) which will help readability. Let's see what others say. Best wishes, Pol098 (talk) 14:58, 23 November 2020 (UTC)[reply]

(edit-conflict) Our target audience includes "incompetent readers". Our goal as an encyclopedia for everyone is to improve their education and competence. (Personally, I would not call someone "incompetent" just for not knowing what "n.d." or "3 (12): 7–8" means.)

While "n.d." is one accepted practise to indicate a "no date given" condition, it is only one of them. There are different styles how to denote this, from variations on the abbreviation (with or without space, in different cases and with varying interpunctation) to spelling it out as "no date" or "undated" (in different cases and possibly bracketed). While most people who are not aware of the abbreviation should be able to guess that "n.d." means "no date" if given instead of a date, others might not ("not documented", "not displayed", "new data", "next date", "named date", "no dummy"?). Our general philosophy is to avoid abbreviations which might not be understood by everyone.

As I have stated in the past already, I'm all in favour of tokenizing such special cases (we already do this in some cases, f.e. with "et al." - although this one is special also in other ways). This has several other advantages as well:

Improved machine-readability
Consistency within articles and across the project in regard to how to indicate this condition
Control over the display output and metadata format should the recommended output format change over time (think of the discussions regarding how to display volumes, issues and pages) or if we would want to support other metadata standards in the future (beyond COinS) where this condition might be codified somehow. Even if we would not change the output format from "n.d.", it might be already helpful for readers if we'd display a tooltip with its expanded meaning. And in the metadata, it could be changed to "[n.d.]" to indicate a descriptive date rather than an actual date.
Easier localisation into other languages (for the same reason why we prefer |language=fr over |language=French). For example, in a German citation one would typically write "o. D." ("ohne Datum") rather than "n.d.", but "k. D." ("kein Datum") is seen as well. Likewise, there are abbreviations like "o. J." (without year), "o. O." (without location), "o. A." (without author) and "Anon." (for anonymous author(s)).

Regarding HTML comments, you wrote that author= would be the recommended form. It is possible that this has changed, but the last time I looked the recommended form was author=. Either way, this shows that HTML comments, as useful as they often are, are not a good method to indicate common states like this because they are more complicated to use for editors and therefore are not used consistently, thereby making it difficult to machine-read them. Special tokens such as |date=none, |author=none, |author=staff, |author=anon are much preferable to them.

--Matthiaspaul (talk) 17:14, 23 November 2020 (UTC)[reply]

Yeah, incompetent might be a bit strong, but en.wiki is one of two English language Wikipedias. For those who do not understand commonplace citation initialisms, abbreviations, and symbols used throughout the English language publishing world (and consequently in cs1|2), perhaps the other English language Wikipedia is a better choice. But, were it an issue, I would have thought that editors at simple.wiki would have tweaked (or asked us for assistance in tweaking) simple:Module:Citation/CS1/Configuration to accommodate their readers.

I have said in the past, and will likely say in the future, that cs1|2 is not APA, CMOS, Bluebook, or any other citation style. I am comfortable with cs1|2 not being any of those, but, I do not think that cs1|2 should be made to be so different from other citation styles that we abandon the commonly-used citation initialisms, abbreviations, and symbols that English-language readers have come to expect.

If it is to be believed that n.d. is rather obscure to the reader and must be fixed, it must follow that all of the other citation initialisms, abbreviations, and symbols used by cs1|2 are also rather obscure to the reader, mustn't it? If we believe that to be true, then we must discontinue use of all standard English-language citation initialisms, abbreviations, and symbols. We must replace: 'ed.' → editor, 'eds.' → editors, 'ed.' → edition, '§' → section, '§§' → sections, 'Vol.' → volume, 'no.' and 'No.' → issue or number, 'p.' → page, and 'pp.' → pages. And lest we forget it, 'et al.' → and others.

—Trappist the monk (talk) 18:41, 23 November 2020 (UTC)[reply]

The last time this topic was raised appears to be Help talk:Citation Style 1/Archive 55 § The n.d. keyword for undated sources (includes links to two other discussions).

—Trappist the monk (talk) 15:31, 23 November 2020 (UTC)[reply]

(edit-conflict) Given that we already use the keyword "none" in various other places, I would suggest to, at the minimum, support something like |date=none. However, if there are more similar conditions (as in the none/staff/anon example for authors above), more keywords could be introduced for them as well.

The keyword "none", indicating that this information is not given in the source, should be distinguished from the condition, that the information should not be displayed but would still be used in reference anchor generation and be provided in the metadata (for which I suggested the keyword "off" recently introduced for |title=), and the condition, that the information is simply unknown to the editor at present (but might be given in the source), which should not be indicated by a special token, but is often indicated to other editors by providing an empty |date= parameter (which, however, is sometimes removed by other editors "cleaning up").

I'm open in regard to the best output format, be it "n.d.", "no date", or something else. However, the good thing is that once we would have introduce a tokenized input for this condition, we are free to centrally change the output any time later on would this become necessary.

--Matthiaspaul (talk) 17:14, 23 November 2020 (UTC)[reply]

Addition to generic title

Hello, I was wondering if articles with "Subscribe to read" in the reference title could be added to Category:CS1 errors: generic title. There are currently over 1,000 usages of these in titles. Thanks. Keith D (talk) 14:35, 23 November 2020 (UTC)[reply]

Appears to be associated with Financial Times:

Cite web comparison
Wikitext	`{{cite web\|title=Subscribe to read\|url=https://www.ft.com/content/2d2a9afe-6829-11e5-97d0-1456a776a4f5\|website=Financial Times}}`
Live	"Subscribe to read". Financial Times. `{{cite web}}`: Cite uses generic title (help)
Sandbox	"Subscribe to read". Financial Times. `{{cite web}}`: Cite uses generic title (help)

—Trappist the monk (talk) 15:46, 23 November 2020 (UTC)[reply]

@@ Line 1,219: / Line 1,219: @@
 ::: So, it doesn't look as if they would intend to change them (to the better or worse) in the foreseeable future.
 ::: --[[User:Matthiaspaul|Matthiaspaul]] ([[User talk:Matthiaspaul|talk]]) 22:32, 23 November 2020 (UTC)
+::::To reiterate, nobody will stop you if you wish to insert any "official" or semi-official identifier in {{para|id}}, regardless of whether such is well maintained or not. But there has to be a more compelling reason to formalize these into yet more parameters. Not every secondary identifier must be coded, documented and explained. This particular citation system is already overly complex and there is a good chance that the needs of the non-expert reader are not met. The litmus test: the most complex citation possible should be understood by the least knowledgeable reader possible. [[Special:Contributions/107.14.54.1|107.14.54.1]] ([[User talk:107.14.54.1|talk]]) 01:21, 24 November 2020 (UTC)
 == ISBN line breaks ==