Module talk:Citation/CS1

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Language parameter[edit]

user:Trappist_the_monk, and others :)

Hi! See this task. We are wondering if there's a way to tell Cite templates in VE that they don't need to add the language parameter if the source language is the same than the wiki one. Do you think this is feasible, and most importantly, that it makes sense to do so? (AWB removes that parameter anyway.) Thanks a lot for your input! --Elitre (WMF) (talk) 11:33, 30 April 2015 (UTC)

I don't use VE so I don't know anything about how it works. But is that really your question? The phabricator task doesn't mention VE.
If the question is should VE populate |language=English (or the ISO 639-1 code en) then I would say no, don't do it. Editors here complain about citation template clutter whether it's in the rendering (access dates or English language annotation) or in the wikitext. The exception – isn't there always an exception? – is when the source is in multiple languages: |language=fr, de, en, then you should include English in the |language= parameter.
Did I answer your question?
Trappist the monk (talk) 12:22, 30 April 2015 (UTC)
I oversimplified :) mw:Citoid is a tool in VE which auto-generates references. It relies on Cite templates and TemplateData, so we're looking for a way to change the former or the latter to tell it to not populate that field, precisely for the reason you provided. Maybe now the Phabricator task makes more sense? Thank you! --Elitre (WMF) (talk) 12:32, 30 April 2015 (UTC)
I don't use Citoid or VE either, but those would be the places to focus. In one of them, you may be able to have some sort of explanatory note for editors that they should not populate |language= with the language of the WP that they are editing. – Jonesey95 (talk) 12:45, 30 April 2015 (UTC)
Thank you for weighing in. While it's certainly possible to improve such messaging in case of need, we aren't talking about editors adding that parameter manually. See a brief description of how Citoid works. Best, --Elitre (WMF) (talk) 12:50, 30 April 2015 (UTC)
Doesn't Citoid create the cite template? If so, how could the template, which hasn't yet been created, tell citoid how it is to be populated? Why am I so confused?
Trappist the monk (talk) 13:27, 30 April 2015 (UTC)
@Trappist the monk and Jonesey95: Sorry for the confusion, possibly my fault for not explaining my suggestion clearly enough near the bottom of that phab task.
Briefly, the aim (of my suggestion) is to make our Module work the same way as the French Modules do - to not render the language if it's set to the local default (English) - for example, in this revision, the 1st and 2nd citations both include |langue = français, but that info isn't rendered in the reflist below. That fixes the issue of having a useless string of text shown to readers (I assume that's the main reason we automatically remove it, when someone includes it in a citation).
This would:
  1. make the citation's language (meta-info) programatically available for researchers (rather than them having to just assume a missing language=..)
  2. let Citoid freely add any detected language (without having to worry about exceptions)
  3. Remove one task from the Checkwiki cleanup list at enwiki (thereby leading to fewer bot-edits like (half of the changes in) this diff)
    (Seems to be #21 at Wikipedia:WikiProject Check Wikipedia/List of errors)
See Zebulon84's explanation in this comment, for links to the French module (with line numbers) and a few details on the nuances.
I hope that's a bit clearer, and seems like a sensible course of action. [I tried searching for past discussions about this, but it could've been in dozens of locations, and the topic keywords are very common which makes it even harder! If you know of any details/discussions/edge-cases that need to be considered, please point them out. I'm not a developer, so am not sure what else needs to be considered. Thanks for your thoughts, and your previous work on this complex code. :) ] Quiddity (WMF) (talk) 18:27, 30 April 2015 (UTC)
The more-or-less equivalent code in en:WP's Module:Citation/CS1/sandbox (sandbox because it is slightly different from the current live module and til now was the intended update) is at line 1762. There, we look to see how many languages are listed in |language=. If only one language is listed and if that language is English or en then we add the page to Category:CS1 maint: English language specified and then go on to render the (in English) annotation text. We could, instead of or in addition to categorizing, simply return an empty string when English is the only language in the parameter value. That more-or-less mimics fr:WP.
As far as past conversations about the language parameter, those, if they exist, would most likely be in this page's archive or in the archives of Help talk:Citation Style 1. Or not. When I wrote Monkbot task 6 I adopted the philosophy stated in the box at {{en icon}}. If I remember correctly, that's the only place that I found that addressed the issue of English language annotation in en:WP.
There has been little to no reaction from CS1|2 template users with regard to the changes I've made to |language= support. Still, if your proposal is to procede, it might be best to, at the least, notice Help talk:Citation Style 1 with its 160 watchers since only 60 watch this page (and who probably are also watching H:CS1).
Trappist the monk (talk) 19:55, 30 April 2015 (UTC)
Pinging people there. Thank you. --Elitre (WMF) (talk) 08:28, 1 May 2015 (UTC)
I also apologize for the confusion: my point instead was a bit different. I'm taking for granted that not having the language parameter at in the template (not just not having a redundant language code in the reflist) is what the community here currently desires and the reason why it gets deleted by bots. I'm not challenging the reasoning behind this, I'm asking if there's a way to achieve this by changing the template behaviour, if possible and, as I stated in my first post, if it really makes sense for everybody. If it doesn't, and Quiddity's suggestion makes more sense, or if you have a different one, great! We'd just like to understand what's everybody's preference, here. HTH. --Elitre (WMF) (talk) 19:32, 30 April 2015 (UTC)
I'm not sure I understand what you mean by changing the template behaviour. An editor (or Citoid) creates a CS1|2 template and Module:Citation/CS1 renders it. Here, we can't do anything about what editors or Citoid create. We can, though, decide how we render the citation that the reader sees. The fr:WP solution is to not display language annotation when the specified language is an acceptable form of French. The |language=English parameter is still in the wikitext. Is this what you mean by changing behavior?
Trappist the monk (talk) 19:55, 30 April 2015 (UTC)
What the French are doing is a partial solution - you wouldn't provide a redundant information, but you'd still need cleanup; if an empty string (when English is the only language in the parameter value) was returned, wouldn't editors here still instruct the bot to go and delete "language=en" from wikitext anyway? I think they would, unless they decided this is not an error which needs to be corrected anymore. Since bots at enwiki currently get rid of "language=en" entirely, I'm just wondering if there is a way to prevent that that parameter is added in the first place (if this is a desirable outcome, obviously, and only when the source language matches the wiki one). Again, we're just flagging this to understand whether something needs to be done, and by whom :) Best, --Elitre (WMF) (talk) 08:28, 1 May 2015 (UTC)
The only way to keep bots from deleting something is to deny them access to the wikitext. Automated tools that create CS1|2 templates place those templates into the wikitext. The template processing tool (here, Module:Citation/CS1) has no power to rewrite the template in wikitext; it can only read and act upon what it reads where that act is to translate the data contained in the template into a correctly rendered HTML citation for display in a reader's browser.
The only place to prevent |language= parameter insertion is at the inserter: Citoid or Visual Editor. The French solution handles the cases where the inserter is a human editor constructing CS1|2 citations and where the community have decided that language annotation for sources in the local WP language need not be displayed.
Trappist the monk (talk) 10:06, 1 May 2015 (UTC)
Trappist the monk, thanks a lot for the explanation. Let me know if anything changes at some point as a result of this conversation. best, --Elitre (WMF) (talk) 06:01, 5 May 2015 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── Elitre (WMF): I have changed the en:WP Module:Citation/CS1/sandbox so that citations at en:WP with |language=en or |language=English will not display the language annotation:

Cite book compare
{{ cite book | language=en | title=Title }}
Live Title (in English). 
Sandbox Title. 
Cite book compare
{{ cite book | language=English | title=Title }}
Live Title (in English). 
Sandbox Title. 

Trappist the monk (talk) 00:16, 7 May 2015 (UTC)

You know, thinking about how often WPMED folks (especially the translation task force) copy citations to other Wikipedias, it might be a good idea to keep the |language=en information. Maybe we should hide it, rather than removing it. WhatamIdoing (talk) 17:38, 5 May 2015 (UTC)
That is a good point. Paging Doc James, who does a lot of this copying, I believe. – Jonesey95 (talk) 18:25, 5 May 2015 (UTC)
I think it is good information to provide. Yes we are adding En references to nearly 100 other Wikipedias. Doc James (talk · contribs · email) 19:34, 5 May 2015 (UTC)

Author parsing[edit]

So after the latest blowup over template changes, I think adopting the "vauthors=" mechanism from Template:Vcite2 journal is worth serious consideration. The current approach is very unambiguous, but having the first and last name of every author as a separate parameter makes for very bloated markup, and it's definitely a burden on those of us who are still hand-typing citations. I support the goal of making our references into something semantically meaningful, because I think that many useful tools for article curation could be built if we had a reliable way of identifying which articles were supported by which references, which journals, books, authors, etc. But the content has to come first: if the semantic markup scheme is deterring authors from editing, there is a serious problem. In the big picture, it's much more important to have people identify sources in medical articles as being WP:MEDRS (or not) than it is for those references to emit correct metadata. (I think the medical articles tend to be a bigger friction point because they often have very large author lists that bloat enormously.)

So, is it possible to graft the parsing code for "vauthors=" onto the "authors=" parameter here? I assume "authors=" already dumps its entire string as a single author into the COinS metadata, so it's unlikely to make things worse. And is it possible to parse "et al." or "''et al.''" at the end of that string without needing explicit markup? I have enough experience of HTML parsing to know that this kind of "do what I mean" parsing is risky, but given the debacle of the Visual Editor rollout, we're going to have to cope with hand-writing these templates for quite a while. Explicitly marking up every author may be less prone to error, but this isn't the first time there has been pushback about this, and I think this solution might make the process of structuring references a lot less objectionable. Choess (talk) 02:18, 6 May 2015 (UTC)

I haven't given the question of how would |vauthors= parsing work within Module:Citation/CS1 a great deal of thought but I'm pretty sure that it could be done. And it would produce better metadata because a requirement of |vauthors= would be that author (and editor) name lists would be required to adhere to the Vancouver system. Because |authors= does not have that kind of requirement, a wide variety of formats can be found.
Module:Citation/CS1 already recognizes a variety of et al. forms. It is that recognition that causes the population of Category:CS1 maint: Explicit use of et al. That same recognition strips the et al. from the author and editor lists before they are sent to metadata and causes the module to render the author and editor lists with the standardized form of et al. appended. The cs1|2 templates' remit includes handling and placement of static text but it is necessary to have a proper and consistent mechanism to inform the template when it should render certain static text. This is why we have |display-authors=etal.
When editors feel free to add non-author-name text et al. to author-name parameters, I think that they then feel free to add other non-author-name text to author-name parameters in spite of instruction to the contrary in all cs1|2 documentation. Et al. is relatively easy to detect and compensate for; other text, not so easy; if it were, I'd have a category full of pages that have such cs1|2 template parameters.
Trappist the monk (talk) 09:29, 6 May 2015 (UTC)

Et al 2[edit]

In English, abbreviations are set off with a comma, e.g. like this. This applies to "et al." equally, i.e. Smith, Jones, et al. When used with two or more names, the APA style expects it, viz. http://blog.apastyle.org/apastyle/2011/11/the-proper-use-of-et-al-in-apa-style.html. Similarly for Grammarist. As does ICMJE. Chicago says to use a comma unless there's only one name written in full. That's everybody except Trappist the monk, so I've restored my edit. Any third opinions? --RexxS (talk) 22:00, 13 May 2015 (UTC)

When I reverted your edit, in my edit summary I wrote: cs1|2 not bound by ICMJE; historically, cs1|2 has not used a comma before et al.
I've written this before: cs1|2 are not APA, are not Chicago, are not Bluebook, are not LSA, are not ICMJE, nor are they any other style. Certainly cs1|2 have been influenced by these styles but are not beholden to them.
Here is a simple {{cite book/old}} using {{citation/core}}. It has nine authors so it generates et al. in place of the ninth:
Author1; Author2; Author3; Author4; Author5; Author6; Author7; Author8 et al. Title. 
This form has been in place since this edit to {{citation/core}} on 7 October 2009. That style continues in use to the present day in Module:Citation/CS1.
In the edit summary of your revert of my revert you wrote: not bound by your preferences either - see talk. I have made no claim of personal preference with regards to a comma preceding et al.; if you can show where I have, please do so, otherwise, please do not put words into my mouth that I have not spoken.
Trappist the monk (talk) 00:16, 14 May 2015 (UTC)
RexxS has made a bold edit, and that edit has been reverted. Now we discuss. That's how WP works. Let's not have an edit war in a sandbox.
One of the things that we typically do on this page, or on Help talk:Citation Style 1, which is watched by more editors and serves as a better place to discuss changes to the module, is suggest a change and show some examples of how the change would be implemented. Then the change, if it meets with approval (or at least tentative approval, or perhaps aggressive lack of interest, or outright ambivalence), can be implemented in the sandbox and examples of the before/after rendering can be shown.
A suggestion has been made to insert a comma before "et al." in author and editor lists. Shall we attempt to implement that change in the sandbox and then display some test cases here to see if it works as intended? – Jonesey95 (talk) 03:21, 14 May 2015 (UTC)
One of the problems that besets Wikipedia is that it is being fossilised by editors who insist that "we have done it this way in the past" is an argument against any change. It isn't any argument at all. @Trappist the monk: Your revert summary made two points, neither of which provided any objective reason why it would be better to have no separator before " et al." Your argument is clearly then nothing more than your personal preference, and I make no apologies for pointing that out to you. I have provided multiple objective reasons: the use of commas before abbreviations is standard English grammar; all other style guides that I know of require a comma. I know we're not obliged to follow other style guides, but a lack of obligation to do something is poor excuse for not doing it, and I'd ask why you would not want to adopt a style that was consistent with what readers see in almost every other serious publication? --RexxS (talk) 11:58, 14 May 2015 (UTC)
Do not presume to think that I am opposed to change; I am not. For evidence of that look at the history of this module; read Help talk:Citation Style 1 and its archives.
Lest you continue to put words into my mouth that I have not spoken, let me definitively state my position with regards to punctuation preceding et al.: I am neither in favor of nor opposed to punctuation preceding et al. in editor- and author-name-lists; in short, I do not care.
If the community are content to have et al. rendered without preceding punctuation, then I accept that. If the community determines though discussion that cs1|2 should render et al. with preceding punctuation, then I accept that.
Trappist the monk (talk) 13:14, 14 May 2015 (UTC)
  • Support – The only argument presented above for not including a serial comma before "et al." is that it hasn't been since 7 October 2009 which is not a strong argument. Furthermore the removal of the comma was apparently made with no discussion. As journal style guides overwhelming support including a comma, the argument in favor is much stronger. Boghog (talk) 03:26, 14 May 2015 (UTC)
  • Partial support: I think that "Smith, Alan; Brown, Jane; et al." makes sense. In the standard CS1 style, I support changing to use a semicolon before "et al.", not a comma, because other authors are separated from one another by semicolons. If the separator of choice is a comma (i.e. |mode=cs2|name-list-format=vanc), then use a comma before "et al." – Jonesey95 (talk) 03:39, 14 May 2015 (UTC)
    |mode= does not change the style of separators used in the author and editor name lists:
    |mode=cs1:
    Last1, First1; Last2, First2; Last3, First3; Last4, First4. Title. 
    |mode=cs2:
    Last1, First1; Last2, First2; Last3, First3; Last4, First4, Title 
    but |name-list-format=vanc does:
    Last1 First1, Last2 First2, Last3 First3, Last4 First4. Title.  Vancouver style error (help) – the error here because the example uses enumerated names
    Trappist the monk (talk) 11:28, 14 May 2015 (UTC)
Good point. A semicolon should be used unless |name-list-format=vanc or |mode=cs2 in which case a comma should be used. Boghog (talk) 04:28, 14 May 2015 (UTC)
I corrected myself above. I'm not used to these new formatting parameters, but I knew there was some way to have commas separating names. – Jonesey95 (talk) 13:47, 14 May 2015 (UTC)
The new |vauthors= and |veditors= cause the module to rewrite their content as a last-first list and then render it in Vancouver system style without requiring |name-list-format=vanc.
Trappist the monk (talk) 14:05, 14 May 2015 (UTC)
  • Partial support per Joensey95 for consistency with the existing formatting and how this is done in other style guides. Imzadi 1979  04:07, 14 May 2015 (UTC)
  • Support. Serial comma is the norm here, and so far no real objections have been raised. "Standard English grammar" is of course a red herring, as these are citations rather than sentences. My only concern would be to clarify usage when combining individuals with corporate authors. LeadSongDog come howl! 12:19, 15 May 2015 (UTC)

Redirect this page to Help talk:Citation Style 1?[edit]

Should we redirect this page to Help talk:Citation Style 1? That page is watched by 160 editors, while this one is watched by only 60, and they are essentially the same forum. I think we should have just one discussion location for issues relating to the CS1 templates. – Jonesey95 (talk) 03:21, 14 May 2015 (UTC)

I would support this.
Trappist the monk (talk) 11:33, 14 May 2015 (UTC)
Given the above discussion occurred, absolutely. --Izno (talk) 19:49, 14 May 2015 (UTC)
And see the proposals below. I support the redirect. Just merge the existing discussions into the talk over there, if still active, archive them otherwise.  — SMcCandlish ¢ ≽ʌⱷ҅ʌ≼  02:15, 14 June 2015 (UTC)

Et al 3[edit]

Please see this thread at the Helpdesk. The author name Sheetal is being rendered as She et al. in {{citation}}. Something to do with the regex for et al., apparently. Thanks. - Sitush (talk) 18:32, 26 May 2015 (UTC)

(ec) For details, compare:

  • {{citation |first=Sheetal |last=Ranjan |chapter=Crimes Against Women in India |editor-first=N. Prabha |editor-last=Unnithan |title=Crime and Justice in India |year=2013 |publisher=SAGE Publications |isbn=978-8-13210-977-8 |url=https://books.google.co.uk/books?id=k_6HAwAAQBAJ}}
  • Ranjan, She et al. (2013), "Crimes Against Women in India", in Unnithan, N. Prabha, Crime and Justice in India, SAGE Publications, ISBN 978-8-13210-977-8 

This is evidently caused by the over-eager regexp in the following code line:

local pattern = ",? *'*[Ee][Tt] *[Aa][Ll][%.']*$"

which will recognize an "et al" mark even if it has neither a space between the two words nor a word boundary before it. Could we have a "\w" check or something of the sort built in to the beginning of that regexp to avoid this?

Fut.Perf. 18:34, 26 May 2015 (UTC)

Specifically, I'd recommend replacing ",? *" with "(, *| +)" (i.e. either a comma plus optional space, or at least one space to separate the "et al" string from the preceding text). Fut.Perf. 06:53, 27 May 2015 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── Lua doesn't support the regex alternation |operator. Fixed in the sandbox I think:

Citation compare
{{ citation | chapter=Crimes Against Women in India | last=Ranjan | editor-last=Unnithan | isbn=978-8-13210-977-8 | first=Sheetal | publisher=SAGE Publications | title=Crime and Justice in India | editor-first=N. Prabha | chapter-url=//books.google.co.uk/books?id=k_6HAwAAQBAJ&pg=PA249 | year=2013 }}
Live Ranjan, She et al. (2013), "Crimes Against Women in India", in Unnithan, N. Prabha, Crime and Justice in India, SAGE Publications, ISBN 978-8-13210-977-8 
Sandbox Ranjan, Sheetal (2013), "Crimes Against Women in India", in Unnithan, N. Prabha, Crime and Justice in India, SAGE Publications, ISBN 978-8-13210-977-8 

Trappist the monk (talk) 13:19, 30 May 2015 (UTC)

Hm. Like practically everything else that happens here, you are talking in cryptic terms that simply fly over the head of most people. Does fixing in the sandbox mean that there is going to be a proper fix or am I supposed to work it out by deploying whatever hack you did there? Female infanticide in India stills shows the error. - Sitush (talk) 19:56, 7 June 2015 (UTC)
The sandbox is the development environment for the citation templates. Changes are introduced and tested there, typically after a discussion like this one. Once the changes have been tested, the changes are moved to the main module, which makes them active in the templates. Because the citation templates are used millions of times in articles, the main module code is changed only once every few months. – Jonesey95 (talk) 00:48, 8 June 2015 (UTC)
Thank you. Your response makes sense, although I'd query the testing bit given the number of bug reports that seem to appear here ;) - Sitush (talk) 01:47, 8 June 2015 (UTC)

Let's add a collaboration parameter.[edit]

Large science projects will very often have massive list of authors. See for example, the 2012 Review of Particle Physics list of authors. The usual way of citing these massive collaboration is typically to have "J. Smith et al. (Collaboration name)" or similar (for an Wikipedia example, see [1].) This is usually achieve with the less-than-desirable a)

 |author1=W.-M. Yao ([[Particle Data Group]])
 |author2=...
 |display-authors=1
 |year=2012
 

which yields the broken/incorrect W.-M. Yao (Particle Data Group) et al. (2012), or sometimes with b)

 |last1=Yao |first1=W.-M.
 |last2=... |first2=...
 |coauthors=et al. ([[Particle Data Group]])
 |year=2012

which yields a correct Yao, W.-M et al. (Particle Data Group) (2012), and other similar hacks.

The real/best solution would to add a |collaboration= that would allow to write c)

 |last1=Yao |first1=W.-M.
 |last2=... |first2=...
 |display-authors=1
 |collaboration=[[Particle Data Group]]

or alternatively d)

 |last1=Yao |first1=W.-M.
 |last2=... |first2=...
 |display-authors=1
 |collaboration=Particle Data Group |collaboration-link=Particle Data Group

in order to generate the correct Yao, W.-M et al. (Particle Data Group) (2012).

This should apply across the board, in both {{citation}} and {{cite xxx}} styles. Headbomb {talk / contribs / physics / books} 19:45, 7 June 2015 (UTC)

You can now do |last1 = Yao |first1=W.-M. |display-authors=etal rather than |last1=Yao |first1=W.-M. |last2=... |first2=... |display-authors=1. --Izno (talk) 16:47, 8 June 2015 (UTC)
Yes, but that's not really relevant to the collaboration parameter. Headbomb {talk / contribs / physics / books} 01:50, 9 June 2015 (UTC)
Which is why I marked it up with a small?... I suppose I could have done a <aside>...</aside> to be all Html 5-groovy... --Izno (talk) 03:02, 9 June 2015 (UTC)
  • Discussions like this should be at Help talk:Citation Style 1. The Module talk:Citation/CS1 talk page is for discussion of how to code up the consensuses arrived at over there.  — SMcCandlish ¢ ≽ʌⱷ҅ʌ≼  02:11, 14 June 2015 (UTC)
I really, really doubt anyone would oppose this. The current options (A and B, above) cannot possibly be what consensus wants. The only thing tricky bit would be whether we want a |collaboration= + |collaboration-link=, but personally I would leave that to template coders to decide on.Headbomb {talk / contribs / physics / books} 12:39, 14 June 2015 (UTC)