User:Jason Quinn/Cite template parameters do have a mostly determined logical ordering

From Wikipedia, the free encyclopedia

Wikipedia is based on textual data. Every article is ultimately created by editing text, so called "wikitext". Some of that wikitext consists of "templates" that process text. One of the most common kind of templates are our cite and citation templates, used to aid formatting the references in our articles. These cite templates consist of the template name followed by a sometimes large number of template parameters for things like the title and author names etc. Editors usually pay no attention to the ordering of these parameters but they should. This essay will help explain why and suggest a nearly "perfect" way of ordering the parameters that "just makes sense". The main idea is that parameters that go together should be grouped together and that information should roughly follow the order of the presentation.

Key big ideas[edit]

  • The cite template parameters should roughly be in the order in which they tend to display.
  • Parameters that are closely related should be closely grouped.
  • Very roughly speaking, the ordering of the parameters should flow from general to specific (e.g., information about the volume itself like author, title, and year to information about the location within the volume like page) and important to less important (author clearly of super high importance while an LCCN number more candy than key).
  • Parameters with information intended to be read by humans should be near the front and computer readable parameters (like url's) should be near the back. I like to use the parameter |url= as a signal to editors that everything until the end of the reference can be semi-ignored while copy-editing the source.
  • Spacing matters in cite templates! There is an almost objectively best choice for spacing which is |last=Bloggs with a space before the pipe and not | last= Bloggs or | last = Bloggs or any other permutation. This is the best because of how text-wrapping works, for instance with Wikipedia's traditional source editor. All the other variations make it harder to parse the source code and are prune for editors to introduce errors because the text wrapping is confusing. This is one point where editors might object that it's a pointless change so I try not to do spacing changes by themselves but it's also important for an article's source to be consistent in style. The articles almost never are; so, preemptively to editors who wish to object to tweaks in spacing: if an editor is fully-copying editing an article's references, let them work in the style they like. Almost nobody does this dull, laborious task of making the articles consistent so unless you wish to do it too, don't bother somebody who's actually doing the valuable work because you don't like the way they go about it.

Specific smaller ideas[edit]

Many of my edits include changes like this:

  • As per the cite template documentation, use the parameters |last= and |first= rather than |author= when appropriate.
  • If there is only one author |last= and |first= should be used and not |last1= and |first1=. Conversely, if there's more than one author, then |last1= and |first1= should be used and not |last= and |first=.

Motivating example[edit]

When the author name is given with |last= and |first=, why would you not want to keep those two things close together in the list of parameters? Unfortunately it is not uncommon to see this "in the wild". The problem is that an editor seeing |last= or by itself might think that |first= is missing and when the parameter list is long, is can be difficult to find the matching |first= or overlook it, which causes them to add another one. In the end it just causes wasted time and effort. They go together like peas and carrots and should be adjacent.

It's less clear in which order the author parameters should appear. Should it be |last= then |first= or |first= then |last=? Here's there's more leeway but using |last= first is better. Why? Two reasons. It matches the most common presentation order of the cite template and this helps locate the source text for a given reference in the article. And also, it helps to alphabetize references by author last name as is often done in the References and Bibliography sections.

There's one more parameter to mention, |author-link=, which is commonly used and associated with an author. It should come after the name parameters such that the ordering is,

|last= |first= |author-link=

There's a common alias for |author-link= which is |authorlink=. I actually prefer the dashlash version because it's less visual clutter. But I am flexible either way with this. More important is that the article is consistent in its usage throughout.

With this idea explained in slow detail, let's move on to the first "group" of parameters that has a rationale behind an ordering. The "Authors group". Each of the following groups could have similar explanations about why they are groups and why the suggested order is what it is but I will mostly be brief.

Obvious groupings[edit]

Authors group[edit]

The parameters related to an author should go together like so

|last= |first= |author-link=

or, if there's more than author,

|last1= |first1= |author-link1= ... |lastN= |firstN= |author-linkN=

where "N" stands for the number of the last author.

There are several assumptions here justified in the § Motivating example:

  • |author-link='s go after the names.
  • |last= comes before |first=.

Lastly, sometimes you'll see |display-authors=, which should obviously come after the actual author names.

Editors group[edit]

Similarly, after the authors should go the editors.

|editor-last= |editor-first= |editor-link=

for just one author, or, if there's more than one,

|editor1-last= |editor1-first= |editor1-link= ... |editorN-last= |editorN-first= |editorN-link=

The same advice about numbering for authors applies to editors: If there's just one, don't number the parameters and if there's more than one editor, number the first.

Url group[edit]

Common cite template parameters are |url= and |access-date=, the which should only be used if |url= is present. These two parameters go together like peas and carrots. It makes no sense to have them far apart from each other. They should be next to each other. In fact, if they are far apart it introduces several problems. It makes noticing both are there much harder. So sometimes editors will change the value of |url= without updating |accessdate= because they didn't even notice |accessdate= was used. When they are randomly scattered, it also slows down the human parsing the cite template. This helps make editing less productive. It also makes sense that |accessdate= goes after |url= because it depends on the url. Conclusion: these two parameters should appear in this order:

|url= |accessdate=

Similarly, there's often this pair

|archive-url= |archive-date=

which clearly belong together in that order.

Also you'll see also |url-status=. This most tightly refers to |url= so it should be closer to it and the logical ordering should be:

|url= |accessdate= |url-status= |archive-url= |archive-date=

Publisher group[edit]

There's another obvious pair

|publisher= |location=

Title group[edit]

|title= |trans-title= |edition= |chapter=

It's obvious that |trans-title=, if it exists, should come directly after |title=. Also, |edition= should be very very closely connected to |title=.

The use of |chapter= is rarer but not uncommon. It is kind of an anomaly because it describes a part of a source instead of the whole source. But due to the way it is rendered, this needs to be part of the title group.

Date group[edit]

|date= |orig-date=

OR

|year= |orig-year=

A preliminary sketch for a group ordering[edit]

[Authors group] [Editors group] [Date group] [Title group] [Publisher group] [Url group]

This group order basically follows the citation presentation for both CS1 and CS2 cite templates. It really helps source code editors to have the cite parameters match up roughly with the cite template's presentation.

One thing to point out here. It is much nicer to have the URL group near the end of the citation. URLs tend to be long strings of semi-random characters. Putting most of the material of human interest before the URL stuff helps ensure it's read and scrutinized.

But we are missing some key parameters here, especially things like metadata related to the source (ISBN's, DOI's, etc.). Generally we will want to group these together but this requires more discussion.

Metadata group[edit]

For cite journal[edit]

For this we will need to identify some subgroups of meta data. Let's start with a group commonly seen with {{cite journal}}.

|volume= |issue=

These two clearly go together and should appear as shown. However, they are often used with |page= (or |pages=). Now |page= is a very special parameter. It is the first parameter that does not give information that applies to the entire source. It's an internal detail about a source. Generally speaking, we will want this order: |volume= |issue= |page=. But ideally we specify internal details completely after the specify the source details. This is why the {{rp}} template is sometimes used after the ref tags. Unfortunately, the presentation that {{rp}} produces is kind of clunky and it's often avoided for that reason.

But generally speaking we would like things like this:

[Details that apply to the whole source] followed by [Details that specify location within the source]

but unfortunately there are technical hurdles to doing this ideally within Mediawiki using ref tags and templates.

Long story short, we want |page= as far to the end of the list parameters as possible. But this runs into a problem. That "unreadable" URL group is there. The most practical solution is to put it before the URL group, which has the benefit of keeping it near |issue=.

These aren't the only meta data parameters. There's a whole bunch of them: (|isbn=, |sbn=, |jstor=, etc.). Generally speaking, I put these other meta parameters after |publisher= since the publisher is often responsible for creating the IDs for their values but before this |volume= |issue= |page= subgroup. While I tend to put the ubiquitous |isbn= directly after |publisher=, for the most part the order completely doesn't matter for the less common ones.

For cite book[edit]

For books, things are a bit different. For books, |volume= seems better suited to be part of the title group so that I'd usually use |title= |volume=.

Spacing[edit]

Odds-n-ends[edit]

Some parameters don't have a particular obvious place to go. Just do your best in those cases. Also the set of parameters themselves are still evolving and changing. So copy-editors who like to cleanup article source need to stay abreast of that and adapt.

See also[edit]