Jump to content

Wikipedia talk:Manual of Style: Difference between revisions

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Content deleted Content added
Pi zero (talk | contribs)
Line 596: Line 596:
:::* None of the other systems that have been mentioned here is "opposite" to logical quotation, so in calling it ''logical quotation'' there is no implication that any of these other systems is "the opposite of logical", even if there were such a thing as "the opposite of logical" in this situation (which there isn't).
:::* None of the other systems that have been mentioned here is "opposite" to logical quotation, so in calling it ''logical quotation'' there is no implication that any of these other systems is "the opposite of logical", even if there were such a thing as "the opposite of logical" in this situation (which there isn't).
:::[[User:Pi zero|Pi zero]] ([[User talk:Pi zero|talk]]) 00:17, 31 May 2009 (UTC)
:::[[User:Pi zero|Pi zero]] ([[User talk:Pi zero|talk]]) 00:17, 31 May 2009 (UTC)

::::Calling one system "logical" implies that the others are less logical or illogical entirely. It might be nice if it didn't, but it does. [[User:Darkfrog24|Darkfrog24]] ([[User talk:Darkfrog24|talk]]) 00:48, 31 May 2009 (UTC)


== [[WP:MOS#Quotation marks]] ==
== [[WP:MOS#Quotation marks]] ==

Revision as of 00:48, 31 May 2009

WikiProject iconManual of Style
WikiProject iconThis page falls within the scope of the Wikipedia:Manual of Style, a collaborative effort focused on enhancing clarity, consistency, and cohesiveness across the Manual of Style (MoS) guidelines by addressing inconsistencies, refining language, and integrating guidance effectively.
Note icon
This page falls under the contentious topics procedure and is given additional attention, as it closely associated to the English Wikipedia Manual of Style, and the article titles policy. Both areas are subjects of debate.
Contributors are urged to review the awareness criteria carefully and exercise caution when editing.
Note icon
For information on Wikipedia's approach to the establishment of new policies and guidelines, refer to WP:PROPOSAL. Additionally, guidance on how to contribute to the development and revision of Wikipedia policies of Wikipedia's policy and guideline documents is available, offering valuable insights and recommendations.

See also
Wikipedia talk:Writing better articles
Wikipedia talk:Article titles
Wikipedia talk:Quotations
Wikipedia talk:Manual of Style (dates and numbers)
Wikipedia talk:Manual of Style/quotation and punctuation

En dashes vs. hyphens

Following the section here on en dashes, I moved Ural-Altaic languages to Ural–Altaic languages. However, I've gotten complaints saying that a hyphen is used in the literature, and that takes precedence over the MOS. Since punctuation varies from source to source, it doesn't seem that clear-cut to me. So I'd like your input:

  1. Does the punctuation of academic literature take precedence over wikipedia's MOS? (per complaint, "they are inappropriate in established linguistic names", as in #2 below)
  2. When a language family is named after two languages (Yuki–Wappo, named after the Yuki and Wappo languages) or geographic areas (Niger–Congo, spoken along the Niger and Congo rivers), and neither is a prefix, should we use the en dash? A hyphen is always used in the lit, but is this an orthographic issue, or a punctuation issue?
  3. What if one of the names is shortened to its root form? "Uralic–Altaic languages" I think should be en-dashed, and "Uralo-Altaic languages" clearly should be hyphenated, but what about "Ural-Altaic/Ural–Altaic languages", which is Ural Mountains plus Altai Mountains plus the suffix -ic?

I've also gotten more general complaints:

"En-dashes are also ridiculous since they are not easy to type. Use them in mathematical formulas, but not in connected English text or in hyphenated vocabulary items." [I'm not sure what use they would have in math formulas.]

kwami (talk) 00:50, 8 March 2009 (UTC)[reply]

This seems to me a case where a hyphen is correct. The use is for conjunction, not disjunction. There is not a from–to, versus, or opposition sense between the two terms that would indicate en-dash usage. But I'm not an expert, wait for other opinions. -- Tcncv (talk) 06:52, 8 March 2009 (UTC)[reply]
You could argue Niger-Congo is a from-to (spoken from the Niger to the Congo). kwami (talk)
See the "Usage guidelines" subsection under Dash#En dash. --Wulf (talk) 09:10, 8 March 2009 (UTC)[reply]
Yes, that is what we are discussing, but what is your interpretation? -- Tcncv (talk) 18:51, 8 March 2009 (UTC)[reply]
To me these seem very much like Bose–Einstein condensate, which the CMS would not require an en dash for, but which has an en dash in the title here on wikipedia. I've never seen that phrase with an en dash in the academic lit either, so it does seem to be a good illustration for my question.
Another objection I've heard (see my talk page) is that readers looking things up will be constantly redirected from hyphenated search strings, wasting their time and concentration wondering how what they entered was wrong. Is this a concern for anyone else? If it is, should we set up a bot to fix links to en-dashed article titles? kwami (talk) 10:04, 8 March 2009 (UTC)[reply]
I am retracting my earlier opinion. It appears that Wikipedia MoS also prefers en-dash usage for "and" relationships, which (I believe) include both the Bose–Einstein condensate and Niger–Congo languages cases. It appears that other style guides are mixed on this issue, with some (such as Chicago) preferring dashes. Again, I am not an expert. (As a side note, is it just me, or is it confusing to have the "and" condition covered under "disjunction"?)
Other opinions are requested. -- Tcncv (talk) 18:51, 8 March 2009 (UTC)[reply]
This is another case in which WP:MOS has been written to "reform" English, rather than to record what it does. This should be fixed. Septentrionalis PMAnderson 23:14, 8 March 2009 (UTC)[reply]
This is also an archaic preservation of typesetting style which is inappropriate for computer usage. Hyphens are all that are required. The distinction between an en-dash and a hyphen is strictly a holdover from the printing industry. In linguistics, we never use en-dashes in formulations like Niger-Congo and Ural-Altaic and Amto-Musan, etc. An additional objection to the silly wording of this MOS concerning en-dashes is their use in "and" constructions. That would require them in "hyphenated" names, then, as well, such as Meredith Whitney-Bowes, for example. I am looking at the January 2008 issue of International Journal of American Linguistics right now. Page 2 "Patla-Chicontla Totonac" [hyphen], page 59 "Uto-Aztecan" [hyphen], page 89 "Proto-Cholan" [hyphen]. These are constructions of language names. However, in the formulation on page 141, an en-dash is correctly used in the formulation "Cherokee—English Dictionary". "Cherokee-English" is not an accepted linguistic formulation and the dictionary is clearly "from" Cherokee "to" English. On page 142, we also see "Eastern Ojibwa—Chippewa—Ottawa Dictionary" with en-dashes. Thus, the formulation "from—to" is a correct usage of the en-dash, while the "and" construction is not. ("Niger-Congo" is not a "from—to" construction, but is an "and" construction--"the languages of the Niger and Congo Basins"). (Taivo (talk) 22:26, 9 March 2009 (UTC))[reply]
Playing devil's advocate, "proto-" and "Uto-" would always be hyphenated, because they're prefixes.
I'm not so sure "Niger-Congo" is an "and" formulation: language families are frequently named for their geographic extremes, in effect 'the languages from A to B'.
What about cases where one or both of the joined names contain more than one word? Or hyphenating an already hyphenated name, as often happens with "proto-"? kwami (talk) 23:03, 9 March 2009 (UTC)[reply]
Linguistic usage always prevails--hyphens all the way. And we all know that the "Niger-Congo" family extends far beyond the Niger and Congo Rivers. Indeed, Atlantic-Zambezi would be a more accurate "extension" description. The argument for "geographical" extension works (in archaic typesetting terms if necessary at all) only for terms that are not accepted linguistic names. Accepted linguistic names should always be hyphenated because they are proper names. In examining linguistic usage, the only time one finds en-dashes (or, better, em-dashes) is in forms (as cited above) that are dictionary names, "From Cherokee to English". These are not geographical ranges, but "translation ranges" only. But, in the end, en-dashes are silly retentions from typesetting and have no real function in the modern, computerized world. (Taivo (talk) 23:58, 9 March 2009 (UTC))[reply]
Your "proper names" argument may be the way to go. That would take care of hyphenated surnames as well. (Except that dictionary titles are also proper names. "Unitary terms", perhaps?) However, em dashes are not appropriate for dictionaries. An em dash would give a bizarre reading, rather like a colon, such as the title is "Cherokee" and that it's an English dictionary. kwami (talk) 00:19, 10 March 2009 (UTC)[reply]
When combining hyphenated forms into larger units, older sources that were typeset sometimes used en-dashes to combine hyphenated forms. Thus, in the Handbook of Native American languages, Volume 10, Southwest (1983, typeset by linotype), on page 115 we find "Proto-" added to "Uto-Aztecan" with an en-dash. But in Mithun's The Languages of Native North America (1999, computer typeset) we find "Athapaskan-Eyak-Tlingit" with all hyphens (page 346 ff) and on page 123 "Proto-Uto-Aztecan" with all hyphens. In linguistics books, all the linotype-typeset (precomputer) books I checked have occasional (although not universal) en-dashes in places and all the computer-typeset books I checked have hyphens all the way. This WP:MOS appears to be a misguided attempt to turn back the clock to a precomputerized typesetting era. (Taivo (talk) 00:12, 10 March 2009 (UTC))[reply]
What we'll end up with then is differing punctuation standards depending on the topic of the article. That doesn't seem to be a tenable situation. (I don't see the point of the AET example, but pUA captures the diff.)
There's also the issue of precision. Within linguistics, the meaning of these names is obvious. However, they're not always so obvious to the non-linguist. Granted, hyphens are not wrong, but en dashes help disambiguate. This reminds me of punctuation in quotations. A final period or comma may come before or after the quotation mark, depending on the style guide we're following. However, here on WP we've decided to follow logical order, as being an encyclopedia warrants precision in such matters. kwami (talk) 00:19, 10 March 2009 (UTC)[reply]
Sorry for not being more specific--Athapaskan-Eyak-Tlingit is a combination of Athapaskan-Eyak with Tlingit. What you are implying about the precision comment is that books published by linguists are imprecise and that Wikipedia is somehow more precise. Ahem. Within linguistics, hyphens are now standard usage for all proper names of languages. That's the Manual of Style which should be followed for all language and linguistics articles. Within our field, we get to establish what is standard usage and what is not. These are proper names in the same way that Meredith Baxter-Birney is a proper name. When you start using an en-dash in her name, then you have a valid argument for using them in linguistics proper names. Otherwise, there is no valid contemporary reason for using en-dashes in linguistic names when the specialists within that field don't use en-dashes. (Taivo (talk) 00:33, 10 March 2009 (UTC))[reply]
Oh, yeah, I got that much. It's just not clear to me that AET isn't just a list of the three branches of the family, without trying to subclassify them. Yes, I agree with your surname analogy, as I've said above. kwami (talk) 00:38, 10 March 2009 (UTC)[reply]
One further point about en-dashes within linguistic proper names. If linguists are using hyphens in all proper names of languages and language groups, then who will be decided which names get en-dashes and which ones don't? Non-linguists? I hardly think that they have the authority to decide such matters. Out in the world of linguistics, there aren't any en-dashes, so adding them into Wikipedia articles is actually a falsification of the data. (Taivo (talk) 00:36, 10 March 2009 (UTC))[reply]
It's a punctuation standard, not decided on a case-by-case basis. So there is no "decision". Illustrations from some of the Papuan families: Trans–New Guinea, East Bird's Head–Sentani, Left May–Kwomtari, Ramu–Lower Sepik, Yele–West New Britain, Reefs–Santa Cruz, but a hyphen in Eastern Trans-Fly. (Per the MOS, most of these should actually have spaces: East Bird's Head – Sentani.) I don't know about the spaces, but even if we stick with hyphenating Niger-Congo, I think we should follow Linotype-level precision for protolanguages (which does not contradict linguistic custom), and keep the en dashes in these Papuan families, which otherwise are ambiguous. I think precision is valuable for its own sake, even if specialists don't bother with it. kwami (talk) 00:50, 10 March 2009 (UTC)[reply]
But you are applying two different things in your examples. First, "proto-" and "trans-" are prefixes and prefixes should always be attached with hyphens and not with en-dashes. "co-operate" should never have an en-dash any more than "proto-" or "trans-". Thus, your example of Trans-New Guinea is in a different category than an example such as Athapaskan-Eyak-Tlingit. None of these examples from New Guinea are complex, all are simple: A + B. The only ambiguous cases are where you have formulations such as: A+B + C+D. But again I ask, who is going to make the decision where to put en-dashes and where to put hyphens? If you put en-dashes throughout in Athapaskan-Eyak-Tlingit, then you've violated your argument about precision. I'm not willing to trust these decisions to anyone except the original linguist author, but they all have used hyphens. So using en-dashes here and there where the original authors did not is a falsification of the data. (Taivo (talk) 00:59, 10 March 2009 (UTC))[reply]
You're conflating different phenomena. The MOS description isn't very clear: Indo-European and Proto-Indic both take hyphens, because they involve prefixes. Proto–Indo-European, however, takes an en dash, because the prefix docks to an already hyphenated form. This is established usage in linguistics, as you yourself showed. (It being obsolete is a different argument entirely.) Besides using en dashes when conjoining already hyphenated terms, it's also standard to use them when conjoining multi-word terms, as in the Papuan examples. That has nothing to do with something like Niger-Congo, where your surname argument is convincing. kwami (talk) 01:16, 10 March 2009 (UTC)[reply]
Just for a control, I looked all the way back at Voegelin and Voegelin's Classification and Index of the World's Languages (1977, long before computerized typesetting) and they used hyphens. As just a sample, on page 243, I found "Athapascan-Eyak", "Na-Dene", "Sino-Tibetan-Na-Dene", "Kuki-Chin", "Naga-Kuki-Chin", and "non-Indo-European". All with nothing but hyphens and some of them constructed themselves from other hyphenated forms. (Taivo (talk) 00:40, 10 March 2009 (UTC))[reply]
The quality of the linguistics has little to do with the quality of the printing or typesetting. For all we know, V&V wrote the book on a manual typewriter and expected the typesetters to take care of such issues, and the typesetters didn't know the difference with these unfamiliar names. And even if they chose to be imprecise, I don't think we should go with the lowest common denominator. I can go along with "Na-Dene", as that is consistent with broader English usage, but "Sino-Tibetan-Na-Dene" is just stupid. Within the linguistic community, okay, everyone knows what they mean. But for a broader audience it definitely needs an en dash: "Sino-Tibetan–Na-Dene". kwami (talk) 00:50, 10 March 2009 (UTC)[reply]
  • This sounds like support for Do what reliable sources in English do, leaving the present elaborate distinction as a rule of thumb when sources conflict, or taking them out altogether.
  • Which, if either, should we do?
  • Other comments?
    • I should think the difference between the two junctions in non-Indo-European worth marking, myself; but if sources don't....00:44, 10 March 2009 (UTC)

(outdent) Actually, using en-dashes does violate linguistic custom. All linguists are currently using hyphens for "proto-" and most have used hyphens in the past. The problem is that any use of en-dashes violates contemporary linguistic usage and much of past usage. En-dashes are extinct in linguistic literature and were never very common even in the past. (Taivo (talk) 00:43, 10 March 2009 (UTC))[reply]

Yes, "Proto-Indo-European" is almost universally hyphenated. However, this is not restricted to linguistics: prefixes on already-hyphenated forms are generally hyphenated themselves, regardless of the field. Therefore I think this is an argument for amending the MOS, not for making linguistics an exception.
I don't have a problem with hyphenating "Proto-Indo-European", "Niger-Congo", and "Amto-Musan". However, I do object to hyphenating "Sino-Tibetan–Na-Dene", "East Bird's Head–Sentani", and "Yele–West New Britain", as the results are difficult to parse.
I'm finding that TNG is often not conjoined at all: "Trans New Guinea", but that when it is, it is often en-dashed: "Trans–New Guinea". It seems that the current trend is to write it as three separate words, despite the fact that one is a prefix. This is a clear indication that people find hyphenation problematic in this case. kwami (talk) 01:41, 10 March 2009 (UTC)[reply]

Looking through the MoS Talk archives, it is apparent that dash usage is either the most recurring topic or is close to it. I think that points to a systemic problem either in the way Wikipedia defines its dash guidelines or in its expectations of its editors, and the repeated debates are a distraction from other MoS issues. Short of abandoning dashes (which I'm sure has no chance of happening), I think the dash guidelines should start with a clear and concise summary on which the rest of the guideline can build. The best, clearest, and most concise summary I've come across in recent discussion was by The Duke of Waltham (talk · contribs) in this earlier discussion.


Of course, hyphens have many other uses such as prefixes (Proto-Indo-European) and phrasal adjectives (hard-boiled egg), but I think there is general agreement on these uses. Unfortunately (IMHO), Wikipedia guideline does not distinguish the conjunction ("and") cases from the disjunction ("to" and "verses") cases and specifies that en-dashes be used for both. Thus the conjunction in "Michelson-Morley experiment" uses an en-dash, and as I interpret the current guideline, Ural-Altaic languages should also use an en-dash. However, for those familiar with other styles such as The Chicago Manual of Style, this seems unnecessary.

I would propose relaxing the dash guideline to recognize (and even encourage) the use of hyphens as an alternative to en-dashes for conjunctions. Disjunctions would continue to use en-dashes. There is precedent for this in the allowance of spaced en-dashes as an alternative to unspaced em-dashes for interruption. -- Tcncv (talk) 03:23, 10 March 2009 (UTC)[reply]

Since the linguistic usages of hyphenated forms falls within the "conjunction" guideline, this follows current linguistic usage of hyphens as found in the journals. (Taivo (talk) 03:38, 10 March 2009 (UTC))[reply]

"Michelson-Morley experiment" sounds like an experiment by some guy named Michelson Morley, whereas "Michelson–Morley experiment" makes it clear that there were two people, Michelson and Morley. This is I believe an important distinction to maintain.

VikSol's comments from my talk page:

The requirement for n-dashes is an instance of prescriptivism, as in prescriptive grammar. I think practical utility is a much more important consideration than formulaic correctness. I doubt very many readers ever notice when an n-dash is used rather than a hyphen or are even aware of the existence of both. I think the requirement for n-dashes in certain formations is a harmless conceit as long as it doesn't interfere with the use of the encyclopedia. In this case, it does. Practically no one has an n-dash key on their keyboard and only a few know where to find one. This is proved by the fact that the editors who replace hyphens almost always use the "& n d a s h ;" command > "–", which makes text harder to read for editors, rather than a "physical" n-dash "–". Evidently, it is not widely known that the physical n-dash even exists. Because the n-dash character is a hangover from the print industry, and no one has it on their keyboards, everyone who types in a search is going to use the hyphen, e.g. "Eskimo-Aleut languages" rather than "Eskimo–Aleut languages". The result is that every single search of this kind is going to bring up the "Redirected from Eskimo-Aleut languages" message or the like. The readers feels this as a slap in the face, wondering "what did I do wrong?" S/he may then scrutinize the typed message to see if it was mistyped and lose time figuring out an n-dash was required or, more likely, giving up in frustration. By this time precious seconds have been wasted, the reader's chain of concentration is likely broken, and their reaction to Wikipedia begins to turn from positive to negative, because we are not anticipating their predictable reactions. If it was possible to devise a fix whereby searches using hyphens automatically produced the article with n-dash without the "Redirected" message, we would again have a harmless conceit, especially if this fix was automatic and did not require a further effort by the editor, but it would, IMHO, be a waste of time, which is what is scarcest on this planet. In sum, the MOS guideline, if it requires n-dashes in titles, should be changed. Most Wikipedia guidelines are not rigid, recommending that common sense be used and the particular situation considered. If this isn't so here (Kwami, could we have a link to the guideline?), one would want to know why not. Why would this particular bit of typographical traditionalism be allowed to run roughshod over common sense and practicality?

and,

Wikipedia does not generally follow typesetting principles derived from the print industry but remains close to what people type on their computer screens. For example, it puts a line between paragraphs rather than indenting the first line. The purpose of this I think is to maintain editability, i.e. to make it easy for the average user to edit Wikipedia without special knowledge. This is really one of the last holdouts in the computer world to the era of DOS and other user-editable operating systems, which has since been totally eclipsed by systems that freeze out the user and keep him dependent on a handful of corporate monopolies. It's the last glimpse of a world as it might have been. The mere fact that people don't naturally type an n-dash under any circumstances is a sufficient argument against using it, in my opinion. Why spend all this time and effort putting in n-dashes when almost nobody is ever going to notice whether an n-dash or a hyphen was used or not? As I see it, it would be better to remove the use of n-dashes altogether, thereby making the use of dashes/hyphens consistent with the general principle of Wikipedia typography, namely that it's not an imitation of print typesetting but sacrifices some of its refinements in order to maintain direct contact with the average user. All this in a totally non-dogmatic spirit, I hope it's clear. User:VikSol

Ah, here's another example which I think cries out for an en dash: Trans-Fly–Bulaka River, as opposed to Eastern Trans-Fly. kwami (talk) 07:52, 10 March 2009 (UTC) kwami (talk) 07:42, 10 March 2009 (UTC)[reply]

If you want to write new text, kwami, with en-dashes, ok, but don't go changing existing text or existing article titles. There are tons more useful things that you could be doing in the linguistics articles other than turning hyphens into en-dashes. That's just a waste of time, IMHO. (Taivo (talk) 02:23, 11 March 2009 (UTC))[reply]
Actually, I've reverted all the changes along the lines of Niger-Congo. kwami (talk) 02:29, 11 March 2009 (UTC)[reply]
These should be counter-reverted; we should not invent usage. Septentrionalis PMAnderson 16:48, 11 March 2009 (UTC)[reply]
No, Pmanderson, these should not be "counter-reverted". Wikipedia is the inventor of usage here, not linguists. Linguists nearly universally use hyphens here now and have generally abandoned en-dashes. Wikipedia should follow the field, not the other way round. And, Kwami, you have a point about readability, but at what point do we end the hyphen/en-dash madness? How about South Bird's Head-Timor-Alor-Pantar, which is composed of South Bird's Head + (Timor + (Alor-Pantar))? Should we then use an em-dash to add another layer of detailed understanding: South Bird's Head—Timor–Alor-Pantar (I don't know if I got the right symbols inserted since their appearance here on the edit page is not the same as their appearance on the article page--another argument for just using hyphens). (Taivo (talk) 19:53, 11 March 2009 (UTC))[reply]
I've reverted to just using en dashes to join multi-word terms. So "Bird's Head–Timor-Alor-Pantar". One en dash to join Bird's Head, with a space in it, to Timor-Alor-Pantar, with hyphens in it. From what I've seen in print, you generally use en dashes for the highest level in the taxonomy, and reduce everything else to hyphens.
Em dashes would never be used. If we want to be sticklers, the way to join phrases which contain spaces would be with en dashes with spaces. So theoretically we could have "Bird's Head – Timor–Alor-Pantar". However, I don't see any point in doing that, as it doesn't improve legibility, and have reverted the few families where I had spaced en dashes following the MOS. kwami (talk) 21:18, 11 March 2009 (UTC)[reply]
Actually, Taivo, in the classifications I'm familiar with, TAP is not a family with two branches, Timor and Alor-Pantar, but rather one with multiple branches spread over three islands, Timor, Alor, and Pantar. Therefore simple hyphens are all that is needed: Timor-Alor-Pantar. Where the en dash would come in is in "West Timor–Alor-Pantar", as it specifies that "West" applies only to Timor, not to the whole of *Timor-Alor-Pantar. Omitting the en dash would imply that it might contrast with *East Timor-Alor-Pantar, rather than with East Timor, as it actually does. kwami (talk) 21:26, 11 March 2009 (UTC)[reply]
(Makasai-)Alor-Pantar is contrasted with ungrouped languages of Timor in Ethnologue, Ruhlen, and International Encyclopedia of Linguistics (which all follow Wurm's classification). In fact, none of my references have Alor-Pantar ungrouped--all group them together against the languages of Timor. I'd be curious as to who isn't following Wurm's lead in this particular grouping. But the point still remains--if you want to use en-dashes for clarity, then you must use an em-dash when you have A+(B+(C+D)). The journals, however, are still in favor of hyphens all the way, though. OK, I just decided to look at a journal that I don't subscribe to and found a real mess--Oceanic Linguistics. I love OL and read it on-line regularly, but it's got a real mess. In the first article of the Dec 2008 issue I found "Proto-Oceanic" (hyphen), "Central Malayo-Polynesian" (hyphen), "Central-Eastern Malayo-Polynesian" (hyphens), but "South Halmahera–West New Guinea" (en-dash), "Pre–Proto-Oceanic" (en-dash and hyphen), "Proto–Central-Eastern Malayo-Polynesian" (en-dash and hyphens), "Proto–Central Malayo-Polynesian" (en-dash and hyphen), "Proto–Western Malayo-Polynesian" (en-dash and hyphen), and, perversely, "Proto-Eastern-Malayo-Polynesian" (all hyphens-not a typo, but consistently throughout the article). So some "proto-"s have hyphens and some have en-dashes. It also has "Proto–Trans–New Guinea" (en-dashes). In the third article of that issue, I found "Proto-North–Central Vanuatu" (hyphen and en-dash, not a typo, but consistently). Contrast this with the first Squib of that issue which has "Proto–North-Central Vanuatu" (en-dash and hyphen, not a typo, but consistently). In the fourth article, there was "Timor-Alor-Pantar" (all hyphens). My point is that there is no real consistency even when editors of articles subject to typography try to distinguish between en-dashes and hyphens. Wikipedia is not a typeset article and I find it pretentious to think that it is. And when the same editor in a prestigious journal like Oceanic Linguistics mixes en-dashes and hyphens in the same form in two different articles just proves what a confusion they really are and not the enlightenment you would like them to be. And to think that we are encouraging non-linguists to use en-dashes...... (Taivo (talk) 23:38, 11 March 2009 (UTC))[reply]
Actually, Taivo, you pretty much prove my point.
  • In none of your sources is Timor a genetic node apart from Alor-Pantar. Therefore Timor-Alor-Pantar is not an (A+(B+C)) cladistic description, but an (A+B+C) geographic description like Niger-Congo, and hyphens are all that is needed. Your OL citation supports this.
  • Good examples from OL, which show that en dashes still are used in the linguistic lit. It's actually not bad at all. There are only a few inconsistencies. "Proto-" vs "Proto–" is a case in point: en dash when prefixed to a hyphenated term, hyphen otherwise. Perfectly consistent except for "Proto-Eastern-Malayo-Polynesian", which we'd expect to be like Proto–Central Malayo-Polynesian. Evidently an author who doesn't use en dashes, but IMO that's still a pretty good batting average. And as noted above, hyphens may be universally substituted for en dashes, so really it's just a stylistic difference, just as placing punctuation inside or outside quotation marks is a matter of style. Same consistency with the other prefixes, Pre– and Trans–.
  • "Proto-North–Central Vanuatu" is clearly an error. Perhaps a typo in the custom spell checker? Are you really claiming we must abandon en dashes because you found a typo?
  • No, we never use em dashes for compounds. I don't know where you get the idea that we "must" do this. Any sources to support your claim?
  • As far as professionals getting it wrong, so what? I bet they misspell words too. Should we abandon standard spelling because the professionals sometimes get it wrong? But the professionals very rarely got it wrong: One abandoned en dashes for all hyphens, which is a stylistic difference, while only one name was actually incorrect, and that in only one of the articles you found it in. So no, I would not agree that this is "a mess", but rather an excellent guide to what we should be doing. kwami (talk)
Okay, another few cents' worth: (1) What the discussion above shows is the Byzantine complexity of the rules for using n-dashes, along with their variability, not only between English and French etc. usage but within English usage itself (and we haven't even really factored in here the differences between American and British usage, and various schools thereof). If people dripping with graduate degrees, specialized knowledge, and long versed in Wikipedia can't figure it out, who can? Obviously, there is no chance that the average editor coming to Wikipedia for the first time can. (2) It is impossible to devise a practically applicable standard for the use of n-dashes, because there are so many different ways to think about an expression like Proto-Uto-Aztecan and the like. (3) Linguistics provides some ways to think about these problems, and indeed it is the science best placed to do so. (a) What we have here is arguably a conflict between the two poles that govern language change, ease of expression and ease of understanding. E.g. it's easier to assimilate sounds to nearby sounds but after a certain point harder to understand the result. Languages generally arrive at a compromise. (b) The fundamental problem is not of our making and has no solution: it's that English is conflicted about the process of compounding. In German for instance there is never any doubt about whether a word should be compounded or not: Urindogermanisch is one word, unlike its English equivalent, 'Proto-Indo-European' or more literally 'Proto-Indo-Germanic', but on the other hard it's disarticulatable into its component elements, rather like the inflections of an agglutinating language, whereas in English once elements are joined in a word they stay joined, if the compounding has passed the point of hyphenation (or n-dashing). Thus English, unlike German, has all sorts of different levels of compounding, sometimes linked to accent and type of word, sometimes not: e.g. some manuals of style will tell you to write "the twentieth century" (nominal) but "twentieth-century events" (adjectival). But this principle is not consistently observed, even in principle, i.e. the language is unsettled on its principles of composition. (4) We could use hyphens only in article titles, as in "Na-Dene languages", and n-dashes inside the articles, as in "Na–Dene languages". But the result would be that using a hyphen in the search box and hitting "Go" would find the article title, but clicking on the "Search" button would fail to find the instances with n-dashes inside articles. I haven't actually checked this, and perhaps there is some fix. (5) But - and I strongly agree with Taivo on this - why go to all this effort? Every time someone edits one of these articles, an army of bots must crawl into place, we editors concerned with linguistics issues must drop what we are doing and spring into action, and the poor sap who has used a hyphen sees his edits squashed by some know-it-all (as he is likely to see it). (6) No one can ever hope to master the Byzantine complexities governing the use of hyphens versus n-dashes. To do so would require an army of lawyers, who would be just as productive as the English real estate lawyers of the 19th century who liked to keep lawsuits going for generations (a steady income, don't you know). It is as much as we can do to keep up with the two-way distinction between hyphens and m-dashes, as illustrated by the fact that some people prefer an m-dash as a mark of punctuation, others an n-dash with two spaces (potentially micro-spaces - the Byzantine complexities multiply). (7) It's true that an n-dash can help to differentiate expressions of the A-B + B-C form. But, referring to point (3a) above on the struggle between ease of expression and ease of understanding, is it worth it? (a) Most people are unaware of the distinction and don't notice it, so the benefits are reserved to an elite. We could try to educate people about this, but the fact is that spontaneous understanding is not there, and we are not writing for typesetters alone. (b) If we use n-dashes, we must define when they are to be used, and as this entire discussion shows, there is no realistic way to do so. Every attempt to do so founders on the lack of any clear standards in either American or British / Commonwealth usage, and this lack of clear standards is ultimately related to the fluctuating status of compounding in the English language and English orthography. (8) In conclusion, given the practical and real impossibility of defining any standards that are (a) consistent and (b) simple enough to be generally used, I suggest that Wikipedia should abandon the use of n-dashes altogether, even in date ranges (where, again, nobody types an n-dash spontaneously), but at a minimum and most definitively in all linguistic names of languages and language families. That is a way to end the confusion, to ensure that searches find what they are looking for, and to simplify the tasks of editors, and it will work. Regards to all, VikSol (talk) 23:53, 11 March 2009 (UTC) PS- I think that Taivo's point above that Wikipedia is not typeset deserves to be taken very seriously. For example, footnote numbers and other superscripts bump up the line separation unevenly - hardly esthetic, but tolerated. This is a much more serious defect than any wandering between n-dashes and hyphens, but so far there's no practical fix. We are a long way from the refined standards of the print industry (when it managed to apply these), but then again, democratic culture has its advantages. Why imitate something that is different by nature? VikSol (talk) 00:06, 12 March 2009 (UTC)[reply]
IMO, it's all quite clear. We can discuss which style guidelines are most appropriate to follow, but if we're going to throw up our hands and say "we're confused!", we might as well abandon spellings, pronunciations, calendars, technical terms, and units of measurement we find confusing. The use of the en dash as Taivo illustrated in OL serves a valuable disambiguating function, and IMO it should be preserved. kwami (talk) 00:27, 12 March 2009 (UTC)[reply]
But we are confused! If you are saying, "we now use hyphens, but use n-dashes only to disambiguate", as in A-B–C-D, fine, then we have a simple principle, but one that innovates. If this does not inhibit searches, then it does no great harm. It remains true that no one will notice it, except for a few ultra-professionals. So my point above about "is it worth it?" stands. If you want to try to spell out some clear principles we should follow, then fine, I'll look at them with interest. But at the moment no coherent set of principles is in view. Also, I am not advocating orthographic anarchy, but a clear and simple principle: hyphens in all compounds, n-dashes not at all (preferably) or only in date-range expressions like 1900–1910 (because the usage is relatively well established). A principle simple enough for everyone to follow. The idea is that we sacrifice a little ease in comprehension (the possibility of disambiguating A-B–C-D from A-B-C-D) for a lot of ease of expression. No doubt, there is a real advantage either way. But the relative balance of advantages seems clear to me. VikSol (talk) 01:15, 12 March 2009 (UTC)[reply]

[outdent] En dashes when compounding words which contain spaces or already contain hyphens. En dashes when compounding two people's names, vs. hyphens for compounding the name of a single person. Both pretty standard. Comprehension on the part of the reader trumps ease of input for the editor. Of course, there are situations where we may disagree on usage, but that's no different than differences on capitalization. Just came across an example: An editor abbreviated "East Fijian-Polynesian" as "East" in a table, evidently thinking it meant East (Fijian-Polynesian), when actually it meant (East Fijian)-Polynesian. With an en dash, "East Fijian–Polynesian", the structure of the compound is clear. kwami (talk) 03:01, 12 March 2009 (UTC)[reply]

You ignored one of my main points, Kwami. The same editor used both Proto-North–Central Vanuatu and Proto–North-Central Vanuatu. It was not a typo since each was 100% consistent within the article in which it occurred. The rules are silly and confusing to the point that the same person used two different combinations for the same term on two different days. And don't go all warm and bubbly because one of the journals I regularly read uses en-dashes. The other half-dozen or so that I regularly read do not. Indeed, an increasing amount of linguistics is being published from camera-ready copy and not being typeset at all. It should go without saying that virtually all camera-ready copy is being done with hyphens and not en-dashes. I second everything that VikSol is saying about the needlessness of en-dashes in an on-line, user-edited format such as Wikipedia. And your suggestion that this can be done with bots is absolutely ludicrous. Bots are not thinking machines, they are stupid computer programs that don't know anything beyond 1 or 0. In the 1960s, the Air Force and NASA agreed that solid rocket propulsion would be called a "motor" and liquid rocket propulsion would be called an "engine". The next proposal that came out of Thiokol included the mechanical replacement of "engine" with "motor" for consistency. Thus, the buildings were protected by an independent fire department and its "fire motors". I can't tell you how many times I've seen bots do silly things here. You think I'm going to trust a computer program to correctly place en-dashes when I don't trust anyone who doesn't have a linguistics degree? Get real. En-dashes are a relic from an age of typesetting and have no place in Wikipedia. (Taivo (talk) 04:33, 12 March 2009 (UTC))[reply]
Ah, I didn't catch that it was the same editor. Still, the fact that all but one term was correct, and even that term was correct in one of the articles, tells me it's not all that difficult for other people. You make it sound as if I'm the only one who understands this. And if someone makes a mistake, so what? This is a wiki, and someone else will come along and correct it, just as they do with capitalization, quotations, and other formating issues. Most people will continue to write with hyphens, and that's fine. In the infrequent cases where a hyphenated name is ambiguous, we can make it more precise. I never said bots should make the decision, but once an article is moved to a name with an en dash, bots can fix the redirects and other mentions of the name. I don't see what the problem is: precision vs. ease of data entry, and meanwhile it's okay to use the easier form of data entry.
Anyway, you've convinced me to abandon the more extreme interpretation of when to use en dashes, and there are few linguistic articles which compound multi-word terms. kwami (talk) 09:43, 12 March 2009 (UTC)[reply]
Kwami, I am concerned that you have been replacing n-dashes with hyphens in articles and article titles such as "Proto-Chukotokto-Kamchatkan", now changed to "Proto–Chukotko-Kamchatkan". (1) I had the impression that the discussion here was moving toward a consensus, but the principles to be followed have not been spelled out comprehensively. Please have a little more patience with Taivo and me and the other persons concerned until the positions are clearly defined. If some parties then don't get their way, fine, but we should have the principles spelled out clearly as well as the grounds for decision. (2) Adding an n-dash after "Proto" raises two real concerns: (a) As we all agree, it has the drawback of complicating searches, by bringing up a "Redirected" message. (b) A further concern is that, if n-dashes are used after "Proto", this conflicts with the use of n-dashes to disambiguate expressions like Uralo-Indo-European by transforming them into Uralo–Indo-European, since we then have to sometimes speak of Proto–Uralo–Indo-European - two uses of an n-dash that use conflicting rules, indicating a contradictory and therefore confusing system. I think we should try to achieve consensus here before changing any more article titles. Regards, VikSol (talk) 02:25, 13 March 2009 (UTC)[reply]
Concern noted, VikSol. Not very many articles are affected, so it won't be hard to undo, and I'm quite willing to compromise on the protolanguages.
An alternative to your "Uralo–Indo-European" example would be "Uralo-Indoeuropean". You're right, with only two levels of conjunction (hyphen and en dash), you can get two en dashes in terms like "Proto–Uralo–Indo-European". I've seen this in print, actually, with "Proto–Trans–New Guinea". As for other hyphenated protolanguages, I can't see that there would actually be much chance of miscomprehension, so a hyphen on proto- wouldn't be problematic. On the other, besides being typographically correct, we've seen that en dashes are still used in the linguistic literature for such protolanguages.
Given that the world's most cited protolanguage, pIE, is (nearly?) always doubly hyphenated, I don't see a problem with deciding to hyphenate prefixes (proto-, macro-, pre-, post-) on all hyphenated family names. There are so few en-dashed names that we can take them on a case-by-case basis. Where I really think that in the interest of clarity we should have en dashes is in families which join two multi-word terms. "Uralo-Indo-European" looks like a tripartite name composed of Uralic, Indic, and European. Since most of these will be extremely obscure names (otherwise someone would have come up with something shorter!), we can't expect people to understand them just through familiarity. kwami (talk) 07:05, 14 March 2009 (UTC)[reply]

[replying to multiple people at once, so outdent] Taivo: You said "this WP:MOS appears to be a misguided attempt to turn back the clock to a precomputerized typesetting era." However, using that argument would suggest that proportional fonts are unnecessary and we should all just use Courier (with two spaces after periods, no less). Hyphens are sometimes used on computers not because the underlying thinking has changed, but merely due to simple technical limitations. Therefore, typographically correct characters should be used whenever possible. I'm in favor of an en dash in this case. "The [computer] is not a typewriter". --Wulf (talk) 05:45, 13 March 2009 (UTC)[reply]

There is a fundamental difference between proportional fonts and en-dashes. One is automatic and the other is not. We don't need to insert any special commands in order to use proportional fonts--the kerning is built into the font. It also does not require the use of any special characters. It takes the characters typed on the keyboard and mechanically spaces them proportionally. An en-dash is a fundamentally different thing--it is a character that is not found on anyone's keyboard. It is a highly specialized creature that (as I illustrated above with the same editor using en-dashes in two different places in the same word) has no real rules of usage outside the world of typography (and even then the rules are arcane and not-well-known). It is not an ASCII character, it is not a character in any phonetic font, it is just a leftover from another era. It is not "the typographically correct" character, it is just an archaic option. (And, BTW, I do use two spaces after periods.) (Taivo (talk) 06:11, 13 March 2009 (UTC))[reply]
Okay, now you're making sense. You just accidentally proposed that MediaWiki convert double hyphens to en dashes. Also, the "leftover from another era" is when we had to cram as many characters as we could into 7-8 bits. Wikipedia does not use ASCII, nor does anybody else these days. By your reasoning, we should use the asterisk as a multiplication sign because the true multiplication sign "is not an ASCII character". There was never a sea change in typography, just a comparatively very brief period of technical limitation which we have now passed... --Wulf (talk) 03:52, 14 March 2009 (UTC)[reply]
Fortunately, without your "help", kwami decided that current linguistic usage superseded Wikipedia's misplaced efforts at making editing more difficult rather than less difficult. (Taivo (talk) 04:57, 14 March 2009 (UTC))[reply]
Maybe I'm missing something here... Remind me why you believe Ural-Altaic should have a hyphen, yet Bose–Einstein condensate gets to keep its en dash? --Wulf (talk) 08:33, 14 March 2009 (UTC)[reply]
For me, it's simply a matter of disambiguating. "Ural-Altaic" means basically the same thing, whether you read it as one family spread from the Urals to the Altai, or the combined Uralic and Altaic families. Bose-Einstein, however, could be misunderstood as somebody named "Bose Einstein". (Not likely with that name, perhaps, but much more ambiguous with other names.) kwami (talk) 09:57, 14 March 2009 (UTC)[reply]
Because Ural-Altaic (hyphen) is standard usage among linguists and has never had an en-dash in it. Linguistic usage favors hyphens over en-dashes. And, as Kwami says, there's no ambiguity, but ambiguity is not so much a factor in contemporary linguistic usage. Our field uses hyphens generally now and it isn't Wikipedia's place to try to impose its will on it. Ural-Altaic is conjunctive, not distributional in nature and most linguists will interpret it as conjunctive. (Taivo (talk) 11:02, 14 March 2009 (UTC))[reply]
Hmm, it took me a while to figure out that Gay-Lussac is one person but Boyle–Mariotte are two, when I was in high school. --80.104.235.34 (talk) 12:32, 14 March 2009 (UTC)[reply]
I'm just not sure how what most people in a particular field happen to use has to do with Wikipedia's standardized style manual and the consistent application thereof. Should we now have separate style manuals for each WikiProject? (And, speaking of WikiProjects, shouldn't WikiProject Typography be consulted on this?) --Wulf (talk) 21:37, 14 March 2009 (UTC)[reply]
Yes, that's a good example for this project page. kwami (talk) 21:18, 14 March 2009 (UTC)[reply]

en dashes vs hyphens (cont.)

(1) For years, everyone has been happily naming articles "Proto-Indo-European language" and the like and finding them in searches without any difficulty. Thus, established and settled usage on Wikipedia is to use hyphens in all names of languages. Kwami has been innovating in changing this established and settled usage. But this usage has never posed the slightest practical problem. Changing it will not increase the encyclopedia's ease of use. It will, on the contrary, decrease it by afflicting users with constant "Redirected from ..." messages, among other problems, including but not limited to increased difficulty of editing and the need to constantly update edits.

It's true the current MOS guidelines can be interpreted to require n-dashes in article titles when they are used in language names. But the more fundamental question is: is it a good idea to do so?

When an n-dash is used in a range of numbers, such as 1914-1918, it is an ideogram, read in practice as “to” in most instances. According to Tcnv above, it disjoins the numbers. When an n-dash is used to write a compound, it is used to conjoin, the opposite usage. Thus, the use of an n-dash in these cases follows a different rule in each case and the two rules are directly contradictory. This is our first warning that we are entering arcane territory here, with no safe footing for the average user of language.

(2) Kwami has flagged the complicated instance of Uralic-Altaic versus Uralo–Altaic and Ural–Altaic. I believe this illustrates the impossibility of arriving at a system simple and logical enough for the average person to utilize.

For Uralic-Altaic, there is no problem. Uralic-Altaic means “Uralic and Altaic”. It is similar to Sanskrit dvandva compounds, a well-known form in linguistics.

For Uralo-Altaic, the issue is more complicated. At first glance, Uralo- looks like a prefix, like proto-, neo-, geo-, turbo-, as well as trans-, pre-, etc. But Indo-European developed the use of -o / as a combination form, followed in this by several of its daughter languages, in some cases by inheritance (e.g. Greek), in others by drift (e.g. Avestan). This is what is going on here. Uralo-Altaic means “Uralic and Altaic”, but the primay suffix -ic has been replaced by the secondary suffix -o. It appears to be a combination of prefix and nominal, but in fact it is a combination of two nominals.

Similarly, in Ural-Altaic, the -ic suffix has been elided in the first element, a procedure well known in languages, rather like gapping in syntax.

Uralic-Altaic, Uralo-Altaic, and Ural-Altaic, then, are all identical in meaning, in spite of first appearances.

As these examples show, the distinction between coordinate forms (as Uralic-Altaic obviously is) and prefixed forms (as Uralo-Altaic appears to be at first glance) is not always easy to tell.

Furthermore, there is no way to tell from the form of the first element what its function is. For example, Indo-European is the language family from which many of the languages of India and Europe are derived — in this case the elements are coordinate — but Indo-Aryan is those forms of Aryan spoken in India — in this case “Indo-” is a prefix qualifying “Aryan”.

As I understand it, because of such issues Kwami has now abandoned the “complicated” version of using n-dashes in favor of a somewhat simpler system, detailed below.

(3) Let me try to sum up the evolving positions. I think there has been and will continue to be some movement in everybody’s position and this is the purpose of the discussion.

Taivo and I, along with various other people (see discussion of “curly quotes” in the section just archived), would ideally like to see n-dashes eliminated from Wikipedia and entirely replaced with hyphens both in compound words (including language names, such as Proto-Uralic) and in number ranges (such as 1914–1918). But, above all, we would like to see the existing de facto custom of hyphenating all language names continued.

I am beginning to grasp what Kwami has been trying to get across about the advantages of n-dashes in disambiguation, e.g. Uralo–Indo-European versus Uralo-Indo-European. I think these advantages are real and must be weighed in the balance.

Kwami is taking the view that:

  • In names of languages that are compounds, the hyphen is the basic form – the default. Example: Indo-European.
  • The hyphen is replaced with the n-dash in several different circumstances:
When an element is added to a name separated by a space. Example: Trans–New Guinea.
When an element is added to a name that is already hyphenated, in several specific circumstances:
When a simplex name is added to a name that is hyphenated. E.g.: Uralo–Indo-European.
When two hyphenated language names are conjoined. Example: Indo-European–Hamito-Semitic.
When a prefix is added to a hyphenated name. Example: Proto–Indo-European.

The most important point here is that, as I understand it, Kwami is now advocating a system in which a first compounding is indicated with a hyphen, a second with an n-dash. Thus we get Indo-European, but Uralo–Indo-European and Proto–Indo-European.

(4) There are several problems with this.

(a) A fairly serious problem is the fluctuation that results from these principles in prefixing “Proto-”. For example, we get “Proto-Uralic” (hyphen), but “Proto–Chukotko-Kamchatkan” (n-dash). Here there is no advantage whatsoever in disambiguation, since the expressions are totally unambiguous: a proto-language being a single language by definition, there is no possibility of misunderstanding Proto-Chukotko-Kamchatkan as “Proto-Chukotko plus Kamchatkan”.

(b) Another problem is: what do you make of expressions like Pre-Proto-Indo-European, actually fairly frequent in some works? What about Proto-Uralo-Indo-European or Pre-Proto-Uralo-Indo-European? Obviously, we have long since run out of different forms of hyphens and dashes.

There is a simple solution to these arcana and inconsistencies: eliminate — or more precisely continue to avoid — n-dashes and keep using hyphens, as everyone has been doing on Wikipedia for years.

VikSol (talk) 22:54, 14 March 2009 (UTC)[reply]

Hyphens are almost always an acceptable substitute for en dashes. As you've pointed out, en dashes are sometimes useful for disambiguation. For me, that's the relevant issue, not legalistic adherence to the guideline. So, for example, per the MOS, pIE should be en-dashed, and with several other protolanguages, en dashes are found in the literature. In the IE lit, however, it's always hyphenated, or nearly always so. Per your point in (4a), there is no ambiguity, so on the balance I'd say we should probably go with hyphens. The MOS after all is a guideline, and we need to take other considerations into account.
However, with something like Trans–New Guinea or Indo-European–Hamito-Semitic, en dashes are found in the lit, or sometimes there is no established usage, and there is potential ambiguity. Here I think the advantage of en dashes is the overriding factor. Also, there are relatively few such language families, and even fewer have dedicated articles (most are branches intermediate between better-established families which do have articles), so they're not disruptive.
As for your question in (4b), there are only two levels, hyphen and en dash. Once you reach an en dash, everything from there on out is also an en dash: Proto–Indo-European–Hamito-Semitic, Proto–Trans–New Guinea. (The latter at least attested in the ling lit.) Theoretically you might think we'd need further dab'ing. However, human language is not infinitely recursive. We quickly reach a cognitive processing limit, which IMO is why we don't see many terms where a third level would be useful. In the very few cases were we come across such terms, we could go with the en dashes, or take advantage of acronyms, which is what is generally found in the lit anyways: proto-TNG, pre-proto-TNG, etc.
I occasionally see hyphens replaced with spaces, as in "Trans New Guinea" and "Meso Philippines". I don't see any advantage to such usage, but maybe someone else here does? kwami (talk) 23:52, 14 March 2009 (UTC)[reply]
I think that the linguists here--kwami, vik-sol, and myself--seem to have come to a workable solution for 99% of all cases--hyphens all the way. Now, the "problems" and "ambiguous cases" that kwami cites are mostly smoke and mirrors since no one talks in the literature about Proto-Uralic-Indo-European in any realistic sense since there are virtually no contexts in which such an artificial formulation would be used. There are probably only a dozen truly ambiguous cases that are actually likely to be used in Wikipedia and they are nearly all in New Guinea. We can arm wrestle over each of them if they start to cause problems of interpretation, but since the number of people who are actually ever going to write or edit (or even read) an article on a language of the Trans-Fly-Bulaka River family can be counted on the fingers of one hand (with a few fingers left over), the problem is probably moot. I don't give a hoot about the non-linguistic uses of en-dashes versus hyphens, so the non-linguists who have been involved in this discussion can argue about the Barnes–Noble Paradigm versus the Barnes-Noble Paradigm and I don't really care. (Taivo (talk) 00:59, 15 March 2009 (UTC))[reply]
The only top-level families are Trans–New Guinea, East Bird's Head–Sentani, Ramu–Lower Sepik, and Yele–West New Britain (assuming that's valid), all in New Guinea. There are also some branches of Austronesian such as South Halmahera–West New Guinea, also mostly in NG, or at least in Melanesia. kwami (talk) 01:19, 15 March 2009 (UTC)[reply]
But I honestly don't think there's any real ambiguity in any of these. I find the use of an en-dash after a prefix especially inappropriate (trans-, proto-, pre-). But, in actuality, these are very minor issues since the use of any of these is so rare (except, perhaps, for Trans-New Guinea). (Taivo (talk) 01:44, 15 March 2009 (UTC))[reply]
I think it's telling that the trend for TNG seems to be writing it as three words, "Trans New Guinea", despite trans being a prefix. Maybe people object to treating "trans-new" as if were a unit? And South Halmahera–West New Guinea is difficult to parse with just a hyphen. kwami (talk) 01:58, 15 March 2009 (UTC)[reply]
Do you have any published sources for this trend? (Taivo (talk) 03:43, 15 March 2009 (UTC))[reply]
Again, what does this have to do with linguistics/linguists? I see this as a simple typography issue... You'll notice that the punctuation, hyphen and dash articles all belong to Category:Typography -- not Category:Linguistics or anything related. --Wulf (talk) 03:59, 15 March 2009 (UTC)[reply]
Wulf, Thanks for the links, they are most useful. I quote from the article Dash:
====Usage guidelines====
The en dash is used instead of a hyphen in compound adjectives for which neither part of the adjective modifies the other. That is, when each is modifying the noun. This is common in science, when names compose an adjective as in Bose–Einstein condensate. Compare this with "award-winning novel" in which "award" modifies "winning" and together they modify "novel". Contrast "Franco-Prussian War", "Anglo-Saxon", etc., in which the first element does not strictly modify the second, but a hyphen is still normally used. The Chicago Manual of Style recognizes but does not mandate this usage and uses a hyphen in Bose-Einstein condensate.
Thus, "Bose–Einstein", taken as a supposedly unshakable example of the use of an n-dash, is contradicted by the most prestigious manual of all, The Chicago Manual of Style. There could be no better example of the confusion that reigns in this area, which we must not inflict on Wikipedia users. VikSol (talk) 04:59, 15 March 2009 (UTC)[reply]

Let me try to characterize the discussion to this point:

(1) There is consensus that the prefix “Proto-” does not need to be n-dashed, since there is no ambiguity. I will hazard that the same principle would apply to “Pre-”, as in the title of Winfrid P. Lehmann’s book ‘’Pre-Indo-European”.

(2) The next question to consider, I think, is whether this principle applies to all prefixes, or to these prefixes only? I suggest it should apply to all prefixes, on these grounds:

  • The treatment of prefixes should be consistent, as much as practical.
  • Prefixes, by their nature, do not give rise to ambiguities of the type “Indo-Germanic-Semitic” (which I recently had to use to translate Hermann Möller’s indogermanisch-semitisch).
  • The confusion of a prefix with a language name is nonexistent or so rare as to be unimportant. As far as I know there are no languages named Pre, Trans, Macro, or any other letter combination identical to an English prefix. (I did once wonder whether Macro-Ge involved a language called Macro, but one gets beyond such things.)

In consequence, all prefixes should be hyphenated, since they do not involve ambiguity.

(3) But what of the case where the prefixed expression involves two separate words, as in Trans–New Guinea? Here there does appear to be a frequent usage of an n-dash. However, with regard to Trans–New Guinea, it seems to me that, “Trans-” being simply a prefix like “Proto-” and “Pre-”, and no more ambiguous than them, there is no reason to n-dash it simply because the following words are not hyphenated.

(4) This leaves the case of disambiguation, but let’s leave that for later.

My suggestion, then, is that we adopt the principle that a hyphen should follow all prefixes in language names. Examples: Proto-Indo-European, Pre-Indo-European, Pre-Proto-Indo-European, Macro-Ge, Trans-Eurasian, Trans-New Guinea.

VikSol (talk) 04:37, 15 March 2009 (UTC)[reply]

Concluding remarks (?)

(1) If I am not mistaken, there is now consensus that hyphens should be used after all prefixes. (Assuming my argument above about forms like Trans-New Guinea is accepted.)

We could provide a more linguistically precise definition of “prefixes” here, but this does not seem to be of immediate relevance.

(2) The remaining issue on the table is disambiguation.

Taivo and I have signaled that we will not fight this one to the bitter end.

There is general agreement that the decision depends on balancing competing considerations. I will try to sum these up.

(3) There is a genuine advantage to the use of an n-dash to disambiguate terms. The forms concerned are primarily:

A + B-C. Example: Uralo–Indo-European.

A-B + C. Example: Indo-Germanic–Semitic.

A-B + B-C. Example: Indo-European–Hamito-Semitic.

Also theoretically possible and sometimes really occurring are such forms as:

A + B-C + D. Example: Korean–Japanese-Ryukyuan–Ainu. (Made-up term, discussed below.)

A B + C. Example: East Fijian–Polynesian.

A (B) + C D. Example: North–Central Vanuatu.

A (B-C) + D E-F. Example: Central–Eastern Malayo-Polynesian.

etc.

(4) Let me point out that many of the so-called ambiguous forms are not that ambiguous when closely considered. The language is pretty smart and already has built-in ways to avoid ambiguity. In particular, most of the compound junctures are disambiguated – in the spoken language itself – by the combination form -o or the use of English terms like "West" which could never (as a practical matter) constitute a language name. So actually such forms as South Halmahera-West New Guinea, North-Central Vanuatu, and Central-Eastern Malayo-Polynesian are not ambiguous at all.

What happens in such cases in that we get again into the hair-splitting we encountered in such series as Uralic-Altaic, Uralo-Altaic, and Ural-Altaic. The grounds for deciding whether a hyphen or an n-dash is needed are so obscure, subject to individual interpretation, and hypertechnical that no non-linguist can reasonably be expected to grasp them all, and no two linguists may agree on all interpretations.

In other words, a disambiguation that produces ambiguity is no progress.

(5) In other cases, solutions may be possible short of the use of an n-dash. For example, some forms can be disambiguated by combining them, a possibility Kwami has raised above. For example, we could write Indogermanic-Semitic rather than Indo-Germanic-Semitic. This is justified by usage fairly often. For example, the terms Afroasiatic and Afro-Asiatic are both in current use.

Other forms can be avoided in practice. For example, some linguists prefer Uralo-Indo-European to Indo-Uralic, but the shorter term is much more prevalent. Joseph Greenberg spoke of Japanese-Ryukyuan, but Korean-Japanese-Ainu, presumably to avoid a lengthy and ambiguous term. Indo-European–Hamito-Semitic could be abbreviated to Indo-Semitic.

This raises the reflection that the very complex names tend to be reserved for new proposals and controversial groupings. When a language family is well established it tends to get a simple name, for obvious practical reasons: it’s simpler to work with and linguists already know what languages it groups. I note in Kwami’s list of language families (Template:Language families) that none of the established upper-level families have very complex names – at most something like Yele–West New Britain.

Usually, a new or controversial proposal will not get its own article but will be explained in some other context, e.g. Korean-Japanese-Ainu is explained under “Altaic languages” and “Classification of Japanese”. The famous but controversial proposals already have short names, e.g. Nostratic and Amerind.

What I am trying to get at is that the ambiguity problem is one of very limited scope, so limited that the occasional useful n-dash is likely to puzzle people, since they will have encountered it so rarely.

(6) Other objections may be catalogued as follows.

  • Most people do not know that such a character as an n-dash exists. I discussed this whole set of issues with one of the top legal draftsmen in the country, a Harvard JD, who had never heard of n-dashes. This is after twenty years of work in a field that demands extreme precision of language. A character that is not recognized by probably over 99% of readers does not disambiguate anything. It just gives the impression the typography is inconsistent (even when it’s not). And the few who recognize it, such as Kwami, already know perfectly well what the expressions mean.
  • The latest edition of The Chicago Manual of Style gives increased preference to hyphens over n-dashes and is also concerned to adjust typography to the computer era. I think these things are not an accident and that close scrutiny of the manual would reveal that it is because of the computer era that the n-dash is falling from favor.
  • The issue of searches is of great importance. When n-dashes were adopted, there was no way to search a text electronically. It did not matter to a reader glancing across pages, flipping through a book, or reading down the columns of an index whether the typesetter had used hyphens or n-dashes. Today it does. Yes, our computers are dumb and can’t even do accents properly. But this is the way things are. I am not sure that the cognitive dissonance provoked by adding an n-dash key to the computer keyboard would be worth it. Compare Martinet’s Economie des changements phonétiques on the disadvantages of having too many phonemes that are too similar.

Obviously, we cannot have forms with hyphens in titles and forms with n-dashes in the text. It’s all or nothing.

(7) My sense is that, given the minimal advantages of disambiguation in practice, the rarity of the character that would result, the practical impossibility of defining usable criteria, and most crucially the issue of searches, on balance the n-dash should be avoided in language names. Let the physicists sort out whether they want to use “Bose–Einstein” or, per the new Chicago Manual of Style, “Bose-Einstein” (see Dash).

I too like the advantages of being able to disambiguate Indo-Germano-Semitic and similar expressions. But there are workarounds and these may be preferable to adopting a character of rare application, obscure usage, and diminishing currency.

VikSol (talk) 21:24, 15 March 2009 (UTC)[reply]

I came late to the game. For technical language such as linguistics or physics, I would think the Wikipedia MoS should defer to the technical language, so it should be Bose–Einstein condensate but if linguists really don't care about the en-dash–hyphen distinction, then technical linguistic terms should appear as they do in the linguistics literature.
I have to agree with Wulf that CMS15 has much more to do with the dark ages of computer typography than it does about what formal published work should contain. I think it's notable that the fields that are particular about their dashes—math, physics, and computer science—are the fields that have had access to powerful typesetting software (TeX/LaTeX) the longest. I would argue that Wikipedia should do what it can to make typographically beautiful articles even if that means typography geeks like ourselves are running around putting in directional quote marks and en-dashes.
As for searches, I think that's the job of the search engine and of redirect pages. Google doesn't have any problem with a search for "Bose-Einstein condensate" (although Google does find the page with the hyphen that redirects to the page with the en-dash). —Ben FrantzDale (talk) 23:35, 15 March 2009 (UTC)[reply]
Just a comment--Wikipedia is not typeset and never will be because it would never pass peer-review. As careful as we specialists are with individual articles, this is still a user-edited document, and, as such is a computer-only thing. Therefore, following Viksol's admonishment that this should be easy for computer searches is paramount. Leave out the en-dashes and directional quotation marks because they are just arrogance. (Taivo (talk) 00:52, 16 March 2009 (UTC))[reply]
Where do you keep getting this idea that typesetting == letterpress or something? As the relevant Wikipedia article opens, “typesetting involves the presentation of textual material in graphic form…”. As far as Wikipedia being a “user-edited… computer-only thing”, must I mention Wikipedia:Books – or the article in Nature which compared Wikipedia with the Encyclopedia Britannica? (Although what does peer review or being in print have to do with good typography anyway?) You also complained about searching, but – as Ben had already said – Google already handles it fine. There is also a bug filed with Mozilla regarding the find bar, and other browsers will assuredly follow shortly. MediaWiki already utilizes Unicode normalization, and it would be fairly straightforward to normalize searches as well (which would mean that Unicode characters and their equivalents would be properly treated as just that – equivalent).
Oh, and you’ll notice that this post is written using only proper, semantic, typographically-correct characters – all of which were entered using the buttons below every Wikipedia edit box. Proper punctuation and typography are no more arrogant than proper spelling and grammar. —Wulf (talk) 03:38, 17 March 2009 (UTC)[reply]
The issue has been resolved for linguists. You can believe the conceit, Wulf, that Wikipedia is on a par with EB, but there's not a college professor that I know who will accept it as a legitimate source for a term paper. (Taivo (talk) 04:56, 17 March 2009 (UTC))[reply]
Wikipedia is not “a legitimate source for a term paper” because it’s a tertiary source, not so much because it’s unreliable (not that it is). But that’s why we have a thorough citation system. I have to ask: if you think so little of Wikipedia, why bother contributing? —Wulf (talk) 08:58, 17 March 2009 (UTC)[reply]
Great post, and thanks for pointing out the connection between availability of powerful typesetting software and being particular about dashes and such. I’d never noticed that. —Wulf (talk) 03:38, 17 March 2009 (UTC)[reply]

If it makes no practical difference which form we use, then it doesn't really matter. But when using a hyphen is misleading, I don't think that we should dumb down an article because we think our readers won't understand what a dash is. That's the same argument people make for abandoning the IPA: spelling pronunciations are precise enough for our purposes, my dictionary doesn't use the IPA, it's too much to ask people to learn just to use Wikipedia, etc. If linguists expect others to learn the IPA in order to figure how to pronounce the name of a moon, a literary character, or a chemical element, then I don't see why others can't expect linguists to learn basic punctuation. kwami (talk) 09:24, 17 March 2009 (UTC)[reply]

Usage, my friend--description and not prescription. That's the key to linguistics and why using hyphens to describe our field is much more important than trying to impose an en-dash where all our colleagues use hyphens. That's been my point all along--linguistic usage is hyphens all the way. VikSol has more subtle (and just as valid) arguments, but my principal point has always been usage takes precedence over all other factors. And Wulf's point, that Wikipedia is a tertiary source, makes usage in the primary and secondary sources all the more important since a tertiary source should never impose its will upon the more important sources. So, since usage is most important, and since the vast majority of primary and secondary sources use only hyphens.... (Taivo (talk) 12:10, 17 March 2009 (UTC))[reply]
One thing I still don't understand is how using an n-dash in an expression like Trans–New Guinea instead of Trans-New Guinea disambiguates anything.
If the purpose of an n-dash is to express a higher level of separation, then Trans–New Guinea means a language called "Trans" plus a language called "New Guinea".
I guess the idea is that the space in New Guinea represents a higher level of separation than the hyphen in Mixe-Zoque, so the space needs to be trumped by a still higher level of separation, that of an n-dash. But an n-dash, being a conjoining symbol here, like the hyphen, logically indicates a lower level of separation than a space. The poor reader has nowhere to turn:
  • If he guesses the n-dash is a disjoiner, as in "the evolution–creation debate", then he reads the expression as meaning "the Trans language plus the New Guinea language".
  • If he guesses the n-dash is a conjoiner, and therefore weaker than a space, then he reads "the Trans-New language of Guinea" or "the Trans-New form of the Guinea language".
  • If he is aware of the principle that prefixes receive hyphens, not n-dashes (on which Taivo, Kwami, and I have reached consensus for expressions where no space occurs), he expects Trans-New Guinea, and wonders why an n-dash was used instead of a hyphen.
I think this usage just adds a layer of confusion and should be dropped. I am not speaking here to the merits of cases where the n-dash really disambiguates. VikSol (talk) 23:25, 17 March 2009 (UTC)[reply]
TNG is not named for a language, but for a geographic area, like Niger-Congo. It is trans-(New Guinea). With a hyphen, it would imply (trans-new) Guinea, which is on the wrong continent. The use of an en dash when joining hyphenated or interspaced terms is a basic rule of punctuation. True, we can substitute a hyphen without much loss of comprehension. But then we could also drop capitalization without much loss of comprehension: trans-newguinea. That doesn't mean we should.
As for Taivo's point, you're proposing that we use a different system of punctuation for each field of knowledge in an attempt to remain authentic to the lit, which would be a complete mess. This is just punctuation. True, the literature should be considered, but we should come up with one standard for wikipedia. Most linguistics sources use seriffed fonts too. Should we force all linguistics articles to display with seriffed fonts in an attempt to be authentic? kwami (talk) 00:06, 18 March 2009 (UTC)[reply]

Actually, the WP:MOS already specifies that very thing:

An overriding principle on Wikipedia is that style and formatting should be applied consistently within articles, though not necessarily throughout the encyclopedia as a whole. One way of presenting information may be as good as another is, but consistency within articles promotes clarity and cohesion.
The Arbitration Committee has ruled that the Manual of Style is not binding, that editors should not change an article from one guideline-defined style to another without a substantial reason unrelated to mere choice of style, and that revert-warring over optional styles is unacceptable.
Where there is disagreement over which style to use in an article, defer to the style used by the first major contributor.

Just quoting what the WP:MOS already says. (Taivo (talk) 03:46, 18 March 2009 (UTC))[reply]

…How does that quote back you up here? Nowhere does it advocate a different style for each field of knowledge. It just says we shouldn’t modify existing articles back and forth as the guide changes – nothing about it not being ideal to have consistent styling across the entire site. If we were to have different styling for each field, then there would be no central Manual of Style (its duties being performed by a myriad of WikiProject subpages). There’s a bit of flawed logic in your mere inclusion of that quote in the first place, but the circular nature of it would require a rather long explanation… Suffice it to say I think that quote has no relevance in this discussion. —Wulf (talk) 00:20, 23 March 2009 (UTC)[reply]
Good riddance to a "central Manual of Style". Yes, each field should be allowed to practice its own habits within Wikipedia. Otherwise, it is falsification of data to change a style just because some dilettante in Wikipedia with no experience in the field desires it. The quote clearly states that the style of the original editor has priority. (Taivo (talk) 04:28, 23 March 2009 (UTC))[reply]
Bit of a problem with that… There is a manual of style, and I seriously doubt it’ll ever go away. It may give preference to original article styles, but page titles are far more important and should be uniform. Besides, it seems the quote you are using regarding consistency within articles is more about American vs. British English than mandated proper typography. For example, see the most recent reference given for the relevant section in the MoS:

Wikipedia does not mandate styles in many different areas; these include (but are not limited to) American vs. British spelling, date formats, and citation style. Where Wikipedia does not mandate a specific style, editors should not attempt to convert Wikipedia to their own preferred style, nor should they edit articles for the sole purpose of converting them to their preferred style, or removing examples of, or references to, styles which they dislike. [emphasis added]

Wikipedia:Naming conventions#Special characters says “for the use of hyphens and dashes in page names, see Manual of Style (dashes)”, which says “when naming an article, a hyphen is not used as a substitute for an en dash that properly belongs in the title…”. Furthermore, Dash says “the en dash is used instead of a hyphen in compound adjectives for which neither part of the adjective modifies the other.”. —Wulf (talk) 14:05, 23 March 2009 (UTC)[reply]

Re "This discussion is in need of attention from an expert on the subject", please bear with an intrusion from a newcomer working in a different research community.

From the foregoing discussion it seems plain that there can be no such expert, in any universal sense, because it seems plain that different communities have different en-dash conventions.

For instance, in my own community (mainly classical physics, mathematical physics, and climate research) there is an obsolescent convention of the "Bose--Einstein" sort.

The trend represented by recent editions of the Chicago Style Manual is also exhibited by some of the recently founded online journals in my field, specifically, those of the European Geosciences Union. They all follow the suggestion that in-text en dashes should all be replaced by hyphens. Surely that's the way of the future.

The tiny minority of readers who care about the difference can easily think of some of the hyphens as "really" being en dashes.

I agree that the issue of searches is of great importance... EdgeworthMcIntyre (talk) 18:08, 28 March 2009 (UTC)[reply]

PS: Wikipedia could do a great service to humanity by slightly redesigning its typography such that en dashes look exactly the same as hyphens. It would be easy to make hyphens a touch longer and thinner, and en dashes a touch shorter and thicker. In the computer code they could all be hyphens, helping toward bug-free searches.

It would be wonderful if everyone's opinion as to how to arrange hyphens and en dashes were equally well served by what's on the screen. Indeed, perception psychology ("categorical perception" etc) tells us that those with the strongest opinions might well, in fact, see the particular arrangement they like. EdgeworthMcIntyre (talk) 11:33, 29 March 2009 (UTC)[reply]

Well, another solution would be a major military conflict over the issue of dashes....... Michael Hardy (talk) 13:30, 29 March 2009 (UTC)[reply]
A Hyphen War, perhaps? ^^ —Wulf (talk) 15:36, 29 March 2009 (UTC)[reply]
I'm fine with tweaking MOS to be more tolerant of hyphens. - Dan Dank55 (push to talk) 04:09, 31 March 2009 (UTC)[reply]
Ditto. — SMcCandlish [talk] [cont] ‹(-¿-)› 05:31, 14 April 2009 (UTC)[reply]
  • Who on earth is even going to notice the difference between a hyphen and an en-dash? This really is just such a waste of time. Much more offensive, to me, is the statement in the MOS that an EM dash should not have a space either side of it. I think it should, and then it doesn't matter whether what is written between the spaces is an em dash, an en dash, or a hyphen. Alarics (talk) 21:28, 12 April 2009 (UTC)[reply]
Sometimes it does, sometimes it doesn't, depending upon usage. — SMcCandlish [talk] [cont] ‹(-¿-)› 05:31, 14 April 2009 (UTC)[reply]
Check a style guide. Em dashes should technically have hair spaces, which is impractical on computers (more so than dashes). With proportional (i.e. not monospaced) fonts, the hair spaces are (at least in theory) taken care of by the font. However, I seem to recall one style guide recommending en dashes over em dashes just to avoid all the confusion over it, as nobody argues for a lack of spaces around en dashes. —Wulf (talk) 18:06, 22 April 2009 (UTC)[reply]

Moving forward again

This discussion is getting longwinded and off on tangents. For my part I am (surprisingly?) in agreement with Septentrionalis/PMAnderson that MOS is being overly prescriptive rather than descriptive on some aspects of this issue, and also agree with many related points raised by Tcncv and Taivo. I also agree with much of what MOS says about use of en-dashes in the sense of "to" or "through", as in "1990–1998", as well as the juxtapositional use as in "Canada–UK relations", but feel as many do here that "Ural–Altaic" is taking it too far. I think such a usage is a misconstruance of the purpose of en-dashes, to the extent there is any (including off-Wikipedia) consensus on their use to begin with. So, the question before us is what should MOS say on the matter? I think we need to refocus on what what we can come to consensus on that en-dashes are actually useful for (with reference to an overall sense of what off-WP style guides say), and reformulate from there. I would suggest that deference is generally given to the hyphen, based on a preponderance of external evidence, from post-Internet communication styles, to current academic journals, and so on. — SMcCandlish [talk] [cont] ‹(-¿-)› 05:31, 14 April 2009 (UTC)[reply]

Coming late to this debate, that sounds good to me. (I am currently engaged in a discussion over Weaire-Phelan structure vs. Weaire–Phelan structure). The Chicago Manual of Style and most search engine hits are on my side, but WP:ENDASH is against me. The Dash article says that "A 'simple' compound used as an adjective is written with a hyphen; at least one authority considers name pairs, as in the Taft-Hartley Act to be 'simple',[5] while most consider an en dash appropriate there[citation needed]. That "most" seems highly suspect in the light of this discussion, and I really do wonder whether that missing citation actually exists. If it doesn't, then WP:MOS should surely be revised to follow the real world. -- Cheers, Steelpillow (Talk) 2danish oi0:35, 2 May 2009 (UTC)
Wikipedia talk:Manual of Style/Archive 105#Guideline-by-guideline citation of sources. Wavelength (talk) 21:53, 2 May 2009 (UTC)[reply]
Also coming lamentably late to the discussion, but wanted to harp a bit on the notion that establishing a (good) style guide, unlike doing (good) linguistics, is always a fundamentally prescriptive endeavor. That said, style is not prescriptive in the sense that using a given feature (hyphen, en dash, whatever) is the ultimately correct way of using a language, but insofar as consistency in presentation is desirable for a given publication (in this case, Wikipedia as a whole). While it is certainly in Wikipedia's best interest to reflect the informed usage of experts, it is a confusion of logical levels to refer to such as "descriptivism", at least when it comes to establishing style.
While I have extreme respect for linguists and linguistic expertise, linguists are not the experts to which we should ultimately defer in matters of typological style. Scientists' work involves technical distinctions within their fields, not those of typography, and (more to the point) scholars publish in journals, which have their own styles and their own economy of production. In my editorial experience, periodical literature (including many, but not all, scientific and medical journals) tends away from hyphen/en-dash distinctions, whereas you'll find them quite closely observed in other contexts and fields.
To cite the typographical tendencies of experts in a specific field, especially when using their technical vocabulary, as evidence to call for a specific stylistic usage in a broad context (WP) is a little mixed up. To say "linguistic usage always prevails" strikes me as a creeping prescriptivism of the wrong kind – although a linguist's expertise regarding the geographical extent of a language family (and thus whether the terms Niger-Congo and Ural-Altaic are disjunctive or otherwise) is most welcome in deciding on how to implement the local typographical style. But to that end, the linguist is the expert on the language family and the substantive aspects of its naming, not necessarily on how the name is punctuated. And to call a style that differs from the specialist's usage an attempt to "reform" the language misses the mark.
Of course, the point of style is to give coherence and consistency, deviations from which can detract from the publication's voice (in this case, an encyclopedic voice). I do think that en dashes provide useful distinctions in formatted text and shouldn't be tossed away as some archaism based on a subset of formatting conventions. When it comes to specifics, I pretty much agree with SMcCandlish's position above, but wanted to point out my reservations about some of the discussion that led to it.
Just a quick additional comment on the discussion above: hyphen usage with prefixes is only relevant in contexts where the prefix is attached to a capitalized item (Trans-Siberian, but transcontinental), or where hyphenless treatment results in phonological ambiguity (re-elect). Affixes in English are generally run in without space except where circumstances demand otherwise. Sorry if that was already obvious to the group here. /Ninly (talk) 06:38, 8 May 2009 (UTC)[reply]

Proposal

I have just checked through some maths books for conjunctive name pairs. Cambridge University Press (e.g. Cromwell's Polyhedra). use en dashes. Allen Lane (e.g. Mlodinow's Euclid's window) use hyphens. The only Dover books I have to hand are older works (e.g. Coxeter's Regular polytopes, 2nd Edn), also using hyphens. I find the same name-pair with hyphen in one book and en dash in another.

En dashes are a pain to maintain. The practical approach for Wikipedia is to use hyphens unless there is a clear, referenced usage of en dashes in any particular field - irrespective of the publisher.

I propose to amend WP:MOS accordingly. However I do not know the etiquette - should I just do so, or is there a protocol to work through first? Certainly, the discussion aspect has been done to death. Should we vote on it? -- Cheers, Steelpillow (Talk) 09:28, 4 May 2009 (UTC)[reply]

  • My vote is for. -- Cheers, Steelpillow (Talk) 09:28, 4 May 2009 (UTC).[reply]
    Specifically, I see no reason to flout The Chicago Manual of Style, which prescribes hyphens. -- Cheers, Steelpillow (Talk) 11:07, 19 May 2009 (UTC)[reply]
  • Opposed in the interest of stylistic consistency (comments in above subsection) and because I don't think en dashes are a pain to maintain. /Ninly (talk) 06:42, 8 May 2009 (UTC)[reply]
  • support. ndashes provide no visible benefits to users, while mdashes w/o spaces look too much like hyphenated terms. Both ndashes and mdashes are a pain to use - especially in tools like refTools, where the edit box "Insert" facility doesn't work. As for the 7-char HTML entities that you have to get absolutely right first time otherwise the text becomes a dog's breakfast ...! --Philcha (talk) 09:41, 8 May 2009 (UTC)[reply]
  • Oppose for the same reasons as Ninly —Wulf (talk) 03:21, 9 May 2009 (UTC)[reply]
  • Oppose endash and emdashes aren't hyphens, no need to behave as if they were. If you don't like endashes and emdashes, don't use them and someone will clean up after you eventually.Headbomb {ταλκκοντριβς – WP Physics} 04:33, 9 May 2009 (UTC)[reply]
  • Oppose: Those who say that typographical conventions for using em- and en-dashes should be changed (to hyphens) for the computer era have it backwards. Single and double hyphens were used on typewriters in place of dashes because typewriters did not have keys for dashes, and there were practical limitations on the number of typewriter keys. Computers do away with that and other limitations, and permit the correct use of all typographical conventions. There is no reason to "dumb down" dashes to hyphens. Finell (Talk) 12:10, 9 May 2009 (UTC)[reply]
  • Oppose: I agree with with previous votes of opposition that many editors are willing to clean up em 'n' en for people who find it too cumbersome to employ them, and that computers (and WP's edit tools) allow for the easy application of these characters. Continue to use dashes properly, even when scholarly or published sources misuse them (except written quotations). TEPutnam (talk) 19:37, 18 May 2009 (UTC)[reply]
  • Support as abolishing recent and pointless distinctions, unsupported by usage. The idea that CUP is typeset on typewriters is praeposterous. Septentrionalis PMAnderson 19:40, 18 May 2009 (UTC)[reply]
Comment: I don’t think anybody is suggesting that typewriters are still used in 2009 (that would indeed be preposterous), just that typewriters were a limitation in the past that made some people forget proper typography… See the link that TEPutnam gave. —Wulf (talk) 02:20, 22 May 2009 (UTC)[reply]
[See Ding, click clack -- typewriter is back. -- Wavelength (talk) 03:01, 22 May 2009 (UTC)][reply]
  • Support The linguists have shredded xdashes (see above), the Chicago MOS is moving towards hyphens, xdashes are hard to distinguish from hyphens at WP's default font and size (which is what unregistered readers see) and xdashes are a PITA. --Philcha (talk) 08:03, 28 May 2009 (UTC)[reply]
  • Oppose. I prefer having a way of visually distinguishing between something named after a single person with a hyphenated name and something named after two people. These things are very common in mathematics (e.g. the Birch–Swinnerton-Dyer conjecture, named after Birch and Swinnerton-Dyer). —David Eppstein (talk) 15:04, 28 May 2009 (UTC)[reply]

Does anyone really believe that the outcome of this discussion will make one iota of difference to the quality of experience that our readers get from this encyclopedia? Phil Bridger (talk) 22:27, 24 May 2009 (UTC)[reply]

Yes, actually. —Wulf (talk) 07:33, 26 May 2009 (UTC)[reply]
Same here. When readers who aren't grammar/punct nuts look at a piece of poorly punctuated writing, they can tell that it looks sloppy and amateurish, even if they are not immediately able to figure out why they feel so or where the mistakes are. Darkfrog24 (talk) 04:24, 30 May 2009 (UTC)[reply]

Punctuation: Quotation marks: Inside or outside

Out of curiosity, how did this odd little community decide that periods and commas belong outside the quotation marks? This goes against traditional academic standards, rules set by MLA, the APA, Harvard and others. What books did you consult? What books have you read? Have you taken English courses recently? If you're British, then you are forgiven. That's your academic convention after all. But for all the Americans here, what the hell y'all thinking?

For example, see Quotation Marks: Teaching the Basics by Susan Collins, The McGraw-Hill Handbook of English Grammar by Mark Lester, or The Associated Press guide to Punctuation by René J. Cappon. Better still, pick up any old book from your local library. Have you glanced at the featured articles on Wikipedia? Have you seen which style they have adopted? Best, Miguel Chavez (talk) 07:23, 13 May 2009 (UTC)[reply]

I'm American, and generally prefer American usage. Some Britishisms absolutely make me itch — whilst, aluminium, dice used as singular. But on this one I'm with the Brits. In this case they just happen to be right. Quotation marks enclose that which is being quoted; if the thing you're quoting doesn't have the punctuation mark, then it shouldn't be there. --Trovatore (talk) 07:29, 13 May 2009 (UTC)[reply]
The funny part is I agree with you. It has always made logical sense to me to adopt the British style. But here's the rub, it's not up you or me to decide these things! There is a long history of precedent, and grammatical rules have already been put in place. They are being taught in schools, enforced in our universities, and are adopted by almost every English speaking scholar, editor, and publishing house. If Wikipedia is a tool of education -- which we are lead to believe -- then we do a disservice to this aim by advocating a convention that will be rejected by most learned institutions. There are practical reasons for keeping grammer and punctuation universal. But it seems that this group wants to play by their own rules. Miguel Chavez (talk) 07:50, 13 May 2009 (UTC)[reply]
This isn't the American Wikipedia. It's the English-language Wikipedia. Accordingly, the project has had to address the issue of dealing with national variations in English.
As your original post admits, the punctuation style employed here (sometimes called the "logical" style) is the one common in British English. On the other hand, we use the double quotation marks of American English. There is no perfect solution. This is the one we've adopted. One could make a sensible argument for saying that each editor uses the style that's considered academically correct in his or her home country, but that would produce jarring changes of style within a single article (sometimes within a single sentence). See Wikipedia:Manual of Style#Quotation marks. At one point that section or some other MoS provision characterized our approach as splitting the difference between AE and BE usage. We also have rules about the spelling differences between different versions of English. Sometimes "neighbour" is correct and sometimes "neighbor" is correct. JamesMLane t c 08:11, 13 May 2009 (UTC)[reply]
I understand that we should shy away from American parochialism. But that by itself is not an argument. This little group here has decided to adopt a style used by a minority of English speakers, and one which is at odds with the preponderance of English speaking academics and academic institutions. It is rejected by the Modern Language Association (the folks on the literary side), the American Psychological Association (the scientific side), as well as the good folks at Harvard. My point is this. A lot of kids read Wikipedia, and they -- for better or worse -- are going to incorporate what they see here into their writing styles. And you know what, you're going to piss their teachers off. Why, 'cause you think you know better. If this was any other subject, all we'd have to do is consult a list of authoritative texts. Evidence would be presented, and a rational consensus would ensue. I have a feeling that this would be a futile exercise in this case. As an avid and faithful reader, I can only cringe. Best, Miguel Chavez (talk) 09:25, 13 May 2009 (UTC)[reply]
"you're going to piss their teachers off". Their teachers should be worrying about the kids' incompetence with basic spelling and grammar, rather than which of two legitimate, established stylistic conventions they adhere to. Ilkali (talk) 10:47, 13 May 2009 (UTC)[reply]

Reading the bit about pissing the teachers off it occurred to me that these teachers really haven't that right. If teachers want students to use the US style, surely they'd have to teach it since by no means does it follow logically that something not part of the quote belongs within. If students pick up what they see here and copy it, that'll show the teachers that there's a gap to be filled and give them the opportunity to reinforce the crazy illogical system that the good folks at Harvard peddle. Better still, if enough students copy WP, US academia might swing toward logic ... But, no, WP is no tool to be used for pushing some style or other, however, we are free to adopt one and the one we've adopted by rational consensus is the logical style. Should we worry too much about WP's influence of American academia ... wouldn't we be overrating ourselves? I'm sure America has enough inertia to continue down its current illogical punctuation path in spite of us. There are worse things than logical punctuation on the net for kids to copy. JIMp talk·cont 10:21, 13 May 2009 (UTC)[reply]

This seems to be just another case of the phenomenon described above under #Too many people with too much spare time?, anyway. [1][2] --Hans Adler (talk) 10:43, 13 May 2009 (UTC)[reply]
Hate to point out the obvious, but this coming from a person reading the Wikipedia Manual of Style, clicking on the discussion tab, and reading the pedantic discussion therein. Miguel Chavez (talk) 19:42, 13 May 2009 (UTC)[reply]
The teachers are right because that's the standard everyone has agreed to (minus the folks on the island). It's funny to me because you hear over and over again that Wikipedia wants to be considered a "serious encyclopedia," and it's this kind of make-your-own-rules crap that makes it look like a joke.
In science, as well as other academic disciplines, there is a process called peer review. That means if you think you have a better idea you, as a professional, submit your idea to be reviewed by a panel of experts also trained in that field of expertise. If your idea's have some semblance of merit it is published. And it is through publication and argument that one's ideas can become orthodoxy. When this happens — and the argument has been won — you start to see your ideas published in encyclopedias, textbooks, and taught in the lower grade levels. This is how the academic process is done. Not so here. If the "logical" style passes the peer review process and manages to become incorporated into most English style manuals then I will concede. But until then everyone of you who thinks that "this", is the right way to use punctuation is wrong. And of course there are worst things in the world to worry about (red herring), but anyone who has a grasp of basic grammar and punctuation can't help but get irritated. At the very least there should be a warning that the Wikipedia MoS departs from most English style guides. Best, Miguel Chavez (talk) 19:42, 13 May 2009 (UTC)[reply]
The internet is not America. Again: The internet is not America. Calling a stylistic convention "wrong" because it does not match what you were taught completely misses the point of this page, which is to provide guidance to editors in the face of multiple regionally or contextually prevailing conventions. Ilkali (talk) 21:14, 13 May 2009 (UTC)[reply]
The Internet may not be America, but the readers of the English language Wikipedia overwhelmingly are. If we use American spellings in articles (which we do in most), then we need to be consistent. DreamGuy (talk) 22:38, 13 May 2009 (UTC)[reply]
I'm a firm believer in the peer review process. I know I don't have a doctorate in English, and even if I did I wouldn't have the hubris to think that I could speak for my entire professional community. As it is, a preponderance of English speakers, English departments, professional writers, and publishing houses adopt American conventions of punctuation. As mentioned previously, several major British newspapers have even adopted the American style. You may believe you are in the right, but at least be humble enough to admit you are on the losing side. This whole discussion reminds me of the Intelligent Design crowed who can't win the debate in academic circles so the peddle their ideas online and try to sneak them through the back door. Best, Miguel Chavez (talk) 01:18, 14 May 2009 (UTC)[reply]
Don't forget that the American style was originally the British style, and that the problem comes from the US and Canada (unlike Australia) not following the reform that happened in the UK. A recent change towards the logical system doesn't look like "on the losing side" to me. Also it seems to me that roughly half the native speakers of English live in the logical quotation area, and many of the others prefer logical quotation anyway. And then we have the fact that the Chicago Manual of Style, while clearly preferring the American style, admits there are precision problems and permits logical style where these matter. This affects some (admittedly few) of our articles.
Also the reason for the American system is that it looks better in conventional typography. Given the generally abysmal quality of web typography both on screen and in print, this is simply irrelevant in our context. --Hans Adler (talk) 01:52, 14 May 2009 (UTC)[reply]
I'm curious as to how you arrived at your sum. I have it by at least a factor of three, possibly four. And if you sample published volumes and periodicals, as well as guidelines adopted by educational institutions and publishing houses, I would imagine the figure rises significantly higher. Best, Miguel Chavez (talk) 06:28, 14 May 2009 (UTC)[reply]
You still don't seem to understand the nature of the exercise here. The point isn't to decide which style convention is "correct", because that doesn't exist. The point is to select one according to our goals as a global encyclopedia, and it was decided that logical quotation best meets those goals. Ilkali (talk) 08:41, 14 May 2009 (UTC)[reply]
I think I understand well enough. Some of you folks think you have come up with an improved style of punctuation. It's not quite the American system and it's not quite the British system. It's sort of a bastard child of both. So impressed with yourselves you have dubbed it the "logical system." It is not accepted by most of the world's academic institutions and used by very few if any writers, scholars and editors. But this does not dissuade you one bit. Rather than adopt a well recognized system used by universities and publishing houses, you just mandate that all users on Wikipedia must conform to the style which you happen find more intuitively pleasing or logical, at least to your mind. Never mind that most users will either reject it as sloppy punctuation, or worst still, adopt it in classrooms only to be marked down by their instructors. You say there is no correct way to use punctuation. Well not quite. There is, depending on your geographic location and the system you adopt. Most systems adopt the American style in way of punctuation. As such the Wikipedia MoS should reflect this, or at least be flexible enough to allow editors the freedom to make the decision themselves by not mandating preference. The fairest solution would be a compromise of sorts. Outline the differences in punctuation by the competing styles and let the editors decide which is more suitable for their prospective article. Best, Miguel Chavez 23:57, 14 May 2009 (UTC)[reply]
"Some of you folks think you have come up with an improved style of punctuation". ...What? You think logical quotation was invented on Wikipedia?
"you just mandate that all users on Wikipedia must conform to the style which you happen find more intuitively pleasing or logical". Oh, shut up. If you want to actually discuss the merits of different styles of quotation and how closely they match Wikipedia's core aims then I'm happy to engage you on that. I'm not interested in hissy fits. Ilkali (talk) 10:57, 15 May 2009 (UTC)[reply]
No, he means that the combination of "logical" punctuation and the use of double quotation marks (as default) is novel. I don't think we are the first instance, by any means; but British style guides do recommend single quotes, and the CMOS does say that British style should have single quotes if used, presumably to keep the comma near the preceding word. Septentrionalis PMAnderson 16:41, 15 May 2009 (UTC)[reply]
Our choice of quotation glyphs is as unrelated to our choice of internal vs external punctuation as is our choice of color vs colour. The idea that this is about British vs American style is a persistent error. Ilkali (talk) 17:17, 15 May 2009 (UTC)[reply]
Thanks Septentrionalis. I appreciate you handling my light work. You got it spot on. Miguel Chavez 08:06, 16 May 2009 (UTC)[reply]
Double quotes are preferred for technical reasons (when searching for abcd the internal search engine will find "abcd" but it won't find 'abcd'); I wouldn't object to allow single quotes in articles written in British English if/when that is fixed. --A. di M. (formerly Army1987) — Deeds, not words. 10:25, 16 May 2009 (UTC)[reply]
Confused as to how that would affect a search? Best, Miguel Chavez 00:21, 17 May 2009 (UTC)[reply]
Because search engines will see single quotes as part of the words being searched for, and double quotes as string delimiters. Since the quotes can always be changed after you cut and paste into the search engine, this is a weak argument, but it should be in the quideline. Septentrionalis PMAnderson 15:54, 17 May 2009 (UTC)[reply]
One of Wikipedia's core aims is civility. It has been plain for years that this rule (especially as it now stands, without acknowledgment that MOS is making a choice) is not conducive to civility. Septentrionalis PMAnderson 16:41, 15 May 2009 (UTC)[reply]

A handful of editors established this some time ago. Some of them belong to the Dominions, where schoolmarms seem to be very fierce about "logical" punctuation; one of them was an American engineer who posted at length about his grudge against his liberal arts professors (they took off marks putting commas outside). It has the advantage of presenting quotations precisely as written; on the other hand, it is prone to error, and open to difficult cases (especially when the only source uses the other convention). A rational MOS would say this, and let editors choose - as long as each article was consistent. Septentrionalis PMAnderson 18:39, 13 May 2009 (UTC)[reply]

Makes sense to me. Miguel Chavez (talk) 19:53, 13 May 2009 (UTC)[reply]

In articles using American spelling we use American punctuation rules. I don't care if the MoS currently says otherwise. It won't be the first nor the last time the MoS says something silly that the vast majority of editors just ignore outright. On articles using British spellings by all mean use British punctuation rules, otherwise no. DreamGuy (talk) 22:38, 13 May 2009 (UTC)[reply]

"In articles using American spelling we use American punctuation rules". Makes no sense. The two are completely separate things. Ilkali (talk) 22:44, 13 May 2009 (UTC)[reply]
As a general rule of thumb, this seems to be the sensible position. And it's a policy I have adopted and one I think most editors here on Wikipedia have adhered to. Articles discussing British subjects, like the British ethologist Richard Dawkins for example, ought to employ British parochialisms. Articles which touch upon general subjects, however, ought to employ general rules of punctuation, as defined by groups like the MLA, APA and other reputable sources. Best, Miguel Chavez (talk) 00:22, 14 May 2009 (UTC)[reply]
Do you really expect anyone to take your comments seriously if you use POV language like this? Johnbod (talk) 13:29, 16 May 2009 (UTC)[reply]
My only expectation is for people to address the arguments being made. However I'm a little confused as to why I should behave as though I didn't have a point of view when I clearly do? Best Miguel Chavez (talk) 02:39, 17 May 2009 (UTC)[reply]
The long-standing consensus is that articles with no obvious relation to any one place can be written in any dialect of English, provided that it is consistent and that idioms which can be easily misunderstood by speakers of other dialects are avoided. The fact that there are style guides for American English which are reputable sources doesn't make British rules "parochialisms": there also are style guides for British English which are reputable sources. --A. di M. (formerly Army1987) — Deeds, not words. 09:55, 15 May 2009 (UTC)[reply]
No, but insisting on only British format everywhere is parochialism, and impractical parochialism; too many of our editors do not use it, or have never heard of it. Septentrionalis PMAnderson 16:41, 15 May 2009 (UTC)[reply]
"too many of our editors do not use it, or have never heard of it". I don't think you get the point of style guides. If we were just describing what our editors already do, this page wouldn't be a guideline. Ilkali (talk) 17:17, 15 May 2009 (UTC)[reply]
Guidelines are what our editors generally agree on doing; see WP:Consensus (and indeed WP:Policies and guidelines). There is no point to a volunteer organization, which (by policy) accepts anyone, having anything else; nothing else is enforceableable, useful, or conducive to civiility. (There are other ends the futile effort at prescription can serve, chiefly ego-inflation, but few of them are socially useful; do any of them contribute to the encyclopedia?) Septentrionalis PMAnderson 17:45, 15 May 2009 (UTC)[reply]
"Guidelines are what our editors generally agree on doing [...]. There is no point to a volunteer organization, which (by policy) accepts anyone, having anything else". It doesn't seem like people can agree not to vandalise pages or edit war, either. Let's get rid of the associated policies. Ilkali (talk) 18:33, 15 May 2009 (UTC)[reply]
On the contrary, that is a perfect example. There is general agreement (including both sides of the edit war, talking about each other) that edit-warring should not be done; so we have policy against it. Septentrionalis PMAnderson 19:41, 15 May 2009 (UTC)[reply]
No, most people have an "unless I'm right" clause in their internal WP:3RRs. Assuming I'm right about that, should we write it into the real policy?
Guidelines aren't there to predict or describe or affirm what people already do. They're supposed to guide. They're there to say "in situation X, do Y". The fact that not everybody will automatically follow the guideline, or that some might reject it, does not negate its purpose or its value. Ilkali (talk) 20:03, 15 May 2009 (UTC)[reply]
Does it matter, then, what they guide or whether editors generally agree with it? Reductio ad absurdum does work, of course, on the answer No; but I don't want to leap to a conclusion. Septentrionalis PMAnderson 20:43, 15 May 2009 (UTC)[reply]
Guidelines are supposed to show what the best current practices are. (I've read that somewhere, but I don't remember where.) --A. di M. (formerly Army1987) — Deeds, not words. 10:25, 16 May 2009 (UTC)[reply]
The problem is the consensus you mentioned is not being followed through here. The MoS is explicitly endorsing one style while rejecting the other, even though the one being rejected is far better recognized and implemented more often by most academic institutions. I described the British system as a parochialism because that's what it is: a localized phenomenon. Although many English speakers use it, its use is still in the minority. Best, Miguel Chavez (talk) 08:25, 16 May 2009 (UTC)[reply]
See my comments to PManderson about what a style guide is and isn't. Our job here isn't to describe what our editors already do. There are more concerns here than just who's in the majority.
I would also point out that logical quotation is not restricted to Britain. Scholars of all countries often use it for its precision, like we do here. Ilkali (talk) 08:43, 16 May 2009 (UTC)[reply]
Well this is a public encyclopedia, and decisions ought to be made by majority consensus. A cursory look at the articles seems to favor the American style. As for "logical quotation," its use in public literature is exceedingly narrow, even in the scientific literature, which prefers the APA style. Miguel Chavez 00:21, 17 May 2009 (UTC)[reply]
"decisions ought to be made by majority consensus". No, that's stupid.
"its use in public literature is exceedingly narrow, even in the scientific literature, which prefers the APA style". Are you still pretending choice of glyphs and choice of internal-vs-external are somehow the same thing? Ilkali (talk) 19:00, 18 May 2009 (UTC)[reply]
First things first. If you are under the impression that you and your cadre have ownership of this article you are terribly mistaken. This is an open project, and as long as our contributions are reasonable (and supported by mainstream academic scholarship), then they ought to be considered and decided upon by consensus. My opinion on this issue is simple. Wikipedia ought to adopt general and broadly recognized styles of punctuation. I believe this principle, although unspoken, is why we prefer American preferences of spelling. Not because the American style is superior, but that it will be recognized by a preponderance of Wikipedia users. I understand that this issue is controversial—oddly enough—so I'm willing to give in to plurality, and put up for consideration that the Wikipedia MoS allow editors the flexibility to choose for themselves which style is most appropriate for their prospective articles. To the second point. When I stated that logical quotation was "exceedingly narrow," I meant just that, and nothing more. That the practice of placing a punctuation marks outside the quotation when the punctuation mark is not part of the sentence, and inside the quotation when the punctuation mark is part of the quoted speech is disproportionately narrow as compared to the style which is generally used on the North American continent and adopted by most editors and publishing houses. This with the fact that most academic bodies adopt the American style, makes for a persuasive argument in favor of adopting inside punctuation. The type of glyph was not really an issue with me, as my argument is completely consistent with Wikipedia choice of glyph. But there is something to the fact that the odd coupling of styles reduces it further still. Best, Miguel Chavez (talk) 05:56, 19 May 2009 (UTC)[reply]
"as long as our contributions are reasonable (and supported by mainstream academic scholarship), then they ought to be considered and decided upon by consensus". Nobody's criticised your contributions on the basis of anything but their merit. Don't cry persecution just because people aren't agreeing with you.
"we prefer American preferences of spelling". No we don't.
"I'm willing to give in to plurality, and put up for consideration that the Wikipedia MoS allow editors the flexibility to choose for themselves". Style guides exist for a reason. Logical quotation is plainly superior for our purposes, and I've yet to see any coherent arguments that our explicit preference for it has ever caused any harm, other than to the sensitivities of stylo-dominionistic Americans. Ilkali (talk) 11:54, 19 May 2009 (UTC)[reply]
"Don't cry persecution just because people aren't agreeing with you." First, "persecution" was never at issue. This is about the possessiveness and arrogance on your part to refuse amending the guidelines in lieu of reasonable (and ongoing) dissent. Second, I think I've easily split the difference here. After all, I'm not the one getting all bothered about consensus. Third, I think our preference for American spelling is quite evident, and I'll leave it for readers to decide for themselves. Third, you've done nothing to address my arguments other than go on about how "plainly superior" logical quotation is. Never mind that most of the academic and literary world disagrees with you. But what do they know, right? In any case, my argument has nothing to do with which style is theoretically superior (the answer is none). The argument is based on four factors. 1, which style is better recognized by (English speaking) Wikipedia readers? 2, which style is in accordance with general principles of academic convention? 3, which style is backed by the preponderance of editors and publishing houses? 4, could the coupling of American and British styles confuse students on the North American continent, thus leading to the adoption of a style which will be rejected by their institutions of learning? Your arguments so far have been: so what, shut up, and this has been decided long ago -- go away. So far, not so impressed. Best, Miguel Chavez (talk) 20:22, 19 May 2009 (UTC)[reply]

Out of curiosity, is there some reason to suppose (as this thread seems mostly inclined to) that the choice of logical quotation was made arbitrarily on the basis of personal style preferences, rather than for the reasons that the page itself actually states? Pi zero (talk) 00:12, 14 May 2009 (UTC)[reply]

Here's the tirade about liberal artsy stuff, anyway.
The reasons the page states are valid, as far as they go; so are the reasons Miguel Chavez would urge on the other side; so are the cautions of the Chicago Manual of Style (§6.10) that the British style requires extreme authorial precision and occasional decisions by the editor or typesetter. We should state all of them. Septentrionalis PMAnderson 05:18, 14 May 2009 (UTC)[reply]
The purpose of writing (whether in English or any other human language) is to be understood. Thus clarity of communication is paramount, and should decide any issues about grammatical rules. Consequently, I put punctuation inside when it is part of what I am quoting, or outside when it is part of my sentence structure. Thus one might have a period inside the quotation and also one outside it, if the quotation is a full declarative sentence and forms the last word in one of my declarative sentences. For example, one might say
Byzantine Emperor Manuel II Palaiologos said "Show me just what Muhammad brought that was new and there you will find things only evil and inhuman, such as his command to spread by the sword the faith he preached.".
I do not care whether we call this American, British, or logical. It is the system I use. JRSpriggs (talk) 00:51, 15 May 2009 (UTC)[reply]
That is actually logical, but it is neither of the systems under discussion. This shows the ineffectiveness of the present prohibition.
Again, who opposes acknowledging that there are at least two systems (since Wikipedia is doubtless actually using both) and describing the reasons to choose one or the other? Septentrionalis PMAnderson 03:37, 15 May 2009 (UTC)[reply]

What about:

In British English, punctuation marks are placed inside the quotation marks only if they are part of the quoted text:

Arthur said, "The situation is deplorable and unacceptable." (The period is part of the quoted text.)
Arthur said that the situation was "deplorable". (The period is not part of the quoted text.)

In American English, commas and periods are normally placed inside the quotation marks regardless of whether they are part of the quoted text:

Arthur said that the situation was "deplorable."

Nevertheless, the British style can also be used in American English in scientific and technical contexts where the standard American style would be misleading:

In the vi text editor, a line can be deleted from the file by typing "dd". (Putting the period inside the quotation marks would suggest that it also must be typed, but that would delete two lines.)

--A. di M. (formerly Army1987) — Deeds, not words. 09:55, 15 May 2009 (UTC)[reply]

"The British" or "American format" might be better. Americans do use the British style, sometimes; and conversely. Septentrionalis PMAnderson 16:41, 15 May 2009 (UTC)[reply]
Sounds pretty good. Miguel Chavez (talk) 08:13, 16 May 2009 (UTC)[reply]

We have a clear guideline here. It is generally accepted, and it works. PLEASE leave it alone. See #Too many people with too much spare time? above. Thank you. Finell (Talk) 01:17, 16 May 2009 (UTC)[reply]

If it were generally accepted, this small section wouldn't attract complaints every other month. Septentrionalis PMAnderson 03:00, 16 May 2009 (UTC)[reply]
Not necessarily. There are plenty of Wikipedians who will argue about anything, and especially about standards. Finell (Talk) 13:01, 16 May 2009 (UTC)[reply]
Granted. But if that were all that were going on, there'd be an argument about every section, because the content wouldn't matter. But that's not the case; show me the last argument about the section on Celestial bodies, although it is open to criticism. This section, however, justifiably annoys people, and should be revised closer to WP:ENGVAR, even if we choose to express a preference for "logical" quotation. . Septentrionalis PMAnderson 17:35, 16 May 2009 (UTC)[reply]
As the project page states, the standard is based on precision: "it is used by Wikipedia both because of the principle of minimal change, and also because the method is less prone to misquotation, ambiguity, and the introduction of errors in subsequent editing.". That is also why it is called logical quotation. It has nothing to do with WP:ENGVAR, and adding WP:ENGVAR to the explanation would muddle something that is now clear. Finell (Talk) 22:02, 16 May 2009 (UTC)[reply]
That is the (fairly feeble) argument for it; and editors should indeed consider it. The "imprecision" consists of the fact that some readers, faced with the other system, which uses ," as a compound sign, will be parochial enough to read the single comma as part of the quotation although it need not be. Editors should be aware of that possibility, and adopt "logical" punctuation when it will be a problem - or recast to avoid the comma; but it's a single comma. Septentrionalis PMAnderson 15:46, 17 May 2009 (UTC)[reply]

The reference to WP:ENGVAR does deswerve clarification: I do not mean that we should adapt quotation systems to the national variety of English used. We should, when there are two commonly used systems of typography in English, mention that there are two, and discourage switching between them save for food reasons and by consensus. Engvar does that for color/colour; we should do it more widely. Septentrionalis PMAnderson 15:46, 17 May 2009 (UTC)[reply]

"If it were generally accepted, this small section wouldn't attract complaints every other month." The last time an objection to logical quotation was raised was six months ago, by you (in a discussion of timewasting complaints). Prior to that there was a discussion in August 2008 (again involving you); and prior to that, there was a discussion in May 2008, again involving you. That certainly adds up to far less than a complaint "every other month", as you must be aware as a participant in the last three unsuccessful and frequently-rejected "challenges". chocolateboy (talk) 23:18, 16 May 2009 (UTC)[reply]
This debate is ridiculous. If editors want to use the most well known system of punctuation, which is adopted by most universities, books, newspapers and periodicals then they should. Miguel Chavez 00:21, 17 May 2009 (UTC)[reply]
Good luck with that. [3][4] chocolateboy (talk) 01:27, 17 May 2009 (UTC)[reply]
That small edit is actually how a came to find out about this unusual policy of yours. I find it, well, odd that an article dealing exclusively with American popular culture should be so infused with British parochialisms. Miguel Chavez (talk) 02:27, 17 May 2009 (UTC)[reply]
  • I should remind those who wave WP:POINT around: it prohibits doing actions which you do not support, or which damage the encyclopedia, to make a point. Doing actions which you do support, and which help the encyclopedia, despite a guideline is WP:IAR; that is supported by policy. However, those who do this must state what benefit they see, and may be reversed unless supported by local consensus. Septentrionalis PMAnderson 15:50, 17 May 2009 (UTC)[reply]

¶ For whatever little it may be worth, I crossed the Atlantic thrice between the ages of 6 and 11, ending up in the U.S. with a mild chauvinistic prejudice in favour of British style & spelling. But sometimes American practice makes more sense to me on a relatively objective logical basis and sometimes British. I think American double-quotes are far better because they avoid double-takes when an apostrophe (prime, etc.) is encountered either in the middle of a quotation or closely outside one. [The bakers' son swore that ‘when I visited Goldsmiths', they didn't have the goods’ that had been described.] That's still a problem with apostrophes near an enclosed quotation, but those are less common. But I don't like introducing punctuation that isn't in the original, no matter what the aesthetic advantages might be, because it can either mislead the reader or impose on him or her the burden of trying to reconstruct what the original looked like. On the other hand, if you're quoting a whole sentence that ends with a full-stop/period (or would, if it were written down from speech) then by almost the very same logic, leave it inside the quotation marks. This rule, by the way, is almost imperative (from my point of view) for other stops such as exclamation points [!] and question/interrogation marks [?] because doing otherwise could distort the import of the original quotation; so why not follow it for periods/full-stops and commas? —— Shakescene (talk) 18:15, 24 May 2009 (UTC)[reply]

I'll admit it. I sort of hope logical punctuation is eventually adopted by the academic community -- even though it's aesthetically inferior with its lack of symmetry and uniformity -- but until that day comes I'm going to stick with "mainstream" academic convention. Maybe it's because I'm a first born, and studies show that we tend to be less rebellious. I suppose that also explains why I continue to use Windows. Best, Miguel Chavez (talk) 09:38, 25 May 2009 (UTC)[reply]
I do not think we should call the British system "logical." It biases people against American punctuation by implying that it is illogical. It's not more logical; it's just a different way of solving an old typographical problem. It's perfectly logical to spell "color" without a U, but if I'm writing an article about London wall paintings, I'd best use London spelling.
I propose that we instead refer to the British punctuation rule as it applies to commas and periods as "the stop rule" or "the stop system" because it relies on the location of the stop within the sentence. It's descriptive, nearly self-explanatory and, hopefully, won't tick anyone off.
If the matter of British or American punctuation is being revisited, then here are my two cents: Why not use the guideline that is already in place for spelling? Articles that are about specifically American or British topics use the spelling that more closely relates to the subject matter, and articles that don't apply to either stick with the system established by the original contributor.Darkfrog24 (talk) 19:44, 29 May 2009 (UTC)[reply]

Is it "aesthetically inferior"? Isn't that in the eye of the beholder? As for calling it "logical", this is no bias against Americans: it is logical. What is part of the quote goes inside; what is not doesn't. What could be more logical? As has been noted in this and prior discussions on the topic we have a different situation from US verses Commonwealth spelling. American punctuation changes the quote this is why it has been judged inappropriate. JIMp talk·cont 21:04, 29 May 2009 (UTC)[reply]

Calling one system "logical" implies that its alternative is illogical. For example, factions in the abortion debate imply that their opponents are "pro-death" or "anti-choice." Another way to mitigate this is to give the American system a name of its own and use both in the article, such as in an article that refers to both the pro-life and pro-choice factions by their own selected names (or an article that refers to the two systems as "British" vs. "American" or as "logical" vs. "consistent," etc.). As for what could be more logical, it can be argued that it is more logical to treat periods and commas consistently as opposed to switching back and forth. The name "logical," with regard to the stop rule, is arbitrary. Darkfrog24 (talk) 22:02, 29 May 2009 (UTC)[reply]
If a style were called by any other name, it would smell just as logical ... or consistent ... Well, the American style is illogical. To put something within inverted commas regardless of whether it is part of the quote is illogical. This is no anti-American bias nor is it arbitary. Nor do I see how logic could lead us to the idea that we should prefer the kind of "consistency" involved in putting fullstops and commas (and what about other punctuation marks?) within the quote regardless of whether they belong there. Consistency, on the other hand, it might be argued, leads us directly to logical punctuation for what could be more consistant than having everything within the inverted commas being part of the quote and everthing outside not? I'd thus argue that logical quotation is more consistant or conforms to a higher level of consistency that the alternative. So to use the term consistent to describe the American style and thus imply that the logical style is inconsistent would not be correct. The term æsthetic punctuation is sometimes used but beauty is in the eye of the beholder (and I can name you one beholder who see nothing pretty here). There is, however, another name which is often used for the American style: typesetters punctuation since it was the early typesetters who came up with this style for practical reasons, reasons which no longer apply. JIMp talk·cont 18:14, 30 May 2009 (UTC)[reply]
It is not illogical and calling it so offends me deeply. As you've just demonstrated with the word "consistent," words can apply or not apply to something depending on how the reasoning is worked out. Your argument against the logic of American punctuation would only hold water if were not understood that using a period or comma to end a quote is just part of the quoting process. It is. And in American English, full stops and commas are put there because they do belong there.
I wouldn't object to calling the stop rule "technical punctuation," because it is used in technical documents. Darkfrog24 (talk) 18:32, 30 May 2009 (UTC)[reply]
Two points.
  • Logical here means "using the logical status of the punctuation as a basis for deciding where to put it", so the use of that term has nothing whatever to do with whether or not it makes sense to use that logical status as the basis for the decision. (I acutally do think it makes a lot of sense to do so in Wikipedia, because it maximizes information delivered to the reader, and delivering information to the reader is the mission of Wikipedia; but I digress.) Since logical here doesn't mean "making sense", but rather "using information about logical status", it's not at all clear that it's even meaningful to talk about what is the "opposite" of logical, and even if there were such a thing as its opposite, that opposite wouldn't be "illogical".
  • None of the other systems that have been mentioned here is "opposite" to logical quotation, so in calling it logical quotation there is no implication that any of these other systems is "the opposite of logical", even if there were such a thing as "the opposite of logical" in this situation (which there isn't).
Pi zero (talk) 00:17, 31 May 2009 (UTC)[reply]
Calling one system "logical" implies that the others are less logical or illogical entirely. It might be nice if it didn't, but it does. Darkfrog24 (talk) 00:48, 31 May 2009 (UTC)[reply]

[Just noticed this (re: "inside or outside"), which I point out in an article talk page and am copying here, for others' information, if useful:] An example called a "sentence fragment" in the section linked on "Quotation marks" ("Come with me."--mispunctuated as "Come with me") is actually not a "fragment" of a sentence; it is a full sentence: an imperative command ("Come with me." is a sentence based on the imperative usage of the verb "to come"; "Come." signifies "I am telling you to come," just as "Come with me." signifies "I am telling you to come with me." (It may appear elliptical ["a fragment"] to some, but it is an imperative.) --NYScholar (talk) 00:47, 26 May 2009 (UTC) [Also added to section above: #Punctuation: Quotation marks: Inside or outside. --NYScholar (talk) 00:57, 26 May 2009 (UTC)][reply]

[Please see Imperative mood#Usage: in the above example ("Come with me." v. "Come with me") in the current version of this project page re: "inside or outside" quotation marks, what is labeled as "correct" is actually incorrect, and vice versa. (I added Wikified link above too.)--NYScholar (talk) 03:08, 27 May 2009 (UTC)][reply]

Centre-facing images and L2 headers

A recent discussion on WP:ANI#Rotational has raised a couple of possible conflicts between two MOS guidelines. Please could people offer their opinions on the following issues?

  1. How should right-facing portraits be handled in the lead section? MOS suggests that "It is often preferable to place images of faces so that the face or eyes look toward the text." Is it appropriate to swap positions of the TOC and the lead image, as in this edit?
  2. Is it appropriate to convert L2 headers to L3 for aesthetic reasons, such as this edit? Papa November (talk) 18:04, 18 May 2009 (UTC)[reply]
Re (2), I asked the same thing on this page a little while ago, but didn't get much feedback.[5] It seems completely obvious to me that the top-level headers in each article should be level 2 (==) not level 3 (===). — Carl (CBM · talk) 18:09, 18 May 2009 (UTC)[reply]
  • For what it is worth, I offer the following;
    1. This is a conflict of competing goods; it is appropriate to do it either way, since some readers will value both. If the switched way were widely offered and widely rejected, it might be appropriate to consider that very few readers find "facing inward" the Most Important Thing; but I don't see that that is true. Few readers care.
    2. This esthetic judgment can (and should) be implemented by adjusting reader settings. But if an occasional article has level 3 headers, so what?
IN neither case are these worth revert-warring about. Septentrionalis PMAnderson 19:09, 18 May 2009 (UTC)[reply]

The first dot point of the "images" section is "Start an article with a right-aligned lead image or InfoBox." I think the wording suggests that this trumps "It is often preferable to place images of faces so that the face or eyes look toward the text.", and so it should. Specifically,

  1. No, the layout in that diff looks horrible. If that is the result of allowing left-aligned lead images, then the MOS should be clarified to make it even more clear that they are discouraged.
  2. I largely agree with Septentrionalis' "if an occasional article has level 3 headers, so what?"; but note that that diff also demoted the References section to level 4, making it subordinate to the Asteraceae section. Look at the contents box:

105 Acanthaceae 106 Rubiaceae 107 Asteraceae 107.1 References

That is completely wrong; indefensible even.

Hesperian 23:56, 18 May 2009 (UTC)[reply]

Actually the number of articles with a level-3 references section is vanishingly small (under 3000). I checked. Once you remove from that the ones that have "references" as a subsection of "notes and references", and a swath of stubs that seem to have been autogenerated with level 3 headers, the number left is small enough to attribute solely to users unfamiliar with how to format wikitext. — Carl (CBM · talk) 00:01, 19 May 2009 (UTC)[reply]
I would not have switched the image and the TOC, but I think the switched version looks well enough. But the matter seems to have been settled by blocking the advocate. Septentrionalis PMAnderson 14:44, 19 May 2009 (UTC)[reply]
He has not been blocked this time; a discussion at ANI, that I did not participate in, led to a sanction that he cannot revert this sort of change if someone else reverts him. He is still able to edit normally in every other way. Which seems like a reasonable solution to me. — Carl (CBM · talk) 14:52, 19 May 2009 (UTC)[reply]
"It is often preferable to place images of faces so that the face or eyes look toward the text." Is that really a modern stylistic rule? Or does it belong to the 1930s and 1940s, when vehicles in movies (especially trains) were expected to be pointed to the right when moving eastward and left when moving westward? --John Nagle (talk) 17:37, 19 May 2009 (UTC)[reply]
It's still a rule. It looks odd when a face is pointing off the page; in addition, readers' eyes tend to follow the direction of the face, and we don't want to point them away from the text. SlimVirgin talk|contribs 17:45, 19 May 2009 (UTC)[reply]
  • Looking at the examples (and several other versions) here's my 3 cents
    1. Right facing portraits in the lead should be on the LEFT. But if they are enclosed in a valid infoboxthen the right-side is okay. Right-facing portraits look awkward and horrible when placed on the right side if they are not in an Infobox
    2. Yes it is appropriate to convert L2 headers to L3 headers if the situation warrants it (i.e. for aesthetic reasons). I'm not sure if aesthetics was the original claim at the diff referenced above, but the page itself calls for L3 headers. L2s are reserved for Section Headings: The whole page there is one section (L2 level). Consequently, every entry there is, by definition, a sub-heading (L3). Sub-headings are supposed to be L3. Joe Hepperle (talk) 10:04, 22 May 2009 (UTC)[reply]
      • If there are no section headings, the entire article is at the L1 level, not the L2 level. The L1 header is the article title; more can be produced with a single = sign but convention is that we do not use them. An L3 header is for a subsubheading. — Carl (CBM · talk) 05:21, 23 May 2009 (UTC)[reply]

Archives of this page

Why do archives 8, 9, 66 and 67 of this page appear to be redlinks in the {{talkheader}} at the top of this page?  Skomorokh  19:44, 18 May 2009 (UTC)[reply]

I was wondering the same thing... — Cheers, JackLee talk 04:15, 19 May 2009 (UTC)[reply]
Editing Wikipedia talk:Manual of Style/Archive 8 has this message: 14:18, 25 February 2009 Od Mishehu (talk | contribs) deleted "Wikipedia talk:Manual of Style/Archive 8" ‎ (G2: Test page). -- Wavelength (talk) 14:45, 24 May 2009 (UTC)[reply]
The other three do not have such a message. -- Wavelength (talk) 14:47, 24 May 2009 (UTC)[reply]
It doesn't really help, but I can confirm that Archive 8 had a single edit to it that was actually a test edit before it was deleted. Why the numbering sequence is wrong, I do not know. MBisanz talk 15:21, 28 May 2009 (UTC)[reply]
This is what I discovered from the Internet Archive Wayback Machine.
http://web.archive.org/web/*/http://en.wikipedia.org/w/index.php?title=Wikipedia_talk:Manual_of_Style/Archive_8
http://web.archive.org/web/*/http://en.wikipedia.org/w/index.php?title=Wikipedia_talk:Manual_of_Style/Archive_9
http://web.archive.org/web/*/http://en.wikipedia.org/w/index.php?title=Wikipedia_talk:Manual_of_Style/Archive_66
http://web.archive.org/web/*/http://en.wikipedia.org/w/index.php?title=Wikipedia_talk:Manual_of_Style/Archive_67
-- Wavelength (talk) 19:46, 28 May 2009 (UTC)[reply]
I found the following information.
  • Archive 7 ends on 16 February 2005 and Archive 10 begins on 22 January 2005.
  • Archive 65 ends on 14 February 2007 and Archive 68 begins on 1 February 2007.
-- Wavelength (talk) 22:29, 29 May 2009 (UTC)[reply]

What's wrong with MOS

Let me start with what's right with MOS: the notice that when English speakers do a thing in two different ways, as with WP:ENGVAR or the Oxford comma, that both are widely accepted and should be left alone.

Beyond that, it would be harmless if MOS stated what is universally accepted about English prose (End every sentence or fragment with punctuation. would be one example - and even to that there might be exceptions).

[Also, as below, certain conventions "to prevent factual errors or ambiguity". This principle does not include (as some would argue) conventions that choose among alternatives, and which might prevent ambiguity provided the readers knew that particular paragraph of MOS; very few of our readers will. 18:34, 19 May 2009 (UTC)]

If, when there are two widely accepted ways of doing things, as with "logical" punctuation, we stated advantages and disadvantages, and left the editor to make up his mind, that would be treating our editors like adults. Even if we then expressed a preference between the alternatives, that would be informative, respectful, and useful.

But we do not; explanations are routinely removed, and arbitrary choices imposed, in the interests of a chimerical Wikipedia-wide "consistency". Our articles will never be consistent: some have infoboxes, some don't; some use color, some use colour; some use asyndeton, some don't.

Even on those issues on which MOS pretends to impose "consistency", they are not consensus, but the opinion of a handful of editors. (More editors have objected to the requirement of "logical" punctuation that ever have supported it; but they don't stick around.) Their imposition is a massive violation of WP:NOT, which is policy: Instruction creep should be avoided. Wikipedia's policies and guidelines are descriptive, not prescriptive. They represent an evolving community consensus for how to improve the encyclopedia and are not a code of law.

I dispute 90% of MOS, therefore, individually and in parts. Whatever comes of the current Arbitration, I shall be taking it off my watchlist, as beyond present repair; I hope that (as with other entrenched disasters) I will find it better after a long intermission.

I encourage those who think they can fix this divisive and polemical page to continue to try; indeed, I post this in an effort to assure them that they are not alone, and have support, even in absentia.

Comments? Septentrionalis PMAnderson 15:06, 19 May 2009 (UTC)[reply]

That perfect consistency can never be achieved does not make it worthless to try for greater consistency. —David Eppstein (talk) 15:36, 19 May 2009 (UTC)[reply]
Greater consistency in dashes and quote-marks is not worth very much to begin with; that striving for it is futile does make the effort vacuous. Septentrionalis PMAnderson 18:47, 19 May 2009 (UTC)[reply]
Your problem with the Manual of Style is that it is a manual of style. That's not going to change. Ilkali (talk) 15:58, 19 May 2009 (UTC)[reply]
The present state of MOS is inconsistent with being a guideline. In the long run, one of them will change. The present decrees of MOS are inconsistent with the English language; in that case, there is little doubt which will change.
But it really doesn't matter; the force of a guideline (except to newbies who don't realize how little tags mean) rests in the persuasiveness of the arguments it makes and the consensus it represents. MOS has no arguments, and (with the exceptions above) represents no consensus. Septentrionalis PMAnderson 18:47, 19 May 2009 (UTC)[reply]
If, as you say, "The present decrees of MOS are inconsistent with the English language", can you provide examples of those inconsistencies and sources to support your assertion? -- Wavelength (talk) 23:35, 19 May 2009 (UTC)[reply]
  • For one, the effort to bury aesthetic or typographer's punctuation by silence. There are two methods of dealing with the question, both with style guides that support them and recommend against the other (I mention this because it is discussed a few sections sections up, with citations).
  • For another, the guideline on Celestial bodies, which says that we should use The greatest cobalt deposit on earth with a lower case earth, unless another celestial body is mentioned in the same sentence. This is not what is done; if the availability of cobalt on Mars is the subject, Earth is capitalized whether Mars is in the same sentence or not.
  • But there are hundreds; the whole section on Figures and words is an effort to patch a misbegotten hard rule quarried out of a collection of rules of thumb so that it doesn't produce too many howling violations of idiom. Septentrionalis PMAnderson 20:31, 20 May 2009 (UTC)[reply]
At a minimum, a manual of style should prescribe some enforceable rules that prevent factual errors or ambiguity. Some that should be in this MOS are:
      • The decimal point is the period (full-stop), never the comma
      • Billion and trillion have the short-scale meaning, never the long-scale meaning
      • Dates in the format YYYY-MM-DD are governed by ISO-8601 (or are not so governed, but we should make up our minds)
      • All-numeric dates in other than the YYYY-MM-DD format are forbidden
      • The citation format should be specified in a manner so that pages, volumes, and issues can be clearly distinguished.
      • When a name for a unit of measure, such as gallon, indicates different quantities in different countries, there is no presumption about which national variety of the unit is intended and any article that fails to specify the variety is in error.
If we wanted to, we could confine the manual of style to those kinds of pronouncements, and for the rest, give a list of well-regarded manuals such as the Chicago Manual. --Jc3s5h (talk) 16:18, 19 May 2009 (UTC)[reply]
I agree to some extent with PMA. We should lay down some basic style guidelines, and we should insist on consistency within articles. But if we try to pin down every tiny thing, in the end we'll succeed mostly in undermining the MoS. It would be good if we could have a small but solid core of rules, and thereafter advice outlining choices and explaining who tends to do what where. SlimVirgin talk|contribs 17:53, 19 May 2009 (UTC)[reply]
(ec) All of Jc3s5h's examples are quite reasonable; as a quibble, it might be possible to permit the decimal comma provided there is a clear note explaining what's going on. They make up another valid category: avoiding unclarity.
Let me, however, encourage reduction towards that minimum. I look forward to seeing a revised and simplified MOS in some months. Good luck. Septentrionalis PMAnderson 17:59, 19 May 2009 (UTC)[reply]
There are a few other useful part, like WP:LAYOUT (excluding WP:MOSIMAGE), which actually helps editors and readers. But on the whole MOS is WP:CREEP - for example dictating what kinds of dash should be used for page rages in citations. And as for making every last minutia of MOS a requirement for FA, well, if I'm gouig to spend that much time researching, it will be on a subject I enjoy. --Philcha (talk) 18:17, 19 May 2009 (UTC)[reply]
You've expressed my sentiments exactly, especially with regard to "treating our editors like adults." A very thoughtful post. Best, Miguel Chavez (talk) 07:46, 21 May 2009 (UTC)[reply]

Pmanderson indicates that when there are alternative styles which lead to factually different meanings, readers can not or will not rely on the MOS to decide which alternative is intended in an artilce. If this is so, then the MOS should be deleted, and every single article should be written in an unambiguous style. For example, if a journal is cited in a style that does not spell out or abbreviate the terms "volume", "issue", and "page(s)" then it must contain a statement about which manual of style the citations follow. --Jc3s5h (talk) 18:52, 19 May 2009 (UTC)[reply]

Is it sufficient to be clear about which number is which, without necessarily following any model? "Vol. 5, Issue 2, pp. 143-158" will give any reader the location of the article, which is the reason to cite it in the first place. It may be preferable to follow either the standard citation method in the field, or how the journal cites itself, but is it important? Septentrionalis PMAnderson 18:59, 19 May 2009 (UTC)[reply]
Style guides at WP are "devisive and polemical" only inasmuch as a single editor, above, has been running this line for three years; it has largely been a singular crusade. I hear no complaints from our article editors, whether those who prepare and maintain featured content or those who contribute to other articles, about the nature and function of MoS. Most editors, like me, are grateful to have a resource that clears up doubt. Every professional publication has a style guide, and WP's is significantly less fine-grained that the major hard-copy style guides. Those who feel there is insufficient stylistic latitude might ponder on the overwhelming proportion of stylistic decisions editors make (would it be 95%?) that are not covered by MoS. That is as it should be. There is no solid evidence that editors out there do not like the function and status of MoS as it has been over years. Tony (talk) 05:20, 20 May 2009 (UTC)[reply]
They were divisive and polemical before I ever saw them, they are now, and they will continue to be. Septentrionalis PMAnderson 20:35, 20 May 2009 (UTC)[reply]
Tony, you "hear no complaints from our article editors" because you listen in the wrong places, mainly at FAC. See: the quotes at User:Philcha#GA.2C__FA.2C_etc. and Category:Hyphen Luddites. How many hours' study does it take to become thoroughly familiar with MOS? I'd rather spend the time researching publications that that help to build content. --Philcha (talk) 07:09, 20 May 2009 (UTC)[reply]
"How many hours' study does it take to become thoroughly familiar with MOS?" Nobody is forced to spend time learning it. If you follow the wrong style, the worst that will happen is that someone will leave you a message suggesting that you adjust to something different. Even that's optional. As long as you don't deliberately change things from good style to bad, there's no problem. Ilkali (talk) 07:37, 20 May 2009 (UTC)[reply]
Pilcha: "How many hours' study" does it take to learn how to write to a professional level? That is the requirement of featured articles. "Good articles" are usually not written well, and are full of inconsistencies (not all, but most). I'd say reading through the MoSs would be a quick-sure way of improving one's writing ability, but not sufficient to achieve a professional standard, since only a tiny part of style is treated by the style guides (grammar, redundancy, logic, and flow are too multifarious and detailed to treat, and would quickly end up being over-presriptive). Probably you'd want to become more familiar with some parts than others—that's certainly the case for me, and I quickly consult any part I'm unfamiliar with, where on occasion it is necessary. I very much like most of what is on your page, Pilcha, including general advice on article writing; but I'm unsure I agree with your stance on the style guides, since not everyone can do as well as you can without its advice. Perhaps there's a case for the writing of a beginner's guide to the style guides (with links for the reader who wants a greater level of detail on any particular matter). Tony (talk) 07:50, 20 May 2009 (UTC)[reply]
Tony, who's this "Pilcha" whose views you're commenting on?
The fatal flaw in your case it that I've seen many examples of lousy writing that complies with MOS, and it's quite easy to write decent to good prose that ignores MOS. Which is more beneficial for readers? On which would time be better spent? Your own excellent advice on User:Tony1/How to satisfy Criterion 1a (of FAC - "engaging, even brilliant prose") illustrates my point - only one section of it considers MOS, and it's the one I ignore, while enthusiatically recommending the rest. --Philcha (talk) 08:12, 20 May 2009 (UTC)[reply]
I think these "criticisms" would apply to all WP policies and guidelines - but it was never claimed that they would enable perfection to be achieved, nor that they should be followed blindly without exception, nor that editors should read and master the whole set of them before writing anything. They are (or should be) useful references to be consulted when you're not sure about something or when editors take differing views. Of course the MoS and other project pages need a great deal of improvement, but that can be achieved by proposing concrete changes and discussing them calmly. --Kotniski (talk) 09:31, 20 May 2009 (UTC)[reply]
Hi, Kotniski, re "discussing them calmly", I do not find MOS exciting :-)
Re "nor that they should be followed blindly without exception", I wish - see WP:WIAFA.
Re "They are (or should be) useful references to be consulted when you're not sure about something or when editors take differing views", it's impossible to find anything in it, and there's no overview, index or any thing that would help. What little I know about MOS is based on review comments that cited relevant pages - and with which I did not always agree.
Re "that can be achieved by proposing concrete changes", an incremental strategy only works for upgrading a reasoanbly good product. I dont' think MOS qualifies. --Philcha (talk) 12:31, 20 May 2009 (UTC)[reply]
Well, the difficulty of finding the information is something that could be worked on (although it's not totally disastrous at the moment - there's a main MoS page and a sidebar that guides you to the various subpages). If you think the FA people impose MoS rules more rigidly than was intended, then that's more a matter for FA than for MoS (it says at the top of this and all other guidelines that there will be occasional exceptions). Do you have any other complaints?--Kotniski (talk) 13:15, 20 May 2009 (UTC)[reply]
I'm not the only one who has complaints - see User:Philcha#GA.2C__FA.2C_etc..
Re "the difficulty of finding the information is ... not totally disastrous at the moment", I disagree. The sidebar on the right seems to reflect mainly which groups have at various times in the past kicked up a fuss. OTOH I see no easy navigation to WP:LAYOUT, which, with one possible exception, is very useful. In fact I'm astonished that the "create article" dialogue does not automatically insert a template (subst, not transcluded) that sets up the standard sections.
The "possible exception" to the usefulness of WP:LAYOUT is WP:MOSIMAGE. This seems to be going through one of its moderate phases, but I've seen periods when it has outright forbidden the explicit sizing of images - which is nuts.
The difficulty of navigating and finding stuff is partly a consequence of the size of MOS, and would be reduced if MOS were slimmed down. The other problem is that WP's internal search engine is pretty useless. Although that's not MOS' fault, I suggest MOS needs to adapt to the fact.
I'm having a skim through MOS now to see what I think as good / desirable but inadequate / bad and will post my thoughts here when I think I have enough material. --Philcha (talk) 14:18, 20 May 2009 (UTC)[reply]
PS I partially agree with "that's more a matter for FA than for MoS". The sad fact is that WP has its fair share of people who get their kicks out of bossing others around, by no means confined to FAC, and I think WP guidelines should be careful not to provide weapons for such people. --Philcha (talk) 14:22, 20 May 2009 (UTC)[reply]
  • I am glad to see that so much productive conversation has been sparked by what was intended to be a farewell. I think this would make a good essay, which is why I have continued to engage; it would be better if written by someone who intends to stick around. Septentrionalis PMAnderson 20:03, 20 May 2009 (UTC)[reply]
One characteristic of Wikipedia is that, even if an editor makes a mistake, another editor can correct the mistake. Therefore, there is no urgency for an editor to learn all the details of the Manual of Style, although each editor would do well to continue learning at a reasonable speed. There is also no need to remove from the Manual useful guidelines which some editors appreciate. Also, not all editors have easy access to the Chicago Manual of Style and other manuals of style. Please let the guidelines remain for those who appreciate them, and, if some editors ignore some guidelines, then please let other editors correct their mistakes.
-- Wavelength (talk) 20:47, 20 May 2009 (UTC)[reply]
Another thing that MOS could usefully do is to report what Chicago, Oxford and so on say about a given question. Those are reasons to choose between alternatives. But MOS doesn't. I hope to return to find things otherwise.
There is also no need to remove from the Manual useful guidelines which some editors appreciate. This is too broad. If many, or (as sometimes happens) more, editors disapprove of them, then they should be removed. Guidelines are supposed to be consensus, and such "guidelines" aren't; nor are they useful, except in stirring up controversy. Septentrionalis PMAnderson 21:00, 20 May 2009 (UTC)[reply]
Nobody is forced to spend time learning it. If you follow the wrong style, the worst that will happen is that someone will leave you a message suggesting that you adjust to something different. Even that's optional. As long as you don't deliberately change things from good style to bad, there's no problem.

Would any of this were true; if it were, MoS would not be a problem, and such of its regulations as are unwise would sink into a harmless obscurity. If it can be arranged to become true, I would leave this page forever, instead of long enough to let it evolve without me. But the worst that will happen is that bots will be driven through articles to fix what doesn't need fixing (including altering direct quotations), and useful articles will be turned down for FA because they don't comply with a prejudice codified in some obscure MOS subpage. Worse yet, incompetent and dishonest articles can pass FA, because "correcting" MOScruft looks like a full review, even if the accuracy, neutrality, verifiability, and clarity of the article have not been considered.

The last sentence has other problems: is it really meant to imply that "good style" is whatever MOS approves of, no matter how obscure and illiterate, and "bad style" is what NOS disapproves of? I hope not. Septentrionalis PMAnderson 20:52, 20 May 2009 (UTC)[reply]

¶ I'd been stuck on the talk page of WP:MOSNUM so long that I'd forgotten that a Mother Page existed, and suggested a shorter working Manual for essential questions of accessibility, readability and clarity. Then when I was reminded (by a reference to this very secton) that such an overall summary MoS did in fact exist, I checked the kilobyte count in this article's edit history, and saw that it exceeded 144,000. If you added the counts of all the two-dozen subsidiary manuals (excluding linguistic and technical ones) to that, I'm not sure what literary work's word count would be exceeded: Dickens, Shakespeare, the New Testament? the Bible? Certainly if you tried to read all the discussion archives, too, you'd be planning on a long programme of study.

It's true that printed Manuals of Style (Modern Language Association, Associated Press, U.S. Government Printing Office, New York Times, University of Chicago Press, Clarendon Press, etc.) are bound volumes of some length, but they're largely for the use of full-time paid professional writers, reporters, professors, copy-editors and editors. Fewer than two dozen people are actually getting paid for any function by Wikipedia.

There's nothing wrong with guides to usage and manuals of style as such, but I think there's a parallel with the argument summaries listed under phrases like WP:Snowball and WP:Not. Although people often forget, they're not arguments in and of themselves, just summaries or collections of other arguments that might be valid ones to use in reaching a consensus for a particular question on a particular page. Similarly, as Septentronalis suggested above, a listing of the various reasons that different manuals (and different discussions at Wikipedia) have come to different conclusions, or to similar conclusions, about some question of usage, is extremely valuable, but (in the absence of some unarguable danger of confusion, mistake, error, ambiguity or unreadability) the actual context of the particular article, especially the needs of that article's likely readers, rather than some Universal ('bot-enforced) Wikipedia Rule, should be the basis for any decision that might be needed. —— Shakescene (talk) 00:57, 21 May 2009 (UTC)[reply]

Responding to Septentrionalis' point, I believe the value for Wikipedia of a MoS lies in a few areas, which include:

  • Avoiding edit wars. (Edit wars over style have typically been vicious, incomprehensible to onlookers, & lame.)
  • Achieving some degree of consistency in Wikipedia articles. (For example, order of end material sections.)
  • Providing guidance for perplexed editors. (AFAIK, the examples of Manuals of Style quoted above by Septentrionalis & Shakescene do not discuss proper format for article titles, hyperlinks, nor naming conventions.)

I would hope that these points are enough of a chore for anyone. The secret to a minimum of stress & frustration on Wikipedia is not knowing how to succeed in disputes, but knowing how to avoid them in the first place.

Trying to make the MoS a proscriptive can only end in frustration. Who is going to ban an editor from Wikipedia for, say, repeatedly splitting infinitives? Who would avoid being banned for complaining about it endlessly? -- llywrch (talk) 21:41, 21 May 2009 (UTC)[reply]

  • Waste of time I think that this conversation is basically one big chatty discussion about pet peeves and irritations. Honestly, people, there's the village pump for this kind of stuff, and I hear there's an IRC chat too. If PMA wanted to (keep) dispute(ing) "90%" of MOS, then he could do that line-by-line and point-by-point. If you have a specific proposal for changes, I suggest taking it to another section. As for PMA's "goodbye", it's worth remembering what that usually means in wikiland. WhatamIdoing (talk) 21:55, 21 May 2009 (UTC)[reply]
I have seen dozens of specific proposals made; almost all of them have been greeted with the same liefalsehood: whatever MOS says is immemorial consensus, even if only one or two editors can actually be found who support it or indeed understand it. (I have made some in the course of this discussion, and they stand unresponded to.) I do not have more energy to waste on this WP:OWN and WP:CREEP violation; but I would like to see it cleaned up in a year's time, and salute those who seem willing to do so. Septentrionalis PMAnderson 02:51, 22 May 2009 (UTC)[reply]
    • This is about the broad picture, not over the hundreds of different details (which are just mentioned in the discussion immediately above for illustration). There are two (and of course more) different approaches to how a manual of style should be employed, how much detail it needs, and how prescriptive or permissive it should be. The editors on both sides are hard-working and sincere, with the best interests of Wikipedia, its editors and its readers at heart, but when we argue about these questions in the context of one of the hundreds of different details, the debate can degenerate into personal sniping and irritation (exactly what everyone wants to avoid). And where better to discuss approaches to the Manual of Style than on its Talk Page? —— Shakescene (talk) 23:40, 21 May 2009 (UTC)[reply]

Wikipedia:Manual of Style (Japan-related articles) is highly useful and our Japan-related articles would be a mess without it. Do not touch. --Apoc2400 (talk) 15:09, 22 May 2009 (UTC)[reply]

That's actually a fairly good guideline; it abides by consensus, and provides explanations; when it chooses between systems of romanization, it chooses a widely intelligible one, and says why. Most of it is a straight application of WP:Use English; it has shortfalls as a page-naming convention, but it seems to be being ignored where it is over-eager, as indeed it ought to be. Septentrionalis PMAnderson 15:42, 22 May 2009 (UTC)[reply]
I was consciously and (at least once) explicitly excluding the whole class of language-related pages from my own discussion of the family of two-dozen MoS pages and sub-pages, because those linguistic/national pages really deal with very different and very specific issues that most English-Wikipedia editors won't often encounter (unlike, for example, WP:MOSNUM or WP:MOSICON). —— Shakescene (talk) 05:44, 23 May 2009 (UTC)[reply]

Length of time in years

The length of time sometimes gets put into Wikipedia, which has a potential to lead to incorrectness. For example, there are many pages which say something like: "She has been married to her husband for twenty-eight years."

The problem with such a statement is that in five years from now, who will know how many years those people have been married for? Therefore, I can think of a couple of ways to resolve this problem:

  • Wikipedia includes a mechanism for inserting a hidden date, and automatically updates the duration.
  • Wikipedia style guides against such a statement in favor of: "She has been married to her husband since 1991."

Is this really an issue? Or is there another way to improve the durability of Wikipedia? It just seems like Wikipedia's not going to remain fresh if people continue to write the way that they are. Twocs (talk) 07:48, 23 May 2009 (UTC)[reply]

The "Precise language" section of the MoS already suggests to avoid such statements. If you want to use auto-updating counts, "for {{age|1991|03|17}} years" outputs "for 33 years", automatically updated on 17 March every year. (I don't know if a similar template spelling out the number exist.) But personally I prefer "since 1991". --A. di M. (formerly Army1987) — Deeds, not words. 14:34, 23 May 2009 (UTC)[reply]
{{Numtext|{{age|1991|03|17}}}} gives "thirty-three". See {{Numtext}} for a statement about overhaul/subtemplates. Sswonk (talk) 19:59, 23 May 2009 (UTC)[reply]
Awesome. Thanks! Twocs (talk) 08:13, 24 May 2009 (UTC)[reply]

About U.S. Customary units in US-specific articles

Perhaps the guidelines have to be changed in favor of SI as en-wiki isn't us-wiki and only Americans have feeling of U.S. Customary. SkyBonTalk\Contributions 19:22, 23 May 2009 (UTC)[reply]

There's a good reason for keeping U.S. customary units in U.S.-related articles, apart from keeping Americans happy! The sources for such articles will usually use customary units, and converting to SI can be misleading. For example, imagine a distance quoted as 1 mile; we would usually convert that as 1.6 km, but the real distance might be 1.5 km or 1.7 km (both of which we could convert as 1 mi, or we could be more precise). If we write the distance as "1 mile (1.6 km)", it is unambiguous that the metric measurement is a convertion of the customary measurement, and not the other way round. Physchim62 (talk) 07:55, 26 May 2009 (UTC)[reply]
I completely agree. Saying that something was "2.54 (or even 2.5) cm (1 inch)" long or weighed "2.2 kg (1 lb)" can indicate misleading precision (as I said in some recent discussion that may now be archived). —— Shakescene (talk) 08:02, 26 May 2009 (UTC)[reply]
I agree with Physchim62 and Shakescene, and (drifting slightly off-topic) would add that this is a more general problem. It's not uncommon to see converted values given to more precision than can reliably be inferred from the source. Just as a made-up example, something like 1 mile (1.61 km).
I don't know that the MOS can really do much about this latter problem, though, as it's not really a style issue. Anyone who understands units and physical quantities will already know not to do this, and for those who don't, it's not intuitive that they should look in a style manual to find out about it. --Trovatore (talk) 09:39, 26 May 2009 (UTC)[reply]
Actually, I would say that a style manual is probably the best place to advise editors about this point. I was unaware of this issue until another editor pointed it out to me, and to be honest I am still not entirely sure how much precision a converted quantity should be expressed in. I have been operating on the assumption that the converted quantity should generally be expressed to the same precision as the original quantity, though if the original quantity is an integer (e.g., 1 mile) it is acceptable to express the converted quantity (e.g., 1.6 km) to an additional decimal place. Is this correct? If so, the {{Convert}} template can be used as follows: "{{convert|1|mi|km|1}}". — Cheers, JackLee talk 10:31, 26 May 2009 (UTC)[reply]
Yes, the {{convert}} template is a great tool for non-specialists, and is maintained by people who know what they're talking about (even if I don't always agree with them on the finest of fine details!). Adding a single extra significant figure to the conversion, as in "1 mile (1.6 km)" is fine, and advisable if the converted measurement is less than about five units. If there is a demand, we can get some more specific guidelines together: however, I'm not sure that there is a huge problem that needs solving here. Physchim62 (talk) 12:05, 26 May 2009 (UTC)[reply]
I think it depends on the kind of measurement: if it is a nominal (defined) value, as in two miles or pint glass, then it's OK to give one or two more significant digits, e.g. 3.2 km and 0.57 l (or even 3.22 km and 0.568 l); if it is an estimate or an order of magnitude, as in "X is a village about 2 miles north of Y", then converting to 3.2 km rather than to 3 km makes very little sense. (In this particular example, the value is intrinsically indeterminate to within that level of precision, unless you specify which point of each settlement you're considering, or both settlements are less than 50 metres across.) I would tolerate an exception if the significand of the conversion is 1, to avoid rounding for example 1.4 down to 1, but refusing to round 4.8 up to 5 when the source doesn't actually justify such a precision is pointless. The template as it currently exists yields "15 kilometres (9.3 mi)" using its default precision, which is ridiculous. (On the other hand, I don't like the paranoia by which the template reduces the precision by one digit when the conversion factor is greater than 2, yielding "84 kilograms (185 lb)" and "53 inches (130 cm)" rather than "84 kilograms (185 lb)" and "53 inches (135 cm)"; I would increase the treshold to 10.) --A. di M. (formerly Army1987) — Deeds, not words. 00:22, 27 May 2009 (UTC)[reply]
BTW, if the source value is a defined value, I think it is generally better to explicitly state whether or not the conversion also is: "210 mm × 297 mm (approx. 8.28 in × 11.69 in)" or "8.5 in × 11 in (215.9 mm × 279.4 mm exactly)". --A. di M. (formerly Army1987) — Deeds, not words. 00:32, 27 May 2009 (UTC)[reply]

En-dashes and image file corruption

File:Henry Fuseli — Hamlet and the Ghost.JPG
Corrupted version: Horatio, Marcellus, Hamlet, and the Ghost (Artist: Henry Fuseli 1798).
File:Henry Fuseli- Hamlet and his father's Ghost.JPG
Correct display, with hyphen in filename: Horatio, Marcellus, Hamlet, and the Ghost (Artist: Henry Fuseli 1798).

Mostly copy/pasted from a post to FAC talk where this is also relevant: from 17 May until today, two illustrations at an existing featured article failed to display because an attempt to implement MoS compliance corrupted the filenames (which contained hyphens).[6] The surprising thing is that this filename corruption remained uncorrected for so long at the article Hamlet. The problem was quite noticeable and prominent: redlinks appeared in place of these images and one of them was very high on the page. After correcting the problem, posting to article talk, and notifying the user who performed these edits am also posting here because this went unnoticed and unfixed for a week at Shakespeare's most famous play: one wonders how many other articles (featured or otherwise) may have been damaged in a similar manner. On my other account I caught similar problems at non-featured articles several months ago. Respectfully submitted, Hamlet, Prince of Trollmarkbugs and goblins 18:07, 24 May 2009 (UTC)[reply]

[Aside:] I similarly wonder all the time why the Wikimedia software, unlike almost all similar software, is case-sensitive, which vastly complicates, and often frustrates, searches and wikilinking, as well as pulling up the correct image or file. —— Shakescene (talk) 18:20, 24 May 2009 (UTC)[reply]
Well, most of these images are hosted at Commons. That project is vastly multilingual, so we can't tell them to stop using hyphens in filenames. Hamlet, Prince of Trollmarkbugs and goblins 18:28, 24 May 2009 (UTC)[reply]

Article Titles

I think some mention should be made in this section of when italics should be used in the title. For example species and genuses should have their titles in italics. This can be done either by removing the "name" section from the taxobox if the article name is the same e.g. [7] or by adding {{italictitle}} if the name section is different (e.g. it's a common name in the taxobox) or there is a bracketed part in the title e.g. on Homo (genus). I'm not sure, but should books and songs also have italicised titles? I assume using {{italictitle}} would be easiest for this too. Smartse (talk) 08:45, 27 May 2009 (UTC)[reply]

Where is the subject-verb agreement subsection of Grammar[?]

I need MoS guidance on how Wikipedia treats plural-only noun forms: "I bought a pair of shoes yesterday." Shall I treat the subsequent pronoun as plural (the shoes) or singular (the pair) for Wikipedia style? If there is a more specific Wiki article on subject-verb agreement rules, I'd like some direction to it. How does Wikipedia treat collective nouns? I'm just seeking to know what Wikipedia wants. Fdssdf (talk) 23:12, 27 May 2009 (UTC)[reply]

In my experience, there are some circumstances in which a collective noun should be treated as singular, and other circumstances in which it should be treated as plural, depending on whether you're discussing the collection as a unit, or discussing its constituent elements. I generally rely on my instincts as a native speaker to tell me when to do which, and only in a very clear-cut situation would I be likely to take action when challenged on it — the best action being to rephrase the entire passage so as to avoid the question (an excellent way to achieve consensus, assuming you can find a strong enough alternative).
That said, there is actually a rather laughable claim in the MoS about this, at WP:PLURALS, that in North American English such words are invariably plural. Having lived all my life in North America, I know this to be codswallop. Pi zero (talk) 01:43, 28 May 2009 (UTC)[reply]

E.g.: Politics: "Politics is a controversial subject." Cf. the Wikipedia lead para., beginning: "Politics is the process by which groups of people make decisions." To me these constructions illustrate correct grammar (subject-verb agreement); however, I hear all around me (in the U.S.) "Politics are...." constructions, as if Politics were a plural term. (One needs to look up the difference between the adjective politic and the noun politics (which happens to end in an s, so that there is no distinction between the singular version politics and the plural version politics, as in the example given in Wiktionary: "What are your politics?"). Lack of sufficient education of Americans in grammar and syntax (in high school) makes many errors of this kind unnoticeable to many Americans. This is not, in my view as an American professor of English language and literature, a matter of "varieties of English" or one's personal "instincts"; this is a matter of knowledge of current conventional usage rules of English grammar and syntax. One needs to use a dictionary and a style guide when in doubt, not depend on Wikipedia's ("optional") style guidelines in its Manual of Style. Some matters are not "stylistic"; they are matters of conventional and/or current usage defined in the most up-to-date books on grammar and syntax. Such discussions in Wikipedia's own MOS need to be documented with verifiable citations to reliable sources, following WP:CITE. Otherwise, Wikipedia's editors and readers cannot consider them dependable (reliable). --NYScholar (talk) 02:46, 28 May 2009 (UTC)[reply]

For the general acknowledgment that Wikipedia editors and readers need to consult style guides for "details" (including such matters of subject-verb agreement) of style, grammar, syntax, etc., see WP:MOS#Further reading. There is an assumption here that Wikipedia's editors and readers have knowledge of basic conventions of current usage of grammar, mechanics, and style, and that, if they do not have such knowledge, they need to do such "further reading" about matters that Wikipedia's style manual may not be discussing in "detail" and/or about matters that Wikipedia's own editors and readers may be engaged in disputing. (See current talk page above and the lengthy archives of this project talk page.) --NYScholar (talk) 02:56, 28 May 2009 (UTC)[reply]
WP:MOS#Further reading  encourages editors to be familiar with other style guides; that's not at all prescriptivist. The objective here is to achieve phrasing for Wikipedia articles that is clear, professional, and fosters a stable consensus. We want the vast majority of editors at any given article to be able to agree that the phrasing there is good English in the formal register. If two fluent native speakers disagree on a point, and find each other's choice of phrasing truly cringe-worthy, then it would actually make the situation worse to have WP:MOS weighing in on the question: those two editors need to iron out a solution they can both live with, which is most likely to be something different than either of them had originally envisioned — and that's much less likely to happen if WP:MOS says that one of them is "right". A good consensus solution would leave both editors satisfied with a job well done; WP:MOS ramming a resolution down one of their throats would be apt to diminish that editor's commitment to Wikipedia, which in the long run, averaged over many such situations, is far more damaging to Wikipedia than any particular phrasing of the article would have been. Pi zero (talk) 05:34, 28 May 2009 (UTC)[reply]
I really do not understand some of the emphases in your response. I did not say anything about being "prescriptivist". I understand that "Further reading" is to provide "guidance" to those who feel that they need it. The template for "Style"--{{Style}} makes it very clear that there are Style guides for more information about "details" of writing and editing that Wikipedia's own MOS (WP:MOS) does not cover. (Please see the rest of my comment, which is only a series of observations from my own perspective; it is not intended to be "prescriptive", but just to present a perspective gleaned from contributing to Wikipedia since June 2005.) --NYScholar (talk) 17:12, 28 May 2009 (UTC)[reply]
I don't find "what is cringe-worthy" in any way a practical (or practicable) guideline: what makes one person (and all readers are people) "cringe" may not make another person "cringe"; that is no guideline at all. There is much in Wikipedia that makes me "cringe"; but I do not comment on it all, because I know those things may not make others "cringe". (I cannot say "that makes me cringe" as a means of convincing a student, writer, or another editor of the lack of feasibility of a written construction; I have to point to a reliable style guide [or dictionary, depending on matter at hand] for them to refer to as a source.) Cringing is clearly relative to one's personal, cultural, and educational background, professional training and experience, and knowledge; "cringing" is highly subjective. [Where is so-called cringe-worthiness presented in Wikipedia as either a policy or a guideline? I never noticed the ref. to it before seeing this talk page. (Maybe I missed it in WP:LOP?)] --NYScholar (talk) 17:50, 28 May 2009 (UTC)[reply]

Back to the example given by the initial poster of this section: re: "a pair of shoes" (notice the indefinite singular article a) and shoes: the former is singular in number ("a pair"), while the latter is plural in number ("shoes"). In the syntax (order of words) and grammar of an English sentence, regarding "subject-verb agreement", one generally would encounter agreement of the verb with the subject in number (not with the direct object or indirect object). (Note that "subject-verb agreement" or "subject-pronoun agreement" are different kinds of agreement in English grammar.) (cont.)

Concerning the poster's original question about how to refer back to the sentence subsequently: the number of the pronoun reference depends on whether the writer is referring to back to the entire pair of shoes or to the shoes; it is up to the writer. In this case, it appears to me, it (referring to "a pair of shoes" or to "the pair of shoes") would clearly designate being referred to the entire pair; whereas they refers back to shoes in the sentence (i.e., to both "shoes"). --NYScholar (talk) 20:09, 28 May 2009 (UTC)[reply]
[In reply to a specific other question raised by the initial poster: for a very rudimentary [and incomplete] discussion in Wikipedia of some kinds of "agreement", see Verb#Agreement; a simple search for "subject-verb agreement" turned it up. --NYScholar (talk) 23:55, 28 May 2009 (UTC) (Please see also the templates at top of that article.) --NYScholar (talk) 00:01, 29 May 2009 (UTC)][reply]
[See also another templated related subject article: Agreement (linguistics). --NYScholar (talk) 00:05, 29 May 2009 (UTC)][reply]

[…So as not to take up so much space, I'm moving a long ("hidden" templated) comment that I wrote over an extended period of time to my own user space: one can find it in my sandbox, accessible from either my user page or my talk page headers (if one wants more information related to the matter of MOS re: "agreement" and/or "further reading"). I'll leave it in my sandbox for a while and remove it later. Otherwise one can find it in the editing history. --NYScholar (talk) 00:32, 29 May 2009 (UTC)][reply]

Protection

This came up on WP:RFPP and I protected for edit warring for 2 weeks. Unfortunately, this locks out everyone as well.

If the two folks involved would be willing to agree here to not edit war again, I'll happily unlock the page, with the provision that if you two go at again, either myself or someone else will block you.

What do you say? rootology/equality 20:46, 29 May 2009 (UTC)[reply]

Per a good faith request on RFPP, I've unprotected the page. Any further edit warring can and will likely lead to blocks. Please discuss, not edit war. rootology/equality 21:28, 29 May 2009 (UTC)[reply]

This is what I proposed, which led to the unblock:
  1. Warn the two warriors that they will be blocked if they edit the MoS, and that the dispute should be resolved by consensus on the MoS talk page.
  2. Unblock the MoS.
  3. The section involved should be restored to what is was before the current dispute, and should not be changed in substance unless a clear consensus for change is reached on the MoS talk page. The consensus, if reached, should be implemented by editors other than the warriors.
  4. Warn everyone, on the MoS talk page, that further editing of the section involved that does not conform to this resolution will result in blocks of the offending editors.

I am creating a subsection below for resolution of the content dispute. Will someone who is familiar with the dispute please make the subsection heading more specific? Thank you. Finell (Talk) 21:39, 29 May 2009 (UTC)[reply]

What edit war?

What edit war? I got a message from Root, so I suppose I'm meant to be involved. Darkfrog24 (talk) 22:23, 29 May 2009 (UTC)[reply]
I see that someone's removed my changes, so I'm going to assume that my addition of the lines regarding Wikipedia's use of quotation marks. Please correct me if this is not the case.
I posit that this is not an edit war. The first time I typed in my changes, I mistakenly wrote "inside" instead of "outside." In all likelihood, this is why they seemed untrue to Kotniski. However, since I had not yet noticed the typo, I responded to K (in the change description; see History) as if K had been responding to the fact that the stop rule is not considered correct in American English. Also visible in the page history, I noticed the typo shortly after and corrected it.
It may be relevant that I did answer Kotniski's question "why??" but it seemed more appropriate to do so on K's talk page than here.Darkfrog24 (talk) 22:51, 29 May 2009 (UTC)[reply]
I am also assuming that this message does not refer to my correction of the passage regarding quotation marks and punctuation to address proper usage of colons and semicolons, but please confirm. Darkfrog24 (talk) 22:59, 29 May 2009 (UTC)[reply]
In restoring the section to an earlier state, I interpreted the duration of the edit war to include the entire consecutive sequence of edits by the two parties involved. It did seem to me as I was reversing them that, absent an edit war, most (perhaps all) of them would not have stood anyway without getting consensus first. Pi zero (talk) 23:36, 29 May 2009 (UTC)[reply]
Seems reasonable on your part. Still not sure what this is about myself.Darkfrog24 (talk) 23:39, 29 May 2009 (UTC)[reply]
(Double take) What, not even the one about colons? "Punctuation" implies "all punctuation," which is not the case. It's my understanding that colons and semicolons go outside the quote marks in both British and American English. And what about the parallel construction? If Wikipedia did the same thing both times, select a way of doing things that's accepted in some places but not others for technical reasons, then why wouldn't a decision to describe them in the same pay pass scrutiny? Darkfrog24 (talk) 01:03, 30 May 2009 (UTC)[reply]
Wikipedia has not, so far as I can see, selected a way of doing things that's accepted in some places but not others (at least, not in this case). It has adopted its own house style for technical reasons, and since it says all punctuation (I agree that that is what the current wording means), that is the house style unless consensus is successfully built to change it. Note that strictly according to that house style, it would be very unusual for a colon to go inside the quotation marks anyway; your example with a colon outside the quotation marks is, in fact, consistent with the house style.
BTW, the practice of tagging things as British or American seems potentially inflammatory, though obviously not intended to be; they aren't British or American, they're the house style. I might support a proposal (depending on its particulars) to add a sentence... somewhere... stating explicitly that these things are the house style, independent of what variety of English may be preferred by any given article. The wording and placement of such a sentence would be tricky, and seeking consensus would be a very wise discretion. Pi zero (talk) 02:10, 30 May 2009 (UTC)[reply]
By "acceptable in some places but not in others," I mean that Wikipedia's policy of using the stop rule for commas and periods with quote marks is acceptable in British English but not in American. Since Wikipedia has done this twice--once with commas, periods and quotes and once with choosing double quotes over single--it makes sense to describe them both the same way. You will note that the section on double quotes includes an explanation of why this form was chosen. I simply changed it so that it matched the one in the section about commas, periods and quote marks.
Also, if placing colons outside the quote marks is the house style, then the MoS should reflect that. Right now it says to put "punctuation," which I read as "all punctuation," inside or outside the quote marks depending on the stop. Because neither British nor American English permits this in the case of colons and semicolons, I figured that it was an oversight on the part of some previous contributor. One of the changes I made earlier today corrected this.Darkfrog24 (talk) 02:18, 30 May 2009 (UTC)[reply]
To be clear here (although the point has been expressed in other words elsewhere): Saying that the house style usually causes a trailing colon to go outside the quote marks is different from saying that it always does so. It usually does so, but only because a trailing colon usually isn't part of what is being quoted. The particular example you had added to the MOS is one in which logical quotation happens to be in agreement with the British and American styles. The reason you gave in the example is incorrect under the current house rules: the colon doesn't go outside in that example because it is a colon; rather, it goes outside because it is not part of the thing being quoted (i.e., the word "gender"). Pi zero (talk) 16:24, 30 May 2009 (UTC)[reply]
Found a source on the matter. See below. Matters of advisory notes, parallel construction and what exactly the war is supposed to be about still stand.Darkfrog24 (talk) 17:55, 30 May 2009 (UTC)[reply]

Request for clarification

It was User:MBisanz who asked for, and briefly obained, temporary full protection of the project page because, he said, "Two experienced users edit warring." If there is really any question about who was edit warring and what they were warring about, please ask User:MBisanz. Evidently User:Pi zero knows who the two warriors are. Otherwise, please identify the subject of the subject the dispute and discuss it below the next heading, so some progress can be made. If it is the subject of the MOS's prescription of so-called logical placement of punctuation inside or outside quotations, that subject was discussed to death above, there was no concensus to change the long-standing policy, and that section of the MOS should be restored to what it was before without further pointless argument. Editors' denials of an edit war or their participation does nothing to advance the project. Admins generally do not block a page for edit warring in the absence of an edit war. Thank you. Finell (Talk) 00:02, 30 May 2009 (UTC)[reply]

No, the changes that I made that were reversed by Pi zero were not the subject of any discussion previously shown on this talk page and did not alter any of Wikipedia's rules. I only described regional status, added a line about semicolons and colons, and matched some phrases, as you can see either just above your own post or in the page history.Darkfrog24 (talk) 00:37, 30 May 2009 (UTC)[reply]
Update: MBisanz's talk page says that said user is taking a wikibreak and won't be back until June 1. Darkfrog24 (talk) 13:20, 30 May 2009 (UTC)[reply]
My complaint was about [8] and [9] which wholesale undid another user's contributions while there was no concurrent discussion at the talk page about the actions. MBisanz talk 21:38, 30 May 2009 (UTC)[reply]
Thanks for clearing that up, MBisanz.Darkfrog24 (talk) 22:05, 30 May 2009 (UTC)[reply]
I answered Kotniski's question on said user's talk page, here: [[10]]. This only pertains to the line about single quotes vs. double. If you look in the page history, you'll see that Kotniski and I had no further dispute on the other one. I'm confident that this matter was put to rest, at least as far as the two of us are concerned, before the page was blocked. Darkfrog24 (talk) 22:09, 30 May 2009 (UTC)[reply]

Discussion of dispute concerning {please insert subject of dispute}

Colons and semicolons with quotation marks

The section on quotation marks with other punctuation says "Punctuation," implying all punctuation, is placed in or outside the quote marks according to the stop rule. However, in both British and American English, colons and semicolons are always placed outside the quote marks. It seems likely that the phrasing is an oversight on the part of some previous contributor. What do you guys think of "Colons and semicolons are placed outside the quotation marks. Question marks, exclamation points, periods and commas are placed inside the quotation marks only if the sense of the punctuation is part of the quotation." Darkfrog24 (talk) 04:15, 30 May 2009 (UTC)[reply]

See Quotation mark#Punctuation: "In both styles, question marks and exclamation marks are placed inside or outside quoted material on the basis of logic, but colons and semicolons are always placed outside." -- Wavelength (talk) 05:42, 30 May 2009 (UTC)[reply]
What's the point of describing styles that we don't prescribe? In logical quotation, colons and semi-colons are treated identically to any other punctuation, so there's no need to talk about them specially. Ilkali (talk) 10:02, 30 May 2009 (UTC)[reply]
Logical quotation is neither the British style nor the American style. Logical quotation is sometimes used by scientific and technical publications because of its precision, regardless of what variety of English they use. Both the British style and the American style treat colons and semicolons differently from other punctuation, but logical quotation does not treat them any differently than any other punctuation. Some of the phrasing over at Quotation mark#Punctuation may be too easily misunderstood on this point. The term logical quotation there is presented in a sentence that also mentions typesetters' quotation, apparently another name for the American style, and then in describing use of logical quotation it says that it is sometimes used "even in the U.S.". The phrase "even in the U.S." apparently assumes that the use of logical quotation in the U.S. is more surprising than its use elsewhere, since logical quotation deviates more from American style than it does from British style; but the phrase might be misconstrued as equating logical quotation with the British style, and the seeming contrast of logical quotation with typesetters' quotation doesn't do anything to disabuse readers of that impression. Perhaps the article section Quotation mark#Punctuation could be made clearer, but this isn't the place to get into that. Pi zero (talk) 13:12, 30 May 2009 (UTC)[reply]
With regard to just colons and semicolons in this MoS for now, it still looks like "Punctuation" is just a typo. I'm checking NASA style guide, British sources, U.S. sources, scientific societies, anything I can find online, and so far no one ever seems to put colons or semicolons inside the quotation marks, regardless of how they treat periods and commas. Many of the style guides aren't available online, though. Darkfrog24 (talk) 13:44, 30 May 2009 (UTC)[reply]
I took a search through the archive of this page. It doesn't seem that this matter has come to anyone's attention before. Suggests that colons/semicolons are easy to overlook. Darkfrog24 (talk) 13:53, 30 May 2009 (UTC)[reply]
Suggests that they follow the same rules as everything else in logical quotation and don't merit special mention. Ilkali (talk) 14:09, 30 May 2009 (UTC)[reply]

That's the beauty of logical quotation: it's logical. There's nothing special about colons or semicolons. The rule is simple: what's part of the quote goes within & what's not goes outside. JIMp talk·cont 17:38, 30 May 2009 (UTC)[reply]

After much searching, I have found one source that conforms with this position. Whether or not Wikipedia should employ this policy is another question, but for now I'm satisfied that "Punctuation" was not necessarily a typo. Darkfrog24 (talk) 17:53, 30 May 2009 (UTC)[reply]
OK, I'm having a little trouble following this. In logical quotation, the only time the last symbol before the end-quote would be a semicolon is if that's what the person being quoted wrote. It's a little hard to imagine such a case. Maybe something like
The note continued, "my typewriter's period key is broken, so I must end this missive with a semicolon;".
but I think we can agree that this is sort of bizarre. The question that could arise is, in the non-logical style (don't like to call it American, because I love this country), are semicolons ever transported inside the quote, even though they would logically be outside? That is, would you ever write
He said, "when you shoot, shoot — don't talk;" however, he did not in fact shoot.
?
The answer I was taught in school is no, but my brother-in-law's dissertation moved semicolons inside the quotes in this manner and he claimed the MLA backed him up. --Trovatore (talk) 23:22, 30 May 2009 (UTC)[reply]