Wikipedia talk:Manual of Style: Difference between revisions

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Content deleted Content added
Yndajas (talk | contribs)
Line 714: Line 714:


--[[User:Yndajas|Yndajas]] ([[User talk:Yndajas|talk]]) 22:42, 28 July 2018 (UTC)
--[[User:Yndajas|Yndajas]] ([[User talk:Yndajas|talk]]) 22:42, 28 July 2018 (UTC)

* The short answer is no. The only reason Wikipedia can include detailed historical records about otherwise non-notable people (such as the per-ward results) is that they faithfully reflect the sources used. If the election council for York puts out a statement updating their records, I'd support updating it. As it is, I would much rather remove the entire section rather than make a name change that isn't supported by [[WP:PSTS|secondary sources]] (just the person's own statements); it isn't obvious how to prove that the person changing their name is the same one that ran for office. The emotional issues aren't relevant; this person's [http://andylaw.co own website] uses their own name. [[User:power~enwiki|power~enwiki]] ([[User talk:Power~enwiki|<span style="color:#FA0;font-family:courier">π</span>]], [[Special:Contributions/Power~enwiki|<span style="font-family:courier">ν</span>]]) 23:18, 28 July 2018 (UTC)
** <small>As a side note, this isn't really the right forum, but as this is a new user who was directed here, I see no reason to move this discussion. [[User:power~enwiki|power~enwiki]] ([[User talk:Power~enwiki|<span style="color:#FA0;font-family:courier">π</span>]], [[Special:Contributions/Power~enwiki|<span style="font-family:courier">ν</span>]]) 23:18, 28 July 2018 (UTC)</small>

Revision as of 23:18, 28 July 2018

HTML entities

Greetings all, I'm currently updating the style-checking code that reports to Wikipedia:Typo Team/moss, and I need some clarity on which HTML character entity references (things like &amp;) are allowed or preferred. Variations that are not allowed or which are disfavored would be brought to the attention of human editors, along with other suspected style and spelling errors. There are occasional mentions of such entities in the Manual of Style, but no general rules that I could find. I would propose the following:

HTML character entity references

(edited to reflect the below comments)

HTML character entity references are a way to tell a web browser to render a certain character without including that character in the web page directly. Characters may be referenced by name, decimal number, or hexadecimal number. For example, "&euro;" is the same as "&#x20AC;", "&#8364;", or including the character "€" directly. For a comprehensive list, see List of XML and HTML character entity references. Wikipedia editors are encouraged to follow these guidelines to make it easier for editors to read and understand wikitext, especially those not familiar with HTML notation.

  • In general, it is preferable to write characters directly instead of using an HTML entity reference. Wikipedia stores articles with Unicode, so any character that could possibly be referenced can also be input directly. The web site's editing pages have built-in special character support to make it easy to input characters not typically found on keyboards. Editors can also use the Unicode input method provided by their operating system.
  • Numeric references should not be used when there is a named reference available. For example, &minus; should be used instead of &#8722;
  • References must be used when the character itself cannot be used for technical reasons. For example, "]" cannot appear in wikilinks that use "[[" and "]]" to mark the start and end. The <nowiki> tag can also be used to prevent interpretation of special characters as wiki markup.
  • Named references are preferred when the characters themselves are easily confused. This includes:
    • Whitespace. The regular ASCII space " " should be typed directly, but entities should be used for others like "&nbsp;" and "&ensp;".
    • Dashes and similar characters. The regular ASCII hypen-minus "-" should be entered directly, but other characters might be entered with entities. For example, &minus; is generally preferred because "−" looks very similar to "-" in some web browsers. See Wikipedia:Manual of Style § Dashes for more usage guidelines.
    • Prime (′) and related symbols that resemble quote marks
  • Other guidelines ask that the Unicode characters not be used at all (except when the character itself is being discussed):

Initial discussion

What do folks think? -- Beland (talk) 19:39, 14 July 2018 (UTC)[reply]

  • Another set of characters to avoid are the superscript-digits (at least when used with a mathematical meaning). See MOS:MATH#Superscripts and subscripts. —David Eppstein (talk) 19:46, 14 July 2018 (UTC)[reply]
    • Good catch, I'll add a link to the list of exceptions. -- Beland (talk) 21:10, 14 July 2018 (UTC)[reply]
  • I disagree that mdash isn't easily confused -- in some fonts it definitely is. I'd pretty much advocate that everything not on a standard English keyboard (whatever the "standard English keyboard" is) should be symbolically represented by either a & form or a template. And I'm a little worried that the typo team link at the start of the OP talks about flagging "violations of the Wikipedia:Manual of Style"; I fear this will slide all too easily into a project to blindly "fix violations". EEng 20:06, 14 July 2018 (UTC)[reply]
    @EEng: OK, I'll drop the emdash example. As for scope...well, this is already a project to fix violations of the Manual of Style and English spelling and grammar, though it's never done blindly. In some cases it would be safe to make a bot to make certain substitutions (like converting numerical to named references), but that would require approval by Wikipedia:Bot requests to make sure it didn't have any unwanted side effects. Not sure why that is something to be afraid of; if we think a certain form is better for editors, that seems useful. We don't do that for spelling mistakes because there could be a good reason to keep the misspelling. Could you explain a bit why you feel it's better for an editor to come across say, &trade; instead of ™ when opening an article for editing? -- Beland (talk) 21:00, 14 July 2018 (UTC)[reply]
    I'm fine with replacing numerical refs and &trade; and so on; in fact I welcome it because, as I mentioned, I generally think everything not on standard keyboards should be expressed symbolically in the wiki source. Its the vague statement at Wikipedia:Typo_Team/moss that you're gonna find "violations of the Wikipedia:Manual of Style" that worries me. I don't mind automatically identifying apparent "violations", but what worries me is that that might slide into automatic "fixes" – worried because MOS isn't rigid, it needs to be applied with common sense, exceptions apply, etc. EEng 21:24, 14 July 2018 (UTC)[reply]
    Re replacing characters with entities or the reverse: what I don't want to see is slow-motion edit wars where one group of editors or bots regularly replace characters by entities and a different group regularly replace entities by characters. That sort of thing just clutters watchlists for no good reason. So I'd rather either see a very clear specification of which things should be expanded and which should be left as unicode (probably difficult to attain consensus for) or (more likely) something like WP:RETAIN where edits of this type are discouraged. —David Eppstein (talk) 21:32, 14 July 2018 (UTC)[reply]
    Absolutely agree. A hard-won consensus in advance will consume 1/1000 the editor time and energy wasted on a zillion skirmishes and rage-reverts all over the project. And certainly some part of that consensus might be that some things come under RETAIN (though honestly the less RETAIN stuff we have the better). EEng 21:35, 14 July 2018 (UTC)[reply]
    An explicit list would be great for me, since I have to code that into software anyway. I'll whip up a table. FTR, as of April there were a grand total of 7 numerical references the moss software could find, and I changed all of them just now. -- Beland (talk) 01:15, 15 July 2018 (UTC)[reply]

The proposal should be revised to make it clear how it relates to the advice already in the MOS at WP:MOS#Keep markup simple,

An HTML character entity is sometimes better than the equivalent Unicode character, which may be difficult to identify in edit mode; for example, &Alpha; is explicit whereas Α (the upper-case form of Greek α) may be misidentified as the Latin A.

Also the proposal should indicate where this addition would go into the MOS; context matters.

The proposal contains the statement "The web site's editing pages have built-in special character support to make it easy to input characters not typically found on keyboards." That's only partially true; in the version I use, there are a variety of special characters to choose from, but when I hover over them, there isn't any little hint that pops up telling me what the name of the character is. So it is hard to be sure if a character is an n dash or a minus. In another case, it's hard to tell a prime from an apostrophe. I've learned to tell an n dash from a hyphen, but I'll bet there's lots of editors who can't. Jc3s5h (talk) 22:18, 14 July 2018 (UTC)[reply]

Hmm, thumbnails for special characters would make a great feature improvement for the web UI. I agree it's a bit of a pain; I always have to paste characters into a search engine to figure out what they are. If we're making a big table of what should be which, maybe it would need to be on its own subpage? I'm agnostic as to where this goes, and I'm open to suggestions; I don't think it matters as long as it's easy to find. -- Beland (talk) 01:15, 15 July 2018 (UTC)[reply]
FTR, I have filed a feature request for the popup text to include the character name at [1] for anyone who wants to comment or follow along at home. Thanks for the suggestion! -- Beland (talk) 06:57, 16 July 2018 (UTC)[reply]

Second draft

(Edited to reflect the below discussion)

HTML character entity references are a way to tell a web browser to render a certain character without including that character in the web page directly. Characters may be referenced by name, decimal number, or hexadecimal number. For example, &euro; is the same as &#x20AC;, &#8364;, or including the character directly. For a comprehensive list, see List of XML and HTML character entity references [2].

In choosing between the numeric reference, named reference, and direct character methods, Wikipedia never uses the numeric reference when a named reference is available, and it usually prefers direct character input over named references (and edits in this direction are made by semi-automated systems like AutoWikiBrowser). For example, &minus; should be used instead of &#8722;, and é should be used instead of &eacute;. Wikipedia stores articles with Unicode, so any character that could possibly be referenced can also be input directly. The web site's editing pages have built-in special character support to make it easy to input characters not typically found on keyboards. Editors can also use the Unicode input method provided by their operating system. There are some exceptions where named references are preferred, to avoid confusion and to circumvent technical limitations. The <nowiki> tag can also be used instead of character escaping to prevent interpretation of special characters as wiki markup. These preferences are detailed in the table below, and some instances where a given character is preferably not used at all (except where that character is itself the topic of discussion) are noted. Wikipedia editors are encouraged to follow these guidelines to make it easier for editors to read and understand wikitext, especially those not familiar with HTML notation.

Category Preferred forms Exceptions and notes
ASCII characters ! " % & ' + < = > [ ] Sometimes proximity to other characters causes misinterpretation of &, <, >, [, ], or ' as part HTML markup or wiki markup. In these cases, use &amp;, &lt;, &gt;, &#91;, &#93; or &apos;.
Latin and Germanic letters À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï Ð Ñ Ò Ó Ô Õ Ö Ø Ù Ú Û Ü Ý Þ ß à á â ã ä å æ ç è é ê ë ì í î ï ð ñ ò ó ô õ ö ø ù ú û ü ý þ ÿ Œ œ Š š Ÿ Instead of ligatures (Æ, æ, Œ, œ) write two separate letters, except in proper names and in text in languages in which they are standard – see Wikipedia:Manual of Style § Ligatures.
Greek letters Α Β Γ Δ Ε Ζ Η Θ Ι Κ Λ Μ Ν Ξ Ο Π Ρ Σ Τ Υ Φ Χ Ψ Ω α β γ δ ε ζ η θ ι κ λ μ ν ξ ο π ρ ς σ τ υ φ χ ψ ω ϑ ϒ ϖ When written standalone (not part of a Greek word with other Greek characters), the following can be used to reduce confusion with similar-looking Latin alphabet letters: &Alpha; &Beta; &Epsilon; &Zeta; &Eta; &Iota; &Kappa; &Mu; &Nu; &Omicron; &Rho; &Tau; &Upsilon; &Chi; &kappa; &omicron; &rho;. μ (mu) and Σ (sigma) are nearly identical to µ (micro) and ∑ (sum), but the other characters are not used in Wikipedia so there is no potential for confusion.
Quote marks &lsquo; &rsquo; &sbquo; &ldquo; &rdquo; &bdquo; &acute; &prime; &Prime; ASCII quote marks are generally preferred. Wikipedia:Manual of Style/Dates and numbers § Specific units says not to use &prime; and &Prime; for inches and feet.
Dashes –/&ndash; —/&mdash; &horbar; &shy; &horbar; is not used by Wikipedia. For more info on &shy; (optional hyphen) see MOS:SHY.
Whitespace and non-printing &nbsp; &ensp; &emsp; &thinsp; &zwnj; &zwj; &lrm; &rlm; &ensp;, &emsp;, &zwnj;, and &zwj; are generally unnecessary. For more info on text direction, see MOS:RTL.
Math × ÷ √ ∝ ∝ ¬ ± ∂ ∇ ℵ ℜ ℑ ℘ ∀ ∃ ∈ ∉ ∋ ∅ ∏ ∑ ∠ &and; (∧ confused with ^) &or; (∨ confused with v) ∩ ∪ ∫ ∴ ∼ ≅ ≈ ≠ ≡ ≤ ≥ ⊂ ⊃ ⊄ ⊆ ⊇ ⊕ ⊗ ⊥ ⌈ ⌉ ⌊ ⌋ &lang; (⟨ confused with <) &rang; (⟩ confused with >) In some cases TeX markup is preferred to Unicode characters; see Wikipedia:Manual of Style/Mathematics § Typesetting of mathematical formulae. × (&times;) is used in article titles and also for hybrid species. ∑ (sum) should not be used; Wikipedia uses the nearly identical Σ (sigma).
Currency ¢ £ ¤ ¥ € $
Non-English punctuation ¿ ¡ « » &lsaquo; &rsaquo; &lsaquo; and &rsaquo; are not used by Wikipedia; < and > can be used instead.
Dots &middot; &bull; &sdot; "..." is preferred to "…" - see MOS:ELLIPSIS. Wiki markup should be used instead of these for lists; see Wikipedia:Manual of Style/Lists § List layout.
Diacritics ¨ ¸ ‾ ˜ ˆ
Arrows ← ↑ → ↓ ↔ ↵ ⇐ ⇑ ⇒ ⇓ ⇔
Other symbols ¦ § © ® ™ ° µ ¶ † ‡ ƒ ‰ ◊ ♠ ♣ ♥ ♦ µ (micro) is not used by Wikipedia; use μ (lowercase Greek letter mu) instead - see Wikipedia:Manual of Style/Dates and numbers § Specific units
Superscript and subscript ¹ ² ³ ª º Do not use Unicode subscripts and superscripts like these for numbers, per Wikipedia:Manual of Style/Superscripts and subscripts; use <sup> and <sub> instead.
Fractions ¼ ½ ¾ &frasl; These are not used unless discussing the characters themselves; for alternatives, see Wikipedia:Manual of Style/Dates and numbers § Fractions and ratios


Above is is a draft of a definitive list of whether the HTML reference or the character itself should be used, as suggested by other editors above. I noticed a few things:

  • Both the characters and the references are widely used for endash and emdash; allow both for now?
  • mu and micro are rarely if ever used in the same context; the direct form seems preferable? Same for sum and sigma?
  • ∼ (&sim;) and ~ (ASCII tilde) seem to be used interchangably but &sim; itself is used very rarely.

-- Beland (talk) 08:12, 15 July 2018 (UTC)[reply]

  • usually prefers direct character input over named references – That's too sweeping. I can see this is gonna take a lot of discussion. For starters, pinging David Eppstein for his thoughts on literal or symbolic for math symbols (not meaning to imply there's one simple answer to that). Not pinging SM because he'll find his was here without doubt and his user name is too hard to get right and it's late and I'm tired. EEng 08:32, 15 July 2018 (UTC)[reply]
    • I think it's very important to spell out &minus; as otherwise it's too difficult to distinguish from &ndash. Otherwise I don't feel strongly but I know I have seen legions of random AWB users replace &times; (e.g.) by its unicode character. So we should not encourage replacements that go the other way. —David Eppstein (talk) 16:30, 15 July 2018 (UTC)[reply]
      • Okey, that seems useful to note. -- Beland (talk) 21:55, 15 July 2018 (UTC)[reply]
    • @EEng: Well, if I'm counting right, out of the 252 named references, in 28 instances (11.1%), the proposal is recommending to use the reference over the character itself, and in 27 instances (10.7%) it's either not making a recommendation or different options are used in different circumstances. That leaves 78.2% of the time where the character itself is being recommended over the named reference. That seems to qualify as "usually"; am I missing something? -- Beland (talk) 21:55, 15 July 2018 (UTC)[reply]
      You're counting entries in the table; I'm counting occurrences in the wild i.e. I'd wager that the population of ndash + mdash in articles is greater than that of all those other characters put together, and those two should always be coded by name or template, IMHO. EEng 02:37, 16 July 2018 (UTC)[reply]
      @EEng: Ah, would it make more sense to say "for most characters prefers" rather than "usually prefers"? -- Beland (talk) 02:42, 16 July 2018 (UTC)[reply]
      At this point I don't know if anything needs to be said at all. I'm a bit unclear about something. Right now much or most of this advice, to the extent it's somewhere in MOS, is distributed among the various relevant sections. You're not proposing to insert this giant table somewhere, are you? Because then it will be in two places which will need to be kept in sync. EEng 03:36, 16 July 2018 (UTC)[reply]
  • WP:MOSNUM always uses the Greek letter mu or the html entity &mu; as the metric prefix for micro. I know some Unicode characters were created for obscure reasons such that Wikipedia has no interest in using those characters; I infer from it's low numerical code value &micro; (U+00B5, µ) exists as a way of coding the micro symbol that was used in some pre-Unicode character codes that didn't provide for most Greek letters, to permit round-tripping between those older character codes and Unicode. According to the Unicode Consortium, the Greek letter character is preferred,[1]. Maybe use the Greek letter mu directly, whether in a Greek word, the archaic stand-alone symbol for micrometer, or the metric prefix, and explicitly encourage editors to replace µ (U+00B5) with μ (U+03BC). Jc3s5h (talk) 10:31, 15 July 2018 (UTC)[reply]
    • Ah, OK, I'll change that. Is sigma also preferred over sum in all cases? -- Beland (talk) 02:28, 16 July 2018 (UTC)[reply]
      • Looks like sum is rarely used compared to sigma which is used a lot, so I'll put in the same advice and see if that meets with popular approval. -- Beland (talk) 02:31, 16 July 2018 (UTC)[reply]
  • As a comment, is convenient in templates when you want a whitespace. --Izno (talk) 21:58, 15 July 2018 (UTC)[reply]
    • Ah, this points out to me that the regular space (which is U+0032) actually doesn't have a named reference, so it probably doesn't belong on this chart.

References

  1. ^ Beeton, Barbara; Freytag, Asmus; Sargent, Murray III (30 May 2017). "Unicode® Technical Report #25". Unicode Technical Reports. Unicode Consortium. p. 11. {{cite web}}: Cite has empty unknown parameter: |dead-url= (help)

EEng made a good find, that &dollar; was missing. It turns out that this is because List of XML and HTML character entity references only goes up to HTML 4, and HTML 5 has a ton more, listed here. Given the length of the resulting table if we include all of them, maybe we should just say "use the character itself except for those listed below" and list the ones where named references should be used? (And maybe continue to list the characters that should not be used at all?) -- Beland (talk) 03:53, 16 July 2018 (UTC)[reply]

I still don't understand why, to a first approximation, we're not saying that everything other than a-zA-Z0-9`~!@#$%^&*()-_=+[]{};':",./<>? should be given via &foo; or {some template}. Also, the table mixes advice on how to express various characters with advice on whether and when to use various characters. Not saying that's bad, just worth noting. EEng 04:12, 16 July 2018 (UTC)[reply]
I think accented Roman letters should certainly be written as e.g. á not &aacute;. More generally I am in favor of using unicodes over html entities or templates in most cases, with exceptions for characters like &amp; (when written next to something that would cause it to expand to a different entity) or &minus; (because there is too much possibility for confusion with other dash-like characters). Also, as an aside, the text above about avoiding ligatures is too strong; when these characters occur in the standard spelling of a name (e.g.), we should write them that way even when we are writing in English. —David Eppstein (talk) 04:25, 16 July 2018 (UTC)[reply]
Re accented Romans, I did say "to a first approximation". Re ligatures, the text says "except proper names" -- is that not enough? EEng 05:05, 16 July 2018 (UTC)[reply]
I did a quick database check, and as of April 2018, – is more popular than &ndash; by a ratio of about 10.6:1.
My thought on combining "how" and "whether" is that it's entirely likely the answer to the question "how do I put this character into Wikipedia?" is "please don't, use this other one", so having it all in one place is handy. -- Beland (talk) 05:28, 16 July 2018 (UTC)[reply]
The fact that literal ndash is 10X as common as symbolic just shows how much work we have to do -- in my edit window it's very hard to tell ndash from hyphen or mdash unless they're next to each other. I'm fine with combining both kinds of advice, though (again) I'm not sure what exactly where this big table is gonna go. EEng 06:38, 16 July 2018 (UTC)[reply]
Well, you were using your guess that the numbers were the other way around as an argument for a wording change. The current preponderance might be evidence that most editors prefer the raw characters, or maybe it's just what people do because the UI is designed to encourage that. That fact that the UI is the way that it is may be an indication that there is not great support for using &ndash; and friends. I can generally tell the difference between dashes of different lengths, though if some people can't, that may be an indication that it just doesn't matter that much. In any case, given the lack of consensus on this, the current proposal is to remain neutral on the choice for ndash and mdash, and let editors decide on a page-by-page basis. In contrast, for other characters like ∀ and °, which can be clearly distinguished by everyone, I haven't heard a good argument for why those shouldn't just be used directly. -- Beland (talk) 23:26, 17 July 2018 (UTC)[reply]
  • Well, you were using your guess that the numbers were the other way around – No, you're mixing up two different things. I conjectured that ndashes and mdashes, together, make up the bulk (counting each use separately) of all these not-on-the-keyboard characters; that was without regard to how those characters were expressed (literal vs. symbolic).
  • that the UI is the way that it is may be an indication that there is not great support for using &ndash; and friends – WP's facilities and interfaces are full of debris that's little used or even "impossible" to use (e.g. template parameters that want to present information that an RfC has determined should never be presented). Trying to infer how things are spozed to be based on things you see in the UI will get you way off track very, very fast.
  • I can generally tell the difference between dashes of different lengths – So can I, easily in the rendered page, but in the wikisource only with a bit of effort, if I make a point of looking. It's that last bit that's the rub: in the rendered page an ndash vs. mdash look like – vs. —, but in the wikisource they're much more similar i.e. vs. . (What you see in that sentence may depend on your skin, so your mileage may very.) Thus it's easy in copyediting to not notice that the wrong one is present, and that's why symbolic names should be used instead. (If we really cared we'd suggest that hyphens be rendered as &hyp; as well. I actually tried that once in an article but got laughed off the stage, so we'll just have to live with using the literal -. What I usually do is when I see e.g. a date range like 1899-1920, I just change the literal hypheny-dashy thing that's there to &ndash, so that I know it's the right thing.)
  • I haven't heard a good argument for why those shouldn't just be used directly – Clearly a quotation in a language using a non-Roman script should just present that text literally. For everything else, there are a lot of pros and cons relating to how many different special symbols are used (in a given article), the extent to which each one is used repeatedly, how potentially confuse-able they are for one another or for something else not even used on the page page, the likely sophistication of editors who might work on the article, and a lot more. Here's a random example: WP:MOSNUM says arcminutes should be denoted by a prime and not an apostrophe or a single quote i.e. ′ but not ‘ or ' . Once again, you have to be looking to notice if the wrong one is there; thus MOSNUM suggests that the markup &prime; be used to save editors squinting. Unfortunately different considerations come into play for different symbols, so separate analyses are needed in each case. That's why I predicted this discussion would take a long time.
EEng 03:32, 18 July 2018 (UTC)[reply]

As for the general direction of the advice, using characters directly seems to be the recommended best practice for web development generally. It's more WYSIWYG and easier for web editors to read and think about. It also fits the goal of not forcing editors to learn HTML in order to be able to use Wikipedia; they can just input and edit these characters in the same way they do elsewhere like Word or phone apps or other web sites. We also have a UI right below the text-being-edited box which encourages people to add the characters directly; it would be weird if the advice is to generally use the references because that's not what the system is designed to encourage. The escaping system was originally designed to allow input of special characters that were part of SGML or HTML itself (like angle brackets). Later it became a way to work around the limitations of ASCII. But modern web sites all use Unicode now, as does Wikipedia, so it's a bit of an obsolete workaround. I think any system where you have to learn a special language for telling a computer something is less user-friendly than a system where you can express your intention in the way you would express it to other humans. -- Beland (talk) 06:29, 16 July 2018 (UTC)[reply]

I think we should treat it like citations: citations are hard, both inside Wikipedia and outside. Just see what happens in any university freshman humanities class where citation expectations are rigorously enforced for the first time in most student's life. So at Wikipedia we're satisfied if the first editor gives some way to find the source; gnomes can improve the citation format later. And the tools to do the improvement exist.
Similarly, editors who are not skilled with markup can do the best they can with the visual editor and other editors can improve it. The editors who make the improvements need the tools to do so, and bots must not overrule their contributions by converting html entities to characters.
The idea that you can write documents and web pages with purely WYSIWYG tools is only true if you're writing some thing simple, or you're a slob. That's why Microsoft Word has a little paragraph symbol so you can turn on the display of paragraph marks. That's why WordPress has two editing tabs, WYSIWYG view, and HTML view. The Wikipedia editors are quite primitive, hence the need for HTML entities continues. Jc3s5h (talk) 10:54, 16 July 2018 (UTC)[reply]
I agree contributions of new editors should be welcomed whetehr or not they follow this sort of guideline; I added language to that effect in the draft. -- Beland (talk) 23:26, 17 July 2018 (UTC)[reply]

General comment This discussion may affect WP:CHECKWIKI error 11. The error is currently disactivated. -- 11:10, 16 July 2018 (UTC)

  • A couple of quick responses:
    1. Wrap the table's characters-as-such, not just the HTML character entities, with <code>...</code> or perhaps with {{kbd}}, whatever looks better (semantically, it can be either – it's code when viewed in the wikitext but also input when you're entering it). If we don't like any of the faint-background effects, use bare <kbd>...</kbd>, which just uses monospace. I would go with <code> because the table already uses a light grey and it blends in well, while also not requiring any template calls.
    2. That for which we're providing entity codes should also be shown as characters.
    3. That for which we're showing characters but recommending/allowing entity codes should also be shown as those codes.
    4. "ASCII characters": Present the characters in the same order as the codes in the later column.
    5. "Greek latters: Change "but the other characters are not used" to "but these latter two characters are not used".
    6. "Dashes": This is a misuse of the slash character and and results in confusing typographical gibberish: "–/&ndash; —/&mdash;". Try: "– (&ndash;), — (&mdash;),". Also, "For more info on ­ (optional hyphen) see MOS:SHY" is a misuse of parentheses (round brackets), seeming for some kind of emphasis. Should just remove them.
    7. "Whitespace and non-printing": should also including &hairsp;; like &thinsp; it is generally only used for kerning in templates and such; there is usually not any reason to manually insert either into an article.
  • # "&lsaquo; and &rsaquo; are not used by Wikipedia; < and > can be used instead" is wrong; the are not the same character and should not be confused. If we need to illustrate French quotation style, etc., use the correct characters, not lesser-than and greater-than, which serve an entirely different purpose. This is pretty much exactly like hyphen vs. dash vs. minus.
 — SMcCandlish ¢ 😼  07:28, 17 July 2018 (UTC)[reply]
The weird "shy" line was due to a typo preventing &shy; from showing up at all. I fixed that. You're right about lsaquo; I must have messed up something when scanning the database for it. I'll change that and other points you mention in the next draft, as applicable. Thanks for reading! -- Beland (talk) 00:37, 18 July 2018 (UTC)[reply]

Third draft

Posted to Wikipedia:Manual_of_Style/Text_formatting#HTML_character_entity_references

Proposed as new subsection titled "HTML character entity references" under Wikipedia:Manual of Style § Miscellaneous, replacing the second paragraph of "Keep markup simple".

HTML character entity references are a way to tell a web browser to render a certain character without including that character in the web page directly. Characters may be referenced by name, decimal number, or hexadecimal number. For example, &euro; is the same as &#x20AC;, &#8364;, or including the character directly.

On Wikipedia, characters should be used directly unless doing so is confusing for editors or causes technical problems. Numerical references should not be used if a named reference is available. For example, &minus; should be used instead of &#8722;, and é should be used instead of &eacute;. Edits favoring these conventions are made by semi-automated systems like AutoWikiBrowser. For a comprehensive list of available named references, see [3].

Wikipedia stores articles with Unicode, so any character that could possibly be referenced can also be input directly. The web site's editing pages have built-in special character support to make it easy to input characters not typically found on keyboards. Editors can also use the Unicode input method provided by their operating system. There are some exceptions where named references are preferred, to avoid confusion and to circumvent technical limitations. The <nowiki> tag can also be used instead of character escaping to prevent interpretation of special characters as wiki markup.

Characters to avoid |
Avoid Instead use Note
(&hellip;) ... (i.e. 3 periods) See MOS:ELLIPSIS.
Unicode Roman numerals like Latin letters equivalent (I II i ii) MOS:ROMANNUM
Unicode fractions like ¼ ½ ¾ &frasl; {{frac}}, {{sfrac}} See MOS:FRAC.
Unicode subscripts and superscripts like ¹ <sup></sup> <sub></sub> See WP:SUPSCRIPT. In article titles, use {{DISPLAYTITLE:...}} combined with <sup></sup> or <sub></sub> as appropriate.
µ (&micro;) μ (&mu;) See MOS:NUM#Specific units
Ligatures like Æ æ Œ œ Separate letters (AE ae OE oe) Generally avoid except in proper names and text in languages in which they are standard. See MOS:LIGATURES.
(&sum;) (&#8719;) (&horbar;) Σ (&Sigma;) Π (&Pi;) (&mdash;) (Not to be confused with \sum and \prod, which are used within <math> blocks.)
(&lsquo;) (&rsquo;) (&sbquo;) (&ldquo;) (&rdquo;) (&bdquo;) ´ (&acute;) (&prime;) (&Prime;) ` (&#96;) Straight quotes (" and ') Use {{coord}}, {{prime}} and {{pprime}} for mathematical notation; elsewhere use straight quotes unless discussing the characters themselves. See MOS:QUOTEMARKS.
(&lsaquo;) (&rsaquo;) « (&laquo;) » (&raquo;) Use &lang; and &rang; for math notation. In foreign quotations normalize angle quote marks to straight, per MOS:CONFORM, except where internal to non-English text, per MOS:STRAIGHT.
&ensp; &emsp; &thinsp; &hairsp; Normal space These are sometimes used for precision positioning in templates but rarely in prose, where non-breaking (&nbsp;) and regular spaces are normally sufficient. Exceptions: MOS:ACRO, MOS:NBSP.
In vertical lists

(&bull;) · (&middot;) (&sdot;)

* Proper wiki markup should be used to create vertical lists. See HELP:LIST#List basics.
&zwj; &zwnj; see note Used in certain foreign-language words, see zero-width joiner/zero-width non-joiner. Should be avoided elsewhere.
£ for GBP, keep ₤ for Italian Lira and other lira currencies that use ₤ (see the main article for that currency) MOS:CURRENCY; find broken instances
Potentially confusing or technically problematic characters |
Category coded form (direct form) Notes
Miscellany &amp; (&) &lt; (<) &gt; (>) &#91; ([) &#93; (]) &apos; (') &#124; (|) Use these characters directly in general, unless they interfere with HTML or wiki markup. Apostrophes and pipe symbols can alternatively be coded with {{'}} and {{!}} or {{pipe}}. See also character-substitution templates and WP:ENCODE.
Greek letters &Alpha; (Α) &Beta; (Β) &Epsilon; (Ε) &Zeta; (Ζ) &Eta; (Η) &Iota; (Ι) &Kappa; (Κ) &Mu; (Μ) &Nu; (Ν) &Omicron; (Ο) &Rho; (Ρ) &Tau; (Τ) &Upsilon; (Υ) &Chi; (Χ) &kappa; (κ) &omicron; (ο) &rho; (ρ) In isolation, use coded forms to avoid confusion with similar-looking Latin letters; in a Greek word or text, use the direct characters.
Quotes &lsquo; () &rsquo; () &sbquo; () &ldquo; () &rdquo; () &bdquo; () &acute; (´) &prime; () &Prime; () &#96; (`) Can be confused with straight quotes (" and '), commas, and with one another. MOS:STRAIGHT generally requires conversion to straight quotes, except when discussing the characters themselves or sometimes with non-English languages. See next row for prime characters.
Apostrophe-like ' ` ´ ʻ ʼ ʽ ʾ ʼ ʽ ʻ ʼ
Dashes, minuses, hyphens &ndash; () &mdash; () &minus; () - (hyphen) &shy; (soft hyphen) Can be confused with one another. For dashes and minuses, both forms are used (as well as {{endash}} and {{emdash}}). Soft hyphens should always be coded with the HTML entity or template. Plain hyphens are usually direct, though at times {{hyphen}} may be preferable (e.g. Help:CS1#Pages). See MOS:DASH, MOS:SHY, and MOS:MINUS for guidelines.
Whitespace &nbsp; &emsp; &ensp; &thinsp; &hairsp; &zwj; &zwnj; In direct form these are nearly impossible to distinguish from a normal space. See also MOS:NBSP.
Non-printing &lrm; &rlm; In direct form these are nearly impossible to identify. See MOS:RTL.
Mathematics-related &and; () &or; () &lang; () &rang; () Can be confused with x ^ v < >. In some cases TeX markup is preferred to Unicode characters; see MOS:FORMULA. Use {{angbr}} instead of ) / ()
Dots &sdot; () &middot; (·) &bull; () Can be confused with one another. Interpuncts (&middot;) are common in horizontal lists and to indicate syllables in words. Multiplication dots (&sdot;) are used for math. In practice, the dots are used directly instead of the HTML entities.

FTR, as of the July 1, 2018 database dump, &lsqb; is used about 329 times and &lbracket; is used about 91 times, so I picked the more common one. -- Beland (talk) 15:04, 18 July 2018 (UTC)[reply]

  • While I still have my reservations about where this is going and the amount of effort it will take to iron all the bugs out, I'm warming up to this. EEng 15:35, 18 July 2018 (UTC)[reply]
  • The table asserts the &Prime; html entity resembles the ASCII backtick (`), and even have something displayed that looks like a backtick. But this is the real result of the &amp:Prime; html entity: ″. The table is just a mass of stuff and I wouldn't be able to find anything in there to make corrections. Jc3s5h (talk) 16:46, 18 July 2018 (UTC)[reply]
    • @Jc3s5h: Sorry, the backtick was missing from the second table; I just fixed that. It was rather exhausting to catalog everything and try to format it properly, so I didn't get a chance to double-check things. You're right about it being hard to read, so I also put each character in the second table on its own line, to make matching up characters and references easier. Is that clear enough now? Is it making the table too long? -- Beland (talk) 23:21, 18 July 2018 (UTC)[reply]
      • In the table, as rendered, &Prime; appears twice. Each time the character next to it is `, which is U+0060 and is named GRAVE ACCENT. But this is wrong; it should look like a double prime and is U+2033. It is used to mark seconds of time or seconds of arc; a backtick is completely wrong for that. Jc3s5h (talk) 00:04, 19 July 2018 (UTC)[reply]
        • Ah, that was caused by a capitalization error in the first table. Fixed! -- Beland (talk) 05:47, 20 July 2018 (UTC)[reply]
Mostly looking good. It would put this at the bottom of MOS:TEXT, probably. Maybe in a section called "Unicode characters". We could see about cross-referencing it in various places.  — SMcCandlish ¢ 😼  02:07, 19 July 2018 (UTC)[reply]
Gave the boxes a spinshine/reorganization. Headbomb {t · c · p · b} 18:29, 19 July 2018 (UTC)[reply]

I posted this to Wikipedia:Manual_of_Style/Text_formatting#HTML_character_entity_references (there's another section there that talks about Unicode PUA and RTL characters) and cross-referenced from Wikipedia:Manual of Style § Miscellaneous. Feel free to edit the live version as needed. -- Beland (talk) 05:56, 20 July 2018 (UTC)[reply]

And thanks to everyone for greatly improving this section from the initial draft! It will be a great help to me in writing the code that will flag less-than-clear usage. -- Beland (talk) 05:57, 20 July 2018 (UTC)[reply]

Might be worth adding a comment in the Greek notes that the same sort of thing applies to Cyrillic letters that look like Latin and Greek ones; use the entity codes for clarity when discussing particular characters, but use the Unicode in actual Russian, Ukranian, etc. words. We probably needn't dwell on the details, since there's another proposal open for centralizing all the scattered Cyrillic-related material to one page. Then again, that's mostly to be about transliteration, so maybe the Greek section in the table should be Greek and Cyrillic?  — SMcCandlish ¢ 😼  04:11, 22 July 2018 (UTC)[reply]

Instances of character references for Cyrillic letters seem to be relatively rare. I don't see any on a casual skim through this report, though I'd have to go through the entire alphabet to definitively say they are never used. Unlike Greek letters, they aren't in common use for scientific and mathematical purposes. I think it would be simpler and probably more user-friendly just to say to use the Cyrillic characters directly, which is what the draft is currently proposing. -- Beland (talk) 07:58, 22 July 2018 (UTC)[reply]
Works for me.  — SMcCandlish ¢ 😼  15:47, 26 July 2018 (UTC)[reply]

Reversion of addition of third draft

So after I posted the tables proposed above, David Eppstein reverted, with the edit summary "what part of "I think you should be more patient"..."Try proposing something narrower and more specific" do you not understand?".

I think I did not see those remarks by David Eppstein and SMcCandlish because they were posted in the discussion ("Fraction slash" below) about the "Slashes" section of the main MOS page, which I did not check for comments before updating the "Text formatting" MOS subpage. SMcCandlish wanted a one-word change to the "Slashes" section, which he implemented. I think David Eppstein was commenting on the change he reverted, as he then wrote:

I'm not convinced that the html section is needed at all. It is more material for a guidebook on html than style guidance for Wikipedia editors. And you appear to have the purpose of using the new section as a bludgeon to begin a massive project of automatically reformatting characters in Wikipedia, which I think is a bad idea (watchlist clutter for no visible change to articles).

"Bludgeon" sounds pretty ugly and mean. I started a project to spell-check all Wikipedia, which is intended to improve its readability and credibility. Along the way I noticed that editors have also occasionally misspelled HTML character entity references. I thought as long as we're cleaning up the misspellings, we might as well clean up any undesirable forms, because right now we don't seem to be representing them consistently. I started this discussion because I couldn't find any guidance in the Manual of Style to help me write the code to correctly flag undesirable forms vs. ignore desirable forms.

Mediawiki markup uses this part of HTML syntax, and if we have a preferred form for these things we'd want to communicate that to editors, and the Manual of Style is the place to document choices of style rather than technical how-to for the benefit of editors, so I don't understand the criticism that this is not the right place for this sort of guideline. Especially since Wikipedia:Manual of Style#Keep markup simple already discusses exactly this point, and the other sections linked from the proposed tables also address which characters are preferred.

We already encourage editors to make edits that have no reader-visible changes but do have editor-visible changes intended to make wikitext easier to read and thus articles easier to edit. That's the whole point of Wikipedia:WikiProject Wikify and wikification. I do agree there are some edits that don't improve readability all that much that aren't that worthwhile on their own, like changing "==xx==" to "== xx ==". This seems less trivial than that. I'd also note we have Wikipedia:HTML5, a project which is doing nothing but replacing obsolete HTML tags with newer ones, with hopefully no user-visible changes.

There are less than 20,000 articles that even have HTML character entity references at all, less than 3.5% of all articles. Even if we changed all of them today, given the sheer volume of changes to the encyclopedia it would not be a big deal, and in reality it will probably take months or years to manually change all the instances, if that's what we want to do. At worst, editors who notice these changes happening will be educated about the desired way of doing things, and be more likely to input characters that way when adding new text.

Given that editors seem to use characters a lot more than references, and given that characters are built into the Wikipedia UI, it seems a lot less disruptive to move toward characters than away from them.

To illustrate the difference it makes to editors, consider an editor who comes across "São Paulo" in wikitext. To most people who are not web developers, that looks like a typographical error. Some English-speaking people might correct it to "Sao Paulo" which is often seen in English, or, getting the idea there might be an accent there, to "Sáo Paulo", which is incorrect. "São Paulo" is what Portuguese speakers are expecting to see - it's what they type with their keyboards, and it's what appears in Word docs and on the Portuguese Wikipedia and on Google Translate, and in the readable parts of other web sites. With "São Paulo", everyone knows exactly what's going on, and there's no need to waste time doing a search on the meaning of "atilde" or "&atilde" or whatnot.

If I were making the rules, I think I'd keep it simple and say to use characters directly except for otherwise invisible characters and those that cause technical problems when used directly. I'd actually be fine if we used ASCII hyphens for all of our dashes, but I'm not complaining if people who can see the difference on their monitors want to upgrade some of them to emdashes to make things look pretty as in the golden years of paper typography. That would make a much smaller table than the one proposed above, but given that other editors seem to feel more strongly about making it easy to tell the difference between certain lookalike characters, I think that table now represents a pretty good compromise. Leaving dashes and quotes as they are takes the biggest chunks of potential work off the table, anyway.

Given that this is proposing a simple general rule and then listing all the desirable exceptions to it, I'm not sure that a narrower proposal would make sense. The volume of comments has been relatively small, so having multiple discussions about the same topic it seems would just burn more editor time. I am, however, open to actionable suggestions. -- Beland (talk) 08:03, 22 July 2018 (UTC)[reply]

@David Eppstein: Did you have any thoughts in response? -- Beland (talk) 18:46, 23 July 2018 (UTC)[reply]
I don't think we should be setting up automatic processes that make neither a visible change to article content nor a semantic difference to the markup of the articles. And I don't think we should be prescribing such things in the MoS and by doing so encouraging such processes. —David Eppstein (talk) 18:54, 23 July 2018 (UTC)[reply]
@David Eppstein: OK, would you be happy if the guideline said that all such changes be made manually? -- Beland (talk) 20:27, 23 July 2018 (UTC)[reply]
Still not strong enough. I would prefer that such changes be made only as part of other substantive changes to articles (more or less what usually happens now with AWB users; see WP:AWBRULES #4). —David Eppstein (talk) 20:35, 23 July 2018 (UTC)[reply]
OK, I think that will lead to undesirable forms lingering around for a long time for no particularly good reason. -- Beland (talk) 20:56, 23 July 2018 (UTC)[reply]
(And I think leaving those forms around would generate higher cognitive load and more work for editors than the messages generated by removing them.) -- Beland (talk) 21:01, 23 July 2018 (UTC)[reply]
(ec with D.E.) Way TLDR. I warned you that this would take a LOT of work and patience before it would be ready to become part of MOS. Your table, without question, inadvertently trods on a lot of toes in the form of established ways various groups of editors do things in various topic areas. It would be wonderful to systematize and summarize and centralize all this but, like I said, it's gonna be a lot of work. And it's one thing to come up with a guide for future editing; it's quite a different one to use it for some mass-change project. To be blunt, if you think that Even if we changed all of them today, given the sheer volume of changes to the encyclopedia it would not be a big deal then there are some things you really don't understand; if you made changes like this to 3% of articles in one day, or one week, or even one month, you'd be strung up by your URLs.

I haven't been following that last week of discussion so I don't know where we are and what the open issues are, but if you want this to see the light of day you need to be prepared to keep plugging for quite some time to work through all the details with all interested parties (not that I even know how to find them). I've gone through an effort like this myself elsewhere in MOS and it can be an exhausting task, though you will be quite rightly congratulated by all in the end if you can pull it off, because it will be a very useful achievement for the project. EEng 19:05, 23 July 2018 (UTC)[reply]

What does "ec with D.E." mean? If you think I should consult more people, but don't know how to go about doing that, that's not really an actionable suggestion. -- Beland (talk) 20:25, 23 July 2018 (UTC)[reply]
It means "edit conflict"; EEng and I wrote our comments in parallel. —David Eppstein (talk) 20:36, 23 July 2018 (UTC)[reply]
@EEng: As far as I know, the only open issue is whether these improvements would justify their own systematic edits. To a large degree, this is just codifying current practice so we can clean up stragglers, so I don't expect very many objections. -- Beland (talk) 07:11, 24 July 2018 (UTC)[reply]
This will need much wider exposure before you can have that kind of confidence. EEng 07:17, 24 July 2018 (UTC)[reply]
@EEng: I was only referring to issues that had been raised by editors who have already heard of the proposal. But how would you like to see me go about getting wider exposure? -- Beland (talk) 21:27, 25 July 2018 (UTC)[reply]

How do other editors feel about David Eppstein's proposal for a rule that "such changes be made only as part of other substantive changes to articles"? Personally, I don't see the need for that, given the arguments I made above, but of course I'll implement whatever the consensus is. -- Beland (talk) 20:46, 23 July 2018 (UTC)[reply]

This is a whole lot of stuff being discussed at once. I'll cover it in the order in which I'm seeing it come up above:

  1. My "Try proposing something narrower and more specific" (and David Eppstein's "I think you should be more patient", from what I can tell) were from the discussion below, on fraction-slash, and have nothing to do with the discussion above about having a handy quick-reference table on characters and their entities and what to do with them on WP. (Well, my comment didn't; I can't read David's mind.) That should be restored, toward the bottom of Wikipedia:Manual of Style/Text formatting I would think.
  2. "I'm not convinced that the html section is needed at all" no longer seems to have a referent. The table version 3 has no such sectioning.
  3. This point by Beland is correct: "Mediawiki markup uses this part of HTML syntax, and if we have a preferred form for these things we'd want to communicate that to editors, and the Manual of Style is the place to document [it]".
  4. Beland's entire "We already encourage editors to make edits that have no reader-visible changes but do have editor-visible changes ..." paragraph and the two that follow it are correct.
  5. David says: "I don't think we should be setting up automatic processes that make neither a visible change to article content nor a semantic difference to the markup of the articles." I can't find anywhere that this has been suggested, and it would already be governed by WP:COSMETICBOT. Beland seems to want to use this for AWB/GENFIXES purposes, but that's not automated. It's semi-automated, and entirely permissible when done in the course of more substantive edits.
  6. Consequently, "I don't think we should be prescribing such things in the MoS and by doing so encouraging such processes" doesn't really track. A) We do in fact have preferences, recorded willy-nilly throughout MoS (e.g. use ... not or &hellip;, at MOS:ELLIPSIS; and use μ or &mu; not &micro;, at MOS:UNITSYMBOLS; and so on), so the idea that it's off-topic or out-of-scope for MoS doesn't fly. B) MoS has already been updated with a footnote against automated "enforcement" of MoS stuff, including cross-references to the COSMETICBOT policy and to ArbCom decisions about it. The fact that someone could go on an bot-mediated enforcement rampage is not an argument against MoS having line-items about various stuff; the fact that we have rules against doing that is already sufficient to address the rare problem. Given that someone just lost their AWB access as a result of doing something like that should discourage a repeat. Rules do not need 100% compliance to be useful, nor does failure to achieve 100% compliance mean they're insufficient; otherwise civil society would be impossible.
  7. David's "I would prefer that such changes be made only as part of other substantive changes to articles": We can include something about this, but not making up a new rule just for this, only pointing out the existing ones. MoS is not an editing or behavioral policy nor a dispute resolution board. This is already covered by WP:MEATBOT policy and WP:AWBRULES, and is just how WP:GENFIXES works. The aforementioned footnote can simply be recycled from the main MoS page to where ever this table will live.
  8. EEng says: "Your table, without question, inadvertently trods on a lot of toes in the form of established ways various groups of editors do things in various topic areas." That's not "without question"; prove it, please. Then we can integrate whatever tweaks are necessary. And sometimes toes have to be stepped on, anyway. Not everything some gaggle of people at a wikiproject are doing is a good idea, nor do they get to just make up their own rules and force others to comply; site-wide concerns override local ones (WP:CONLEVEL policy).

    And what was once an okay idea can become a poor one over time as circumstances change. E.g., the cutover last month to a new HTML linter for the parser broke all kinds of stuff that used to "okay" or "we don't care", but which is no longer okay, and thus we now do care. The most obvious of these is that unclosed inline elements used to be forcibly closed at the opening of a block element and this is no longer the case, resulting in badly broken, mis-rendering HTML in at least tens of thousands of pages. People have been cleaning this up, including with semi-automation tools like AWB and JWB, yet no one having a shit-fit about it. People will have shit-fits about such activity if it's PoV pushing (e.g. changing all "U.S." to "US", or changing all unspaced em-dash parenthesizing to use spaced en dashes), but they don't lose it over technical cleanup. Another example is that <br> breaks the output of at least two of the available edit-mode syntax highlighters, and needs to be changed to <br />; I've already fixed one "Help:"-namespace page from the 2000s that was recommending <br>, and there are probably some others that need fixing in this regard.

  9. The obvious way to proceed is for EEng to document these "toes" he says are being stepped on; for discussion to ensue, with any needed adjustments being made to the table; and then – if we really think it's necessary – do an adoption RfC on table version 4.

 — SMcCandlish ¢ 😼  16:38, 26 July 2018 (UTC)[reply]

  • My point about the toes is simply that, from experience, people tend to be very set in their ways about low-level details such as direct (literal) pasting in of characters vs. coded form (and, where a coded form is used, both the & forms and template forms have their enthusiastic adherents). So the wider this is advertised and discussed the better, to save WP:WHINE-ing down the road.
  • I think the table needs to recognize that there are much-used template forms e.g. {ndash}
  • If we're going to all this trouble, I'd like to see a shift to a preference for coded forms of mdash and ndash, instead of the current even-handed statement. It's just crazy-making that you can't tell if the right character is present (depending on your font and platform of course). This of course we be a stepping on of some toes.
  • I'm still hoping to get an explanation of why Whitespace other than the non-breaking &nbsp; and regular space should be avoided in prose.
EEng 05:27, 27 July 2018 (UTC)[reply]
Because they cause copy-pasting errors/oddities, clashes with find/replace searches, don't play nice with screen readers, mess with alignment/justification, and there's pretty much no point to them in any sort of prose "James&thinsp;Dean was an actor." is pure nonsense. Headbomb {t · c · p · b} 11:06, 27 July 2018 (UTC)[reply]
Yeah, we only use these for special kerning purposes. If there's some case were we're regularly using thin and hair spaces and it's not spacing tweaks in tight material in template output, feel free to point out where we're doing it, and it can be accounted for (if it's a good idea). Other stuff:
  • As for the "set in their ways about ... direct (literal) pasting in of characters", that's irrelevant, because MoS doesn't constrain editors in any way as to adding new material. You can edit WP without ever complying with anything MoS says, as long as you're following WP:CCPOL, and not a) changing guideline-compliant material to be non-compliant, or b) reverting people making non-compliant material be compliant.
  • Re "the table needs to recognize that there are much-used template forms e.g. {ndash}" – sure. That's not an objection to the table, its an expansion suggestion.
  • On changing to &ndash;: I actually proposed that several years ago for the same reason, and did not get consensus. Apparently the average editor, with their fonts, can see the difference clearly, and people were dismissive of the idea because the editing tools below the edit window provide a button for directly inserting the Unicode character. I think, therefore, this is a lost cause. Editors having trouble seeing the difference between , , , and - need to use WP:User CSS or their browser's font settings to use a font for editing that works better for them. I wrote instructions on how to do this at Help:User style#User CSS for a monospaced coding font. It's not absolutely perfect; the minus and hyphen are still hard to distinguish. If I find a better, free coding font than Roboto Mono I'll put it at the front of the font stack.
 — SMcCandlish ¢ 😼  13:07, 27 July 2018 (UTC)[reply]
Or you can use WP:WIKIED WP:WIKED which marks them as different in the edit window. Headbomb {t · c · p · b} 13:18, 27 July 2018 (UTC)[reply]
I'm assuming you didn't really mean m:Wiki Education Foundation. EEng 17:44, 27 July 2018 (UTC)[reply]
Yes, my bad, fixed. I meant WP:WIKED. Headbomb {t · c · p · b} 18:18, 27 July 2018 (UTC)[reply]
  • Obviously no one's suggesting James&thinsp;Dean so that isn't helpful, and BTW I just checked and text search on Chrome has no problem understanding that thin space is a space. Now and then I've used hsp to adjust "something in italics"[5] to "something in italics"[5] (your mileage may vary, of course) and I'm sure I've used thinsp now and then though I can't recall where. Take a look a this change [4].
  • I'm not so sure that the evidence is that Apparently the average editor, with their fonts, can see the difference clearly. I suspect instead that that the great majority of editors don't even know there is a difference (and just use hyphen), most of those who know the difference are inserting directly using the click-to-insert gizmo but don't really notice or care what it looks like in the edit window since they never look back, and the very small number of us who are copyediting and checking these things have learned to deal somehow with the difficulty of distinguishing them – in my case, wherever I see a direct/literal character which I know should be an ndash but I'm not sure, I just change it to {ndash} so I know it's right. But I'd rather we encouraged editors to use a coded form in the first place to save that trouble. Unfortunately that would create a new flashpoint for my next point, which is...
  • MoS doesn't constrain editors in any way as to adding new material – You know that and I know that, but as sure as day follows night someone's gonna paste in a direct rho, someone else is gonna change that to &rho; (as recommended in the table), and the first guy's gonna change it back, saying "I like it this way." Having said that, looking over the whole table now I don't see very many cases where that might happen (unless we adopt a recommendation to use coded forms of ndash and mdash) but I still think the wider this is advertised for comment in advance the less trouble there will be.
EEng 17:44, 27 July 2018 (UTC)[reply]
Do you have any example of where thinsp/ensp/emsp/hairsp should be used in prose? Because you have none, and no one can come up with any use for them in prose. Until you have such counter examples, the avoid them in prose has consensus, and the allow them is your simply your own preference to not disallow them because of reasons which are never explained. Headbomb {t · c · p · b} 18:41, 27 July 2018 (UTC)[reply]
I guess you didn't read my post above because the first bullet point gives one. I've been very up-front about my wish that we could recommend coded dashes over direct dashes, instead of just trying to force it into the table. Please have the same courtesy about your apparent wish to flatly forbid thinsp and hsp. Is such a provision already present in MOS? EEng 18:45, 27 July 2018 (UTC)[reply]
Such a provision is the current state of Wikipedia. No one writes "something in italics"[5], and they shouldn't start to do so either. Not sure what that has to do with dashes.Headbomb {t · c · p · b} 18:49, 27 July 2018 (UTC)[reply]
Is a blanket ban on thinsp and hsp already in MOS or not? EEng 18:56, 27 July 2018 (UTC)[reply]
The only use I can recall for which I manually employ thin space is between § and the section number that follows it, to split the difference between "§ 1.2.3" and "§1.2.3" styles. This is just a personal habit of mine; there's no rule about it. The only use I've ever have for hair space, outside of a template, is between em dash and an author name when attributing a quotation: "Humor is Mandkind's greatest blessing." — Mark Twain". Also not a rule; it just looks better. Neither of these uses is vital. But they're not objectionable. So, we have a handful of use cases we can document, and then discourage it otherwise. Put it in a footnote, probably. I'm a big fan of footnoting "there are some geeky exceptions" stuff instead of clouding the central advice. On horizontal marks: Well, you can try proposing glyph-to-code conversion if you want, but don't hold your breath. With my font tweaking solution, I have no difficulty at all telling en dashes and hyphens apart, in rendered or source view. "The wider this is advertised": Sure, but not while we're still banging on it just with 3 or 4 people. Iron out the obvious kinks, or even more surely that day follows night, people will "strongly oppose" the whole thing on the basis of some nitpick we should have already anticipated.  — SMcCandlish ¢ 😼  20:58, 27 July 2018 (UTC)[reply]
Obviously I meant we elite should get it in the best form we can before inviting the hoi polloi to look at it. EEng 21:29, 27 July 2018 (UTC)[reply]
Thin space is needed for the correct typography of some mathematics formulas. E.g. (from something off-wiki I was working on today) without thin space: ; with thin space: . The thin space makes it much more clear that this is a product of two subformulas rather than some strange binary-operator usage of the exclamation point. —David Eppstein (talk) 21:18, 27 July 2018 (UTC)[reply]
All great use cases (though I'm sure there are more we're not thinking of) so you see why I objected to These are sometimes used for precision positioning in templates but should not be used in prose. Use either non-breaking (&nbsp;) or regular spaces. So who's OK with my formulation These are sometimes used for precision positioning in templates but rarely in prose, where non-breaking &nbsp; and regular space are normally sufficient (with or without a footnote as suggested by SM)? I'm fine with the rest of what SM has said. EEng 21:27, 27 July 2018 (UTC)[reply]

I feel like keeping to the spirit of Wikipedia:Manual of Style#Keep markup simple means saying that &thinsp and &hairsp should not be used around italics, dashes, and §, since either a regular space or no space works just fine. And I agree with that general approach; HTML is not well-suited to pixel-perfect character control, and as long as there are no horribly ugly problems like actually-overlapping characters I don't think we should fuss about that sort of small thing. This sort of layout issue may be better addressed by making web browses render text more beautifully than by throwing in a bunch of site-specific directives.

If we were to start putting &thinsp around, say, emdashes, then I think that would be a good argument for doing that in an {{emdash}} template, since we'd want it everywhere consistently. I don't think it's a good idea to do that sort of fine-control typography on an article-by-article basis, since then it will not be done consistently.

If {{endash}}, &endash, and – all do exactly the same thing with no fancy spacing, I can see an argument for having two different ways to do it (one HTML-free and one for easier identification), but three ways seems like too many, when two of them serve almost exactly the same purpose.

That said, I'd rather publish the new tables with some of the rows marked as disputed/under discussion than hold the whole thing until there's consensus on every single part, so at least we can start making progress on the items that everyone agrees on, which seems like 95% of it. -- Beland (talk) 02:16, 28 July 2018 (UTC)[reply]

  • keeping to the spirit of Wikipedia:Manual of Style#Keep markup simple means saying that &thinsp and &hairsp should not be used around italics, dashes, and § – No, what the linked guideline says is "Other things being equal, keep markup simple... Use HTML and CSS markup sparingly". That's not "should not be used".
  • HTML is not well-suited to pixel-perfect character control, and as long as there are no horribly ugly problems like actually-overlapping characters – It may not be well-suited, but at times we need to do the best we can, and we're not talking about "pixel-perfect". David Eppstein's example is an excellent one in which neither regular space nor no space is at all acceptable.
  • I'd rather publish the new tables with some of the rows marked as disputed/under discussion – Well, I think we have our hands full just coming up with tables which faithfully and uncontroversially centralize what is now scattered all over creation. And that would be quite an achievement. Changes to what's being recommended should be a follow-on effort.
EEng 03:47, 28 July 2018 (UTC)[reply]
David Eppstein's example doesn't use &thinsp; in the wikitext, so it seems to be out of scope of what I'm proposing. is rendered with <math>\phi!\,2^\phi</math>. Though wouldn't that be a good place to use the dot operator if that's appropriate - surely a very subtle spacing difference isn't the best way to clarify the notation? -- Beland (talk) 17:22, 28 July 2018 (UTC)[reply]
Dot operator is for noobs. Writing for noobs may be appropriate in some Wikipedia articles but it can be condescending in other contexts. —David Eppstein (talk) 19:28, 28 July 2018 (UTC)[reply]
@EEng: Are you arguing that the additional complexity of using &thinsp and &hairsp in prose is worthwhile, and if so in what situations? -- Beland (talk) 17:22, 28 July 2018 (UTC)[reply]
I'm arguing that this isn't the time or place ...
... to get into the weeds of changing the current guidelines, rather than just summarizing and centralizing them.
... to tell a practicing mathematician what notation he should use.
EEng 18:12, 28 July 2018 (UTC)[reply]

Exception for superscripts/subscripts in titles

@Beland: Issue to add to the resolution stack: WP:Manual of Style/Titles#Typographic effects specifically advises use of Unicode superscripts and subscripts and such when available for use in titles of works, because they copy-paste correctly (that is, the output of E=mc<sup>2</sup> copy pastes as E=mc2, and can be used in citation templates without boogering the COinS output. I'm wondering if this conflicts with anything in MOS:NUM and MOS:TM, and the main MoS page. If so, we need to figure out how to reconcile that.  — SMcCandlish ¢ 😼  22:56, 27 July 2018 (UTC)[reply]
Well, Wikipedia:Manual of Style/Superscripts and subscripts is marked as inactive, but I resolved what little conflict there seems to be by adding an exception for titles on that page, with a cross-reference to Wikipedia:Manual of Style/Titles § Typographic effects. I added the same cross-reference and exception to the proposed table. -- Beland (talk) 02:00, 28 July 2018 (UTC)[reply]
User:Headbomb reverted the table change with the edit summary "that page has nothing that contradicts the advice given here. This also applies to titles too via {{DISPLAYTITLE}}". Before my edit, I read that row as recommending not to use Unicode superscripts and subscripts at all, and after the edit to recommend not using them except when needed in titles. The linked page in fact says: "To ensure correct copy-pasting, it is preferable to use Unicode superscript or subscript characters when possible, rather than HTML or wiki markup, which are purely typographic (Unicode ² is not the same character as 2 with superscript markup). Special characters can be used in citation templates." which to me contradicts the "don't use, ever" advice before the edit. Actually my edit was incorrect, the exception is not for Wikipedia article titles, but for titles of works generally, so I'd have to reword it if restoring. (SMcCandlish mentioned that but I was reading too quickly.) But does that at least make sense as I explained it, or am I missing something? -- Beland (talk) 02:30, 28 July 2018 (UTC)[reply]
The only exception should be for an article on the unicode characters themselves. Everything else should be done via a DISPLAYTITLE, e.g. (−1)F, or AC0. Titles of works are no exceptions there. Something like H2O: The Book should be located at H20: The Book and formatted via {{DISPLAYTITLE:''H<sub>2</sub>O: The Book''}}, not located at H₂O: The Book, and then formatted as H2O: The Book throughout the rest of the article. Headbomb {t · c · p · b} 02:47, 28 July 2018 (UTC)[reply]
H20? Holy heavy hydrogen, Batman! EEng 03:00, 28 July 2018 (UTC)[reply]
Well, there is ISBN 1492615323. But the same would apply to H2O (American band) / H2O (Scottish band), etc... Headbomb {t · c · p · b} 03:18, 28 July 2018 (UTC)[reply]
OK, I made another go at noting exceptions to the general "don't use" rule in the table. Does that look better? -- Beland (talk) 17:36, 28 July 2018 (UTC)[reply]

Please someone step in to resolve the most stupid revert war ever

EEng keeps messing with the table layout, forcing them to take huge amounts of vertical space, breaking consistency, scaling/zoom functionality, and forcing unnatural breaks for AFAICT, no real reason but personal preferences. What looks better, [5] + [6] (inline) or [7] + [8] (random vertical breaks)? Headbomb {t · c · p · b} 10:52, 27 July 2018 (UTC)[reply]

Works better allowed to naturally flow; viewport sizes vary radically. The version with forced line breaks does waste a bunch of vertical space on my big-ass monitor. When I reduce window width sharply to simulate a mobile device, it wraps awkwardly, because the browser wraps as needed, plus there are forced line breaks, and they're at cross purposes.  — SMcCandlish ¢ 😼  12:02, 27 July 2018 (UTC)[reply]
Also note to EEng (talk · contribs) (posting this here since your userpage is too slow to use), when you refer to collective things, they take the plural form. The hyphen is considered... but Hyphens are considered, not Hyphen is considered.... Headbomb {t · c · p · b} 14:48, 27 July 2018 (UTC)[reply]
I have to go do my laundry but perhaps when I get back we can talk about this calmly and without the self-certainty. EEng 15:03, 27 July 2018 (UTC)[reply]
OK, that's the whites done, so I have a minute. Look, we've all been through this, where we're seeing different things on different platforms, and it's not helpful to say simply "looks horrible" without thinking about what the other person is seeing and what they're trying to achieve. While in general (all other things being equal) the conservation of a table's horizontal and vertical space is a priority in order to make it easier for the reader to absorb its content, in the present example (or one of them) there was the competing desire to present the various dashes and so on in a stacked form to allow the reader to see how confusing they can be. That may or may not have been worth the slight additional vertical space consumed, but it's not ridiculous either, and Headbomb simply ignored my repeated explanations of that instead of engaging in a discussion of the competing desiderata.
As for plural and so on, "Hyphen is considered" is simply a telegraphic form of "The hyphen is considered", and is just as correct as "Q is considered the hardest letter to use in Scrabble." You're more concerned with strict formalism than is appropriate outside article space.
I've been many times thanked for my careful reforms of previously incomprehensible tables such as those at MOSNUM and WP:PROTECTION, so I do know what I'm doing even if you're not able to always see what I'm aiming at. But I'm not sufficiently interested in these minutiae to worry about them, at least until this proposal goes live and its content is in final form. EEng 18:25, 27 July 2018 (UTC)[reply]
Replace "hyphens/minuses/dashes" with "car/turnip/leaf" and see how it doesn't make anysense. "Car should always be..." makes no sense. "Cars should always be..." does. Headbomb {t · c · p · b} 18:44, 27 July 2018 (UTC)[reply]
Replace it by Q to see there must be more to it than you seem to think: "Q is usually followed by u" makes complete sense – or would you insist on "The Q is usually followed by the u"? Or, God forbid, "Q's are usually followed by u's"? Can't you just let anything go? EEng 19:07, 27 July 2018 (UTC)[reply]
First, using "The" makes this singular. But if you remove it and have "Q is usually followed by u" you're using a mention, and the analogous situation would be something like "- is usually followed by ;", not "hyphen is usually followed by semicolon" (the grammatically correct way of having a use would be "Hyphens are usually followed by semicolons"). Headbomb {t · c · p · b} 19:13, 27 July 2018 (UTC)[reply]
Apparently the answer to my question is No. Oh, and see WP:MISSSNODGRASS.EEng 19:38, 27 July 2018 (UTC)[reply]
To get back to the original question, the different versions of the tables look the same to me when I use a narrow window. With a wide window, I prefer the ones with the explicit breaks; I think it makes the markup examples clearer to break them into lines like that. The extra vertical space doesn't bother me; if you have a wide window, you probably also have a tall window. —David Eppstein (talk) 20:15, 27 July 2018 (UTC)[reply]

Fraction slash

By my reading of Wikipedia:Manual of Style/Dates and numbers § Fractions and ratios, it looks like / (ASCII slash) is used in inline fractions instead of ⁄ (fraction slash &frasl;). This conflicts with Wikipedia:Manual of Style § Slashes which recommends &frasl; over /. The "Slashes" section also recommends {{frac}} but fails to mention all the other things that are recommended instead, like <math> and writing out in English. I think the simplest way to resolve this would be to change the "Slashes" section from:

  • in a fraction (7/8), though the "fraction slash" (7&frasl;8, producing 7⁄8) or {{frac}} template ({{frac|7|8}}, producing 78) are preferred

to:

Does that make sense?

Oh, and the two other instances of "fraction slash" in that section would just need to be changed: "(and fraction slashes)" -> "(and slashes in fractions)" and "to slash or fraction slash" -> "to slash".

-- Beland (talk) 07:11, 15 July 2018 (UTC)[reply]

  • Please don't misunderstand me because I don't want to discourage you – I think it's great that you're trying to clear this kind of markup and coding and typesetting stuff up. But you already have one thread open which – trust me – will take a LOT of editor time and attention to bring to fruition, and one such discussion at a time is enough. Let me suggest you withdraw this question for now and pick it up later. And by withdraw I mean delete the thread (including this comment of mine). Otherwise someone will inevitably post something beginning, "Well, let me just say..." and before you know it it will be a discussion after all. EEng 08:41, 15 July 2018 (UTC)[reply]
    • Well, this is a question about a conflict between the existing guides that needs to be resolved regardless of the outcome of the above discussion. I think it's also a bit orthogonal to ask how the character should be expressed vs. whether or not it should be expressed at all; some people may have feelings on this much narrower question. I'm sure I can walk and chew gum at the same time. -- Beland (talk) 21:24, 15 July 2018 (UTC)[reply]
  • I doubt you will achieve consensus on this. The usual slash is far far more convenient than anything else and there is too little benefit for the other slash. And really the <math> template should be used for anything serious; everything else like {{frac}} is a workaround for some rendering issues rather than a universal solution. —David Eppstein (talk) 16:27, 15 July 2018 (UTC)[reply]
    • Well, that sounds entirely compatible with the proposed change...does anyone actually disagree with that? -- Beland (talk) 21:24, 15 July 2018 (UTC)[reply]
  • You've had two good replies. Please take the hint and do not stir too many pots at one time. Johnuniq (talk) 22:48, 16 July 2018 (UTC)[reply]
To be more explicit: It is definitely not the case that {{frac}} is acceptable for use in mathematics articles. Maybe the standards are different for non-mathematical articles using fractional weights and measures. And your frasl example looks horrible also (the digits are too close to the slash); I think frasl is intended only for use with the raised/lowered fractions like those produced by {{frac}} that are disallowed in mathematics articles here. I think neither of these should be encouraged. —David Eppstein (talk) 23:32, 16 July 2018 (UTC)[reply]
Yes. If it were used in running text, it would need some kind of template to do kerning with CSS, an thus we have {{frac}} already.  — SMcCandlish ¢ 😼  07:06, 17 July 2018 (UTC)[reply]

Well, since I'm removing encouragement to use frasl and {{frac}}, and people don't seem to like those, it seems like there is consensus for this change. Please correct me if I'm wrong. The only remaining encouragement for either of those things is on Wikipedia:Manual of Style/Dates and numbers § Fractions and ratios which says {{frac}} is to be used in limited circumstances and discourages HTML character fractions that would use frasl. (So if you still have a problem with that guidance, the talk page for that policy page is probably the place to take it up.) -- Beland (talk) 14:57, 18 July 2018 (UTC)[reply]

You seem very aggressive in interpreting disagreement with parts of your proposal as consensus for everything else. I think you should be more patient. I have certainly not yet agreed that this is, in general, a good direction to go. —David Eppstein (talk) 01:52, 19 July 2018 (UTC)[reply]
Yes. Try proposing something narrower and more specific now.  — SMcCandlish ¢ 😼  02:02, 19 July 2018 (UTC)[reply]
@David Eppstein: @SMcCandlish: Are you objecting to something about fraction slashes or something about the HTML entities section which is being discussed above? What is it that you would want to see changed, or how would you want to see the scope narrowed? -- Beland (talk) 15:30, 20 July 2018 (UTC)[reply]
Re-reading what you originally posted, you claimed there's a conflict between two sections but this doesn't seem clear. You can write "2/3" or use {{frac|2|3}} (which uses &frasl;). If there's anything to resolve, it's probably to change Wikipedia:Manual_of_Style#Slashes to stop saying to use &frasl; inline in general; it should be / in cases like "2/3". We're only using fraction-slash in templates that super/sub script the numerals. We don't want people to use <math> markup for basic inline fractions; it's for more complicated usage. But that section is about slashes, including certain ways of doing fractions, not about fractions in general, so we don't need to go into writing out "two-thirds" or using math markup or [not] using Unicode fraction glyphs; just this: use / not ⁄ for fractions with digits, or use the frac template if you want super/sub-scripted fractions. A cross-ref to the MOS:NUM section on fractions should be sufficient otherwise. Your "though other techniques are usually preferred" isn't correct; sometimes another technique is preferred.  — SMcCandlish ¢ 😼  15:56, 20 July 2018 (UTC)[reply]
@SMcCandlish: So, the proposed change has already been made to this page. Am I reading correctly that the only modification you are requesting to that is changing "usually" to sometimes? -- Beland (talk) 01:35, 21 July 2018 (UTC)[reply]
That's the part that caught my eye; I already tweaked it on the live page.  — SMcCandlish ¢ 😼  02:24, 21 July 2018 (UTC)[reply]
I'm not convinced that the html section is needed at all. It is more material for a guidebook on html than style guidance for Wikipedia editors. And you appear to have the purpose of using the new section as a bludgeon to begin a massive project of automatically reformatting characters in Wikipedia, which I think is a bad idea (watchlist clutter for no visible change to articles). —David Eppstein (talk) 16:15, 20 July 2018 (UTC)[reply]
@David Eppstein: OK, this is not the right section for this discussion. Would you like to append this to the previous discussion, start a new section, or something else? -- Beland (talk) 01:35, 21 July 2018 (UTC)[reply]
Well, I had a long reply I didn't want to sit on any longer, so I copied the above and posted my reply in a new subsection of the "HTML entities" discussion above, "Reversion of addition of third draft". -- Beland (talk) 08:07, 22 July 2018 (UTC)[reply]
But what is "the html section"? I'm not seeing such a subsection at either of the MoS sections under discussion, nor in the wording stuff under discussion here. I strongly agree that "a massive project of automatically reformatting characters in Wikipedia ... is a bad idea"; much of the WT:MOS discussion over the last month has been about two (three, counting a revert spree) of mass "MoS enforcement" waves of robot-like edits, over which at least one person lost their WP:AWB permissions (should've been at least two, if you ask me).

The only thing I've seen in this discussion which should be implemented across a bunch of articles – and only as an add-on to more substantive changes – is conversion of things like "2⁄3" to "2/3". I would suggest asking at WP:GENFIXES if someone can think of a way to add that to AWB's "General Fixes" scripts in such a way that it a) does not clobber mentions of the &frasl; entity, &ampl#8260; entity, &#x2044; entity, or Unicode character outside of fractions; b) does not replace it in constructions like <sup>2</sup>⁄<sub>3</sub> or in templates that [correctly] use it; and 3) which (preferably) does replace the entities with the Unicode glyph in cases that are of the form <sup>2</sup>⁄<sub>3</sub>. If someone wants to convert those to {{frac}} or to <math> markup, they need to do that on a case-by-case basis, since any given markup may have been used for a contextually legitimate reason.
 — SMcCandlish ¢ 😼  02:09, 21 July 2018 (UTC)[reply]

@SMcCandlish:: As of July 1, 2018, there were only 8 articles detected using &frasl;: Assouad dimension, City and South London Railway, Discrete valuation ring, Q-derivative, Winding number. I think I already manually fixed all of them except the first one. The moss project picked up only three instances of numerical references in April, and I already converted all of those. So, none of those will result in a "spree". As for the fraction slash character itself, it's more widely used, appearing in article titles and also English text, it looks like because some people have confused it with a slash. I can add a rule that will post any word containing that character to the moss complaint list and just point editors at Wikipedia:Manual_of_Style/Dates_and_numbers#Fractions_and_ratios to resolve such instances using their judgment. We do have literally about half a million possible typos to get through just involving the letters a-z, so I'm not sure how quickly that list would be cleared unless a particular editor or editors feel that the resulting typography is just an abomination that needs to be removed from the face of the Internet. For people using automated editing tools, maybe it needs to be a "detect the problem but don't try to automatically fix it" thing, since the preferred solution depends on context. -- Beland (talk) 02:54, 21 July 2018 (UTC)[reply]

Okay. But which "moss" thing are we talking about? WP:MOSS = MOS:SPELL (Wikipedia:Manual of Style/Spelling). That's not a project and isn't a list of pages with stuff to fix.  — SMcCandlish ¢ 😼  03:12, 21 July 2018 (UTC)[reply]
@SMcCandlish: Wikipedia:Typo Team/moss. -- Beland (talk) 14:22, 21 July 2018 (UTC)[reply]
Ah, so! I added a disambiguation hatnote atop WP:Manual of Style/Spelling.  — SMcCandlish ¢ 😼  14:29, 21 July 2018 (UTC)[reply]

Merge proposed: WP:NCCOMICS to MOS:COMICS (which is already ~50% NC material)

 – Pointer to relevant discussion elsewhere.

Please see Wikipedia talk:Manual of Style/Comics#Merge in WP:NCCOMICS

Gist: We have WP:Manual of Style/Comics, the top half of which is naming-conventions material. Then we have WP:Naming conventions (comics), a competing comics naming convention. This is a silly WP:POLICYFORK. Having a combined guideline is thus proposed, based on successfully combined MoS/NC pages in other topics.  — SMcCandlish ¢ 😼  08:42, 19 July 2018 (UTC)[reply]

Merge the Cyrillic advice to one guideline

We have a problem. All of these pages overlap, and none of them are actually guidelines:

The non-mainspace pages are redundant and hard to find, likely to conflict and diverge, and not authoritative. They're moribund and all but forgotten, yet listed at Wikipedia:Romanization as if they're guidelines (it also lists articles like Romanization of Kyrgyz as if they are). Mostly what they say is not really naming-convention material in particular, but general MoS material that also happens to apply to article titles. They have inconsistent names and organizational approaches.

I think these should just be merged into a single WP:Manual of Style/Cyrillic, with a general table, footnoted as needed for specific languages where there are variances (or perhaps use different table rows for this?). Have language-specific sections with detailed notes. If anything in it is truly a naming convention (i.e., applies only to titles), this can be put in a separate paragraph, with a shortcut, like WP:NCUKRAINIAN or whatever, as needed; the page will cross-categorize as both an MoS and an NC guideline. We're already doing this with various topical MoS/NC pages, and with WP:SAL, and it works fine (better, actually, that splitting this information across multiple pages). We should actually be doing more of this; see, e.g., the note above about erasing the pointless WP:POLICYFORK that we have between WP:NCCOMICS and MOS:COMICS (which has its own naming conventions section).  — SMcCandlish ¢ 😼  08:55, 19 July 2018 (UTC)[reply]

  • I can agree on this — as long as we remember how many languages (most of them are not even Slavic ones) are using Cyrillic alphabet with so different phonetics. A unified page can become quite bloated. However, because it's not supposed to be a very particular «Englification of Russian», it's better be «Latinization (Romanization) of Cyrillic». Tacit Murky (talk) 15:36, 20 July 2018 (UTC)[reply]
    Sure. We have little actual material to cover that isn't Russian or Ukrainian. Most subjects on en.WP that might have a name in any of the Siberian languages also have a name in English or in Russian that will be more familiar to our readers. I would think we should consolidate and arrange the existing Cyrillic latinisation material at Wikipedia:Romanization and no add to it unless/until we see a need to do so.  — SMcCandlish ¢ 😼  01:17, 21 July 2018 (UTC)[reply]
  • @Beland: You seem to have a good eye for the table tweaking. Care to give this one a go?  — SMcCandlish ¢ 😼  01:17, 21 July 2018 (UTC)[reply]
My only interest in Slavic language words is that they be tagged with <lang> to indicate to spell/grammar checkers that they are not English, and to hint to TTS systems what pronunciation system they should use. -- Beland (talk) 01:41, 21 July 2018 (UTC)[reply]
Sure. Now that {{lang}} has been reworked, a bunch of people are working on doing this consistently, though it's very gradual.  — SMcCandlish ¢ 😼  03:13, 21 July 2018 (UTC)[reply]

Link city and state in ledes of U.S. college and university articles?

The vast majority of articles about U.S. colleges and universities begin with sentence like this: "<Institution> is a <list of adjectives> college/university in <city>, <state>." In many cases, both the city and state are linked to their respective articles. In some cases, they both link only to the city. Is there a firm consensus that the MOS favors or discourages one of these two approaches? (Jweiss11‎ and I had a brief discussion about this on Jweiss11‎'s Talk page if anyone would like a little bit more background.) ElKevbo (talk) 14:06, 20 July 2018 (UTC)[reply]

WP:MOSLINK discourages overlinking, and discourages bunched linking where possible. It might be useful to link the city, provided it's not a well-known city such as LA, NYC, Chicago, or a host of others that English-speakers are likely to be familiar with. But I'm struggling to see why the US state is worthy of a link as well. Is there something I'm missing? This is better raised at WT:MOSLINK. Tony (talk) 14:18, 20 July 2018 (UTC)[reply]
Am I correct in inferring that the concern here is that state names are familiar to most readers and thus don't need a link? I ask not only because of the current discussion about linking but because that also ties into another question I have (which isn't related to the MOS) which concerns the inconsistent inclusion of ", United States" in the lead sentence of these articles.
It's also worth noting that part of this discussion is related to the fact that many colleges and universities are public and therefore governed by their respective states so we're not just concerned with geography. ElKevbo (talk) 14:29, 20 July 2018 (UTC)[reply]
MOS:SEAOFBLUE discourages back-to-back links. If it is being mentioned, the city is the location of interest, even if its name is being qualified by the state. The state link is inevitably linked in the city article.—Bagumba (talk) 14:46, 20 July 2018 (UTC)[reply]
The issue here is the back-to-back bunching of a more specific wikilink with a less specific wikilink when just the more more specific wikilink will do. We should also note here that Template:Infobox university has separate fields for city and state, which render back-to-back wikilinks. Perhaps this should be remedied? Jweiss11 (talk) 15:51, 20 July 2018 (UTC)[reply]
WP:USPLACE is also a factor here. Except for a few very notable exceptions, the articles for towns and cities located in the United States already include the name of the State in their titles (using the format "<City, State>"). For example, the city of Ann Arbor, Michigan (linked to in the article for the University of Michigan)... is formatted as: "[[Ann Arbor, Michigan]]", NOT "[[Ann Arbor]], [[Michigan]]".
However, there are those few exceptions... for example, our article on the city of Chicago doesn't include the name of the State (Illinois) in the title (Personally, I think it should, but consensus has deemed otherwise). Now... this will impact our article on DePaul University (which is in Chicago). The question is: do we want to include a link to Illinois, or is the link to Chicago enough? Blueboar (talk) 01:04, 21 July 2018 (UTC)[reply]
  • Its all this stuff combined. SEAOFBLUE and USPLACE, and also relevance combined with user-interface smarts: "... university in Cleveland, Ohio" is redundant because any reader that wants geographical (or, more narrowly, human geography) info about the institution will get it from Cleveland, Ohio. They're not likely to click on Cleveland (which already has a link to Ohio), then come back to the university article and click on Ohio, a link which is of low (over-generalized) relevance to the university topic. Otherwise we might as well do "... university in Cleveland, Ohio, United States, Western and Northern Hemispheres". Heh.  — SMcCandlish ¢ 😼  01:26, 21 July 2018 (UTC)[reply]
Something else we should consider, some of this is simply due to lazy writing... Let me give an example: While it is helpful for the University of Notre Dame article to specify what state Notre Dame is located in... there is absolutely no need to specify what state the University of Michigan or Ohio State University are in (the name of the institution kind of gives that fact away). So... we could avoid the entire "see of blue" issue and use piped links (writing: "The University of Michigan has it's main campus in the city of Ann Arbor" or "Ohio State University is located primarily in the city of Columbus"). Trying to make everything follow a consistent pattern can limit your options. 02:07, 21 July 2018 (UTC) — Preceding unsigned comment added by Blueboar (talkcontribs)
Yep.  — SMcCandlish ¢ 😼  03:15, 21 July 2018 (UTC)[reply]
Assuming of course that there are no American equivalents to the University of Warwick, which isn't based in Warwick but in nearby Coventry .Nigel Ish (talk) 10:51, 21 July 2018 (UTC)[reply]
When the First Unitarian Church of Berkeley moved to Kensington, they decided not to rename themselves First Unitarian Church of Kensington. (It'll come to you.) EEng 11:12, 21 July 2018 (UTC)[reply]
Sure, when something that was named for a location isn't actually in that location, the lead does need to more clearly specify where it actually is. However, that scenario is highly unlikely for universities named after US states. Blueboar (talk) 11:53, 21 July 2018 (UTC)[reply]
Highly unlikely?? I suppose you've never heard of Washington University. —David Eppstein (talk) 18:55, 21 July 2018 (UTC)[reply]
Named after the man, not the State... but since not everyone knows that, it could be confusing... so, sure, that would be one where we would include the state location as well as the city. Blueboar (talk) 22:42, 21 July 2018 (UTC)[reply]
(edit conflict) I've never even visited the United States, so I don't know, but I have to imagine the probability of a school with such a name existing, even apart from EEng's example above, is overwhelmingly high. That said, writing the lead sentence to say Fu University is a university located in Notfu, Wisconsin. should be discouraged anyway; if the fact that it is located somewhere in spite of its name is important enough to be noted in the lead, then it should be noted separately (Despite its name, Fu University is actually located in Notfu [for reason X].), not in the lead sentence where all it will do is confuse readers and potentially cause them to believe the page has been vandalized. As for linking, I'm inclined to say case-by-case: most Japanese university articles seem to include such links, and not doing so with American institutions because everyone knows what an Ohio is reeks of WP:SYSTEMIC. Hijiri 88 (やや) 12:00, 21 July 2018 (UTC)[reply]
That's the approach I take to such cases (not just universities but "names that don't make sense" in general). It also drives me nuts when I see a "Fu University is a university ..." construction or the like, anyway. It's terrible writing that treats our readers like they've had lobotomies.  — SMcCandlish ¢ 😼  17:52, 21 July 2018 (UTC)[reply]
Like Hijiri88, I'm wary of the assumption that most readers automatically recognize the names of most U.S. states. ElKevbo (talk) 13:32, 21 July 2018 (UTC)[reply]
I don't think the question is whether the state name should be included, but whether it should be linked. For example, should Yale University link to [[New Haven, Connecticut]] or [[New Haven, Connecticut|New Haven)]], [[Connecticut]]. Natureium (talk) 13:52, 21 July 2018 (UTC)[reply]
I would say the first... no need to link to the state article separately. Send the reader to the article on the city ... as that will probably give more relevant information when coming from a university article (such as what neighborhood the university is in, or if there has been any “town vs gown” history, or if there are other universities in the same town, etc)... the reader can get to the article on the state from there. Blueboar (talk) 23:02, 21 July 2018 (UTC)[reply]
I know what the question is - I'm the one who originally asked it. :) But one editor has proposed omitting the location entirely in cases where the institution's name includes the location. ElKevbo (talk) 14:12, 21 July 2018 (UTC)[reply]
The answer is that there is no single “correct” way to do it... there are lots of “correct” ways; and wording that works at one article, may not work at another. That said... in general... a well written article phrases things to avoid unnecessary repetition and avoids over linking. Blueboar (talk) 23:02, 21 July 2018 (UTC)[reply]

@ElKevbo: would you agree that we have consensus that we should not link city and state back to back? Jweiss11 (talk) 17:30, 27 July 2018 (UTC)[reply]

Yes, especially in instances where the title of the city's article also includes the state i.e., nearly all cases. ElKevbo (talk) 18:23, 27 July 2018 (UTC)[reply]

Trypophobia article – using wording from quoted text

Opinions are needed at Talk:Trypophobia#Latest changes. The discussion concerns whether or not it is fine to quote this source as much as desired without the use of quotation marks, and whether or not we should always use a source's exact words. Regarding the latter, the question is whether it's WP:Original research to use our own wording as opposed to a source's exact words and whether wording like this needs to be tagged as WP:Weasel. The discussion additionally concerns stating things in Wikipedia's voice when sources disagree, the research is new, and/or there is no consensus in the literature on the matter.

On a side note: The Trypophobia article contains an image that some find distressing. So a heads up on that. Flyer22 Reborn (talk) 15:31, 20 July 2018 (UTC)[reply]

Already commented there, even before the ping. --Tryptofish (talk) 16:55, 21 July 2018 (UTC)[reply]

Ordering of gendered titles when a gender-neutral equivalent is unavailable?

The lead of our RWBY article includes the phrase "Huntsmen" and "Huntresses", following my adding of quotes, since it's in-universe terminology. That point is important since we can't just say "hunters", which to the best of my knowledge is gender-neutral, because what they actually do in-universe is more military service. However, virtually all the important characters in the show, including the four title "Huntresses", are female, which makes me wonder if it would be better to say "Huntresses" and "Huntsmen".

Setting aside the "don't use in-universe terminology" solution (which I personally like but would never fly on this kind of article), what's the policy here? Are we allowed decide based on factors like the above point about the prominence of female characters to change the order? I suspect (can't seem to get hit-counts from GNews on a mobile device...) that the majority of third-party reliable sources (as well as at least one Wikia site) prioritize "Huntsmen", but that could be because of latent sexism, which is concerning...

Hijiri 88 (やや) 08:16, 21 July 2018 (UTC) (Stricken because a more careful check revealed no one puts "Huntresses" first. That might be deliberate satire on the fictional patriarchy being portrayed, but that gets into OR and SPECULATION territory. Sorry for jumping the gun. My bad. Hijiri 88 (やや) 08:48, 21 July 2018 (UTC) )[reply]

To be honest I think you might be overthinking it. Its probably just as simple as the various sources out there copying the press releases/blurbs from Rooster Teeth themselves rather than any latent sexism. Who (as far as I can tell) always use 'Huntsmen and Huntresses' in that order, unless specifically referring to a particular character. Only in death does duty end (talk) 08:27, 21 July 2018 (UTC)[reply]
Well, yeah, but wouldn't that just be latent sexism on RT's part? (Note that I say "latent" sexism because I don't for a second think there's any malice on their part, just habits ingrained in them from being raised in a partiarchal culture.)
That said, I did just Google RWBY "Huntresses and Huntsmen" and found that there was a total of one news result. (I initially didn't bother with this search since I assumed the above mobile device problem might make it pointless.) So I guess the question is kinda moot because, sexism aside, Wikipedia probably shouldn't be using in-universe terminology that even the producers don't use.
Hijiri 88 (やや) 08:48, 21 July 2018 (UTC)[reply]

Do we italicize the names of toy franchises?

I'm not really sure where to ask this, but I guess I'll ask it here. Should the name of a toy franchise be italicized? See Lego Friends. I've seen this a few places lately, including at Transformers, but not at Garbage Pail Kids. So, I'm confused. Thanks, Cyphoidbomb (talk) 01:26, 22 July 2018 (UTC)[reply]

No. If anything, the GPK case has a better claim to italics, since they're technically a published serial work (collectible cards) not toys. The GPK article is stylistically all over the the place, mis-capitalizing in headings, putting "scare quotes" around Facebook, etc. Haven't looked at the Transformers stuff. I would guess that someone's been italicizing it because they've seen some other franchises italicizing and are just copy-catting the style.

Anyway, we used to have clear instructions about this, and I wonder if they've been lost or dis-clarified. They were to not italicize a franchise, trilogy or other book or film series, fictional universe, or other mass of works and products, or reference thereto, unless and except where it is named after the title (not partial title) of the original work in the series: Thus, these are okay: the Star Trek franchise, Asimov's Foundation series, the Star Wars Extended Universe novels; but these are not: Tolkien's Middle-earth fiction and the films and games developed from them, The Harry Potter series, the Marvel Cinematic Universe. Some serial works come with an over-arching series title in addition to the works' titles, e.g. The Chronicles of Thomas Covenant;[a] this shouldn't be italicized, as it just confuses as to what the franchise name versus the book titles are. This may be a matter of MOS:TITLES not being fully centralized yet. As with MOS:BIO until the recent merges, the work-title-related stuff has been scattered through various guidelines.

The off-WP styles vary, but overall seem pretty close to this rule, or even to italicizing less, i.e. not italicizing franchise names at all. Sometimes they're put in quotes, sometimes italicized, more often neither. This un-stylizing habit pre-dates common fiction franchises, and evolved from treatment of non-fiction series of works (which are distinguished from single works published in a number of volumes, as many reference works and major monographs, like The Golden Bough are).
 — SMcCandlish ¢ 😼  04:05, 22 July 2018 (UTC)[reply]

  1. ^ Off-topic public service message: I must warn everyone away from the Thomas Covenant series. It's enormous – each book in the trilogy of trilogies is about 500–900 pages – and almost as detailed as any other F&SF series there is, but once you wade through it, you find that much of the plot was futile, and you'll be disgusted by the protagonist, nor caring much about the fools going along with him. If I could reclaim any fiction-reading time I've ever spent it would be that series, and I stopped at book 6 after fighting the urge to do so at book 4. So, if you just must investigate it, stop at book 3, before it really goes off the rails.

The endless "fan-capping" problem

We really need to do something to clarify the wording in main guideline and (where applicable) at MOS:CAPS, MOS:TM, MOS:TITLES, WP:NCCAPS, WP:OFFICIALNAME, etc., to stem the tide of fandom-based over-capitalization: "WP has to use the weird capitalization in this company logo/movie poster/album cover just because it's official and because business/entertainment magazines do it".

On a nearly daily basis there are either a) new RMs to move articles to MoS-non-compliant names to mimic logos and and other marketing, or b) fandom-based opposition to an MoS-compliance RM, because it doesn't match the over-capitalization on the album cover or the director's blog. A common variant of this is denial that the MOS:5LETTER rule (capitalize a preposition in the title of a work only if it is five letters or longer) exists or can be applied, simply because it's not the style used in news journalism.

One egregious case was the repeated fight to try to move the song article "Do It like a Dude" to "Do It Like A Dude" complete with capital A (yes, really) as well as capitalized preposition, to match the font styling on the cover the single. An ongoing one is Talk:Spider-Man: Far From Home#Requested move 14 July 2018, swamped with WP:ILIKEIT votes that are ignoring all WP:P&G arguments.

This is sucking up way too much editorial time at RM, and the discussions are always circular rehash. It's a constant firehose of WP:LAME. Something in the guideline wording needs to be adjusted to curtail this stuff. I can take a stab at it at some point soon, but would rather have some additional eyes and brains on it. Where are the weak spots? Why is it not getting through? How can so many people, even when pointed directly at the guidelines, all saying the same thing in different wording, still somehow not understand?
 — SMcCandlish ¢ 😼  14:43, 23 July 2018 (UTC)[reply]

  • I know this is "locking the barn door after the horse has fled"... but this all stems from our decision to put article titles in sentence case instead of title case. That was a bad decision, looking back at it (90% of the arguments would have been avoided if we had decided to use title case)… unfortunately, fixing that decision is now unworkable. Blueboar (talk) 14:51, 23 July 2018 (UTC)[reply]
    I felt that way when I first arrived; I really hated the sentence casing. But if we'd picked title case it would have made disambiguation a lot messier, and would make it harder to tell whether something was about a proper noun a lot without actually going to the article. I think that's why sentence case prevailed. A decision way before my time.  — SMcCandlish ¢ 😼  19:58, 23 July 2018 (UTC)[reply]
  • (edit conflict) I agree with the thrust of SMcCandlish's post. Indiscriminate capping is indeed a problem—and Blueboar, in my view rendering article titles in sentence case was the sanest formatting decision ever taken in the early days of en.WP. Sometimes we could be forgiven for feeling that fan-capping and vanity-capping is a violation of WP:POV, at least in spirit. Every self-respecting publishing house imposes its own rules (especially WRT capping, which has no influence on google searches). One reason is that inconsistency drags down the subtle sense of a publication's authority. Tony (talk) 14:59, 23 July 2018 (UTC)[reply]
    Of course you agree! You two are the main promoters of this absurd rule. WP has no business fiddling with the title of works where the actual title is clear, especially using its own home-cooked rules. I'd welcome discussing the question, and would hope that commonsense would prevail, and WP:COMMONNAME rule supreme, as it should. Johnbod (talk) 15:05, 23 July 2018 (UTC)[reply]
    Been over this already [9] (many, many times). And COMMONNAME is not a style policy; WP:AT and the naming conventions defer to the Manual of Style on style questions.  — SMcCandlish ¢ 😼  20:10, 23 July 2018 (UTC)[reply]
  • A consistent factor in the debate is the issue of "common name" versus "common style". Those who support the current position of the MoS argue that name and style can be separated, so that we decide on the name based on sources, but then apply our own styling based on the MoS. I fully accept this general approach, but long ago (during the debates about the capitalization of organism names) I asked those who espoused name/style separation to provide clear definitions and explanations of when orthography could be changed, as it is 'merely' style, and when it could not, as it conveys important components of meaning. I think it's important for those who support the status quo to try to step back and see things from a less committed perspective. For example, the MoS regards some capitalization in sources as 'mere' style, and so of no semantic importance, even though the capitalization is clearly noticed by many editors, who faithfully copy the source. On the other hand, the MoS imposes style choices, such as length of dash, that evidence shows many (if not most) editors don't notice and so don't naturally copy. This difference seems inconsistent to many editors, which, I think, is one reason for the endless re-opening of old debates.
So to answer User:SMcCandlish's question, I think that more rigorous and reasoned definitions and explanations of the differences between "name" and "style", as relevant to the MoS, might help.
What doesn't help or contribute to reaching consensus is giving positions you don't agree with prejorative labels, like "fan-capping", when many editors see it as just following the sources they use, which in other contexts is laudable. Peter coxhead (talk) 19:11, 23 July 2018 (UTC)[reply]
The whole problem with the idea (as has been explained to Johnbod, et al., many times, but they just refuse to hear or accept it), is that different kinds of publications have one rule or another about how to treat prepositions in titles of works. The same work will have its title rendered differently depending on who's writing about it, in the real world. News journalism usually uses a four-letter rule (sometimes even three, depending on publisher, with marketing style leaning toward one letter), at one extreme, while academic journals tend to go with never capitalizing any prepositions, even long ones like throughout and alongside. We and various others have a middle-ground approach, a compromise. But if all you read is newspapers and magazines, all you're exposed to, pretty much, is the four-letter rule, and you get the impression that it's The One True Way to write English. This is obviously an illusion. Pick any well-known book with "from" or "with" in the middle of its title. You'll find that journalism sources render it "From" or "With", academic ones virtually always lower-case them, and other kinds of publications vary widely.

There is no "official spelling" obeyed by everyone. It's a fantasy. And a weird one. I have never in my entire life encountered a book author, movie director, etc., throwing a public tantrum because a book review or a film journal used "from" but the artiste's marketing materials use "From". Seriously, no one cares, except a small number of Wikipedia editors. WP:RM routinely resolves to follow MOS:5LETTER, time after time. Yet people who focus on entertainment magazines and websites as their sources never, ever stop trying to forcibly capitalize "with" and "from" in works they like. They don't go around doing this to titles of obscure works of non-fiction, or songs people have probably not heard unless they're over 60; they do it with current pop-culture topics that they're big into. It appears to be another strain of the "I want to capitalize this because it makes it seem important" thing; the same emphasis-caps urge that we have to deal with a lot more broadly. But this fannish version of it is just really common, and really tendentious.  — SMcCandlish ¢ 😼  19:55, 23 July 2018 (UTC)[reply]

Perhaps it is time to accept that our MOS is out of sync with what our editors want. Continually telling people “but what you want is WRONG” is pointless if no one wants to listen. Blueboar (talk) 20:16, 23 July 2018 (UTC)[reply]
Except that if these voices of complaint are coming from fans (and reading between the lines, dedicated fans) there's a bit of COI - not actionable! - here to demand that MOS is wrong. The argument reminds me of the past situation with MMA and current with wrestling in general that "but for our area, we need our rules!" They're not seeing the bigger picture that a MOS is meant to provide, which is a general reading and editing consistency for WP. I would say the community is listening, but simply not accepting the argument that the one topic area needs special rules here, particularly one based on pop culture. --Masem (t) 20:52, 23 July 2018 (UTC)[reply]
Right. The same 10 or so people – out of around 30,000 monthly editors – pursuing pop-culture over-capitalization again and again no matter how many times consensus turns against them is not an indication that our guidelines are broken.  — SMcCandlish ¢ 😼  06:05, 24 July 2018 (UTC)[reply]
This problem can be summed up (kind of) with the Wikipedia Stephen King story collection Four past Midnight title, which seems wrong on many levels. It's not the name that King uses, nor his fans (and no, I am not a dedicated fan, read him long ago but he lost me in the decades he started writing 7,000 page books), nor the world at large. This one example stands out as defining "what's wrong" with the hard-and-fast rule on how many letters a word has to have to be capitalized in a title. The other major problem is MOS says that if something isn't capitalized "consistently" (which some editors define as always, no exceptions) then it must be lower-cased, even if the vast majority of sources and common sense itself deem that upper-case is the way to go. That MOS point often gives naming-rights to a few people, those who write the sources. They, probably out of ignorance or research-laziness, fail to upper-case something, and that error then flows into Wikipedia where it can be pointed to as non-consistency, and thus brings non-common use styling into this project. Solving it should not mean adding even stricter language to MOS, but loosening it up to allow common-sense and most familiar names in English to be considered as important and viable components in capitalization decisions. Randy Kryn (talk) 00:17, 24 July 2018 (UTC)[reply]
Here is the n-gram for Four past Midnight, published in 1990, which seems relevant to this discussion. Randy Kryn (talk) 03:55, 24 July 2018 (UTC)[reply]
I've suggested before that we could consider a change, to capitalize short prepositions that are often are not prepositions (Past, Like, etc.). Just because various style guides treat all prepositions, by length, as a class doesn't mean we are forced to, especially given doubt among linguists that the "preposition" categorization is actually valid rather than an obsolete idea from early-20th-century approaches to language (for an easy-reading explanation of this, and a tremendous amount of good writing advice, see Steven Pinker's The Sense of Style, which covers it in detail without miring the reader in linguistics jargon; IIRC, it's covered in ch. 4, "The Web, the Tree, and the String", which should be required reading before anyone can edit this site. >;-) No one's bothered with an RfC suggesting such a change, and instead they just try to re-re-re-litigate their preferences at RM after RM. It's a productivity drain for everyone. And that is all it is. It's not editorial cluster A's preferences versus cluster B's, it's A's versus something like 5 consistent guidelines and thousands of previous RM closes. But such a change still would have no effect on "with" and "from". Over-capitalizers of these just need to let it go. It's a classic specialized-style fallacy, the silly notion that sources reliable about a topic (e.g. who has been cast in an upcoming Spider-Man movie) are reliable sources for how Wikipedia must write and style prose about the subject.  — SMcCandlish ¢ 😼  06:19, 24 July 2018 (UTC)[reply]

PS: An N-gram on a pop-culture topic is utterly meaningless for capitalization analysis of titles of works, because around 90% of the material written about such topics is entertainment journalism, which all follows the four-letter (or even shorter) rule. I.e., it's circular reasoning, begging the question, cherry picking (in the off-WP sense, a.k.a. fallacy of incomplete evidence), all at once. I think people have difficulty with this because they mistake COMMONNAME for a style policy and don't understand the reason we have the policy and why it's not a style policy. It exists so people looking for David Johansen don't end up at Buster Poindexter; it has nothing to do with forcing particular nitpicks of typography, and we have at least 5 guidelines against doing that to mimic "official" stylization.  — SMcCandlish ¢ 😼  06:38, 24 July 2018 (UTC)[reply]

@SMcCandlish: returning to your initial question, neither Wikipedia:Article titles#Article title format nor Wikipedia:Manual of Style#Article titles explicitly cover the situation where the title of an article is the title of a work. Perhaps it would help to add something here, or at least to add links to other places in the MoS. When What Is To Be Done? is used as an example in the MoS, you can understand why editors might choose "Four Past Midnight" or capitalize "with" in the title of a work. Peter coxhead (talk) 06:59, 24 July 2018 (UTC)[reply]
Hmm. That appears to be an error. I'm surprised it wasn't noticed sooner. Interestingly, even WikiProject Russia's literature task force has it as What Is to Be Done?, so it's odd that the "To" version has arisen. Anyway, no such over-capitalization shows up in any example of titles at MOS:TITLES, MOS:CAPS, etc., as far as I can tell. Anyway, "the situation where the title of an article is the title of a work" might well be part of the issue. Maybe the assumption that covering it at MOS:TITLES is enough is a poor assumption. A cross-reference, at least, couldn't hurt.  — SMcCandlish ¢ 😼  07:27, 24 July 2018 (UTC)[reply]
RM opened: Talk:What Is To Be Done?#Requested move 24 July 2018  — SMcCandlish ¢ 😼  10:18, 25 July 2018 (UTC)[reply]
  • Blueboar, when you write "what editors want", are you sure you don't mean "what Blueboar wants"? Tony (talk) 04:57, 24 July 2018 (UTC)[reply]
    Nope. I have not been involved in the discussions about any of these titles. Blueboar (talk) 10:38, 24 July 2018 (UTC)[reply]
    But you have been in many similar ones. >;-)  — SMcCandlish ¢ 😼  10:18, 25 July 2018 (UTC)[reply]
  • I echo the objection to the use of pejorative phrases like "fan-capping" or implication of the motives of anyone involved has anything to do with their level of fan involvement. The highest principle in WP:Article titles, that topics are named based on reliable, secondary sources, stems from WP:Verifiability. These are core POLICIES compared to a MOS guideline which continues to expand beyond its initial purpose of addressing technical limitations of Wiki software and is now becoming an WP:OR bible of usage. The problems related above stem from MOS advocates pushing this set of guidelines too far to the front. A MOS is fine for describing how we should handle original, subjective prose in articles, but cannot be used to override hard facts (like titles of works and other proper names) which are directly presented in reliable sources. So no, we cannot continue to alter commonly-accepted titles of a works based on a set of guidelines we've created ourselves - except where the common presentation of such is incompatible with the wiki software or other practical concerns. -- Netoholic @ 08:27, 24 July 2018 (UTC)[reply]
    • WP:NOT#NEWS is also policy: "Wikipedia is not written in news style." Meanwhile the policies you want to cite you are badly misinterpreting; they dictate nothing whatsoever about style.  — SMcCandlish ¢ 😼  10:23, 25 July 2018 (UTC)[reply]
  • I agree with Netoholic. Not only in the pejorative way this topic is stated, but in the general "tail-wagging-the-dog" mentality here. I'm sorry, but who the hell do we think we are? We have zero moral right to tell an author their title is wrong just because it offends some rando on the internet's idea of proper grammar. Can unusual styling be promotional? Yes. Does that mean we should never use unusual stylings to maintain NPOV? Absolutely not. Maybe the author/director/producer chose such a style for promotional reasons, or maybe they had an artistic purpose. If you don't know the reason, then you have a moral obligation to the author's freedom of expression to respect their articstic choice and use their styling. Period. Anything else is bollocks. oknazevad (talk) 10:00, 24 July 2018 (UTC)[reply]
    Since you're just agreeing with Neto, see reply to him just above. Your thing about their being some kind of One True Title has also already been covered in detail [10]. It's a fantasy. The exact same work's title will be rendered with "From" in a newspaper and "from" in a film journal. And no one cares, except a dozen or so people who won't stop going on and on about it on Wikipedia. The entire notion is pure WP:OR.  — SMcCandlish ¢ 😼  11:25, 25 July 2018 (UTC)[reply]
  • Hi everyone. Personally I think Masem sums it up best: MOS is supposed to provide a feeling of consistency across all topics and articles. As for the specifics, I'm also in favour of applying standardised capitalisation in nearly all cases (as stated above, it makes it much easier for readers of an encyclopedia, rather than following the whims of branding and advertisers). There are a few cases where reliable sources all tend toward using the owner's stylisation (e.g. iPod, eBay, etc.), which is absolutely the right thing to do, but for titles of individual comic books, cartoons, etc. there are often not enough serious reliable sources that use consistent style guides and have professional editorial oversight that cover them for us to follow their conventions (e.g. it's not unusual for pop-culture artifacts like this to be reviewed by one or two borderline reliable websites by semi-professional writers, and only edited cursorily – this isn't really a strong precedent to follow, and I would say that, unlike iPod etc. above, there is in practice no fully established consensus among reliable sources on how these specific stories should be capitalised, as they are not covered widely enough). ‑‑YodinT 13:49, 24 July 2018 (UTC)[reply]
    • Also, following the main thread of the discussion, I'd agree with pretty much everyone above that this whole process is a productivity sink. I can't see a way forward that would help with this in practice, but would just say that my impression is that it's essentially about what casual editors see as looking "right" rather than trying to make their topic more important (though no doubt they might also try to do this in other ways) – they look up Four past Midnight or whatnot, think to themselves "that just doesn't seem right", and then (and normally only if they really care about the topic...) they try to get it changed to be all initial capitals. We absolutely could change our style guidelines on the capitalisation of titles of works (following modern journalistic style guidelines for example), but then the exact same process as above would play out in reverse, but for fans of traditional grammar/capitalisation, who would invest the same amount of energy in trying to get it reverted back to what we have now. On the one hand, the grammar-fans might perhaps be more likely to understand that it's our convention, and just to accept our style guidelines even if they disagree... (maybe a bit optimistic...), but on the other my impression is that it might alienate them further from Wikipedia in a way that pop-culture fans wouldn't be put off. Just a few thoughts. ‑‑YodinT 14:30, 24 July 2018 (UTC)[reply]
      I've said for a long time that people are free to open a proposal to change to the four-letter rule. They never do it. They only want to squabble for additional capitalization for particular works that they care about (modern, popular stuff with shiny logos and entertainment-press buzz). I've given this a lot of thought, and there are really 5 possible courses; it's unclear which would be least painful for the community: 1) Don't actually resolve it, and instead keep at the same circular rehash at RM after RM until, I guess, people just die off or something. 2) Revise approx. 5 guidelines and 2 policies to be clearer about this and to put a stop to it. 3) Adopt the four letter rule, requiring tens of thousands of page moves and millions of inline edits, against surely stiff resistance (WP didn't end up with the 5-letter rule by accident, it was chosen and has remained consensus the entire time). 4) Make up a "five-letter plus" rule, that makes exceptions some people don't like lower-casing (e.g., "like" as a preposition – but this would still have no effect on "from" and "with" debates, and would still move probably 1000+ page and involve at least tens of thousands of in-article changes). 5) Fire up dramaboards to restrain people from any more of this WP:IDHT / WP:TE activity at RM. The "6th option" isn't one: It's just not going to happen that we'll randomly apply whatever stylization seems to be the majority usage for any given work, since it would produce utter chaos. Every proposition in this direction has died. We have explicit policies against this idea, like WP:CONSISTENCY and (given that most of the result will be from news, not better sources, and will be following news style guides) WP:NOT#NEWS. It's also not within WP:COMMONNAME ambit at all. The entire dozen years I've been here, the same few people have been pushing a "COMMONNAME is a style policy" hypothesis, and RM consensus tells them they're wrong again and again and again and they just won't accept it. WP:AT and the naming conventions guidelines to defer to MoS on style matters, on purpose. All WP:NCCAPS is is MOS:CAPS applied to titles. We should probably just merge them.  — SMcCandlish ¢ 😼  12:16, 25 July 2018 (UTC)[reply]
  • I generally look at it as: If we wouldn't change the capitalisation on someone's name, we shouldnt on the name of other works. Where the capitalisation as part of the title is clearly evident (most often in books, film titles etc) and not a function of the logo/trademark (often seen in companies with allcaps/lowercase etc) then really WP:COMMONNAME does apply. Capitalisation has never been accepted as solely a style issue rather than a naming issue, which is why the RFC is getting the results it is. And really the point of the MOS is that is it meant to be a guide of best practice for the majority of situations with some exceptions. On this issue there are too many exceptions that can be easily argued makes the MOS guidance not useful. If it causes more problems than it solves, its not useful guidance. What would eliminate most of the conflict on ENWP would be stating where the article title matches a creative work, capitalisation is deferred to local consensus. But that is never going to get consensus either. Only in death does duty end (talk) 14:49, 24 July 2018 (UTC)[reply]
    Already been over this [11]. The RM is a joke. It is nothing but fans arguing that because a primary source and some entertainment newa – all using the four-letter rule – style it by their rule that this "proves" WP's MoS is wrong – about a movie that doesn't even exist yet. It's absurd. They only thing is proves is that, yes, those writers do follow their four-letter rule, which was never in question – not in this case and not in hundreds of previous RMs to fix overcapitalization based on marketing or news style (which are the same style – the capitalization schemes and other "rules" promulgated in marketing style guides like that of the American Marketing Association are direct directly from the AP Stylebook. If you try to get a job in marketing or PR (in the US, where Hollywood is) there's about a 99% chance that a requirement of the position will be detailed knowledge of AP style. "I'm following news, not marketing" is a non-argument; they're the same style. It is not WP style. Actual policy: "Wikipedia is not written in news style." And of course WP:NOT#PROMO also equates, in this context, to not written in marketing style either. It's why we have MOS:TM. PS: I takes no time at all – despite AP style's near hegemony on news typography – to find news source that don't do "From"; the further they are from the Hollywood press in at least one of subject matter, location, or dependence on print publishing, the faster you see "from". [12][13][14][15][16][17][18][19][20][21][22]. Many of these are even entertainment-related publishers, but they're "webby" ones. A few veer back and forth between spellings [23]  — SMcCandlish ¢ 😼  12:16, 25 July 2018 (UTC)[reply]
  • I'd support general statement to use title case for titles of articles, where the title is the title of a work (book, play, poem, media, etc.), unless sources on the work specify otherwise. It has nothing to do with being a "fan". Alanscottwalker (talk) 21:54, 24 July 2018 (UTC)[reply]
    @Alanscottwalker: No one is arguing otherwise. The issue is that what "title case" means is a bit different depending on house style. We've had a particular one – a mainstream compromise between the extremes of news style and of academic style – for a long time. A few editors can't stand it and just won't drop the stick.  — SMcCandlish ¢ 😼  11:12, 25 July 2018 (UTC)[reply]
    Except I would stress dependence on common sources for the subject, better bringing it in line with the spirit of titling - 5 letters is just arbitrary, instead of, say, four -- it will always be arbitrary and thus the subject of dispute. So, having recourse to sources is the usual way to form evidence-based decisions. Alanscottwalker (talk) 11:25, 25 July 2018 (UTC)[reply]
    But "sources" predictably do something different depending on what house style they follow, and almost all coverage of movies is going to be from entertainment press following AP Stylebook which is not our stylebook. It's rather like suggesting that because the average American eats more cheeseburgers than pupusas, that cheeseburgers are proven to be better food, when you limit your dataset of food choices only to Americans. It's abuse of statistics. The end result of a course like with would be pretending WP:CONSISTENCY policy doesn't exist, because all movies and TV shows and pop albums with "from" or "with" in their titles will end up at "From" and "With" and other works (classical music, influential novels, paintings, etc.) with the same names will end up at "from" and "with". That would certainly cause endless and much more widespread dispute. Probably all it takes to end this dispute is clearer, synchronized guideline and policy wording. Basically, "yes, WP does have a 5-letter rule; no, we do not randomly use different capitalization on the whims of what the source pile this month is showing." Why? because works reliable for facts about a subject (who's in the movie, etc.) are not reliable sources for how English must be used to write about that subject in an encycloedia.  — SMcCandlish ¢ 😼  12:27, 25 July 2018 (UTC)[reply]
No one cares that you do not like the way movie sources do it. If you are writing and reading about that subject, that will be the natural, expected, and understood way. -- Alanscottwalker (talk) 12:48, 25 July 2018 (UTC)[reply]
Until you read a film studies journal or other higher-quality source, not just newspapers and e-zine doing reviews and regurgitating studio press releases and star interview comments (all primary-source material). We don't write WP based on the style of the numerically most common source type, or, for example, all our video game and rock music articles would read like gamer-zine and Kerrang! material, and our articles on medical and science topics would be ponderous, jargon-encrusted, and impenetrable to the average reader (or even to specialists in other fields). PS: You are not a mind-reader. You have no idea whatsoever what I "like". In point of fact, I use the four-letter rule in my off-site writing and in my music playlist. I'm just able to separate my personal preferences from what Wikipedia's guidelines are, something an increasingly tedious handful of people seem to have great difficulty doing.  — SMcCandlish ¢ 😼  15:00, 26 July 2018 (UTC)[reply]
  • Per the debate at the Spiderman RM, I'd be in favor of honoring what reliable sources use first, and use style guidelines when sources are mixed. There are repeated invocations of "chaos" on standards like this, but really, this is how most article titles are decided already - what do the sources use? Sometimes there are problems when there's a split between "scholarly" and "popular" sources, and some titles are descriptive titles that really are invented by Wikipedia (History of XYZ from 1900-1963 or whatever), sure. But it's usually pretty workable and the vast majority of article titling works fine without any trips to RM at all. Additionally, titles that are not in sources at all and are not descriptive titles are almost universally a bad idea, and the few times they've been tried (e.g. due to an AmEnglish / BritEnglish naming dispute) often get moved again later. That said, that's just my two cents, and I'd be happy to be wrong if I was outvoted. The more interesting thing is SMcCandlish's contention that, basically, everyone else is constantly wrong and it's eating up too much time with this "endless problem." Doesn't that imply that the proposed standards SMcCandlish supports are not supported by the editor community at large? Style pages reflect consensus, they don't guide it. If the variance is as bad as claimed, then that seems a strong signals that something is wrong with the current guidelines, and they aren't receiving wide community support among the editors, and should be changed to reflect what does have support. SnowFire (talk) 23:15, 24 July 2018 (UTC)[reply]
    Repeat: [24]. The amount of variance isn't bad, the amount of time wasted by the same handful of "give me marketing mimicry or give me death" people is what's bad. And actually read the RM thread. There is no "reliable sources versus style guidelines" issue; it's news style versus WP style. All the sources cited are following news style, and WP does not as a matter of actual policy (not guideline).  — SMcCandlish ¢ 😼  10:18, 25 July 2018 (UTC)[reply]
The thing is, WP style can change if we need it to... So, the question is: should we change our style to better sync up with the style commonly used by news media? Blueboar (talk) 10:42, 25 July 2018 (UTC)[reply]
If this had been a good idea 17 years go (evidently it wasn't or, why did consensus prefer the five-letter rule?), the cost to do so now would be staggering. Thousands and thousands of articles would have to move, and it would mean changing literally millions of in-text references, all because about a dozen people just can't seem to understand that every publisher has a house style and ours it not the AP Stylebook. It's really that simple.  — SMcCandlish ¢ 😼  11:09, 25 July 2018 (UTC)[reply]
Ah, the Founding Fathers! Fortunately it is easier to change MOS than the US constitution. I wonder how many people actually participated in that decision - do you have a link? I rather doubt that the effects would be on the scale you claim - many creators or publishers of works follow something similar. It sounds like it's time to rebrand this as "the endless MOS strange rule problem". Johnbod (talk) 14:10, 25 July 2018 (UTC)[reply]
Wait, "cost"? There's no significant cost, unless someone specifically takes it upon themselves to go forth and tweak every applicable page name and in-text reference personally and immediately. These things can spread out over time. --tronvillain (talk) 14:27, 25 July 2018 (UTC)[reply]
Spreading it out over time does not eliminate the cost, just hide it. Every minute spent by an editor making a change that doesn't objectively improve the encyclopedia is a minute wasted. This is one of the reasons MoS regulars are so strongly resistant to willy-nilly changes. The majority of style matters are ultimately arbitrary (and this one absolutely is); the value in having a line item about it is consistency for readers and dispute reduction for editors. No particular style is "correct" in some absolute sense. So, changing from one capitalization scheme to another because a few editors won't stop venting about it, or even because at some point 50.001% of editors prefer it, is a bad idea. WP:NOT#DEMOCRACY exists for a reason. There are serious productivity costs involved in changing stuff based on either mob-rule whims or the "argumentum ad nauseam until I win by wearing down the opposition" technique.  — SMcCandlish ¢ 😼  15:00, 26 July 2018 (UTC)[reply]
I suppose I'm just repeating myself, but since my argument is continually misconstrued and strawman'd by SMcCandlish: I am not suggesting that "marketing mimicry" be used. I am suggesting reliable sources be used, which is not the same thing for when there's a crazy style official title and another more standard title used in the media. You can disagree with that too, but can you at least argue against my actual stance? SnowFire (talk) 17:15, 25 July 2018 (UTC)[reply]
There are multiple kinds of reliable sources, and they use different styles. A "with" or "from" in the title of the exact same work will be capitalized based on the type and house style of the publication writing about it, not the work about which they are writing. This is what you won't seem to absorb, yet it takes only seconds to prove it. "Gone with the Wind" dominates in scholarly writing [25] as does "Far from the Madding Crowd" [26], while "Gone With the Wind" is more common (barely) in news writing [27] as is "Far From the Madding Crowd" [28] (when using it as a title rather than a clichéd phrase). Book sources lean lower-case but less strongly than journals [29][30], plus N-gram [31]; there are some book publishers who go four-letter. The style you are arguing for as if it were the only one, or the only permissible one, or the "correct" one, is AP Stylebook style, which is preferred by the entertainment press and most newspapers. It's also what marketing style is based on. On this particular matter they are the same style. The distinction are are trying to draw between entertainment-press sources and the logos, posters, covers and other marketing simply doesn't exist. Meanwhile, you're ignoring the distinction between the the newsy sources, academic sources, and mainstream books sources, all of which approach title case differently (and often differently from publisher to publisher, even within one of those categories – some news publishers have a three-letter rule, for example). For any given work, what type of sources the majority of sources about that work are is going to be will vary, and will even change over time (e.g., lots of film and other journals write about movies, but usually several years after their release, after their social impact, critical reception, etc., have built up and can be analyzed).  — SMcCandlish ¢ 😼  15:34, 26 July 2018 (UTC)[reply]
Also, to respond to your actual argument: WP:NOT#NEWS has nothing to do with this topic. That's a guideline about not having WP articles about the Tacoma Truck Fair or the like. But it doesn't really matter; even if we granted that NOT-NEWS also means "ignore newspapers for anything related to capitalization", I'm not saying that news media is by any means an exclusive source. There's tons of other published media that discusses art/novel/movie/song titles as well. If books written on the topic all use refer to something using a particular styling, those too are valid sources for what the title (capitalization and all) actually is, and not forbidden by NOTNEWS, and are perfectly valid applications of verifiability for article content & style. SnowFire (talk) 17:30, 25 July 2018 (UTC)[reply]
Except not. It's a policy, not a guideline. It covers many things, not just what you say it does. Among these various points: "Wikipedia is also not written in news style." Amusingly, books written about a subject like this generally do not use the styling you want (demonstrated above [32]), nor do news publishers all use it (demonstrated in same post, and for that Spider-Man movie in particular, here), so your "If books written on the topic all ... refer to something using a particular styling" idea is inapplicable twice over.  — SMcCandlish ¢ 😼  15:41, 26 July 2018 (UTC)[reply]

Citations in the lead

This text was added to WP:MEDMOS, based on this discussion.

Consensus was not gained that this change is in concurrence with project-wide MOS. It has not been determined that statements in the leads of medical articles are more likely than any other type of article to be challenged, and the main reason for this push for citations in the lead has been for the (external) translation project, which translates only the leads of medical articles (a separate problem in and of itself). Many examples have been given over the years of how this demand for citations in the lead compromised the summary aspect of article leads. SandyGeorgia (Talk) 14:57, 24 July 2018 (UTC)[reply]

Beta-Hydroxy beta-methylbutyric acid provides an example of this new trend, with up to six citations per sentence. SandyGeorgia (Talk) 15:11, 24 July 2018 (UTC)[reply]
Sorry, whats the issue? MOS is not incompatible with WP:V which can require inline citations in the lead, and neither is MEDMOS, it just lays out two advantages to doing so. Apart from biographies and fringe science, Medical articles are certainly one of a group of 'likely to be challenged for claims of fact' (especially when it intersects with fringe/alternative medicine) that would require citations per WP:V, so having them there in advance isnt an issue, nor is there anything in the site-wide MOS that says you cant. 'Its not necessary' encyclopedia wide is not incompatible in any way with MEDMOS 'its not necessary but here are a couple of reasons why you might want to for these articles'. Only in death does duty end (talk) 15:11, 24 July 2018 (UTC)[reply]
What is at issue is broader discussion to keep MEDMOS in sync with site-wide policy and guideline. If the claims put forward about the reasons for requiring extreme lead citations in medical articles are true (I disagree that they are), then the reasoning should be included in a site-wide guideline, not just MEDMOS. If false, the wording is extraneous. Citations can always be provided in leads for any articles if consensus is developed on an individual article. The push here is to demand them for the purposes of an external project (translation), which in and of itself has resulted in compromised quality of articles, as the focus is on the lead rather than the body of the articles. Forcing citations into leads in many cases has rendered it difficult to write a summarizing lead. The extreme to which this has gone is seen at Beta-Hydroxy beta-methylbutyric acid, where there are up to six citations per sentence. If that is the direction we want lead citations to go, then it should be a general guideline, not just a medical guideline. Broader input, beyond the increasingly walled garden at the Medicine project, should go into this decision. SandyGeorgia (Talk) 15:18, 24 July 2018 (UTC)[reply]
But it is in line with with the wider MOS. Neither states you are or are not required to have citations in the lead. MEDMOS gives two reasons why its preferable for medical articles to do so. Those two reasons do not exist for every article and so would be inappropriate to add to a site-wide MOS. And the wording as written is hardly a 'demand'. If you want to write a medical article without citations in the lead, you are still free to do so. But someone may come along and add them later - functionally you cannot prevent that - because if you did attempt to remove them over a style issue, they would just cite WP:V and add them anyway. There are plenty of examples of local topic-specific guidelines that do not apply to other topics. Its only a problem when they are in conflict, and there is no conflict here. (The problem with BHBMA looks more to be citation overkill where multiple citations are used for single relatively short sentences, where one citation for the lead would do with the others in the body) Only in death does duty end (talk) 15:24, 24 July 2018 (UTC)[reply]
I believe Sandy is referring to MOS:LEADCITE. --Izno (talk) 15:32, 24 July 2018 (UTC)[reply]
It is untrue that "If you want to write a medical article without citations in the lead, you are still free to do so." I intended to bring Dementia with Lewy bodies to FA standard, and was told in the rudest possible terms that it would be strenuously opposed unless I (over)cited the lead. I was forced to cite the lead, which makes it harder to write a compelling summary. And it has not been determined site-wide that leads of medical articles should be an exception. MEDMOS and MEDRS have been widely accepted partly because of efforts in the past to make sure they stayed in sync with broader policy and guideline. Citation overkill, and substandard citations in leads to meet the needs of the translation project, are an issue across medical articles. SandyGeorgia (Talk) 15:34, 24 July 2018 (UTC)[reply]
Its not untrue at all. If you want to write a 'featured article' you might have to jump through extra hoops but thats the price you pay for writing a featured article. You can write a standard medical article perfectly fine without citations in the lead (unless WP:V comes into play). No one can stop you. And once again, MEDMOS is not stating it is an exception to wider site guidelines, its merely stating there are a couple of reasons why you might want to do it differently for medical articles. (Izno, its also not in conflict with LEADCITE - which itself accepts certain types of articles are more likely to require citations in the lead). Only in death does duty end (talk) 15:43, 24 July 2018 (UTC)[reply]
Overcitation of leads is not a requirement for FAC. If it is to be the case for medical articles, then it should have consensus beyond the walled garden of the medicine project. Hence, the broader discussion that should have been initiated before the change was made to MEDMOS. SandyGeorgia (Talk) 15:44, 24 July 2018 (UTC)[reply]
Why would there be a broader discussion on changing a guideline that only applies to medical articles? You have yet to point out where there is an actual conflict between what MEDMOS says currently and wider site guides. They all currently state (with the exception of where WP:V comes in) that lead citations are not required. Only in death does duty end (talk) 15:50, 24 July 2018 (UTC)[reply]
When individuals (unsupported by broader consensus even in discussions at WP:MED) are forcing non-site-wide practices into FAs and local guideline pages, a broader discussion is optimal. And, as already pointed out (and mentioned over the years), great care was taken in earlier years to make sure that MEDRS and MEDMOS stayed in sync with site-wide policy and guideline. Taking local pages beyond what has site-wide acceptance jeopardizes years of careful work. (Not to mention the damage to article leads that results from this practice.) SandyGeorgia (Talk) 18:56, 24 July 2018 (UTC)[reply]
You are repeating yourself but you are not actually answering the question. They are currently in sync with site-wide practice. FAC is largely irrelevant as it can (and regularly does) mandate higher standards than are required for articles (to be published). If you are having a problem with a getting a featured article label on an article because the editors involved in featured articles want it done in a certain way, that is not a MOS issue (I would be surprised if any FA reviewers asked for citations in the lead except for controversial content as FA requires its compatible with the MOS. And since both with/without citations is compatible with the MOS and MEDMOS....). What is the conflict between MEDMOS and the wider site best-practices please? Only in death does duty end (talk) 18:59, 24 July 2018 (UTC)[reply]
I strongly support the addition of the new language. I also note that, as written, it does not say that anything is mandatory. But what it says accurately reflects present-day editing norms, not limited to medical content. Maybe long ago in wiki-years it was otherwise – I don't know. But it is perfectly acceptable for editors in a specific topic area such as this one to form a consensus that content about that topic should generally follow more stringent sourcing guidelines than what applies site-wide. After all, MEDRS sets requirements for secondary sourcing that do not apply in other subject areas, and that's a good thing. And there is no valid reason for FAC to dictate otherwise. Good scholarly writing requires this style of attribution, and although Wikipedia is written for a general audience rather than a scholarly one, the special burden of our health-related content (that it can potentially influence health decisions made by our readers, with very significant real-world consequences) makes it reasonable to treat material in the lead as subject to "citation needed". --Tryptofish (talk) 20:11, 24 July 2018 (UTC)[reply]
I'm pretty sure FAC doesnt actually do this at all, will ping @Ealdgyth: (who does a fair amount of FA reviews as I recall). But currently the wording at MEDMOS isnt more stringent, it just says there are some good reasons to do it. But you dont have to. Only in death does duty end (talk) 20:31, 24 July 2018 (UTC)[reply]
I should clarify that I did not mean literally that FAC dictates to MEDMOS. I was trying to communicate that the fact that there was a difficult FA review is not a valid reason to say that the new wording should be removed from MEDMOS. Also, it's been my experience that the objections to cites in the leads of health articles mostly come from editors who have been active at FAC. In any case, sorry if that was unclear. And I don't mean to start a FAC versus MED grudge match. --Tryptofish (talk) 20:41, 24 July 2018 (UTC)[reply]
No I meant my experience of reading FA-articles is that FA promotes articles regardless of cites in the lead or not - even a quick look at the medical (and non-medical) FA's shows examples of both - so I think FA is a non-issue when it comes to cites in the lead debate. Only in death does duty end (talk) 20:53, 24 July 2018 (UTC)[reply]
I'm finding it curious to be told what is and isn't practice at FAC :) :) I suggest there is probably no medical editor on Wikipedia who knows same to the extent that I do. Perhaps Graham or Cas though. Only, I would be interested in knowing which articles you have written and how you have composed and cited the leads for them ? SandyGeorgia (Talk) 21:57, 26 July 2018 (UTC)[reply]
Or you could link to the discussion where you were told you had to have multiple cites in the lead to write a FA. Diffs please. Only in death does duty end (talk) 22:06, 26 July 2018 (UTC)[reply]
I suspect you may be the only person in this discussion who doesn't know where to find them, and there are pages of discussion, including a draft RFC. OID, I am still wondering if you have every built an entire article and then summarized its content to a lead, and if so, just how you personally do so? An example of an article and lead as you build them might help me see things from your perspective. For my perspective, you can look through scores of medical FAs and others to see that leads do not always need to be cited. If someone demanded citations in leads under my FAC tenure, that demand would be ignored because it is an invalid, unactionable demand. As Ealdgyth can also explain. SandyGeorgia (Talk) 01:37, 27 July 2018 (UTC)[reply]
You stated above that you were told unless you used multiple cites in the lead your FA would be opposed. Please provide a diff. Given you have been complaining about it, this shouldn't be hard. Only in death does duty end (talk) 02:07, 27 July 2018 (UTC)[reply]

Yeah, it's really just a matter of whether something in the lead is likely to be viewed as controversial (or has already been contested). Well, at a stub it may also be a matter of whether the claim in question exists outside the lead; many stubs are nothing but a lead. :-) Anyway, I tend to agree with Only in death; it's just a fact that medical claims are more likely to be controverted. It's also a fact that WP:MOS has no control over whether WP:FAC can demand something above and beyond what MoS does; I would surmise this will also be true of WP:V and WP:CITE and their pools of regulars; the FAC crowd aren't going to listen to them saying "X is not actually required", either. From FAC's viewpoint, it is required if you want the WP:FA icon.

I don't agree with this insularity at all, mind you, but I observe that it's happening. FAC is a wikiproject, and the FA label is something that wikiproject hands out based on their own criteria. At least as of late 2016, there was quite a bit of hostility over there toward complying with anything in MoS that the people in that echo chamber don't like, which actually a real WP:CONLEVEL problem. I long ago stopped thinking of FA as an "official" WP process, but as just some drama I don't want to be involved in. It's like a Boy Scout merit badge that will cost you a limb if you're not part of the in-crowd. I've been here over 12 years, and have GAs under my belt but no attempts at FA – it's just that off-putting. So, I definitely feel SandyGeorgia's pain on this aspect of the matter.

To get back to WP:MEDMOS, I don't see that there's a conflict between it and the main MoS (at least not on this point, and it does have a conflict with WP:PSTS that I've been trying to get resolved for about 3 years, so I'm not saying the page is perfect). It may go above and beyond MoS's basic requirements (and, as Tryptofish points out, even above basic WP:V / WP:RS / WP:NOR requirements). Lots of MoS subpages do similarly for particular things, just as various notability and naming conventions guidelines are more persnickety than WP:N and WP:AT respectively. The central policies and guidelines are minimums, not limits – within reason. Is citing medical claims in the lead really unreasonable?
 — SMcCandlish ¢ 😼  10:05, 25 July 2018 (UTC)[reply]

SM, FA delegates/coordinators are fully empowered to ignore even what you refer to as FA regulars, when their commentary is not within WP:WIAFA, so I am not sure of any relevance of any of your statements above; No, X cannot be required at FAC by whim because someone wants it-- no matter how vociferously the oppose. I have promoted FACs with multiple opposes, and archived FACs with 28 Supports. There are almost no areas where FAC goes beyond MOS; where the standards do is spelled out at WP:WIAFA, which has not changed since the 3,000+ FACs I promoted. The effect of one editor demanding citations based on personal preference has no relevance to FAC-- it does, though, sere to discourage editors from wanting to waste time bringing articles to standard. Is demanding citations in the lead unreasonable?

So, back to the issue; should MEDMOS stay in sync with MOS? Citations in the lead are not required. I contest that medical leads have content any more likely to be challenged than many other areas-- this is a made-up meme. And yes, overcitation in leads makes it difficult to write compelling prose. SandyGeorgia (Talk) 21:57, 26 July 2018 (UTC)[reply]

Well the example of BHBMA appears to be a problem of WP:CITEOVERKILL rather than just having citations. 6 cites instead of 1 is excessive if the citations are used elsewhere in the article and they are just confirming each other. If a sentence is being constructed such that it requires 5/6 cites to source specific claims in the sentence, then really thats something that should be re-written for the lead. There doesnt appear to be any conflict between sources (the main reason for warring citation spam) so unless someone somewhere is demanding 4,5,6 cites for non-controversial info I dont see what the holdup is in slimming them down unless they are actually required for WP:V purposes - but some of the sentences are quite short and I am pretty sure you dont need 5 citations for what is a single statement. Here as part of the FAC process @Doc James: actually questioned the amount of references used. So that indicates to me neither the Med project or FA are requiring that sort of excessive citation in the lead. The GA review here also brings up excessive citations, but also shows that there are issues with material being challenged. Only in death does duty end (talk) 10:49, 25 July 2018 (UTC)[reply]
We've discussed this within WPMED on multiple occasions. There is nothing contradictory between the MOS and MEDMOS. Natureium (talk) 14:39, 25 July 2018 (UTC)[reply]
At this point in this discussion, I think that the bottom line for me is that there is sufficient consensus for the added wording at MEDMOS. --Tryptofish (talk) 17:32, 25 July 2018 (UTC)[reply]
The LEADS are better if referenced. But nothing in the LEAD should really require more than one or two refs (simply pick the best). The LEADS however do not require references, but if referenced with MEDRS compliant sources it would be fairly controversial to try to remove them.
It appears to be claimed that the ONLY reason to reference the leads is to support creation of medical content in other languages (and this is positioned as a bad thing). This, however, is not the case. While it is one reason to reference the lead, it also makes them easier to discuss and improve as one can verify that the content in the lead is well supported or not more easily.Doc James (talk · contribs · email) 08:34, 26 July 2018 (UTC)[reply]
would agree w/ Doc James in terms of the lede--Ozzie10aaaa (talk) 12:20, 26 July 2018 (UTC)[reply]
Concur also with Only in death and Doc James that 1 ref per claim is sufficient. We seem to have this dispute, to the extent it really is one, because of "cite stacking" in the lead, not because the lead has any citations in it at all.  — SMcCandlish ¢ 😼  14:44, 26 July 2018 (UTC)[reply]
At this point, there is still not a single editor from outside of the usual bunch weighing in on this discussion. Ah, but such is Wikipedia these days. I have raised a concern about the direction the MED pages are going, after years of carefully keeping them in sync with site-wide pages; nothing has been or will be done because there are no new eyes on the topic. Carry on. No medical editors are writing top quality content, so resolution one way or another won't have much effect. Regards, SandyGeorgia (Talk) 21:57, 26 July 2018 (UTC)[reply]
I said earlier that "I don't mean to start a FAC versus MED grudge match", and that ^ is what I was concerned about. Peace. --Tryptofish (talk) 22:28, 26 July 2018 (UTC)[reply]
Understood. It was Only in Death (an editor I never enountered in the Featured processes) who stated that "If you want to write a 'featured article' you might have to jump through extra hoops but thats the price you pay for writing a featured article;" and seems to have less than thorough knowledge of the FA process, because there is no requirement to cite leads in FAs. I agree that the FA portion of this discussion should not be relevant, but we do have the example of the way the new wording in the guideline is being interpreted by a few medical editors is extreme, as happened in that case. As you know, rather than rock the boat, I ceded and cited the lead fully at dementia with Lewy bodies even though that should not be needed, and was not needed. But that is how this wording is being extended in application. A very good example of that can be seen with:
  • Medications for one symptom may worsen another.[11]
There is no reason to have to cite a general statement like that in the lead; that is overcitation of the lead, and this sort of thing leads to clunky writing. Regards, SandyGeorgia (Talk) 01:52, 27 July 2018 (UTC)[reply]
I sure do remember that. For whatever it may be worth, one fish's opinion is that "Medications for one symptom may worsen another.[11]" and "Medications for one symptom may worsen another." do not differ from each other in terms of clunkiness of writing. I realize that this is subjective, but I think I'm no slouch when it comes to clarity or engaging-ness of writing (aside from that hyphen I just put there). --Tryptofish (talk) 18:29, 27 July 2018 (UTC)[reply]
The word you're looking for is engagiosity. EEng 18:33, 27 July 2018 (UTC)[reply]
Snort, laugh. Says the editor who has made the cites in Phineas Gage so convoluted that they are clunky. [FBDB] --Tryptofish (talk) 18:49, 27 July 2018 (UTC)[reply]
You stated explicitly you were told taking Dementia to FA would be opposed with citing multiple times in the lead would be opposed. It's also an indisputable fact that you have to have to do more to get an article to FA standard than is normal. I have found no evidence from trawling through FA pages, or the med project, or your contribution history, that as a project either FA or med have said or implied that you are/were required to cite in the lead. If they have, it's well hidden. In fact, as I already said above, the only evidence I have found (using your own example) is that they asked for the complete opposite (less cites in the lead). So at this point you really need to provide some actual evidence in the form of diffs because so far you have made a number of misleading statements, as well as being extremely insulting towards both the FA and med wikiprojects. Only in death does duty end (talk) 05:32, 27 July 2018 (UTC)[reply]
  • The translation issue seems to be a red herring. If only the lead of a foreign language article is translated then, in the English translation version, it is no longer the lead; it has become the body of the article. And, as for the general point, MOS:CITELEAD makes it clear that "there is not ... an exception to citation requirements specific to leads." This means that the lead of any article may require citations, if challenged, and so medical articles are just a likely case, rather than being special in this regard. Andrew D. (talk) 22:36, 26 July 2018 (UTC)[reply]
  • Concur. There is no reason for medical articles to be treated any differently than any other class of article, with respect to citations in the lead. The site-wide guideline covers it. MEDRS and MEDMOS tried (in the past anyway), not to extend beyond site-wide policy and guideline, but to explain how those policies and guidelines applied to biomedical content. Going beyond what is required for any other type of article is likely to result in a backlash, and accomplishes nothing. If people want to translate, that is fine, but they can seek out the citations as needed (which they should be reading anyway, although they don't always.) Regards, SandyGeorgia (Talk) 01:32, 27 July 2018 (UTC)[reply]
  • Leaning toward concurrence with Only in death. Where is the evidence of either a) FAC requiring lead citations in med articles, or b) MEDRS actually diverging from MoS or from WP:CITE? Like, please actually quote the material where this alleged WP:POLICYFORK is happening.  — SMcCandlish ¢ 😼  11:56, 27 July 2018 (UTC)[reply]
After suffering through this long discussion I'm leaning towards simply choosing death, period. EEng 18:48, 27 July 2018 (UTC)[reply]
[citation needed] --Tryptofish (talk) 18:52, 27 July 2018 (UTC)[reply]
  • I would not want MEDMOS to say that there has to be an inline cite for every sentence in the lead. And maybe there has been a problem with editors disagreeing about whether a cite is really necessary for a particular sentence in a particular lead. But that's not a reason to say that permitting lead cites automatically creates a problem.
I think that it would be a problem if MOS set a requirement, and MEDMOS tried to say that the site-wide requirement would not apply to med pages. But that is not the case here. I see nothing wrong with MEDMOS suggesting (and it is a suggestion rather than a requirement) something that MOS says is OK but not required. MOS says that some pages can have cites in the lead and other pages don't have to. MEDMOS just says that putting cites in the lead is recommended. On the other hand, MEDRS has long said, with good consensus, that there are situations where primary sources are impermissible, whereas RS does not make that kind of prohibition. Because this is a situation where a specific topic has editors who want to go beyond the minimum required by MOS, rather than to ignore requirements set by MOS, this is not a violation of MOS. And there really are valid reasons to encourage lead cites in health-related pages. --Tryptofish (talk) 18:46, 27 July 2018 (UTC)[reply]

"Typographic conformity" section cleanup

MOS:CONFORM was out-of-step (for many years now) with two points of actual practice, and the section was also mix-and-matching grammatical structures in its list items. I also realized that we were not dealing with the special considerations that apply to doing CONFORM stuff with titles of works (including factors that can actually break citations), so I've cross-referenced a summary of that at Wikipedia:Manual of Style/Titles#Typographic conformity (MOS:TITLECONFORM)  — SMcCandlish ¢ 😼  09:38, 25 July 2018 (UTC)[reply]

Rhyme scheme patterns

A rhyme scheme is a pattern that appears in the lines of a poem, and generally letters are used to notate the pattern, for example "ABAB CDCD". Sometimes these sequences are very long, the longest I could find on Wikipedia being "abacabadabacabaeabacabadabacabafabacabadabacabaeabacabadabacaba". There is considerable variation in how this notation is capitalized and punctuated:

  • ABAB
  • "ABAB"
  • abab
  • "abab"
  • "A,B,A,B"
  • "AB AB"

In some cases, the main article says that the notation requires a specific capitalization. For example, "aBaBccDDeFFeGG" distinguishes masculine and feminine rhymes with lower vs. upper case. I think it would be nice, for human readability reasons, to settle on a consistent style for this notation. To me the sequences are nicely distinguished from sentence prose when they are either in quotes, all caps, or both. I'm open to whatever poetry-editing editors want to do, but for the sake of having a starting point for discussion, how about the below? I have asked for input from Wikipedia talk:WikiProject Poetry. -- Beland (talk) 22:00, 25 July 2018 (UTC)[reply]


Unless otherwise required by a specific notation, rhyme schemes should generally be written when appearing in prose:

  • In all capital letters
  • Enclosed in double quotation marks
  • Without italics
  • Using spaces to separate groups (not commas or other punctuation)

Example: This poem uses the "ABAB" rhyming pattern.


Why the quotation marks? What information do you think you are conveying by including this extra baggage in the notation? (Also, I suspect that when spaces are used it may often be because they are meaningful, e.g. to separate stanzas. And there are more styles currently in use than you list; e.g. a-b-a, b-c-b, c-d-c. —David Eppstein (talk) 23:28, 25 July 2018 (UTC)[reply]
I'm not attached to the quote marks for all-caps sequences. If you parse the sequences as proper nouns, they make sense without them. If you parse them as a sequence of symbols not part of the sentence but being quoted from somewhere else, I would think about them like MOS:WORDSASWORDS, where quote marks are one of the options (and prettier than italics, which I don't see often used for this purpose in poetry articles). If we were doing all-lowercase, then the sequences would blend in with sentences much more easily and I think there'd be a stronger argument for quote marks. Yes, the spaces are meaningful, though without a lot of research I'm not sure I could catalog all the ways people use them, so I tried to be generic. Do you think "stanzas" is a better word than "groups"? I did some searches for the style with dashes and didn't find attestation for that, but that may merely be a limitation or my own misuse of the search engines. I'm open to that or other styles as well if people like them better, though my personal opinion is that dashes just bulk up the string without adding clarity. -- Beland (talk) 19:45, 26 July 2018 (UTC)[reply]
  • Overprescription, micromanagement, MOSbloat. And you really are stirring too many pots at once. EEng 02:39, 26 July 2018 (UTC)[reply]
Pinging Phil wink who has put a lot of thinking and work into this issue. Also pointing out as of relevance: WP:POETRY#Scansion, Wikipedia talk:WikiProject Shakespeare/Archive 5#Scansion and meter in the sonnets. --Xover (talk) 07:47, 26 July 2018 (UTC)[reply]
The linked discussions are beautiful examples of knowledgeable editors working out article content for themselves in a specific and important topic area. Why does MOS need to overbear that? EEng 10:18, 26 July 2018 (UTC)[reply]
Well, I'm asking those knowledgeable editors if we can agree on a single style for a given notation, rather than having different styles on different articles due to smaller discussions coming up with different answers. I'm not sure what you mean by the MOS overbearing; if there's a preferred style for something, this is the place to document it (either by explaining in detail or pointing to WikiProject guidelines). The point of project-wide consistency is to make article content easier to digest (e.g. when reading a bunch of different articles about rhyming poems) and to make the project look polished, professional, and credible. -- Beland (talk) 19:45, 26 July 2018 (UTC)[reply]
  • I strongly suspect that there are already-published norms about this sort of thing, "out there". If there's consensus among editors who work on poetry material a lot that there's a preferred way to do this (and I'd bet that David Eppstein's suspicion that spacing can be semantically meaningful is correct), maybe we could add something about it in a poetry and lyrics section at WP:Manual of Style/Writing about fiction. This above stuff seems pretty half-baked at this point, though. And scansion and rhyme schemes are not the same thing. The two scansion threads pointed out do seem consistent with each other, probably because WP:POETRY#Scansion is basically a de facto guideline. So, it could be the start of MOS:FIC section on poetry. PS: I agree that adding quotation marks around such markup is pointless. — Preceding unsigned comment added by SMcCandlish (talkcontribs) 16:40:00, 26 July 2018 (UTC)[reply]
  • As always:
A. It is an axiom of mine that something belongs in MOS only if (as a necessary, but not sufficient test) either:
  • 1. There is a manifest a priori need for project-wide consistency (e.g. "professional look" issues such as consistent typography, layout, etc. -- things which, if inconsistent, would be noticeably annoying, or confusing, to many readers); OR
  • 2. Editor time has, and continues to be, spent litigating the same issue over and over on numerous articles, either
  • (a) with generally the same result (so we might as well just memorialize that result, and save all the future arguing), or
  • (b) with different results in different cases, but with reason to believe the differences are arbitrary, and not worth all the arguing -- a final decision on one arbitrary choice, though an intrusion on the general principle that decisions on each article should be made on the Talk page of that article, is worth making in light of the large amount of editor time saved.
B. There's a further reason that disputes on multiple articles should be a gating requirement for adding anything to MOS: without actual situations to discuss, the debate devolves into the "Well, suppose an article says this..."–type of hypothesizing -- no examples of which, quite possibly, will ever occur in the real life of real editing. An analogy: the US Supreme Court (like the highest courts of many nations) refuses to rule on an issue until multiple lower courts have ruled on that issue and been unable to agree. This not only reduces the highest court's workload, but helps ensure that the issue has been "thoroughly ventilated", from many points of view and in the context of a variety of fact situations, by the time the highest court takes it up. I think the same thinking should apply to any consideration of adding a provision to MOS.

I'd like to see, at the least, evidence for A1 or A2 before we even think about embarking on such a debate, because if MOS does not need to have a rule on something, then it needs to not have a rule on that thing. EEng 21:55, 26 July 2018 (UTC)[reply]

For A1, Rhyme scheme uses both lowercase-in-quotes and uppercase-no-quote styles in its prose, but mostly uses uppercase when explaining the different notations. I think this looks ugly and unprofessional because it is inconsistent, and the inconsistency continues when compared to other poetry pages, found by a moss scan:

-- Beland (talk) 02:23, 28 July 2018 (UTC)[reply]

I don't understand what that list is supposed to demonstrate. EEng 02:31, 28 July 2018 (UTC)[reply]

Old vs New spelling

Red X Off-topic?
 – Seems to be a content dispute.

Maybe some editor here would like to put an opinion about naming of the 1st and the 2nd President of Indonesia, Sukarno and Suharto (discussion). It would be appreciated especially for editors that don't know much about Indonesia but experienced at MOS to comment on there. Thank you. Hddty. (talk) 04:26, 26 July 2018 (UTC)[reply]

This is something that should be resolved at the Indonesian project, and, it has gone around in circles for the last ten years, it really is something that people enjoy having arguments about. I fail to see how coming to a generic page like this will help at all. JarrahTree 08:56, 26 July 2018 (UTC)[reply]
@JarrahTree: This is an invitation, I hope many editors discuss this topic. Hddty. (talk) 09:03, 26 July 2018 (UTC)[reply]
Good luck then - this is a problem where languages and spelling variations simply bring more trouble than they are worth, in the end. JarrahTree 09:39, 26 July 2018 (UTC)[reply]
The only thing like a names-related dispute I see at either article's talk page is at Talk:Sukarno about "Ahmad" or "Ahhmad" as an alleged forename, which was apparently made up by a journalist, and was not actually used by him nor used for him in many reliable sources. It is and should be mentioned somewhere in the article (not as his real name) because some readers are fairly likely to encounter it and search for it here. But this isn't an MoS matter, it's a content issue.

The discussion at Wikipedia talk:WikiProject Indonesia#Perfected spelling/#Perfected spelling/Ejaan yang disempurnakan (EYD) is also a content dispute, not an MoS matter. Just open a WP:RM discussion at each article's talk page. I predict that attempts to move these articles to Soekarno and Soeharto will fail for WP:COMMONNAME and WP:RECOGNIZABLE and WP:USEENGLISH reasons. While both articles should include such spellings in the lead, they're not what is most commonly used in English-language reliable sources for these subjects.  — SMcCandlish ¢ 😼  14:27, 26 July 2018 (UTC)[reply]

I beg to differ, from my length of time working inside the Indonesia it is a MOS issue.(I can remember 2006 and 2007 discussions over this issue) The focusing on Sukarno/Suharto issue as the only example does not give adequate context. Despite the fervour for one size fits all ideas - some WikiProjects actually have had the determination to designate usage within the domain of their project scope - as to specific guidelines over translation and transliteration issues (which actually developed into accepted practice and MOS for the articles found within the projects) - for years the Indonesian project has had a circular argument, trying to establish a guideline within the project about the issue... but as I said above to Hddty good luck... JarrahTree 14:53, 26 July 2018 (UTC)[reply]
Which still doesn't sound like an MoS issue. For example, I was at both articles, noticed that Suharto mentioned the Soeharto spelling in the body, so I tried copying it into the lead as a MOS:BOLDSYN. I got reverted, on the basis that not only did Suharto not use it, he corrected people who did use it. Elsewhere we have sources saying that when the spelling reforms were introduced, people were allowed to choose which they preferred for their own names. And that spelling is not often used for Suharto in RS in the English language. Ergo, the reversion was fine and is definitely a content not style matter. The objection is that while Soeharto can be mentioned somewhere, because someone might actually search on that spelling and it is a redirect, it's too trivial for the lead. That's a content relevance argument, not a "style rules" one. A potential compromise might be something like "(also sometimes referred to as Soeharto in Indonesian)", linking directly to the section on the orthographic reform, but the same indiscriminate/trivia objection would likely be raised.

If WP:WikiProject Indonesia wants – after all this time – to develop a WP:Manual of Style/Indonesia-related topics and see if it gains support as a WP:PROPOSAL (such things often fail, being full of wikiproject attempts to defy site-wide rules rather than apply them), they can give it a try (though we really need to merge these things into one page, divided by country/culture, and with redundancy eliminated). Until then, WP:LOCALCONSENSUS rightly applies at a per-article level, since there's no site-wide rule anyone's contradicting by deciding which u versus oe or whatever spelling(s) should appear in the lead for particular cases, or be used as the title.

Relevant open threads: WT:WikiProject Indonesia#Enhanced Indonesian Spelling System: a content dispute. WT:WikiProject Indonesia#RFC: MOS for Indonesia articles and scripts: an unrelated (but style) matter on limiting use of Indonesian script (based on limitation of Indic script in articles about India; that was undertaken because there are too many writing systems in India, which may or may not be true of Indonesia). WT:WikiProject Indonesia#Perfected spelling/Ejaan yang disempurnakan (EYD): an article titles dispute, and covered by WP:COMMONNAME and WP:USEENGLISH.
 — SMcCandlish ¢ 😼  11:48, 27 July 2018 (UTC)[reply]

Update to MOS:TENSE

I just noticed that MOS:TENSE suggests that discontinued products are given present tense. Here's an example [[33]]. While this seems counter-intuitive, I'm wondering if the same logic applies to companies. Almost every defunct or acquired company uses past tense in their articles, and that seems to be correct usage. I'd like to hear from anyone who can explain why we use present tense for discontinued products, and if the same doesn't apply to companies, perhaps we should add a specific carve out to the MOS:TENSE section making this clear. I'm happy to take a shot at it - just want to makes sure there's consensus. TimTempleton (talk) (cont) 19:12, 26 July 2018 (UTC)[reply]

Discontinued products have the potential to exist still, whereas dissolved companies no longer exist. Acquired companies should probably use present tense for continuing operations (though something like "began making product in X" is probably preferable) and past tense for historical events. --Izno (talk) 19:29, 26 July 2018 (UTC)[reply]
Good point. Looks like clarification should then be added. I'll wait a bit and see if there's any dissent. TimTempleton (talk) (cont) 19:40, 26 July 2018 (UTC)[reply]
MOS:TENSE already says that though. :) --Izno (talk) 19:54, 26 July 2018 (UTC)[reply]
I was referring to the corporate carve out. Something like:
Acquired or otherwise defunct companies are referred to using the past tense.
TimTempleton (talk) (cont) 20:32, 26 July 2018 (UTC)[reply]
They're already included in "subjects that no longer meaningfully exist as such". Is this a frequent enough problem that we need to say something specific about it? If we did, that wording would need another qualifier, something like:
Acquired or otherwise defunct companies are referred to using the past tense, unless they still operate under the same name as a merged business division.
Seems more trouble than it's worth. Another approach might be a footnote immediately after "as such":
{{efn|Examples of cases for past tense: defunct organizations, countries that don't exist any longer, offices or other roles or titles that were eliminated, bands that have broken up, lost works of literature or art, services that were ended, business divisions merged out of existence, torn down buildings, etc.}}
Following the logic in this essay, we really should be a) more resistant to adding clarifications and examples when there's not a clear need for it, and b) using footnotes more often when a case marginally needs mention, so that we keep the main text leaner.  — SMcCandlish ¢ 😼  11:27, 27 July 2018 (UTC)[reply]

A recent edit changed "Capitalize names of particular institutions ..." to "Capitalize names of institutions ..." on the grounds that "particular" is superfluous here. I undid the edit, but it has been restored by another editor.

I entirely agree that logically "particular" is redundant, but ultra-clarity is useful in the MoS. Leaving just the plural "names of institutions" might imply that e.g. the Universities of Oxford and Cambridge is correct.

One compromise might be "Capitalize the name of an institution ...", although I still think the original is fine. Peter coxhead (talk) 05:58, 27 July 2018 (UTC)[reply]

This isn't the biggest issue in the world, but my point is that the word "particular" isn't doing any lifting whatsoever in that sentence. Certainly it isn't resolving your exemplified issue - since the 'universities of Oxford and Cambridge' are (two) particular institutions; resolution depends upon the word "names", since that formulation isn't a "name" (as an aside I note that the UK Parliament used the capitalisation in this very phrase, back in the 1920s: "An Act to make further provision with respect to the Universities of Oxford and Cambridge and the Colleges therein, although of course usage has evolved since then). Looking at a few style guides I can't find any guidance in any of them for dealing with the situation where several organisations are listed together preceded by the plural of a word that is common to all of their titles, anyway, but I agree with you that lower case feels appropriate. In any event, your alternative wording in the singular also works. MapReader (talk) 07:09, 27 July 2018 (UTC)[reply]
I can't find explicit guidance in the MoS re the plural case (SMcCandlish: can you help?), but there have been various relevant discussions, e.g. Talk:List of mayors of Birmingham#Requested move 5 September 2017 which have all upheld the view that where "X of Y" is a title, and so X and Y are capitalized, in "xs of Y", x is not capitalized. (As it happens, I don't agree that lower case feels appropriate – I would capitalize – but I'm pretty sure the consensus here is otherwise.) Peter coxhead (talk) 09:46, 27 July 2018 (UTC)[reply]
I put it back in. It was added because people were misreading it as meaning to do, e.g., "all of the State Legislatures in the United States", "she attended both Harvard and Princeton Universities", etc. The "Universities of Oxford and Cambridge and the Colleges" style was preferred by certain style guides in the early to mid-20th century but is excoriated in most of them today. It's the same style that results in "XYZ Corporation is a Canadian shoe manufacturer. J. Q. Foobar founded the Corporation in 1978." This style is virtually never used today except in legal writing (within actual legal documents, not legal journal prose, etc.), and in a small sliver of especially stilted business English, usually in internal self-referential material (i.e., your CEO's memo is apt to call your own company "the Company" but will not refer to a meeting with a rep from XYZ Corporation as a negotiate between "the Company and the Corporation"). WP doesn't write this way.

There may be another way to get at this than by inserting "particular", but it does seem to be having the desired effect, even if MapReader and Popcornduff might be skeptical that it should. There's been a marked decline in the over-capping style over the last several years.

PS: "xs of Y" is always going to look off to a subset of readers, those for whom such a title seems to usually capitalized (or at least where they don't notice much when it's not. LC is the norm in the British press now – in-page search these articles for "lords mayor", and there were zero hits at either The Guardian or The Economist for this phrase capitalized: [34][35][36][37][38][39][40]. It's not like Wikipedia made it up. :-) There'll always be a hint of perceptual dissonance because a title like "Lord Mayor" is almost always encountered attached to someone's name. Similarly, many Americans always want to capitalize "president" in reference to the US head of state; some of them even want to do it with adjectival constructions like "presidential race".
 — SMcCandlish ¢ 😼  11:11, 27 July 2018 (UTC)[reply]

The reduction in capping is because the Internet and SMS are driving society towards ignoring capitals altogether. It is certainly most unlikely to have anything to do with the word "particular" in this sentence of the MoS, where it appears utterly redundant. MapReader (talk) 11:58, 27 July 2018 (UTC)[reply]
  • I was going to make the same point MapReader made even before I saw they'd written it. In the words of the Guardian style guide: "The tendency towards lowercase, which in part reflects a less formal, less deferential society, has been accelerated by the explosion of the internet: some web companies, and many email users, have dispensed with capitals altogether."
I have to say I don't understand what "particular" is adding here, even after reading the arguments about plurals. To me, as someone who uses style guides a lot professionally, it's not clear that "particular" is meant to clarify that you don't capitalise in lists or whatever. I wouldn't have inferred that at all. Popcornduff (talk) 12:45, 27 July 2018 (UTC)[reply]
Except the lower-casing trend started way before the Internet existed as a general public medium. There is no doubt that computer-mediated communication is accelerating the trend, but it doesn't matter. It's not WP's job to defend tradition against the abuses of callow youth, but (with regard to style matters) to reflect actual practice in contemporary formal written English (and to work around key WP-specific issues here and there, where there's a technical problem – thus, e.g., our avoidance of curly quotes).

On the original topic, I'll just repeat: There may be another way to get at this than by inserting "particular", but it does seem to be having the desired effect. Rather than argue about one word, maybe suggest better wording that makes the point more clearly.
 — SMcCandlish ¢ 😼  13:14, 27 July 2018 (UTC)[reply]

While I feel SMcC is stretching credibility beyond a sensible point by arguing that this word is doing any good whatsoever, it's just one word, and isn't doing any harm apart from padding. So as the OP I say let's drop the matter. MapReader (talk) 14:36, 27 July 2018 (UTC)[reply]
Well, I'm also fishing for "Better wording would be ...". I tend to just directly edit this stuff, but others here seem to feel more strongly that the current wording isn't good, rather than that it's not ideal. I'm not sure how to more tightly get at something like "Capitalize the names of institutions, including their official names and conventional short forms that are treated as proper names, but do not capitalize a word from such a name (university, corporation, etc.) when used apart from the name, nor in plural form when two or more institutions sharing the term are mentioned back-to-back." I guess we could use that exact wording in a footnote, but I don't think anyone wants to see something like that inline in the main guideline text.  — SMcCandlish ¢ 😼  14:10, 28 July 2018 (UTC)[reply]

 You are invited to join the discussion at Wikipedia talk:WikiProject Radio Stations#Call sign meanings. The discussion is about using bold caps in expanding radio call sign letters, e.g. KIEM (Keep Informed Every Minute), and whether radio stations should have an exception to the MOS:ACRO guideline: Do not apply italics, boldfacing, underlining, or other highlighting to the letters in the expansion of an acronym that correspond to the letters in the acronym. Thanks. Reidgreg (talk) 14:19, 27 July 2018 (UTC)Template:Z48[reply]

Deadnaming trans people

I'm coming up against an issue regarding deadnaming. I appear in the article City of York Council election, 2015 under my deadname. I edited this but the edit was revoked by Sam Blacketer. I understand but disagree with Sam's concerns about/concept of accuracy and also feel that there are a number of key reasons why deadnaming should be avoided in almost all contexts anyway. The discussion between Sam and me can be found on Sam's talk page under the section "Deadnaming trans people".

In summary:

  1. Deadnaming trans people can be dangerous - sometimes very seriously so - both from a mental health/gender dysphoria perspective and from a risk of outing perspective. The Radical Copyeditor's Style Guide for Writing About Transgender People stated in section 2.4.2: "Using a trans person’s birth name or former pronouns without permission (even when talking about them in the past) is a form of violence". In the recent academic article How do you wish to be cited? Citation practices and a scholarly community of care in trans studies research articles, Thieme and Saunders discuss citation practices including the issues of deadnaming through records.
  2. It is common on Wikipedia to refer to people by the name they go by/are best known by in other domains e.g. celebrities with stage names. The stakes are higher for trans people and deadnaming.
  3. There is evidence of my name change and my request to be referred to only as Ynda Jas (not as my deadname) online, and I can provide a deed poll and bank statements in my new name if there's any question about the validity of this change (Sam hasn't questioned it).
  4. I have submitted a request with City of York Council for their website to be updated.
  5. The Manual of Style appears to favour referring to trans people under their current ("self-designated"/"self-expressed") name/pronouns/identity and states that "this applies in references to any phase of that person's life", though it is a little ambiguous about whether this applies only to biographical pages or also to references elsewhere. Sam stated that if I had a biographical page things might be different, but does that not create a two-class system of people who are worthy of not being deadnamed (and their current/actual name/identity respected) and people who are effectively unworthy (as a result of not being Wikipedia-noteworthy enough to have their own biographical page)?
  6. Sam feels that election records should reflect what was recorded at the time as a matter of accuracy. I argue it would be accurate - more accurate in fact - to use Ynda Jas since the person who stood for election was Ynda Jas, not my deadname. My deadname is simply an old label for Ynda Jas (me). Surely the main purpose of the page is to act as a record of who ran for election, not what names were on the ballot paper?
  7. Sam went on to talk about how the page should reflect that it was a cis male who stood for election - it wasn't, I simply wasn't out as trans/non-binary then (my interpretation of gender - pretty uncontroversial in contemporary Western trans communities - is that for those trans people who are not fluid in their identity, they have always been the gender they identify with, they simply haven't always known it/been out to themselves or the rest of the world). Either way, the page wouldn't reflect whether people saw me as cis or trans whether it uses my deadname or Ynda Jas because (A) that information is not on the page - it's literally just a name! - and (B) people had no real basis on which to assume I was cis or trans at that time. My gender was not on the ballot paper. There may have been campaign materials using the pronouns with which I was referred at the time, but this argument feels like clutching at straws and an issue of inconsequential significance relative to the impact of deadnaming. In my case, I would give permission to include a footnote to say that I was on the ballot paper under my deadname, but this should be the exception not the rule - see point (1).

Could there be clearer guidance on this in the Manual of Style, particularly in relation to non-biographical pages? If there needs to be further discussion before a decision is made on this, it should involve trans people.

--Yndajas (talk) 22:42, 28 July 2018 (UTC)[reply]

  • The short answer is no. The only reason Wikipedia can include detailed historical records about otherwise non-notable people (such as the per-ward results) is that they faithfully reflect the sources used. If the election council for York puts out a statement updating their records, I'd support updating it. As it is, I would much rather remove the entire section rather than make a name change that isn't supported by secondary sources (just the person's own statements); it isn't obvious how to prove that the person changing their name is the same one that ran for office. The emotional issues aren't relevant; this person's own website uses their own name. power~enwiki (π, ν) 23:18, 28 July 2018 (UTC)[reply]
    • As a side note, this isn't really the right forum, but as this is a new user who was directed here, I see no reason to move this discussion. power~enwiki (π, ν) 23:18, 28 July 2018 (UTC)[reply]