Jump to content

Wikipedia talk:Manual of Style: Difference between revisions

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Content deleted Content added
Line 742: Line 742:
== "Respectable" results ==
== "Respectable" results ==
The phrase [https://en.wikipedia.org/w/index.php?title=Special:Search&limit=500&offset=0&profile=default&search=%22in+a+respectable%22&searchToken=6jd6oll8hsx5uuq76g4z59jzk "X finished in 'a respectable' Yth position"], seems to me to be un-encyclopedic. "Respectability" is subjective term. Thoughts? [[User:Bogger|Bogger]] ([[User talk:Bogger|talk]]) 14:36, 7 August 2018 (UTC)
The phrase [https://en.wikipedia.org/w/index.php?title=Special:Search&limit=500&offset=0&profile=default&search=%22in+a+respectable%22&searchToken=6jd6oll8hsx5uuq76g4z59jzk "X finished in 'a respectable' Yth position"], seems to me to be un-encyclopedic. "Respectability" is subjective term. Thoughts? [[User:Bogger|Bogger]] ([[User talk:Bogger|talk]]) 14:36, 7 August 2018 (UTC)

: It's subjective and it doesn't really add information to the sentence, so unless it's from paraphrasing a source I'd say it was [[MOS:IDIOM]] that was better avoided. Do you think it might be covered under different rules or that it should be added somewhere in particular? -- [[Special:Contributions/109.79.181.42|109.79.181.42]] ([[User talk:109.79.181.42|talk]]) 16:52, 7 August 2018 (UTC)

Revision as of 16:52, 7 August 2018

HTML entities

Greetings all, I'm currently updating the style-checking code that reports to Wikipedia:Typo Team/moss, and I need some clarity on which HTML character entity references (things like &) are allowed or preferred. Variations that are not allowed or which are disfavored would be brought to the attention of human editors, along with other suspected style and spelling errors. There are occasional mentions of such entities in the Manual of Style, but no general rules that I could find. I would propose the following:

HTML character entity references

(edited to reflect the below comments)

HTML character entity references are a way to tell a web browser to render a certain character without including that character in the web page directly. Characters may be referenced by name, decimal number, or hexadecimal number. For example, "€" is the same as "€", "€", or including the character "€" directly. For a comprehensive list, see List of XML and HTML character entity references. Wikipedia editors are encouraged to follow these guidelines to make it easier for editors to read and understand wikitext, especially those not familiar with HTML notation.

  • In general, it is preferable to write characters directly instead of using an HTML entity reference. Wikipedia stores articles with Unicode, so any character that could possibly be referenced can also be input directly. The web site's editing pages have built-in special character support to make it easy to input characters not typically found on keyboards. Editors can also use the Unicode input method provided by their operating system.
  • Numeric references should not be used when there is a named reference available. For example, − should be used instead of −
  • References must be used when the character itself cannot be used for technical reasons. For example, "]" cannot appear in wikilinks that use "[[" and "]]" to mark the start and end. The <nowiki> tag can also be used to prevent interpretation of special characters as wiki markup.
  • Named references are preferred when the characters themselves are easily confused. This includes:
    • Whitespace. The regular ASCII space " " should be typed directly, but entities should be used for others like "&nbsp;" and "&ensp;".
    • Dashes and similar characters. The regular ASCII hypen-minus "-" should be entered directly, but other characters might be entered with entities. For example, &minus; is generally preferred because "−" looks very similar to "-" in some web browsers. See Wikipedia:Manual of Style § Dashes for more usage guidelines.
    • Prime (′) and related symbols that resemble quote marks
  • Other guidelines ask that the Unicode characters not be used at all (except when the character itself is being discussed):

Initial discussion

What do folks think? -- Beland (talk) 19:39, 14 July 2018 (UTC)[reply]

  • Another set of characters to avoid are the superscript-digits (at least when used with a mathematical meaning). See MOS:MATH#Superscripts and subscripts. —David Eppstein (talk) 19:46, 14 July 2018 (UTC)[reply]
    • Good catch, I'll add a link to the list of exceptions. -- Beland (talk) 21:10, 14 July 2018 (UTC)[reply]
  • I disagree that mdash isn't easily confused -- in some fonts it definitely is. I'd pretty much advocate that everything not on a standard English keyboard (whatever the "standard English keyboard" is) should be symbolically represented by either a & form or a template. And I'm a little worried that the typo team link at the start of the OP talks about flagging "violations of the Wikipedia:Manual of Style"; I fear this will slide all too easily into a project to blindly "fix violations". EEng 20:06, 14 July 2018 (UTC)[reply]
    @EEng: OK, I'll drop the emdash example. As for scope...well, this is already a project to fix violations of the Manual of Style and English spelling and grammar, though it's never done blindly. In some cases it would be safe to make a bot to make certain substitutions (like converting numerical to named references), but that would require approval by Wikipedia:Bot requests to make sure it didn't have any unwanted side effects. Not sure why that is something to be afraid of; if we think a certain form is better for editors, that seems useful. We don't do that for spelling mistakes because there could be a good reason to keep the misspelling. Could you explain a bit why you feel it's better for an editor to come across say, &trade; instead of ™ when opening an article for editing? -- Beland (talk) 21:00, 14 July 2018 (UTC)[reply]
    I'm fine with replacing numerical refs and &trade; and so on; in fact I welcome it because, as I mentioned, I generally think everything not on standard keyboards should be expressed symbolically in the wiki source. Its the vague statement at Wikipedia:Typo_Team/moss that you're gonna find "violations of the Wikipedia:Manual of Style" that worries me. I don't mind automatically identifying apparent "violations", but what worries me is that that might slide into automatic "fixes" – worried because MOS isn't rigid, it needs to be applied with common sense, exceptions apply, etc. EEng 21:24, 14 July 2018 (UTC)[reply]
    Re replacing characters with entities or the reverse: what I don't want to see is slow-motion edit wars where one group of editors or bots regularly replace characters by entities and a different group regularly replace entities by characters. That sort of thing just clutters watchlists for no good reason. So I'd rather either see a very clear specification of which things should be expanded and which should be left as unicode (probably difficult to attain consensus for) or (more likely) something like WP:RETAIN where edits of this type are discouraged. —David Eppstein (talk) 21:32, 14 July 2018 (UTC)[reply]
    Absolutely agree. A hard-won consensus in advance will consume 1/1000 the editor time and energy wasted on a zillion skirmishes and rage-reverts all over the project. And certainly some part of that consensus might be that some things come under RETAIN (though honestly the less RETAIN stuff we have the better). EEng 21:35, 14 July 2018 (UTC)[reply]
    An explicit list would be great for me, since I have to code that into software anyway. I'll whip up a table. FTR, as of April there were a grand total of 7 numerical references the moss software could find, and I changed all of them just now. -- Beland (talk) 01:15, 15 July 2018 (UTC)[reply]

The proposal should be revised to make it clear how it relates to the advice already in the MOS at WP:MOS#Keep markup simple,

An HTML character entity is sometimes better than the equivalent Unicode character, which may be difficult to identify in edit mode; for example, &Alpha; is explicit whereas Α (the upper-case form of Greek α) may be misidentified as the Latin A.

Also the proposal should indicate where this addition would go into the MOS; context matters.

The proposal contains the statement "The web site's editing pages have built-in special character support to make it easy to input characters not typically found on keyboards." That's only partially true; in the version I use, there are a variety of special characters to choose from, but when I hover over them, there isn't any little hint that pops up telling me what the name of the character is. So it is hard to be sure if a character is an n dash or a minus. In another case, it's hard to tell a prime from an apostrophe. I've learned to tell an n dash from a hyphen, but I'll bet there's lots of editors who can't. Jc3s5h (talk) 22:18, 14 July 2018 (UTC)[reply]

Hmm, thumbnails for special characters would make a great feature improvement for the web UI. I agree it's a bit of a pain; I always have to paste characters into a search engine to figure out what they are. If we're making a big table of what should be which, maybe it would need to be on its own subpage? I'm agnostic as to where this goes, and I'm open to suggestions; I don't think it matters as long as it's easy to find. -- Beland (talk) 01:15, 15 July 2018 (UTC)[reply]
FTR, I have filed a feature request for the popup text to include the character name at [1] for anyone who wants to comment or follow along at home. Thanks for the suggestion! -- Beland (talk) 06:57, 16 July 2018 (UTC)[reply]

Second draft

(Edited to reflect the below discussion)

HTML character entity references are a way to tell a web browser to render a certain character without including that character in the web page directly. Characters may be referenced by name, decimal number, or hexadecimal number. For example, &euro; is the same as &#x20AC;, &#8364;, or including the character directly. For a comprehensive list, see List of XML and HTML character entity references [2].

In choosing between the numeric reference, named reference, and direct character methods, Wikipedia never uses the numeric reference when a named reference is available, and it usually prefers direct character input over named references (and edits in this direction are made by semi-automated systems like AutoWikiBrowser). For example, &minus; should be used instead of &#8722;, and é should be used instead of &eacute;. Wikipedia stores articles with Unicode, so any character that could possibly be referenced can also be input directly. The web site's editing pages have built-in special character support to make it easy to input characters not typically found on keyboards. Editors can also use the Unicode input method provided by their operating system. There are some exceptions where named references are preferred, to avoid confusion and to circumvent technical limitations. The <nowiki> tag can also be used instead of character escaping to prevent interpretation of special characters as wiki markup. These preferences are detailed in the table below, and some instances where a given character is preferably not used at all (except where that character is itself the topic of discussion) are noted. Wikipedia editors are encouraged to follow these guidelines to make it easier for editors to read and understand wikitext, especially those not familiar with HTML notation.

Category Preferred forms Exceptions and notes
ASCII characters ! " % & ' + < = > [ ] Sometimes proximity to other characters causes misinterpretation of &, <, >, [, ], or ' as part HTML markup or wiki markup. In these cases, use &amp;, &lt;, &gt;, &#91;, &#93; or &apos;.
Latin and Germanic letters À Á Â Ã Ä Å Æ Ç È É Ê Ë Ì Í Î Ï Ð Ñ Ò Ó Ô Õ Ö Ø Ù Ú Û Ü Ý Þ ß à á â ã ä å æ ç è é ê ë ì í î ï ð ñ ò ó ô õ ö ø ù ú û ü ý þ ÿ Œ œ Š š Ÿ Instead of ligatures (Æ, æ, Œ, œ) write two separate letters, except in proper names and in text in languages in which they are standard – see Wikipedia:Manual of Style § Ligatures.
Greek letters Α Β Γ Δ Ε Ζ Η Θ Ι Κ Λ Μ Ν Ξ Ο Π Ρ Σ Τ Υ Φ Χ Ψ Ω α β γ δ ε ζ η θ ι κ λ μ ν ξ ο π ρ ς σ τ υ φ χ ψ ω ϑ ϒ ϖ When written standalone (not part of a Greek word with other Greek characters), the following can be used to reduce confusion with similar-looking Latin alphabet letters: &Alpha; &Beta; &Epsilon; &Zeta; &Eta; &Iota; &Kappa; &Mu; &Nu; &Omicron; &Rho; &Tau; &Upsilon; &Chi; &kappa; &omicron; &rho;. μ (mu) and Σ (sigma) are nearly identical to µ (micro) and ∑ (sum), but the other characters are not used in Wikipedia so there is no potential for confusion.
Quote marks &lsquo; &rsquo; &sbquo; &ldquo; &rdquo; &bdquo; &acute; &prime; &Prime; ASCII quote marks are generally preferred. Wikipedia:Manual of Style/Dates and numbers § Specific units says not to use &prime; and &Prime; for inches and feet.
Dashes –/&ndash; —/&mdash; &horbar; &shy; &horbar; is not used by Wikipedia. For more info on &shy; (optional hyphen) see MOS:SHY.
Whitespace and non-printing &nbsp; &ensp; &emsp; &thinsp; &zwnj; &zwj; &lrm; &rlm; &ensp;, &emsp;, &zwnj;, and &zwj; are generally unnecessary. For more info on text direction, see MOS:RTL.
Math × ÷ √ ∝ ∝ ¬ ± ∂ ∇ ℵ ℜ ℑ ℘ ∀ ∃ ∈ ∉ ∋ ∅ ∏ ∑ ∠ &and; (∧ confused with ^) &or; (∨ confused with v) ∩ ∪ ∫ ∴ ∼ ≅ ≈ ≠ ≡ ≤ ≥ ⊂ ⊃ ⊄ ⊆ ⊇ ⊕ ⊗ ⊥ ⌈ ⌉ ⌊ ⌋ &lang; (⟨ confused with <) &rang; (⟩ confused with >) In some cases TeX markup is preferred to Unicode characters; see Wikipedia:Manual of Style/Mathematics § Typesetting of mathematical formulae. × (&times;) is used in article titles and also for hybrid species. ∑ (sum) should not be used; Wikipedia uses the nearly identical Σ (sigma).
Currency ¢ £ ¤ ¥ € $
Non-English punctuation ¿ ¡ « » &lsaquo; &rsaquo; &lsaquo; and &rsaquo; are not used by Wikipedia; < and > can be used instead.
Dots &middot; &bull; &sdot; "..." is preferred to "…" - see MOS:ELLIPSIS. Wiki markup should be used instead of these for lists; see Wikipedia:Manual of Style/Lists § List layout.
Diacritics ¨ ¸ ‾ ˜ ˆ
Arrows ← ↑ → ↓ ↔ ↵ ⇐ ⇑ ⇒ ⇓ ⇔
Other symbols ¦ § © ® ™ ° µ ¶ † ‡ ƒ ‰ ◊ ♠ ♣ ♥ ♦ µ (micro) is not used by Wikipedia; use μ (lowercase Greek letter mu) instead - see Wikipedia:Manual of Style/Dates and numbers § Specific units
Superscript and subscript ¹ ² ³ ª º Do not use Unicode subscripts and superscripts like these for numbers, per Wikipedia:Manual of Style/Superscripts and subscripts; use <sup> and <sub> instead.
Fractions ¼ ½ ¾ &frasl; These are not used unless discussing the characters themselves; for alternatives, see Wikipedia:Manual of Style/Dates and numbers § Fractions and ratios


Above is is a draft of a definitive list of whether the HTML reference or the character itself should be used, as suggested by other editors above. I noticed a few things:

  • Both the characters and the references are widely used for endash and emdash; allow both for now?
  • mu and micro are rarely if ever used in the same context; the direct form seems preferable? Same for sum and sigma?
  • ∼ (&sim;) and ~ (ASCII tilde) seem to be used interchangably but &sim; itself is used very rarely.

-- Beland (talk) 08:12, 15 July 2018 (UTC)[reply]

  • usually prefers direct character input over named references – That's too sweeping. I can see this is gonna take a lot of discussion. For starters, pinging David Eppstein for his thoughts on literal or symbolic for math symbols (not meaning to imply there's one simple answer to that). Not pinging SM because he'll find his was here without doubt and his user name is too hard to get right and it's late and I'm tired. EEng 08:32, 15 July 2018 (UTC)[reply]
    • I think it's very important to spell out &minus; as otherwise it's too difficult to distinguish from &ndash. Otherwise I don't feel strongly but I know I have seen legions of random AWB users replace &times; (e.g.) by its unicode character. So we should not encourage replacements that go the other way. —David Eppstein (talk) 16:30, 15 July 2018 (UTC)[reply]
      • Okey, that seems useful to note. -- Beland (talk) 21:55, 15 July 2018 (UTC)[reply]
    • @EEng: Well, if I'm counting right, out of the 252 named references, in 28 instances (11.1%), the proposal is recommending to use the reference over the character itself, and in 27 instances (10.7%) it's either not making a recommendation or different options are used in different circumstances. That leaves 78.2% of the time where the character itself is being recommended over the named reference. That seems to qualify as "usually"; am I missing something? -- Beland (talk) 21:55, 15 July 2018 (UTC)[reply]
      You're counting entries in the table; I'm counting occurrences in the wild i.e. I'd wager that the population of ndash + mdash in articles is greater than that of all those other characters put together, and those two should always be coded by name or template, IMHO. EEng 02:37, 16 July 2018 (UTC)[reply]
      @EEng: Ah, would it make more sense to say "for most characters prefers" rather than "usually prefers"? -- Beland (talk) 02:42, 16 July 2018 (UTC)[reply]
      At this point I don't know if anything needs to be said at all. I'm a bit unclear about something. Right now much or most of this advice, to the extent it's somewhere in MOS, is distributed among the various relevant sections. You're not proposing to insert this giant table somewhere, are you? Because then it will be in two places which will need to be kept in sync. EEng 03:36, 16 July 2018 (UTC)[reply]
  • WP:MOSNUM always uses the Greek letter mu or the html entity &mu; as the metric prefix for micro. I know some Unicode characters were created for obscure reasons such that Wikipedia has no interest in using those characters; I infer from it's low numerical code value &micro; (U+00B5, µ) exists as a way of coding the micro symbol that was used in some pre-Unicode character codes that didn't provide for most Greek letters, to permit round-tripping between those older character codes and Unicode. According to the Unicode Consortium, the Greek letter character is preferred,[1]. Maybe use the Greek letter mu directly, whether in a Greek word, the archaic stand-alone symbol for micrometer, or the metric prefix, and explicitly encourage editors to replace µ (U+00B5) with μ (U+03BC). Jc3s5h (talk) 10:31, 15 July 2018 (UTC)[reply]
    • Ah, OK, I'll change that. Is sigma also preferred over sum in all cases? -- Beland (talk) 02:28, 16 July 2018 (UTC)[reply]
      • Looks like sum is rarely used compared to sigma which is used a lot, so I'll put in the same advice and see if that meets with popular approval. -- Beland (talk) 02:31, 16 July 2018 (UTC)[reply]
  • As a comment, is convenient in templates when you want a whitespace. --Izno (talk) 21:58, 15 July 2018 (UTC)[reply]
    • Ah, this points out to me that the regular space (which is U+0032) actually doesn't have a named reference, so it probably doesn't belong on this chart.

References

  1. ^ Beeton, Barbara; Freytag, Asmus; Sargent, Murray III (30 May 2017). "Unicode® Technical Report #25". Unicode Technical Reports. Unicode Consortium. p. 11. {{cite web}}: Cite has empty unknown parameter: |dead-url= (help)

EEng made a good find, that &dollar; was missing. It turns out that this is because List of XML and HTML character entity references only goes up to HTML 4, and HTML 5 has a ton more, listed here. Given the length of the resulting table if we include all of them, maybe we should just say "use the character itself except for those listed below" and list the ones where named references should be used? (And maybe continue to list the characters that should not be used at all?) -- Beland (talk) 03:53, 16 July 2018 (UTC)[reply]

I still don't understand why, to a first approximation, we're not saying that everything other than a-zA-Z0-9`~!@#$%^&*()-_=+[]{};':",./<>? should be given via &foo; or {some template}. Also, the table mixes advice on how to express various characters with advice on whether and when to use various characters. Not saying that's bad, just worth noting. EEng 04:12, 16 July 2018 (UTC)[reply]
I think accented Roman letters should certainly be written as e.g. á not &aacute;. More generally I am in favor of using unicodes over html entities or templates in most cases, with exceptions for characters like &amp; (when written next to something that would cause it to expand to a different entity) or &minus; (because there is too much possibility for confusion with other dash-like characters). Also, as an aside, the text above about avoiding ligatures is too strong; when these characters occur in the standard spelling of a name (e.g.), we should write them that way even when we are writing in English. —David Eppstein (talk) 04:25, 16 July 2018 (UTC)[reply]
Re accented Romans, I did say "to a first approximation". Re ligatures, the text says "except proper names" -- is that not enough? EEng 05:05, 16 July 2018 (UTC)[reply]
I did a quick database check, and as of April 2018, – is more popular than &ndash; by a ratio of about 10.6:1.
My thought on combining "how" and "whether" is that it's entirely likely the answer to the question "how do I put this character into Wikipedia?" is "please don't, use this other one", so having it all in one place is handy. -- Beland (talk) 05:28, 16 July 2018 (UTC)[reply]
The fact that literal ndash is 10X as common as symbolic just shows how much work we have to do -- in my edit window it's very hard to tell ndash from hyphen or mdash unless they're next to each other. I'm fine with combining both kinds of advice, though (again) I'm not sure what exactly where this big table is gonna go. EEng 06:38, 16 July 2018 (UTC)[reply]
Well, you were using your guess that the numbers were the other way around as an argument for a wording change. The current preponderance might be evidence that most editors prefer the raw characters, or maybe it's just what people do because the UI is designed to encourage that. That fact that the UI is the way that it is may be an indication that there is not great support for using &ndash; and friends. I can generally tell the difference between dashes of different lengths, though if some people can't, that may be an indication that it just doesn't matter that much. In any case, given the lack of consensus on this, the current proposal is to remain neutral on the choice for ndash and mdash, and let editors decide on a page-by-page basis. In contrast, for other characters like ∀ and °, which can be clearly distinguished by everyone, I haven't heard a good argument for why those shouldn't just be used directly. -- Beland (talk) 23:26, 17 July 2018 (UTC)[reply]
  • Well, you were using your guess that the numbers were the other way around – No, you're mixing up two different things. I conjectured that ndashes and mdashes, together, make up the bulk (counting each use separately) of all these not-on-the-keyboard characters; that was without regard to how those characters were expressed (literal vs. symbolic).
  • that the UI is the way that it is may be an indication that there is not great support for using &ndash; and friends – WP's facilities and interfaces are full of debris that's little used or even "impossible" to use (e.g. template parameters that want to present information that an RfC has determined should never be presented). Trying to infer how things are spozed to be based on things you see in the UI will get you way off track very, very fast.
  • I can generally tell the difference between dashes of different lengths – So can I, easily in the rendered page, but in the wikisource only with a bit of effort, if I make a point of looking. It's that last bit that's the rub: in the rendered page an ndash vs. mdash look like – vs. —, but in the wikisource they're much more similar i.e. vs. . (What you see in that sentence may depend on your skin, so your mileage may very.) Thus it's easy in copyediting to not notice that the wrong one is present, and that's why symbolic names should be used instead. (If we really cared we'd suggest that hyphens be rendered as &hyp; as well. I actually tried that once in an article but got laughed off the stage, so we'll just have to live with using the literal -. What I usually do is when I see e.g. a date range like 1899-1920, I just change the literal hypheny-dashy thing that's there to &ndash, so that I know it's the right thing.)
  • I haven't heard a good argument for why those shouldn't just be used directly – Clearly a quotation in a language using a non-Roman script should just present that text literally. For everything else, there are a lot of pros and cons relating to how many different special symbols are used (in a given article), the extent to which each one is used repeatedly, how potentially confuse-able they are for one another or for something else not even used on the page page, the likely sophistication of editors who might work on the article, and a lot more. Here's a random example: WP:MOSNUM says arcminutes should be denoted by a prime and not an apostrophe or a single quote i.e. ′ but not ‘ or ' . Once again, you have to be looking to notice if the wrong one is there; thus MOSNUM suggests that the markup &prime; be used to save editors squinting. Unfortunately different considerations come into play for different symbols, so separate analyses are needed in each case. That's why I predicted this discussion would take a long time.
EEng 03:32, 18 July 2018 (UTC)[reply]

As for the general direction of the advice, using characters directly seems to be the recommended best practice for web development generally. It's more WYSIWYG and easier for web editors to read and think about. It also fits the goal of not forcing editors to learn HTML in order to be able to use Wikipedia; they can just input and edit these characters in the same way they do elsewhere like Word or phone apps or other web sites. We also have a UI right below the text-being-edited box which encourages people to add the characters directly; it would be weird if the advice is to generally use the references because that's not what the system is designed to encourage. The escaping system was originally designed to allow input of special characters that were part of SGML or HTML itself (like angle brackets). Later it became a way to work around the limitations of ASCII. But modern web sites all use Unicode now, as does Wikipedia, so it's a bit of an obsolete workaround. I think any system where you have to learn a special language for telling a computer something is less user-friendly than a system where you can express your intention in the way you would express it to other humans. -- Beland (talk) 06:29, 16 July 2018 (UTC)[reply]

I think we should treat it like citations: citations are hard, both inside Wikipedia and outside. Just see what happens in any university freshman humanities class where citation expectations are rigorously enforced for the first time in most student's life. So at Wikipedia we're satisfied if the first editor gives some way to find the source; gnomes can improve the citation format later. And the tools to do the improvement exist.
Similarly, editors who are not skilled with markup can do the best they can with the visual editor and other editors can improve it. The editors who make the improvements need the tools to do so, and bots must not overrule their contributions by converting html entities to characters.
The idea that you can write documents and web pages with purely WYSIWYG tools is only true if you're writing some thing simple, or you're a slob. That's why Microsoft Word has a little paragraph symbol so you can turn on the display of paragraph marks. That's why WordPress has two editing tabs, WYSIWYG view, and HTML view. The Wikipedia editors are quite primitive, hence the need for HTML entities continues. Jc3s5h (talk) 10:54, 16 July 2018 (UTC)[reply]
I agree contributions of new editors should be welcomed whetehr or not they follow this sort of guideline; I added language to that effect in the draft. -- Beland (talk) 23:26, 17 July 2018 (UTC)[reply]

General comment This discussion may affect WP:CHECKWIKI error 11. The error is currently disactivated. -- 11:10, 16 July 2018 (UTC)

  • A couple of quick responses:
    1. Wrap the table's characters-as-such, not just the HTML character entities, with <code>...</code> or perhaps with {{kbd}}, whatever looks better (semantically, it can be either – it's code when viewed in the wikitext but also input when you're entering it). If we don't like any of the faint-background effects, use bare <kbd>...</kbd>, which just uses monospace. I would go with <code> because the table already uses a light grey and it blends in well, while also not requiring any template calls.
    2. That for which we're providing entity codes should also be shown as characters.
    3. That for which we're showing characters but recommending/allowing entity codes should also be shown as those codes.
    4. "ASCII characters": Present the characters in the same order as the codes in the later column.
    5. "Greek latters: Change "but the other characters are not used" to "but these latter two characters are not used".
    6. "Dashes": This is a misuse of the slash character and and results in confusing typographical gibberish: "–/&ndash; —/&mdash;". Try: "– (&ndash;), — (&mdash;),". Also, "For more info on ­ (optional hyphen) see MOS:SHY" is a misuse of parentheses (round brackets), seeming for some kind of emphasis. Should just remove them.
    7. "Whitespace and non-printing": should also including &hairsp;; like &thinsp; it is generally only used for kerning in templates and such; there is usually not any reason to manually insert either into an article.
  • # "&lsaquo; and &rsaquo; are not used by Wikipedia; < and > can be used instead" is wrong; the are not the same character and should not be confused. If we need to illustrate French quotation style, etc., use the correct characters, not lesser-than and greater-than, which serve an entirely different purpose. This is pretty much exactly like hyphen vs. dash vs. minus.
 — SMcCandlish ¢ 😼  07:28, 17 July 2018 (UTC)[reply]
The weird "shy" line was due to a typo preventing &shy; from showing up at all. I fixed that. You're right about lsaquo; I must have messed up something when scanning the database for it. I'll change that and other points you mention in the next draft, as applicable. Thanks for reading! -- Beland (talk) 00:37, 18 July 2018 (UTC)[reply]

Third draft

Posted to Wikipedia:Manual_of_Style/Text_formatting#HTML_character_entity_references

Proposed as new subsection titled "HTML character entity references" under Wikipedia:Manual of Style § Miscellaneous, replacing the second paragraph of "Keep markup simple".

HTML character entity references are a way to tell a web browser to render a certain character without including that character in the web page directly. Characters may be referenced by name, decimal number, or hexadecimal number. For example, &euro; is the same as &#x20AC;, &#8364;, or including the character directly.

On Wikipedia, characters should be used directly unless doing so is confusing for editors or causes technical problems. Numerical references should not be used if a named reference is available. For example, &minus; should be used instead of &#8722;, and é should be used instead of &eacute;. Edits favoring these conventions are made by semi-automated systems like AutoWikiBrowser. For a comprehensive list of available named references, see [3].

Wikipedia stores articles with Unicode, so any character that could possibly be referenced can also be input directly. The web site's editing pages have built-in special character support to make it easy to input characters not typically found on keyboards. Editors can also use the Unicode input method provided by their operating system. There are some exceptions where named references are preferred, to avoid confusion and to circumvent technical limitations. The <nowiki> tag can also be used instead of character escaping to prevent interpretation of special characters as wiki markup.

Characters to avoid |
Avoid Instead use Note
(&hellip;) ... (i.e. 3 periods) See MOS:ELLIPSIS.
Unicode Roman numerals like Latin letters equivalent (I II i ii) MOS:ROMANNUM
Unicode fractions like ¼ ½ ¾ &frasl; {{frac}}, {{sfrac}} See MOS:FRAC.
Unicode subscripts and superscripts like ¹ <sup></sup> <sub></sub> See WP:SUPSCRIPT. In article titles, use {{DISPLAYTITLE:...}} combined with <sup></sup> or <sub></sub> as appropriate.
µ (&micro;) μ (&mu;) See MOS:NUM#Specific units
Ligatures like Æ æ Œ œ Separate letters (AE ae OE oe) Generally avoid except in proper names and text in languages in which they are standard. See MOS:LIGATURES.
(&sum;) (&#8719;) (&horbar;) Σ (&Sigma;) Π (&Pi;) (&mdash;) (Not to be confused with \sum and \prod, which are used within <math> blocks.)
(&lsquo;) (&rsquo;) (&sbquo;) (&ldquo;) (&rdquo;) (&bdquo;) ´ (&acute;) (&prime;) (&Prime;) ` (&#96;) Straight quotes (" and ') Use {{coord}}, {{prime}} and {{pprime}} for mathematical notation; elsewhere use straight quotes unless discussing the characters themselves. See MOS:QUOTEMARKS.
(&lsaquo;) (&rsaquo;) « (&laquo;) » (&raquo;) Use &lang; and &rang; for math notation. In foreign quotations normalize angle quote marks to straight, per MOS:CONFORM, except where internal to non-English text, per MOS:STRAIGHT.
&ensp; &emsp; &thinsp; &hairsp; Normal space These are sometimes used for precision positioning in templates but rarely in prose, where non-breaking (&nbsp;) and regular spaces are normally sufficient. Exceptions: MOS:ACRO, MOS:NBSP.
In vertical lists

(&bull;) · (&middot;) (&sdot;)

* Proper wiki markup should be used to create vertical lists. See HELP:LIST#List basics.
&zwj; &zwnj; see note Used in certain foreign-language words, see zero-width joiner/zero-width non-joiner. Should be avoided elsewhere.
£ for GBP, keep ₤ for Italian Lira and other lira currencies that use ₤ (see the main article for that currency) MOS:CURRENCY; find broken instances
Potentially confusing or technically problematic characters |
Category coded form (direct form) Notes
Miscellany &amp; (&) &lt; (<) &gt; (>) &#91; ([) &#93; (]) &apos; (') &#124; (|) Use these characters directly in general, unless they interfere with HTML or wiki markup. Apostrophes and pipe symbols can alternatively be coded with {{'}} and {{!}} or {{pipe}}. See also character-substitution templates and WP:ENCODE.
Greek letters &Alpha; (Α) &Beta; (Β) &Epsilon; (Ε) &Zeta; (Ζ) &Eta; (Η) &Iota; (Ι) &Kappa; (Κ) &Mu; (Μ) &Nu; (Ν) &Omicron; (Ο) &Rho; (Ρ) &Tau; (Τ) &Upsilon; (Υ) &Chi; (Χ) &kappa; (κ) &omicron; (ο) &rho; (ρ) In isolation, use coded forms to avoid confusion with similar-looking Latin letters; in a Greek word or text, use the direct characters.
Quotes &lsquo; () &rsquo; () &sbquo; () &ldquo; () &rdquo; () &bdquo; () &acute; (´) &prime; () &Prime; () &#96; (`) Can be confused with straight quotes (" and '), commas, and with one another. MOS:STRAIGHT generally requires conversion to straight quotes, except when discussing the characters themselves or sometimes with non-English languages. See next row for prime characters.
Apostrophe-like ' ` ´ ʻ ʼ ʽ ʾ ʼ ʽ ʻ ʼ
Dashes, minuses, hyphens &ndash; () &mdash; () &minus; () - (hyphen) &shy; (soft hyphen) Can be confused with one another. For dashes and minuses, both forms are used (as well as {{endash}} and {{emdash}}). Soft hyphens should always be coded with the HTML entity or template. Plain hyphens are usually direct, though at times {{hyphen}} may be preferable (e.g. Help:CS1#Pages). See MOS:DASH, MOS:SHY, and MOS:MINUS for guidelines.
Whitespace &nbsp; &emsp; &ensp; &thinsp; &hairsp; &zwj; &zwnj; In direct form these are nearly impossible to distinguish from a normal space. See also MOS:NBSP.
Non-printing &lrm; &rlm; In direct form these are nearly impossible to identify. See MOS:RTL.
Mathematics-related &and; () &or; () &lang; () &rang; () Can be confused with x ^ v < >. In some cases TeX markup is preferred to Unicode characters; see MOS:FORMULA. Use {{angbr}} instead of ) / ()
Dots &sdot; () &middot; (·) &bull; () Can be confused with one another. Interpuncts (&middot;) are common in horizontal lists and to indicate syllables in words. Multiplication dots (&sdot;) are used for math. In practice, the dots are used directly instead of the HTML entities.

Discussion of third draft

FTR, as of the July 1, 2018 database dump, &lsqb; is used about 329 times and &lbracket; is used about 91 times, so I picked the more common one. -- Beland (talk) 15:04, 18 July 2018 (UTC)[reply]

  • While I still have my reservations about where this is going and the amount of effort it will take to iron all the bugs out, I'm warming up to this. EEng 15:35, 18 July 2018 (UTC)[reply]
  • The table asserts the &Prime; html entity resembles the ASCII backtick (`), and even have something displayed that looks like a backtick. But this is the real result of the &amp:Prime; html entity: ″. The table is just a mass of stuff and I wouldn't be able to find anything in there to make corrections. Jc3s5h (talk) 16:46, 18 July 2018 (UTC)[reply]
    • @Jc3s5h: Sorry, the backtick was missing from the second table; I just fixed that. It was rather exhausting to catalog everything and try to format it properly, so I didn't get a chance to double-check things. You're right about it being hard to read, so I also put each character in the second table on its own line, to make matching up characters and references easier. Is that clear enough now? Is it making the table too long? -- Beland (talk) 23:21, 18 July 2018 (UTC)[reply]
      • In the table, as rendered, &Prime; appears twice. Each time the character next to it is `, which is U+0060 and is named GRAVE ACCENT. But this is wrong; it should look like a double prime and is U+2033. It is used to mark seconds of time or seconds of arc; a backtick is completely wrong for that. Jc3s5h (talk) 00:04, 19 July 2018 (UTC)[reply]
        • Ah, that was caused by a capitalization error in the first table. Fixed! -- Beland (talk) 05:47, 20 July 2018 (UTC)[reply]
Mostly looking good. It would put this at the bottom of MOS:TEXT, probably. Maybe in a section called "Unicode characters". We could see about cross-referencing it in various places.  — SMcCandlish ¢ 😼  02:07, 19 July 2018 (UTC)[reply]
Gave the boxes a spinshine/reorganization. Headbomb {t · c · p · b} 18:29, 19 July 2018 (UTC)[reply]

I posted this to Wikipedia:Manual_of_Style/Text_formatting#HTML_character_entity_references (there's another section there that talks about Unicode PUA and RTL characters) and cross-referenced from Wikipedia:Manual of Style § Miscellaneous. Feel free to edit the live version as needed. -- Beland (talk) 05:56, 20 July 2018 (UTC)[reply]

And thanks to everyone for greatly improving this section from the initial draft! It will be a great help to me in writing the code that will flag less-than-clear usage. -- Beland (talk) 05:57, 20 July 2018 (UTC)[reply]

Might be worth adding a comment in the Greek notes that the same sort of thing applies to Cyrillic letters that look like Latin and Greek ones; use the entity codes for clarity when discussing particular characters, but use the Unicode in actual Russian, Ukranian, etc. words. We probably needn't dwell on the details, since there's another proposal open for centralizing all the scattered Cyrillic-related material to one page. Then again, that's mostly to be about transliteration, so maybe the Greek section in the table should be Greek and Cyrillic?  — SMcCandlish ¢ 😼  04:11, 22 July 2018 (UTC)[reply]

Instances of character references for Cyrillic letters seem to be relatively rare. I don't see any on a casual skim through this report, though I'd have to go through the entire alphabet to definitively say they are never used. Unlike Greek letters, they aren't in common use for scientific and mathematical purposes. I think it would be simpler and probably more user-friendly just to say to use the Cyrillic characters directly, which is what the draft is currently proposing. -- Beland (talk) 07:58, 22 July 2018 (UTC)[reply]
Works for me.  — SMcCandlish ¢ 😼  15:47, 26 July 2018 (UTC)[reply]

Reversion of addition of third draft

So after I posted the tables proposed above, David Eppstein reverted, with the edit summary "what part of "I think you should be more patient"..."Try proposing something narrower and more specific" do you not understand?".

I think I did not see those remarks by David Eppstein and SMcCandlish because they were posted in the discussion ("Fraction slash" below) about the "Slashes" section of the main MOS page, which I did not check for comments before updating the "Text formatting" MOS subpage. SMcCandlish wanted a one-word change to the "Slashes" section, which he implemented. I think David Eppstein was commenting on the change he reverted, as he then wrote:

I'm not convinced that the html section is needed at all. It is more material for a guidebook on html than style guidance for Wikipedia editors. And you appear to have the purpose of using the new section as a bludgeon to begin a massive project of automatically reformatting characters in Wikipedia, which I think is a bad idea (watchlist clutter for no visible change to articles).

"Bludgeon" sounds pretty ugly and mean. I started a project to spell-check all Wikipedia, which is intended to improve its readability and credibility. Along the way I noticed that editors have also occasionally misspelled HTML character entity references. I thought as long as we're cleaning up the misspellings, we might as well clean up any undesirable forms, because right now we don't seem to be representing them consistently. I started this discussion because I couldn't find any guidance in the Manual of Style to help me write the code to correctly flag undesirable forms vs. ignore desirable forms.

Mediawiki markup uses this part of HTML syntax, and if we have a preferred form for these things we'd want to communicate that to editors, and the Manual of Style is the place to document choices of style rather than technical how-to for the benefit of editors, so I don't understand the criticism that this is not the right place for this sort of guideline. Especially since Wikipedia:Manual of Style#Keep markup simple already discusses exactly this point, and the other sections linked from the proposed tables also address which characters are preferred.

We already encourage editors to make edits that have no reader-visible changes but do have editor-visible changes intended to make wikitext easier to read and thus articles easier to edit. That's the whole point of Wikipedia:WikiProject Wikify and wikification. I do agree there are some edits that don't improve readability all that much that aren't that worthwhile on their own, like changing "==xx==" to "== xx ==". This seems less trivial than that. I'd also note we have Wikipedia:HTML5, a project which is doing nothing but replacing obsolete HTML tags with newer ones, with hopefully no user-visible changes.

There are less than 20,000 articles that even have HTML character entity references at all, less than 3.5% of all articles. Even if we changed all of them today, given the sheer volume of changes to the encyclopedia it would not be a big deal, and in reality it will probably take months or years to manually change all the instances, if that's what we want to do. At worst, editors who notice these changes happening will be educated about the desired way of doing things, and be more likely to input characters that way when adding new text.

Given that editors seem to use characters a lot more than references, and given that characters are built into the Wikipedia UI, it seems a lot less disruptive to move toward characters than away from them.

To illustrate the difference it makes to editors, consider an editor who comes across "São Paulo" in wikitext. To most people who are not web developers, that looks like a typographical error. Some English-speaking people might correct it to "Sao Paulo" which is often seen in English, or, getting the idea there might be an accent there, to "Sáo Paulo", which is incorrect. "São Paulo" is what Portuguese speakers are expecting to see - it's what they type with their keyboards, and it's what appears in Word docs and on the Portuguese Wikipedia and on Google Translate, and in the readable parts of other web sites. With "São Paulo", everyone knows exactly what's going on, and there's no need to waste time doing a search on the meaning of "atilde" or "&atilde" or whatnot.

If I were making the rules, I think I'd keep it simple and say to use characters directly except for otherwise invisible characters and those that cause technical problems when used directly. I'd actually be fine if we used ASCII hyphens for all of our dashes, but I'm not complaining if people who can see the difference on their monitors want to upgrade some of them to emdashes to make things look pretty as in the golden years of paper typography. That would make a much smaller table than the one proposed above, but given that other editors seem to feel more strongly about making it easy to tell the difference between certain lookalike characters, I think that table now represents a pretty good compromise. Leaving dashes and quotes as they are takes the biggest chunks of potential work off the table, anyway.

Given that this is proposing a simple general rule and then listing all the desirable exceptions to it, I'm not sure that a narrower proposal would make sense. The volume of comments has been relatively small, so having multiple discussions about the same topic it seems would just burn more editor time. I am, however, open to actionable suggestions. -- Beland (talk) 08:03, 22 July 2018 (UTC)[reply]

@David Eppstein: Did you have any thoughts in response? -- Beland (talk) 18:46, 23 July 2018 (UTC)[reply]
I don't think we should be setting up automatic processes that make neither a visible change to article content nor a semantic difference to the markup of the articles. And I don't think we should be prescribing such things in the MoS and by doing so encouraging such processes. —David Eppstein (talk) 18:54, 23 July 2018 (UTC)[reply]
@David Eppstein: OK, would you be happy if the guideline said that all such changes be made manually? -- Beland (talk) 20:27, 23 July 2018 (UTC)[reply]
Still not strong enough. I would prefer that such changes be made only as part of other substantive changes to articles (more or less what usually happens now with AWB users; see WP:AWBRULES #4). —David Eppstein (talk) 20:35, 23 July 2018 (UTC)[reply]
OK, I think that will lead to undesirable forms lingering around for a long time for no particularly good reason. -- Beland (talk) 20:56, 23 July 2018 (UTC)[reply]
(And I think leaving those forms around would generate higher cognitive load and more work for editors than the messages generated by removing them.) -- Beland (talk) 21:01, 23 July 2018 (UTC)[reply]
(ec with D.E.) Way TLDR. I warned you that this would take a LOT of work and patience before it would be ready to become part of MOS. Your table, without question, inadvertently trods on a lot of toes in the form of established ways various groups of editors do things in various topic areas. It would be wonderful to systematize and summarize and centralize all this but, like I said, it's gonna be a lot of work. And it's one thing to come up with a guide for future editing; it's quite a different one to use it for some mass-change project. To be blunt, if you think that Even if we changed all of them today, given the sheer volume of changes to the encyclopedia it would not be a big deal then there are some things you really don't understand; if you made changes like this to 3% of articles in one day, or one week, or even one month, you'd be strung up by your URLs.

I haven't been following that last week of discussion so I don't know where we are and what the open issues are, but if you want this to see the light of day you need to be prepared to keep plugging for quite some time to work through all the details with all interested parties (not that I even know how to find them). I've gone through an effort like this myself elsewhere in MOS and it can be an exhausting task, though you will be quite rightly congratulated by all in the end if you can pull it off, because it will be a very useful achievement for the project. EEng 19:05, 23 July 2018 (UTC)[reply]

What does "ec with D.E." mean? If you think I should consult more people, but don't know how to go about doing that, that's not really an actionable suggestion. -- Beland (talk) 20:25, 23 July 2018 (UTC)[reply]
It means "edit conflict"; EEng and I wrote our comments in parallel. —David Eppstein (talk) 20:36, 23 July 2018 (UTC)[reply]
@EEng: As far as I know, the only open issue is whether these improvements would justify their own systematic edits. To a large degree, this is just codifying current practice so we can clean up stragglers, so I don't expect very many objections. -- Beland (talk) 07:11, 24 July 2018 (UTC)[reply]
This will need much wider exposure before you can have that kind of confidence. EEng 07:17, 24 July 2018 (UTC)[reply]
@EEng: I was only referring to issues that had been raised by editors who have already heard of the proposal. But how would you like to see me go about getting wider exposure? -- Beland (talk) 21:27, 25 July 2018 (UTC)[reply]

How do other editors feel about David Eppstein's proposal for a rule that "such changes be made only as part of other substantive changes to articles"? Personally, I don't see the need for that, given the arguments I made above, but of course I'll implement whatever the consensus is. -- Beland (talk) 20:46, 23 July 2018 (UTC)[reply]

This is a whole lot of stuff being discussed at once. I'll cover it in the order in which I'm seeing it come up above:

  1. My "Try proposing something narrower and more specific" (and David Eppstein's "I think you should be more patient", from what I can tell) were from the discussion below, on fraction-slash, and have nothing to do with the discussion above about having a handy quick-reference table on characters and their entities and what to do with them on WP. (Well, my comment didn't; I can't read David's mind.) That should be restored, toward the bottom of Wikipedia:Manual of Style/Text formatting I would think.
  2. "I'm not convinced that the html section is needed at all" no longer seems to have a referent. The table version 3 has no such sectioning.
  3. This point by Beland is correct: "Mediawiki markup uses this part of HTML syntax, and if we have a preferred form for these things we'd want to communicate that to editors, and the Manual of Style is the place to document [it]".
  4. Beland's entire "We already encourage editors to make edits that have no reader-visible changes but do have editor-visible changes ..." paragraph and the two that follow it are correct.
  5. David says: "I don't think we should be setting up automatic processes that make neither a visible change to article content nor a semantic difference to the markup of the articles." I can't find anywhere that this has been suggested, and it would already be governed by WP:COSMETICBOT. Beland seems to want to use this for AWB/GENFIXES purposes, but that's not automated. It's semi-automated, and entirely permissible when done in the course of more substantive edits.
  6. Consequently, "I don't think we should be prescribing such things in the MoS and by doing so encouraging such processes" doesn't really track. A) We do in fact have preferences, recorded willy-nilly throughout MoS (e.g. use ... not or &hellip;, at MOS:ELLIPSIS; and use μ or &mu; not &micro;, at MOS:UNITSYMBOLS; and so on), so the idea that it's off-topic or out-of-scope for MoS doesn't fly. B) MoS has already been updated with a footnote against automated "enforcement" of MoS stuff, including cross-references to the COSMETICBOT policy and to ArbCom decisions about it. The fact that someone could go on an bot-mediated enforcement rampage is not an argument against MoS having line-items about various stuff; the fact that we have rules against doing that is already sufficient to address the rare problem. Given that someone just lost their AWB access as a result of doing something like that should discourage a repeat. Rules do not need 100% compliance to be useful, nor does failure to achieve 100% compliance mean they're insufficient; otherwise civil society would be impossible.
  7. David's "I would prefer that such changes be made only as part of other substantive changes to articles": We can include something about this, but not making up a new rule just for this, only pointing out the existing ones. MoS is not an editing or behavioral policy nor a dispute resolution board. This is already covered by WP:MEATBOT policy and WP:AWBRULES, and is just how WP:GENFIXES works. The aforementioned footnote can simply be recycled from the main MoS page to where ever this table will live.
  8. EEng says: "Your table, without question, inadvertently trods on a lot of toes in the form of established ways various groups of editors do things in various topic areas." That's not "without question"; prove it, please. Then we can integrate whatever tweaks are necessary. And sometimes toes have to be stepped on, anyway. Not everything some gaggle of people at a wikiproject are doing is a good idea, nor do they get to just make up their own rules and force others to comply; site-wide concerns override local ones (WP:CONLEVEL policy).

    And what was once an okay idea can become a poor one over time as circumstances change. E.g., the cutover last month to a new HTML linter for the parser broke all kinds of stuff that used to "okay" or "we don't care", but which is no longer okay, and thus we now do care. The most obvious of these is that unclosed inline elements used to be forcibly closed at the opening of a block element and this is no longer the case, resulting in badly broken, mis-rendering HTML in at least tens of thousands of pages. People have been cleaning this up, including with semi-automation tools like AWB and JWB, yet no one having a shit-fit about it. People will have shit-fits about such activity if it's PoV pushing (e.g. changing all "U.S." to "US", or changing all unspaced em-dash parenthesizing to use spaced en dashes), but they don't lose it over technical cleanup. Another example is that <br> breaks the output of at least two of the available edit-mode syntax highlighters, and needs to be changed to <br />; I've already fixed one "Help:"-namespace page from the 2000s that was recommending <br>, and there are probably some others that need fixing in this regard.

  9. The obvious way to proceed is for EEng to document these "toes" he says are being stepped on; for discussion to ensue, with any needed adjustments being made to the table; and then – if we really think it's necessary – do an adoption RfC on table version 4.

 — SMcCandlish ¢ 😼  16:38, 26 July 2018 (UTC)[reply]

  • My point about the toes is simply that, from experience, people tend to be very set in their ways about low-level details such as direct (literal) pasting in of characters vs. coded form (and, where a coded form is used, both the & forms and template forms have their enthusiastic adherents). So the wider this is advertised and discussed the better, to save WP:WHINE-ing down the road.
  • I think the table needs to recognize that there are much-used template forms e.g. {ndash}
  • If we're going to all this trouble, I'd like to see a shift to a preference for coded forms of mdash and ndash, instead of the current even-handed statement. It's just crazy-making that you can't tell if the right character is present (depending on your font and platform of course). This of course we be a stepping on of some toes.
  • I'm still hoping to get an explanation of why Whitespace other than the non-breaking &nbsp; and regular space should be avoided in prose.
EEng 05:27, 27 July 2018 (UTC)[reply]
Because they cause copy-pasting errors/oddities, clashes with find/replace searches, don't play nice with screen readers, mess with alignment/justification, and there's pretty much no point to them in any sort of prose "James&thinsp;Dean was an actor." is pure nonsense. Headbomb {t · c · p · b} 11:06, 27 July 2018 (UTC)[reply]
Yeah, we only use these for special kerning purposes. If there's some case were we're regularly using thin and hair spaces and it's not spacing tweaks in tight material in template output, feel free to point out where we're doing it, and it can be accounted for (if it's a good idea). Other stuff:
  • As for the "set in their ways about ... direct (literal) pasting in of characters", that's irrelevant, because MoS doesn't constrain editors in any way as to adding new material. You can edit WP without ever complying with anything MoS says, as long as you're following WP:CCPOL, and not a) changing guideline-compliant material to be non-compliant, or b) reverting people making non-compliant material be compliant.
  • Re "the table needs to recognize that there are much-used template forms e.g. {ndash}" – sure. That's not an objection to the table, its an expansion suggestion.
  • On changing to &ndash;: I actually proposed that several years ago for the same reason, and did not get consensus. Apparently the average editor, with their fonts, can see the difference clearly, and people were dismissive of the idea because the editing tools below the edit window provide a button for directly inserting the Unicode character. I think, therefore, this is a lost cause. Editors having trouble seeing the difference between , , , and - need to use WP:User CSS or their browser's font settings to use a font for editing that works better for them. I wrote instructions on how to do this at Help:User style#User CSS for a monospaced coding font. It's not absolutely perfect; the minus and hyphen are still hard to distinguish. If I find a better, free coding font than Roboto Mono I'll put it at the front of the font stack.
 — SMcCandlish ¢ 😼  13:07, 27 July 2018 (UTC)[reply]
Or you can use WP:WIKIED WP:WIKED which marks them as different in the edit window. Headbomb {t · c · p · b} 13:18, 27 July 2018 (UTC)[reply]
I'm assuming you didn't really mean m:Wiki Education Foundation. EEng 17:44, 27 July 2018 (UTC)[reply]
Yes, my bad, fixed. I meant WP:WIKED. Headbomb {t · c · p · b} 18:18, 27 July 2018 (UTC)[reply]
  • Obviously no one's suggesting James&thinsp;Dean so that isn't helpful, and BTW I just checked and text search on Chrome has no problem understanding that thin space is a space. Now and then I've used hsp to adjust "something in italics"[5] to "something in italics"[5] (your mileage may vary, of course) and I'm sure I've used thinsp now and then though I can't recall where. Take a look a this change [4].
  • I'm not so sure that the evidence is that Apparently the average editor, with their fonts, can see the difference clearly. I suspect instead that that the great majority of editors don't even know there is a difference (and just use hyphen), most of those who know the difference are inserting directly using the click-to-insert gizmo but don't really notice or care what it looks like in the edit window since they never look back, and the very small number of us who are copyediting and checking these things have learned to deal somehow with the difficulty of distinguishing them – in my case, wherever I see a direct/literal character which I know should be an ndash but I'm not sure, I just change it to {ndash} so I know it's right. But I'd rather we encouraged editors to use a coded form in the first place to save that trouble. Unfortunately that would create a new flashpoint for my next point, which is...
  • MoS doesn't constrain editors in any way as to adding new material – You know that and I know that, but as sure as day follows night someone's gonna paste in a direct rho, someone else is gonna change that to &rho; (as recommended in the table), and the first guy's gonna change it back, saying "I like it this way." Having said that, looking over the whole table now I don't see very many cases where that might happen (unless we adopt a recommendation to use coded forms of ndash and mdash) but I still think the wider this is advertised for comment in advance the less trouble there will be.
EEng 17:44, 27 July 2018 (UTC)[reply]
Do you have any example of where thinsp/ensp/emsp/hairsp should be used in prose? Because you have none, and no one can come up with any use for them in prose. Until you have such counter examples, the avoid them in prose has consensus, and the allow them is your simply your own preference to not disallow them because of reasons which are never explained. Headbomb {t · c · p · b} 18:41, 27 July 2018 (UTC)[reply]
I guess you didn't read my post above because the first bullet point gives one. I've been very up-front about my wish that we could recommend coded dashes over direct dashes, instead of just trying to force it into the table. Please have the same courtesy about your apparent wish to flatly forbid thinsp and hsp. Is such a provision already present in MOS? EEng 18:45, 27 July 2018 (UTC)[reply]
Such a provision is the current state of Wikipedia. No one writes "something in italics"[5], and they shouldn't start to do so either. Not sure what that has to do with dashes.Headbomb {t · c · p · b} 18:49, 27 July 2018 (UTC)[reply]
Is a blanket ban on thinsp and hsp already in MOS or not? EEng 18:56, 27 July 2018 (UTC)[reply]
The only use I can recall for which I manually employ thin space is between § and the section number that follows it, to split the difference between "§ 1.2.3" and "§1.2.3" styles. This is just a personal habit of mine; there's no rule about it. The only use I've ever have for hair space, outside of a template, is between em dash and an author name when attributing a quotation: "Humor is Mandkind's greatest blessing." — Mark Twain". Also not a rule; it just looks better. Neither of these uses is vital. But they're not objectionable. So, we have a handful of use cases we can document, and then discourage it otherwise. Put it in a footnote, probably. I'm a big fan of footnoting "there are some geeky exceptions" stuff instead of clouding the central advice. On horizontal marks: Well, you can try proposing glyph-to-code conversion if you want, but don't hold your breath. With my font tweaking solution, I have no difficulty at all telling en dashes and hyphens apart, in rendered or source view. "The wider this is advertised": Sure, but not while we're still banging on it just with 3 or 4 people. Iron out the obvious kinks, or even more surely that day follows night, people will "strongly oppose" the whole thing on the basis of some nitpick we should have already anticipated.  — SMcCandlish ¢ 😼  20:58, 27 July 2018 (UTC)[reply]
Obviously I meant we elite should get it in the best form we can before inviting the hoi polloi to look at it. EEng 21:29, 27 July 2018 (UTC)[reply]
Thin space is needed for the correct typography of some mathematics formulas. E.g. (from something off-wiki I was working on today) without thin space: ; with thin space: . The thin space makes it much more clear that this is a product of two subformulas rather than some strange binary-operator usage of the exclamation point. —David Eppstein (talk) 21:18, 27 July 2018 (UTC)[reply]
All great use cases (though I'm sure there are more we're not thinking of) so you see why I objected to These are sometimes used for precision positioning in templates but should not be used in prose. Use either non-breaking (&nbsp;) or regular spaces. So who's OK with my formulation These are sometimes used for precision positioning in templates but rarely in prose, where non-breaking &nbsp; and regular space are normally sufficient (with or without a footnote as suggested by SM)? I'm fine with the rest of what SM has said. EEng 21:27, 27 July 2018 (UTC)[reply]

I feel like keeping to the spirit of Wikipedia:Manual of Style#Keep markup simple means saying that &thinsp and &hairsp should not be used around italics, dashes, and §, since either a regular space or no space works just fine. And I agree with that general approach; HTML is not well-suited to pixel-perfect character control, and as long as there are no horribly ugly problems like actually-overlapping characters I don't think we should fuss about that sort of small thing. This sort of layout issue may be better addressed by making web browses render text more beautifully than by throwing in a bunch of site-specific directives.

If we were to start putting &thinsp around, say, emdashes, then I think that would be a good argument for doing that in an {{emdash}} template, since we'd want it everywhere consistently. I don't think it's a good idea to do that sort of fine-control typography on an article-by-article basis, since then it will not be done consistently.

If {{endash}}, &endash, and – all do exactly the same thing with no fancy spacing, I can see an argument for having two different ways to do it (one HTML-free and one for easier identification), but three ways seems like too many, when two of them serve almost exactly the same purpose.

That said, I'd rather publish the new tables with some of the rows marked as disputed/under discussion than hold the whole thing until there's consensus on every single part, so at least we can start making progress on the items that everyone agrees on, which seems like 95% of it. -- Beland (talk) 02:16, 28 July 2018 (UTC)[reply]

  • keeping to the spirit of Wikipedia:Manual of Style#Keep markup simple means saying that &thinsp and &hairsp should not be used around italics, dashes, and § – No, what the linked guideline says is "Other things being equal, keep markup simple... Use HTML and CSS markup sparingly". That's not "should not be used".
  • HTML is not well-suited to pixel-perfect character control, and as long as there are no horribly ugly problems like actually-overlapping characters – It may not be well-suited, but at times we need to do the best we can, and we're not talking about "pixel-perfect". David Eppstein's example is an excellent one in which neither regular space nor no space is at all acceptable.
  • I'd rather publish the new tables with some of the rows marked as disputed/under discussion – Well, I think we have our hands full just coming up with tables which faithfully and uncontroversially centralize what is now scattered all over creation. And that would be quite an achievement. Changes to what's being recommended should be a follow-on effort.
EEng 03:47, 28 July 2018 (UTC)[reply]
David Eppstein's example doesn't use &thinsp; in the wikitext, so it seems to be out of scope of what I'm proposing. is rendered with <math>\phi!\,2^\phi</math>. Though wouldn't that be a good place to use the dot operator if that's appropriate - surely a very subtle spacing difference isn't the best way to clarify the notation? -- Beland (talk) 17:22, 28 July 2018 (UTC)[reply]
Dot operator is for noobs. Writing for noobs may be appropriate in some Wikipedia articles but it can be condescending in other contexts. —David Eppstein (talk) 19:28, 28 July 2018 (UTC)[reply]
I know, how about × or * ? I larnd bout them in algbra. EEng 00:18, 29 July 2018 (UTC)[reply]
@EEng: Are you arguing that the additional complexity of using &thinsp and &hairsp in prose is worthwhile, and if so in what situations? -- Beland (talk) 17:22, 28 July 2018 (UTC)[reply]
I'm arguing that this isn't the time or place ...
... to get into the weeds of changing the current guidelines, rather than just summarizing and centralizing them.
... to tell a practicing mathematician what notation he should use.
EEng 18:12, 28 July 2018 (UTC)[reply]

@David Eppstein: @EEng: Given the above discussion, do either of you have any remaining objections to posting the revised guidelines? -- Beland (talk) 00:07, 5 August 2018 (UTC)[reply]

Yes. I have the same objections I have already discussed. Why do you think that would have changed? —David Eppstein (talk) 00:08, 5 August 2018 (UTC)[reply]
Honestly, I've completely lost track of where we are. I'd be happy to trust SMcCandlish to recapitulate what the outstanding objections (by me, by DE, or by anyone else) seem to be. EEng 05:46, 5 August 2018 (UTC)[reply]

A germane (if insane) Help page

Assuming we're still doing this, let's while we're at it do something about this insane pile of technical minutiae: WP:How_to_make_dashes. EEng 23:48, 4 August 2018 (UTC)[reply]

I tided that up so it's a shorter read if you don't want to learn how to type by keyboard "shortcut". I'll add a link to the fourth draft. -- Beland (talk) 22:55, 6 August 2018 (UTC)[reply]

Exception for superscripts/subscripts in titles

@Beland: Issue to add to the resolution stack: WP:Manual of Style/Titles#Typographic effects specifically advises use of Unicode superscripts and subscripts and such when available for use in titles of works, because they copy-paste correctly (that is, the output of E=mc<sup>2</sup> copy pastes as E=mc2, and can be used in citation templates without boogering the COinS output. I'm wondering if this conflicts with anything in MOS:NUM and MOS:TM, and the main MoS page. If so, we need to figure out how to reconcile that.  — SMcCandlish ¢ 😼  22:56, 27 July 2018 (UTC)[reply]
Well, Wikipedia:Manual of Style/Superscripts and subscripts is marked as inactive, but I resolved what little conflict there seems to be by adding an exception for titles on that page, with a cross-reference to Wikipedia:Manual of Style/Titles § Typographic effects. I added the same cross-reference and exception to the proposed table. -- Beland (talk) 02:00, 28 July 2018 (UTC)[reply]
User:Headbomb reverted the table change with the edit summary "that page has nothing that contradicts the advice given here. This also applies to titles too via {{DISPLAYTITLE}}". Before my edit, I read that row as recommending not to use Unicode superscripts and subscripts at all, and after the edit to recommend not using them except when needed in titles. The linked page in fact says: "To ensure correct copy-pasting, it is preferable to use Unicode superscript or subscript characters when possible, rather than HTML or wiki markup, which are purely typographic (Unicode ² is not the same character as 2 with superscript markup). Special characters can be used in citation templates." which to me contradicts the "don't use, ever" advice before the edit. Actually my edit was incorrect, the exception is not for Wikipedia article titles, but for titles of works generally, so I'd have to reword it if restoring. (SMcCandlish mentioned that but I was reading too quickly.) But does that at least make sense as I explained it, or am I missing something? -- Beland (talk) 02:30, 28 July 2018 (UTC)[reply]
The only exception should be for an article on the unicode characters themselves. Everything else should be done via a DISPLAYTITLE, e.g. (−1)F, or AC0. Titles of works are no exceptions there. Something like H2O: The Book should be located at H20: The Book and formatted via {{DISPLAYTITLE:''H<sub>2</sub>O: The Book''}}, not located at H₂O: The Book, and then formatted as H2O: The Book throughout the rest of the article. Headbomb {t · c · p · b} 02:47, 28 July 2018 (UTC)[reply]
H20? Holy heavy hydrogen, Batman! EEng 03:00, 28 July 2018 (UTC)[reply]
Well, there is ISBN 1492615323. But the same would apply to H2O (American band) / H2O (Scottish band), etc... Headbomb {t · c · p · b} 03:18, 28 July 2018 (UTC)[reply]
OK, I made another go at noting exceptions to the general "don't use" rule in the table. Does that look better? -- Beland (talk) 17:36, 28 July 2018 (UTC)[reply]
Reverted and clarified. Unicode superscripts shouldn't be used anywhere in titles, except for articles dealing with the Unicode characters themselves. Copy-pasting issues are irrelevant to how things should be properly formatted and displayed, and copy-pasting H<sub>2</sub>O is no harder than copy-pasting H₂O. This is also an accessibility concern, as screen-readers will often chock on Unicode superscripts.Headbomb {t · c · p · b} 14:16, 5 August 2018 (UTC)[reply]

Please someone step in to resolve the most stupid revert war ever

EEng keeps messing with the table layout, forcing them to take huge amounts of vertical space, breaking consistency, scaling/zoom functionality, and forcing unnatural breaks for AFAICT, no real reason but personal preferences. What looks better, [5] + [6] (inline) or [7] + [8] (random vertical breaks)? Headbomb {t · c · p · b} 10:52, 27 July 2018 (UTC)[reply]

Works better allowed to naturally flow; viewport sizes vary radically. The version with forced line breaks does waste a bunch of vertical space on my big-ass monitor. When I reduce window width sharply to simulate a mobile device, it wraps awkwardly, because the browser wraps as needed, plus there are forced line breaks, and they're at cross purposes.  — SMcCandlish ¢ 😼  12:02, 27 July 2018 (UTC)[reply]
Also note to EEng (talk · contribs) (posting this here since your userpage is too slow to use), when you refer to collective things, they take the plural form. The hyphen is considered... but Hyphens are considered, not Hyphen is considered.... Headbomb {t · c · p · b} 14:48, 27 July 2018 (UTC)[reply]
I have to go do my laundry but perhaps when I get back we can talk about this calmly and without the self-certainty. EEng 15:03, 27 July 2018 (UTC)[reply]
OK, that's the whites done, so I have a minute. Look, we've all been through this, where we're seeing different things on different platforms, and it's not helpful to say simply "looks horrible" without thinking about what the other person is seeing and what they're trying to achieve. While in general (all other things being equal) the conservation of a table's horizontal and vertical space is a priority in order to make it easier for the reader to absorb its content, in the present example (or one of them) there was the competing desire to present the various dashes and so on in a stacked form to allow the reader to see how confusing they can be. That may or may not have been worth the slight additional vertical space consumed, but it's not ridiculous either, and Headbomb simply ignored my repeated explanations of that instead of engaging in a discussion of the competing desiderata.
As for plural and so on, "Hyphen is considered" is simply a telegraphic form of "The hyphen is considered", and is just as correct as "Q is considered the hardest letter to use in Scrabble." You're more concerned with strict formalism than is appropriate outside article space.
I've been many times thanked for my careful reforms of previously incomprehensible tables such as those at MOSNUM and WP:PROTECTION, so I do know what I'm doing even if you're not able to always see what I'm aiming at. But I'm not sufficiently interested in these minutiae to worry about them, at least until this proposal goes live and its content is in final form. EEng 18:25, 27 July 2018 (UTC)[reply]
Replace "hyphens/minuses/dashes" with "car/turnip/leaf" and see how it doesn't make anysense. "Car should always be..." makes no sense. "Cars should always be..." does. Headbomb {t · c · p · b} 18:44, 27 July 2018 (UTC)[reply]
Replace it by Q to see there must be more to it than you seem to think: "Q is usually followed by u" makes complete sense – or would you insist on "The Q is usually followed by the u"? Or, God forbid, "Q's are usually followed by u's"? Can't you just let anything go? EEng 19:07, 27 July 2018 (UTC)[reply]
First, using "The" makes this singular. But if you remove it and have "Q is usually followed by u" you're using a mention, and the analogous situation would be something like "- is usually followed by ;", not "hyphen is usually followed by semicolon" (the grammatically correct way of having a use would be "Hyphens are usually followed by semicolons"). Headbomb {t · c · p · b} 19:13, 27 July 2018 (UTC)[reply]
Apparently the answer to my question is No. Oh, and see WP:MISSSNODGRASS.EEng 19:38, 27 July 2018 (UTC)[reply]
To get back to the original question, the different versions of the tables look the same to me when I use a narrow window. With a wide window, I prefer the ones with the explicit breaks; I think it makes the markup examples clearer to break them into lines like that. The extra vertical space doesn't bother me; if you have a wide window, you probably also have a tall window. —David Eppstein (talk) 20:15, 27 July 2018 (UTC)[reply]

Fourth draft

Proposed for posting to Wikipedia:Manual of Style/Text formatting § HTML character entity references and replacing the second paragraph of "Keep markup simple" at Wikipedia:Manual of Style § Miscellaneous with a link to this new section.

HTML character entity references are a way to tell a web browser to render a certain character without including that character in the web page directly. Characters may be referenced by name, decimal number, or hexadecimal number. For example, &euro; is the same as &#x20AC;, &#8364;, or including the character directly.

On Wikipedia, characters should be used directly unless doing so is confusing for editors or causes technical problems. Numerical references should not be used if a named reference is available. For example, &minus; should be used instead of &#8722;, and é should be used instead of &eacute;. For a comprehensive list of available named references, see [9].

Wikipedia stores articles with Unicode, so any character that could possibly be referenced can also be input directly. The web site's editing pages have built-in special character support to make it easy to input characters not typically found on keyboards. Editors can also use the Unicode input method provided by their operating system. There are some exceptions where named references are preferred, to avoid confusion and to circumvent technical limitations. The <nowiki> tag can also be used instead of character escaping to prevent interpretation of special characters as wiki markup.

Please note: It is always OK, whether using manual or semi-automated means, to fix broken HTML entities by replacing them with characters or correct HTML entities (whichever is preferred in the specific case). (Fully automated fixes would need bot approval.) However, when changing existing text from a disfavored to favored form, especially when making large numbers of changes, WP:MEATBOT asks that editors making manual edits please pay attention to the context and be aware of exceptions to the guidelines. When using automated and semi-automated tools, remember that WP:COSMETICBOT and WP:AWBRULES ask that these tools not be used to make changes of this type unless accompanied by a more substantive (reader-visible) change. Check Wikipedia error 11 is disabled for this reason.

Characters to avoid |
Avoid Instead use Note
(&hellip;) ... (i.e. 3 periods) See MOS:ELLIPSIS.
Unicode Roman numerals like Latin letters equivalent (I II i ii) MOS:ROMANNUM
Unicode fractions like ¼ ½ ¾ &frasl; {{frac}}, {{sfrac}} See MOS:FRAC.
Unicode subscripts and superscripts like ¹ <sup></sup> <sub></sub> See WP:SUPSCRIPT. In article titles, use {{DISPLAYTITLE:...}} combined with <sup></sup> or <sub></sub> as appropriate.
µ (&micro;) μ (&mu;) See MOS:NUM#Specific units
Ligatures like Æ æ Œ œ Separate letters (AE ae OE oe) Generally avoid except in proper names and text in languages in which they are standard. See MOS:LIGATURES.
(&sum;) (&#8719;) (&horbar;) Σ (&Sigma;) Π (&Pi;) (&mdash;) (Not to be confused with \sum and \prod, which are used within <math> blocks.)
(&lsquo;) (&rsquo;) (&sbquo;) (&ldquo;) (&rdquo;) (&bdquo;) ´ (&acute;) (&prime;) (&Prime;) ` (&#96;) Straight quotes (" and ') Use {{coord}}, {{prime}} and {{pprime}} for mathematical notation; elsewhere use straight quotes unless discussing the characters themselves. See MOS:QUOTEMARKS.
(&lsaquo;) (&rsaquo;) « (&laquo;) » (&raquo;) Use &lang; and &rang; for math notation. In foreign quotations normalize angle quote marks to straight, per MOS:CONFORM, except where internal to non-English text, per MOS:STRAIGHT.
&ensp; &emsp; &thinsp; &hairsp; Normal space These are sometimes used for precision positioning in templates but rarely in prose, where non-breaking (&nbsp;) and regular spaces are normally sufficient. Exceptions: MOS:ACRO, MOS:NBSP.
In vertical lists

(&bull;) · (&middot;) (&sdot;)

* Proper wiki markup should be used to create vertical lists. See HELP:LIST#List basics.
&zwj; &zwnj; see note Used in certain foreign-language words, see zero-width joiner/zero-width non-joiner. Should be avoided elsewhere.
£ for GBP, keep ₤ for Italian Lira and other lira currencies that use ₤ (see the main article for that currency) MOS:CURRENCY; find broken instances
Potentially confusing or technically problematic characters |
Category coded form (direct form) Notes
Miscellany &amp; (&) &lt; (<) &gt; (>) &#91; ([) &#93; (]) &apos; (') &#124; (|) Use these characters directly in general, unless they interfere with HTML or wiki markup. Apostrophes and pipe symbols can alternatively be coded with {{'}} and {{!}} or {{pipe}}. See also character-substitution templates and WP:ENCODE.
Greek letters &Alpha; (Α) &Beta; (Β) &Epsilon; (Ε) &Zeta; (Ζ) &Eta; (Η) &Iota; (Ι) &Kappa; (Κ) &Mu; (Μ) &Nu; (Ν) &Omicron; (Ο) &Rho; (Ρ) &Tau; (Τ) &Upsilon; (Υ) &Chi; (Χ) &kappa; (κ) &omicron; (ο) &rho; (ρ) In isolation, use coded forms to avoid confusion with similar-looking Latin letters; in a Greek word or text, use the direct characters.
Quotes &lsquo; () &rsquo; () &sbquo; () &ldquo; () &rdquo; () &bdquo; () &acute; (´) &prime; () &Prime; () &#96; (`) Can be confused with straight quotes (" and '), commas, and with one another. MOS:STRAIGHT generally requires conversion to straight quotes, except when discussing the characters themselves or sometimes with non-English languages. See next row for prime characters.
Apostrophe-like ' ` ´ ʻ ʼ ʽ ʾ ʼ ʽ ʻ ʼ
Dashes, minuses, hyphens &ndash; () &mdash; () &minus; () - (hyphen) &shy; (soft hyphen) Can be confused with one another. For dashes and minuses, both forms are used (as well as {{endash}} and {{emdash}}). Soft hyphens should always be coded with the HTML entity or template. Plain hyphens are usually direct, though at times {{hyphen}} may be preferable (e.g. Help:CS1#Pages). See MOS:DASH, MOS:SHY, and MOS:MINUS for guidelines.
Whitespace &nbsp; &emsp; &ensp; &thinsp; &hairsp; &zwj; &zwnj; In direct form these are nearly impossible to distinguish from a normal space. See also MOS:NBSP.
Non-printing &lrm; &rlm; In direct form these are nearly impossible to identify. See MOS:RTL.
Mathematics-related &and; () &or; () &lang; () &rang; () Can be confused with x ^ v < >. In some cases TeX markup is preferred to Unicode characters; see MOS:FORMULA. Use {{angbr}} instead of ) / ()
Dots &sdot; () &middot; (·) &bull; () Can be confused with one another. Interpuncts (&middot;) are common in horizontal lists and to indicate syllables in words. Multiplication dots (&sdot;) are used for math. In practice, the dots are used directly instead of the HTML entities.

Discussion of fourth draft

@David Eppstein: I thought your opinions might have changed or been refined in response to the comments by SMcCandlish in the discussion of the third draft. SMcCandlish said some interesting things about how to formulate advice against disruptive editing, which I think helped evolve my position. I've tried to integrate both your views in the new paragraph in the above fourth draft. How does that sound to you? -- Beland (talk) 23:35, 6 August 2018 (UTC)[reply]

My opinion is still that we should not make invisible and semantics-neutral changes to articles except as part of more substantive edits to the same articles, and that your suggestions here seemed aimed at doing that. If you reassure me that no such automation is intended, and that your proposed tables are intended purely for the use of human editors, I may be willing to take your proposals more seriously. But even then, I don't see this as something that is so important to standardize that it should be codified in the MOS. —David Eppstein (talk) 00:50, 7 August 2018 (UTC)[reply]
@David Eppstein: Pretty sure no one in WP:BAG would ever approve a bot whose only purpose is changing &euro; to or &ndash; to (or vice versa) without strong consensus to do so per WP:COSMETICBOT. Some of those could end up as minor WP:GENFIXES (and only when "Unicodify page" is manually enabled), but that's already an option for a lot of things. I think (not sure) AWB exposes invisible characters (non-breaking spaces to &nbsp; for instance), and I know for a fact that WP:WikED does it. I'd support changing obscure hex (&#x20AC;) and dec (&#8364;) codes to their regular () or readable (&euro;) equivalents on a character-per-character basis though. Headbomb {t · c · p · b} 02:28, 7 August 2018 (UTC)[reply]
I definitely don't mind turning hexes into unicodes if it's part of more substantive edits. But we don't need complicated tables to do that. —David Eppstein (talk) 02:45, 7 August 2018 (UTC)[reply]
@David Eppstein: Well, given that there's a complicated guideline that's being enforced, it seems like it has to be written down somewhere? It took some work to figure out what generally desirable practices are, and what the exceptions are that we need to watch out for. Is there somewhere else you think it would be more appropriate to codify this? -- Beland (talk) 06:32, 7 August 2018 (UTC)[reply]
"I worked hard on it" is never a valid reason for accepting something. And I think it's important to put more attention into learning what our actual practices are, than into making high-handed decisions about what they should be. —David Eppstein (talk) 06:35, 7 August 2018 (UTC)[reply]
@David Eppstein: That's not really what I was getting at. The reason it's worth documenting is not that I personally put work into it, it's that someone else would have to do the same work over again (for any given character) to figure out "which way should I write this?" or "is this the right style or should I change it?" later on. Documenting the preferred practice should save time in both looking for that information, in being confused by disfavored styles, and in changing disfavored to favored styles. I think we have done a good job learning what our actual practices are; I have looked at frequencies in database dumps, and we've done a consultation where people have pointed out what various groups consider errors vs. acceptable style. What's the argument against putting this in the MoS? -- Beland (talk) 15:45, 7 August 2018 (UTC)[reply]

Merge the Cyrillic advice to one guideline

We have a problem. All of these pages overlap, and none of them are actually guidelines:

The non-mainspace pages are redundant and hard to find, likely to conflict and diverge, and not authoritative. They're moribund and all but forgotten, yet listed at Wikipedia:Romanization as if they're guidelines (it also lists articles like Romanization of Kyrgyz as if they are). Mostly what they say is not really naming-convention material in particular, but general MoS material that also happens to apply to article titles. They have inconsistent names and organizational approaches.

I think these should just be merged into a single WP:Manual of Style/Cyrillic, with a general table, footnoted as needed for specific languages where there are variances (or perhaps use different table rows for this?). Have language-specific sections with detailed notes. If anything in it is truly a naming convention (i.e., applies only to titles), this can be put in a separate paragraph, with a shortcut, like WP:NCUKRAINIAN or whatever, as needed; the page will cross-categorize as both an MoS and an NC guideline. We're already doing this with various topical MoS/NC pages, and with WP:SAL, and it works fine (better, actually, that splitting this information across multiple pages). We should actually be doing more of this; see, e.g., the note above about erasing the pointless WP:POLICYFORK that we have between WP:NCCOMICS and MOS:COMICS (which has its own naming conventions section).  — SMcCandlish ¢ 😼  08:55, 19 July 2018 (UTC)[reply]

  • I can agree on this — as long as we remember how many languages (most of them are not even Slavic ones) are using Cyrillic alphabet with so different phonetics. A unified page can become quite bloated. However, because it's not supposed to be a very particular «Englification of Russian», it's better be «Latinization (Romanization) of Cyrillic». Tacit Murky (talk) 15:36, 20 July 2018 (UTC)[reply]
    Sure. We have little actual material to cover that isn't Russian or Ukrainian. Most subjects on en.WP that might have a name in any of the Siberian languages also have a name in English or in Russian that will be more familiar to our readers. I would think we should consolidate and arrange the existing Cyrillic latinisation material at Wikipedia:Romanization and no add to it unless/until we see a need to do so.  — SMcCandlish ¢ 😼  01:17, 21 July 2018 (UTC)[reply]
  • @Beland: You seem to have a good eye for the table tweaking. Care to give this one a go?  — SMcCandlish ¢ 😼  01:17, 21 July 2018 (UTC)[reply]
My only interest in Slavic language words is that they be tagged with <lang> to indicate to spell/grammar checkers that they are not English, and to hint to TTS systems what pronunciation system they should use. -- Beland (talk) 01:41, 21 July 2018 (UTC)[reply]
Sure. Now that {{lang}} has been reworked, a bunch of people are working on doing this consistently, though it's very gradual.  — SMcCandlish ¢ 😼  03:13, 21 July 2018 (UTC)[reply]
I've notified various relevant wikiprojects, including Russia, Ukraine, Caucasia, Europe, Languages, and Film's Soviet and post-Soviet cinema task force.  — SMcCandlish ¢ 😼  07:04, 29 July 2018 (UTC)[reply]
  • Since there are over a hundred languages that use the Cyrillic alphabet, it may be fine to merge the top-level pages about Cyrillic naming and romanization into a single guideline, including a bare summary for each language like naming the romanization system(s) used, but keep the details and any romanization tables in language-specific pages. That said, romanization tables in the Wikipedia: namespace should be replaced by links to the encyclopedic articles. Michael Z. 2018-07-31 19:22 z

Link city and state in ledes of U.S. college and university articles?

The vast majority of articles about U.S. colleges and universities begin with sentence like this: "<Institution> is a <list of adjectives> college/university in <city>, <state>." In many cases, both the city and state are linked to their respective articles. In some cases, they both link only to the city. Is there a firm consensus that the MOS favors or discourages one of these two approaches? (Jweiss11‎ and I had a brief discussion about this on Jweiss11‎'s Talk page if anyone would like a little bit more background.) ElKevbo (talk) 14:06, 20 July 2018 (UTC)[reply]

WP:MOSLINK discourages overlinking, and discourages bunched linking where possible. It might be useful to link the city, provided it's not a well-known city such as LA, NYC, Chicago, or a host of others that English-speakers are likely to be familiar with. But I'm struggling to see why the US state is worthy of a link as well. Is there something I'm missing? This is better raised at WT:MOSLINK. Tony (talk) 14:18, 20 July 2018 (UTC)[reply]
Am I correct in inferring that the concern here is that state names are familiar to most readers and thus don't need a link? I ask not only because of the current discussion about linking but because that also ties into another question I have (which isn't related to the MOS) which concerns the inconsistent inclusion of ", United States" in the lead sentence of these articles.
It's also worth noting that part of this discussion is related to the fact that many colleges and universities are public and therefore governed by their respective states so we're not just concerned with geography. ElKevbo (talk) 14:29, 20 July 2018 (UTC)[reply]
MOS:SEAOFBLUE discourages back-to-back links. If it is being mentioned, the city is the location of interest, even if its name is being qualified by the state. The state link is inevitably linked in the city article.—Bagumba (talk) 14:46, 20 July 2018 (UTC)[reply]
The issue here is the back-to-back bunching of a more specific wikilink with a less specific wikilink when just the more more specific wikilink will do. We should also note here that Template:Infobox university has separate fields for city and state, which render back-to-back wikilinks. Perhaps this should be remedied? Jweiss11 (talk) 15:51, 20 July 2018 (UTC)[reply]
WP:USPLACE is also a factor here. Except for a few very notable exceptions, the articles for towns and cities located in the United States already include the name of the State in their titles (using the format "<City, State>"). For example, the city of Ann Arbor, Michigan (linked to in the article for the University of Michigan)... is formatted as: "[[Ann Arbor, Michigan]]", NOT "[[Ann Arbor]], [[Michigan]]".
However, there are those few exceptions... for example, our article on the city of Chicago doesn't include the name of the State (Illinois) in the title (Personally, I think it should, but consensus has deemed otherwise). Now... this will impact our article on DePaul University (which is in Chicago). The question is: do we want to include a link to Illinois, or is the link to Chicago enough? Blueboar (talk) 01:04, 21 July 2018 (UTC)[reply]
  • Its all this stuff combined. SEAOFBLUE and USPLACE, and also relevance combined with user-interface smarts: "... university in Cleveland, Ohio" is redundant because any reader that wants geographical (or, more narrowly, human geography) info about the institution will get it from Cleveland, Ohio. They're not likely to click on Cleveland (which already has a link to Ohio), then come back to the university article and click on Ohio, a link which is of low (over-generalized) relevance to the university topic. Otherwise we might as well do "... university in Cleveland, Ohio, United States, Western and Northern Hemispheres". Heh.  — SMcCandlish ¢ 😼  01:26, 21 July 2018 (UTC)[reply]
Something else we should consider, some of this is simply due to lazy writing... Let me give an example: While it is helpful for the University of Notre Dame article to specify what state Notre Dame is located in... there is absolutely no need to specify what state the University of Michigan or Ohio State University are in (the name of the institution kind of gives that fact away). So... we could avoid the entire "see of blue" issue and use piped links (writing: "The University of Michigan has it's main campus in the city of Ann Arbor" or "Ohio State University is located primarily in the city of Columbus"). Trying to make everything follow a consistent pattern can limit your options. 02:07, 21 July 2018 (UTC) — Preceding unsigned comment added by Blueboar (talkcontribs)
Yep.  — SMcCandlish ¢ 😼  03:15, 21 July 2018 (UTC)[reply]
Assuming of course that there are no American equivalents to the University of Warwick, which isn't based in Warwick but in nearby Coventry .Nigel Ish (talk) 10:51, 21 July 2018 (UTC)[reply]
When the First Unitarian Church of Berkeley moved to Kensington, they decided not to rename themselves the First Unitarian Church of Kensington. EEng 11:12, 21 July 2018 (UTC)[reply]
Sure, when something that was named for a location isn't actually in that location, the lead does need to more clearly specify where it actually is. However, that scenario is highly unlikely for universities named after US states. Blueboar (talk) 11:53, 21 July 2018 (UTC)[reply]
Highly unlikely?? I suppose you've never heard of Washington University. —David Eppstein (talk) 18:55, 21 July 2018 (UTC)[reply]
Named after the man, not the State... but since not everyone knows that, it could be confusing... so, sure, that would be one where we would include the state location as well as the city. Blueboar (talk) 22:42, 21 July 2018 (UTC)[reply]
(edit conflict) I've never even visited the United States, so I don't know, but I have to imagine the probability of a school with such a name existing, even apart from EEng's example above, is overwhelmingly high. That said, writing the lead sentence to say Fu University is a university located in Notfu, Wisconsin. should be discouraged anyway; if the fact that it is located somewhere in spite of its name is important enough to be noted in the lead, then it should be noted separately (Despite its name, Fu University is actually located in Notfu [for reason X].), not in the lead sentence where all it will do is confuse readers and potentially cause them to believe the page has been vandalized. As for linking, I'm inclined to say case-by-case: most Japanese university articles seem to include such links, and not doing so with American institutions because everyone knows what an Ohio is reeks of WP:SYSTEMIC. Hijiri 88 (やや) 12:00, 21 July 2018 (UTC)[reply]
That's the approach I take to such cases (not just universities but "names that don't make sense" in general). It also drives me nuts when I see a "Fu University is a university ..." construction or the like, anyway. It's terrible writing that treats our readers like they've had lobotomies.  — SMcCandlish ¢ 😼  17:52, 21 July 2018 (UTC)[reply]
Like Hijiri88, I'm wary of the assumption that most readers automatically recognize the names of most U.S. states. ElKevbo (talk) 13:32, 21 July 2018 (UTC)[reply]
I don't think the question is whether the state name should be included, but whether it should be linked. For example, should Yale University link to [[New Haven, Connecticut]] or [[New Haven, Connecticut|New Haven)]], [[Connecticut]]. Natureium (talk) 13:52, 21 July 2018 (UTC)[reply]
I would say the first... no need to link to the state article separately. Send the reader to the article on the city ... as that will probably give more relevant information when coming from a university article (such as what neighborhood the university is in, or if there has been any “town vs gown” history, or if there are other universities in the same town, etc)... the reader can get to the article on the state from there. Blueboar (talk) 23:02, 21 July 2018 (UTC)[reply]
I know what the question is - I'm the one who originally asked it. :) But one editor has proposed omitting the location entirely in cases where the institution's name includes the location. ElKevbo (talk) 14:12, 21 July 2018 (UTC)[reply]
The answer is that there is no single “correct” way to do it... there are lots of “correct” ways; and wording that works at one article, may not work at another. That said... in general... a well written article phrases things to avoid unnecessary repetition and avoids over linking. Blueboar (talk) 23:02, 21 July 2018 (UTC)[reply]

@ElKevbo: would you agree that we have consensus that we should not link city and state back to back? Jweiss11 (talk) 17:30, 27 July 2018 (UTC)[reply]

Yes, especially in instances where the title of the city's article also includes the state i.e., nearly all cases. ElKevbo (talk) 18:23, 27 July 2018 (UTC)[reply]
Yes. I'd go further and not even insert the city-name if it's blazing out from the name of the institution. Tedious for readers. Tony (talk) 04:16, 30 July 2018 (UTC)[reply]
I disagree but this isn't the place to discuss it. ElKevbo (talk) 13:08, 30 July 2018 (UTC)[reply]

Citations in the lead

This text was added to WP:MEDMOS, based on this discussion.

Consensus was not gained that this change is in concurrence with project-wide MOS. It has not been determined that statements in the leads of medical articles are more likely than any other type of article to be challenged, and the main reason for this push for citations in the lead has been for the (external) translation project, which translates only the leads of medical articles (a separate problem in and of itself). Many examples have been given over the years of how this demand for citations in the lead compromised the summary aspect of article leads. SandyGeorgia (Talk) 14:57, 24 July 2018 (UTC)[reply]

Beta-Hydroxy beta-methylbutyric acid provides an example of this new trend, with up to six citations per sentence. SandyGeorgia (Talk) 15:11, 24 July 2018 (UTC)[reply]
Sorry, whats the issue? MOS is not incompatible with WP:V which can require inline citations in the lead, and neither is MEDMOS, it just lays out two advantages to doing so. Apart from biographies and fringe science, Medical articles are certainly one of a group of 'likely to be challenged for claims of fact' (especially when it intersects with fringe/alternative medicine) that would require citations per WP:V, so having them there in advance isnt an issue, nor is there anything in the site-wide MOS that says you cant. 'Its not necessary' encyclopedia wide is not incompatible in any way with MEDMOS 'its not necessary but here are a couple of reasons why you might want to for these articles'. Only in death does duty end (talk) 15:11, 24 July 2018 (UTC)[reply]
What is at issue is broader discussion to keep MEDMOS in sync with site-wide policy and guideline. If the claims put forward about the reasons for requiring extreme lead citations in medical articles are true (I disagree that they are), then the reasoning should be included in a site-wide guideline, not just MEDMOS. If false, the wording is extraneous. Citations can always be provided in leads for any articles if consensus is developed on an individual article. The push here is to demand them for the purposes of an external project (translation), which in and of itself has resulted in compromised quality of articles, as the focus is on the lead rather than the body of the articles. Forcing citations into leads in many cases has rendered it difficult to write a summarizing lead. The extreme to which this has gone is seen at Beta-Hydroxy beta-methylbutyric acid, where there are up to six citations per sentence. If that is the direction we want lead citations to go, then it should be a general guideline, not just a medical guideline. Broader input, beyond the increasingly walled garden at the Medicine project, should go into this decision. SandyGeorgia (Talk) 15:18, 24 July 2018 (UTC)[reply]
But it is in line with with the wider MOS. Neither states you are or are not required to have citations in the lead. MEDMOS gives two reasons why its preferable for medical articles to do so. Those two reasons do not exist for every article and so would be inappropriate to add to a site-wide MOS. And the wording as written is hardly a 'demand'. If you want to write a medical article without citations in the lead, you are still free to do so. But someone may come along and add them later - functionally you cannot prevent that - because if you did attempt to remove them over a style issue, they would just cite WP:V and add them anyway. There are plenty of examples of local topic-specific guidelines that do not apply to other topics. Its only a problem when they are in conflict, and there is no conflict here. (The problem with BHBMA looks more to be citation overkill where multiple citations are used for single relatively short sentences, where one citation for the lead would do with the others in the body) Only in death does duty end (talk) 15:24, 24 July 2018 (UTC)[reply]
I believe Sandy is referring to MOS:LEADCITE. --Izno (talk) 15:32, 24 July 2018 (UTC)[reply]
It is untrue that "If you want to write a medical article without citations in the lead, you are still free to do so." I intended to bring Dementia with Lewy bodies to FA standard, and was told in the rudest possible terms that it would be strenuously opposed unless I (over)cited the lead. I was forced to cite the lead, which makes it harder to write a compelling summary. And it has not been determined site-wide that leads of medical articles should be an exception. MEDMOS and MEDRS have been widely accepted partly because of efforts in the past to make sure they stayed in sync with broader policy and guideline. Citation overkill, and substandard citations in leads to meet the needs of the translation project, are an issue across medical articles. SandyGeorgia (Talk) 15:34, 24 July 2018 (UTC)[reply]
Its not untrue at all. If you want to write a 'featured article' you might have to jump through extra hoops but thats the price you pay for writing a featured article. You can write a standard medical article perfectly fine without citations in the lead (unless WP:V comes into play). No one can stop you. And once again, MEDMOS is not stating it is an exception to wider site guidelines, its merely stating there are a couple of reasons why you might want to do it differently for medical articles. (Izno, its also not in conflict with LEADCITE - which itself accepts certain types of articles are more likely to require citations in the lead). Only in death does duty end (talk) 15:43, 24 July 2018 (UTC)[reply]
Overcitation of leads is not a requirement for FAC. If it is to be the case for medical articles, then it should have consensus beyond the walled garden of the medicine project. Hence, the broader discussion that should have been initiated before the change was made to MEDMOS. SandyGeorgia (Talk) 15:44, 24 July 2018 (UTC)[reply]
Why would there be a broader discussion on changing a guideline that only applies to medical articles? You have yet to point out where there is an actual conflict between what MEDMOS says currently and wider site guides. They all currently state (with the exception of where WP:V comes in) that lead citations are not required. Only in death does duty end (talk) 15:50, 24 July 2018 (UTC)[reply]
When individuals (unsupported by broader consensus even in discussions at WP:MED) are forcing non-site-wide practices into FAs and local guideline pages, a broader discussion is optimal. And, as already pointed out (and mentioned over the years), great care was taken in earlier years to make sure that MEDRS and MEDMOS stayed in sync with site-wide policy and guideline. Taking local pages beyond what has site-wide acceptance jeopardizes years of careful work. (Not to mention the damage to article leads that results from this practice.) SandyGeorgia (Talk) 18:56, 24 July 2018 (UTC)[reply]
You are repeating yourself but you are not actually answering the question. They are currently in sync with site-wide practice. FAC is largely irrelevant as it can (and regularly does) mandate higher standards than are required for articles (to be published). If you are having a problem with a getting a featured article label on an article because the editors involved in featured articles want it done in a certain way, that is not a MOS issue (I would be surprised if any FA reviewers asked for citations in the lead except for controversial content as FA requires its compatible with the MOS. And since both with/without citations is compatible with the MOS and MEDMOS....). What is the conflict between MEDMOS and the wider site best-practices please? Only in death does duty end (talk) 18:59, 24 July 2018 (UTC)[reply]
I strongly support the addition of the new language. I also note that, as written, it does not say that anything is mandatory. But what it says accurately reflects present-day editing norms, not limited to medical content. Maybe long ago in wiki-years it was otherwise – I don't know. But it is perfectly acceptable for editors in a specific topic area such as this one to form a consensus that content about that topic should generally follow more stringent sourcing guidelines than what applies site-wide. After all, MEDRS sets requirements for secondary sourcing that do not apply in other subject areas, and that's a good thing. And there is no valid reason for FAC to dictate otherwise. Good scholarly writing requires this style of attribution, and although Wikipedia is written for a general audience rather than a scholarly one, the special burden of our health-related content (that it can potentially influence health decisions made by our readers, with very significant real-world consequences) makes it reasonable to treat material in the lead as subject to "citation needed". --Tryptofish (talk) 20:11, 24 July 2018 (UTC)[reply]
I'm pretty sure FAC doesnt actually do this at all, will ping @Ealdgyth: (who does a fair amount of FA reviews as I recall). But currently the wording at MEDMOS isnt more stringent, it just says there are some good reasons to do it. But you dont have to. Only in death does duty end (talk) 20:31, 24 July 2018 (UTC)[reply]
I should clarify that I did not mean literally that FAC dictates to MEDMOS. I was trying to communicate that the fact that there was a difficult FA review is not a valid reason to say that the new wording should be removed from MEDMOS. Also, it's been my experience that the objections to cites in the leads of health articles mostly come from editors who have been active at FAC. In any case, sorry if that was unclear. And I don't mean to start a FAC versus MED grudge match. --Tryptofish (talk) 20:41, 24 July 2018 (UTC)[reply]
No I meant my experience of reading FA-articles is that FA promotes articles regardless of cites in the lead or not - even a quick look at the medical (and non-medical) FA's shows examples of both - so I think FA is a non-issue when it comes to cites in the lead debate. Only in death does duty end (talk) 20:53, 24 July 2018 (UTC)[reply]
I'm finding it curious to be told what is and isn't practice at FAC :) :) I suggest there is probably no medical editor on Wikipedia who knows same to the extent that I do. Perhaps Graham or Cas though. Only, I would be interested in knowing which articles you have written and how you have composed and cited the leads for them ? SandyGeorgia (Talk) 21:57, 26 July 2018 (UTC)[reply]
Or you could link to the discussion where you were told you had to have multiple cites in the lead to write a FA. Diffs please. Only in death does duty end (talk) 22:06, 26 July 2018 (UTC)[reply]
I suspect you may be the only person in this discussion who doesn't know where to find them, and there are pages of discussion, including a draft RFC. OID, I am still wondering if you have every built an entire article and then summarized its content to a lead, and if so, just how you personally do so? An example of an article and lead as you build them might help me see things from your perspective. For my perspective, you can look through scores of medical FAs and others to see that leads do not always need to be cited. If someone demanded citations in leads under my FAC tenure, that demand would be ignored because it is an invalid, unactionable demand. As Ealdgyth can also explain. SandyGeorgia (Talk) 01:37, 27 July 2018 (UTC)[reply]
You stated above that you were told unless you used multiple cites in the lead your FA would be opposed. Please provide a diff. Given you have been complaining about it, this shouldn't be hard. Only in death does duty end (talk) 02:07, 27 July 2018 (UTC)[reply]

Yeah, it's really just a matter of whether something in the lead is likely to be viewed as controversial (or has already been contested). Well, at a stub it may also be a matter of whether the claim in question exists outside the lead; many stubs are nothing but a lead. :-) Anyway, I tend to agree with Only in death; it's just a fact that medical claims are more likely to be controverted. It's also a fact that WP:MOS has no control over whether WP:FAC can demand something above and beyond what MoS does; I would surmise this will also be true of WP:V and WP:CITE and their pools of regulars; the FAC crowd aren't going to listen to them saying "X is not actually required", either. From FAC's viewpoint, it is required if you want the WP:FA icon.

I don't agree with this insularity at all, mind you, but I observe that it's happening. FAC is a wikiproject, and the FA label is something that wikiproject hands out based on their own criteria. At least as of late 2016, there was quite a bit of hostility over there toward complying with anything in MoS that the people in that echo chamber don't like, which actually a real WP:CONLEVEL problem. I long ago stopped thinking of FA as an "official" WP process, but as just some drama I don't want to be involved in. It's like a Boy Scout merit badge that will cost you a limb if you're not part of the in-crowd. I've been here over 12 years, and have GAs under my belt but no attempts at FA – it's just that off-putting. So, I definitely feel SandyGeorgia's pain on this aspect of the matter.

To get back to WP:MEDMOS, I don't see that there's a conflict between it and the main MoS (at least not on this point, and it does have a conflict with WP:PSTS that I've been trying to get resolved for about 3 years, so I'm not saying the page is perfect). It may go above and beyond MoS's basic requirements (and, as Tryptofish points out, even above basic WP:V / WP:RS / WP:NOR requirements). Lots of MoS subpages do similarly for particular things, just as various notability and naming conventions guidelines are more persnickety than WP:N and WP:AT respectively. The central policies and guidelines are minimums, not limits – within reason. Is citing medical claims in the lead really unreasonable?
 — SMcCandlish ¢ 😼  10:05, 25 July 2018 (UTC)[reply]

SM, FA delegates/coordinators are fully empowered to ignore even what you refer to as FA regulars, when their commentary is not within WP:WIAFA, so I am not sure of any relevance of any of your statements above; No, X cannot be required at FAC by whim because someone wants it-- no matter how vociferously the oppose. I have promoted FACs with multiple opposes, and archived FACs with 28 Supports. There are almost no areas where FAC goes beyond MOS; where the standards do is spelled out at WP:WIAFA, which has not changed since the 3,000+ FACs I promoted. The effect of one editor demanding citations based on personal preference has no relevance to FAC-- it does, though, sere to discourage editors from wanting to waste time bringing articles to standard. Is demanding citations in the lead unreasonable?

So, back to the issue; should MEDMOS stay in sync with MOS? Citations in the lead are not required. I contest that medical leads have content any more likely to be challenged than many other areas-- this is a made-up meme. And yes, overcitation in leads makes it difficult to write compelling prose. SandyGeorgia (Talk) 21:57, 26 July 2018 (UTC)[reply]

Well the example of BHBMA appears to be a problem of WP:CITEOVERKILL rather than just having citations. 6 cites instead of 1 is excessive if the citations are used elsewhere in the article and they are just confirming each other. If a sentence is being constructed such that it requires 5/6 cites to source specific claims in the sentence, then really thats something that should be re-written for the lead. There doesnt appear to be any conflict between sources (the main reason for warring citation spam) so unless someone somewhere is demanding 4,5,6 cites for non-controversial info I dont see what the holdup is in slimming them down unless they are actually required for WP:V purposes - but some of the sentences are quite short and I am pretty sure you dont need 5 citations for what is a single statement. Here as part of the FAC process @Doc James: actually questioned the amount of references used. So that indicates to me neither the Med project or FA are requiring that sort of excessive citation in the lead. The GA review here also brings up excessive citations, but also shows that there are issues with material being challenged. Only in death does duty end (talk) 10:49, 25 July 2018 (UTC)[reply]
We've discussed this within WPMED on multiple occasions. There is nothing contradictory between the MOS and MEDMOS. Natureium (talk) 14:39, 25 July 2018 (UTC)[reply]
At this point in this discussion, I think that the bottom line for me is that there is sufficient consensus for the added wording at MEDMOS. --Tryptofish (talk) 17:32, 25 July 2018 (UTC)[reply]
The LEADS are better if referenced. But nothing in the LEAD should really require more than one or two refs (simply pick the best). The LEADS however do not require references, but if referenced with MEDRS compliant sources it would be fairly controversial to try to remove them.
It appears to be claimed that the ONLY reason to reference the leads is to support creation of medical content in other languages (and this is positioned as a bad thing). This, however, is not the case. While it is one reason to reference the lead, it also makes them easier to discuss and improve as one can verify that the content in the lead is well supported or not more easily.Doc James (talk · contribs · email) 08:34, 26 July 2018 (UTC)[reply]
would agree w/ Doc James in terms of the lede--Ozzie10aaaa (talk) 12:20, 26 July 2018 (UTC)[reply]
Concur also with Only in death and Doc James that 1 ref per claim is sufficient. We seem to have this dispute, to the extent it really is one, because of "cite stacking" in the lead, not because the lead has any citations in it at all.  — SMcCandlish ¢ 😼  14:44, 26 July 2018 (UTC)[reply]
At this point, there is still not a single editor from outside of the usual bunch weighing in on this discussion. Ah, but such is Wikipedia these days. I have raised a concern about the direction the MED pages are going, after years of carefully keeping them in sync with site-wide pages; nothing has been or will be done because there are no new eyes on the topic. Carry on. No medical editors are writing top quality content, so resolution one way or another won't have much effect. Regards, SandyGeorgia (Talk) 21:57, 26 July 2018 (UTC)[reply]
I said earlier that "I don't mean to start a FAC versus MED grudge match", and that ^ is what I was concerned about. Peace. --Tryptofish (talk) 22:28, 26 July 2018 (UTC)[reply]
Understood. It was Only in Death (an editor I never enountered in the Featured processes) who stated that "If you want to write a 'featured article' you might have to jump through extra hoops but thats the price you pay for writing a featured article;" and seems to have less than thorough knowledge of the FA process, because there is no requirement to cite leads in FAs. I agree that the FA portion of this discussion should not be relevant, but we do have the example of the way the new wording in the guideline is being interpreted by a few medical editors is extreme, as happened in that case. As you know, rather than rock the boat, I ceded and cited the lead fully at dementia with Lewy bodies even though that should not be needed, and was not needed. But that is how this wording is being extended in application. A very good example of that can be seen with:
  • Medications for one symptom may worsen another.[11]
There is no reason to have to cite a general statement like that in the lead; that is overcitation of the lead, and this sort of thing leads to clunky writing. Regards, SandyGeorgia (Talk) 01:52, 27 July 2018 (UTC)[reply]
I sure do remember that. For whatever it may be worth, one fish's opinion is that "Medications for one symptom may worsen another.[11]" and "Medications for one symptom may worsen another." do not differ from each other in terms of clunkiness of writing. I realize that this is subjective, but I think I'm no slouch when it comes to clarity or engaging-ness of writing (aside from that hyphen I just put there). --Tryptofish (talk) 18:29, 27 July 2018 (UTC)[reply]
The word you're looking for is engagiosity. EEng 18:33, 27 July 2018 (UTC)[reply]
Snort, laugh. Says the editor who has made the cites in Phineas Gage so convoluted that they are clunky. [FBDB] --Tryptofish (talk) 18:49, 27 July 2018 (UTC)[reply]
You stated explicitly you were told taking Dementia to FA would be opposed with citing multiple times in the lead would be opposed. It's also an indisputable fact that you have to have to do more to get an article to FA standard than is normal. I have found no evidence from trawling through FA pages, or the med project, or your contribution history, that as a project either FA or med have said or implied that you are/were required to cite in the lead. If they have, it's well hidden. In fact, as I already said above, the only evidence I have found (using your own example) is that they asked for the complete opposite (less cites in the lead). So at this point you really need to provide some actual evidence in the form of diffs because so far you have made a number of misleading statements, as well as being extremely insulting towards both the FA and med wikiprojects. Only in death does duty end (talk) 05:32, 27 July 2018 (UTC)[reply]
  • The translation issue seems to be a red herring. If only the lead of a foreign language article is translated then, in the English translation version, it is no longer the lead; it has become the body of the article. And, as for the general point, MOS:CITELEAD makes it clear that "there is not ... an exception to citation requirements specific to leads." This means that the lead of any article may require citations, if challenged, and so medical articles are just a likely case, rather than being special in this regard. Andrew D. (talk) 22:36, 26 July 2018 (UTC)[reply]
  • Concur. There is no reason for medical articles to be treated any differently than any other class of article, with respect to citations in the lead. The site-wide guideline covers it. MEDRS and MEDMOS tried (in the past anyway), not to extend beyond site-wide policy and guideline, but to explain how those policies and guidelines applied to biomedical content. Going beyond what is required for any other type of article is likely to result in a backlash, and accomplishes nothing. If people want to translate, that is fine, but they can seek out the citations as needed (which they should be reading anyway, although they don't always.) Regards, SandyGeorgia (Talk) 01:32, 27 July 2018 (UTC)[reply]
  • Leaning toward concurrence with Only in death. Where is the evidence of either a) FAC requiring lead citations in med articles, or b) MEDRS actually diverging from MoS or from WP:CITE? Like, please actually quote the material where this alleged WP:POLICYFORK is happening.  — SMcCandlish ¢ 😼  11:56, 27 July 2018 (UTC)[reply]
After suffering through this long discussion I'm leaning towards simply choosing death, period. EEng 18:48, 27 July 2018 (UTC)[reply]
[citation needed] --Tryptofish (talk) 18:52, 27 July 2018 (UTC)[reply]
  • I would not want MEDMOS to say that there has to be an inline cite for every sentence in the lead. And maybe there has been a problem with editors disagreeing about whether a cite is really necessary for a particular sentence in a particular lead. But that's not a reason to say that permitting lead cites automatically creates a problem.
I think that it would be a problem if MOS set a requirement, and MEDMOS tried to say that the site-wide requirement would not apply to med pages. But that is not the case here. I see nothing wrong with MEDMOS suggesting (and it is a suggestion rather than a requirement) something that MOS says is OK but not required. MOS says that some pages can have cites in the lead and other pages don't have to. MEDMOS just says that putting cites in the lead is recommended. On the other hand, MEDRS has long said, with good consensus, that there are situations where primary sources are impermissible, whereas RS does not make that kind of prohibition. Because this is a situation where a specific topic has editors who want to go beyond the minimum required by MOS, rather than to ignore requirements set by MOS, this is not a violation of MOS. And there really are valid reasons to encourage lead cites in health-related pages. --Tryptofish (talk) 18:46, 27 July 2018 (UTC)[reply]
  • In my view, medical articles are among the crown jewels of en.WP. But I just HATE the lazy, disruptive practice of tagging every single sentence. Tony (talk) 04:22, 30 July 2018 (UTC)[reply]

Rhyme scheme patterns

A rhyme scheme is a pattern that appears in the lines of a poem, and generally letters are used to notate the pattern, for example "ABAB CDCD". Sometimes these sequences are very long, the longest I could find on Wikipedia being "abacabadabacabaeabacabadabacabafabacabadabacabaeabacabadabacaba". There is considerable variation in how this notation is capitalized and punctuated:

  • ABAB
  • "ABAB"
  • abab
  • "abab"
  • "A,B,A,B"
  • "AB AB"

In some cases, the main article says that the notation requires a specific capitalization. For example, "aBaBccDDeFFeGG" distinguishes masculine and feminine rhymes with lower vs. upper case. I think it would be nice, for human readability reasons, to settle on a consistent style for this notation. To me the sequences are nicely distinguished from sentence prose when they are either in quotes, all caps, or both. I'm open to whatever poetry-editing editors want to do, but for the sake of having a starting point for discussion, how about the below? I have asked for input from Wikipedia talk:WikiProject Poetry. -- Beland (talk) 22:00, 25 July 2018 (UTC)[reply]


Unless otherwise required by a specific notation, rhyme schemes should generally be written when appearing in prose:

  • In all capital letters
  • Enclosed in double quotation marks
  • Without italics
  • Using spaces to separate groups (not commas or other punctuation)

Example: This poem uses the "ABAB" rhyming pattern.


Why the quotation marks? What information do you think you are conveying by including this extra baggage in the notation? (Also, I suspect that when spaces are used it may often be because they are meaningful, e.g. to separate stanzas. And there are more styles currently in use than you list; e.g. a-b-a, b-c-b, c-d-c. —David Eppstein (talk) 23:28, 25 July 2018 (UTC)[reply]
I'm not attached to the quote marks for all-caps sequences. If you parse the sequences as proper nouns, they make sense without them. If you parse them as a sequence of symbols not part of the sentence but being quoted from somewhere else, I would think about them like MOS:WORDSASWORDS, where quote marks are one of the options (and prettier than italics, which I don't see often used for this purpose in poetry articles). If we were doing all-lowercase, then the sequences would blend in with sentences much more easily and I think there'd be a stronger argument for quote marks. Yes, the spaces are meaningful, though without a lot of research I'm not sure I could catalog all the ways people use them, so I tried to be generic. Do you think "stanzas" is a better word than "groups"? I did some searches for the style with dashes and didn't find attestation for that, but that may merely be a limitation or my own misuse of the search engines. I'm open to that or other styles as well if people like them better, though my personal opinion is that dashes just bulk up the string without adding clarity. -- Beland (talk) 19:45, 26 July 2018 (UTC)[reply]
  • Overprescription, micromanagement, MOSbloat. And you really are stirring too many pots at once. EEng 02:39, 26 July 2018 (UTC)[reply]
Pinging Phil wink who has put a lot of thinking and work into this issue. Also pointing out as of relevance: WP:POETRY#Scansion, Wikipedia talk:WikiProject Shakespeare/Archive 5#Scansion and meter in the sonnets. --Xover (talk) 07:47, 26 July 2018 (UTC)[reply]
The linked discussions are beautiful examples of knowledgeable editors working out article content for themselves in a specific and important topic area. Why does MOS need to overbear that? EEng 10:18, 26 July 2018 (UTC)[reply]
Well, I'm asking those knowledgeable editors if we can agree on a single style for a given notation, rather than having different styles on different articles due to smaller discussions coming up with different answers. I'm not sure what you mean by the MOS overbearing; if there's a preferred style for something, this is the place to document it (either by explaining in detail or pointing to WikiProject guidelines). The point of project-wide consistency is to make article content easier to digest (e.g. when reading a bunch of different articles about rhyming poems) and to make the project look polished, professional, and credible. -- Beland (talk) 19:45, 26 July 2018 (UTC)[reply]
  • I strongly suspect that there are already-published norms about this sort of thing, "out there". If there's consensus among editors who work on poetry material a lot that there's a preferred way to do this (and I'd bet that David Eppstein's suspicion that spacing can be semantically meaningful is correct), maybe we could add something about it in a poetry and lyrics section at WP:Manual of Style/Writing about fiction. This above stuff seems pretty half-baked at this point, though. And scansion and rhyme schemes are not the same thing. The two scansion threads pointed out do seem consistent with each other, probably because WP:POETRY#Scansion is basically a de facto guideline. So, it could be the start of MOS:FIC section on poetry. PS: I agree that adding quotation marks around such markup is pointless. — Preceding unsigned comment added by SMcCandlish (talkcontribs) 16:40:00, 26 July 2018 (UTC)[reply]
  • As always:
A. It is an axiom of mine that something belongs in MOS only if (as a necessary, but not sufficient test) either:
  • 1. There is a manifest a priori need for project-wide consistency (e.g. "professional look" issues such as consistent typography, layout, etc. -- things which, if inconsistent, would be noticeably annoying, or confusing, to many readers); OR
  • 2. Editor time has, and continues to be, spent litigating the same issue over and over on numerous articles, either
  • (a) with generally the same result (so we might as well just memorialize that result, and save all the future arguing), or
  • (b) with different results in different cases, but with reason to believe the differences are arbitrary, and not worth all the arguing -- a final decision on one arbitrary choice, though an intrusion on the general principle that decisions on each article should be made on the Talk page of that article, is worth making in light of the large amount of editor time saved.
B. There's a further reason that disputes on multiple articles should be a gating requirement for adding anything to MOS: without actual situations to discuss, the debate devolves into the "Well, suppose an article says this..."–type of hypothesizing -- no examples of which, quite possibly, will ever occur in the real life of real editing. An analogy: the US Supreme Court (like the highest courts of many nations) refuses to rule on an issue until multiple lower courts have ruled on that issue and been unable to agree. This not only reduces the highest court's workload, but helps ensure that the issue has been "thoroughly ventilated", from many points of view and in the context of a variety of fact situations, by the time the highest court takes it up. I think the same thinking should apply to any consideration of adding a provision to MOS.

I'd like to see, at the least, evidence for A1 or A2 before we even think about embarking on such a debate, because if MOS does not need to have a rule on something, then it needs to not have a rule on that thing. EEng 21:55, 26 July 2018 (UTC)[reply]

For A1, Rhyme scheme uses both lowercase-in-quotes and uppercase-no-quote styles in its prose, but mostly uses uppercase when explaining the different notations. I think this looks ugly and unprofessional because it is inconsistent, and the inconsistency continues when compared to other poetry pages, found by a moss scan:

-- Beland (talk) 02:23, 28 July 2018 (UTC)[reply]

I don't understand what that list is supposed to demonstrate. EEng 02:31, 28 July 2018 (UTC)[reply]
Yeah, is this supposed to indicate a consistent stylistic preference, or is this is just a partial list, of a style Beland is objecting to?  — SMcCandlish ¢ 😼  00:27, 31 July 2018 (UTC)[reply]
@EEng: This was just to demonstrate that different articles use inconsistent conventions, since you didn't want to make a rule unless there was "evidence" for a need for consistency. Is this what you were asking for? -- Beland (talk)
What are you talking about? All your examples use a single consistent format. EEng 22:40, 6 August 2018 (UTC)[reply]
@EEng: Well, as listed above, all the examples are inconsistent with the article rhyme scheme, which is internally inconsistent. And actually it's more complicated than that; the scan downcased all the contents for de-duplication purposes. (Sorry, I forgot it was doing that.) So for example Ballad of Eric actually uses "ababC". If you only read the article, it is a bit unclear why the "C" is capitalized, but rhyme scheme notes sometimes that is done to indicate verbatim repetition of an entire line. But in other cases capitalization is used to indicate gender, so it might help to be more explicit. Some further irregularities: Nachtlied (Reger) uses italics. Ottava rima is internally inconsistent, using both "abababcc" and "a-b-a-b-a-b-c-c". -- Beland (talk) 07:32, 7 August 2018 (UTC)[reply]
Beyond EEng's A1/A2/B prerequisites, we're also missing any evidence that Beland has surveyed professional best practices in writing rhyme schemes and has come to the proposal above as a condensation of those best practices, or has brought any subject expertise at all to bear on the issue. The comment above suggesting an unfamiliarity with the word "stanza" is not promising. —David Eppstein (talk) 02:00, 31 July 2018 (UTC)[reply]
WT:WikiProject Poetry has been notified of the discussion (by Beland). I've added more notices, to WT:MOSFICT, and to the talk pages of the songs, music, and classical music wikiprojects, to try to attract more participants.  — SMcCandlish ¢ 😼  12:41, 1 August 2018 (UTC)[reply]
@David Eppstein: I have not researched professional practices beyond a quick search for what seems to be most popular on other web sites that discuss rhyme schemes, and I have no interest in doing any further research. In general, I have no interest in reading poetry or reading about poetry, so I'll take the grade "not promising" with pride. I just happened to notice that Wikipedia uses this notation inconsistently, and I need some guidance from folks who do care about this stuff so I can correctly program my database scanner and advise other editors how to fix this type of problem. I chose my proposed style based on what looks good to me, as a mostly arbitrary starting point for discussion. If following a professional standard is important to you, feel free to cite one you want us to use. -- Beland (talk) 22:34, 6 August 2018 (UTC)[reply]
You are proud of your ignorance of professional practice, and you want to use that ignorance as the basis for making policy decisions here? Do you think that the professionals might somehow have failed to address issues of how to notate this kind of information? Or that the best practices of experts are something to be avoided? That being uninformed makes your aesthetic judgements purer? That as someone not even interested in the subject of poetry, you are going to recognize and address the informational needs that such a notation should provide? That seems an amazingly backwards and anti-intellectual position for a Wikipedia editor to take. —David Eppstein (talk) 22:40, 6 August 2018 (UTC)[reply]
@David Eppstein: I am not arguing that ignorance gives me any greater insight into what style should be used; if anything, it gives me less. I'm just saying that as a non-poetry person, I don't really care which style is chosen, as long as we are consistent. I'm providing a default option if no one else cares either. It sounds like you want the style chosen to reflect professional practice of some kind; if so, whose practices do you want us to follow? -- Beland (talk) 23:46, 6 August 2018 (UTC)[reply]
I poked through List of style guides; the MHRA Style Guide for example does not address this issue at all. I don't know if MLA Style Manual does, but I couldn't download it for free. If you do a web search for "abab cdcd" you'll see both abab cdcd and ABAB CDCD used in about equal measure. I don't think this comes up often enough to have a generally accepted style for a general audience. If there's a particular journal or something you care to follow, I'm open to suggestions. -- Beland (talk) 00:36, 7 August 2018 (UTC)[reply]
If you think that general-purpose style guides were what I meant by the best practices of experts, then all I can say is that you're so ignorant that you don't even know you're ignorant. What I meant (as I thought would have been obvious) were the writings of literary scholars when writing about rhyme schemes. You know, the sort of writing that we should be using as references in our articles about rhyme schemes? Maybe we could find something like this in the sort of textbook that would be used for undergraduate-level poetry classes? As a random and probably too-technical example (not intended to be in any way definitive) this paper on Glück writes "lower case letters stand for masculine clausulae and upper case letters for feminine clausulae. X and x indicate respectively blank feminine and blank masculine clausulae." If we're going to write about style guides for rhyme schemes at all (something I'm still not convinced we need), these sorts of intricacies are something we need to understand, so that we don't end up writing a lobotomized style guide that can only be used for lobotomized articles. —David Eppstein (talk) 01:09, 7 August 2018 (UTC)[reply]
@David Eppstein: Well, there's no need to be insulting; I checked general-purpose style guides because they were easy to find and I'm trying to put in some effort to help get this discussion to some sort of resolution. I assumed you were talking about academic journals, which is why I asked if there were any in particular you wanted to argue in favor of. I don't read poetry journals because I hate poetry. Yes, rhyme scheme clearly documents that some notations use a mix of uppercase and lowercase letters. I read that, and I mentioned it in the preface to my proposal, and that's why I included in the proposal "Unless otherwise required by a specific notation". Maybe it wasn't clear what I was getting at? I expect there are a lot of different academic notations (and actually that article lists quite a few) but what Wikipedia seems to use most of the time is the dead-simple notation we used in high school freshman English class, which can be freely styled either in all-uppercase or all-lowercase. If you look at reference 5 on Summum Bonum (poem), you'll see that it uses all-lowercase with no spaces and no punctuation. But if you search Google Books for ababababcd as found in Christis Kirk on the Green you'll see that some academic authors use spaces and some don't. In general, I think the way Wikipedia deals with cases like this where there are heterogeneous external conventions is simply to declare a house style and adapt all content it's quoting to that style. We even do this for direct quotes to some degree: Wikipedia:Manual_of_Style#Typographic_conformity. We don't use single quote marks for articles about British things even though British sources almost always do, because that doesn't fit the house style. Because there are several different variants of this notation, some of which use capitalization and spacing in a meaningful way, an entirely different way to handle this would be to require articles to follow the convention for the particular subnotation listed at rhyme scheme. That would require making that master article internally consistent; for example, right now it uses both "abab" and ABAB CDCD EFEF GHGH to describe traditional rhymes, which should be using the same subnotation. We could also just say something like "Unless differences between uppercase and lowercase letters are being used in a meaningful way, all letters should be uppercase" and "Unless spacing or punctuation is being used in a meaningful way (such as to separate stanzas), patterns should be written without spaces or dashes; spaces are preferred over dashes when indicating groups of lines." -- Beland (talk) 07:32, 7 August 2018 (UTC)[reply]
  • I need some guidance from folks who do care about this stuff so I can correctly program my database scanner and advise other editors how to fix this type of problem – How about if you just don't do that? That would save us all a huge amount of trouble. EEng 22:44, 6 August 2018 (UTC)[reply]
@EEng: Because that would leave an ugly inconsistency that makes articles harder to read. This should not be a lot of trouble; we just need to make an arbitrary choice between multiple available styles already in use. If this is taking a lot of work, maybe we're over-thinking it. -- Beland (talk) 23:46, 6 August 2018 (UTC)[reply]
We also need some way to resolve these cases so that we don't waste editor time examining the same ones over and over again because they keep getting detected as spelling errors. If we resolve them inconsistently, then it seems to me we're wasting the effort that was put into detecting them in the first place and manually resolving them. -- Beland (talk) 00:21, 7 August 2018 (UTC)[reply]
{sic} handles the spelling error thing. I'm still waiting for evidence there's any problem here MOS needs to solve. Several people here have expressed exasperation at your stirring so many pots. This project of yours to automate something or other -- is it going to beget an endless stream of these bids to regiment everything big and small? EEng 00:27, 7 August 2018 (UTC)[reply]
If we put {{sic}} on all of these before deciding on a style, then once we eventually do decide on a style, we'll have to go back and change half of them. That seems like a lot of wasted work in comparison to just deciding on the style sooner rather than later. If we happen to decide to capitalize them, we won't need {{sic}} at all, and we will also be able to ensure we haven't missed any. I'm sure we'll have more questions as we make progress fixing more known spelling problems. But it won't be endless; there shouldn't be any more style questions to decide than the staff of a print encyclopedia would have to decide, and the MoS and WikProjects have already decided a very large number of questions. I don't know why having two outstanding style rule questions on this page, for a project with 5.6 million articles would be unreasonable, especially as there are 10 other style question discussions also on this page. I don't expect all the same people to care enough about all possible questions, as different people are subject matter experts in different areas. As for "evidence", do you not consider the existing style inconsistencies I've pointed out as a problem that should be solved? -- Beland (talk) 00:53, 7 August 2018 (UTC)[reply]

Someone else is going to have to take over. This is hopeless. EEng 03:21, 7 August 2018 (UTC)[reply]

A recent edit changed "Capitalize names of particular institutions ..." to "Capitalize names of institutions ..." on the grounds that "particular" is superfluous here. I undid the edit, but it has been restored by another editor.

I entirely agree that logically "particular" is redundant, but ultra-clarity is useful in the MoS. Leaving just the plural "names of institutions" might imply that e.g. the Universities of Oxford and Cambridge is correct.

One compromise might be "Capitalize the name of an institution ...", although I still think the original is fine. Peter coxhead (talk) 05:58, 27 July 2018 (UTC)[reply]

This isn't the biggest issue in the world, but my point is that the word "particular" isn't doing any lifting whatsoever in that sentence. Certainly it isn't resolving your exemplified issue - since the 'universities of Oxford and Cambridge' are (two) particular institutions; resolution depends upon the word "names", since that formulation isn't a "name" (as an aside I note that the UK Parliament used the capitalisation in this very phrase, back in the 1920s: "An Act to make further provision with respect to the Universities of Oxford and Cambridge and the Colleges therein, although of course usage has evolved since then). Looking at a few style guides I can't find any guidance in any of them for dealing with the situation where several organisations are listed together preceded by the plural of a word that is common to all of their titles, anyway, but I agree with you that lower case feels appropriate. In any event, your alternative wording in the singular also works. MapReader (talk) 07:09, 27 July 2018 (UTC)[reply]
I can't find explicit guidance in the MoS re the plural case (SMcCandlish: can you help?), but there have been various relevant discussions, e.g. Talk:List of mayors of Birmingham#Requested move 5 September 2017 which have all upheld the view that where "X of Y" is a title, and so X and Y are capitalized, in "xs of Y", x is not capitalized. (As it happens, I don't agree that lower case feels appropriate – I would capitalize – but I'm pretty sure the consensus here is otherwise.) Peter coxhead (talk) 09:46, 27 July 2018 (UTC)[reply]
I put it back in. It was added because people were misreading it as meaning to do, e.g., "all of the State Legislatures in the United States", "she attended both Harvard and Princeton Universities", etc. The "Universities of Oxford and Cambridge and the Colleges" style was preferred by certain style guides in the early to mid-20th century but is excoriated in most of them today. It's the same style that results in "XYZ Corporation is a Canadian shoe manufacturer. J. Q. Foobar founded the Corporation in 1978." This style is virtually never used today except in legal writing (within actual legal documents, not legal journal prose, etc.), and in a small sliver of especially stilted business English, usually in internal self-referential material (i.e., your CEO's memo is apt to call your own company "the Company" but will not refer to a meeting with a rep from XYZ Corporation as a negotiate between "the Company and the Corporation"). WP doesn't write this way.

There may be another way to get at this than by inserting "particular", but it does seem to be having the desired effect, even if MapReader and Popcornduff might be skeptical that it should. There's been a marked decline in the over-capping style over the last several years.

PS: "xs of Y" is always going to look off to a subset of readers, those for whom such a title seems to usually capitalized (or at least where they don't notice much when it's not. LC is the norm in the British press now – in-page search these articles for "lords mayor", and there were zero hits at either The Guardian or The Economist for this phrase capitalized: [10][11][12][13][14][15][16]. It's not like Wikipedia made it up. :-) There'll always be a hint of perceptual dissonance because a title like "Lord Mayor" is almost always encountered attached to someone's name. Similarly, many Americans always want to capitalize "president" in reference to the US head of state; some of them even want to do it with adjectival constructions like "presidential race".
 — SMcCandlish ¢ 😼  11:11, 27 July 2018 (UTC)[reply]

The reduction in capping is because the Internet and SMS are driving society towards ignoring capitals altogether. It is certainly most unlikely to have anything to do with the word "particular" in this sentence of the MoS, where it appears utterly redundant. MapReader (talk) 11:58, 27 July 2018 (UTC)[reply]
  • I was going to make the same point MapReader made even before I saw they'd written it. In the words of the Guardian style guide: "The tendency towards lowercase, which in part reflects a less formal, less deferential society, has been accelerated by the explosion of the internet: some web companies, and many email users, have dispensed with capitals altogether."
I have to say I don't understand what "particular" is adding here, even after reading the arguments about plurals. To me, as someone who uses style guides a lot professionally, it's not clear that "particular" is meant to clarify that you don't capitalise in lists or whatever. I wouldn't have inferred that at all. Popcornduff (talk) 12:45, 27 July 2018 (UTC)[reply]
Except the lower-casing trend started way before the Internet existed as a general public medium. There is no doubt that computer-mediated communication is accelerating the trend, but it doesn't matter. It's not WP's job to defend tradition against the abuses of callow youth, but (with regard to style matters) to reflect actual practice in contemporary formal written English (and to work around key WP-specific issues here and there, where there's a technical problem – thus, e.g., our avoidance of curly quotes).

On the original topic, I'll just repeat: There may be another way to get at this than by inserting "particular", but it does seem to be having the desired effect. Rather than argue about one word, maybe suggest better wording that makes the point more clearly.
 — SMcCandlish ¢ 😼  13:14, 27 July 2018 (UTC)[reply]

While I feel SMcC is stretching credibility beyond a sensible point by arguing that this word is doing any good whatsoever, it's just one word, and isn't doing any harm apart from padding. So as the OP I say let's drop the matter. MapReader (talk) 14:36, 27 July 2018 (UTC)[reply]
Well, I'm also fishing for "Better wording would be ...". I tend to just directly edit this stuff, but others here seem to feel more strongly that the current wording isn't good, rather than that it's not ideal. I'm not sure how to more tightly get at something like "Capitalize the names of institutions, including their official names and conventional short forms that are treated as proper names, but do not capitalize a word from such a name (university, corporation, etc.) when used apart from the name, nor in plural form when two or more institutions sharing the term are mentioned back-to-back." I guess we could use that exact wording in a footnote, but I don't think anyone wants to see something like that inline in the main guideline text.  — SMcCandlish ¢ 😼  14:10, 28 July 2018 (UTC)[reply]

Deadnaming trans people

The following discussion is closed. Please do not modify it. Subsequent comments should be made in a new section. A summary of the conclusions reached follows.
 — SMcCandlish ¢ 😼  14:11, 29 July 2018 (UTC) (non-admin closure)[reply]

I'm coming up against an issue regarding deadnaming. I appear in the article City of York Council election, 2015 under my deadname. I edited this but the edit was revoked by Sam Blacketer. I understand but disagree with Sam's concerns about/concept of accuracy and also feel that there are a number of key reasons why deadnaming should be avoided in almost all contexts anyway. The discussion between Sam and me can be found on Sam's talk page under the section "Deadnaming trans people".

In summary:

  1. Deadnaming trans people can be dangerous - sometimes very seriously so - both from a mental health/gender dysphoria perspective and from a risk of outing perspective. The Radical Copyeditor's Style Guide for Writing About Transgender People stated in section 2.4.2: "Using a trans person’s birth name or former pronouns without permission (even when talking about them in the past) is a form of violence". In the recent academic article How do you wish to be cited? Citation practices and a scholarly community of care in trans studies research articles, Thieme and Saunders discuss citation practices including the issues of deadnaming through records.
  2. It is common on Wikipedia to refer to people by the name they go by/are best known by in other domains e.g. celebrities with stage names. The stakes are higher for trans people and deadnaming.
  3. There is evidence of my name change and my request to be referred to only as Ynda Jas (not as my deadname) online, and I can provide a deed poll and bank statements in my new name if there's any question about the validity of this change (Sam hasn't questioned it).
  4. I have submitted a request with City of York Council for their website to be updated.
  5. The Manual of Style appears to favour referring to trans people under their current ("self-designated"/"self-expressed") name/pronouns/identity and states that "this applies in references to any phase of that person's life", though it is a little ambiguous about whether this applies only to biographical pages or also to references elsewhere. Sam stated that if I had a biographical page things might be different, but does that not create a two-class system of people who are worthy of not being deadnamed (and their current/actual name/identity respected) and people who are effectively unworthy (as a result of not being Wikipedia-noteworthy enough to have their own biographical page)?
  6. Sam feels that election records should reflect what was recorded at the time as a matter of accuracy. I argue it would be accurate - more accurate in fact - to use Ynda Jas since the person who stood for election was Ynda Jas, not my deadname. My deadname is simply an old label for Ynda Jas (me). Surely the main purpose of the page is to act as a record of who ran for election, not what names were on the ballot paper?
  7. Sam went on to talk about how the page should reflect that it was a cis male who stood for election - it wasn't, I simply wasn't out as trans/non-binary then (my interpretation of gender - pretty uncontroversial in contemporary Western trans communities - is that for those trans people who are not fluid in their identity, they have always been the gender they identify with, they simply haven't always known it/been out to themselves or the rest of the world). Either way, the page wouldn't reflect whether people saw me as cis or trans whether it uses my deadname or Ynda Jas because (A) that information is not on the page - it's literally just a name! - and (B) people had no real basis on which to assume I was cis or trans at that time. My gender was not on the ballot paper. There may have been campaign materials using the pronouns with which I was referred at the time, but this argument feels like clutching at straws and an issue of inconsequential significance relative to the impact of deadnaming. In my case, I would give permission to include a footnote to say that I was on the ballot paper under my deadname, but this should be the exception not the rule - see point (1).

Could there be clearer guidance on this in the Manual of Style, particularly in relation to non-biographical pages? If there needs to be further discussion before a decision is made on this, it should involve trans people.

--Yndajas (talk) 22:42, 28 July 2018 (UTC)[reply]

  • The short answer is no. The only reason Wikipedia can include detailed historical records about otherwise non-notable people (such as the per-ward results) is that they faithfully reflect the sources used. If the election council for York puts out a statement updating their records, I'd support updating it. As it is, I would much rather remove the entire section rather than make a name change that isn't supported by secondary sources (just the person's own statements); it isn't obvious how to prove that the person changing their name is the same one that ran for office. The emotional issues aren't relevant; this person's own website [link redacted] uses their old name. power~enwiki (π, ν) 23:18, 28 July 2018 (UTC)[reply]
    • As a side note, this isn't really the right forum, but as this is a new user who was directed here, I see no reason to move this discussion. power~enwiki (π, ν) 23:18, 28 July 2018 (UTC)[reply]
Distraction
Not a new user at all. Yndajas has been here since 2007.  — SMcCandlish ¢ 😼  03:55, 29 July 2018 (UTC)[reply]
Yndajas is not technically a new editor, but Yndajas's sporadic contribution history, which includes significant gaps in editing, indicates a high likelihood of being unfamiliar with a number of Wikipedia rules and protocols. Flyer22 Reborn (talk) 07:11, 29 July 2018 (UTC)[reply]
I've encountered editors like that, and they tend to essentially be newbies. Flyer22 Reborn (talk) 07:15, 29 July 2018 (UTC)[reply]
This one makes lots of convoluted policy arguments, so not very noob. But this is immaterial, really. I didn't mean to side-track us into this.  — SMcCandlish ¢ 😼  13:16, 29 July 2018 (UTC)[reply]
I agree that this is not a MOS issue, but in cases where there is reliable sourcing, WP:BLP applies and the current name should be used, with citation. --Tryptofish (talk) 23:29, 28 July 2018 (UTC)[reply]
@power~enwiki - I assume you read through the discussion between Sam and me if you came across my old website. In which case, you'll note I referred to it as my old website, and linked to a page on my new website where I am explicit on how I should be cited. I'm in the process of shifting content over to my new website - when that's complete I'll probably just set up a redirect to the new website until the domain and hosting expires, so that I'm not deadnamed by that site (other than the URL) anymore. The fact I haven't had time to get all that done yet is not reason to believe I'm still intentionally referring to myself by my deadname, nor reason for resistance to change in other areas of my web presence. Regarding whether it's definitely me, I doubt you'll deem it a reliable source, but here are three images I'm tagged in with my deadname in comments from that election campaign, including one on the podium with the elected Green councillor: 1 | 2 | 3. As you'll see, it's the same person that you'll find in various places online under the name Ynda Jas today. As I said, I am also happy to show you my deed poll and whatever else you deem relevant. These aren't the kinds of things you put in a newspaper, so what counts as a reliable source needs consideration - I can provide deed poll, my old and new websites, social media posts and so on, but not every trans person could make these readily available and will you accept them anyway? There needs to be a clear process by which trans people can have references to them updated as a matter of safety even if not out of respect for people's identities Yndajas (talk) 01:54, 29 July 2018 (UTC)[reply]
Unproductive bickering.
  • TL, stopped reading at Using a trans person’s birth name or former pronouns without permission (even when talking about them in the past) is a form of violence. Please get real. EEng 23:35, 28 July 2018 (UTC)[reply]
    Get real? Are you disputing the fact that mental health is real and to purposefully disregard it is violent, or that outing people as trans can endanger them and that to endanger is an act of violence? This is the reality for many of us, and Wikipedia shouldn't be contributing to it Yndajas (talk) 01:54, 29 July 2018 (UTC)[reply]
    Outing people? What??? If you don't want your old identity tied to your new one then why are you going around doing it yourself? [17] Not referring to you in the way you prefer might be thoughtless or unkind, or it may just be an inescapable consequence of reporting the facts, but it's not violence. To pretend it is trivializes real violence and real victims of violence.
    And BTW I'm gay so you can skip the lectures on oppression. EEng 02:24, 29 July 2018 (UTC)[reply]
    As I've said elsewhere, I'm privileged enough that I can be out, but I still experience real gender dysphoria and one source of this is being referred to by incorrect pronouns and an incorrect name. Gender dysphoria contributes to depression. To cause someone emotional distress is harmful, and to cause harm - especially when doing so consciously - is violence. This is not trivial.
    As far as I'm aware, Wikipedia edit histories aren't indexed by search engines, so someone would have to dig deep to find that I made an edit in which I claim my deadname as my deadname, but either way as I say it's not being out (in my case) that's the problem (and as I also said, in my case as I am able to be out I don't mind a footnote explaining I ran under a different name, but this shouldn't be standard). The problem for me is the continued presence of old labels. Yndajas (talk) 02:39, 29 July 2018 (UTC)[reply]
    Many people will in fact dispute this use of the word "violent"; it's a particular subcultural usage that is at odds with the historical and still-current general usage of the term outside that context. Getting into an argument about that here will not help you (as demonstrated by where this is already going; I got edit-conflicted 3 times just trying to say this and head this off).  — SMcCandlish ¢ 😼  02:41, 29 July 2018 (UTC)[reply]
    I think responding to mockery ("Get real") on a serious issue with a clear and well-reasoned explanation of why this is violence (you'll find this kind of definition in many places - a result of greater awareness and destigmatisation of mental health) should not be a point of reprimand Yndajas (talk) 02:58, 29 July 2018 (UTC)[reply]
    It wasn't mockery. Trivializing real violence by pretending that hurt feelings are comparable is muddling hyperbole. EEng 03:03, 29 July 2018 (UTC)[reply]
    If you do the slightest bit of research you'll find that suicide rates among trans people are a very big issue. Being misgendered, deadnamed etc all contribute to gender dysphoria, which is a source of depression. This is not trivial. Yndajas (talk) 03:12, 29 July 2018 (UTC)[reply]
    No one said the definition doesn't exist, it's just not going to be productive discussion here.  — SMcCandlish ¢ 😼  03:00, 29 July 2018 (UTC)[reply]
    I'll be happy to stop having that discussion once it's no longer questioned that this is a serious mental health issue on a par with physical violence (and sometimes a physical violence threat itself, though not for me through deadnaming). Yndajas (talk) 03:12, 29 July 2018 (UTC)[reply]
    That's a straw man. Since no one actually did question whether this is a serious mental health issue, I will take you at your word that this unproductive argument will end.  — SMcCandlish ¢ 😼  03:17, 29 July 2018 (UTC)[reply]
  • This seems to be another Bruce vs. Caitlin Jenner type situation. The question is whether we should use the historical “name of record” when mentioning someone in a historical context. It isn’t a really a case of “deadnaming” ... it is a case of adhering to the historical record in a specific context. Similar to how we use “Bruce” in our articles on the Olympics, even though we use “Caitlin” in all other contexts. Blueboar (talk) 01:10, 29 July 2018 (UTC)[reply]
    Except the Manual of Style seemingly already suggests that's not good practice? And it is deadnaming whether you like the label or not - it's exactly what deadnaming means. Unless Caitlin has okayed being deadnamed, you should not be doing it here nor should it be the practice in articles on the Olympics. Not good enough. As I argued in my original post here, surely accuracy about who did what is more important than what name that person was labelled with at the time? Ynda Jas is the who in this case, not my deadname. Yndajas (talk) 01:54, 29 July 2018 (UTC)[reply]
    This isn't a deadnaming case. We're correctly reporting that the City of York says that those names were on the ballot. You aren't WP:Notable (by the standards in that guideline – don't feel bad, nearly no one here is!) and don't have your own article here, so the only person drawing a connection between that name on the ballot and you under your current name is you. It would be an unverifiable distortion of the facts to change that article to say that Ynda Jas was on the ballot. Deadnaming is continuing to refer to trans people in the present by their old names; it has nothing to do with expunging old records or falsifying what they say in an encyclopedia. If someone shows up at your birthday party and (despite knowing of your name change) continues to refer to you by your old name, that would be deadnaming. Our own Deadnaming article section is consistent with this interpretation, as are offsite articles like this one at Healthline.  — SMcCandlish ¢ 😼  02:41, 29 July 2018 (UTC)[reply]
    The table for each ward includes the heading "Candidate" not "Name on the ballot" - I was the candidate, and I am Ynda Jas not [deadname]. "Name on the ballot" would be a historical matter, but the "candidate" is living in the present with the name Ynda Jas, and that page is referring to them by their deadname in the present. Using someone's deadname is deadnaming them, and Wikipedia is doing just that. Wikipedia is a live, readily-updatable resource. I updated it to accurately state who the candidate was and then that change was reverted - that reversion was an act of deadnaming as is the continued presence of it on a resource that can easily be edited at any time.
    There needs a be a process by which non-notable people can stop being deadnamed, or as I said in my original post on this page it creates a two-class system where you have to earn your right not to be deadnamed, which is unacceptable. How can we provide reliable resources if you won't accept our personal websites, other websites, social media presence or some combination of these? And in the cause of people who can't be open about being trans, what if even these can't be provided? Yndajas (talk) 02:53, 29 July 2018 (UTC)[reply]
    Nice try, but "Candidates" in that context means "Names on the ballot". It's a context of what the reliable election-related documentary sources say, not what people are saying on their personal websites or in online discussion boards like this talk page.  — SMcCandlish ¢ 😼  02:59, 29 July 2018 (UTC)[reply]
    I disagree. A candidate is a/the person who stands for a position. I am that person. If it was a record of the names on the ballot, it should be much clearer about that. If it was clearly that (it's not), there should at least be a right to redact such information Yndajas (talk) 03:05, 29 July 2018 (UTC)[reply]
    It doesn't matter if you disagree. We're not going to change how we source articles to make one person happy. I strongly suspect that just going and changing the table column to read "Name on the ballot" would do nothing to assuage you.  — SMcCandlish ¢ 😼  03:06, 29 July 2018 (UTC)[reply]
    Done: [18].  — SMcCandlish ¢ 😼  03:13, 29 July 2018 (UTC)[reply]
    The thing is I'm talking here about an issue that affects many, many trans people, not just me. And for others it's often a more serious issue. It's not just to make me happy. And no, I wouldn't be assuaged by that change, but I would understand the intensely rigid resistance a little better and it would be having a different discussion about how to deal with the deadnaming issue in a different context Yndajas (talk) 03:17, 29 July 2018 (UTC)[reply]
    And the issues now are that (1) you're literally going off-template re: council election results pages in order to continue to justify deadnaming a trans person and (2) while the column and the cell now accurately correspond, you're still deadnaming a trans person, and we need another solution. If there's no solution that satisfies Wikipedia editors, then the information should just be removed altogether. Deadnaming is not a solution, no matter how accurate it is in context (which it now is, and previously wasn't, for clarity). Yndajas (talk) 03:25, 29 July 2018 (UTC)[reply]
    Your interpretation of what deadnaming means in the context of historical material is at odds with how the community interprets it (see three back-to-back and truly massive Village Pump RfCs 1, 2, 3; and WP:LGBT's own WP:TRANSNAME material which suggests nothing like retroactively changing names in historical material). Repeat: MoS is as it is on this point because of those RfCs. It will not change because you're angry. The only change that might happen (though probably will not) is a change to sourcing policy, and that cannot be effectuated here. Using a parameter of the template to do something the template is designed to do isn't going "off-template". You said yourself that the issue was the "Candidates" wording and that "Name on the ballot" would be different. I predicted that such a change would not actually satisfy you, and this prediction has proven correct.  — SMcCandlish ¢ 😼  04:11, 29 July 2018 (UTC)[reply]
    (1) It appears to be standard to use "Candidate" on these pages - maybe that's not what you call a template, but using "Name on the ballot" is still non-standard. So this is adopting a non-standard practice to justify deadnaming. (2) In saying "Candidate" and "Name on the ballot" are different with regards to historical accuracy, I was never suggesting it would be okay to deadname simply based on it being historically accurate. I'm not suggesting historical inaccuracy, but a solution that forces deadnaming is not a solution. Reverting to "candidate" - as is standard - and naming the candidate (not how they were referred to at the time) is a solution. Yndajas (talk) 12:31, 29 July 2018 (UTC)[reply]
    There is no formal standard in play there. We could present the same information as a list and without any templates at all. Repeating the same argument after it's already been addressed is "proof by assertion". Here is your own statement again, agreeing that "Name on the ballot" would just be historical information, but stating that "Candidate" makes it a claim about the person in the present. I implemented (in good faith though with skepticism) the solution you proposed, and now you're claiming I'm "adopting a non-standard practice to justify deadnaming" and "forcing deadnaming", which is a pretty crappy thing to say, and which reeks of WP:Gaming the consensus-building process (specifically point no. 2). This really needs to stop. For the fifth time: this venue cannot and will not give you what you want. You can make a case for it at WT:BLP.  — SMcCandlish ¢ 😼  13:16, 29 July 2018 (UTC)[reply]
  • We literally had a discussion about this 3 weeks ago. Do we need to have yet another one? EEng might need to start a counter for gender-related discussions. --Izno (talk) 02:35, 29 July 2018 (UTC)[reply]
    Who had this discussion? What was the outcome? Do you have a link? Yndajas (talk) 02:41, 29 July 2018 (UTC)[reply]
    See the archive box at the top of this page; the search feature in it (try "deadname" and "deadnaming") will find relevant discussions. The recent one was Wikipedia talk:Manual of Style/Archive 203#MOS:GENDERID.  — SMcCandlish ¢ 😼  03:05, 29 July 2018 (UTC)[reply]
Ad hominem commentary
  • That most recent one is from someone who is very clearly transphobic. An earlier thread in 2015 suggests a history of scepticism around the concept of deadnaming, SMcCandlish: "I know quite a few TG people, and only a small minority are into this "deadname" stuff and trying to erase their past"... Yndajas (talk) 03:42, 29 July 2018 (UTC)[reply]
    Except it's not transphobic, it's a statement about who the person knows and what their apparent opinions on the matter are (as interpreted by that person). "Doesn't agree with my viewpoint" doesn't translate to "transphobic". And this is not a venue to engage in name-calling because of the discretionary sanctions that apply here.  — SMcCandlish ¢ 😼  04:04, 29 July 2018 (UTC)[reply]
    As a trans person, I find this deeply transphobic: "I recommend that this guideline be revised to encourage using pronouns objectively based on someone's actual gender as opposed to their perceived, projected, or desired gender. Wikipedia is an encyclopedia, not an outlet for propagating the subjective delusions or preferences of biographical subjects." It's saying that how a person identifies is not their actual gender, and essentially implying that trans people are delusional and/or gender is a choice. It's not name-calling to call something what it is - which is ironically all I'm asking for in this thread - and to threaten sanctions for calling out transphobia is dangerous. Yndajas (talk) 12:31, 29 July 2018 (UTC)[reply]
    That poster was already indefinitely blocked as a troll; you're complaining about trolling that was already dealt with and which had no impact on the consensus discussion. This is not dispute resolution venue anyway; if you find someone saying transphobic things, try WP:ANI. Next, it's fallacious handwaving to unjustifiably call someone names, get warned against doing so, switch to applying the same term to something that actually qualifies for the label, then complain about the notice as if the inappropriate labeling the first time didn't happen. Please see also WP:NOT#FORUM and WP:Talk page guidelines. This page is not for endless "sport debate" about TG-related or any other topics; it is for working on MoS, and asking how to apply it to a particular article.  — SMcCandlish ¢ 😼  13:16, 29 July 2018 (UTC)[reply]

This thread is off-topic here. This is not a style discussion, it's a demand to change our sourcing approach with regard to living persons. You should probably raise this at WT:BLP.  — SMcCandlish ¢ 😼  03:05, 29 July 2018 (UTC)[reply]

  • But this topic is about the Manual of Style's guidance on non-biographical pages? Yndajas (talk) 03:25, 29 July 2018 (UTC)[reply]
    • Which was set by three back-to-back community RfCs and isn't going change because you don't like it. It cannot change to do what you want it to do, because that would directly conflict with our sourcing policies. This is not a venue that can change the sourcing policies.  — SMcCandlish ¢ 😼  04:15, 29 July 2018 (UTC)[reply]
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Italics advice is too scattered! (archived discussion from 2016 per hatnotes, plus ongoing discussion thwarting)

Discussion thwarting that prompted me to dig all this up: When I clicked on (Discuss) in the hatnote on Wikipedia:Manual of Style/Titles § Italics, the wikilink did not take me to the discussion section that was named after that # of the (Discuss)—even though the resulting URL in my location bar was correct—revealing once again an entrenched behavior that left me staring at a large box of archive page links. This on-going issue that pops up especially in the more active talk pages such as the Manual of Style (MOS) talk page must be improved.

It takes far too long to excavate the exact archive page where the original discussion section is located then bring the discussion back to light and to remember to repoint the (Discuss) wikilinks in the sets of hatnote templates, which does not encourage people to BEBOLD and discourages editor community involvement, especially amongst newbies or the exhaustingly disabled such as myself. I have seen that archive box so many times and have sighed and moved on ~99% of the time. Sorry that I can only dump this topic here, but my real-life limitations currently won't let me do anything else and something should really be done about this, perhaps in the template code in hatnotes that create the (Discuss) wikilinks. Thanks in advance to anyone who can follow-up on this.


Italics advice is too scattered: The 2016 hatnote Template:Merge portions to was added to Wikipedia:Manual of Style/Titles § Italics with the corresponding template added to the other page.

The |reason= parameter states:

We badly need to consolidate the scattered titles-related advice, and have a whole page for it. Only cross-reference and a summary of title-related material need be left here.

I just now appended to that parameter the following:

NOTE from July 2018 by Geekdiva: This proposal by User:SMcCandlish has had its discussion (without responses) archived at:
& a corresponding mention archived at:

Request for assistance: I have to ask someone reading this to repoint the (Discuss) wikilinks to the final discussion section. Due to what I mentioned in the first section, putting this here is the best my real-life limitations mean I can do at the moment. I can't even discuss it now and probably won't find my way back here! Heh. Thank you in advance for your time. —Geekdiva (talk) 11:40, 29 July 2018 (UTC)[reply]

@Geekdiva: That's a lot of text without a clearly discernible rationale. It looks like you just want merge tags on the MoS pages to point to current threads, but that you're not raising anything about any actual merge proposals. Is this correct? That's easy enough to do, though I'm not sure of the value of pointing to an old archive thread instead of to the current talk page where people can start a new one. To address that particular merge: most of the MOS:TITLES material has been consolidated. It's been an ongoing project. After doing a bunch of that and a bunch of MOS:BIO consolidation, I left it alone for a while in case people want to raise issues or objections. Since there haven't been any (other than some temporary confusion about the BIO merge from people who didn't notice its original merge discussion), I don't see a reason not to continue, starting with that particular bit at MOS:ITALICS.  — SMcCandlish ¢ 😼  14:20, 29 July 2018 (UTC)[reply]

wonderous or wondrous?

If it is sooth though seeketh, seeketh the soothsayer.

Re this edit, should it be wonderous or wondrous? See [19] and [20]. Also see [21][22],[23][24]. --Guy Macon (talk) 15:20, 29 July 2018 (UTC)[reply]

Wondrous is the normal spelling, as your links show. I wouldn't be surprised if wonderous is attested somewhere, but it's at the very least unusual.
So it appears that the diff you gave is a routine correction of a spelling mistake. I don't understand why you've brought it to this page. --Trovatore (talk) 19:28, 29 July 2018 (UTC)[reply]
  • OED lists wondrous as current, wonderous as 15th to 18th C. EEng 19:41, 29 July 2018 (UTC)[reply]
    Forsooth? Stonied, I am!  — SMcCandlish ¢ 😼  20:10, 3 August 2018 (UTC)[reply]

Block quotations and pull quotes again

Some while back, we decided to try removing all mention of pull quotes from MOS:QUOTES, and along with it went the advice to not abuse pull quote templates like {{Cquote}}, {{Rquote}}, {{Quote frame}} (the "giant quotation marks" and "quote-framing" stuff). Since then I've seen a rise in misuse of these decorative templates in articles (even aside from the fact that the then-extant cases were mostly not cleaned up – I replaced hundreds of them with {{quote}} but there are thousands more.

I think we should re-institute the advice, at least about the templates, and also cover non-templated attempts to decorative quotation formatting. E.g., I recently encountered <blockquote style="background:none; margin-right:5em; margin-left:0; border-left:solid 4px #ccc; padding:1.5em;"> being used to stylize all the block quotations in a GA:

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

I'm hard-pressed to understand why anyone would think "grey vertical bar = block quotation", and this style is likely to conflict with other page elements, like images, in ugly ways (we have no way to control window width and content flow, so it is inevitable that this CSS gimmick is going to end up juxtaposed directly against a left-floated image for many readers). This decorative style is also wasting a significant amount of vertical space.

Either we need MoS to just discourage quotation décor flat-out, or we need to have an RfC to set a standard blockquote stylization, if the community somehow feels that normal block quotation style is faulty:

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.

It's detrimental to site-wide "look and feel" consistency for people (who probably have no background in usability and accessibility) to go around making up idiosyncratic formatting and installing it in our articles. We might even address this as a more general matter (e.g. do not install rounded-corner tables because you think they look cool, etc. – except on your user page).

A side matter: We also need to advise to put reference citations before the block quote, or in {{Quote}}'s parameters for this; they should not be inside the quote, or they're incorrectly marked up as part of the original material rather than WP's meta-material.
 — SMcCandlish ¢ 😼  00:02, 3 August 2018 (UTC)[reply]

Boilerplate duplications in related articles

Where a large number of articles are very closely related, and contain sections which are interchangeable or nearly so are duplicated within those articles, should Wikipedia encourage use of a boilerplate system instead of copying the same section over three hundred or more articles? I think the "template" system might end up being too rigid? Any suggestions? Can this be done? Or should we keep on cut-and-pasting such sections? Thanks! Collect (talk) 18:55, 3 August 2018 (UTC)[reply]

Example? Sounds like exactly what templates are for. We also now have sectional transclusion.  — SMcCandlish ¢ 😼  20:10, 3 August 2018 (UTC)[reply]
Sectional transclusion – is that one of those gender reassignment surgeries? I just can't keep up with these modern developments. EEng 01:14, 4 August 2018 (UTC)[reply]
Nah it's when the foetus is taken out by C-section, then put into the father, so he can lug it around for a while.  — SMcCandlish ¢ 😼  02:37, 4 August 2018 (UTC)[reply]
The cookie-cutter/templated approach isn’t wrong ... it is just a bit lazy in my mind. The important thing for me is that we also continue to allow articles to NOT follow the mold. We should not fall into the trap of thinking that all articles in the topic area have to follow the pattern just because most do. Blueboar (talk) 20:19, 3 August 2018 (UTC)[reply]
  • An example of such interchangeable sections? EEng 21:26, 3 August 2018 (UTC)[reply]
    It's a paragraph rather than a section, but one example is {{Johnson solid}}. I'm not convinced this is a good idea, but having it consolidated and properly sourced in one place like this is at least better than its creator's habit of spewing this material over hundreds of related articles by copy and paste without proper sourcing. —David Eppstein (talk) 21:29, 3 August 2018 (UTC)[reply]
    And this would be used in each of the 92 articles, or something? And that's really needed, instead of just saying, in each of the 92 articles, "A dimorphic protoplasm is a Johnson solid", and letting the reader click to learn what that is? EEng 21:34, 3 August 2018 (UTC)[reply]
    Is it really needed? I doubt it, but the editor who pushes this material is difficult to rein in, and as I said, tends to copying and pasting of sort-of-related material on lots of sort-of-related articles. At least this way the copyedits and sourcing get propagated to all copies at once. —David Eppstein (talk) 22:05, 3 August 2018 (UTC)[reply]
    Here's another example: Look at List of birds of Gibraltar. There's a paragraph after each family name that could be turned into a template. Similar paragraphs are used throughout the "List of birds of ..." series. There was once an effort to set them up that way, but it is not now used. @Basar: was involved in that and I think Template:Bird list header was part of it. The system had not been maintained and had only been partially implemented, so now each list has similar material, but of course individual editors have made changes and now there's a lot of variation. I'm not sure using templates would be an improvement, but that's an example.  SchreiberBike | ⌨  04:01, 4 August 2018 (UTC)[reply]
Also 461 cases for " qualified for the 1980 U.S. Olympic team but was unable to compete due to the 1980 Summer Olympics boycott. (She) did however receive one of 461 Congressional Gold Medals created especially for the spurned athletes. as one of the shorter examples. There are scads of other examples, of course, but I was unsure that "template" results would make the material appear "of a piece" of the rest of the articles or simply make it stand out as most templates now appear to do. Collect (talk) 14:25, 4 August 2018 (UTC) .[reply]

Plural(s)

I wish to draw your attention to a problem of plural(s). Many editors and even the manual of style have difficulty with the plural(s) being used to describe group(s) that possibly contain one or more item(s). (Reference(s) is another common case.)

The two examples currently in the MOS are fragment(s) and article(s). In both cases I would suggest it would be clearer and simpler to use the plural and not include parentheses at all.

I'm overdoing it here for emphasis, but it reduces readability as gives readers something extra to process (and then almost always ignore). (Editors should in general ask themselves if phrases in parantheses are worth including at all.) Try to read a sentence out loud and as a reader if there is too much or too little punctuation, it takes just that little more thought about what to say or not say. Then comes the issue of accessibility, anything that reduces readability is inevitably even worse when read by a screen reader. A screen reader will generally ignore the extra punctuation, or sometimes will pause, and if set to a verbose reading mode will read all the punctuation in pedantically painful detail.

On the basis that this extra over punctuation is of no benefit, and actively reduces readability I suggest updating the MOS to discourage the practice, or at least force editors to justify any special cases where it may be necessary. There are different ways to look at it as a question of plurals, punctuation, or use of parentheses, so I defer to your judgement as to the best place to warn against this well intentioned but misguided overuse of punctuation. -- 109.77.248.175 (talk) 20:08, 4 August 2018 (UTC)[reply]

"Respectable" results

The phrase "X finished in 'a respectable' Yth position", seems to me to be un-encyclopedic. "Respectability" is subjective term. Thoughts? Bogger (talk) 14:36, 7 August 2018 (UTC)[reply]

It's subjective and it doesn't really add information to the sentence, so unless it's from paraphrasing a source I'd say it was MOS:IDIOM that was better avoided. Do you think it might be covered under different rules or that it should be added somewhere in particular? -- 109.79.181.42 (talk) 16:52, 7 August 2018 (UTC)[reply]