Wikipedia talk:Manual of Style/Lead section: Difference between revisions

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Content deleted Content added
m Archiving 2 discussion(s) to Wikipedia talk:Manual of Style/Lead section/Archive 23) (bot
Line 343: Line 343:
:::::I think in many cases it would suffice to have a reliable source that backs the general pronunciation rules regarding foreign names that English speakers may even have no idea how to pronounce and make a note that it is just the general language guidance of how to pronounce. This would be more helpful to the project than requiring a reliable source for each specific person's name and how the person pronounce it, instead of the name in general. Although of course if there are such sources, they could be preferred.
:::::I think in many cases it would suffice to have a reliable source that backs the general pronunciation rules regarding foreign names that English speakers may even have no idea how to pronounce and make a note that it is just the general language guidance of how to pronounce. This would be more helpful to the project than requiring a reliable source for each specific person's name and how the person pronounce it, instead of the name in general. Although of course if there are such sources, they could be preferred.
:::::Examples being [[List of Indian monarchs]], [[List of heads of state of Mexico]], [[Mayor of Paris#List of officeholders]]. <span style="border-radius:8em;padding:0 7px;background:orange">[[User:Thinker78|<span style="color:white">'''Thinker78'''</span>]]</span> [[User talk:Thinker78|(talk)]] 19:43, 23 January 2024 (UTC)
:::::Examples being [[List of Indian monarchs]], [[List of heads of state of Mexico]], [[Mayor of Paris#List of officeholders]]. <span style="border-radius:8em;padding:0 7px;background:orange">[[User:Thinker78|<span style="color:white">'''Thinker78'''</span>]]</span> [[User talk:Thinker78|(talk)]] 19:43, 23 January 2024 (UTC)

== wp:lede deletions ==

Hello. An editor has repeatedly deleted from a lede the description of what is indeed the subject of most of the article. And is the most notable aspect of the subject's bio. (After failing - in discussion with other editors - to have the article changed to be simply a redirect). Curiously, the editor is citing mos:lede as a rationale. Can someone please join the discussion [https://en.wikipedia.org/w/index.php?title=Patsy_Widakuswara&diff=prev&oldid=1206883375 here]? Thanks. [[Special:Contributions/2603:7000:2101:AA00:95FD:29F8:EB8A:7855|2603:7000:2101:AA00:95FD:29F8:EB8A:7855]] ([[User talk:2603:7000:2101:AA00:95FD:29F8:EB8A:7855|talk]]) 20:49, 13 February 2024 (UTC)

Revision as of 20:49, 13 February 2024

WikiProject iconManual of Style
WikiProject iconThis page falls within the scope of the Wikipedia:Manual of Style, a collaborative effort focused on enhancing clarity, consistency, and cohesiveness across the Manual of Style (MoS) guidelines by addressing inconsistencies, refining language, and integrating guidance effectively.
Note icon
This page falls under the contentious topics procedure and is given additional attention, as it closely associated to the English Wikipedia Manual of Style, and the article titles policy. Both areas are known to be subjects of debate.
Contributors are urged to review the awareness criteria carefully and exercise caution when editing.
Note icon
For information on Wikipedia's approach to the establishment of new policies and guidelines, refer to WP:PROPOSAL. Additionally, guidance on how to contribute to the development and revision of Wikipedia policies of Wikipedia's policy and guideline documents is available, offering valuable insights and recommendations.

MOS:BIOFIRSTSENTENCE

I have a question that came up in a recent close review that for which I never received a definitive answer. The closer argued that an occupation could not be mentioned in the first sentence per MOS:BIOFIRSTSENTENCE (also mentioned in MOS:FIRSTBIO). The reasoning was that the occupation was contentious. I don't want to religiate that close or that article, but I'd like to see some clarification on MOS:BIOFIRSTSENTENCE. Can an occupation be a contentious, value-laden label? Nemov (talk) 13:08, 20 October 2023 (UTC)[reply]

Not sure why the OP is being cagey about it, but this is about the term "journalist" being used.  — SMcCandlish ¢ 😼  19:02, 20 October 2023 (UTC)[reply]
  • Any claimed fact can be controversial, the more so the less it is supported in independent reliable sources.  — SMcCandlish ¢ 😼  19:03, 20 October 2023 (UTC)[reply]
    First of all, what is cagey by asking for clarification? I would remind you to assume good faith. Journalist is an occupation. It would't matter if the occupation is a plumber. My reading of MOS:BIOFIRSTSENTENCE is for value-laden terms. It seems by your interpretation if there's any argument/controversy with an occupation then per MOS:BIOFIRSTSENTENCE it cannot be in the first sentence. That is fine, but then I would suggest wording MOS:BIOFIRSTSENTENCE to be clearer in that regard. Nemov (talk) 19:23, 20 October 2023 (UTC)[reply]
    I didn't imply anything faith-wise, I'm just observing that you posted an over-generally phrased question without providing sufficient detail, making us go dig the detail out on our own. Not terribly helpful. Anyway, I don't see anything unclear about BIOFIRSTSENTENCE. If I were notable and I claimed to be a licensed plumber as well as a writer and an IT consultant, WP should not just include the plumber claim without independent sourcing. More to the point of this particular case, sources appear to be in disagreement about whether what Ngo does is journalism, which is probably a more serious matter than not finding any sources that address whether he's a journalist at all (i.e., there is no WP:ABOUTSELF wiggle room to even contemplate). Anyway, when there's a conflict in what sources are saying, this becomes a WP:DUE policy matter and has nothing to do with style guidelines. Or to put it another way, what BIOFIRSTSENTENCE says is simply not relevant when the claimed occupation is disputed, because it is not a style question of any kind but a fact-establishment content question. We establish the facts first, and decide how to style them after the fact.  — SMcCandlish ¢ 😼  19:58, 20 October 2023 (UTC)[reply]
    I appreciate you spending time to respond to this, but I'm not here to debate Ngo. I'm curious about the application of MOS:BIOFIRSTSENTENCE because that's what the closer used to justify the action. The closer said[[1]] that per MOS:BIOFIRSTSENTENCE, we ought to omit labels that are contentious in the first sentence of the lead. Debates about the lead sentence come up a lot in biography discussions. If it's just a bad justification on the closer's part that's fine, they could have justified it differently. Nemov (talk) 20:28, 20 October 2023 (UTC)[reply]
    I don't really see a problem. BIOFIRSTSENTENCE has language that is clearly moderated to comply with NPOV and related policies: the opening paragraph of a biographical article should neutrally describe the person .... One, or possibly more, noteworthy positions, activities, or roles that the person held, avoiding subjective or contentious terms. If Ngo's claimed status as a "journalist" is disputed in the RS material, then it's not a neutral description and is subjective and contentious. It might have been a more solid close to cite NPOV policy directly. But I don't think it matters much, since the close was correct either way, and the P&G pages that are applicable are not in conflict, so which one was cited isn't very important. There are lots of pages here that re-state a rule from another page in summary form, and this generally isn't a problem (as long as a WP:POLICYFORK doesn't develop over time).  — SMcCandlish ¢ 😼  21:04, 20 October 2023 (UTC)[reply]
    Again, I'm not challenging the close and this isn't about Ngo. My question could be any article. Can an occupation be rejected on BIOFIRSTSENTENCE/WP:CONTENTIOUS as a value-laden label? That's how this was justified. I believe the closer got to the right answer as you have pointed out, but should have said the occupation was "subjective" instead of claiming it's value-laden. Nemov (talk) 21:25, 20 October 2023 (UTC)[reply]
    In short, "yes". If you agree with the close in the first place, what is prompting you to ask the question? "Value-laden" (i.e. subjective or contentious and not neutral) is just as valid a rationale as a bare "subjective".  — SMcCandlish ¢ 😼  21:28, 20 October 2023 (UTC)[reply]
    I'm asking the question because to claim an occupation is a value-laden term doesn't match the what's written at WP:CONTENTIOUS. I would recommend amending it if that's how editors want to interpret it. As Aquillion mentions below, "anything can be controversial." So what's the point in having the distinction in the first place? Nemov (talk) 21:54, 20 October 2023 (UTC)[reply]
Anything can be controversial if the sources present it as controversial; anything can be value-laden depending on the context. If there is clear disagreement among sources of comparable weight about something, then we can't state it as a fact in the article voice, per WP:NPOV (Avoid stating seriously contested assertions as facts.) I don't think there is a meaningful distinction between contentious, contested, and value-laden - they're all different ways of saying "do non-opinion sources generally agree on this and state it as uncontested fact." If high-quality non-opinion sources agree on something and state it as fact, then it is uncontested, uncontentious, and not value-laden; and likewise, if there's disagreement among them or they state it in a plainly skeptical manner, then should be treated as contested, contentious, and value-laden. That's the only threshold that matters - how editors feel about a term doesn't come into it; nor is there some sort of list of "verboten" terms or anything like that. (That said, it should be easy to see that some professions are more prestigious than others, and especially for people in media-related fields can carry value judgments about the value and veracity of their work, as well as their overall methods and intent. Whether or not someone is described using those terms can therefore become a value-laden judgment, so it's unsurprising that there would be cases where the sources would conflict or treat them as controversial. "Propagandist" is also a profession; do you think we could use it in the lead sentence of a bio when there's disagreement over them among the highest-quality sources? What about "prostitute?") --Aquillion (talk) 21:47, 20 October 2023 (UTC)[reply]
@Nemov, I hope that the original dispute is long settled and nearly forgotten, but I wanted to circle back to this idea that an occupation is a value-laden term. This is probably not helpful (meaning: practical) language for discussions. A bona fide occupation (e.g., butcher, baker, candlestick maker) is not a value-laden term. But:
  • Some things people do to produce money, or to keep themselves occupied, involve activities can be described in ways that tend to express an opinion or judgement about the person's activities (an oppressed prostitute, or an empowered sex worker? a professional gambler, or a gambling addict? a business owner or a crime boss? a terrorist or a freedom fighter or a mercenary?). These are sometimes value-laden terms.
  • The right of certain individuals to claim certain careers may be in doubt (e.g., an author who's never been published, a consultant with no clients, a politician who has lost every election...).
Disputes about whether someone should be called a journalist don't really involve "value-laden terms" per se. Instead, the question is what it means for someone to be a journalist, and whether the person really is one. WhatamIdoing (talk) 05:47, 15 November 2023 (UTC)[reply]

There is a dispute regarding whether it is DUE to mention A Haunting in Venice, a film adaptation, in the lead of Hallowe'en Party, its source material. TL;DR, proponents argue that the film is the most notable among the handful of adaptations, as evidenced by the fact that it is the only one to have a standalone article and that it has the most WP:SIGCOV; opponents argue that all of the adaptations are equally notable and it is therefore not appropriate to single out the film in the lead. You are invited to weigh in, thanks. InfiniteNexus (talk) 00:06, 31 October 2023 (UTC)[reply]

information Note: Started an RfC about this, see Talk:Hallowe'en Party#RfC on mention of film adaptation in the lead. Thanks. InfiniteNexus (talk) 00:04, 19 November 2023 (UTC)[reply]

A wording dispute about technical material

We have this presently:

Make the lead section accessible to as broad an audience as possible. In general, introduce useful abbreviations but avoid difficult-to-understand terminology, symbols, mathematical equations and formulas. Where uncommon terms are essential, they should be placed in context, linked, and briefly defined. The subject should be placed in a context familiar to a normal reader. For example, it is better to describe the location of a town with reference to an area or larger place than with coordinates. Readers should not be dropped into the middle of the subject from the first word; they should be eased into it.

This was recently changed to the following (with the change annoted here like this, for visual clarity):

Make the lead section accessible to as broad an audience as possible. In general, introduce useful abbreviations but avoid difficult-to-understand terminology, symbols, mathematical equations and formulas where such usage would conflict with the goal of making the article as accessible to as wide an audience as possible. Where uncommon terms are essential, they should be placed in context, linked, and briefly defined. The subject should be placed in a context familiar to a normal reader. For example, it is better to describe the location of a town with reference to an area or larger place than with coordinates. Readers should not be dropped into the middle of the subject from the first word; they should be eased into it.

The rationale for the addition was "put back in original wording here. We actually have articles *about* equations and other highly technical subjects. The guideline should not be read as excluding these from the lede." The rationale for the reversion was "No such goal - not the original text".

I'm not inclined to dig back through page history to determine when such wording was added the first time, by whom, or for what rationale. It's more sensible to just discuss whether we think such wording would be appropriate to have here.  — SMcCandlish ¢ 😼  20:15, 16 November 2023 (UTC)[reply]

I agree with the rationale, but I'm not sure the underlined text itself expresses the point very well. To paraphrase the given rationale, excluding an equation from an article about that equation would be perverse.
Furthermore, I'm a bit puzzled as to why abbreviations are singled out as acceptable. Why is it fair to introduce a "useful abbreviation" but not a useful symbol? The same be careful about introducing the unfamiliar ethos should apply across the board. XOR'easter (talk) 20:46, 16 November 2023 (UTC)[reply]
I'm not certain as to what is meant by "useful". My reading has been that it is okay to introduce an abbreviation and then use it to avoid repeating a long phrase. Hawkeye7 (discuss) 21:05, 16 November 2023 (UTC)[reply]
That sounds fair, in principle. My concern is that by the same token, one should be able to introduce a symbol and then use it to avoid repeating its definition or otherwise spilling a lot of words. XOR'easter (talk) 21:10, 16 November 2023 (UTC)[reply]
The original was inserted here. To me the additional sentence is not only repetitive, but demands a "goal" that we do not have and which conflicts with out mission. Our goal is to construct an encyclopaedia. Some articles will, of their natural, be quite specialised and of interest only to the specialist reader. Difficult-to-understand mathematical formulae and the like are absolutely essential in an article where that is the subject. @Tito Omburo: Hawkeye7 (discuss) 21:02, 16 November 2023 (UTC)[reply]
My take is that whether "equations" are used is a fairly blunt proxy for comprehensibility. Some of the most impenetrable introductory paragraphs in math articles are written entirely in prose, whereas is something that almost anyone can understand, but still might be less than beautiful in the first sentence.
So we should consider the issue of mathematical notation somewhat separately from the broader question of how to present technical material to readers who may not quite have the background for it.
I think it is reasonable to say, not as a hard rule but as a general stylistic preference, that mathematical notation should usually be avoided in the lead sentence, and maybe even the lead paragraph, except in cases where the article is specifically about an equation or similar formal entity (quadratic formula, Pell equation for example).
Note that this is specifically an aesthetic consideration; it is not really about comprehensibility. --Trovatore (talk) 21:23, 16 November 2023 (UTC)[reply]

This isn't in any way about "excluding an equation from an article"; it is only about lead sections of articles, and this discussion is going to be needlessly heated and increasingly nonsensical unless this distinction is understood and maintained.  — SMcCandlish ¢ 😼  21:29, 16 November 2023 (UTC)[reply]

The {{od}} template fails to make it completely clear whom you're responding to. From the content, I think you're responding to Hawkeye7, is that correct? --Trovatore (talk) 21:36, 16 November 2023 (UTC) [reply]
Sorry for accidentally eliding that above (I meant to include "the intro of" and didn't notice I had omitted it until re-reading just now). But... excluding an equation from the lead of an article about that equation is still absurd. I really doubt that anyone wants to remove the illustration of clefs from the lead of Clef. That's what the page is about; removing it in the name of "clarity" would rightly be seen as backwards. XOR'easter (talk) 21:41, 16 November 2023 (UTC)[reply]
Well, I was quoting a particular editor clearly, but making a general point: if this discussion gets mired in "what should be in the article at all" instead of remaining focused on "what should be in the lead section" then it's not going to go anywhere useful. This has implications for other statements above, like "consider the issue of mathematical notation somewhat separately from the broader question of how to present technical material". In the lead makes a big difference here, in what such a consideration would entail.  — SMcCandlish ¢ 😼  21:59, 16 November 2023 (UTC)[reply]
Another concern: a bajillion times over the years, I've said some variation of "the lead is meant to summarize the body". When a subject is notation-intense, omitting that notation from the lead entirely could well make for a defective summary. (I first wrote "dishonest summary", but that could be construed as implying ill intent.) Now, depending on the subject, it might still make sense to exclude fancy notation from the opening line, or from the first paragraph. The pragmatic choice will depend upon the topic, the length of the article, and the plausible intended audience. XOR'easter (talk) 22:12, 16 November 2023 (UTC)[reply]
I'm having trouble reading the changed text. Can we instead say

In general, introduce useful abbreviations but avoid difficult-to-understand terminology, symbols, mathematical equations and formulas where unless such usage would conflict with the goal of making the article as accessible to as wide an audience as possible.

Even with this change, it's still hard to understand. The pile-on wording of "do this avoid that unless the first thing" is convoluted.67.198.37.16 (talk) 22:22, 16 November 2023 (UTC)[reply]
BTW, this change would address XOReaster's concern: if some highly technical article requires some exceptional lead with unusual wording, its allowed, because that would meet "the goal of making the article as accessible to as wide an audience as possible." 67.198.37.16 (talk) 22:29, 16 November 2023 (UTC)[reply]
"Avoid unless it would conflict with the goal of making the article accessible" means not using this technical content in the cases where it is needed, but allowing this content in cases where it is unnecessary. You are changing the meaning to the opposite of what it should be. —David Eppstein (talk) 22:38, 16 November 2023 (UTC)[reply]
Yeah, there seems to be a series-of-negatives problem going on.  — SMcCandlish ¢ 😼  22:42, 16 November 2023 (UTC)[reply]
Thus illustrating that technical content may be a sufficient condition for producing difficult-to-read text, but it is not a necessary condition. —David Eppstein (talk) 22:52, 16 November 2023 (UTC)[reply]
I mean look, even journal articles ordinarily start with prose. It's not really about comprehension. It just doesn't look nice to start with symbols; it looks like you've wandered into somebody's notes rather than a polished article. I think we can make some such point, maybe not for the whole lead section, but at least for the lead paragraph.
And yes, there does need to be an exception for articles that are specifically about an equation or other symbolic entity. --Trovatore (talk) 23:18, 16 November 2023 (UTC)[reply]
I think that is broadly true, but not universally true. Abstracts and opening paragraphs of journal articles do break out the notation if it's sufficiently well-established in their fields that they don't have to define it first. "Our algorithm runs in time", etc. There are terms invented by incorporating notation into words: ∞-category, for example (and heck, ∞-groupoid has to have the symbol in the article title!). XOR'easter (talk) 02:22, 17 November 2023 (UTC)[reply]
I said "ordinarily". Yes, there will be exceptions, but as a general rule, it's probably better to defer heavy symbol usage to, at least, the second paragraph. I think it just looks nicer. I don't have a deeper reason than that, at least not that I've analyzed well enough to elucidate. --Trovatore (talk) 03:19, 17 November 2023 (UTC)[reply]

The original was changed with this very misleadingly summarized and undiscussed edit, which clearly changed the meaning to an injunction against including equations in the lede. This leads to all sorts of perverse problems, as editors who have experience editing mathematics and technical subjects have already remarked. Tito Omburo (talk) 23:20, 16 November 2023 (UTC)[reply]

Also, I am confused by the "no such goal" edit summary. If it is not a goal of this guideline to eliminate mathematics from lede sections of articles on mathematics, perhaps it makes sense to cut out the proscription on equations altogether. Seems like a classic case of WP:BEANS. Tito Omburo (talk) 23:24, 16 November 2023 (UTC)[reply]

As someone trying (perhaps poorly) to just facilitate the discussion happening, without "having a dog in the fight", I want to suggest that several of you continuing to edit the pertinent material in the guideline page back and forth while the discussion is going on kind of defeats the purpose of the discussion, which is to come to some consensus about what that material should say and why.  — SMcCandlish ¢ 😼  23:35, 16 November 2023 (UTC)[reply]

Listing large US cities by state in broadcasting article leads

I've had this come up in an FAC (Wikipedia:Featured article candidates/WSNS-TV/archive1) and wanted some clarity on the topic. Some broadcasting articles are on stations located in and licensed to very large, undisambiguated-title-by-state-per-AP Stylebook US cities. Which of these should be preferred?

Pinging for visibility: Mvcg66b3r and MaranoFan. Sorry for double pings, but SMcCandlish asked me to move this over. Sammi Brie (she/her • tc) 04:03, 19 November 2023 (UTC)[reply]

  • My personal take on this would be to either link the city to its article, or give the long version, but not both, and prefer the former if there's an infobox that gives the long version. Even a lot of major city names are technically ambiguous (cf. San Francisco (disambiguation)). The rationale for linking would be that, while we don't normally link this class of major metro cities when they are mentioned in passing (e.g. in "Smith moved to Chicago in 2014"), in the lead of an article about a radio/TV station, the market it serves is directly pertinent to fully understanding the subject, so the link is justified.  — SMcCandlish ¢ 😼  04:15, 19 November 2023 (UTC)[reply]
    @SMcCandlish, I should have linked them in the examples above, but normally, they are. Examples revised. Sammi Brie (she/her • tc) 04:21, 19 November 2023 (UTC)[reply]
    I that case, I would just go with San Francisco, though even San Francisco, California would be preferable to San Francisco, California, United States. We generally don't put "United States" after a US state name, except sometimes in infoboxes (for no reason I've ever seen articulated). The usual presumption that people know where and what San Francisco or Chicago are goes double for entire US states.  — SMcCandlish ¢ 😼  04:41, 19 November 2023 (UTC)[reply]
    @SMcCandlish Nikkimaria has gone at me for not having country mentions in articles before. (The relevant infoboxes have a country field.) Sammi Brie (she/her • tc) 04:57, 19 November 2023 (UTC)[reply]
    Well, I guess it's good to have a general discussion then and come to a clearer consensus about what to do.  — SMcCandlish ¢ 😼  05:28, 19 November 2023 (UTC)[reply]
  • Option A. It's been my take that the lede in Us TV station articles have long been problematic, in that they are overlinked leading to WP:SEAOFBLUE. For example: WFTY-DT; "It is owned by TelevisaUnivision alongside Newark, New Jersey–licensed UniMás co-flagship WFUT-DT (channel 68) and Paterson, New Jersey–licensed Univision co-flagship WXTV-DT (channel 41)". I feel the excess verbiage could be removed, without lessening the information in the lede. Example: "It is owned by TelevisaUnivision alongside Newark, New Jersey–licensed WFUT-DT (channel 68) and Paterson, New Jersey–licensed WXTV-DT (channel 41)", conveys the same information and allows the reader to choose whether or not they want to click on the wiki-link for more information regarding the sister stations. - BlueboyLINY (talk) 20:41, 19 November 2023 (UTC)[reply]
    @BlueboyLINY I've been very attuned to SEAOFBLUE issues in our leads. We have a lot of kludgy lead paragraphs in our topic. Our other issue is that, generally, only articles I've improved have adequate summary leads of their contents. Sammi Brie (she/her • tc) 21:24, 19 November 2023 (UTC)[reply]
    BlueboyLINY, it is not clear from your example text what you think should and shouldn't be linked, since you didn't include any links in any of it. This makes your rationale and your desired outcome rather hard to understand.  — SMcCandlish ¢ 😼  21:32, 19 November 2023 (UTC)[reply]
    This is the relevant excerpt with links:

    It is owned by TelevisaUnivision alongside Newark, New Jersey–licensed UniMás co-flagship WFUT-DT (channel 68) and Paterson, New Jersey–licensed Univision co-flagship WXTV-DT (channel 41), which WFTY simulcasts on its respective second and third digital subchannels.

    Sammi Brie (she/her • tc) 21:43, 19 November 2023 (UTC)[reply]
    And this is the excerpt with links I feel are unnecessary removed:

    It is owned by TelevisaUnivision alongside Newark, New Jersey–licensed WFUT-DT (channel 68) and Paterson, New Jersey–licensed WXTV-DT (channel 41), which WFTY simulcasts on its respective second and third digital subchannels.

    - BlueboyLINY (talk) 03:00, 20 November 2023 (UTC)[reply]
    I would rewrite as:
    '''WFTY-DT''' (channel 67) is a [[television station]] licensed to [[Smithtown (CDP), New York|Smithtown, New York]], United States, serving [[Long Island]] as an affiliate of the [[True Crime Network]]. It is owned by [[TelevisaUnivision]] alongside [[Newark, New Jersey]]–licensed [[UniMás]] [[Flagship (broadcasting)|co-flagship]] [[WFUT-DT]] (channel 68) and [[Paterson, New Jersey]]–licensed [[Univision]] co-flagship [[WXTV-DT]] (channel 41), which WFTY [[Simulcast|simulcasts]]…
    +
    '''WFTY-DT''' (channel 67) is an American [[television station]] licensed to [[Smithtown (CDP), New York|Smithtown, New York]], serving [[Long Island]] as an affiliate of the [[True Crime Network]]. It is owned by [[TelevisaUnivision]] alongside [[Newark, New Jersey|Newark]]–licensed [[UniMás]] [[Flagship (broadcasting)|co-flagship]] [[WFUT-DT]] (channel 68) and [[Paterson, New Jersey|Paterson]]–licensed [[Univision]] co-flagship [[WXTV-DT]] (channel 41), which WFTY [[Simulcast|simulcasts]]…
    — HTGS (talk) 22:14, 23 November 2023 (UTC)[reply]
  • While we are the English Wikipedia, I think it stands to reason that a vast majority knows that San Francisco is a city in the US. So, I would go with Option A. Where necessary, like with New York City, I would just use a piped link. - NeutralhomerTalk • 13:56, 20 November 2023 (UTC)[reply]

FA numbers

@Femke, the numbers of sentences in FAs was based on a non-random sample of 61 articles. I looked again at the first 10, using the specific version that was promoted. Here are the numbers for each (words/sentences):

  • 361/16
  • 391/15
  • 232/9
  • 244/9
  • 399/19
  • 245/10
  • 334/15
  • 361/15
  • 343/16
  • 137/7

The range for sentences is 7 to 19, and if you exclude the most extreme, it's either 9 to 16 or 10 to 15 sentences per lead.

The mean word count is 305 words per lead, with most of them falling either around 250 or (a little more frequently) 350.

The words per sentence count has a range of 20 to 27, with a mean of 23.

This is similar to the numbers for last December's TFAs, which you can find here. I suggest that instead of raising the number of sentences per lead to 12, which is less accurate, you consider changing the "300 words" to a range (e.g., "250 to 350 words"). WhatamIdoing (talk) 06:17, 27 November 2023 (UTC)[reply]

Thanks for giving the background here. I'm happy with the recent change by Tpbradbury to 200-400 words.
I was surprised to see the combo 10 sentences for 300 words, as that would imply an average 30 words per sentence, above the maximum length (not maximum average) of 25 words in a sentence the UK government uses to assure readability.
In the non-random sample of 61 articles TFA list, the median sentence length is 21 words, which comes closer to what I expect, even there are two outliers with 30+ words. So for the 200-400, a rough number of sentences would be 10 to 20, taking that median and rounding for simplify. Happy for that to be added, or for the number of sentences to be omitted altogether. It's a bit odd to have a small range of sentences with a wider length range. —Femke 🐦 (talk) 17:35, 27 November 2023 (UTC)[reply]
I think you assumed that the smaller number of sentences (10) had the same number of words (300) as the larger number of sentences (15). In practice, 10-sentence leads tend to have 230–260 words in them, and 15-sentence leads tend to have 300–350 words in them.
We should probably change the "200" to 250. Only a small percentage of FAs have leads as short as 200. WhatamIdoing (talk) 17:43, 27 November 2023 (UTC)[reply]
That's indeed how I read the 10 sentences vs 300 words.
You're right it's a bit asymmetric: of the same 61 articles TFA sample, the 10% percentile is 193, 20% is 232, 50% is 282, 80% is 399 and 90% is 446. So a range of 250 to 400 words makes sense if we round to the nearest 50 and take the 20% and 80% percentile.
If we take the same percentiles (20% and 80%), we'd get 10 to 18 sentences. The alternative calculation of dividing word count by median sentence length gives us 11 to 19 sentences, which doesn't feel nicely rounded off. —Femke 🐦 (talk) 17:54, 27 November 2023 (UTC)[reply]
10 to 18 sentences sounds good. Gawaon (talk) 19:22, 27 November 2023 (UTC)[reply]
Those numbers (250–400 and 10–18) also match the December 2022 counts (excluding the smallest 6 and the largest 6). WhatamIdoing (talk) 19:34, 27 November 2023 (UTC)[reply]
I don't think it's self-evident that this is the appropriate way to come up with the figures (perhaps we should take a look at articles promoted in the last two, three, or five years to get a reasonably large sample without getting too many articles that do not reflect current best practices? Perhaps we should take into consideration that certain kinds of articles are likely to be over-/underrepresented among featured articles and have on average shorter/longer leads than others? Perhaps we should not be looking at absolute word counts but relative ones?), but more importantly I think this is taking an overly quantitative approach to an issue that is inherently mostly qualitative. The most important thing is that the figures only be used descriptively, not prescriptively. Being somewhat fuzzy about it (such as by using a range, and preferably a fairly broad one) helps here. TompaDompa (talk) 00:33, 28 November 2023 (UTC)[reply]
The counts can only be determined by hand; feel free to pick your own set and count them yourself. Another dataset will do no harm.
This information is already presented in a strictly descriptive manner: "Most Featured articles have a lead length of..." – not "You must" or "You should". WhatamIdoing (talk) 01:11, 28 November 2023 (UTC)[reply]
What is the aim of including these numbers (which will be misapplied)? What are we gaining ? SandyGeorgia (Talk) 01:15, 28 November 2023 (UTC)[reply]
I, for one, find them useful guidance. Gawaon (talk) 03:10, 28 November 2023 (UTC)[reply]
I tend to agree that this is mostly a qualitative not quantitative matter, and that these numbers will be misapplied, but may be we still need something like them anyway, perhaps with a strong statement that there are just a ballpark estimate and not grounds for forcing a split or deletion of material. PS: "10 to 18" is also weird to me, oddly arbitrary. It would make more sense as "10 to 20", without having much effect on the other numbers.  — SMcCandlish ¢ 😼  04:22, 28 November 2023 (UTC)[reply]
Well, that's why I rounded down the first time, but 10 to 18 is more accurate.
I'd much rather have word and sentence counts than paragraph counts. This page has made length-related suggestions since the first version in 2004, when length considerations took up half the page. Whatever benefit we thought we were getting by creating this page in the first place, IMO we'll get those benefits plus greater clarity by replacing the suggested paragraph count with a suggested word or sentence range. WhatamIdoing (talk) 05:00, 28 November 2023 (UTC)[reply]
Why "replacing"? But are currently there, and it makes a lot of sense to have both (or rather, all three: paragraphs, sentences, and words). Gawaon (talk) 07:32, 28 November 2023 (UTC)[reply]
The problem we've had (for years) with the paragraph counts is that people say "Oh, five paragraphs is too many – this lead is too long – look, I removed a line break, and now it is the right length!" Or "One paragraph is too little – this lead is too short – look, I pressed Return in the middle of the paragraph to make two single-sentence paragraphs instead of one two-sentence paragraph, and now it is the right length!" WhatamIdoing (talk) 16:03, 28 November 2023 (UTC)[reply]
I agree that "10 to 18 is more accurate". Rounding it up to 20 would feel arbitrary, as would rounding down the lower word count to 200 (as it has been for a short while, meanwhile reverted). Gawaon (talk) 07:34, 28 November 2023 (UTC)[reply]
The page is going WP:CREEPy; we still have no evidence that "Most Featured articles have a lead length of about three paragraphs, containing 12 to 15 sentences, or 250–400 words", as that was apparently based on one sample of TFAs for one month (according to the discussion at WP:SIZE, and why is this being discussed in two different places), and it could reflect a skew towards certain kinds of articles that are over-represented at WP:FA (eg hurricanes). We shouldn't be imposing stuff on a guideline that editors will misinterpret (because they always do), and we can't make generalizations like this about FAs without considering the topic. SandyGeorgia (Talk) 07:43, 28 November 2023 (UTC)[reply]
It's based on two sample sets (December 2022 + all of WPMED's FAs), both of which had the same results.
(Hurricanes were one of my concerns about FAs; most of them seem to have shorter than usual leads, with two paragraphs.) WhatamIdoing (talk) 15:53, 28 November 2023 (UTC)[reply]
Well, as another example of differences, the MED FAs include bios, which are different than medical conditions. So same issue ... it's hard to separate length from topic. SandyGeorgia (Talk) 16:03, 28 November 2023 (UTC)[reply]
When you get the same results in two separate studies, the odds of a third study producing different results are pretty small. But if you (or anyone else) would like to pick a third sample set, please feel free to do so, and please share your results. WhatamIdoing (talk) 17:01, 28 November 2023 (UTC)[reply]
WhatamIdoing, I suspect this is not a good representative month for this data. That month happened to have three one-paragraph leads, which are actually quite rare (unless FAC has gone way off the rails). If these kinds of numbers are to be used, a broader sample is called for. SandyGeorgia (Talk) 09:18, 28 November 2023 (UTC)[reply]
It could be; you could pick another month and see what you find. Looking at Wikipedia:Featured content, 12 of the most recent 15 FAs have three paragraphs in the lead; one has four paragraphs and two have two paragraphs. Three paragraphs is mean, median, and mode in that small sample set. WhatamIdoing (talk) 17:19, 28 November 2023 (UTC)[reply]

I'd be happy to take a more representative sample of FAs and redo the analysis later. I find some word count really useful here; I often quote it to say that 600+ words leads are intimidating and difficult to read. Having guidance on the number of paragraphs without guidance on words can lead people to misinterpret as well and write very bloaty paragraphs. To further avoid misinterpreting, we may want to widen to a 10-90 percentile interval instead. —Femke 🐦 (talk) 08:24, 28 November 2023 (UTC)[reply]

If we want to be all statistical about it, we could follow the 68–95–99.7 rule. We're currently taking the inner 80%, which is a bit more than one standard deviation from the median. WhatamIdoing (talk) 17:09, 28 November 2023 (UTC)[reply]
The one-paragraph leads are outliers, and three in one month should be an extreme anomaly. (If they're not, something is wonky at FAC.) I'm less interested in re-doing the numbers than I am in seeing better qualifiers put on the text. SandyGeorgia (Talk) 17:23, 28 November 2023 (UTC)[reply]
It says "Most Featured articles have a lead length of about three paragraphs". What would you change that to? "Most Featured articles have a lead length of about three paragraphs, and almost never one or five"? "Most Featured articles have a lead length of about three paragraphs, based on multiple samples, all of which found that three was the most common number of paragraphs"? WhatamIdoing (talk) 19:10, 28 November 2023 (UTC)[reply]
Most featured articles have a lead length of about three paragraphs; lead length varies depending on the topic and content area, but is rarely less than two or more than five paragraphs. SandyGeorgia (Talk) 19:25, 28 November 2023 (UTC)[reply]
I'm not opposed to that, though one could express it even simpler: "Most featured articles have a lead length of about two to four paragraphs." Gawaon (talk) 19:35, 28 November 2023 (UTC)[reply]
Brings us right back to the same problem-- no context, the uninitiated will then oppose five or one, which is what we're trying to avoid. They happen, albeit rarely, and they are acceptable. SandyGeorgia (Talk) 19:43, 28 November 2023 (UTC)[reply]
There are still the words "most" and "about" even in my proposal, but – anyway. Gawaon (talk) 20:06, 28 November 2023 (UTC)[reply]
(edit conflict) I think coords can deal with people at FAC overinterpreting stuff if we're clear that these are just descriptive and not prescriptive. I don't think people oppose for such reasons, really? These guidelines are most useful for showing newbies that a 2 sentence lead is really not that good, and that the leads of featured articles do not become a bloated mess, but typically stay under 500 (550) words.
I can't tell who wrote this, but there is no problem with FAC Coords (who know how to apply guidelines). As explained over at the split discussion (??? Why ??) at the talk page of WP:SIZE, the problem is not FAC, but how less experienced editors will use/interpret this data. SandyGeorgia (Talk) 20:44, 28 November 2023 (UTC)[reply]
@SandyGeorgia, what makes you believe that the lead varies according to the topic of the article? Could you articulate an example, like "Music articles tend to have short leads" or "Biographies tend to have long leads"? I'm hoping to find out what the difference is between "varies by the needs of the specific article" (which I assume we could all agree is true) and "varies by subject" (good luck guessing whether your subject tends to be shorter or longer).
I just ran another small set at User:WhatamIdoing/Sandbox#Most recent FAs (15 articles). 80% of them have exactly three paragraphs. 100% of them have between 10 and 17 sentences. 80% of them have a word count between 250 and 400 (and the 20% that don't are within a rounding error of 250, so 100% of them have "about" 250–400 words in the lead).
Someone else has claimed that lead length varies by length of the article; this might be true, but it's a fairly minor effect (articles 3x median have a lead that is in the upper half; articles that are 0.6x median have a lead that is in the bottom half). WhatamIdoing (talk) 21:15, 28 November 2023 (UTC)[reply]
Hurricanes and ships are often short; medical and scientific articles (climate change) (not bios) are often longer. Medical articles may have (proportionally) longer leads because the things we should include is somewhat prescribed at WP:MEDORDER (that is, we hit classification, signs and symptoms, diagnosis, cause, prognosis, epidemiology, history, cultural, etc ... we have a checklist other content areas might not have). Bios vary. I think I'd be convinced on the number range if we took Femke's analysis of articles passing FAR and looked at all of them (it's unclear to me why some are left out). But even that then would be misleading, as hurricanes aren't passing FAR because there's a CCI ... SandyGeorgia (Talk) 21:20, 28 November 2023 (UTC)[reply]
Also, in medical, lead size relating to article size falls apart because we cover the MEDORDER bits no matter how short the article. That is, Ajpolino's lead at the very short Buruli ulcer (not much known) is probably similar to the much longer lung cancer, as he has to hit all the MEDORDER sections anyway. No such list in most other kinds of content ... SandyGeorgia (Talk) 21:22, 28 November 2023 (UTC)[reply]
From the WPMED-tagged FAs, I get a mean of 371 words per lead for all of them (biographies, basic science, etc.), and 380 words per lead for only the diseases, drugs, etc. I doubt that difference is either statistically significant or of practical relevance. WhatamIdoing (talk) 21:38, 28 November 2023 (UTC)[reply]
OK, I guess that answers that. SandyGeorgia (Talk) 21:38, 28 November 2023 (UTC)[reply]
But that's well above the 300 average mentioned before ... (sorry for piecemeal responses, heading out the door soon, trying to finish up). SandyGeorgia (Talk) 21:39, 28 November 2023 (UTC)[reply]
It's within the 250–400 range, but longer than the 300ish median. WhatamIdoing (talk) 21:48, 28 November 2023 (UTC)[reply]
There's a separate problem looking at what's coming out of FAC; I've seen no evidence that leads are being reviewed, and plenty that they're not. There is one editor (Dying) who basically spends an entire day reviewing the lead of every WP:TFA and copyediting the blurbs-- which is work that should be happening at FAC. So, again, interested in Femke's analysis of what's coming out of FAR, as those tend to be more complex articles, and get more indepth review (but then FAR misses the many short hurricanes, as they are all quagmired in a CCI). And some *very* short articles are coming out of FAC of late, another concern. SandyGeorgia (Talk) 21:38, 28 November 2023 (UTC)[reply]

Analysis with some more articles

I've finished the analysis using articles that were kept at a featured article review. Typically these might be a bit more meaty as the more core articles tend to be saved. Given that Sandy suspected that where running out of meaty articles to run at today's featured article, I thought this would be a nice dataset to complement the first one with. The rounded values of the combined set (either the 9 to 95% or the 10 to 90% interval) are 200 to 500 words. This gives a nice range to avoid people over interpreting but thus provide some guidance to avoid people writing intimidatingly long unreadable introductions. What do you guys think? For the sentence interval, we could say between 10 and 20 sentences (80% interval: 9-22, 90% interval: 7-23 words). —Femke 🐦 (talk) 20:12, 28 November 2023 (UTC)[reply]

Looks at these articles (and some others), 500 words in the lead already feels somewhat too long for me. Personally I would be more comfortable with the old range of up to 400 words, or 450 as a "compromise". Its similar with the sentence range, but I guess 20 vs. 18 sentence is close enough that it doesn't matter much. Did you also check the paragraph count? Gawaon (talk) 20:40, 28 November 2023 (UTC)[reply]
I didn't, but feel free to add it to the table. —Femke 🐦 (talk) 20:47, 28 November 2023 (UTC)[reply]
Femke ... Looking at articles kept at FAR does broaden the sample to something meaninful, so good for that. If you browse the old data I used to keep at Wikipedia talk:Featured article statistics, you'll see that many of the historically very short and very long articles have ended up defeatured. In the case of short, they end up merged elsewhere, in the case of long, reasons vary (but often related to maintenance or people waking up and realizing the prose wasn't tight anyway). And, I suspect that FAC regulars are no longer rigorous about reviewing leads, so I'm glad you branched out in your data selection. So it's interesting that you came up with a good and broader range on kept FARs. Which articles were at the low end ? That is, is there a trend in content area? And I do suspect that looking at words may be more helpful than paragraphs, so we don't end up with something like India, where it seems five paragraphs of information has been artificially shoved in to four, perhaps to "comply" with an over-interpretation of a guideline. SandyGeorgia (Talk) 20:52, 28 November 2023 (UTC)[reply]
Never mind; I see your list now. Not sure it's complete-- where did you pull it from so that it missed Lung cancer, Hanford Site, J. K. Rowling, Belton House, Diocletian, and many others (see Wikipedia:Featured article review/FASA/Records) for example? SandyGeorgia (Talk) 20:56, 28 November 2023 (UTC)[reply]
And speed of light passed FAR with five paras, but now has seven (sigh). SandyGeorgia (Talk) 21:03, 28 November 2023 (UTC)[reply]
For that article, I see 4 paragraphs/18 sentences/503 words at the original promotion, 5 paragraphs/23 sentences/571 words at the end of FAR, and 6 paragraphs/22 sentences/546 words today. It is an illustration of the principle that adding blank lines does not always mean that the lead has gotten longer. WhatamIdoing (talk) 21:48, 28 November 2023 (UTC)[reply]
Yes; for that reason, focus on paras is misleading (eg India). If we go back to something you stated on the other page (it's not FAC/FAR we have to worry about but newer editors who need guidance), I'm trying to understand how using a word or para or sentence count won't end up mis-used by the very people who need the help. The problems are always the same: POV pushing-- the attempt to get everything you can that supports the editor's POV in to the lead or infobox for maximum exposure. That's the problem that needs solving here, and I fear that the word count will just encourage the POV pushers to chunk in POV right up to the maximum mentioned in the guideline, with no sense of context around the numbers. Most editing isn't like the J. K. Rowling FAR, where experienced editors set aside their differences and wrote the lead following policy and sticking within a valid word range. Many lead problems revolve around POV, with not always experienced editors fighting it out. The huge emphasis on this guideline page needs to stay on due weight and summarizing key points according to preponderance of sources... and any mention of words, sentences, paras has to be highlighted as clearly secondary to that primary aim. That underlies my concern about reading too much in to these numbers, because any way we calculate them, I can tell you where there are sampling issues in FAs. SandyGeorgia (Talk) 22:10, 28 November 2023 (UTC)[reply]
First, I think we need to remember that we have many more leads that are too short than those that are too long. There are eight articles tagged with Template:Lead too short for every one with Template:Lead too long. Furthermore, some of these (e.g., Damon Runyon) are probably tagged because they have multiple shorter paragraphs but a very normal word count. Encouraging people to increase the size of the lead is not necessarily problematic, on average.
Second, we are not talking about adding a paragraph count. We have had a paragraph count in this page since literally its first revision. I am talking about providing better guidance for longer leads than what we have had since literally the first revision, which basically said (and still says) "Hey, the max is usually four paragraphs – so remove a couple of blank lines to comply. Four long paragraphs is definitely better than five shorter ones". If your main concern is the small number of articles with very long leads, then providing sentence or word counts should reduce the problem of people "hiding" overly long leads by cramming 900 words into four l-o-n-g paragraphs. You can't hide 900 words by changing the white space but keeping the same content. If we say that 250–400 words is the common length for FAs, then 900 is not going to be accepted as typical or desirable, and the only solution will be to remove some of that POV pushing that was chunked in there. I'm okay with that.
There is another option: We could remove all numeric guidance about lead length completely. We will stop saying anything about the number of paragraphs. We will not add information about sentences or word counts. Each editor will form their own personal opinions about whether all leads should be restricted to one paragraph, or that "too long" is two thousand words, or whatever else they personally feel like. There will be no standards at all. I do not recommend this approach. WhatamIdoing (talk) 22:30, 28 November 2023 (UTC)[reply]
I think talking about POV pushing is distracting here. This is mainly about readability. A readable article make sure that you have enough information in the lead that many people don't have to go read beyond the lead, while preventing the problem of very long leads, which are too intimidating.
I've never found myself in discussion with people disagreeing that a lead was too short, whereas I have found myself very often in discussions with people that wanted to defend a 600 or 700 word lead that was difficult to parse. Anyway, whether we make it 200 to 500, or 250 to 400, this will give some guidance to people there must be a reason that these are the typical values. I think the absence of the guidance is more likely going to be abused then the presence, especially if we keep it descriptive rather than prescriptive.
In terms of the selection, I just took The first 30 articles from the very old and old list. Slightly more from the very old because the old list doesn't have 15 safe yet. Hence a few missing articles. —Femke 🐦 (talk) 19:54, 29 November 2023 (UTC)[reply]

Lists in leads

I noticed some disagreement in the Solar System article regarding a possible presentation of lists in the lead section by @CactiStaccingCrane. To me, this article seems like a place where such a relatively rare presentation is plausible—the topic involves a definition of a medium-sized (not small as to fit into prose, not large as to not belong in the lead) list. The consensus in reversion seems to be that prose is generally always preferred—if that should be the case, then it should be stated explicitly here, I think.
Personally, I think there is nothing inherently wrong with lists in the lead section in specific cases, perhaps like the Solar System article—save for the argument that it may require too much vertical space, but I'm skeptical of that. Thoughts? Remsense 21:37, 3 December 2023 (UTC)[reply]

Pinging @Praemonitus and @OmegaMantis as they have recently revert my edits. CactiStaccingCrane (talk) 01:03, 4 December 2023 (UTC)[reply]
This is horrible, and not a good idea. Also, considering that the article passed FAR a little over a year ago, why does that lead need to be messed with at all? XOR'easter. SandyGeorgia (Talk) 01:58, 4 December 2023 (UTC)[reply]
I, as a general-interest lay reader, read this lede this afternoon, and it sure felt that bulleting would help. After reading that lede, I was just a little bewildered by all those details and lists, and can remember only a little.
This is ironic (and sad) because Solar System went onto my Watchlist 10 or 12 years ago when I praised the then-current lede as nearly ideal: enough information for the casual reader and enough pointers to entice that reader to sections in the body of the article. But, as with many good ledes, successive additions and amendments over the years have destroyed those qualities. The lede to New York City has suffered similar elephantiasis by accretion. @Remsense, CastiStaccingCrane, SandyGeorgia, and Nikkimaria: —— Shakescene (talk) 02:18, 4 December 2023 (UTC)[reply]
Yep ... if a lead needs a list, it probably has too much detail. SandyGeorgia (Talk) 02:24, 4 December 2023 (UTC)[reply]
SandyGeorgia, if this is the consensus, I think it would be worthwhile to explicitly state in this guideline. Remsense 02:25, 4 December 2023 (UTC)[reply]
I've never seen anyone suggest adding a list to a lead before, so that seems WP:CREEPy. SandyGeorgia (Talk) 02:26, 4 December 2023 (UTC)[reply]
SandyGeorgia, perhaps. This is not my first time to the best of my recollection, but I figured it was worth a chat about. Remsense 02:28, 4 December 2023 (UTC)[reply]
New York City (858 words in the lead) ... ack! A good reason we should be giving some guidance on lead size, too. SandyGeorgia (Talk) 02:25, 4 December 2023 (UTC)[reply]
how do you feel about Solar System lede as it presently exists? Remsense 02:36, 4 December 2023 (UTC)[reply]
It's fine. It was fine when it passed FAR. From the standpoint of proportionally summarizing the rest of the article the way ledes are supposed to do, it was better in the version that passed FAR. XOR'easter (talk) 02:52, 4 December 2023 (UTC)[reply]
I'm not trying to start a discussion over the presentation in particular except as a general example of the guideline. I have no real opinion on that presentation other than it being plausible. Remsense 02:22, 4 December 2023 (UTC)[reply]
Perhaps we should link prose in the lead here. Moxy- 02:27, 4 December 2023 (UTC)[reply]
Tend to agree with "if a lead needs a list, it probably has too much detail." And we don't need to add a line item about this to MOS:LEAD per WP:CREEP and MOS:BLOAT. This kind of dispute is virtually unheard of (probably because everyone intuitively understand that lists are not how to write a lead), so there is no reason to address it here. MoS is already over-long, and no new rule should be added to it unless it is needed to deal with long-term, recurrent editorial strife of a particular sort.  — SMcCandlish ¢ 😼  09:08, 4 December 2023 (UTC)[reply]
I probably should have reverted the lead lists as soon as they appeared in the Solar System, but I am tired of fighting every little format issue. I was asked to take another look by CactiStaccingCrane and immediately tagged the list issue. Sorry if it caused such a fuss. While it is a detail-oriented article, to me the lead was just fine without the bullet lists. Praemonitus (talk) 15:17, 4 December 2023 (UTC)[reply]
One of the general problems with bulleted lists is that they draw attention, which means that they draw attention away from the sentences and paragraphs. Our guidance in the lead is probably correct, but our guidance elsewhere may be less than perfect. We recommend that people write something like:

The main causes of scaryitis are age and a high-fat diet.  Some less common risk factors include:

  • eating bacon,
  • drinking alcoholic beverages,
  • breathing dust,
  • running after dark,
  • talking to strangers, and
  • playing in the street.
It's easy for the eye to just skip right past the common points and focus on the list. Since everything in the lead is supposed to be important (at some level), we don't really want to format the lead so that the eye skips over any of it. WhatamIdoing (talk) 16:35, 4 December 2023 (UTC)[reply]
Yes, this is a good additional reason to not use vertical lists in the lead.  — SMcCandlish ¢ 😼  02:22, 5 December 2023 (UTC)[reply]

An example

Wow, look at this one an an example of how not to write a lead sentence:

The Red Sea (Modern Arabic: البحر الأحمر, romanized: al-Baḥr al-ʾAḥmar, Medieval Arabic: بحر القلزم, romanized: Baḥr al-Qulzum; Biblical Hebrew: יַם-סוּף, romanized: Yam Sūp̄ or Hebrew: הַיָּם הָאָדְוֹם, romanized: hayYām hāʾĀḏōm; Coptic: ⲫⲓⲟⲙ ⲛ̀ϩⲁϩ Phiom Enhah or ⲫⲓⲟⲙ ⲛ̀ϣⲁⲣⲓ Phiom ǹšari; Amarigna: ቀይ ባሕሪ Qey Bahr; Sidama language: Duumo Baara; Tigrinya: ቀይሕ ባሕሪ Qeyih Bahri; Somali: Badda Cas; Afar: "Qasa Bad") is a seawater inlet of the Indian Ocean, lying between Africa and Asia. CUA 27 (talk) 22:12, 20 December 2023 (UTC)[reply]

Wow, that is so unreadable. Good example of when a footnote would be a better choice. Schazjmd (talk) 22:19, 20 December 2023 (UTC)[reply]
Holy crap. We need to diff that as a prime example of what not to do (then actually fix it).  — SMcCandlish ¢ 😼  13:41, 2 January 2024 (UTC)[reply]
It was fixed weeks ago. Gawaon (talk) 16:27, 2 January 2024 (UTC)[reply]
I don't think it belongs in a footnote or anywhere in the article. What's next, translate Indian Ocean into every one of the hundreds of languages of India, every language of Pakistan, every language of Indonesia, every language of Kenya and Tanzania and Mozambique and South Africa and Australia, Malagasy, Somali, Burmese? And so on? Largoplazo (talk) 22:35, 2 January 2024 (UTC)[reply]
Well... it's not unreasonable to provide the "local" names as alternates. But I'd suggest doing that in an infobox or a section towards the end of the article (similar to ==Etymology==), especially when there are more than about two alternate names. WhatamIdoing (talk) 21:48, 29 January 2024 (UTC)[reply]

@SMcCandlish, Gawaon, MicrobiologyMarcus, Tpbradbury, SandyGeorgia, Femke, Thinker78, Fgnievinski, CUA 27, and Eloquence: @WhatamIdoing, Remsense, BlueboyLINY, Sammi Brie, InfiniteNexus, Trovatore, XOR'easter, David Eppstein, Hawkeye7, and Lowercase sigmabot III: Hello. Which is the correct interpretation of the "Pronunciation" section? I've always thought it was this: when there's a name whose pronunciation isn't commonly well-known to English-speakers (which includes a huge number of foreign names), the pronunciation should be added in the original language (that's why the Help/IPA pages exist). I'm asking because a user, Steelkamp, has started removing loads of pronunciation of Italian names because of "MOS:LEADPRON"; see here. Does any of you know the pronunciation of the Italian name "Zappia"? Or "Terenzini"? There're about 10.000 pages containing the pronunciation of Italian names: if his interpretation of the section "Pronunciation" is the right one, almost all of them should be removed. The same goes for any other language (Spanish, Dutch, Japanese...). I think that the correct interpretation is mine, but I'd like to hear your opinion. Thanks. Thiswouldbeauser (talk) 12:21, 2 January 2024 (UTC)[reply]

None of these people are Italians. They are all Australians or other English-speakers whose surnames happen to be of Italian origin, and who would not be using "full-Italian" pronunciations of their names, unless there's some unusual exception among them, so the pronunciation removals appear to be the correct move here. Injecting Italian-IPA pronunciation guides for them is WP:OR that doesn't reflect the reality of how these people are referred to or (in all probability) how they say their own names. It's about the same as putting an Irish Gaelic pronunciation key of [ˈʃaːn̪ˠ ˈkahəsˠiː] on the article of an American named Sean O'Casey. Removing the Italian pronunciation guide from actual Italians like Giovanni Gronchi or Pippo Barzizza, on the other hand, would be a mistake and should be reverted.  — SMcCandlish ¢ 😼  14:07, 2 January 2024 (UTC)[reply]
@Steelkamp, do you have any thoughts about how editors should determine which names do/don't benefit from pronunciation information? I didn't see any names in the list you removed that personally felt a need for the IPA, but perhaps other people have different ideas.
@Thiswouldbeauser, Template:Infobox person supports |pronunciation= parameter, and sometimes a suitable compromise is to provide the information in the infobox (if one already exists) instead of in the first sentence.
The ideal is a recording from the Wikipedia:Voice intro project: then you know exactly how that person says their name at a given point in time. There is not necessarily a fixed pronunciation. I know several people who have changed the pronunciation of their names, and several more whose pronunciation depends on which language they're speaking. WhatamIdoing (talk) 21:41, 2 January 2024 (UTC)[reply]
I for one have no idea how to pronounce Gaetano. I don't think nationality is relevant but rather how self-evident the pronunciation is. Per MOS:LEADPRON, If the name of the article has a pronunciation that is not apparent from its spelling, include its pronunciation in parentheses after the first occurrence of the name. If anything, It is preferable to move pronunciation guides to a footnote or elsewhere in the article if they would otherwise clutter the first sentence. Therefore, I don't think this removal instance was appropriate. Although there is a point about OR and probably best to let a local handle the pronunciation. But if we come to this, pronunciation of names can be atomized by each subjective pronunciation by each household. Then we would need reliable sources to find out the pronunciation of each person's name even if the spelling is completely the same as the next person. Because how do we know how they want to pronounce their names? I think it may suffice to provide the general pronunciation of the name.
I would say interpretation of LEADPRON is on a case by case basis and subject to consensus. Thinker78 (talk) 04:30, 3 January 2024 (UTC)[reply]

Wait, are you saying that they're removing the pronunciations in English or in another language of the names of native English-speaking people from English-speaking countries whose names originate in a language other than English? If the latter, good! Indeed, just a few days ago I removed the transcription into Greek of the name of US actor Jason Mantzoukas. It's just not relevant to an article about someone how their ancestors whose native language wasn't English might have spelled or pronounced their name. Similarly, the article about Kai Bird notes in the "early life" section that his father named him for a university acquaintance of his, Kai-Yu Hsu. I thought that was already of borderline relevance, but when the article went on to tell us that Kai means "mustard" in Chinese (; Mandarin: gài or jiè), I couldn't take it any more. What matters is how their name is pronounced and written in English, and the etymology is almost absolutely going to be irrelevant. Largoplazo (talk) 22:47, 2 January 2024 (UTC)[reply]
I think not including pronunciation can provide for phonetics confusion in some cases, like in the Kai bird (kai as in may or kai as in kite?). Regards, Thinker78 (talk) 04:41, 3 January 2024 (UTC)[reply]
My point is that a phonetic representation should, if given at all, should be only for how he pronounces his name, not how his namesake pronounced it in Chinese. Largoplazo (talk) 10:31, 3 January 2024 (UTC)[reply]
There is also WP:MOSPRON,

If a common English rendering of the foreign name exists (Venice, Nikita Khrushchev), its pronunciation, if necessary, should be indicated before the foreign one. For English words and names, pronunciation should normally be omitted for common words or when obvious from the spelling; use it only for foreign loanwords (coup d'etat), names with counterintuitive pronunciation (Leicester, Ralph Fiennes), or very unusual words (synecdoche).

Pronunciation should be indicated sparingly, as parenthetical information disturbs the normal flow of the text and introduces clutter. In the article text, it should be indicated only where it is directly relevant to the subject matter, such as describing a word's etymology or explaining a pun. Less important pronunciations should be omitted altogether, relegated to a footnote, or to a dedicated section in the article or infobox.

If there was already a pronunciation, instead of taking it out I would say it should be considered how evident its pronunciation is and whether it is best instead to move it to a footnote. Regards, Thinker78 (talk) 04:52, 3 January 2024 (UTC)[reply]
There are three reasons why I removed those pronunciations:
  1. Because of Wikipedia:Manual of Style/Lead section#Pronunciation, which states that "If the name of the article has a pronunciation that is not apparent from its spelling, include its pronunciation in parentheses after the first occurrence of the name." The pronunciations I removed are mostly obvious from their spelling. For example, the pronunciation of "Tony Zappia" is obvious from how it is spelt. There may be some borderline cases, such as Gaetano "Guy" Zangari, which I would not oppose re-adding the pronunciation.
  2. Because none of them had any citations. Even if the pronunciation is not obvious, there should be a source indicating how someone in Australia (not Italy) would pronounce the name.
  3. Because those pronunciations were originally added to Wikipedia by a sock puppet of a banned user. User:Thiswouldbeauser is another one of this person's sock puppets, and this account should be blocked and globally locked.
Steelkamp (talk) 09:19, 3 January 2024 (UTC)[reply]
"how someone in Australia (not Italy) would pronounce the name" is the key matter here, since an Italian-language pronunciation would almost certainly be wrong for an Australian. In many languages, pronunciation is fixed and is actually algorithmically determinable, but maybe sources are proper for how a particular Italian-originating name is pronounced in Australia (or the US, or other places). That said, WP:V only requires that non-controversial claims be verifiable not verified, so any modern public figure's pronuncation would in fact be verifiable with enough leg work, like finding them mentioned in a TV news report. That is, the lack of citations doesn't look like a good removal reason if someone adds an Australian pronunciation that is probably correct, even if the OR of claiming that an Australian uses an Italian-language pronunciation is a good reason for removal (the claim that this is the pronunciation used is not a non-controversial one, since it is very unlikey to be true). The sock reason isn't very good either, though; we don't blanket auto-revert everything that socks and banned editors do. Just the unlikelihood of Australians using Italian pronuncation is sufficient.  — SMcCandlish ¢ 😼  10:14, 3 January 2024 (UTC)[reply]
  • About 2: "Because none of them had any citations". Per MOS:LEADCITE,

The necessity for citations in a lead should be determined on a case-by-case basis by editorial consensus.

  • About 3: "Because those pronunciations were originally added to Wikipedia by a sock puppet of a banned user."

Anyone is free to revert any edits made in violation of a ban or block, without giving any further reason and without regard to the three-revert rule. This does not mean that edits must be reverted just because they were made by a banned editor (changes that are obviously helpful, such as fixing typos or undoing vandalism, can be allowed to stand), but the presumption in ambiguous cases should be to revert.

But I see that Thiscouldbeauser made the reverted edit on 7 June 2023 whereas they were blocked on 19 June 2023. Therefore, the editor at the time they made the edit was not yet even blocked so the freely revert provision does not apply.

In general, it isn't advisable to try to revert every single edit a sockpuppet has ever made. We shouldn't restore typos, bad grammar, misinformation, or BLP violations just because a sockpuppet is the one who fixed the problem. On the flip side, clear vandalism or purely disruptive edits should be reverted whether or not an obvious sockpuppet did it. Try to take a reasonable sampling of the account's edits, and mass-revert them only if it appears that none of them are good.

Regards, Thinker78 (talk) 04:29, 5 January 2024 (UTC)[reply]
@Steelkamp Thinker78 (talk) 04:56, 5 January 2024 (UTC)[reply]
  • Generic answer.

Thank you, I've just read your replies. All right. Should the consensus be that the Italian pronunciations of not Italian persons' names have to be removed (I hope it won't be), then any of you can remove each of the remaining Italian pronunciations of Australian politicians. But, in that case, I also strongly invite you to remove the Italian pronunciations of the following "Americans" (not "Italians") whose surnames happen to be of Italian origin: Martin Scorsese, Francis Ford Coppola, Leonardo DiCaprio, Robert De Niro, Al Pacino, Joe Pesci... To begin with. These are just a few of the many "American" (not "Italian") actors with Italian surnames, I can provide more of them and I can provide many more similar cases out of America and of the world of cinema too. The same will go for other languages than Italian, obviously. This would be the correct move, right? Come on... I hope that the consensus will be different!

  • Specific answer to my accuser.

I'm replying to your 3 points @Steelkamp:.

  1. So you knew that in Italian the surname "Zappia" isn't read /'zapja/ but /dzap'pia/? Assuming that you knew, how many English-speakers know well Italian phonetics without committing errors, in your opinion?
  2. The local pronunciation according to English phonetics of Italian names is something else than the original pronunciation of the name in Italian, one thing doesn't exclude the other. Do you think it's useful to add the English pronunciation of the Italian name? Please add it, I have no objections. But then add this piece of information instead of removing information that's already there!
  3. Oh my... This is the real reason for all you've done, isn't it? Ok, ok... It was my mistake, let me explain. I'm not Thiscouldbeauser, I'm a different person. Why have I chosen such a similar username to register? I'm wondering, it was a stupid move. I did it because I noticed a lot of Italian pronunciations badly transcribed and it was Thiscouldbeauser to add them. He isn't Italian, almost certainly he's Australian and it's clear that he doesn't know the correct Italian pronunciations that he added. Look at any of his edits about pronunciations: none of them is similar to mine. So I registered an account with a similar name to correct them. If I hadn't chosen this name, or edited those pages one hour before or after, you wouldn't have noticed and reverted my corrections. Luckily, most of the pronunciations had already been corrected by other users, so I had just to fix a few. But now you've removed a large part of them... Read the generic answer I've just written a few lines above. However, I'm not Thiscouldbeauser. You've asked a check: the checkers will verify that I'm writing from Italy because my current IP is Italian. I bet that Thiscouldbeauser (and all his previous sockpuppets) had a Australian IPs. This will be the ultimate proof that I'm another person whose only mistake was choosing a banned user's name.

I'd like this last point to be read also by @Extraordinary Writ:. Thiswouldbeauser (talk) 10:39, 3 January 2024 (UTC)[reply]

The pronunciations in Italian of "Coppola", "Pesci", etc., are irrelevant to the respective articles' subjects and should be removed. The pronunciation Zappia uses in English isn't 100% obvious (stressed on the "Zap" or on the "i"?) and it's useful to provide it. I just noticed that Tony Zappia was born in Italy so the Italian pronunciation might be considered relevant. Largoplazo (talk) 12:23, 3 January 2024 (UTC)[reply]
If articles on Americans like Martin Scorsese are using Italian IPA keys to give pronunciations, then that is wrong and should be fixed. That does not mean pronunciation keys for such people should be removed forever, since the pronuciation of such a name is not obvious, and the correct pronunciation in English for that person is easy to establish. The others mostly would also benefit from that as well. Maybe the Australian ones would, too. But in any case where one of our articles is giving patently incorrect pronuncation using an Italian IPA key for American or Australian or other such persons, just removing the falsehood is the correct move. Someone is free to put a corrected version in again later.  — SMcCandlish ¢ 😼  01:22, 4 January 2024 (UTC)[reply]
Many people from immigrant families will pronounce their name one way within their family and/or ethnic community, and another way to their adopted community (or general public or media as the case may be). One shouldn't presume the pronunciations are necessarily wrong without checking the article body/history/citation first. If an RS says Scorsese's grandmother pronounced it such-and-such with him, then that pronunciation may belong equally in the lede, and it may well be appropriate to use Italian IPA in such a case. SamuelRiv (talk) 04:19, 13 January 2024 (UTC)[reply]
Thank you. Regards, Thinker78 (talk) 04:23, 13 January 2024 (UTC)[reply]

Well, whatever your final will be I'll accept it. To me, one more piece of information is better that one less. At least I'm happy that I've made possible this discussion which will be a point of reference for future cases about this matter. Thiswouldbeauser (talk) 14:00, 4 January 2024 (UTC)[reply]

I think the point we're trying to get accross is that an Italan-language pronunciation guide for an Australian or American who happens to have an Italian-derived name is usually going to not be information at all but misinformation, and should be replaced by a pronunciation guide for how that name is treated in the native form of English in these countries. And this might even vary by biographical subject. E.g. Ian MacKaye and his family pronounce it /məˈkaɪ/ while various other Americans may use /məˈkeɪ/; what the "proper" Scottish usual pronuciation is (/məˈkaɪ/, if you're wondering) isn't relevant to their biographies. And even that's a Scottish English pronunciation; perhaps more pertinently to this discussion, the non-Anglicized pronunciation in the Scottish Gaelic langauge – /mĩçˈkʲɤj/ – would be even less relevant (maybe not relevant even to a native-born Scot unless they are a Gaelic speaker). This also actually raises a side point about "Italian"; what we English-speakers think of as Italian is really a standardized and "prestige" version of the Florentine dialect within the Italian dialect continuum, and there are many other dialects, with quite a lot of pronunciation divergence; thus, the regionally "proper" pronunciation of an Italan surname is apt to vary considerably from general-Italian defaults. E.g., a Sardinian or Venetian surname would have a non-English pronunciation guide that might have considerable differences from how that same string of letters would be pronounced in the widespread Florentine version of Italian. This kind of consideration can affect a lot of languages and surnames; e.g. a variant of mine, McCandless, should be properly rendered in an Ulster Scots pronunciation key, which is closer to Scottish than to other Irish English dialects, because it is an Ulster Scots version of the name (not counting the later diaspora in North America, Australia, etc.). It gets much more complicated with something like Chinese, which is really a wide geographical cluster of distinct languages that just share a common writing system; a Cantonese name would not be given a Mandarin pronunciation guide, and so on, nor would either likely apply to a Canadian with a Chinese-derived family name.  — SMcCandlish ¢ 😼  14:44, 4 January 2024 (UTC)[reply]
To emphasize the point: Some languages have a "correct" pronunciation for names, and all people with that name use that pronunciation. I have heard that this is so for French, for instance. It is not true for English. Many common names use varied pronunciations and often the only way to know which one to use is to find out from the subject. So please do not add pronunciations to biographies of people in English-speaking countries unless you have a high-quality source for that pronunciation. Using the pronunciation from another language is very likely to be wrong. Using a typical pronunciation from the same dialect of English is still likely to be wrong enough of the time to cause problems. —David Eppstein (talk) 06:21, 5 January 2024 (UTC)[reply]
I don't share your opinion necessarily about such requirement as I indicated previously. The necessity for citations in a lead should be determined on a case-by-case basis by editorial consensus. Regards, Thinker78 (talk) 06:39, 5 January 2024 (UTC)[reply]
No. For living people, WP:BLP is very clear: Wikipedia must get the article right. Be very firm about the use of high-quality sources. All quotations and any material challenged or likely to be challenged must be supported by an inline citation to a reliable, published source. This cannot be evaded by editorial opinion or local consensus. Material in the lead that summarizes later sourced material can avoid repeating the citation in the lead, but that's not what we're talking about here. —David Eppstein (talk) 07:05, 5 January 2024 (UTC)[reply]
I don't see what I said contradicts WP:BLP. As I indicated, The necessity for citations in a lead should be determined on a case-by-case basis by editorial consensus. The applicability of policies is also subject to consensus. Editors have different interpretations about them all the time. Regards, Thinker78 (talk) 02:48, 6 January 2024 (UTC)[reply]
You are quoting guidelines about whether to include citations for material in the lead that summarizes later sourced material. Pronunciations in the lead are not a summary of later material, and cannot piggyback on the citations for later sourced material. Therefore, they need their own citations. Being in the lead is not a valid excuse for committing original research: there is not, however, an exception to citation requirements specific to leads. —David Eppstein (talk) 07:17, 6 January 2024 (UTC)[reply]
Per Verfiability,

All quotations, and any material whose verifiability has been challenged or is likely to be challenged, must include an inline citation to a reliable source that directly supports the material.

Not all cases need a citation. Regards, Thinker78 (talk) 03:42, 7 January 2024 (UTC)[reply]
I think youse two might be talking past each other a little. If the lead says something, even in a BLP, that is also in the body with a citation, there is no need to also put the citation in the lead, unless the claim is something that might seem controversial to or "likely to be challenged" by the reader. Otherwise our leads on many articles, especially BLPs would basically be unreadably festooned with a citation or two or five every few words. But if the lead makes a claim that is not in the body, then obviously it needs a citation for it right there in the lead. (Same goes for making claims in an infobox that are not in the article, though this is usually a mistake, except for specialized infoboxes, like {{Taxobox}} and various medicine and chemistry ones, that contain various technical detailia).  — SMcCandlish ¢ 😼  16:57, 7 January 2024 (UTC)[reply]
I think you basically are saying what Eppstein is saying though. Regards, Thinker78 (talk) 04:45, 8 January 2024 (UTC)[reply]
Well, agreeing with someone wouldn't be a fault and is something I should arguably do more often). V doesn't require an inline citation for a claim in the lead (or anywhere else) that isn't challenged or likely to be challenged, but years after that material was written our site-wide standard is to use inline citations for everything; the days of "general references" (dumped at the bottom of the page without any demonstration what claims are sourceable to which locations in what sources) being good enough indefinitely are long over. BLP has since then also imposed more firmly stated restrictions on BLPs, but they basically appear to boil down the same thing. Even outside a BLP, all material has to be verifiable; if it is not yet verified, then it is apt to be challenged (e.g. by deletion, by {{citation needed}} or {{dubious}}, by complaint on the talk page, etc. The BLP difference is a presumption toward deletion. The solution (BLP or not) is thus to just provide an inline citation. But for lead information that also appears elsewhere in the article and is cited there, it need not be cited in the lead, unless it is controversial (challenged or likely to be challenged).  — SMcCandlish ¢ 😼 
Like any other piece of information on Wikipedia, proper name pronunciations should be sourced, not be the product of original thought by individual editors. This is especially true for BLPs. --JBL (talk) 19:50, 5 January 2024 (UTC)[reply]

I'm adding just a couple of considerations.

  • Besides Zappia, also "Guy" Zangari, Carlo Furletti, Franca Arena and "Phil" Barresi were born in Italy, this should be a good reason to re-add their pronunciations in Italian; I've searched the web for info about some of the remaining ones and almost all are somehow attached to their roots like most of us Italians, but I don't think that this would be enough for them.
  • Adding sources for the pronunciations of Italian names isn't something useful actually, not just because Italian language has an almost 1-1 letter/sound correspondence and so it's rare that the pronunciation of an Italian name is challenged, but also and especially because the Help/IPA pages pointed to by the pronunciations normally contain sources for the pronunciations themselves, for example in the Italian help page they're in the section "External links".

Helpful information, I hope. Thiswouldbeauser (talk) 15:40, 8 January 2024 (UTC)[reply]

Where someone was born isn't necessarily dispositive of anything. The elder of my sisters was born in the UK but does not use a British pronunciation of her name (/sɑːrə/ rather than /srə/). What is apt to make much more difference is where they have spent most of their life.  — SMcCandlish ¢ 😼  10:30, 9 January 2024 (UTC)[reply]
Certainly but the intended name pronunciation really came from the people who named a person. I would say then that both the usage by the named person and the people who named her are appropriate pronunciations. Also, if a person has a difficult to pronounce name which readers have no idea how to pronounce, I find it helpful to at least provide the pronunciation according to certain general common usage rather than none. Regards, Thinker78 (talk) 17:45, 9 January 2024 (UTC)[reply]
"the intended name pronunciation really came from the people who named a person" is an OR kind of idea we couldn't employ (i.e., we neither have any sources saying this is how name pronunciation is approached, and there would be no sources available to verify such a particular with regard to the name of virtually anyone ever). It's not "helpful" to provide a pronunciation that is correct for someone from, say, Italy, when it may have no bearing on how it's pronounced with regard to a particular person in, say, Australia; it would usually be downright misinformation. We really can't go by anything but either evidence about that persons's self-declared pronunciation (e.g. in an interview), or how RS in A/V media (TV news, etc.) pronounce it with regard to that specific person, or maybe sourceable information on how it's pronounced in general in a particular country or other region, but that last is really iffy, since it may be wrong with regards to that individual, at least in a diaspora country like Australia or the US. I don't think it would be problematic to give a Mexican Spanish pronunciation of the Spanish name of someone from Mexico, since the output is pretty much algorithmically predictable. Such a claim has been made about French, but I'm skeptical because of the langues d'oïl / langues d'oc split, and it's even more dubious in Italy because of the dialect continuum there; same in Spain, and China, and etc. For Ireland, you could probably get away with a Hiberno-English key for any common anglicized Irish name, but when it came to native Irish Gaelic one, it would become problematic again, because there are significant dialectal differences between "school Irish" and several native Gaeltacht dialects that are not mutually consistent (even the very common given name Máire has very different pronunciations in different places, from roughly "Moya", through "Mahrə", to something an English speaker would probably parse as "Mazrə"), though our own article on that name is not accounting for this). Anyway, it is better for us to lack information, for want of sources, that to make up potentially (even likely) false information with guesswork.  — SMcCandlish ¢ 😼  23:36, 9 January 2024 (UTC)[reply]
  • is an OR kind of idea we couldn't employ. We could, ceteris paribus.
  • or maybe sourceable information on how it's pronounced in general. This, because otherwise pronunciation guides would be excluded for any name before audio recording was invented or for anyone whose name doesn't have such source. Also, sometimes names are illegible by people who don't know the language.
  • it would usually be downright misinformation / there are significant dialectal differences / it is better for us to lack information, for want of sources, that to make up potentially (even likely) false information with guesswork. An explanatory footnote can be provided.
Regards,
Thinker78 (talk) 00:34, 10 January 2024 (UTC)[reply]
No, we really couldn't. OR is forbidden in BLPs, and if you carry out your insistence on pushing OR into actual BLPs, you are likely to get blocked. —David Eppstein (talk) 01:29, 10 January 2024 (UTC)[reply]
@David Eppstein you are likely to get blocked. Are you warning me as an administrator or are you speculating? Thinker78 (talk) 04:00, 10 January 2024 (UTC)[reply]
@David Eppstein Also, next time try at least for clarification instead of making likely wrong interpretations of what I intended to say. Key word is "ceteris paribus". If you don't know what it means, ask. Sincerely, Thinker78 (talk) 04:02, 10 January 2024 (UTC)[reply]
Even though it was inappropriate that you are warning me with a block (whether in an administrative capacity or in a speculative capacity) when you are being involved in this discussion, giving the appearance that you are trying to quash unduly the discussion, I will clarify anyway what I intended to say.
SMcCandlish wrote,

"the intended name pronunciation really came from the people who named a person" is an OR kind of idea we couldn't employ (i.e., we neither have any sources saying this is how name pronunciation is approached, and there would be no sources available to verify such a particular with regard to the name of virtually anyone ever).

I replied quoting only a small portion and addressing it with a bit, "is an OR kind of idea we couldn't employ." We could, ceteris paribus.
I thought it would develop in discussion, I did not expect someone, specially an administrator, raising issue in an unduly fashion with it.
When I said "we could", I was challenging the notion that "is an OR kind of idea" the intended name pronunciation really came from the people who named a person. Also that "there would be no sources available to verify such a particular with regard to the name of virtually anyone ever". There are no guarantees and I don't agree with the idea that there would be no sources available whatsoever for verification of the name pronunciation as intended by the people who named a person. It is entirely a possibility that there could be reliable sources that back verification of the pronunciation by said people.
When I said "ceteris paribus" my intention was to indicate that all things being equal (the meaning of ceteris paribus), meaning availability of reliable sources that back the pronunciation as used by the named person or by the people who gave them their name, WE COULD use the intended name pronunciation really came from the people who named a person. Sincerely, Thinker78 (talk) 04:33, 10 January 2024 (UTC)[reply]
I really don't see any point to this digression. There is no need, ever, to say (with obscure Latinisms or otherwise) what amounts to "it wouldn't be OR if we have reliable sources saying it", since by definition it is then no longer OR but reliably sourceable. [sigh] And pointing out that if someone pushes OR into actual BLPs they'll probably get blocked isn't a threat to block you in particular, much less a suppression of your input, it's an observation of what would likely happen if someone did push OR into actual BLPs (and a perfectly reasonable thing for an admin to say in either capacity or for anyone to say). I just used the word "block" twice in this post, agreeing with David Eppstein. Does that make me threatening anyone? It's a word that comes up often in discussions of policy, like this one.  — SMcCandlish ¢ 😼  01:05, 13 January 2024 (UTC)[reply]
Not to belabor this, but there's a difference between how you represent what DE wrote and the literal words they wrote, and imo they were indeed inappropriate. And fwiw in this thread, I still have no idea how "Zappia" is pronounced for Italian-Americans, Italian-Australians, Italian-Italians, or within Tony Zappia's family. (Never heard of him.) I guess in reply to Steelkamp and Thismustbeauser: it's not obvious how Italian speakers will pronounce stuff unless they're from some specific area. Immigrants are right out. SamuelRiv (talk) 06:12, 15 January 2024 (UTC)[reply]

Here's a worthwhile test-case, even about an Australian since that's what this started with: Robert Menzies has no pronunciation key. This Scottish surname is traditionally pronounced roughly "Meng-iss" or "Ming-iss" in Scotland and by some in the Scottish diaspora (because the z in it was originally not a z but a yogh (ȝ), which was typographically replaced with a z as a stand-in in early print), but is often pronounced "Men-zeez" in the diaspora, especially from the 20th century onward (plus also some other variants like "Meng-giss" or "Ming-giss"). So how did this particular Menzies pronounce his name during his lifetime, how is it typically pronounced by Australians in reference to him today, if they differ then which should be present, and by what RS route would we establish either?  — SMcCandlish ¢ 😼  22:27, 22 January 2024 (UTC)[reply]

It is regular practice to offer multiple ways of pronunciation in the first sentence in some cases. Check MOS:PRONPLACEMENT, it has an example. I have to mention that I have a Spanish last name which I pronounce one way in Spanish and I pronounce it differently for ease of pronunciation of others who speak English. Regards, Thinker78 (talk) 22:39, 22 January 2024 (UTC)[reply]
Contemporary Australian news coverage found by searching for his name on YouTube uses the "MEN-zeez" pronounciation. I find that personally convincing, but inadequate as the sort of reliable source we could use for a pronunciation key in an article. —David Eppstein (talk) 17:42, 23 January 2024 (UTC)[reply]
I don't see why that's inadequate? If we had a contemporary Australian news article saying "Mr Menzies (pronounced MEN-zeez)" I think that would clearly be fine; I don't see why news in video format should be assumed to be less reliable - except, of course, when there are conflicting sources? In this case, there are numerous reports from the time, and they all seem consistent on this matter. TSP (talk) 18:36, 23 January 2024 (UTC)[reply]
Well, for one thing, their accent is international newscaster, not bogan; I don't think that should make a difference in this case, but it's something to be careful of. For another, I have heard (Australian) sportscasters disagree among themselves about the proper pronunciation of Australian names (in the case I'm thinking of, whether the first syllable of the name Maya uses the vowel from day or from night). —David Eppstein (talk) 18:43, 23 January 2024 (UTC)[reply]
Robert Menzies seems well documented. For example:

Menzies , Robert Gordon Australian statesman men - ziz / ' mɛnziz / This name can be pronounced either as above or as ming - iss . The latter pronunciation is traditional in most parts of Scotland

— Oxford BBC Guide to Pronunciation (2006)
But that's the BBC in 2006. Regional accents are more common on the BBC nowadays and then Australian accents are another moving target too. Trying to track this across time and geographies seems beyond our scope. Andrew🐉(talk) 19:21, 23 January 2024 (UTC)[reply]
I'm not sure that's worse than with any use of sources, really - caution always needs to be taken on how authoritative a source can be expected to be on a given subject. For minor figures, the broadcaster very probably never asked, and we can omit if there's inconsistency or we don't think we can trust the sources. (I'm reminded of this clip from early in Jeremy Corbyn's career where the BBC calls him Robin Corbyn - he was a fairly obscure figure, they got his name wrong, it happens.) But where we have a wide assortment of news broadcasts from a country, talking about someone who was that country's Prime Minister for nearly 20 years, with a consistent pronunciation, I don't see any reason to assume that none of the journalists ever checked - any more than we need to look at the wealth of printed material and say "are we sure he didn't spell it Menzeys?".
As far as I can see, this section is discussing the question "should we include pronunciation guides where we don't have a reliable source relating to the specific individual?" - in this case, I think we do.
(It would, it's true, be even better if we had a recording of him saying it himself - annoyingly I've found at least two speeches where Menzies said his own name, and of which recordings exist, but both recordings omit the relevant section!) TSP (talk) 19:26, 23 January 2024 (UTC)[reply]
I think in many cases it would suffice to have a reliable source that backs the general pronunciation rules regarding foreign names that English speakers may even have no idea how to pronounce and make a note that it is just the general language guidance of how to pronounce. This would be more helpful to the project than requiring a reliable source for each specific person's name and how the person pronounce it, instead of the name in general. Although of course if there are such sources, they could be preferred.
Examples being List of Indian monarchs, List of heads of state of Mexico, Mayor of Paris#List of officeholders. Thinker78 (talk) 19:43, 23 January 2024 (UTC)[reply]

wp:lede deletions

Hello. An editor has repeatedly deleted from a lede the description of what is indeed the subject of most of the article. And is the most notable aspect of the subject's bio. (After failing - in discussion with other editors - to have the article changed to be simply a redirect). Curiously, the editor is citing mos:lede as a rationale. Can someone please join the discussion here? Thanks. 2603:7000:2101:AA00:95FD:29F8:EB8A:7855 (talk) 20:49, 13 February 2024 (UTC)[reply]