Wikipedia talk:Article size: Difference between revisions
Resetting archive period to status quo (180 days) before it was first changed by another editor in June, though I would likely prefer a longer period. Should be discussed given the size of the talk page.. |
|||
Line 7: | Line 7: | ||
|minthreadsleft = 3 |
|minthreadsleft = 3 |
||
|minthreadstoarchive = 1 |
|minthreadstoarchive = 1 |
||
|algo = old( |
|algo = old(180d) |
||
|archive = Wikipedia talk:Article size/Archive %(counter)d |
|archive = Wikipedia talk:Article size/Archive %(counter)d |
||
}} |
}} |
Revision as of 23:26, 2 July 2022
This is the talk page for discussing improvements to the Article size page. |
|
Archives: 1, 2, 3, 4, 5, 6Auto-archiving period: 6 months |
See WP:PROPOSAL for Wikipedia's procedural policy on the creation of new guidelines and policies. See how to contribute to Wikipedia guidance for recommendations regarding the creation and updating of policy and guideline pages. |
Clarification needed for "article splitting activists"
There seems to be a recent trend of a couple of people (@Blubabluba9990 , @Zsteve21 , and @Onetwothreeip) using the Wikipedia:Database_reports/Articles_by_size page and going around to each page and trying to split articles or edit them in some ways incorrectly to try to shrink the size. Is there any description that can be added to this page such that it can be clarified that simply trying to split articles because they're relatively large and ONLY because they're relatively large is not good editing etiquette? Especially when the splitting is being done by non-subject matter experts they seem to commonly make mistakes when splitting and are done without consultation of the regular editors of the pages. Ergzay (talk) 04:01, 15 October 2021 (UTC)
- This guideline already states that such editorial decisions should obtain consensus. The rest seems to fall somewhat within WP:BOLD. If an editor is being perhaps too bold, the best course of action is probably direct engagement with the users. CMD (talk) 06:09, 15 October 2021 (UTC)
- I've split and reduced many articles over the last few years, mostly without any controversy at all. I can't stress enough that the vast majority of articles >450,000 bytes that I have split have been without any opposition from other editors. I'm sorry if some editors supporting such actions have been uncivil, but the etiquette is clearly a matter of how it is done rather than it being done at all and I have always sought to upheld the highest standards of civility, even when faced with spurious accusations of vandalism, sockpuppeting, bad-faith editing or other abuses. Sometimes I have disagreements with editors over splitting or condensing articles, and that's fine, we work them out. I am willing to offer advice or assistance to Blubabluba9990, Zsteve21 or any other editors that wish to help in the size area, but ultimately they will be accountable for their own actions.
- Most of all I would like to stress to everyone that civility should be of the highest importance. Sometimes editors feel that they own a certain article, and can feel offended when other editors seek to make the article congruent with Wikipedia's guidelines and the vast majority of other articles. This should be considered, although obviously we don't let editors make decisions for an article as if they are the owner(s). An editor who has never edited a particular article has as much right to make changes as an editor who has done most of the work on it.
- I would also be the first person to say that editors who have worked on articles for a significant amount of their time are often those who know best the most optimal way to split an article, or to otherwise reduce its size. These articles may not be split or reduced in the way that one might anticipate, but it happens eventually in some way or another. Onetwothreeip (talk) 06:39, 15 October 2021 (UTC)
- @Onetwothreeip "can feel offended when other editors seek to make the article congruent with Wikipedia's guidelines and the vast majority of other articles" Except this is not true. You're not trying to make articles congruent with Wikipedia's guidelines. You're trying to make articles congruent with your own opinion that many articles should be much smaller than they are now. You've made your own guidelines that you think should be followed, and that is fine, but then you go on to assert that those personal guidelines are Wikipedia's guidelines which is simply a form of gaslighting. Ergzay (talk) 20:52, 15 October 2021 (UTC)
- The articles we are talking about are the extremely long articles, several times larger than the average article size. Those articles are inconsistent with the great majority of Wikipedia articles which are much smaller. Articles being split when they get large is a normal process on this project. Onetwothreeip (talk) 21:59, 15 October 2021 (UTC)
- So they're several times larger than the average article. So what? Are they several times larger than the average well-developed, comprehensive article? And even if so, again: so what? Different topics have different needs. And are you really using Wikipedia:Database_reports/Articles_by_size, which reports the source size of each page, not the amount of readable prose? This is the worst kind of gnoming.You say
Articles being split when they get large is a normal process on this project
-- yeah, a normal process when carried out by people who have an interest in a topic and have thought about how it might be best presented, not drive-bys who fancy themselves working "in the size area". I'll say it again: worst kind of gnoming.Exhibit A: Talk:Glossary_of_engineering#Splitting_this_article was complete waste of time -- yours; that of anyone else interested in the article; and that of anyone wanting to use the article, since you've uselessly broken it into two pieces so that readers have to jump around. You also broke intra-article links while you were at it. Tell us what you achieved there? And while you're at it, convince the rest of us that you even understand the difference between source size and rendered size (or, if you like, readable size). EEng 23:10, 15 October 2021 (UTC)- Yes, they are several times larger than the average well-developed comprehensive article, and their excessive size is either an issue itself, caused by another issue, or both. I don't know what you mean by using that particular page, that's simply a weekly summary of the largest articles by the size of the source code. I did not split that particular article you are mentioning, but I'm happy to defend the splitting of any articles I've split myself, or any other issues to do with this area. All I am concerned with is that the articles and Wikipedia itself is improved. Onetwothreeip (talk) 23:21, 15 October 2021 (UTC)
- I've had a look at the split of that article and it seems fine to me. There doesn't seem to be any issues with intra-article links being broken. One of the two halves hadn't been renamed yet, but I've done that now. It looks like the only issue in this example was that editors were too concerned about process. Onetwothreeip (talk) 23:33, 15 October 2021 (UTC)
- Onetwothreeip: Before we go on... where do you get your statistics on the average size of well-developed, comprehensive articles? You say you didn't split Glossary_of_engineering -- that's right, you merely told others it was a good idea [1][2], and now say that splitting it into two arbitrary halves "seem fine". So I'm going to insist that you defend that decision. You have still failed to give any indication of what the benefit was, so I repeat the challenge: how did it help anything? Because here are eight ways it hurt:
- (1) Readers have to think about which of two arbitrary subpages (A-L, M-Z) has the entry they're looking for;
- (2) If you're searching for a word or phrase, you have to do it on two different pages;
- (3) Intra-article links are broken (contrary to what you say -- if you think they're not, then you're not competent to be splitting articles);
- (4) Even once the intra-article links are fixed, it will take significantly longer to follow such links (in 1/2 the cases);
- (5) Countless incoming inter-article links are now broken, and I don't see you rushing to find and fix them;
- (6) Fixing (5) will create pointless churn of watchlists;
- (7) Adding new links from other articles is now harder, since editors have to remember how the list is split;
- (8) Everyone's time has been wasted marveling at this personal crusade you've created for yourself so that you can feel you're doing something useful, which you're not.
- Now, again: what was the benefit of the split? And, specifically, when do you bunch plan to find and fix all the broken intra-article links and incoming links? EEng 04:06, 16 October 2021 (UTC)
- By comparing the sizes of these super-large articles, which are often but not always those with the most source code, with the sizes of what are considered our better articles, such as featured articles. I did express that it would be good for the article to be split, but that doesn't endorse any possible split. The splitting that did indeed take place of that article, I support.
- This is not the right place to discuss the merits of splitting the article, and I'm happy to discuss that on my user talk page. I will briefly address the points you raise. (1) assumes the reader is looking for a specific entry, which is not true. If they wanted the definition of one specific word or phrase, they would use the main search function. (2) is essentially the same point as (1). (3), you'll have to be specific which links you're referring to, but you are admitting in (4) that it's a fixable problem and I don't accept that it takes longer. The same can be said of (5), keeping in mind that I didn't split the article myself. If I did, I would be attentive to particular issues arising from the split. (6), added activity on watchlists is negligible, (7) is not true as the previous links still apply, and (8) it's up to you if you want to spend your time discussing this, that's not my fault or the fault of anyone splitting the article. There are thousands, if not millions, of articles that could use my attention or the attention of any editor, and since I don't have the capacity to address all of the articles we have, I decide which articles I focus on. Onetwothreeip (talk) 06:07, 16 October 2021 (UTC)
I did express that it would be good for the article to be split, but that doesn't endorse any possible split.
-- What??? EEng 06:51, 16 October 2021 (UTC)- (1) Of course they may be looking for a particular entry. By your reasoning we ought to have a thousand individual pages instead of one (or, I guess, two) consolidated pages.
- (2) Your response makes no sense at all. Let's say I'm interested in engineering terms related to the word heat sink. I have to search two different pages.
- (3) No, I don't have to be specific what links I'm referring to. If you can't find them without my help then (I repeat) you're not competent to be dealing with article splits.
- (4) So it's someone else's job to fix the broken intra-article links (I guess because you don't even know how to find them). And of course it takes longer in half the cases, since now half of the intra-article links are now inter-article links, so that you have to load a new page to follow it. Do you really not grasp that?
- (5) The point remains.
- (6) I guess watchlist churn is unimportant to you, but to those who actually tend to articles it's a significant timewaster.
- (7) What are you talking about? Someone wanting to add a link to a particular entry on what used to be a single page now have to go look to see that how the page was split. Many will perhaps be completely unaware that it was split, and unknowingly link to the old article, which no longer exists.
- (8) What about the participants at WP:Administrators'_noticeboard/IncidentArchive1026#Undiscussed_split? Was that up to them as well? Are you just an innocent onlooker, or are you the editor whose activities are raising so much concern.
- I'll note again that, for all the above threadbare excuses for why nothing too bad resulted from the split, you still haven't responded to the most important question asked: What was the benefit?While you struggle to find an answer to that, let's look (as you suggest) at an article you yourself did split. This article [3] was a handy collection of statistics on the 2021 German elections. Apparently because its source was 400K+ (which is a result of every line of every table carrying an external link as a source, not because there's unusually much material in the article, for an article of this kind) you decided to split off one arbitrary piece [4]. Why that piece? How does that better serve the reader? In fact, do you have any idea of how that material relates to the rest of the material? Do you have even the foggiest idea of the significance of what you did, or how it might affect a reader interested in the elections? Let me guess: no.Pinging in Rosguill, who closed the ANI discussion linked in (8).
- EEng 06:59, 16 October 2021 (UTC)
- The opposite of splitting something into a thousand individual articles is to combine a thousand different articles into one. My comment on a talk page saying that it would be good for an article to be split doesn't mean I support every possible way to split an article. Articles should neither be too small or too large, but often the large size is because of another problem.
- I'm willing to take this extensive discussion to my talk page, but I think you should take a break from your computer as you're getting needlessly heated. To respond briefly, on 1 and 2, readers search using the search bar in the top right. I can't address which links you're talking about in 3, 4 and 5 if you don't tell me which links you're talking about. 6 is bizarre, because edits shouldn't be discouraged on the basis that they appear in watchlists. It only takes a few edits to fully split an article anyway. 7, the old article destination has links to both.
- This next article you mention was never "a handy collection of statistics on the 2021 German elections". It was and remains an article about opinion polling for a federal German election.
Why that piece?
It was an especially large part of an article which was not the core content for the article and worthy of its own article.How does that better serve the reader?
Both the content that remains in the main article and the article split off are more accessible to readers.In fact, do you have any idea of how that material relates to the rest of the material?
Yes, the content is about opinion polling for the election; voting intention polling and favourability of the lead candidates. That article is one I have been reading for years and is currently on my watchlist. If you wish to follow up about this article, I invite you to take the discussion to my talk page. Onetwothreeip (talk) 07:26, 16 October 2021 (UTC) (Note: Much of the comment which this is a response to, timestamped 06:59, was added in subsequent edits after I had first read EEng's comment, so my response didn't cover all of what they added afterwards. Onetwothreeip (talk) 02:27, 17 October 2021 (UTC))- @Onetwothreeip Note, normally with opinion polling you include a constituency prediction based on the opinion polling. They go hand in hand and splitting that article was incorrect based on the how the two pieces of information are normally together (look elsewhere on wikipedia where similar information is presented and those pieces of information are on the same page). I'm going to revert that split of the german article. Ergzay (talk) 15:51, 16 October 2021 (UTC)
- Good idea. Replace the old split page with
#REDIRECT [[destination page]] {{R from merge}}
. EEng 19:06, 16 October 2021 (UTC) - That is not true. Constituency results, predictions and polling are typically separate from the other articles in an election series when there is enough content to justify a separate article. Onetwothreeip (talk) 21:21, 16 October 2021 (UTC)
- Good idea. Replace the old split page with
- We're having the discussion here, now because the real issue isn't any particular article, but your idea that arbitrary splits based on size, and with little or no attention to the effect on the presentation of the material, are somehow helpful. We are trying to help you see that, but you seem unable to engage the issues I've raised -- for example, after doing years of splits you still don't seem to know how to find links broken by a split, and when I've referred to watchlist churn caused by fixing broken links, you responded by saying
only takes a few edits to fully split an article
, which shows you still don't understand the issue. So we'll put that stuff aside to focus on this one thing: I've asked over and over what the benefit was of these splits, and the best you've come up with isBoth the content that remains in the main article and the article split off are more accessible to readers
. Sorry, but that makes no sense. How in the world does splitting the article make any content "more accessible to readers"? EEng 19:06, 16 October 2021 (UTC)- Sometimes it's appropriate to split large articles, but that's not always what is the best solution. Often there are other solutions not only to the issue of an article being exceptionally large, but other issues which also happen to greatly increase the source size of the article.
- You can't say I haven't engaged with what you've saying, I've taken each point you've made and responded. What you mean to say is that I am not agreeing with the opinions you've presented.
- It is not controversial at all to say that articles being extremely large are harder to read for readers, and harder to edit for editors. You can read the Wikipedia guidelines to see more on that. Onetwothreeip (talk) 21:25, 16 October 2021 (UTC)
- Other editors will decide whether you're engaged my concerns. I'm assuming your statement that
articles being extremely large are harder to read for readers, and harder to edit for editors
is an attempt to answer my request that you explain how (as you claimed) that splitting articles makes them more accessible to readers. Let's say that's true, at least all other things being equal (which they rarely are). But how does that apply to the engineering glossary, which isn't "read", or to the German polling, which also isn't "read" (though someone might want to use it to find trends and so on -- a use case you've neatly hobbled by isolating a big part of the data from all the rest). Please explain. EEng 02:01, 17 October 2021 (UTC)- Both those articles are read, viewed and accessed by readers. Those verbs can be used interchangeably with my previous use of "read", which should cover those articles. Only one of those articles you mention have I actually split, and I've very easily defended it (and also the split of another article by a different editor). Even if you disagree with an article split I have made, what you should have done is reverted the split or raise it with me. In the few circumstances that I have made a split that was contested, this is what editors who opposed it have done. Then we go to the talk page and work it out, coming to an agreeable conclusion as per WP:BRD. Onetwothreeip (talk) 02:21, 17 October 2021 (UTC)
- Read, viewed, and accessed are certainly not interchangeable -- you don't "read" a glossary the way you might read the bio of some senator. But in any event, you still haven't said in what way the split makes it easier for a reader to read, view, or access the material, especially given that you've broken it into to pieces that can't be considered together. Again, please explain. EEng 18:18, 17 October 2021 (UTC)
- Both those articles are read, viewed and accessed by readers. Those verbs can be used interchangeably with my previous use of "read", which should cover those articles. Only one of those articles you mention have I actually split, and I've very easily defended it (and also the split of another article by a different editor). Even if you disagree with an article split I have made, what you should have done is reverted the split or raise it with me. In the few circumstances that I have made a split that was contested, this is what editors who opposed it have done. Then we go to the talk page and work it out, coming to an agreeable conclusion as per WP:BRD. Onetwothreeip (talk) 02:21, 17 October 2021 (UTC)
- Other editors will decide whether you're engaged my concerns. I'm assuming your statement that
- @Onetwothreeip Note, normally with opinion polling you include a constituency prediction based on the opinion polling. They go hand in hand and splitting that article was incorrect based on the how the two pieces of information are normally together (look elsewhere on wikipedia where similar information is presented and those pieces of information are on the same page). I'm going to revert that split of the german article. Ergzay (talk) 15:51, 16 October 2021 (UTC)
- Onetwothreeip: Before we go on... where do you get your statistics on the average size of well-developed, comprehensive articles? You say you didn't split Glossary_of_engineering -- that's right, you merely told others it was a good idea [1][2], and now say that splitting it into two arbitrary halves "seem fine". So I'm going to insist that you defend that decision. You have still failed to give any indication of what the benefit was, so I repeat the challenge: how did it help anything? Because here are eight ways it hurt:
- @EEng: Wow that split to glossary of engineering is horrendous. Is there any way to revert these types of things? Ergzay (talk) 15:08, 16 October 2021 (UTC)
- That's going to be a bit harder. More urgent is to put a stop to all this ongoing spilt nonsense. EEng 19:06, 16 October 2021 (UTC)
- So they're several times larger than the average article. So what? Are they several times larger than the average well-developed, comprehensive article? And even if so, again: so what? Different topics have different needs. And are you really using Wikipedia:Database_reports/Articles_by_size, which reports the source size of each page, not the amount of readable prose? This is the worst kind of gnoming.You say
- The articles we are talking about are the extremely long articles, several times larger than the average article size. Those articles are inconsistent with the great majority of Wikipedia articles which are much smaller. Articles being split when they get large is a normal process on this project. Onetwothreeip (talk) 21:59, 15 October 2021 (UTC)
- @Onetwothreeip "can feel offended when other editors seek to make the article congruent with Wikipedia's guidelines and the vast majority of other articles" Except this is not true. You're not trying to make articles congruent with Wikipedia's guidelines. You're trying to make articles congruent with your own opinion that many articles should be much smaller than they are now. You've made your own guidelines that you think should be followed, and that is fine, but then you go on to assert that those personal guidelines are Wikipedia's guidelines which is simply a form of gaslighting. Ergzay (talk) 20:52, 15 October 2021 (UTC)
- Referring to 123IP, after years of disruption and IDHT refusals to accept the concerns of myriad editors, I think the only solution is a topic ban against splitting of any kind, including discussion of the subject, as much of the disruption is on talk pages. This excessive focus on article size is weird and counterproductive. -- Valjean (talk) 20:18, 16 October 2021 (UTC)
- Not true at all Valjean, I have a long record of collaboration on article talk pages with editors I disagree with. You've lied about me before, which you admitted to after being called out by other editors, so I don't think you are being or will be constructive in advising me. I would much prefer to have disagreements over content than whatever personal issues you may have with me, stemming from our interactions on contentious articles. Onetwothreeip (talk) 21:29, 16 October 2021 (UTC)
- Well I've never run into you before, and my analysis is exactly the same. I think we should wait to hear from the admin who closed the ANI thread on this two years ago, and then decide how to move forward. EEng 02:01, 17 October 2021 (UTC)
- You should've raised any concerns you had with any of my edits on the talk page of those article. You're overreacting. Onetwothreeip (talk) 02:08, 17 October 2021 (UTC)
- EEng, I see quite a bit of heated discussion about article splitting philosophy, and a handful of editors asserting that specific edits were poor. I don't have a strong opinion about the topic. Onetwothreeip has made an adequate effort to respond to several complaints here. It's clear that several editors disagree with Onetwothreeip about article organization, and that Onetwothreeip's changes are discovered by said editors long after they have been made, creating a scenario where you're objecting to a pattern of behavior rather than challenging individual edits. I would need to see much stronger consensus that Onetwothreeip's recent edits were undesirable and reckless to justify a sanction. signed, Rosguill talk 06:02, 17 October 2021 (UTC)
- I should have been clearer that it's indeed the pattern of behavior that's of concern here. Obviously a sanction (read: topic ban, as was proposed at ANI last time) would need careful evidence and a community discussion. I was just interested in your thoughts about the situation, given that you closed that discussion (and so perhaps have the best sense of its gestalt); I wasn't suggesting that you do anything. EEng 18:18, 17 October 2021 (UTC)
- 123IP, that's an oddly hypocritical personalization, considering you're charging me with actually lying about you. I suggest you strike that and stick to the issue, which happens to be your attitude toward article splitting. -- Valjean (talk) 02:06, 17 October 2021 (UTC)
- Your entire comment was about myself personally. I would much rather discuss issues to do with editing articles. Onetwothreeip (talk) 02:10, 17 October 2021 (UTC)
- Well I've never run into you before, and my analysis is exactly the same. I think we should wait to hear from the admin who closed the ANI thread on this two years ago, and then decide how to move forward. EEng 02:01, 17 October 2021 (UTC)
- Not true at all Valjean, I have a long record of collaboration on article talk pages with editors I disagree with. You've lied about me before, which you admitted to after being called out by other editors, so I don't think you are being or will be constructive in advising me. I would much prefer to have disagreements over content than whatever personal issues you may have with me, stemming from our interactions on contentious articles. Onetwothreeip (talk) 21:29, 16 October 2021 (UTC)
This entire discussion shows that there needs to be additional guidance on article size. I think it is not likely that this is a coordinated effort, there is a group of editors whose objective seems to simply be to split articles. It would be instructive to look at how the list of longest articles has evolved over the last year, especially the editors and their rhetorical tactics and creative use of the Wikipedia policies. Just some examples from a recent "split battle:"
- "The largest, second largest and third largest articles should be split, or in some way have their size reduced." [Obviously an impossibility–there will always be a largest article.]
- "The article is almost at 500,000 bytes, so it is not consistent with WP:SIZE." [Again, obviously wrong, as the reference only discusses readable prose]
- "Our size guidelines do allow for articles to exceed 100,000 bytes, but this article is a few times larger than that." [The first part contradicts the second bullet above, the second part is irrelevant.]
- "The reason for splitting this article is best summarised as making it easier for readers to access and view the overall content, which may be better done over more than one article." [A segue to the alternate argument, point out a problem that doesn't exist.]
- "I don't think you would be convinced by anything I would show you." [The fallback approach when the others are failing.]
- "The prose size limits are there for the ease of the reader in reading the main content of the article, which is typically the written prose for most articles." [A made-up "rule"]
- "When assessing the size of a prose article, we typically don't consider tables, images and other elements to be the primary content of the article, but that's obviously not tenable for articles which primarily contain those elements." [Another made-up rule, where the second part contradicts the first part.]
As I said, there needs to be some written policy on this because once started, the assault never stops.VarmtheHawk (talk) 16:19, 17 October 2021 (UTC)
- Well summarized. Can I trouble your for diffs for the above, or links to the discussions? EEng 18:23, 17 October 2021 (UTC)
- Those quotes all appear to be mine. In the first one, I was referring to the articles that are currently the very largest, not all articles. All the other quotes are correct in their context. What's most important is to take an approach that evaluates each article's needs separately, so for example if an article's content is mostly in tables, we would evaluate the size of the content within the tables. Onetwothreeip (talk) 21:21, 17 October 2021 (UTC)
- Thank you for explaining that "The largest, second largest and third largest articles should be split..." referred to "articles that are currently the very largest, not all articles." Those of us with a public education had trouble figuring that one out. You might note that this is in direct opposition to your last sentence above. Maybe you could enlighten us as to what your position is on this issue, quantitatively if possible.VarmtheHawk (talk) 05:23, 18 October 2021 (UTC)
- See Tall_poppy_syndrome#Etymology. EEng 14:33, 25 October 2021 (UTC)
- Great analogy. But, as noted below, some tall poppies are not really tall at all; all are equal but some are more equal than others. And, as I frequently point out, there will always be a largest article.VarmtheHawk (talk) 17:52, 25 October 2021 (UTC)
- See Tall_poppy_syndrome#Etymology. EEng 14:33, 25 October 2021 (UTC)
- Thank you for explaining that "The largest, second largest and third largest articles should be split..." referred to "articles that are currently the very largest, not all articles." Those of us with a public education had trouble figuring that one out. You might note that this is in direct opposition to your last sentence above. Maybe you could enlighten us as to what your position is on this issue, quantitatively if possible.VarmtheHawk (talk) 05:23, 18 October 2021 (UTC)
Discussion about proposed solution
We obviously need some clearly stated official wording to guide editors when the idea of splitting is broached. We are not talking about normal content editing here. Splitting is rarely necessary and should always be preceded by a thorough discussion and near 100% consensus for splitting. It should NEVER be a BOLD move.
With other content changes that may be considered controversial, BOLD does not apply, but sometimes a passing editor is not aware of any controversy and they make a BOLD controversial edit. In such cases, they should follow BRD when their edit is reverted and not restore their change. They should allow the status quo version to remain untouched until a discussion has produced a very solid consensus. With normal content editing, BOLD is okay once, but if there are objections, caution should then rule. We are not talking about normal content editing here.
Proposal: It should be plainly stated here and at BOLD that:
Article splitting (which is never a normal content type edit) is a de facto controversial change that excludes appeals to BOLD. Splitting is too consequential a change to do as normal editing and using BRD. The possibility of edit warring over a split should be excluded. A split should only happen after an official RfC reaches a very clear consensus, determined by outside observers, not the one wishing to do the splitting.
Let's discuss and improve this suggested wording. -- Valjean (talk) 19:01, 17 October 2021 (UTC)
- Wikipedia doesn't have a problem of bold edits which split articles. Any objected bold edit which splits articles gets reverted and they don't get reinstated unless there's consensus. I would certainly self-revert a bold edit splitting an article if I was asked to do so. Onetwothreeip (talk) 21:24, 17 October 2021 (UTC)
- History has shown that to not be the case. BOLD splits have often caused problems. This proposal would prevent the many debacles that have led to much debate, edit wars, disruption, wasted time, and strong warnings which have been ignored. Let's plug that open pit so more people don't fall into it and even more editors have to waste time pulling them out and cleaning up the mess they have made. We would not be here if this proposal had been our guideline for splits. We're here because it hasn't been. Following this proposal would also prevent the need to threaten topic blocks for BOLD edits that did not enjoy consensus and the ensuing, long, IDHT discussions that have often followed. That's history. Let's enforce the basic principle that is supposed to work here, which is collaborative editing. Let's stop the kind of solo editing that creates problems. Splits are not normal content editing, so they should be treated differently. -- Valjean (talk) 02:14, 18 October 2021 (UTC)
Splits are not normal content editing
– That's a really good point. In many cases they're more akin to RMs, and should be treated as such (in -- I repeat -- many cases, but by no means all). EEng 06:00, 18 October 2021 (UTC)- Moves are actually often very routine and unremarkable. RMs are in effect only for contested moves. Onetwothreeip (talk) 06:20, 18 October 2021 (UTC)
- What part of
in many cases ... I repeat -- many cases, but my no means all
do you not understand? You have an extremely annoying habit of responding to fragments of what other say. EEng 17:47, 18 October 2021 (UTC)
- What part of
- Moves are actually often very routine and unremarkable. RMs are in effect only for contested moves. Onetwothreeip (talk) 06:20, 18 October 2021 (UTC)
- You're here because you want to be here, in your case because you saw a comment on my talk page. "Bold splits" have a very simple solution when they are contested: they are reverted. That is what's happened every single time they were contested before and is normal process. If someone is edit warring over it, that's a specific matter solved through our usual processes. Onetwothreeip (talk) 02:51, 18 October 2021 (UTC)
- History has shown that to not be the case. BOLD splits have often caused problems. This proposal would prevent the many debacles that have led to much debate, edit wars, disruption, wasted time, and strong warnings which have been ignored. Let's plug that open pit so more people don't fall into it and even more editors have to waste time pulling them out and cleaning up the mess they have made. We would not be here if this proposal had been our guideline for splits. We're here because it hasn't been. Following this proposal would also prevent the need to threaten topic blocks for BOLD edits that did not enjoy consensus and the ensuing, long, IDHT discussions that have often followed. That's history. Let's enforce the basic principle that is supposed to work here, which is collaborative editing. Let's stop the kind of solo editing that creates problems. Splits are not normal content editing, so they should be treated differently. -- Valjean (talk) 02:14, 18 October 2021 (UTC)
- I'm not quite sure what the first sentence means, but simply repeating the current policy doesn't really add much to the discussion. Valjean has raised a valid point, and I don't think the policy anticipated the destructive editing that is occurring. Case in point is zsteve21. He says, and I quote: "I am just a novice Wikipedia editor on my own who wants to make articles have manageable markup sizes." Notwithstanding the bizarre nature of that statement, do we want a novice editor making bold split decisions as he has numerous times in his impressive 2-month experience with Wikipedia? I don't much care about "List of Hallmark Movies" but don't think an article like "Glossary of Engineering" should be messed with without a discussion with the large number of expert contributors. VarmtheHawk (talk) 05:23, 18 October 2021 (UTC)
- At this point I don't know if we need a guideline change, or a handful of topic bans, or both. Your last example there is pretty scary. EEng 05:46, 18 October 2021 (UTC)
- If a particular editor is making bad edits, that's a specific matter and not a flaw of policy. It would be helpful if editors could raise what they think are the bad edits. These are all pretty solvable simply by reverting such bad edits. Onetwothreeip (talk) 06:22, 18 October 2021 (UTC)
- This is not about "good" or "bad" edits. Splits can be either. They are bad when made as BOLD edits without a pre-existing consensus for "if" and exactly "how" it should be done.
- Splits are not normal editing. They are very different and should be governed by different rules. They should not be subject to back-and-forth and BRD editing procedures. All the preliminary work should be done on the talk page, with no attempts to perform the split until a consensus is reached. -- Valjean (talk) 17:35, 18 October 2021 (UTC)
Summary to-date. Since this discussion appears to be winding down, I thought it would be useful to summarize the major points that have been made.
- There is a group of editors whose mission is to split the largest articles, regardless of merit.
- This group will use a myriad of arguments, generally untrue, irrelevant or exaggerated, and will continue to recycle these arguments until the contributors of the article are worn down.
- Counterarguments by the subject matter experts to these arguments are met with derision or requests to prove their counterarguments by comparing their work to other Wikipedia articles; responses are never good enough.
- They support the destructive process of "bold splitting" and believe that it is easy to counter, despite evidence to the contrary.
- They are very familiar with Wikipedia policies, much more so than an average editor working on an article.
- They do not support any changes to policy.
- They are particularly adept at using the SIZE argument, making it mean whatever they feel like at the time.
Please feel free to add to this list or to show that any of them are untrue.VarmtheHawk (talk) 17:41, 18 October 2021 (UTC)
- There may be more but that's a great start. Perfect description of the behavior in display at Talk:Glossary_of_engineering:_A–L#Reverting_the_split. A friend suggests that
What's needed is much more prominent guidance at WP:SPINOUT that this should only be done to well-established articles when there is both a strong consensus and adequate subject expertise to make a sensible subdivision. WP:HASTE is too wishy-washy about how maybe you might think for five seconds before breaking out the chainsaw.
I think that's a great framework to work from. EEng 17:02, 19 October 2021 (UTC)
- There may be more but that's a great start. Perfect description of the behavior in display at Talk:Glossary_of_engineering:_A–L#Reverting_the_split. A friend suggests that
- In looking into the background of this issue, I've noticed that, in the last six months, the top 10 longest articles have been targeted and split by this group, relegating them to a lower spot on the list. Of the current top 20, almost all are identified as targets for splitting. The first two, List of chess grandmasters and List of Falcon 9 and Falcon Heavy launches are subject to fierce debate (in addition to the revisit of Glossary of engineering). Yet the third, List of The Amazing Spider-Man issues has no such comments. Even more interesting is the fact that #7 and #18 are about Donald Trump and his presidency and both exceed 100k in readable prose. I wish someone would comment on this dichotomy. Perhaps a list of longest articles by readable prose?
- What often goes unsaid is that an article once split may again be subject to another split as the list narrows. The attempt to call a moratorium on further discussion of splitting List of Falcon 9 and Falcon Heavy launches was, of course, met with derision.VarmtheHawk (talk) 17:54, 19 October 2021 (UTC)
- When you say "top 10 longest", you mean longest per that database report Ergzay mentioned in his OP at the top of this thread? EEng 18:04, 19 October 2021 (UTC)
- Hopefully fixed.VarmtheHawk (talk) 18:53, 19 October 2021 (UTC)
- My point is that that report is about wikisource size, which is completely irrelevant. EEng 01:24, 21 October 2021 (UTC)
- Hopefully fixed.VarmtheHawk (talk) 18:53, 19 October 2021 (UTC)
- The longest articles purely by prose would mostly be related to Donald Trump and recent American politics, in my experience. Onetwothreeip (talk) 06:49, 20 October 2021 (UTC) Onetwothreeip (talk) 06:16, 20 October 2021 (UTC)
- When you say "top 10 longest", you mean longest per that database report Ergzay mentioned in his OP at the top of this thread? EEng 18:04, 19 October 2021 (UTC)
That's not even remotely true (see, for example, Douglas MacArthur), but I think the picture on this issue is getting clearer:
- The arguments usually presented towards splitting an article are frequently misstated, and the articles in question are almost always in compliance with WP:AS.
- There is not a "one size fits all" approach that works.
- Many articles should be split or otherwise reduced in size, but any argument for doing that should include a valid reason. For example, if one section is considerably more detailed that the rest of the article, that may be a candidate.
- Appearance on Special:LongPages (and similar reports based on wikisource size) has zero weight in arguing for an article to be split (e.g., List of The Amazing Spider-Man issues);[further explanation needed] all arguments should relate to the amount of material the reader sees, distinguishing article prose vs. tabular (and similar) material vs. notes and references vs. images and other visuals, etc., and take into account the distribution of material into sections, the nature of the topic, ways readers are likely to approach the material, and so on.
- The list of prose-size breakpoints (e.g. "almost certainly split" at 100K) is just something someone wrote on the back of an envelope 15 years ago, yet certain editors treat it like the Ten Commandments.
- Because of WP:HASTE, articles should not be split boldly. If an editor feels that an article needs to be split, they should make a concrete proposal and consensus reached. Significant weight should be given to the opinions of the subject matter experts.
If the vote is against, the issue should be put to rest until the article has major changes.
In particular, this practice of constantly throwing up arguments, with the corresponding scramble to respond, really should stop.VarmtheHawk (talk) 23:52, 20 October 2021 (UTC)
- Subject to User:VarmtheHawk's approval, I've made some changes to the above list. EEng 01:37, 21 October 2021 (UTC)
- Yes, no problem. The comment above reflects that the List of The Amazing Spider-Man issues is not materially different from the lists being contested and yet no one has a problem with it. What I meant to say on the proposed deletion was: "If the vote is against, the issue should be put to rest. Should the article significantly change through the addition of material, the issue could be revisited." Either way is fine with me.VarmtheHawk (talk) 01:48, 21 October 2021 (UTC)
- There's always a general discouragement of re-raising a question too soon, but specific language saying "You can't raise this again until X" would be very unusual, and there's no special reason for it here. It would become a point of contention, trust me. EEng 01:58, 21 October 2021 (UTC)
- VarmtheHawk, I agree, and would like to add that article splitting is so consequential that long discussions by advocates should be avoided. They should only try to split articles where they meet no resistance from other editors. The need for splitting, and manner of doing so, should be readily apparent to all and uncontroversial. If there is much resistance, they should move on and not try to press the point. -- Valjean (talk) 15:19, 21 October 2021 (UTC)
- Yes, no problem. The comment above reflects that the List of The Amazing Spider-Man issues is not materially different from the lists being contested and yet no one has a problem with it. What I meant to say on the proposed deletion was: "If the vote is against, the issue should be put to rest. Should the article significantly change through the addition of material, the issue could be revisited." Either way is fine with me.VarmtheHawk (talk) 01:48, 21 October 2021 (UTC)
- Subject to User:VarmtheHawk's approval, I've made some changes to the above list. EEng 01:37, 21 October 2021 (UTC)
Rewrite of WP:SIZERULE section
I did an initial rewrite of the WP:SIZERULE as we seemed to be making no progress in the discussion. It's been rewritten to instead use words instead of byte size as humans don't read bytes, we read words. I took the previous values and used a length of "5" for the word length, which is intentionally small to also uprate the length of articles to be considered and also factor in spaces and other punctuation characters. Ergzay (talk) 01:51, 26 October 2021 (UTC)
- Switching from bytes to words has one VERY important effect, which is to stop people from stupidly looking at the size of the source instead of the amount of readable prose. EEng 02:09, 26 October 2021 (UTC)
- Can we recommend how people can find this info? Maybe using Wikipedia:Prosesize?VR talk 21:24, 28 October 2021 (UTC)
- Ergzay I'm not sure if the "5" factor makes sense. I'm looking at today's main page article and its 27,527 bytes or 2,345 words, meaning a factor of 12. Yesterday's main page article had a factor of 39.VR talk 21:32, 28 October 2021 (UTC)
- For God's sake, the 27k is the wikisource size, not the readable prose size, which is 14k. If we can't get stuff like this straight this conversation is doomed. EEng 05:39, 29 October 2021 (UTC)
- @Vice regent I wasn't trying to match it exactly. I intentionally was doing it to slightly expand the maximum size of articles as the rule was written back when computers and phones in general were less performant and couldn't handle large page sizes. (The sizes were originally added before 2006 when flip phones were the norm even in the US.) Exploring the page history is illuminating. Ergzay (talk) 23:11, 28 October 2021 (UTC)\
- @Ergzay: ok but I don't see this as merely a slight expansion. The last few main page FAs have had these many words: 3339, 2345, 4799, 3308, 3003, 3931, 2055, 2869, for an average of ~3,200 words/featured article. So maybe we should recommend splitting a lot earlier than 20,000 words.VR talk 00:24, 29 October 2021 (UTC)
- I don't think the factor of bytes to words should matter. The point of the size rule is to separate out the byte count as the reason to split the article, as the byte count includes tables, references, re-worded links, and other things that should not be counted for the reason to split articles. Wikipedia uses XTools to calculate prose words and characters, and I verified by copying over the article page for 1989 (Taylor Swift album) to Microsoft Word, deleted out tables and photos, and used used the Word Count tool on the Review tab, and got a word count of 5,132 words, for 34,053 characters, for an average word size of 6.64, compared to XTools calculation of 4,819 prose words and 30,423 prose bytes or characters with an average word size of 6.31, not the factor of 39 mentioned above. This is a reasonable size article that is not close to needing to be split. If we compare a word count of 4,819 to an article byte size of 188,818 for those stuck on byte size, then if this article grew to 20,000, the article might have a byte size of 784,000 bytes, maybe an imposingly large article size, about 50% larger than article sizes of 500,000 that the article editors have been pursuing. I suppose we could compromise on 15,000 words, which if we extrapolated the Taylor Swift album article, would take it to 588,000 bytes, in range with other extra large articles, and almost 95,000 prose bytes, at which point it is a good idea to recommend splitting the article. Mburrell (talk) 03:11, 29 October 2021 (UTC)
- 15,000 words is nearly 5 times more than the average featured article that has ~3,200 words (1989 (Taylor Swift album) is the biggest FA; I'm using FAs on main page in last week as a random sample). I'm seeing plenty of other FAs also in the 2,000-5,000 word count range. I think moderately sized articles are easier to read and maintain and that's what we should strive for.VR talk 04:25, 29 October 2021 (UTC)
- I'm going to draw a line in the sand right here and now on this (and I'm sorry if you feel picked on -- not my intention):
- (a) Our convenience in editing is of ZERO consequence. All that matters is what serves our readers best.
- (b) Articles are very rarely "read". Most, er, readers read the lead, read or skim the first section or two, and then dip in here and there according to level of interest or what they're after, possibly using the TOC as a guide. Talking about making articles "easy to read" (read top to bottom, that is) is a red herring.
- EEng 06:00, 29 October 2021 (UTC)
- I'm going to draw a line in the sand right here and now on this (and I'm sorry if you feel picked on -- not my intention):
- 15,000 words is nearly 5 times more than the average featured article that has ~3,200 words (1989 (Taylor Swift album) is the biggest FA; I'm using FAs on main page in last week as a random sample). I'm seeing plenty of other FAs also in the 2,000-5,000 word count range. I think moderately sized articles are easier to read and maintain and that's what we should strive for.VR talk 04:25, 29 October 2021 (UTC)
- @Vice regent To be frank I'm in favor of deleting the section entirely. The rule was originally written in a time frame of the internet when it was dominated by low performance devices with very low amounts of internal memory. That is not the norm now even for the most underpowered of Android phones. I changed it to words to "repurpose" the size section for something useful, as splitting on fixed byte size in this day and age is frankly ridiculous. Ergzay (talk) 05:54, 29 October 2021 (UTC)
- I don't think the factor of bytes to words should matter. The point of the size rule is to separate out the byte count as the reason to split the article, as the byte count includes tables, references, re-worded links, and other things that should not be counted for the reason to split articles. Wikipedia uses XTools to calculate prose words and characters, and I verified by copying over the article page for 1989 (Taylor Swift album) to Microsoft Word, deleted out tables and photos, and used used the Word Count tool on the Review tab, and got a word count of 5,132 words, for 34,053 characters, for an average word size of 6.64, compared to XTools calculation of 4,819 prose words and 30,423 prose bytes or characters with an average word size of 6.31, not the factor of 39 mentioned above. This is a reasonable size article that is not close to needing to be split. If we compare a word count of 4,819 to an article byte size of 188,818 for those stuck on byte size, then if this article grew to 20,000, the article might have a byte size of 784,000 bytes, maybe an imposingly large article size, about 50% larger than article sizes of 500,000 that the article editors have been pursuing. I suppose we could compromise on 15,000 words, which if we extrapolated the Taylor Swift album article, would take it to 588,000 bytes, in range with other extra large articles, and almost 95,000 prose bytes, at which point it is a good idea to recommend splitting the article. Mburrell (talk) 03:11, 29 October 2021 (UTC)
- @Ergzay: ok but I don't see this as merely a slight expansion. The last few main page FAs have had these many words: 3339, 2345, 4799, 3308, 3003, 3931, 2055, 2869, for an average of ~3,200 words/featured article. So maybe we should recommend splitting a lot earlier than 20,000 words.VR talk 00:24, 29 October 2021 (UTC)
- I think there still needs to be byte-size considerations, particularly when you get to pages like tables and lists that do not use a lot of prose. While it is important to not have extensively long prose articles and thus reasons to split, we also don't want pages that are extremely large in byte-size for readers on slower/limited connections (5g and fast connections are *still* not universal). You probably need to have both word count and byte size, though word count should be the leading reason to split. --Masem (t) 23:14, 28 October 2021 (UTC)
- I think this suggestion conflates two different discussions, because lists and tables are excluded from the paragraph about readable prose. The Lists, tables and summaries section does not have a size limit specified, but these days there is an unofficial splitting logic that reduces tables and lists when they approach 500,000 characters.Mburrell (talk) 03:11, 29 October 2021 (UTC)
- Lets make the unofficial, official. And I'd prefer splitting much before 500,000 characters.VR talk 04:27, 29 October 2021 (UTC)
- Characters are not bytes though. Wikimarkup isn't readable characters either. None of these metrics are good as they are all open to interpretation. It's better to have no rule at all and better to have a "no extreme articles" rule and "split where appropriate" rules rather than constantly trying to chop articles to smaller sizes purely based on their size. Some topics need lots of references and sources which inflate the size to extreme sizes despite the page itself being small. Others have massive amounts of prose with little sources and could often have sections split out and summarized when they become too bloated. There is no hard and fast rule based on size on when something should be split, so writing it down in a page like this just gives excuses to trolls to come and split a page your working on (I had the poor experience to encounter such a troll recently which is what caused me to come to this page and start this effort to fix this page). Ergzay (talk) 06:12, 29 October 2021 (UTC)
- Lets make the unofficial, official. And I'd prefer splitting much before 500,000 characters.VR talk 04:27, 29 October 2021 (UTC)
- @Masem When you say "not 5G", I don't even have 5G nor does anyone I know. 5G isn't even relevant. Byte size is a historical remnant of the time when 2G (or slower) was the norm and devices had memory sizes that were given in terms of single digit megabytes. The size limits were originally added even before the iPhone 1 came out with 128 MB of onboard RAM (and little left for the web browser) which was huge for the time. The era of trying to limit page sizes to such an extreme extent is long past. Modern websites even clock into the megabytes (which I agree is too much), but trying to chop webpages at the 100kb or 200kb mark is just absurd. Ergzay (talk) 06:00, 29 October 2021 (UTC)
- I think this suggestion conflates two different discussions, because lists and tables are excluded from the paragraph about readable prose. The Lists, tables and summaries section does not have a size limit specified, but these days there is an unofficial splitting logic that reduces tables and lists when they approach 500,000 characters.Mburrell (talk) 03:11, 29 October 2021 (UTC)
- While I think this whole guideline is in dire need of reform, I vigorously object to etching in stone a new set of numbers coming off the back of some envelope in 2021, to replace the previous written-in-stone numbers that came off the back of some other envelope 15 years ago. And I absolutely cannot believe there's still talk about "byte size", meaning the size of the wikisource -- which is absolutely irrelevant to what the reader sees, the cost of downloading, or anything else. And even if you fixed that goofup in the discussion and talked about HTML (etc.) size instead, that's still irrelevant to download cost. Are there people in this day and age who don't realize that images completely dominate download cost?And is there anyone who thinks this guideline will be substantially changed without an eventual RfC? EEng 04:56, 29 October 2021 (UTC)
I'm going to muscle in here to order you all to do something, on pain of excommunication (and I've got a personal pipeline to the pope, so I can arrange that if really necessary). Before anyone says one more word, everyone needs to go to Preferences > Gadgets > Browsing and check the box that says Prosesize: add a toolbox link to show the size of and number of words in a page. That adds a "Page size" link to the toolbox to the left of each article. When you click that, it barfs back a bunch of statistics for the article you're looking at, including
- Prose size (text only): XX kB (YYYY words) "readable prose size"
Those, and only those, are the numbers we should be discussing (at least when we're talking about prose, not tables and quotes and stuff).
After everyone does the above, I'll allow discussion to resume. The Great and Powerful Oz has spoken! EEng 05:46, 29 October 2021 (UTC)
- I have reverted the change, which did more than a slight upgrade to length, it actively changed the level of guidance so that it was far more encouraging of longer articles. Humans read words not bytes, but bytes are used as a proxy here, on the assumption that 10,000 words is around 50,000 bytes. The SIZERULE section is a supplementary rule of thumb for the rough 10,000 word guideline. Further, the edit removed mention of two tools which can measure the byte proxy, with no replacements. Regarding overall technical size, the byte size of the SIZERULE section explicitly refers to prose size, so it does not correspond to the download size, the loading size, or similar considerations. If there is a technical issue relating to overall technical size (eg. Wikipedia:Template limits), it would need to be reflected in the Markup size section. CMD (talk) 05:38, 29 October 2021 (UTC)
- I made the change and I frankly agree. I'm primarily in favor of deleting the section entirely, but I kept the section and rearranged it as a concession. If the preference is that it's entirely bad I'd prefer to dlete it. Ergzay (talk) 06:01, 29 October 2021 (UTC)
- I've deleted the section as an alternative rather than trying to massage it into something more appropriate for this day and age. If this is agreeable (or no comments in a few weeks) I'll also go fix all the now broken links to the section. Ergzay (talk) 06:04, 29 October 2021 (UTC)
- I love your enthusiasm, and personally I'm for it, but just killing all numbers is never going to fly without substantial discussion. Or maybe it will -- wouldn't that be wonderful? EEng 06:15, 29 October 2021 (UTC)
- Right now I'm trying to drive more discussion on why people think fixed size limits are a good idea at all. As far as I'm aware, if you have internet at all wikipedia web page sizes are going to be smaller than the rest of almost anywhere else on the internet. Even the largest pages (that don't have images). I'd like one person to arrive with real numbers that can justify such limits. As so far it's just been hand waving. Ergzay (talk) 06:19, 29 October 2021 (UTC)
- I think EEng's doctrinal invocation should be noted here. The presence and absence of images is not relevant to the SIZERULE subsection, and overall web page sizes similarly do not relate to the subsection's purpose or intention. If the issue relates to images and overall web pages, I am not sure why SIZERULE is being edited in any direction. CMD (talk) 06:44, 29 October 2021 (UTC)
- I think you're agreeing with me but to be honest I'm sure what you just said. EEng 06:47, 29 October 2021 (UTC)
- @Chipmunkdavis My primary impetus for starting this discussion is people (gnomes/trolls) abusing SIZERULE to go around to many disparate articles that they are not involved with, trying to split them, often ignoring discussion or not seeking to bring in the usual editors of the page to consult their opinions. Then if you go against their splitting of the article they immediately point to SIZERULE and use the wikimarkup size as proof that they are doing good work by chopping other people's articles into smaller pieces. I wish to stop this behavior. How we get there I am not particular on. Cutting off their incorrect use of a very old article written in the days of 2G and sub 100 MB memory phones by deleting/modifying/etc the section they are using seems like a good start. In either case the article should be changed because it is outdated for the modern era. Ergzay (talk) 07:54, 29 October 2021 (UTC)
- Anyone pointing to wikimarkup size to support SIZERULE is not applying SIZERULE, and if done consistently should be handled as disruptive behaviour as in any other area of the wiki. With regards to 2G and sub 100 MB, such factors are not relevant to SIZERULE, which is more or less a part of MOS, and not too attached to how modern our era is. CMD (talk) 08:22, 29 October 2021 (UTC)
- @Chipmunkdavis How is it part of MOS? Isn't it just something that people just got used to as it encrustified into "this is just how we've always done things"? Look at the section I created below. It definitely started as a technical limitation when the rule was originally created. Ergzay (talk) 11:43, 29 October 2021 (UTC)
- It is similar to MOS in how it is treated, providing general parameters for the formatting of our articles. MOS is indeed pretty encrustified. CMD (talk) 12:34, 29 October 2021 (UTC)
- Well I'm up for getting MOS changed. It deserves to be with regards to this. Ergzay (talk) 12:51, 29 October 2021 (UTC)
- That's fine, but it should have a wide discussion involving the areas that use it, such as the GAN and FAC processes. CMD (talk) 05:52, 30 October 2021 (UTC)
- Well I'm up for getting MOS changed. It deserves to be with regards to this. Ergzay (talk) 12:51, 29 October 2021 (UTC)
- It is similar to MOS in how it is treated, providing general parameters for the formatting of our articles. MOS is indeed pretty encrustified. CMD (talk) 12:34, 29 October 2021 (UTC)
- @Chipmunkdavis How is it part of MOS? Isn't it just something that people just got used to as it encrustified into "this is just how we've always done things"? Look at the section I created below. It definitely started as a technical limitation when the rule was originally created. Ergzay (talk) 11:43, 29 October 2021 (UTC)
- Anyone pointing to wikimarkup size to support SIZERULE is not applying SIZERULE, and if done consistently should be handled as disruptive behaviour as in any other area of the wiki. With regards to 2G and sub 100 MB, such factors are not relevant to SIZERULE, which is more or less a part of MOS, and not too attached to how modern our era is. CMD (talk) 08:22, 29 October 2021 (UTC)
- I think EEng's doctrinal invocation should be noted here. The presence and absence of images is not relevant to the SIZERULE subsection, and overall web page sizes similarly do not relate to the subsection's purpose or intention. If the issue relates to images and overall web pages, I am not sure why SIZERULE is being edited in any direction. CMD (talk) 06:44, 29 October 2021 (UTC)
- Right now I'm trying to drive more discussion on why people think fixed size limits are a good idea at all. As far as I'm aware, if you have internet at all wikipedia web page sizes are going to be smaller than the rest of almost anywhere else on the internet. Even the largest pages (that don't have images). I'd like one person to arrive with real numbers that can justify such limits. As so far it's just been hand waving. Ergzay (talk) 06:19, 29 October 2021 (UTC)
- I love your enthusiasm, and personally I'm for it, but just killing all numbers is never going to fly without substantial discussion. Or maybe it will -- wouldn't that be wonderful? EEng 06:15, 29 October 2021 (UTC)
- Today's FA is Climate change, which is a complex topic with scientific, political, and economical dimensions. Despite its complexities, it is covered in prose size of only 53,000 bytes (8298 words). Each of the article's main sections has its own article, and what's left behind is a summary (WP:SUMMARYSTYLE). To me limiting article size is not about bandwidth limitations, its about article quality.VR talk 15:50, 31 October 2021 (UTC)
- Took a look at Climate change. You are correct that the article has a prose size of 53 kB, 8294 words. It has a wiki-text of 263 kB. So if we took the 8294 words, scaled it up to 20,000 words proposed for a size limit, we would have a prose size in bytes of 128 kB, and a total wiki-text size of about 634 kB, not that we are trying to use wiki-text size, so just using that to compare to currently enforce unofficial standards. This makes it a little larger, but not excessively larger, so if I am reading your statement as a comment on article size, it seems to say that a proposed 20,000 word limit would be acceptable? Or are you suggesting that a smaller 15,000 word limit would be more acceptable (a scaled 95 kB prose text, 475 kB wiki-text)? Maybe I am missing the thrust of your argument on the discussion on article size. Are you saying article size does not matter, as long as every article is written to the quality of a featured article standard? Could you expand on what you are trying to state in terms of article size? Thanks. Mburrell (talk) 20:50, 31 October 2021 (UTC)
- @Mburrell: yes I think 15,000 words should be the limit. Although personally I'd prefer even lower, as the policy page does quote 10,000 words as ideal from a human attention span perspective[5]. Smaller pages force us to summarize content, which is incredibly useful to the average reader (they can always go to the spinned off article if they want more detail).VR talk 23:11, 31 October 2021 (UTC)
- Or they could keep reading the current article if they want more detail. I'm sorry, but this discussion is built on sand. I've just removed the passage asserting that articles should be X words at more because humans read at Y words per minute and can only concentrate for 40 minutes -- cited to a book on management (not psychology, or education, or anything like that) -- and which at the same time links to the article attention span -- which, interestingly, says that adults can concentrate for 5 to 6 hours. It's all a mess of conjecture and OR, founded only on a few random editors' unsupported assertions about what our readers want or need. EEng 00:56, 1 November 2021 (UTC)
- Can we have an RfC to decide this? Guidelines should reflect a broader and stronger consensus than we have here.VR talk 01:15, 1 November 2021 (UTC)
- I too would like to see an RfC. I am mostly in agreement with the changes that User:EEng is doing, but I agree that we need a broad and strong consensus to change a project page. I would not mind a fuzzy upper limit where reducing or splitting a prose article should be discussed, and I would not mind setting the relative (not absolute) limit at either 15k or 20k, and I wouldn't mind an upper fuzzy limit on lists and tables as well, but it should be based on community agreement, or real size logic, and not the hand-waving logic that EEng has been excising. I think the current modifications to the article give a good discussion point for the RfC, but I would like to see community buy-in on the changes. Mburrell (talk) 02:13, 1 November 2021 (UTC)
- As I note in a separate section below, Wikipedia:Splitting is a parallel page that very much overlaps this one (or the way this one was until the axe was taken to it recently) re the triggers and considerations for splitting, plus it gives detailed how-to on carrying out splits. I think the thing to do is to take this conversation over there, or get the participants there over here, and hash it out among us -- before opening any RfC. On the down side, I now have an external commitment that's going to take up a lot of time, and may be as attentive as I know all of you would love me to be. EEng 02:52, 1 November 2021 (UTC)
- As there's no recent discussion there, best to continue it here. And yes the RfC would affect the wording both here and at Wikipedia:Splitting. What exactly will be asking in an RfC? a) no size guideline, b) prose-based size guideline of a max of 10 or 15 or 20 k words, c) a guideline that is based on both prose and other markup? Sorry just hypothesizing.VR talk 16:35, 1 November 2021 (UTC)
- There shouldn't be any use of markup in the size guideline, if we have guideline at all, as that reason existed historically only for technical reasons that no longer exist.
- @EEng I know you said you'd be busy soon, but you're leading this very well so far with your edits. If you want additional people to comment just ping me however and I'll stop by. I'll be watching this page but probably not regularly looking. Ergzay (talk) 05:37, 2 November 2021 (UTC)
- It's impossible to overemphasize the following: There are clearly editors over at WP:Splitting who would be interested in what we've been doing here, and my guess is they don't have this page watchlisted, which explains why there's been so little comment so far. They need to be brought into the discussion before we start thinking about a project-wide RfC. It's just that I've been hesitant to open that door given my other commitments. EEng 11:05, 2 November 2021 (UTC)
- As there's no recent discussion there, best to continue it here. And yes the RfC would affect the wording both here and at Wikipedia:Splitting. What exactly will be asking in an RfC? a) no size guideline, b) prose-based size guideline of a max of 10 or 15 or 20 k words, c) a guideline that is based on both prose and other markup? Sorry just hypothesizing.VR talk 16:35, 1 November 2021 (UTC)
- As I note in a separate section below, Wikipedia:Splitting is a parallel page that very much overlaps this one (or the way this one was until the axe was taken to it recently) re the triggers and considerations for splitting, plus it gives detailed how-to on carrying out splits. I think the thing to do is to take this conversation over there, or get the participants there over here, and hash it out among us -- before opening any RfC. On the down side, I now have an external commitment that's going to take up a lot of time, and may be as attentive as I know all of you would love me to be. EEng 02:52, 1 November 2021 (UTC)
- I too would like to see an RfC. I am mostly in agreement with the changes that User:EEng is doing, but I agree that we need a broad and strong consensus to change a project page. I would not mind a fuzzy upper limit where reducing or splitting a prose article should be discussed, and I would not mind setting the relative (not absolute) limit at either 15k or 20k, and I wouldn't mind an upper fuzzy limit on lists and tables as well, but it should be based on community agreement, or real size logic, and not the hand-waving logic that EEng has been excising. I think the current modifications to the article give a good discussion point for the RfC, but I would like to see community buy-in on the changes. Mburrell (talk) 02:13, 1 November 2021 (UTC)
- Can we have an RfC to decide this? Guidelines should reflect a broader and stronger consensus than we have here.VR talk 01:15, 1 November 2021 (UTC)
- Or they could keep reading the current article if they want more detail. I'm sorry, but this discussion is built on sand. I've just removed the passage asserting that articles should be X words at more because humans read at Y words per minute and can only concentrate for 40 minutes -- cited to a book on management (not psychology, or education, or anything like that) -- and which at the same time links to the article attention span -- which, interestingly, says that adults can concentrate for 5 to 6 hours. It's all a mess of conjecture and OR, founded only on a few random editors' unsupported assertions about what our readers want or need. EEng 00:56, 1 November 2021 (UTC)
- @Mburrell: yes I think 15,000 words should be the limit. Although personally I'd prefer even lower, as the policy page does quote 10,000 words as ideal from a human attention span perspective[5]. Smaller pages force us to summarize content, which is incredibly useful to the average reader (they can always go to the spinned off article if they want more detail).VR talk 23:11, 31 October 2021 (UTC)
- Took a look at Climate change. You are correct that the article has a prose size of 53 kB, 8294 words. It has a wiki-text of 263 kB. So if we took the 8294 words, scaled it up to 20,000 words proposed for a size limit, we would have a prose size in bytes of 128 kB, and a total wiki-text size of about 634 kB, not that we are trying to use wiki-text size, so just using that to compare to currently enforce unofficial standards. This makes it a little larger, but not excessively larger, so if I am reading your statement as a comment on article size, it seems to say that a proposed 20,000 word limit would be acceptable? Or are you suggesting that a smaller 15,000 word limit would be more acceptable (a scaled 95 kB prose text, 475 kB wiki-text)? Maybe I am missing the thrust of your argument on the discussion on article size. Are you saying article size does not matter, as long as every article is written to the quality of a featured article standard? Could you expand on what you are trying to state in terms of article size? Thanks. Mburrell (talk) 20:50, 31 October 2021 (UTC)
- Comment - so I came to this page this morning, looking for the usual article size guide table, only to find it's gone completely... and on the back of a few bold edits by just two or three editors, which has already been reverted once by Chipmunkdavis. So I've reverted again, pretty much for the identical reasons given by CMD above. The guidelines on article length are a longstanding and highly-used aspect of the MOS, and I cite the 60kb "probably should be split" guidance frequently at FAC and elsewhere. Of course, there are exceptions, and the guidance already gives advice about not being hasty, but the general guidance is sound.. and it's not just about length of time to load the page (something which is still a factor for those in the global south who don't enjoy the advanced internet connections that we do), it's also a simple issue of readability and good article design. If changes of this magnitude are to be effected, it needs to be via a sitewide RFC, and with extremely good reasons set out as to why having long articles is suddenly fine and dandy, when it never has been before. — Amakuru (talk) 10:17, 4 November 2021 (UTC)
- @Amakuru They're longstanding because they've been forgotten about with the advancement of technology. Please see my summary in the documentation down below. The rules are abused by trolls/gnomes to chop articles on the basis of wiki markup size rather than some reasonable standard about clarity or topic coverage being too wide. There's numerous articles that have been long standing but have been ruined by the adventurism of these types of people. The size rules date to the era of 2G pre-smartphone phones and when many people had dialup internet and should be discarded. Ergzay (talk) 15:26, 4 November 2021 (UTC)
- Also, do note @Vice regent recently put an item on the talk page for starting discussion to head into an RFC. Ergzay (talk) 15:27, 4 November 2021 (UTC)
- It really doesn't matter how the guidelines came into being 15 years ago, the point is that they are in effect today and they are used regularly to inform size decisions and I see no evidence that the rules of thumb contained in those tables are not relevant now. As noted previously, the recent FA articles on climate change and Earth both come in at significantly below 60kb of prose size, yet these are among the most complex topics that one could possibly seek to write an article on. So it is not only possible to write articles that aren't too long, it is also desirable. From a stylistic standpoint as much as from a technology one. And, on that topic, the "advancement of technology" you mention may be significant in the western world, but as someone with experience working in Africa, I can assure you that bandwidths and data rates there can still be limited.
- If the guidance re bytesize is misunderstood by those whom you characterise as "trolls/gnomes", then the solution is to clarify the language around this guidance so that it's crystal clear that we refer to prose sizing rather than Wiki markup sizing. The solution is not to throw the whole guidance out altogether, just because a few people misunderstand it. Also, the concept of prose size as a byte count is well-established and already used to ensure minimum article sizes in processes such as WP:DYK and destubathons, so let's not pretend this is an archaic and little-understood metric. Cheers — Amakuru (talk) 15:57, 4 November 2021 (UTC)
- Confusion re source size vs prose length is the least of it: the history shows that the numerical limits are simply made up, the end product of a cascade of arbitrary transformations applied to an original, real, 32K limit on source size (which of course no longer applies), leavened by some nonsense about human attention span. They're built on nothing, and while it may be comforting to feel you're guided by some kind of authority [6], it's not healthy for articles to be cut apart on such a basis.
- There were dozens of changes made, the substantive ones explained in edit summaries. If you think something should be restored or changed, do that (after duly considering the reason offered for the original change, of course), but blindly reverting everything because you miss the comfort of someone telling you what to do instead of deciding for yourself, no. EEng 18:12, 4 November 2021 (UTC)
- The edits start here [7]. I propose we keep them. If no one says what they don't like about them in the next few days I'll be putting them back. EEng 06:35, 10 November 2021 (UTC)
- I didn't like the edit here.VR talk 13:07, 10 November 2021 (UTC)
Documentation of the history of WP:SIZERULE
Here I will document the history of SIZERULE and show how little it has been updated in recent years. SIZERULE first appeared with the creation of this page on March 7th, 2003. At the time the max page size was given as 30K. It was explicitly at that time written as a technical limitation of browsers of that time period. Smartphones didn't exist in 2003, and data plans for phones, what they were, often used 2G or worse. Some people had cable internet but most people still used dialup. At some point in the intervening years the value was tweaked to 32K, likely to be base 2, and there were additional changes clarifying that the limit was only for the article, and not for lists as the meaning started to drift away from it being for a technical reason. Some time before 2005 or so a clarification was added that said that mobile browsers and some web browsers crop any pages longer than 32KB and refuse to load any more. On January 17th, 2006 the limit was increased to 50kb. On February 22nd, 2007 the limit was increased to 100kb. And there it has sat for 13 years, with a technological and digital revolution happening around it, we now keep chopping articles to 100kb in wikitext length for "technical reasons".
Does this not strike anyone else as utterly ridiculous? Ergzay (talk) 08:19, 29 October 2021 (UTC)
- It's not only utterly ridiculous, but completely and totally ridiculous as well. And here's more ridiculousness: that early guideline was talking about the size of the wikisource [8], but then suddenly someone apparently just stuck in the words
of readable prose
, thereby completely changing the meaning [9]. - Then in 2006 someone actually proposed (AND I AM NOT MAKING THIS UP) a "Mandatory breakup committee":
First, an editor tries to establish consensus: the issue is brought up on the talk page, and it is suggested that the regulars break up the article into subtopics, with short summary paragraphs (w/ main article attachments), see thermodynamics as an example, so that the main page gets below a certain limit. Second, if plan #1 stifles out in argument and indecision to act, for a number of consecutive weeks, then an breakup arbitration committee notice is placed on the talk page, putting an ultimatum deadline, such that either the regulars break up the page to below a certain limit by that date or an external breakup committee, enforced by a team of administrators, will do so.
- This cookie-cutter approach persists to this day (see elsewhere on this very page) and must be resisted at all costs. As one editor put it (elsewhere on the page just linked):
The persons providing the justification for limits on article size are predominantly "techies" for whom the writing part is a chore compared to the joy of formatting pages, blocking miscreants and otherwise engaging in the plumbing aspects of html page production. These are the folks who theorize that readers will get bored with articles that are longer than x kb (notice how the limits are in kb and not words - very instructive) often because of their own inadequacies in that department. What is lost in this discussion is that some articles are well written and can hold the readers' interest far longer than much of the mediocre prose found in other entries.
- Just so! I absolutely support removing the numerical limits, which are fashioned from whole cloth and based on no evidence whatsoever about what readers want or need. EEng 13:47, 29 October 2021 (UTC)
- I'm afraid you're both mistaken. There are many content creators, including yours truly and SandyGeorgia, who emphasize the importance of writing concise articles and using summary style when they get too long. Anything over 10,000 words is unlikely to pass at FAC. (t · c) buidhe 10:34, 7 January 2022 (UTC)
- Note to self and to my fellow editors: Don't forget Wikipedia:Splitting, which repeats the stupid character count cutoffs, claiming they're based on an assertion that readers can concentrate for 30-40 minutes, citing our very own article Attention span -- which says nothing like that, rather says 5-6 hours. We all know this varies tremendously depending on the reader, motivation, nature of material, and 50 other things, and figures like 30-40 minutes (or 5-6 hours, for that matter) are just pulled out of the air.This page and that page need to be harmonized somehow; they're really both trying to do the same thing. I know! Let's merge them! EEng 22:55, 30 October 2021 (UTC)
- I agree that the attention span argument is quite unfounded; in my experience many readers don't even read a full Wikipedia article from top to bottom anyway; they are looking for a specific piece of information. But I think that's not the only reason why you'd want to limit how long articles can get. Extremely long articles have longer load times and are harder to navigate. I've spent some time at Wikipedia talk:Manual of Style lately and that page is so long that I constantly get lost. If we want to aid readers in finding information then splitting extremely long articles up into ones that are overseeable seems like a good idea to me. ―Jochem van Hees (talk) 16:41, 2 November 2021 (UTC)
- I'm afraid I must disagree with some of what you say. The load-time argument is completely fallacious given that most article's download time is driven almost entirely by images. And saying something about articles based on your experience with MOS is like saying you drive a compact car because you had trouble finding the bathroom on a 747. They're completely different animals. EEng 22:44, 2 November 2021 (UTC)
- Sorry I only noticed your reply just now. I'm not sure which fallacy my argument has, nor where you got that statistic for download time from. I'm not at all an expert on this but I did a quick test by loading the page Border control, which is not only huge in page size but also has loads of images. According to Chrome's network devtools, loading the page content took longer than any of the images; the content took 833ms while the images were anywhere between 10ms and 130ms, and they were downloaded in parralel. (That's with my relatively good wired connection; it will take significantly longer on a weak mobile wireless connection.)In any case, if you're really that disstatisfied with me using the MOS as an example, I only used that example because that happened recently and was therefore quickly on my mind. I have had similar issues during the UEFA Euro 2020 when I often wanted to look up the latest developments but had to scroll all the way down each time; especially on mobile it's hard to find stuff. Or an article that I have worked on myself, List of Eurovision Song Contest entries, which was ginormous before we split it into two. ―Jochem van Hees (talk) 12:13, 5 November 2021 (UTC)
- @Jochem van Hees The mobile site being a bad user experience is mostly a result of them collapsing every table section heading by default. You can stroll from the top of the page to the bottom of the page in an instant though as a single flick can move very quickly from top to bottom. Ergzay (talk) 16:17, 6 November 2021 (UTC)
- Sorry I only noticed your reply just now. I'm not sure which fallacy my argument has, nor where you got that statistic for download time from. I'm not at all an expert on this but I did a quick test by loading the page Border control, which is not only huge in page size but also has loads of images. According to Chrome's network devtools, loading the page content took longer than any of the images; the content took 833ms while the images were anywhere between 10ms and 130ms, and they were downloaded in parralel. (That's with my relatively good wired connection; it will take significantly longer on a weak mobile wireless connection.)In any case, if you're really that disstatisfied with me using the MOS as an example, I only used that example because that happened recently and was therefore quickly on my mind. I have had similar issues during the UEFA Euro 2020 when I often wanted to look up the latest developments but had to scroll all the way down each time; especially on mobile it's hard to find stuff. Or an article that I have worked on myself, List of Eurovision Song Contest entries, which was ginormous before we split it into two. ―Jochem van Hees (talk) 12:13, 5 November 2021 (UTC)
- I'm afraid I must disagree with some of what you say. The load-time argument is completely fallacious given that most article's download time is driven almost entirely by images. And saying something about articles based on your experience with MOS is like saying you drive a compact car because you had trouble finding the bathroom on a 747. They're completely different animals. EEng 22:44, 2 November 2021 (UTC)
- I was just trying to replace some references in an article whose entire size (not just prose) is 131K. It was slowing my browser down to edit the entire article, so I had to edit it section by section. Does this happen to others too? VR talk 13:08, 10 November 2021 (UTC)
- @Vice regent I've never had that problem personally though for large articles (the one I edit commonly is over 400K) however it sometimes takes a couple seconds to load the page and then submit the edit for the page, but there is no problem browsing the page or editing the page. I've heard that the "visual editor" is extremely bad/slow for Wikipedia. Are you using that? Ergzay (talk) 20:09, 10 November 2021 (UTC)
- No, always source editor. Maybe I have too many windows or tabs open? VR talk 20:32, 10 November 2021 (UTC)
- I'm not sure. I'm on Firefox on a more recent Macbook M1 but I had no problems on my 2015 Macbook Pro I used to use. Ergzay (talk) 21:25, 10 November 2021 (UTC)
- No, always source editor. Maybe I have too many windows or tabs open? VR talk 20:32, 10 November 2021 (UTC)
- Yes. I live in a first world country and use a recent laptop with an updated browser with a fast broadband connection. Editing starts to slow down around 40-50k wikitext and when it gets much longer than that, either you have to live with a lot of lag or go section by section. (t · c) buidhe 10:31, 7 January 2022 (UTC)
- Then go section by section. That's what section editing is for. It's incomprehensible that you're putting the convenience of your editing over the needs of the reader. EEng 00:32, 8 January 2022 (UTC)
- And I am also in a developed country with modern technology yet am finding J.K. Rowling hard to edit. There are still very good reasons for the size limits, not all related to technology, and one of the key historical issues left out of the initial analysis is attention span and average time to read the page. I suggest the page has not changed because it is still useful as is. SandyGeorgia (Talk) 00:27, 8 January 2022 (UTC)
one of the key historical issues left out of the initial analysis is attention span and average time to read the page
– And the evidence about attention span, and types of users who want to "read the page" versus use it in other ways is ... where? EEng 00:32, 8 January 2022 (UTC)- Our experience at WP:FAC and WP:FAR (where you don't contribute) shows that too-long articles accumulate bloat, and are very difficult to write and maintain, which has a detrimental downstream effect on the article quality and therefore the reader experience compared to an article kept to an appropriate length. A reader who wants more detail on a specific aspect should visit a sub-article. (t · c) buidhe 00:41, 8 January 2022 (UTC)
- That a "too long" article is ... well, too long, is a tautology. The question is: how long is too long? At what point should a particular article have a chunk split off? It's self-evident that articles should be, just as you say,
an appropriate length
, but that takes judgment based on the topic, not some stupid one-size-fits-all table based on, AFAICT, just something someone arbitrarily wrote down 15 years ago. - I really appreciate the
where you don't contribute
throwaway, because it gives me a chance to remind everyone that FAC reliably produces articles which conform to a checklist of mindless rules but which are often pretty awful, sometimes laughably so. As well expressed in the essay User:Physchim62/Situation Normal: All FACked up, "ensuring that featured articles meet the featured article criteria is NOT the end in itself." EEng 01:19, 8 January 2022 (UTC)
- That a "too long" article is ... well, too long, is a tautology. The question is: how long is too long? At what point should a particular article have a chunk split off? It's self-evident that articles should be, just as you say,
- Our experience at WP:FAC and WP:FAR (where you don't contribute) shows that too-long articles accumulate bloat, and are very difficult to write and maintain, which has a detrimental downstream effect on the article quality and therefore the reader experience compared to an article kept to an appropriate length. A reader who wants more detail on a specific aspect should visit a sub-article. (t · c) buidhe 00:41, 8 January 2022 (UTC)
- @Vice regent I've never had that problem personally though for large articles (the one I edit commonly is over 400K) however it sometimes takes a couple seconds to load the page and then submit the edit for the page, but there is no problem browsing the page or editing the page. I've heard that the "visual editor" is extremely bad/slow for Wikipedia. Are you using that? Ergzay (talk) 20:09, 10 November 2021 (UTC)
wp:size under discussion
The redirect page wp:size is currently discussed at WP:RFD. --George Ho (talk) 10:13, 23 December 2021 (UTC)
Change size guideline proposal
I propose changing the "Probably should be divided" in Wikipedia:Article size#size guideline from 60kB to 100kB and the "Almost certainly should be divided" from 100kB to 200kB. This rule was made in 2007, when devices didn't have the capacity to navigate long articles smoothly. 15 years later, we have advanced technology and this rule is ridiculous. Ak-eater06 (talk) 20:15, 25 June 2022 (UTC)
- The rule does not only relate to computer processing speed; it relates to reading and attention span. I think it fine as is. SandyGeorgia (Talk) 20:24, 25 June 2022 (UTC)
- The idea that any more than 1/10 of 1% of visitors have the desire to read an article from top to bottom is absurd, as is reasoning based on such an idea. Overall article size is just one of many considerations in trying to answer the following question: What structure (of this one article, or of a group of related articles) best allows readers of various kinds to satisfy their knowledge-needs? Stupid size formulas, blindly applied, are not the answer to that question. EEng 20:50, 25 June 2022 (UTC)
- User:EEng#s I agree, this size rule is infuriating. Take Barack Obama's presidency for example, his policies are split into a DOZEN or so articles (economic policy, energy policy, East Asia policy, space policy, etc.) in the name of the presidency article being "too long". It just makes it more disorganized. Ak-eater06 (talk) 21:01, 25 June 2022 (UTC)
- Our stats Research:Which parts of an article do readers read Moxy- 04:20, 28 June 2022 (UTC)
- I agree that the guideline shouldn't be changed. Reading and attention spans have not substantially increased during the existence of Wikipedia. While articles do not need to be written as shortly as possible, they should not be difficult to read in their entirety. Onetwothreeip (talk) 03:10, 26 June 2022 (UTC)
- It's really unbelievable to see the attention span argument trotted out over and over and over. There's no evidence anything on this page is more than stuff a few editors made up one day. EEng 21:37, 26 June 2022 (UTC)
- My proposal is changing the "Probably should be divided" from 60kB to 80kB. Ak-eater06 (talk) 19:30, 26 June 2022 (UTC)
- While I'm happy to see even the tiniest move in the direction of sanity, it's really just deck chairs on the Titanic. As you can see from various comments on this page, there's a core group (a) utterly dedicated to the ridiculous idea that any significant proportion of visitors to an article have the intention of reading it top to bottom and (b) willing to translate that myth into word counts derived via baseless "attention span" and "reading rate" numbers from low-quality, one-size-fits-all sources. I hope your change sticks, but if it doesn't just do what I do: ignore this page's nonsense and structure articles according to the needs of each topic. Let those with no judgment of their own apply this Procrustean bed to articles unlucky enough to attract their attention. EEng 21:37, 26 June 2022 (UTC)
- EEng if you want an example of people defending article-splitting insanity, check out this discussion where my proposal to merge the four articles into one of Stephen Harper's tenure was shot down 4-2. Four users opposed merging due to this stupid size rule. Stephen Harper's tenure as Canadian prime minister is divided into four seperate articles (Premiership, domestic policy, foreign policy and environmental policy)! Same with his successor, Justin Trudeau (Premiership, domestic policy, and foreign policy).
- You say to "ignore this page's nonsense and structure articles according to the needs of each topic" and while I try to, these users who voted against merging in the discussion I linked to always prevent me from merging due to WP:MERGEPROP and when I do follow WP:MERGEPROP my merger proposals get shot down due to people citing the size guideline.
- Coming back to my Harper example, it is extremely frustrating to know people have to flip between four different pages when we can easily have his tenure in one, clean article. Ak-eater06 (talk) 06:01, 27 June 2022 (UTC)
- @Ak-eater06: I haven't read all the arguments on the relevant article talk pages but I think you would find success in merging the Premiership, Domestic policy and Environmental policy articles, while keeping the Foreign policy article separate. Onetwothreeip (talk) 08:57, 27 June 2022 (UTC)
- User:Onetwothreeip I did that and got reverted. They cited WP:Mergeprop and once again the stupid size rule. Ak-eater06 (talk) 17:31, 27 June 2022 (UTC)
- @Ak-eater06: I haven't read all the arguments on the relevant article talk pages but I think you would find success in merging the Premiership, Domestic policy and Environmental policy articles, while keeping the Foreign policy article separate. Onetwothreeip (talk) 08:57, 27 June 2022 (UTC)
- I feel for you, but as things stand, reforming this page is one of those situations in which the ratio (effort to overcome mindless idiocy) / (benefit) is just too high. I wish you luck. EEng 12:22, 27 June 2022 (UTC)
- EEng and Ak-eater06 are right. "Too long" is not the issue it once was, so the proposed change makes sense. To understand this, imagine a book. No matter the size of a normal book, it is always easier to find a bit of content as long as it's between the book's covers. Those who want to know everything about the topic/story will read the whole book/article, regardless of length. Splitting articles into separate locations (different book volumes) makes it much more difficult to find stuff, and increases the chance of important info never being seen by the reader.
- Nowadays most readers search a page for information, so keeping it all in one place makes most sense. Few read the whole article. They may read the whole lead, and may skip to interesting parts, but that's all, unless they search for key words and phrases.
- One editor's single-minded, pathological, obsession with splitting long articles is usually very destructive and contrary to the needs of 95% of our readers. Almost no one benefits from it. (Yes, you know who you are.)
- EEng is right: "What structure... best allows readers of various kinds to satisfy their knowledge-needs? Stupid size formulas, blindly applied, are not the answer to that question." -- Valjean (talk) (PING me) 22:48, 26 June 2022 (UTC)
User:Valjean and User:EEng I updated it again to reflect common sense more...hope my change sticks. Thanks for your efforts :) Ak-eater06 (talk) 01:06, 28 June 2022 (UTC)
- A good change that brings us out of the dark ages. -- Valjean (talk) (PING me) 01:09, 28 June 2022 (UTC)
- You really should have consensus before changing a guideline. And these discussions would be more effective if the snark and insult throughout were lowered. SandyGeorgia (Talk) 10:16, 28 June 2022 (UTC)
- Oppose the main purpose of the guideline is not to ensure articles can load (although those with too much wiki text can still pose an issue for some readers) but to optimize the length and level of detail for readers attention spans and ensure that the most important information remains accessible. Really if you're above 7000 to 8000 words for most topics it's better to split off another article. (t · c) buidhe 20:02, 30 June 2022 (UTC)
if you're above 7000 to 8000 words for most topics it's better to split off another article
– And we know that because the guideline says so, and therefore the guideline is correct. That's logic! EEng 22:18, 30 June 2022 (UTC)- No, I know this based on my experience writing featured articles. (t · c) buidhe 04:33, 1 July 2022 (UTC)
- We're really ringing the changes here. Sometimes it's the bandwidth/plight-of-the-third-world argument. Other times it's the I-read-something-somewhere-about-attention-spans argument. Now it's the I-write-featured-articles-argument-so-I-know-best argument. (And the way you say it, it's almost as if you imagine it will impress people!)
- If your assertion truly reflects a universal truth, then editors will discover it for themselves when they apply their good judgment to particular editing situations as they arise; they won't need the FAC elites to show them the way. EEng 05:14, 1 July 2022 (UTC)
- No, I know this based on my experience writing featured articles. (t · c) buidhe 04:33, 1 July 2022 (UTC)
Should the size guideline be removed?
The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.
This guideline currently contains a table described as Some useful rules of thumb for splitting articles, and combining small pages
. Should this table (and the two short paragraphs of attendant notes) be retained or removed? XOR'easter (talk) 18:21, 28 June 2022 (UTC)
Survey
- I support removing the size guideline. Wrote my reasoning above. Ak-eater06 (talk) 04:18, 28 June 2022 (UTC)
- I also support the removal of numerical "limits", though it might be helpful to give editors an idea what the distribution of articles sizes is. And there's plenty of room for guidelines about how to usefully think about article length as an important aspect of topic organization and presentation. I suggest other editors review the earlier threads on this page to get an idea of the recent history of this issue. EEng 04:27, 28 June 2022 (UTC)
- Oppose simple scrapping, without any alternatives being proposed. There is no hard limit as is, there is a range of guidelines which can be applied differently if situations warrant it (depending on how the overall topic is structured with regards to WP:SUMMARYSTYLE for example). As it stands, we have articles whose topics fill multiple books. Unless the proposal is to have book length articles (which is one reading of one of the quotes above I suppose), then there are obviously going to be some guidelines on the matter. The quote from EEng above is correct that size is "one of many considerations" and that this guideline shouldn't be "blindly applied", however those are not reasons to scrap this guideline, they apply inherently to all guidelines on en.wiki. CMD (talk) 05:22, 28 June 2022 (UTC)
- I'm sympathetic to most of what you say, but the problem is that this particular guideline is peculiarly susceptible to being mindlessly "enforced". EEng 06:43, 28 June 2022 (UTC)
- Oppose it's not about "devices [not having] the capacity to navigate long articles smoothly", it's about brains not having the capacity to navigate long articles smoothly. Indeed, the attention span of our average user now is likely even less than it was when Wikipedia was first formed, back in the year 6 B.I. ("before iPhone"), at a time when everybody read books (for fun! not just for school!) instead of being glued to their smartphones. Which, of course, have now colonized brains with the idea that a "full page of information" is whatever fits on a six-inch (15cm) diagonal screen. If it were only about device capacity, then articles could be a thousand times longer than they are now, and a couple years from now, a million times longer. Mathglot (talk) 07:28, 28 June 2022 (UTC)
- It's great to see that the outmoded, 20-year-old made up stuff about attention spans and reading speed -- the justification for this guideline until now -- is being replaced by new and modern made up stuff about attention spans and reading speed. EEng 13:17, 28 June 2022 (UTC)
- Oppose These are guidelines, not rules, designed to say that Wikipedia articles should be neither too large nor too small. They should contain as much information as a narrow subject can allow, and when there is the opportunity to expand the overall content by splitting an article into one or more, that opportunity is usually taken. Guidelines like these are necessary to promote good editorial standards, such as articles neither being "book-length" when it is more appropriate for a topic or list to cover more than one article, nor many articles being one sentence long when they could be reasonably merged into one article under a broader topic. Onetwothreeip (talk) 07:44, 28 June 2022 (UTC)
- Oppose per Mathglot and Onetowthreeip, and keep the limits where they are. WP:NOTTEXTBOOK is policy. Miserably long articles are ... miserable to read, difficult to check, often filled with bloat and text-to-source integrity issues ... and unencyclopedic. WikiBooks is a sister project for those who want to write books. SandyGeorgia (Talk) 10:49, 28 June 2022 (UTC)
- Adding on, now that the non-neutrally-framed RFC has been converted to a still-less-than-neutrally-framed RFC containing a large and one-sided introduction. As a huge percentage of Wikipedia content is sub-standard, allowing articles to sprawl even larger will only add to that problem, and make articles harder to hold to any reliability and harder to check for core policies. Wikipedia is an encyclopedia, not a book; allowing articles to sprawl even larger will only add to Wikipedia's already existing quality problems, and make it harder to prune the extraneous trivia and enforce any standards of writing. I cannot recall having seen an article larger than 10,000 words of prose that couldn't be trimmed and wasn't full of unnecessary detail. This proposal will make an already bad situation worse. SandyGeorgia (Talk) 19:33, 28 June 2022 (UTC)
- Oppose I think there are solid reasons to avoid using the numbers given as gospel (that's why it's a guideline), but I think the proposal seems to act like article splits are the only option when given Wikipedia's purpose as a general encyclopedia and focus on summary style the other answer that should probably be considered first in every case is whether to just cut less-important or extraneous details/information. Der Wohltemperierte Fuchs talk 11:01, 28 June 2022 (UTC)
- This is completely illogical. Extraneous detail should be cut from an article regardless of the article's length. And cutting "less important" material just to fit some arbitrary size guideline is even worse than splitting just to fit some arbitrary size guideline; at least in a split the material is still somewhere. EEng 13:21, 28 June 2022 (UTC)
- Your post would make the same point more effectively if you cut the first sentence, which unnecessarily personalizes comments made by another editor; it seems perfectly logical to me that too-long articles are often that way because of extraneous information. Also, see WP:BLUDGEON re the entirety of this page. SandyGeorgia (Talk) 13:44, 28 June 2022 (UTC)
- I'm sorry, but you're just repeating the illogic. If an article is "too long" because of extraneous information, then the extraneous information should be removed. And if the article isn't "too long", but has extraneous information, then the extraneous information should be removed. Same either way. The two things aren't related. Saying that we should be keep an otherwise unfounded length guideline because it incidentally prompts people to do things they should be doing anyway, and meanwhile also prompts people to do things they shouldn't be doing anyway (like removing "less important" material, or splitting the article) is (and here I'll say it again) completely illogical.As for WP:BLUDGEON, I'll just respond by pointing you to WP:BAREASSERTIONSABOUTATTENTIONSPANANDWHATREADERSWANTREPEATEDOVERANDOVERWITHNOEVIDENCETOBACKTHEMUP. EEng 14:13, 28 June 2022 (UTC)
- Your post would make the same point more effectively if you cut the first sentence, which unnecessarily personalizes comments made by another editor; it seems perfectly logical to me that too-long articles are often that way because of extraneous information. Also, see WP:BLUDGEON re the entirety of this page. SandyGeorgia (Talk) 13:44, 28 June 2022 (UTC)
- This is completely illogical. Extraneous detail should be cut from an article regardless of the article's length. And cutting "less important" material just to fit some arbitrary size guideline is even worse than splitting just to fit some arbitrary size guideline; at least in a split the material is still somewhere. EEng 13:21, 28 June 2022 (UTC)
- Oppose. Not everyone lives in a first world country with the latest technology and fast broadband. Long pages DO take longer to load and ARE difficult to impossible to edit for some people. If wikipedia wants to be the encyclopedia that everyone can access and edit, then pages must be accessible to all. DrKay (talk) 16:44, 28 June 2022 (UTC)
- Doctor, I like you, so it pains me to point out that this is a completely false argument. Any one image in a article takes more download bandwidth than all the text put together. EEng 17:34, 28 June 2022 (UTC)
- I know. That's why I, and others who were in similar circumstances, spent 10 years accessing wikipedia with images turned off. DrKay (talk) 20:13, 28 June 2022 (UTC)
- Support removal of quantitative number-of-kilobytes rules, which are too easy to enforce blindly and AFAICT have no empirical foundation in either technical or human limitations, having been changed from one round figure to another without methodical study of what is slow to transfer over which network and why. Moreover, the specific table being talked about refers to "Readable prose size", not "amount of data that a browser has to download", so if we are basing our guidelines on the latter, we should still scrap the existing table as a distraction. Concerns about attention spans are more relevant to article organization than size; "get to the important stuff quickly" is an argument for a good lede but not an argument for a short page overall. Long articles can be hard to check, but material spread across multiple articles can be even harder. Pages grow out of sync, content gets added to the main article rather than the more appropriate sub-article because the main article gets more traffic, etc. The Wikipedia is not a textbook policy mentioned above strikes me as a red herring. Content can be long and not-textbook-like, or short and textbook-like. Textbookishness is about working step-by-step through mathematical calculations, asking leading questions, and other such stylistic choices. For example, if Speed of light (FA, ~141K) were written like a textbook, it would probably start with the Maxwell equations, write them for vacuum conditions, derive a wave equation, deduce the propagation speed of the resulting waves, etc. Likewise, Pi (FA, ~158K) presents mathematics encyclopedically rather than textbookily. XOR'easter (talk) 16:55, 28 June 2022 (UTC)
- Textbookily is a great word! I'm adding it to my spellchecker. EEng 17:34, 28 June 2022 (UTC)
- Speed of light has a prose size of 46 kB, within the lower bounds of the currently suggested length, while Pi has a prose size of 64 kB, only slightly over the currently suggested length. Holding these are examples of well-written articles would seem to support the current guidelines. CMD (talk) 01:21, 29 June 2022 (UTC)
- Except that people aren't thinking of "prose size", even though that's what the table is nominally about; they're making arguments based on data transfer, which includes everything that DYKcheck doesn't put in yellow. XOR'easter (talk) 01:35, 29 June 2022 (UTC)
- That may be the case, but there are no quantitative rules relating to data transfer to remove. (Although we should probably create one for WP:PEIS.) I'm not sure we have great tools to handle data issues at the moment, an easy way to disable images on mobile Wikipedia is probably a good start, but out of scope here. CMD (talk) 02:00, 29 June 2022 (UTC)
- Disabling images and auto-collapsing infoboxes are both ideas for mobile browsing that feel like they should have been implemented long ago. But if data issues are "out of scope here", then we fall back on the question of whether long articles (by the "readable prose" metric) are too long to be useful. To perhaps further clarify: I picked the first two Featured articles that came to mind which could illustrate the difference between textbook-style and encyclopedia-style writing. For long FA's, Allied logistics in the Southern France campaign has 79,068 characters of "readable prose", Harry S. Truman manages a whopping 84,021, Sonic the Hedgehog has 62,573, Vampire has 61,025, Intelligent design edges the line with 59,137, Pink Floyd build a wall of 71,290, Paul McCartney scores 82,506, and The Beatles need to break up at 90,185. Perhaps appropriately, Byzantine Empire breaks the 105 barrier with 105,111. I just don't think the guideline provides more than an illusion of objectivity, and it's so good at doing that that it becomes a risk. XOR'easter (talk) 02:08, 29 June 2022 (UTC)
- I'm getting a slightly different 103 kb for Byzantine Empire, but either way that is a prime example of an article that needs a judicious cutting of extraneous information. Over half (59 kB) of its length is in one section! I'm not sure what the risk being mentioned is, but this seems an example of why length guidelines may be useful to focus minds. An article should fail FACR4 if it has a single section that is longer than the entirety of Speed of light. Happy for this to be moved into the discussion section, if it gets longer. CMD (talk) 02:51, 29 June 2022 (UTC)
- The risk is that people take it seriously and make judgments based on it, when they are really just numbers plucked from nowhere and backed by nothing. Should we have some guidelines about what makes a page egregiously big? Quite possibly. Is what we've got now even a reasonable starting point for that? I only find myself growing more convinced that it isn't. XOR'easter (talk) 03:27, 29 June 2022 (UTC)
- I'm getting a slightly different 103 kb for Byzantine Empire, but either way that is a prime example of an article that needs a judicious cutting of extraneous information. Over half (59 kB) of its length is in one section! I'm not sure what the risk being mentioned is, but this seems an example of why length guidelines may be useful to focus minds. An article should fail FACR4 if it has a single section that is longer than the entirety of Speed of light. Happy for this to be moved into the discussion section, if it gets longer. CMD (talk) 02:51, 29 June 2022 (UTC)
- Except that people aren't thinking of "prose size", even though that's what the table is nominally about; they're making arguments based on data transfer, which includes everything that DYKcheck doesn't put in yellow. XOR'easter (talk) 01:35, 29 June 2022 (UTC)
- Support A size limit from 15 years ago limits military and political pages far too much. Presidency articles are divided into far too many sub-pages and SCOTUS rulings need more room to explain the background and legacy of rulings. It should at least be doubled. Jon698 (talk) 17:59, 28 June 2022 (UTC)
- I don't see how attention spans matter. If you don't have the will to scroll down a Wikipedia page you most likely don't have the will to learn anything. Longer pages would give people the ability to improve their attention spans as no other popular website offers as much information about a wide array of topics as Wikipedia. Jon698 (talk) 18:04, 28 June 2022 (UTC)
- Also I just thought about this. Somebody with an attention span too short to read a Wikipedia article would probably not be willing to search for information across five separate pages that it was diced up into. Jon698 (talk) 18:06, 28 June 2022 (UTC)
- User:Jon698 I like your presidency point. Obama's presidency for example, is split into a DOZEN or so articles (economic policy, energy policy, East Asia policy, South Asia policy, space policy, etc.). Why need an East Asia article and South Asia article when you can merge them with the foreign policy article? Well, you can't unfortunately because there is this certain group of editors that will claim it breaks the size guideline. So now if readers want to find one policy, they will have to flip between a dozen articles. Ak-eater06 (talk) 18:09, 28 June 2022 (UTC)
- I haven't seen the specific articles in question, but this line of argument chases a red herring. Articles are rarely created or split just due to the size guidelines. More commonly, they are created because someone thinks "Wikipedia should have an article on this". For example, Obama's foreign policy pivot towards the Asia-Pacific was widely remarked about at the time. It is undoubtedly a notable topic, that many would be interested in writing an article on. Could you nonetheless merge it with another article? Possibly yes, but there are reasons such merges tend not to happen and these reasons emphatically are not primarily related to this size guideline. If it were, the result would not be the current proliferation of stubs throughout Wikipedia. CMD (talk) 01:28, 29 June 2022 (UTC)
- User:Chipmunkdavis. I understand. Thank you for the clarification. However, I know for sure that Stephen Harper's tenure is divided into four articles (Premiership, domestic policy, foreign policy, and environmental policy) in the name of one combined article being too long. Same with his successor, Justin Trudeau (tenure is divided into premiership, domestic policy, and foreign policy). Ak-eater06 (talk) 02:56, 29 June 2022 (UTC)
- I haven't seen the specific articles in question, but this line of argument chases a red herring. Articles are rarely created or split just due to the size guidelines. More commonly, they are created because someone thinks "Wikipedia should have an article on this". For example, Obama's foreign policy pivot towards the Asia-Pacific was widely remarked about at the time. It is undoubtedly a notable topic, that many would be interested in writing an article on. Could you nonetheless merge it with another article? Possibly yes, but there are reasons such merges tend not to happen and these reasons emphatically are not primarily related to this size guideline. If it were, the result would not be the current proliferation of stubs throughout Wikipedia. CMD (talk) 01:28, 29 June 2022 (UTC)
- Support It's nonsense to think that Wikipedia should be written to the lowest common denominator. I read a lot of articles, but usually just the intro and the parts I'm interested in. I'm not going to repeat the arguments above as to why the limits should be abolished, but would like to point out that there are thousands of articles that do not conform. What should we do about them, convert Wikipedia into a Readers Digest look-alike?Dr. Grampinator (talk) 18:34, 28 June 2022 (UTC)
- Oppose removal. For me, it's less about device limitations and more about human reader attention span. A 100kb article is just too long. But there are still also technical limitations: long articles can be very difficult to edit, both because the source is difficult to navigate and because the browser scripts used to edit articles don't handle the length well. —David Eppstein (talk) 19:20, 28 June 2022 (UTC)
- Support. Splitting is sometimes needed, but often it's just a topic of a debate that ends not in two (for example) nice smaller articles, but in two chunks of text that should be read together. And besides, why does it even matter how many readers read the whole article? People can look for some specific details, they can just read intro, ctrl-f for something, etc. And the book comparison is a good one, IMO - it's much better to have everything related in one place, than to have a dozen scrappy little articles that nobody would ever maintain (even if readers would read small article from top to bottom, nobody would read all the articles splitted from it). Artem.G (talk) 19:26, 28 June 2022 (UTC)
- Oppose removal. But I would prefer a change on the limits. A several MB page makes my computer hang, while a 100 KB page is fine. Now that we have better technology, it may be time for more limits. weeklyd3 (message me | my contributions) 19:33, 28 June 2022 (UTC)
- Oppose It's a guideline, not policy. 100kb ish should be sufficient for a single article, but I wouldn't object to increasing it a little bit, not sure how much (I'm resisting using the 640kb quote). Needs an upper limit, don't want even 1mb page really. -Kj cheetham (talk) 19:33, 28 June 2022 (UTC) P.S. If people are calling on it too strictly to the point of being deterimental to articles, I'm sure it could be better worded - that doesn't justify removing it completely. -Kj cheetham (talk) 21:28, 28 June 2022 (UTC)
- Oppose – although I would not count prose in footnotes and infoboxes, which require essentially no effort to gloss over. Sure, my computer can handle a bloated article, but they are beyond painful to read, verify, and balance. (As a reader, too, assessing reliability.) The argument that "you only read what you're interested in" doesn't really hold up because you have to find it in the first place! There's no cmd-F on phones. Even the pi article that XOReaster cited, in my opinion, deserves a splitting of the "roles and characterizations" section. Yes, it's probably only GA and FA reviewers who actually read through the entirety of longer articles, and I think MOS:REPEATLINK should be abolished on that account. But when I want an overview, I read a few paragraphs of every section. Why have a sprawling article when it can be easily split into more enjoyable and compact articles? 100 KB doesn't need to go anywhere.
EEng: Can you give some specific examples where this guideline was invoked to reduce an article's size (whether through decruftification, concisification, splitting...) and clearly reduced the encyclopedic quality of the article? (By "clear", I mean something that reasonable and/or experienced editors would agree on, independent of size guidelines, not personal preference.)Just saw the examples in the preceding section. Will have to assess them to see whether I agree. Ovinus (talk) 21:29, 28 June 2022 (UTC) - Oppose. First of all, the notion that page size doesn't matter any more from a technical point of view is nonsense and shows a little bit of ignorance of our worldwide audience. Perhaps that's the case for people in the western world with fast internet and fancy machines, but a lot of people around Africa and other areas don't have such luxury. Loading pages and also editing them in the code editor is definitely still an issue. Secondly, as noted by Sandy and others, a size guide is definitely needed from an encyclopedic point of view. The idea that we might combine all the different pages on Obama's presidency into one megalith is absurd. Even if readers don't read from top to bottom, an article should still be a coherent and summary style overview of what in its scope rather than a free for all. I use these size guidelines regularly and I expect them to stay. — Amakuru (talk) 21:52, 28 June 2022 (UTC)
- In anticipation of the remark that photos take up the most bandwidth: Images load asynchronously in modern browsers. You can see this in action by (in Chrome) opening developer tools, going to Network, and changing the "No throttling" option to something else, to simulate a poor connection. We first have today's TFA (copied to my sandbox), which has a fair amount of text and a fair number of images. In the graph, green indicates that a resource has been requested and a response is being waited for. Turquoise indicates that the resource is being downloaded. The blue line indicates the DOMContentLoaded event (DCL), which is slightly different from FCP (First Contentful Paint) but all that matters is that's approximately when the page becomes usable, which happens after the HTML and CSS have been loaded. The images, at that point, are only partially loaded and appear as blank or half-filled, but the text may be scrolled through and read. The images finally load after some time, as indicated by the red line, and the page is finished. Now, observe WP:FAC, a page of impressive size but with no (large) images. In this run, the CSS was also cached, giving it a leg up. But it was still slower than Red panda to initially load, because the dominant time is the HTML. Finally, we observe List of Johnson solids, an article replete with images, but because the HTML is small, the text loads quickly (while the images take a full minute!). My data collection here was rushed but someone can do a rigorous test. With a slow connection, HTML size matters. Ovinus (talk) 22:47, 28 June 2022 (UTC)
- Oppose removing entirely, but definitely support using character or word counts instead of the bytecounts currently in the table. Some sort of rule of thumb is useful to have (even if it is, in fact, entirely arbitrary), and this one seems to have served reasonably well. I'm not convinced that having even more long, unmaintainable articles will serve the project or the reader. (Also "it's a guideline, not policy" needs to be said more often in general.) -- Visviva (talk) 22:02, 28 June 2022 (UTC)
- The current table is, to my understanding, already a character count. CMD (talk) 01:32, 29 June 2022 (UTC)
- To mine, too. Mathglot (talk) 02:05, 29 June 2022 (UTC)
- Right, I didn't mean to imply otherwise. But it seems unnecessarily roundabout and confusing to have a table with page sizes in kilobytes and a note saying "remember, by bytes we mean characters of readable prose." Especially when bytes are a common metric for other kinds of page size that this section isn't about. Using normal human units up front makes things clearer. -- Visviva (talk) 04:23, 29 June 2022 (UTC)
- Oppose it's a useful guideline which has guided my thoughts on many long articles. But it's also not a mandate of length. It reflects a reality that 100kb of readable prose is quite a lot to sift through, and that unless you have a compelling reason, it really should be more concise. I'm working on American Civil War right now, and it's at 99k readable prose, and I am being conscious about length, using summary style, and putting extraneous detail in sub articles. I want to take it to GA, and maybe it will be over 100k when I do so. But the mere fact that we have a guideline that points out 100k is a good max is keeping conciseness in my thoughts. Our editors are not always good at being concise, me included!
- My fear is that by removing this we push articles to be longer, without being better. Shorter does not mean less quality. My favorite example of that is World War II, which is a GA but only 82k of readable prose, despite being a very very broad topic.
- Now, I could see revising some of the numbers up a bit. 60k, more like 75k. Or maybe listing examples of FAs at different size levels? But I don't think getting rid of this useful rule of thumb is gonna improve the encyclopedia. CaptainEek Edits Ho Cap'n!⚓ 05:59, 29 June 2022 (UTC)
- Oppose Oh god no. There are to many unwieldy articles already, this would only encourage more. - LCU ActivelyDisinterested ∆transmissions∆ °co-ords° 12:46, 29 June 2022 (UTC)
Discussion
The size guideline rule was made in 2007, when devices didn't have the capacity to navigate long articles smoothly. 15 years later, we have advanced technology and this rule is ridiculous. Let's remove it altogether. People can split articles without being influenced by this obsolete rule.
Some other arguments:
The idea that any more than 1/10 of 1% of visitors have the desire to read an article from top to bottom is absurd, as is reasoning based on such an idea. Overall article size is just one of many considerations in trying to answer the following question: What structure (of this one article, or of a group of related articles) best allows readers of various kinds to satisfy their knowledge-needs? Stupid size formulas, blindly applied, are not the answer to that question.
"Too long" is not the issue it once was, so the proposed change makes sense. To understand this, imagine a book. No matter the size of a normal book, it is always easier to find a bit of content as long as it's between the book's covers. Those who want to know everything about the topic/story will read the whole book/article, regardless of length. Splitting articles into separate locations (different book volumes) makes it much more difficult to find stuff, and increases the chance of important info never being seen by the reader.
Nowadays most readers search a page for information, so keeping it all in one place makes most sense. Few read the whole article. They may read the whole lead, and may skip to interesting parts, but that's all, unless they search for key words and phrases.
One editor's single-minded, pathological, obsession with splitting long articles is usually very destructive and contrary to the needs of 95% of our readers. Almost no one benefits from it.
It's time. Ak-eater06 (talk) 04:18, 28 June 2022 (UTC)
Is there any good research on readability size? We have editor retention stats for Wikipedia....but is there accessibility data?..... on a side note...A site-wide rfc should take place as mentioned in previous talks and edit summaries.Moxy- 04:37, 28 June 2022 (UTC)
- I hope everyone involved here is not thinking a guideline can be changed based on a survey that a) is not neutrally positioned, and b) is not a site-wide RFC. SandyGeorgia (Talk) 18:09, 28 June 2022 (UTC)
- I have changed the section heading and tried my hand at posing the question in a neutral way.
Ak-eater06
's opening statement is now under the "Survey" heading, where it can be read as part of a !vote. XOR'easter (talk) 18:25, 28 June 2022 (UTC)- I think the large, non-neutral introduction should be moved here, to the discussion section. SandyGeorgia (Talk) 19:34, 28 June 2022 (UTC)
- It should be moved to WP:VPP, have a proper introduction, and be put on WP:Centralized discussion. This is major, even if it has some enthusiastic supporters. Ovinus (talk) 21:39, 28 June 2022 (UTC)
- I've left a message for the original poster here (in the interest of first trying to work it out with the editor); if this isn't resolved quickly, other steps will be needed. I will be busy for several hours; hopefully others will follow up there in my absence. What an unfortunate approach to a long-standing guideline page. SandyGeorgia (Talk) 21:50, 28 June 2022 (UTC)
- It looks like the post has been moved, though there's still a bit of cleanup to do (an "above" should now be a "below", and there's a dangling reference to it before the !votes start). XOR'easter (talk) 22:23, 28 June 2022 (UTC)
- I'll delete that then ... now that all is moved ... SandyGeorgia (Talk) 03:02, 29 June 2022 (UTC) Diff of deleted post, leftover from when large non-neutral block of quotes was at beginning of RFC. SandyGeorgia (Talk) 03:05, 29 June 2022 (UTC)
- It looks like the post has been moved, though there's still a bit of cleanup to do (an "above" should now be a "below", and there's a dangling reference to it before the !votes start). XOR'easter (talk) 22:23, 28 June 2022 (UTC)
- I've left a message for the original poster here (in the interest of first trying to work it out with the editor); if this isn't resolved quickly, other steps will be needed. I will be busy for several hours; hopefully others will follow up there in my absence. What an unfortunate approach to a long-standing guideline page. SandyGeorgia (Talk) 21:50, 28 June 2022 (UTC)
- It should be moved to WP:VPP, have a proper introduction, and be put on WP:Centralized discussion. This is major, even if it has some enthusiastic supporters. Ovinus (talk) 21:39, 28 June 2022 (UTC)
- I think the large, non-neutral introduction should be moved here, to the discussion section. SandyGeorgia (Talk) 19:34, 28 June 2022 (UTC)
- I have changed the section heading and tried my hand at posing the question in a neutral way.
How important is the speed of content appearing versus the number of bytes, if people are on a so-and-so-many-Mb-per-month data plan? In other words, for the purpose of conserving limited network resources, is time-to-text-rendering actually the measure of "bandwidth" that matters? I'm concerned that, by merely throwing some round numbers into a table, we are patting ourselves on the back for serving a global audience without having done the serious work to determine how to do the job properly. The same goes for loading pages in either of the editors, running scripts, etc. If we need a guideline, we need a guideline, not guesstimates. XOR'easter (talk) 23:24, 28 June 2022 (UTC)
- Regarding the changing of the discussion question after the start of the discussion here, I agree it's a more neutral phrasing, but I do not think the RfC is about the table itself, but about the guideline as a whole. It wouldn't make sense to treat the table in isolate, leaving everything else in place. CMD (talk) 01:37, 29 June 2022 (UTC)
Let's settle on a compromise...shall we?
It appears my proposal to abolish the size guideline may have been a bit too radical for some of you...which explains the overwhelming rejection in the recent RfC.
As such, I propose a compromise. How about we increase the kB limit on "probably should be divided" and "almost certainly should be divided" on the Wikipedia:Article size#Size guideline? I personally think "probably should be divided" should be increased from 60kB to 100kB and "almost certainly should be divided should be increased from 100kB to 125kB.
Readable prose size | What to do |
---|---|
> 125 kB | Almost certainly should be divided |
> 100 kB | Probably should be divided (although the scope of a topic can sometimes justify the added reading material) |
> ?? kB | May need to be divided (likelihood goes up with size) |
< 40 kB | Length alone does not justify division |
< 1 kB | If an article or list has remained this size for over a couple of months, consider combining it with a related page. Alternatively, the article could be expanded; see Wikipedia:Stub. |
Ak-eater06 (talk) 13:26, 29 June 2022 (UTC)
- This is an effective doubling of current article size recommendations, so it would be good to hear the rationale for it. CMD (talk) 13:45, 29 June 2022 (UTC)
- Might help if there was a coherent rationale for the current recommendations. EEng 16:57, 29 June 2022 (UTC)
- Yeah, doubling numbers that were arbitrary and unmotivated will just make a new set of numbers that continue to be arbitrary and unmotivated. XOR'easter (talk) 18:50, 29 June 2022 (UTC)
- This is somewhat my view. Some sort of guideline here is useful to provide a goal. While I'm not too torn on what they are, if picking between two arbitrary numbers, sticking with the current ones makes more sense than picking a new one, ceteris paribus. CMD (talk) 02:24, 30 June 2022 (UTC)
- My own sense is that sticking with numbers just because we've stuck with them so far does nothing but enshrine arbitrary choices for the sake of having something to point at and call traditional. The "readable prose size" numbers are explicitly not about technical limitations, since everything from download sizes to script functionality will depend upon all the other bytes that don't count as "readable prose". Nor do they relate to reader attention spans: if people stop reading after the first few paragraphs, it doesn't matter whether the "readable prose" they leave unread is 10kB or 100kB. The numbers are just an excuse to call an article "bloated" without reading it, and thus force editors to maintain six articles instead of one. But, fuck it. Nobody ever gives up the illusion of numerical objectivity. The community will never agree that there's a problem, let alone on how to solve it, and I only push myself closer to another month-long burnout if I try to care. XOR'easter (talk) 03:59, 30 June 2022 (UTC)
- Let me give you some guidelines that I found in a crappy management manual from 1993: If your burnout is only a few days long, you should probably combine it with some other burnouts. If your burnout is a few weeks long, it might need to be divided into several shorter burnouts. If it's a month long, it should almost certainly be divided. EEng 04:10, 30 June 2022 (UTC)
- My own sense is that sticking with numbers just because we've stuck with them so far does nothing but enshrine arbitrary choices for the sake of having something to point at and call traditional. The "readable prose size" numbers are explicitly not about technical limitations, since everything from download sizes to script functionality will depend upon all the other bytes that don't count as "readable prose". Nor do they relate to reader attention spans: if people stop reading after the first few paragraphs, it doesn't matter whether the "readable prose" they leave unread is 10kB or 100kB. The numbers are just an excuse to call an article "bloated" without reading it, and thus force editors to maintain six articles instead of one. But, fuck it. Nobody ever gives up the illusion of numerical objectivity. The community will never agree that there's a problem, let alone on how to solve it, and I only push myself closer to another month-long burnout if I try to care. XOR'easter (talk) 03:59, 30 June 2022 (UTC)
- This is somewhat my view. Some sort of guideline here is useful to provide a goal. While I'm not too torn on what they are, if picking between two arbitrary numbers, sticking with the current ones makes more sense than picking a new one, ceteris paribus. CMD (talk) 02:24, 30 June 2022 (UTC)
- Yeah, doubling numbers that were arbitrary and unmotivated will just make a new set of numbers that continue to be arbitrary and unmotivated. XOR'easter (talk) 18:50, 29 June 2022 (UTC)
- Might help if there was a coherent rationale for the current recommendations. EEng 16:57, 29 June 2022 (UTC)
- See the comment in the RFC above by ActivelyDisinterested. Same thing. This kind of increase would permit several of the already discussed and way too long articles to continue with, and even expand, their unnecessary bloat. SandyGeorgia (Talk) 13:46, 29 June 2022 (UTC)
- Would oppose this change on the same grounds. (Not technical ones, and I don't think that aspect is as important as the content-related objections.) All I'd suggest is amending the parts of this guideline which refer to bandwidth etc. and updating them to line up with today's practice, so that they aren't used as a questionable justification. And if you open an RfC, put it at a more central location. Ovinus (talk) 15:01, 29 June 2022 (UTC)
- I agree with Ovinus. The guideline overall could be better worded, which is perhaps a more worthwhile use of time than this. -Kj cheetham (talk) 17:20, 29 June 2022 (UTC)
- Ak-eater06, I understand your consternation at the incoherent opposition to reforming this relic of the 20th century web, but a carefully planned project to bring enlightenment to the benighted will be required to get anywhere at all on this. There are just too many editors who need Norman to coordinate instead of applying judgment of their own. Add in the editors who confuse bandwidth with latency and stuff like that, and it's just hopeless. EEng 16:57, 29 June 2022 (UTC)
- While it's good that the rejection of your previous proposal hasn't disheartened you in attempting to improve the guideline area, I don't think there is much utility in changing the figures of the guidelines. Onetwothreeip (talk) 07:34, 30 June 2022 (UTC)
- WP:SIZE is a relic of a bygone era; I consider it obsolete, and encourage other editors to do the likewise. It lacks a rationale, and violates Wikipedia:Purpose:
to benefit readers by acting as a widely accessible and free encyclopedia; a comprehensive written compendium that contains information on all branches of knowledge.
For a decade now, people have been saying that Douglas MacArthur is too long, and then went ahead and added more material to it. Dividing articles is not simple, and when I attempted to do it with American logistics in the Siegfried line campaign (dividing it into transportation and services and supply), the result was highly unsatisfactory. Dividing the Galileo project into articles about the spacecraft and the project made it harder for readers to find they wanted, and an editor trashed the article history in the process. Hawkeye7 (discuss) 20:34, 30 June 2022 (UTC)- I can see an immediate way to cover MacArthur by moving the excessive details of his WWII Activision (which include a lot of events not directed related to him) to a separate article and leaving high level summaries ther. Sane with Galileo. The problem seems to be that editors don't write good summaries behind when content is split to make easy to see the high level details of the split content and being clear more details can be found elsewhere. Masem (t) 21:15, 30 June 2022 (UTC)
- Size would be only one consideration of articles such as Douglas MacArthur and Galileo project. There would be other considerations that ensure difficulty in splitting or reducing the size of those articles, and reduction certainly shouldn't be pursued without consideration. I don't think guidelines should be changed based on errors made by editors in a few articles, which are much more to do with the editors of those articles than these guidelines. Onetwothreeip (talk) 07:47, 1 July 2022 (UTC)
- Agreed; the decision to split articles should never be dictated by this guideline. Douglas MacArthur has grown organically over time. I have created two sub-articles, though: Douglas MacArthur's escape from the Philippines and Relief of Douglas MacArthur. The point illustrated by Galileo is that people complained bitterly about not being able to find information that was plainly in the sub-article. Simply splitting off material and summarising is unacceptable; the amount of detail in an article still has to be "balanced" to avoid being UNDUE. Hawkeye7 (discuss) 07:55, 1 July 2022 (UTC)
- Outliers don't make good examples for the purposes of this discussion; that MacArthur hasn't yet been split doesn't mean it can't be or shouldn't be. SandyGeorgia (Talk) 17:06, 1 July 2022 (UTC)
- Strong oppose among other reasons articles this long are very hard to maintain at a good level of quality, as I've found in my experience at FAC and FAR. (t · c) buidhe 04:36, 1 July 2022 (UTC)
- By the way everyone, did buidhe mention that he writes featured articles? EEng 16:53, 1 July 2022 (UTC) P.S. He also writes featured articles. Featured articles too!
- I've never quite understood this argument. If you have 12,000 words of featured content, why is it harder to maintain that in one article, compared to maintaining 7,000 featured words in one article and 3,000 featured words in each of two subarticles? (The split-up total will always be greater due to summary sections repeating information, and indeed that gives additional maintenance trouble in terms of the common content getting out of synch.) This argument only works if you don't bother with making the two subarticles featured, but how that improves the encyclopedia I'm not sure. Wasted Time R (talk) 19:18, 1 July 2022 (UTC)
- Look, people who write featured articles dwell on Mt. Olympus. From their lofty perch they look down and take delighted amusement in the feeble editing efforts of the rest of us. They also breathe pure, rarefied air and gorge themselves on a magic ambrosia; by consumption of these they are endowed with uncanny powers of composition denied to us pathetic little people. So stop arguing, insect. EEng 21:28, 1 July 2022 (UTC)
- I've never quite understood this argument. If you have 12,000 words of featured content, why is it harder to maintain that in one article, compared to maintaining 7,000 featured words in one article and 3,000 featured words in each of two subarticles? (The split-up total will always be greater due to summary sections repeating information, and indeed that gives additional maintenance trouble in terms of the common content getting out of synch.) This argument only works if you don't bother with making the two subarticles featured, but how that improves the encyclopedia I'm not sure. Wasted Time R (talk) 19:18, 1 July 2022 (UTC)
- By the way everyone, did buidhe mention that he writes featured articles? EEng 16:53, 1 July 2022 (UTC) P.S. He also writes featured articles. Featured articles too!