Wikipedia:Village pump (idea lab): Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Line 930: Line 930:
::{{u|Thebiguglyalien}}, that's also very useful insight. How do you think we can improve the organization of the guidance? Any particular areas of concern? — [[User:Ixtal|Ixtal]] <sup>( [[User talk:Ixtal|T]] / [[Special:Contributions/Ixtal|C]] ) </sup> &#8258; <small> [[Non nobis solum]]. </small> 10:19, 3 March 2023 (UTC)
::{{u|Thebiguglyalien}}, that's also very useful insight. How do you think we can improve the organization of the guidance? Any particular areas of concern? — [[User:Ixtal|Ixtal]] <sup>( [[User talk:Ixtal|T]] / [[Special:Contributions/Ixtal|C]] ) </sup> &#8258; <small> [[Non nobis solum]]. </small> 10:19, 3 March 2023 (UTC)
::I'll second Thebiguglyalien's comment: too much "stuff", poorly organized. We can expect editors to go through [[Help:Introduction]]. But practically no one reads [[Wikipedia:Contributing to Wikipedia]] or any of those pages (I certainly didn't). They're both too basic and too verbose, making them pointless, while far less prominent pages, like MILHIST's Academy or [[User:Tony1/Noun_plus_-ing|userspace essays]], are more useful by a mile. Nobody reads a manual before driving their new car; our help pages shouldn't try to address everything, they should focus on frequently asked questions, like how to copy edit (with ''specific'' guidance and examples), and how to find high-quality sources (too bad we can't just link out to LibGen/Sci-Hub, but people can still find tons of books/papers for free on the web or archive.org, yet most don't know how to). [[User:DFlhb|DFlhb]] ([[User talk:DFlhb|talk]]) 11:54, 3 March 2023 (UTC)
::I'll second Thebiguglyalien's comment: too much "stuff", poorly organized. We can expect editors to go through [[Help:Introduction]]. But practically no one reads [[Wikipedia:Contributing to Wikipedia]] or any of those pages (I certainly didn't). They're both too basic and too verbose, making them pointless, while far less prominent pages, like MILHIST's Academy or [[User:Tony1/Noun_plus_-ing|userspace essays]], are more useful by a mile. Nobody reads a manual before driving their new car; our help pages shouldn't try to address everything, they should focus on frequently asked questions, like how to copy edit (with ''specific'' guidance and examples), and how to find high-quality sources (too bad we can't just link out to LibGen/Sci-Hub, but people can still find tons of books/papers for free on the web or archive.org, yet most don't know how to). [[User:DFlhb|DFlhb]] ([[User talk:DFlhb|talk]]) 11:54, 3 March 2023 (UTC)

== Motto of the day ==

According to what I know, the main page has been relatively static for quite some time now. Perhaps we could add the [[Wikipedia:Motto of the day|motto of the day]] to it? '''[[:User: The Bestagon|<span style="color: green; text-decoration: inherit;"><sup><big>The</big></sup></span>]]<span style="color:#ff4000">⬡</span>[[:User talk: The Bestagon|<span style="color: green; text-decoration: inherit;"><sub><big>Bestagon</big></sub></span>]]''' 12:32, 3 March 2023 (UTC)

Revision as of 12:32, 3 March 2023

 Policy Technical Proposals Idea lab WMF Miscellaneous 
The idea lab section of the village pump is a place where new ideas or suggestions on general Wikipedia issues can be incubated, for later submission for consensus discussion at Village pump (proposals). Try to be creative and positive when commenting on ideas.
Before creating a new section, please note:

Before commenting, note:

  • This page is not for consensus polling. Stalwart "Oppose" and "Support" comments generally have no place here. Instead, discuss ideas and suggest variations on them.
  • Wondering whether someone already had this idea? Search the archives below, and look through Wikipedia:Perennial proposals.

Discussions are automatically archived after remaining inactive for two weeks.

« Archives, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57

Avoiding driving off experts from Wikipedia

I fear experts may be disincentivized from contributing, or driven off by WP:FETCH-like behavior.

One problem, is that even with WP:V, nothing can substitute true expertise. I can read research papers, academic books, watch physics/math lectures, I'm still not going to be able to contribute to these topics with the level of understanding that an expert has. So when experts come here, contribute, and are reverted because of lack of sourcing, not because of any specific content objection, I think that's excessively burdensome. Why not just add a {{cn}} tag?

See this discussion, which seems representative of a more widespread problem: it focuses on procedural issues, instead of content (the only commenter who has discussed the content is the expert whose edits were reverted).

Interested in people's thoughts. Amusingly, WP:BURDEN and WP:PRESERVE were discussed at length recently (see here and here); but those discussions only barely addressed technical articles; most of ours are languishing, due to lack of expert attention, and I think we need to discuss that as its own subject.

How do we make it easier for experts to improve these articles, while minimizing the risk of unverifiable content being added? Surely, we have a ways to go. DFlhb (talk) 00:36, 4 January 2023 (UTC)[reply]

Are there any unique factors when working with an expert that would warrant its own policy considerations? In the meantime, I wonder if our policies need to be updated to more clearly state what is considered an inappropriate removal or restoration. And as a behavioral guideline, I think WP:DONTBITE should be applied a little more firmly. Thebiguglyalien (talk) 07:02, 4 January 2023 (UTC)[reply]
I agree that WP:BITE policy regardless whether of who. That said, WP:ACADEME is a nice essay that I share with academics who struggle with Wiki style of editing versus research paper. ~ 🦝 Shushugah (he/him • talk) 08:39, 4 January 2023 (UTC)[reply]
I sometimes point to Wikipedia:Expert editors. Gråbergs Gråa Sång (talk) 11:41, 4 January 2023 (UTC)[reply]
(1) Those are both very good links. I've been thinking about this question. I have given up editing WP in my own professional field following two events. The first was an attempt to improve an article on a method, where I encountered a bigger expert coming the other way, who politely refused all possible changes to the article. Basically they were editing from the university that developed the method in the first place, and were unable to accept the concept that anyone outside their own colleagues has made any significant contribution since, or that the method has in any way been improved or enlarged upon. I find this rather a pity, so the article remains as it was: good, but basically a fossil, remembering a happy past. The second was an attempt at cleaning up another article, which is full of references to minor primary research papers containing good(ish) ideas that went nowhere, while missing quite important concepts now used in commercial scientific instruments. I became frustrated because it was clearly a target for semi-knowledgeable grad students and early-stage professionals, using a bit of Wikipedia editing as part of their training, and unable to realise that every research paper on the subject claims that it's an important advance. Meanwhile I couldn't add the things that are really being used, because the best sources for them are heavily linked to the manufacturers, so they're not seen as independent (Wikipedia has a very strong nose for even the slightest commercial COI). After a while, I realised that as an educational resource, Wikipedia is pretty rubbish: it's better to devote one's time to institutional and personal web-resources. Which brings us to point 2:
(2) Few experts understand that Wikipedia is an encyclopaedia. It's supposed to be readable by any person of normal educational and academic background. It's not supposed to be a specialist reference resource for use by professionals and experts. There are big divides on this by subject area. For example, look at Mensural notation, a very technical and complex subject in music. The article is readable, and makes sense, requiring little previous knowledge. Now look at Integral, which is a very basic concept in maths. The article kicks off with the first sentence of the lead: "In mathematics, an integral assigns numbers to functions in a way that describes displacement, area, volume, and other concepts that arise by combining infinitesimal data." Basically unless you know what an integral is, that sentence might as well be written in Chinese, but if you do, it makes perfect sense. The whole article is a subtle fight between those who wanted to explain integrals in a way that an average shop assistant could understand if they wanted, versus those who wanted to make sure that every sentence conforms to a mathematician's 100% rigorous approach, and the result is an article that's worthless: it doesn't tell an expert anything they didn't know already, and it's a very poor way for anyone else to find out what an integral is. In many ways, I wish experts would stay away unless they are both experts in the subject matter, and in writing about it in a non-technical way. I suspect that many experts see the subject from too close, and those who are good at writing encyclopaedically about their subject don't do so here, not only because they're fed up of being bitten and challenged by non-experts, but also because they've realised there are better places to do it.
(3) The OP also touched on the matter of citation, and suggested we should be a bit gentle with experts who are writing from knowledge, writing correct information, but not citing. I'd advise avoiding discussing this at the same time as discussing how to include experts anyway. Unfortunately WP has two groups of editors: the delete-uncited-on-sight, and the don't-disrupt-the-flow-by-deleting-correct-statements people. Both sides are utterly convinced that they are Totally Right. Any attempt at discussing the subject merely ends with both sides declaring there is no point in discussing it because not only are they Right, but everyone knows they're Right. It's a doomed discussion, and poisons discussion of anything else. I'd keep it separate! Elemimele (talk) 13:21, 4 January 2023 (UTC)[reply]
Touching on part of your theme, after first trying to edit in the area in which I received my PhD (50 years ago, but never worked in it), I decided I couldn't keep up with the grad students in the field. Similarly, I have stayed away from the field I worked in for 20 years before retiring. Now, I edit what interests me, and don't recall running into any editors who claimed to be experts in the various areas I work in, although experts would be welcome to help sort out competing and ambiguous sources. - Donald Albury 20:28, 4 January 2023 (UTC)[reply]
Personally, I'd hesitate to extrapolate from the sampling of experts who have edited Wikipedia in the past to generalize that very few experts understand that Wikipedia is an encyclopedia targeted at the general public. There are plenty of experts who understand the need to tailor messages for the intended audience. For those making a living in their field of expertise, I agree that that in many cases, there is limited upside in contributing to Wikipedia, versus finding other outlets for public education and potentially becoming a source cited by Wikipedia. isaacl (talk) 22:18, 4 January 2023 (UTC)[reply]
I don't think interaction with experts is the main problem; it's more about reducing the risk of their edits getting lost in the fray. Probably the biggest disincentive for contributions is the idea that "it won't matter, 'cause it won't last" (some likely even would add: "and a nonexpert would eventually destroy my improvements anyway"). Maybe having a guideline recommending that these reverted edits be placed in a talk page header, so they can get reviewed, even a year, 3 years, 5 years later? (better late than never). Or, adding a link to the talk page header with recommendations specific to expert contributors: "state you're an expert, say what's wrong, and give a source that could be used to fix it", to let them know that they can help us a lot with just a few minutes of their time, without them feeling they need to learn how Wikipedia works and just not bothering.
Or having some way for experts to contribute to articles in their userspace, not have these drafts deleted after some arbirtary time, and similarly link them in a talk page header so they can be reviewed at some point? These types of more m:Eventualist approaches seem like they would be most fruitful, without treating expert to lower contribution standards than other editors. DFlhb (talk) 20:36, 4 January 2023 (UTC)[reply]
But how would you deal with the fact that we have no mechanism for verifying who is, indeed, an expert? You cannot label an edit as being made by an expert unless you have verified that the user making the edit is indeed an expert. After the Essjay controversy, Jimmy Wales proposed that Wikipedia adopt a system for verifying experts, but the community said no. While that was 15 years ago, I would be surprised if the community is ready to officially recognize experts. If it is ready, there is the question of the bureaucracy that would be needed to administer it. Donald Albury 21:04, 4 January 2023 (UTC)[reply]
I love Jimbo's idea. One mechanism might be the VRT. Also just learned there's a relevant draft proposal on verification; posting in case others haven't seen. DFlhb (talk) 21:24, 4 January 2023 (UTC)[reply]
But there are so many huge problems. (1) lots of experts won't want to be identified or identifiable, or 'outed', and will not get involved; (2) if experts can be validated by staffers in private, how do we have the transparency to know they're really experts, or do we have an unknown clique with special editing privileges; (3) how do we retain casual-experts who happen to spot an error while drifting past as readers, and correct it (often as an IP editor); and (4) do we actually want an encyclopaedia operating in Britannica mode, written by experts rather than everyone who can find a source? It's a good beast, but a different beast. You might end up with a not-very-good encyclopaedia written by the sort of ex-expert who has time on their hands and nothing much else to do because they're not actually all that good at what they do. Elemimele (talk) 21:48, 4 January 2023 (UTC)[reply]
The problems are so huge that they would completely change the nature of Wikipedia, and undo over two decades of work that has pretty much put traditional general encyclopedias out of business. For verification to work it would need a huge bureaucracy to support it, and you would still have the problem that extremist POV-pushers would constitute the majority of "experts", as others would not be prepared to go through this process in order to do voluntary work. Self-certification would be even worse, as many people over-estimate their expertise. Phil Bridger (talk) 22:18, 4 January 2023 (UTC)[reply]
This seems like a pretty strong misrepresentation of the proposed idea. You're implying that it's some overhaul of editing when it would just be a way for editors to verify their credentials if they liked. Also, there are already systems in place for verifying private information to the WMF. Thebiguglyalien (talk) 22:46, 4 January 2023 (UTC)[reply]
Just want to add to this: we're going back into the "it's an interaction problem" territory, the opposite of what I support. The larger goal is to fix the issue of good ideas being almost "lost to time" in talk page archives or revision histories. The credential verification isn't even a requirement for my ideas; I'm brainstorming ways to make m:Eventualism work better; that's all. DFlhb (talk) 23:27, 4 January 2023 (UTC)[reply]
The problem is that the only decision-making mechanism on English Wikipedia is based on consensus. If some edit is identified as a "good idea", then editors will work at putting it into the article now. It's operationally difficult to maintain a list of ideas that aren't determined to be good now, but that a different set of editors might think are good in future, because any edit can meet that definition. isaacl (talk) 00:02, 5 January 2023 (UTC)[reply]
Citizendium is an object lesson in recognizing experts. Articles could be written by "authors", but only "editors" could approve an article. In academic fields, editors had to hold a PhD in the field, and be working in the field they were an editor for. In non-academic fields they let someone who had published articles on the subject be an editor. Even though I have an earned PhD, I could not have been an editor for that field because I had not held a position in the field. Check out [citizendium.org Citizendium] and see how well it is doing these days. Donald Albury 01:48, 5 January 2023 (UTC)[reply]
The Essjay rule still applies though. If you edit an article and are not an expert, then you are a sock, and will be dealt with accordingly. The level at which articles need to be pitched is always problematic; there is no micro-Wikipedia. Nor are we in the business of lies-to-children. We try to pitch the article at the general reader, but we also know that the more complicated a subject, the more likely it is that the reader has expertise. Anyone looking at an article on integration will be at least a high schooler, for that is when the subject is taught. My third grade math text said: "a circle is a set of points". What a mind-blowing concept! So if it is good enough for the third grade, the rest should have no problems. Hawkeye7 (discuss) 03:20, 5 January 2023 (UTC)[reply]
The reader I have in my head when I look at Integral is a high-school kid's grandmother: the kid has come home talking about integrals and she wants to know what the kid is going on about. She's intelligent, but her maths education happened 60 years ago (and at a time when girls were expected to cook, not integrate), so she turns to the world's best general reference book for help. It'd be disgraceful of us not to do our best! But what's she going to make of the first sentence of the section on Lebesque integrals: "It is often of interest, both in theory and applications, to be able to pass to the limit under the integral"? Lebesque did a better job of explaining in terms of loose change in his pocket. The diagrams in the article are much, much better than the text. To write about integrals in an encyclopaedic way that is useful to an intelligent grandmother you need someone who's an expert not only at maths, but also at little old ladies. Elemimele (talk) 14:26, 5 January 2023 (UTC)[reply]
There are multiple issues raised here. The first is that while it'd be disgraceful of us not to do our best!, our top level articles are seldom examples of our best work. They are very hard to write! So the experts prefer writing up more specific but more manageable topics. (I intended to rewrite one within my own field of expertise over the holiday, but found it more congenial to write about the guy who built Disneyworld.) The second issue is how we can cater for the level of background knowledge of the reader, which determines what information they are seeking. There are three cases in your example: the grandmother (who sounds very much like my own, who attended a domestic arts school back in the 1950s), who has little background; the high schooler, who would be in year 10 or 11, when the topic of integration is introduced; and the college student, who will be encountering the Lebesque integral. (A crucial concept, as noted above, was slyly slipped into the third grader's text, but this was part of the New Math movement of the 1960s.) The Lebesque integral subarticle can assume that level of knowledge; in most cases, the more specific an article, the more we can infer about the reader. But what about this article? We Wikipedians know that "Formal definitions" means "college level math in this section" but most readers don't know that. We now have a tug of war among editors that is common to many mathematical articles. What is the logical ordering? I would argue for pushing that section down the article, and bringing the section on the Fundamental Theorem of Calculus (which the high schooler will encounter) up. But other editors will argue that the ordering is more logical the way it is: with the detailed proofs and concepts coming first. In other words, the issue is pedagogical, not mathematical. Hawkeye7 (discuss) 20:41, 5 January 2023 (UTC)[reply]
That last sentence is spot on. The most valuable experts here are those who consider the pedagogical side. Elemimele (talk) 06:57, 6 January 2023 (UTC)[reply]
I think user boxes were supposed to help with encouraging multiple people with experience and expertise to commen, and know they will be supported. But if you ask for help on a user talk then you could be accused of spamming. Wakelamp d[@-@]b (talk) 03:30, 2 February 2023 (UTC)[reply]
As a PhD-possessing expert who edits almost exclusively in my field of work and study, I'm not really sure that this is really a problem of Wikipedia editors driving off experts. I think these things are true but I'll only speak for myself: This is a really weird place to write and it's different from how I write in almost any other context. Moreover, because most of my interactions are with people with whom I have no relationship and have much less familiarity with the subject, I find myself explaining things over and over again, sometimes things that are glaringly obvious to me and my colleagues but unknown outside of those circles. For example, a few days ago an editor was asking me why we rely on the Carnegie Classification of Institutions of Higher Education to label some U.S. universities as "research universities;" it's a very legitimate question and one that I should be able to answer but it's questioning such a basic, common practice in my discipline that it was equal parts frustrating and amusing (amused at myself for struggling to answer such a basic, reasonable question - not amusement at another editor's ignorance!). Similarly, I've reverted many edits made by editors who confuse a capital campaign with an endowment; another very obvious distinction to me but clearly not obvious for many other people.
In my discipline, I think that we bear most of the fault for not wanting to engage here and contribute to this public good. This is a weird place with a community and practices unlike any other so it takes a lot to stick with it and learn your way around. There are certainly things we can do to make it easier for new editors. But I haven't experienced much that is specific for experts who are new editors that must be changed. Deferring to someone else's expertise without evidence is definitely not the way to go. It would certainly make my editing here easier in some instances but it would be a massive change in our foundational culture and practices, a change that I would not support and I doubt would garner significant support project-wide. ElKevbo (talk) 04:12, 6 January 2023 (UTC)[reply]
Some of you may also be interested in Wikipedia:Teahouse#Non-expert review guild? by GuineaPigC77. Whatamidoing (WMF) (talk) 18:13, 6 January 2023 (UTC)[reply]
I think the focus on "real" experts, credentials, whether they can talk to non-experts, etc. is perhaps a bit beside the problem exposed in the original post. Those of us who remember the Web before it had version numbers (through, say, the early 2000s?) will recall that a significant proportion of it used to be individually-written web pages on the author's little niche interest. (A representative sample for the youngsters.) Sometimes the people who wrote these sorts of things were Genuine Certified Experts as regarded the subject they were writing about; more often, perhaps, they were adjacent to the subject. e.g., the person writing lucidly about the Lebesque integral and its applications might be a programmer implementing a math library rather than a professor of mathematics. It might be someone with no formal credentials at all about the subject, but a passionate amateur student; superficially interested people generally couldn't be bothered to engage deeply enough and for long enough to do this sort of thing. I think Wikipedia absorbed a lot of that passionate amateurism, and rechanneling it was responsible for a lot of our early growth.
Unfortunately, as Wikipedia has become a load-bearing part of the noosphere, we've had to face an increasingly complex threat model. Many more people now edit Wikipedia, not out of a sort of naive enthusiasm for knowledge, but because of a desire to promote (or suppress) some person, organization, or ideology; and the widespread consumption of our information means that errors, even those made by a well-meaning but ignorant enthusiast, can have a great deal of impact. As a result, the way we interpret content policy has become increasingly rigid and compulsive, and focused on protecting us from the lowest common denominator editor. Even if the modal editor is a crook or an idiot, firm application of policy will (we hope) result in them creating accurate articles, will they, nill they. The problem is that this general trend in policy and the interpretation of policy is paid for by the slow, gradual immiseration of editors who are knowledgeable about a particular topic. When trying to make a specialized topic intelligible to a lay reader, you will almost always find that certain pieces of disciplinary knowledge, like those ElKevbo mentions above, are assumed to be understood by the reader of the reliable sources you are using to write the article, and you will not be able to cite a clear, explicit statement of that piece of knowledge from the source. The conscientious editor will find that the particular statements drawn from the reliable sources are intimately intermixed with background knowledge, and is faced with an extended hunt through peripherally relevant sources to gain an explicit warrant for those pieces of background knowledge. It is, frankly, exhausting, a strain on working memory, and deters sustained contribution.
I don't know what to suggest as a solution. These changes to policy happened organically, and for a reason. But I think our current approach will tend to be self-perpetuating; having adapted our policies to editors who can't be trusted to know what they're doing or to act honestly, we will select for an editor base of that type. Choess (talk) 03:50, 7 January 2023 (UTC)[reply]
I understand your comment about filling in the background. Much of my editing is about history and/or archaeology. As I dig through sources I run into events, places, and concepts I have not heard of, and which are mentioned only in passing in a source, and for which no article yet exists in enWP. So I try to fill in those gaps. Such attempts all too often turn into a descent into a multi-branching rabbit hole. Donald Albury 23:26, 7 January 2023 (UTC)[reply]
I was just checking through my notes for MSc lectures I'll be giving shortly, and note that I explicitly warned last year's students not to read any of the relevant articles about the subject on Wikipedia, as the articles are riddled with errors, out of date, and full of trivialities that went nowhere; instead I furnished the students with a list of mainstream textbooks and review articles, and links to generous professors in the US who've put good teaching materials on their own websites. It's a bit depressing reading my own opinion. But there's no way I'm going to try to clear up that mess. I hope (and genuinely believe) that my subject is particularly badly covered, and that I'm not misinforming myself when I use Wikipedia as a reader on other topics. The trouble is, situations like this make me wonder whether other experts feel the same about their fields, and undermine my faith. Elemimele (talk) 13:29, 9 January 2023 (UTC)[reply]
I'm not an expert in the areas that I mainly edit in, so I rely on the reliable sources I can find (including the books from academic and other reputable presses I have accumlated in the last 15 years). I do find when working on existing articles that much of the content is not supported by citations, or is supported by citations to blogs, promotional sites, well-meaning but ill-informed "official" sites, or off-line sources that I am not familiar with, and cannot find coverage of on-line. All too often, the cited sources do not support, in part or at all, the content preceding the citation. I also look back at my early work on WP and cringe. In one early article I got a city name wrong. I saw the mistake seven years later, and after searching to see which editor had introduced the error, was embarrassed to discover that I had made the error when I wrote the article. Yet, I use WP all the time to look up something I know little or nothing about. I will also continue to do what I can to improve the content of WP, however little that may be in the grand scheme of things. Donald Albury 17:43, 9 January 2023 (UTC)[reply]
You've put your finger on one of the problems that some technical experts have here: that they encounter non-experts relying on reliable sources. But the non-experts don't realise that you sometimes have to have some expertise in the subject to recognise a reliable source. The articles in my field, an instrumentation-related branch of science, are riddled with trivia fished out of journals that might as well be re-titled "Annual Reviews of Whacky Ideas that Went Nowhere", but because they're peer-reviewed review articles, they're automatically deemed Reliable. Some of these ideas are 10, 20 years old but no one has ever developed them any further because they don't work. No one ever publishes a subsequent article saying "we read this idea and tried to do it, but it failed", that's not how publishing works. Instead, the fringe ideas just fade away. If you're writing a review for professionals, people expect to be told a few whacky things that they don't know; professionals know this, and don't expect the ideas to be mainstream. But non-experts don't know, so all this stuff gets trotted out in a Wikipedia article as though it were the bread-and-butter of the subject, with not the slightest attempt at distinguishing between what's done and what someone once briefly thought might be worth a try. Meanwhile, I found when I first tried to add some information about what people actually do, the real stuff, it would get reverted because the sources that best support this are often produced by the producers of the instruments, and so it's deemed non-independent stuff, tainted by commercialism. Or it's teaching information produced by labs that do it, in which case it'd be reverted as a "blog". But very often there are five manufacturers all writing more-or-less the same thing because they're describing, accurately, and up-to-date, what is actually done (i.e. it's not really non-independent, because since they all write the same thing, they might as well be writing about one another's instruments, and not their own), or multiple labs, of very high-quality output, are producing similar teaching material, so their "blogs" are mutually supported by the fact everyone who knows anything is saying the same thing. Wikipedia is like a bunch of people who want to know how a back-hoe works, but who refuse to listen to JCB, or a group of back-hoe operators, because they are utterly convinced that a guy called Bert who loves going to truck shows and wrote a book about it is somehow more reliable than a team of experts who actually build the things. And that's a problem. But I do think this depends enormously on the field concerned. There are almost certainly some fields where genuine experts are rare, and well-informed amateurs armed with good sources might be better. But again, it's really hard to assess your own ability to edit in a field without having the expertise. In a sense, the whole of Wikipedia is founded on ignoring the Dunning Kruger effect, and bits of it seem to get away with it quite well! And sorry, I'm rather changing views on experts here, having previously complained that experts are sometimes rubbish at explaining their subject. Elemimele (talk) 23:13, 9 January 2023 (UTC)[reply]
Is it because the experts get outnumbered? It is such a pity that contacting editors that have expertise listed in their infobox is spamming :-( Wakelamp d[@-@]b (talk) 12:21, 24 February 2023 (UTC)[reply]

This is an excellent way of putting peer reviewed content over here, "Category:Wikipedia articles published in peer-reviewed literature", and then that little open book sign appears on the right where the GA and FA symbols are; this somehow seems to connect to this thread on experts. Though this does seem to address expert-level-content more than experts-as-individuals. FacetsOfNonStickPans (talk) 11:43, 10 January 2023 (UTC)[reply]

How long does it take them to figure out I'm not actually an expert in their field? Hawkeye7 (discuss) 17:43, 12 January 2023 (UTC)[reply]

One other factor that drives experts from Wikipedia -- & IMHO prevents more than a limited amount of serious development on any subject -- is the lack of any real payback for the work. Outside Wikipedia if one writes an essay or monograph on a topic, the writer expects to receive something in return: money, or credit, or simply ownership. Instead a Wikipedian donates their time, expertise, & incidental costs of writing an article to the project, after which it becomes the property of everyone, & (as the slogan reads) "anyone can edit it". Yes, we contribute to Wikipedia out of love, but unless a contributor gets something tangible from the contribution this work is in effect unrequited love. The result is that only a fraction of the already small group of contributors will doggedly fight to keep an article usable & the information correct, & even then (as pointed out above) those few may have a subtly incorrect or out of date understanding of the topic.

While this does not discourage any contributions, it has a dampening effect: one is going to limit ones editing time, research time, & enthusiasm for a given topic if they are fighting ignorance without any recognition of their efforts. Articles will reach a point of improvement, then stay at that point -- or degrade if the original author has moved on -- due to this lack of recognition & the policy about WP:OWNERSHIP. Now I'm not saying we should throw out those policies, after all this concerns a central part of Wikipedia culture, & we can all point to instances where this required radical altruism has helped Wikipedia. However, having been made aware of the negative impact this required altruism has, perhaps we should think about loosening this requirement. Or accept Wikipedia is doomed to being only so useful. -- llywrch (talk) 19:48, 20 January 2023 (UTC)[reply]

I am joining this conversation a little to late, but as an expert that left wikipedia, so I thought I would share my experience. In my case, a fellow expert who wanted influence articles from a particular perspective. Eventually things escalated to the point that editor decided to publish papers, in tangentially related journals not really in the subject area. For example something sociology or philosophy of mathematics than rather than mathematical statements. Friends of his would then change articles to the desired POV and with an citation that said exactly what would dovetail into the WP article. As is pointed out above it takes "some expertise in the subject to recognise a reliable source". Editors who were not experts, understandably felt that the point of view was verifiable and notable since it was published, and even I have to admit seemed custom written for the wikipedia article. Since it grew out of the dispute this made sense. I felt my choices were to go down the path of writing my own crazy articles, or just accept this.
As a side note, a lot of the disagreement was about when/where/how do discuss infinitesimal ideas. I am personally horrified to see it pointed out that the jargon into the lead sentence of integral. But in terms of full disclosure, I have edited that sentence (see here if your curious) so I shouldn't be considered an unbiased observer. And I am not interested in reopening old grievances. Thenub314 (talk) 15:58, 9 February 2023 (UTC)[reply]
Because so many editors would rather revert than add, e.g., a {{cn}}, I have decided that WP:BOLD is often not tenable and use the talk pages heavily prior to putting in the time to do major edits. I really wish that MOS would discourage reverts for reasons where adding a template is more appropriate.
Also, there are subjects will I will no longer contribute because some editors have created a hostile work environment, A more robust dispute resolution mechanism, not requiring consent by all editors involved, would help. --Shmuel (Seymour J.) Metz Username:Chatul (talk) 01:34, 13 February 2023 (UTC)[reply]

I don’t think this is going to actually happen, but I have had the following idea for awhile: the Wikimedia foundation should hire a small group of academics as paid editors, who I call chief editors. In the very early days of Wikipedia, Larry Sanger was a paid editor who was in the position of editor-in-chief. The chief editors would kind of play a similar role. Their primary role would be in making editorial judgement of on which reliable sources should an article be based on. As pointed out above, consensus is not a good way to select the best sources for articles; especially for scholarly articles, the approach based on the credentials is better. The chief editors would be hired by the foundation, perhaps based on the nominations/recommendations from the Wikipedia community. We the community should nominate those who have a good of understanding of NPOV and have some academic degrees as well. Getting paid from the foundation should also help ensure avoiding the COI concern. Anyway, just some idea. —- Taku (talk) 13:35, 13 February 2023 (UTC)[reply]

AGREED - DFlhb, I would never recommend anyone to this site. It doesn't matter what you know or what you can prove, the only thing that really matters is whether you have more privileges than another user or have a friend who does. The person with the most power almost always wins at the end of the day. I won't name names, but I know some users here who have retired more people than Mike Tyson did by coming up with endless challenges and objections to the point of making users quit. The site is about 20% of people who are interested in helping other people learn about subjects and 80% of people who use this site for power trips. KatoKungLee (talk) 02:41, 24 February 2023 (UTC)[reply]

A semi-automated tool for IP vandal-fighting based on revert statistics

I'd like to discuss an approach to IP vandal-fighting, that would introduce a semi-automated tool to lighten the load on admins, while hopefully improving our catch rate. Currently, semi-protection is used on article (and other) pages, which protects them from editing by unregistered users, when disruption becomes excessive. A weakness of this approach currently, is that it is a manual process, requiring use of the {{Edit semi-protected}} template at the Talk page, or a request at Wikipedia:Requests for page protection.

My proposal has two parts. One addresses "identification", i.e., the "when disruption becomes excessive" part, and attempts to automate that. The second part, would use the result of the identification process, to either flag/report the page, or semi-protect the page automatically, under some conditions. The proposal does not address, and brings no improvement to vandalism from registered users.

Part 1 — Identification

The identification process is based on the theory that the proportion of reverts on a page tells us something about vandalism. This is a very rough association, as contentious topics are also likely to have a higher proportion of reverts which are not due to vandalism. The process proposed here attempts to tease out the difference.

The proposal depends on the gathering of some statistics about reverts on an article, which might be something like:

  1. ip-edits: number of IP edits in last time-period (a series; week, month, quarter, year?)
  2. ip-reverted: number of IP edits in last time-period that were reverted
  3. reg-edits: number of registered user (non-bot?) edits in last time-period
  4. reg-reverted: number of registered user edits in last time-period that were reverted

(maybe also ip-reverts: number of IP edits in last time-period that are reverts of another user edit)

We would then derive:

  • ip-reverted-pct: percent of IP edits reverted in last time period (w/m/q/y)
  • reg-reverted-pct: percent of registered user edits reverted in last time period
  • reverted-index: calculate ip-reverted-pct / reg-reverted-pct

Over time, we would also have data available to calculate mean values over a large number of articles for the derived statistics, which would yield:

  • mean-ip-reverted-pct: percent of IP edits reverted in last time period (w/m/q/y)
  • mean-reg-reverted-pct: percent of registered user edits reverted in last time period
  • mean-reverted-index: calculate mean-ip-reverted-pct / mean-reg-reverted-pct

Possibly we could have dual, or multiple sets of these values, depending on "contentioius topic" status, or other factors. Possibly some appropriate db query could provide a rough initial approximation for these values for starters, until we have more data.

Part II — response

The meat of the proposal, involves comparing the derived reverted-index to mean-reverted-index, and depending on some conditions (minimum thresholds of accumulated data, etc.) and some configurable parameters (one standard deviation from the mean?) would then do something, which could be:

  • generate a human-readable report, for review by semi-protection team (or pending changes?) and further action
  • semi-protect the article on the fly, if conditions and thresholds are met (maybe stricter/higher threshold, 2 SDs?, than just for reporting?)

Note: about 5% of edits to a page are vandalism, per this study (2006 data).

Note that any on-the-fly protection could have a built-in throttle almost for free; that is, once protected, the ip-reverted numbers on articles having a lot of vandalism would start to fall, and soon show up in the weekly/monthly figures as drops in values. A counterpart to on-the-fly semi-protection could examine just those articles and monitor against some other (configurable) param, and automatically remove semi-protection when that "drop" threshold was reached. This would mitigate any collateral damage caused by semi-protecting an article based on high revert-ratios that in fact was not related to vandalism. (I would argue that any such collateral damage would represent far less damage to the project in the aggregate, than letting all of it through and requiring someone to notice and a manual process it; I'd call it simply, "an abundance of caution", but that's a separate discussion.)

This proposal makes some key assumptions about the meaning of these statistics, and whether they correlate well, or at all, with vandalism, and for that we would probably need a small starter set of data from a handful of articles which could then be assessed by humans to see whether there's a fit. Someone like the Quarry folks (pinged below) might know how to gather a starter set like this for a few articles, so we can test the theory and see if it holds up.

The hope is, that by providing some automation of the detection of possible IP vandalism, with the concomitant possibility of semi-automated or automated semi-protection, we can greatly improve the detection of IP vandalism, somewhat lighten the load on users, who have to stop what they are doing when they notice suspected cases in order to find out how to report it and then report it, and vastly lighten the load of admins having to deal with semi-protection requests, which are probably only a small fraction of the known IP vandalism, and an even smaller fraction of all IP vandalism, with a possible submerged iceberg out there, that no one even notices.

Adding possibly interested users; ClueBot operators: @Cobi, Rich Smith, and DamianZaremba:; Xtools revert stats: MusikAnimal; db query folks: @Cryptic, Novem Linguae, Certes, and Joe Roe:; WP:AIV folks: @Yamla, ToBeFree, Daniel Case, Wldr, Zzuuzz, Bbb23, Materialscientist, Hut 8.5, HJ Mitchell, and KrakatoaKatie:; recent related WP:ANI discussion participants: @Boing! said Zebedee, NinjaRobotPirate, and Kusma:, and pt-wiki users @Renato de Carvalho Ferreira, JMagalhães, MisterSanderson, Érico, and PauloMSimoes: (who took part in the August 2020 IP-banning discussion[English] on pt-wiki.) Thanks, Mathglot (talk) 00:07, 5 February 2023 (UTC)[reply]

Re-ping, due to typo: User:Widr. Mathglot (talk) 00:21, 5 February 2023 (UTC)[reply]
You provided a thorough and thoughtful discussion but I don't see where you've provided a specific proposal. North8000 (talk) 00:42, 5 February 2023 (UTC)[reply]
I guess I implied a proposal, without really stating it:
Let's gather the required revert stats (maybe we have them already?) and create the infrastructure necessary to calculate the needed values, generate reports on them, and maybe auto-semiprotect those pages for which it exceeds a certain threshold.
But as this is still in brainstorming stage, I wasn't quite ready to make a specific proposal, since I'm sure I'll benefit from feedback and new information, which likely will change the proposal. But still, clarity is always better, so this is my first take at a specific proposal. Thanks, Mathglot (talk) 00:56, 5 February 2023 (UTC)[reply]
Thanks for the ping; I'll wait for data before judging this. My initial thought was "this can't work without a distinction between vandalism and good-faith edits in the data", but then again, page protection is applied to prevent disruptive editing in general. ~ ToBeFree (talk) 01:01, 5 February 2023 (UTC)[reply]
(edit conflict) ToBeFree, the proxy for that distinction, is the comparison of article revert-index to the mean; i.e., how the ratio of reverted IP edits to reverted user edit compares to articles in the encyclopedia more generally? If it's twice or ten times as much, why is that? Still, this calculation is a blunt instrument, and as I envisage it, just a starting point. Sharpening it might involve factoring in contentiousness of the article, and other features of the reverted IPs, or the reverting users. I know when I look at my long Watchlist, that I skip or pay less attention to reverts by editors I trust, than ones I don't recognize, and when I see the edit summary "Fix typo" on an edit by an IPv6 which adds +338 bytes to the article, I examine it very carefully. I'm sure we all have an instinct about what to look for in a revert, and this is just a very rough first cut to try to translate that instinct into something quantifiable based on actual data we can deal with and base calculations upon. Necessarily, the first version won't be a very good approximation, but if it's feasible at all, we have to start somewhere. That' why I pinged some of the ClueBot folks, because they may be doing something like this, though for a different task. Mathglot (talk) 01:23, 5 February 2023 (UTC)[reply]
That comparison is surely useful, but does not provide the distinction I was referring to, as there is no way to see from the data whether a specific article has a high amount of good-faith disruptive IP editing or bad-faith disruptive IP editing. Fortunately, this distinction is probably not needed as both should lead to protection. ~ ToBeFree (talk) 01:26, 5 February 2023 (UTC)[reply]
ToBeFree, I see, thanks for that clarification. Certes has already weighed in with some data below, which is exciting to see; I'm still looking at it, to see what it can tell us. Rather than misinterpret you again, can you describe the data you would like to see, or think of a query (in English) that would provide the data you are waiting for? Mathglot (talk) 01:38, 5 February 2023 (UTC)[reply]
That's actually already what I had been waiting for; I'll reply below. ~ ToBeFree (talk) 01:43, 5 February 2023 (UTC)[reply]
I've done a few preliminary counts for the week 00:00 28 Jan to 00:00 4 Feb (excluding 4 Feb, as its bad edits may not yet be reverted). Registered editors made 2,253,389 edits of which 35,146 (1.56%) were reverted. IPs made 220,447 edits of which 41,862 (18.99%) were reverted, a ratio of 12.18. I don't have the standard deviation but I've listed the articles with a ratio over 25, limiting it to those with more than 10 reverted IP edits (to eliminate articles with one IP vandal and no registered vandals). Results are here. Most of them have an infinite ratio because no registered editors made a reverted change last week. Of course, some reverted edits may be a good-faith vandal fighter having their work undone by a persistent vandal. Certes (talk) 01:08, 5 February 2023 (UTC)[reply]
Excellent; thanks very much for this. This is just what we need, some hard data we can begin to look at, so we can have a discussion based on the data, and not speculation. Mathglot (talk) 01:28, 5 February 2023 (UTC)[reply]
Thank you very much! Looking at the evaluated revision history of "1939_Japanese_expedition_to_Tibet", which is displayed as "11 ip_total, 11 ip_reverted, 1 reg_total, 0 reg_reverted", the data quality could probably be improved by counting successive edits from the same IP address as only one contribution, and counting it as reverted if at least one of the edits (alternatively, any of them) has a mw-reverted tag. ~ ToBeFree (talk) 01:47, 5 February 2023 (UTC)[reply]
Quarry 71078: Articles where edits are much more likely to be reverted if from IPs
Quarry 71078: Articles where edits are much more likely to be reverted if from IPs
ns rc_title ip
total
ip
reverted
reg
total
reg
reverted
0 1939 Japanese expedition to Tibet (edit | talk | history | links | watch | logs) 11.0 11.0 1.0 0.0
0 2011 United States debt-ceiling crisis (edit | talk | history | links | watch | logs) 22.0 21.0 5.0 0.0
0 2017 Snooker Shoot Out (edit | talk | history | links | watch | logs) 13.0 13.0 1.0 0.0
0 2022 European heat waves (edit | talk | history | links | watch | logs) 40.0 11.0 21.0 0.0
0 2023 German Masters (edit | talk | history | links | watch | logs) 42.0 25.0 92.0 1.0
0 2023 North Island floods (edit | talk | history | links | watch | logs) 24.0 16.0 349.0 4.0
0 2023 Pakistan Super League squads (edit | talk | history | links | watch | logs) 22.0 22.0 2.0 0.0
0 2023 Pakistani general election (edit | talk | history | links | watch | logs) 13.0 12.0 11.0 0.0
0 Afra Saraçoğlu (edit | talk | history | links | watch | logs) 32.0 16.0 6.0 0.0
0 America's Next Top Model (season 7) (edit | talk | history | links | watch | logs) 15.0 14.0 1.0 0.0
0 America's Next Top Model (season 9) (edit | talk | history | links | watch | logs) 34.0 13.0 8.0 0.0
0 American football positions (edit | talk | history | links | watch | logs) 11.0 11.0 4.0 0.0
0 Barbapapa: One Big Happy Family! (edit | talk | history | links | watch | logs) 12.0 11.0 4.0 0.0
0 Benigembla (edit | talk | history | links | watch | logs) 66.0 11.0 3.0 0.0
0 Brahui language (edit | talk | history | links | watch | logs) 14.0 13.0 2.0 0.0
0 Cascadia movement (edit | talk | history | links | watch | logs) 13.0 13.0 5.0 0.0
0 Criminal stereotype of African Americans (edit | talk | history | links | watch | logs) 11.0 11.0 9.0 0.0
0 Delicious Party Pretty Cure (edit | talk | history | links | watch | logs) 33.0 16.0 11.0 0.0
0 Disney Princess (edit | talk | history | links | watch | logs) 11.0 11.0 2.0 0.0
0 Elimination Chamber (2023) (edit | talk | history | links | watch | logs) 27.0 23.0 76.0 1.0
0 Fateh Burj (edit | talk | history | links | watch | logs) 13.0 11.0 1.0 0.0
0 Food 4 Less (edit | talk | history | links | watch | logs) 18.0 13.0 9.0 0.0
0 Georges Méliès (edit | talk | history | links | watch | logs) 13.0 13.0 1.0 0.0
0 Gum (edit | talk | history | links | watch | logs) 12.0 12.0 3.0 0.0
0 Harshvardhan Rane (edit | talk | history | links | watch | logs) 33.0 12.0 3.0 0.0
0 Hero: 108 (edit | talk | history | links | watch | logs) 18.0 16.0 3.0 0.0
0 Hurlock, Maryland (edit | talk | history | links | watch | logs) 16.0 16.0 4.0 0.0
0 Indonesia's Next Top Model (season 3) (edit | talk | history | links | watch | logs) 79.0 15.0 32.0 0.0
0 Intimidation (edit | talk | history | links | watch | logs) 11.0 11.0 4.0 0.0
0 Janet Nguyen (edit | talk | history | links | watch | logs) 89.0 16.0 1.0 0.0
0 Jerry Lawson (engineer) (edit | talk | history | links | watch | logs) 12.0 12.0 10.0 0.0
0 Jesse Lee Soffer (edit | talk | history | links | watch | logs) 16.0 13.0 5.0 0.0
0 JimJam (edit | talk | history | links | watch | logs) 15.0 13.0 7.0 0.0
0 Jon Snow (character) (edit | talk | history | links | watch | logs) 12.0 12.0 7.0 0.0
0 Joseph Barboza (edit | talk | history | links | watch | logs) 16.0 15.0 9.0 0.0
0 Justin Jefferson (edit | talk | history | links | watch | logs) 14.0 14.0 11.0 0.0
0 Lam Kor-wan (edit | talk | history | links | watch | logs) 41.0 14.0 8.0 0.0
0 Lara Secondary College (edit | talk | history | links | watch | logs) 19.0 19.0 3.0 0.0
0 Larry Page (edit | talk | history | links | watch | logs) 16.0 14.0 7.0 0.0
0 List of Asian stadiums by capacity (edit | talk | history | links | watch | logs) 52.0 15.0 4.0 0.0
0 List of Fuller House episodes (edit | talk | history | links | watch | logs) 12.0 11.0 2.0 0.0
0 List of Go, Diego, Go! episodes (edit | talk | history | links | watch | logs) 18.0 18.0 4.0 0.0
0 List of Pretty Cure films (edit | talk | history | links | watch | logs) 24.0 15.0 1.0 0.0
0 List of current automobile manufacturers by country (edit | talk | history | links | watch | logs) 13.0 12.0 3.0 0.0
0 List of equipment of the South African Army (edit | talk | history | links | watch | logs) 12.0 12.0 4.0 0.0
0 List of former TV channels in the United Kingdom (edit | talk | history | links | watch | logs) 15.0 12.0 3.0 0.0
0 List of militaries by country (edit | talk | history | links | watch | logs) 20.0 18.0 10.0 0.0
0 List of programs broadcast by Treehouse TV (edit | talk | history | links | watch | logs) 62.0 12.0 9.0 0.0
0 Lucas Merolla (edit | talk | history | links | watch | logs) 30.0 14.0 9.0 0.0
0 Maybe (Machine Gun Kelly song) (edit | talk | history | links | watch | logs) 13.0 13.0 4.0 0.0
0 Melinda Dillon (edit | talk | history | links | watch | logs) 12.0 11.0 7.0 0.0
0 Mike Little (edit | talk | history | links | watch | logs) 12.0 12.0 4.0 0.0
0 Oberholzer murder (edit | talk | history | links | watch | logs) 11.0 11.0 1.0 0.0
0 Pedro Porro (edit | talk | history | links | watch | logs) 15.0 12.0 32.0 1.0
0 Philippine peso (edit | talk | history | links | watch | logs) 14.0 11.0 3.0 0.0
0 Quicksilver (wrestler) (edit | talk | history | links | watch | logs) 18.0 18.0 4.0 0.0
0 Real World/Road Rules Challenge: Battle of the Sexes 2 (edit | talk | history | links | watch | logs) 13.0 13.0 2.0 0.0
0 Roberrt (edit | talk | history | links | watch | logs) 12.0 11.0 6.0 0.0
0 Roger Tuivasa-Sheck (edit | talk | history | links | watch | logs) 15.0 15.0 3.0 0.0
0 Sara Corrales (edit | talk | history | links | watch | logs) 11.0 11.0 1.0 0.0
0 Sony Pictures Imageworks (edit | talk | history | links | watch | logs) 11.0 11.0 3.0 0.0
0 St Mary's Rochfortbridge GAA (edit | talk | history | links | watch | logs) 17.0 13.0 1.0 0.0
0 Statement (computer science) (edit | talk | history | links | watch | logs) 12.0 12.0 3.0 0.0
0 Stetson University College of Law (edit | talk | history | links | watch | logs) 14.0 14.0 1.0 0.0
0 Superbook (edit | talk | history | links | watch | logs) 13.0 12.0 6.0 0.0
0 Swashbuckle (TV series) (edit | talk | history | links | watch | logs) 27.0 27.0 10.0 0.0
0 Syria Palaestina (edit | talk | history | links | watch | logs) 14.0 14.0 9.0 0.0
0 Tejasswi Prakash (edit | talk | history | links | watch | logs) 13.0 11.0 3.0 0.0
0 The Friendly Type (edit | talk | history | links | watch | logs) 16.0 14.0 1.0 0.0
0 The Theory and Practice of Oligarchical Collectivism (edit | talk | history | links | watch | logs) 12.0 12.0 2.0 0.0
0 Third Servile War (edit | talk | history | links | watch | logs) 13.0 13.0 1.0 0.0
0 UKTV (edit | talk | history | links | watch | logs) 18.0 18.0 3.0 0.0
0 VRT NWS Journaal (edit | talk | history | links | watch | logs) 13.0 13.0 5.0 0.0
0 Varma (surname) (edit | talk | history | links | watch | logs) 11.0 11.0 3.0 0.0
0 Verona Villafranca Airport (edit | talk | history | links | watch | logs) 11.0 11.0 6.0 0.0
0 Walter Reed (edit | talk | history | links | watch | logs) 14.0 14.0 3.0 0.0
0 Williams Street (edit | talk | history | links | watch | logs) 65.0 30.0 1.0 0.0
0 XEFE-AM (edit | talk | history | links | watch | logs) 12.0 12.0 8.0 0.0
0 Čačak (edit | talk | history | links | watch | logs) 11.0 11.0 3.0 0.0
1 Talk:Ce Acatl Topiltzin (edit | subject | history | links | watch | logs) 11.0 11.0 2.0 0.0
1 Talk:Isla Bryson case (edit | subject | history | links | watch | logs) 14.0 13.0 130.0 3.0
1 Talk:Metohija (edit | subject | history | links | watch | logs) 12.0 12.0 18.0 0.0
1 Talk:Numbers (TV series) (edit | subject | history | links | watch | logs) 12.0 12.0 3.0 0.0
118 Draft:The Bad Batch (2016 film) (edit | talk | history | links | watch | logs) 68.0 19.0 2.0 0.0

SQL query:

SELECT rc_namespace, rc_title,
 SUM(CASE WHEN actor_user IS NULL THEN 1 ELSE 0 END) AS ip_total,
 SUM(CASE WHEN actor_user IS NULL AND ct_tag_id IS NOT NULL THEN 1 ELSE 0 END) AS ip_reverted,
 SUM(CASE WHEN actor_user IS NOT NULL THEN 1 ELSE 0 END) AS reg_total,
 SUM(CASE WHEN actor_user IS NOT NULL AND ct_tag_id IS NOT NULL THEN 1 ELSE 0 END) AS reg_reverted
FROM recentchanges
JOIN actor ON actor_id = rc_actor
LEFT JOIN change_tag ON ct_rc_id = rc_id AND ct_tag_id = 590 /* mw-reverted */
WHERE rc_timestamp BETWEEN '20230128' AND '20230204'
GROUP BY rc_title
HAVING ip_reverted > 10 AND ip_reverted/ip_total * reg_total/(CASE WHEN reg_reverted = 0 THEN 0.0001 ELSE reg_reverted END) > 25
ORDER BY rc_namespace, rc_title
Certes' query as Wikitable. Mathglot (talk) 01:53, 5 February 2023 (UTC)[reply]
Links and current SQL source added ~ ToBeFree (talk) 01:59, 5 February 2023 (UTC)[reply]
Hi, Certes; I forked to 71079 trying to add percent columns, but I'm guessing you can't re-use previously established 'AS' labels in subsequent parts of the statement. You'll see what I tried to do in creating cols ip_rvt_pct and reg_rvt_pct. Or, does every col have to be a SUM because of the GROUP BY? Is that fixable? Thanks, Mathglot (talk) 02:52, 5 February 2023 (UTC)[reply]
The column alias doesn't become usable during the main SELECT, only at the GROUP BY stage. There are workarounds but basically you need to repeat the expression, e.g. (SUM(CASE WHEN actor_user IS NULL AND ct_tag_id IS NOT NULL THEN 1 ELSE 0 END) * 100) / SUM(CASE WHEN actor_user IS NULL THEN 1 ELSE 0 END). Certes (talk) 12:15, 5 February 2023 (UTC)[reply]
I think a bot which tries to automatically identify pages with high rates of IP vandalism and either protect the page or file a request somewhere (maybe WP:RFPP) might well be workable, but I'd suggest:
  1. Semiprotection is much more likely to be appropriate if lots of edits are being reverted in a short space of time. If a page gets 10 IP edits in a year and all are reverted then it probably isn't appropriate to semiprotect the page. If it gets 10 reverted IP edits in a day then semiprotection is a lot more likely to be appropriate.
  2. The number of reverts might be more useful than the number of edits. For example Disney Princess is included in the above query output because an IP made 10 edits in a row which were reverted in one edit. This isn't appropriate for semiprotection. If the IP was reverted 10 times then semiprotection is more likely to be useful.
  3. You could try to determine whether the edits are likely to be vandalism by looking at the edit summary. If an edit is reverted with a standard rollback message, or if the edit summary mentions vandalism or spam then semiprotection is more likely to be appropriate. If people are writing custom edit summaries for the reverts then it's less likely.
  4. The number of IPs making the edits is also important. If one IP is making all the edits then protection probably isn't a good idea and it would make sense to just block that IP (or report them to AIV) instead. If edits are coming from multiple IPs or one vandal on a range then semiprotection is more likely to be useful.
  5. You could also look at the people doing the reverts. If IPs are being reverted by experienced editors (especially multiple experienced editors) then protection is more likely to be appropriate. If on the other hand an IP is reverting another IP, or an IP is reverting a new user, then automatic protection is probably a bad idea (you might protect the page in a vandalised version and then stop the person reverting the vandalism from fixing it).
Hut 8.5 11:09, 5 February 2023 (UTC)[reply]
We might also look at the long-term revert % of regular editors rather than this week's (because most articles have no such reverts this week). For efficiency, that suggests a two-stage process:
  1. Find pages where many different IPs have been reverted recently (that's quick)
  2. For those pages only, calculate the long-term reversion % of registered editors, to filter out contentious topics
In practice, it might be worth not doing the second step: if IPs keep arguing over whether India is better than Pakistan, we may still want to protect.
quarry:query/71090 lists pages where more than 50% of IP edits have been reverted, and at least five different IPs have been reverted, over the previous two days, Certes (talk) 13:09, 5 February 2023 (UTC)[reply]
It might be desirable to calculate statistics on the source of IP vandalism by ASN and restrict anonymous edits from problematic networks. --Shmuel (Seymour J.) Metz Username:Chatul (talk) 15:47, 5 February 2023 (UTC)[reply]
@Certes, I think you could restrict the query to the mainspace, as I understood the proposal to be primarily aimed at the mainspace. You could probably also remove bots and similar edits (e.g., AWB).
@Mathglot, without saying anything at all about the merits of your proposal, I wonder if the upcoming m:IP Masking work will introduce a complication. About a year or so from now, there won't be "unregistered editors"; there will instead by automatically created "temporary accounts". User:172.0.0.1 will become User:*2022-1234 instead (or something like that). This is, in the backend, a third account type. There are advantages (privacy for the user; better communication options for the rest of us), but whatever you build now might break when this happens. I would expect it to be fixable, but I would also expect pretty significant hiccups in the transition process. Whatamidoing (WMF) (talk) 00:00, 8 February 2023 (UTC)[reply]
We only need to know whether each edit came from an account or an IP, and which IP edits come from the same IP. That data should remain available after IP masking, though it may move to different database tables. The main risk may be that this tool will become redundant because IP masking makes vandal-fighting impractical and we reluctantly have to turn away unregistered editors altogether, but that's a separate debate. Certes (talk) 00:11, 8 February 2023 (UTC)[reply]
Right now, we have two categories of accounts:
  • Registered editors
  • Unregistered editors
After IP masking, we will (probably) have three categories of accounts:
  • Registered editors
  • Unregistered editors
  • Temporary editors
with the expectation that there won't actually be anyone in the "Unregistered editors" list (though it will continue to exist in theory). I don't know what kind of update will be necessary to the code (maybe it will be as simple as finding all the bits that say "unregistered" and replacing them with "temporary"), but if you're calculating this dynamically, then there will be a transition period. If the switch happens on February 1 (and assuming a 30-day look-back period) then on February 1, you'll have 30 days' worth of unregistered editors and 0 days' worth of tempoary editors – a problem, if you want to use only temporary editors to calculate which articles are at risk from temporary editors today. Halfway through the month, you'll have 15 days' of unregistered editors and 15 days of temporary editors. This could give you weird numbers. Once the time period is over, then all should be well, but during the transition, it might require some extra effort. Whatamidoing (WMF) (talk) 01:36, 8 February 2023 (UTC)[reply]
I think unregistered and temporary editors are the same thing for this purpose: they're not logged in. The only problem, if and when we have a transition, is that someone who was 123.45.67.89 yesterday might be Hidden-IP-246810 tomorrow and we're deliberately prevented from connecting the two. It will be as if everyone changes their IP address at once (possibly renewing their licence to vandalise, Eternal September-style). But we still don't know the details of IP masking, and some of us still have hope that it will never be achieved, so let's not let that tail wag the dog. Certes (talk) 01:57, 8 February 2023 (UTC)[reply]
@Certes, I believe the plan is for admins and other trusted users (think Wikipedia:Requests for permissions, not an RFA-like process) to have access to the IP for all temporary users. It just won't be visible to anyone with internet access.
IP masking will happen; there is no longer any doubt about that. If you want to follow it, then I suggest watching m:IP Editing: Privacy Enhancement and Abuse Mitigation#Statements from the Wikimedia Foundation Legal department. Whatamidoing (WMF) (talk) 22:25, 21 February 2023 (UTC)[reply]
I'm hoping that we can just look at IP edits and not have to consider edits by registered accounts. If so then we can assume edits are not by AWB or bot (or that if they are, we certainly want to know about them!). Certes (talk) 00:16, 8 February 2023 (UTC)[reply]
Mainspace is certainly the main focus but if IPs are being reverted in other namespaces then something is amiss and needs attention. If they're vandalising a template or WP:something then we want to know, and you have to be a real nuisance to get reverted repeatedly on a Talk: page. It doesn't cost much to check all the namespaces while we're there. Certes (talk) 00:19, 8 February 2023 (UTC)[reply]
(edit conflict)@Whatamidoing (WMF): I am aware of the IP masking project (didn't know it was called that) and have had it in mind since the beginning, but since I don't have sufficient details about it, and the basic idea here seemed intricate enough to begin with without adding the complication of new types of account (yet), I wanted to expose the outlines of the idea here to see how it might evolve, and to get some good folks thinking about it. I think we could still evolve the basic idea a bit more, thinking about different forms of queries to test, gathering more suggestions (as you just did to Certes) and more data to prove (or refute) the principles behind it, and then at some point, if it's worth considering a transition from brainstorming level (VPI) to proposal level (VPR), we could discuss at that point whether it's worth building a simple prototype or report based on these queries (or others) which might be the germ of a useful additional tool for vandal fighting. If all of those "if's" line up as I hope they will, then, perhaps, it would be worth asking ourselves whether we should go ahead and build something simple (a 1x/week bot report?) with the foreknowledge that it will have to change once the new account type arrives, or do we just shelve it for the time being and wait for the change? It's hard to answer that now with little data to go on, but I'm biased to doing something sooner rather than later, partly because it might expose the idea to more eyeballs (and therefore more, and better evolution), partly because I don't know if I really trust the timeline I'm hearing about the IP account transition, given some other expected dates that later changed (not blaming, just trying to be realistic), and partly because if the "old way" (IP accounts) actually works, then for 12, or 18, or 24 months, we've got one more tool in our toolshed for vandal fighting for that period of time, and I'm all in favor of that. (Maybe also because I'm eager to see if this could actually be of benefit, and if so, I'd like to try and get something concrete out there to look at, which might help bootstrap it to the next level. Certainly if you (or anyone from WMF) has any inside knowledge about IP account transition that would doom this or make it unworkable, the sooner we find out about that, the better; likewise if we know that there will be hiccups (how big?) that would be useful knowledge, too. Thank you for mentioning this; it was inevitable that this would be raised at some point, and it's just as well that you did so now. Mathglot (talk) 00:43, 8 February 2023 (UTC)[reply]
I think that a bot report has a lot of potential, though the exercise in general reminds me of Regression toward the mean. In healthcare, there's a drive to address high-need patients based on the medical care they needed last year, and it often ends with researchers noting that high-need patients in year 1 are often not high-need patients in year 2, regardless of whether you do anything. It could be that using "yesterday's" unwanted edits to predict where "tomorrow's" unwanted edits will happen is just not very effective. For example, looking at the first article in the table above, an IP was reverted last week, but that was the first time an IP was reverted on that article in about a dozen years. It could be another dozen years before it happens again. Whatamidoing (WMF) (talk) 01:43, 8 February 2023 (UTC)[reply]
Collecting statistics by ASN would definitely require support on the IP masking side, and privacy concerns would limit who could run the bot(s). --Shmuel (Seymour J.) Metz Username:Chatul (talk) 15:10, 8 February 2023 (UTC)[reply]
@Chatul, are you aware of any work currently being done that depends on the Autonomous System Number? Whatamidoing (WMF) (talk) 22:27, 21 February 2023 (UTC)[reply]
No there would have to be new code to look up the ASN for the IP address, and there would have to be some analysis of privacy concerns to ensure that the new code would not allow unauthorized users to identify anonymous posters. --Shmuel (Seymour J.) Metz Username:Chatul (talk) 13:59, 26 February 2023 (UTC)[reply]

Improved template messages

Hi there, I think that it would be a good idea to make the template messages able to link to specific paragraphs or text.


Thanks,


Blutankalpha Blutankalpha (talk) 12:20, 10 February 2023 (UTC)[reply]

Could you please provide some specific examples illustrating your concern? Template messages do often include links to shortcuts, which I would think is usually good enough. DonIago (talk) 14:07, 10 February 2023 (UTC)[reply]
Templates can only point to things a wikilink can point to, so they can point to sections (e.g. Saturn#Orbit_and_rotation) but not to a line of text on another page (which could easily be edited away). — xaosflux Talk 14:13, 10 February 2023 (UTC)[reply]
Ok, That is a good point.
This matter is closed
Blutankalpha Blutankalpha (talk) 10:27, 20 February 2023 (UTC)[reply]

In the news criteria

In the news has an issue in that the selection process for articles is almost entirely on the basis of editors personal preferences and biases - current editors, in the case of non-recurring stories, and past editors in the case of recurring stories.

This has resulted in a systematic bias issue that prioritizes topics of relevance and concern to Wikipedia editors, rather than to the world in general.

This can be prominently seen at ITNR. Defined very broadly, 73 of the listed events cover international topics. Of the rest:

  • 23 European (11 British, 1 German, 1 Spanish, 8 generally; 1 per 32 million people)
  • 16 North American (13 American, 2 Canadian, 1 generally; 1 per 36 million people)
  • 6 Asian (3 Indian, 2 Japanese, 1 generally; 1 per 760 million people)
  • 5 Oceanic (4 Australia, 1 New Zealand; 1 per 9 million people)
  • 2 South American (2 generally; 1 per 211 million people)
  • 1 African (1 generally; 1 per 1216 million people)

It also violates the core policies of WP:OR and WP:NPOV. By deciding which events are significant enough to publish ITN through our personal criteria we are engaging in original research and placing undue weight on those events that are included based solely on those personal preferences.

To fix this we need a more objective criteria for inclusion at ITN, and my initial proposal is for one of the following:

  1. The event has received significant original (not syndicated or from a wire service) coverage in a wide variety of reliable international news sources.
  2. The event has received significant original (not syndicated or from a wire service) coverage in a majority of 21 selected reliable international news sources.

The second proposal would involve selecting those sources; I expect these would include Agence France-Presse, Associated Press, BBC, Deutsche Welle, The New York Times, Reuters, and The Times of India, while the others would be the subject of considerable discussion. It has the advantage of being more objective and better suited to reducing systematic bias by forcing editors to consider coverage in a wider range of sources, particularly if the list includes non-English language sources, but it may also be harder to get an initial consensus for due to difficulty in creating the list of sources.

The current criteria requiring that the article has been updated and that it meets a minimum standard of quality would remain unchanged.

Please provide both any thoughts you have on the two proposals listed, and any alternative proposals that you may have. BilledMammal (talk) 15:12, 14 February 2023 (UTC)[reply]

  • I think is a very minor concern. Most of what doesn't get through ITN or gets through ITN is mostly determined by the quality of the article update. Which is as it should be, the purpose of the main page is to tell people about Wikipedia articles that are of a high quality. Everything else is irrelevant. ITN is not "Tell the world what Wikipedia thinks is important", that's a terrible way to think about it. Instead, it is "What is something recent that Wikipedia readers may appreciate reading a high quality Wikipedia article about" with an emphasis on the high quality. I'm far less concerned with playing cultural gatekeeper and making sure that the main page is not sullied by things from certain geographic areas, and I'm pretty much only concerned that Wikipedia articles are improved, and that we let readers know what those articles are. --Jayron32 15:55, 14 February 2023 (UTC)[reply]
    ITN is not "Tell the world what Wikipedia thinks is important" – The problem is that this is exactly what it's become. Most people !voting on blurbs don't even mention article quality. It's just the personal opinions of editors as to whether they feel it's an important news story or not. If you want ITN to be centered around article quality, then it's going to need considerable reform. Thebiguglyalien (talk) 20:21, 14 February 2023 (UTC)[reply]
    +1, this is my impression of it, too. Levivich (talk) 20:33, 14 February 2023 (UTC)[reply]
    +1, One article that I worked up to GA borderline passed ITN despite it winning the video game equivalent of an Oscar, with the event attracting tens of millions of viewers. If it was based upon article quality, it would have been accepted much more readily, but the main opposition was centered around video games being "not important enough for ITN". The Night Watch (talk) 15:37, 15 February 2023 (UTC)[reply]
    “It's just the personal opinions of editors as to whether they feel it's an important news story or not.”
    That’s typically how consensus works, and from skimming this discussion it seems to be a lot of the usual suspects on ITN who disagree with the mere idea of consensus because it usually goes against their opinions. The Kip (talk) 16:11, 15 February 2023 (UTC)[reply]
    That's not how it is supposed to work at Wikipedia. At Wikipedia, we ignore commentary that is not based in a sound footing of Wikipedia policy and guidance. If people are commenting with rationales that don't align with established principles at Wikipedia, we're supposed to ignore those comments when assessing consensus. From WP:CONSENSUS, and I quote, "Consensus is ascertained by the quality of the arguments given on the various sides of an issue, as viewed through the lens of Wikipedia policy." (bold mine). ITN has written standards; if people ignore those written standards in their commentary, then their opinion should not be taken into account. So know, we should not let consensus discussions be overwhelmed by people whose opinions run counter to long-established Wikipedia principles. --Jayron32 13:36, 16 February 2023 (UTC)[reply]
    The problem of late is that we have stories that bring a lot of "drive by" !votes (editors that are maybe at ITN the first or second time in their history) and vote, such as in death blurbs "very important person" without further explanation. When that happens en masse, the ITN admins in charge of posting seem to not consider the "views through a WP policy lens" and are more or less !vote counting. (The posting of Betty White's death and Carrie Fisher's were examples of that). That also applies to general stories too. Not so much that the recent one on LeBron breaking the scoring record being flooded with such but there were definitely drive-bys that were basically "Support - a key record was broken" without any further explanation. And that was briefly posted before pulled as the posting admin seems to take those !votes into account when they should have been discounted.
    That's getting away from this point on importance based on RSes, but it is also tied to it, since I've seen frequent calls on stories that get !votes like "All over front pages" or "leading story on (key sties)", which may indicate importance to news media, but not to an encyclopedia per NOTNEWS (eg the whole mess with the US House's Speaker election). I do believe there are some objective standards that incorporate frequency and type of coverage by RSes to evaluate relative importance to the world body of sources, but it should definitely not be the sole driver (in addition to quality) for consideration. Masem (t) 13:58, 16 February 2023 (UTC)[reply]
    "Not a news ticker" means it isn't our job to tell people what is important, it is our job to highlight quality content only. It doesn't really matter much whether or not we think something should or shouldn't be "important", we should be showing people quality article content, period. Any decision making that attempts to assess worthiness is making ITN a news ticker. Instead, it should be a tool to direct people to high quality articles about current events. --Jayron32 14:23, 16 February 2023 (UTC)[reply]
    Disagree, as the more blindly include topics based on quality followed by news prevalence without any further consider, the more we mirror what news media believes important, and we become a news ticker to them. It should be obvious that not every story that gets huge promotion on the news makes for a reasonable topic (first and foremost we rely on enduring coverage and not bursts, which us what the typical story in the news is), so we clearly need a filter to eliminate burst-news coverage. Then we need to add the issue if systematic bias that even with a good selection of worldwide press, we will still find ourselves favoring Western topic, and hence the need to increase the visibility of events in less covered places that our significant (eg the Turkey/Syria quake has been getting less coverage than all the 2024 campaign hijinks or even the Ohio rr srory). That's why blindly following news media makes us more like a news ticker. Masem (t) 14:49, 16 February 2023 (UTC)[reply]
    You can disagree, but your rant about bias is irrelevant here. You fix bias by making articles about underrepresented topics better. The idea that we should eliminate bias by making Wikipedia articles on Western topics worse seems like a bad idea. --Jayron32 15:10, 16 February 2023 (UTC)[reply]
    Far be it from me, a clueless non-admin, to make a judgment about how admin work usually goes at ITN, but I've found that Masem's observations are more or less correct, that the significance of a story is based on who shows up to vote and in what quantity. So any story that provokes an emotional reaction from editors will likely sway the balance because we weigh all of these "significant/not significant" votes equally. This comes down to us not having a clear objective criterion for significance on WP:ITNCRIT, instead going based off of a subjective consensus. I expect that when The Boat Race is nominated for 2023, despite the article being a FA year-after-year (the quality that Jayron32 is looking for), there will be a visceral outpouring of opposition, more than we'd see for any other sporting event, based on the fervor that was accumulated during the discussion calling for its removal from WP:ITN/R.
    In regards to bias, the difficulty with nominating underrepresented topics is that we are still stuck in a precedent-based mindset of "well, we never posted this before, we shouldn't post it now" or "it's not disastrous enough" or "I've never heard of it". That's the sort of acts that perpetuate systemic bias. Indeed, it's present across all of ITN. These sorts of arguments would never fly in an AfD, so it's a real shame that they can be effectively weaponized in this setting. Moreover, newer contributors who are nominating something for ITN for the first time tend to be driven off by petulant line-towing semi-regulars who scream at them "Of course this isn't notable, SNOW close, why the fuck did you nominate this?" which is most discouraging. It's a big part of why people complain about ITN's atmosphere.
    I don't agree with Masem on a lot of things. I think during the start of the 2022 Russo-Ukrainian War, he took an incredibly skeptical line in challenging just about everything the reliable sources stated about events such as massacres or missile strikes. But I think he's right on this one. WaltClipper -(talk) 13:40, 17 February 2023 (UTC)[reply]
    Are you disagreeing with me that we should make articles on underrepresented topics better? --Jayron32 13:43, 17 February 2023 (UTC)[reply]
    I disagree with you that our job is to only highlight quality content, because even if that might be our mission, that's not what the criteria states and that's not consistent with our established consensus which calls for identifying a story's significance. I agree with you on making articles on underrepresented topics better, but if our goal is to get it onto ITN and onto the Main Page, I don't believe that improving it to even an FA will be enough to get it posted to ITN. WaltClipper -(talk) 13:53, 17 February 2023 (UTC)[reply]
    Let me also hasten to add that I want our job to be highlighting only quality content, just like you do, but I don't think we'll get everybody to drink from that cup. --WaltClipper -(talk) 13:57, 17 February 2023 (UTC)[reply]
    The larger problem is that I think we (WP overall) have lost sight of why NOTNEWS exist. Newspapers serve one function - to report information as fast and broadly as possible - while an encyclopedia serves a different one - to summarize a topic for posterity. If we put those functions into a Venn Diagram, there would definitely be an overlap in that there are topics that started as news that become clearly important for enduring knowledge. And with covering current events, we generally had been pretty good about being predictive that an event is going to have the long tail that makes for a good encyclopedic topic. But we getting a lot of new current event articles that may be well-backed from newspapers as part of their function, but fail to prove out as long-term events of significance. In other words, we have editors trying to write like a newspaper and thus we get a lot of noise at ITN, particularly from overrepresented areas. I think we do need to tweak how editors approach NOTNEWS and NEVENT, understanding that just a mere burst of coverage is not necessarily quality sourcing for an encyclopedic article. Whereas news stories from underrepresented areas that fit into encyclopedic content, there may be the long-tail of coverage but the number of works covering it will be low, and that's why I think any type of "counting" of story coverage is a problem that feeds, not fights, systematic bias.
    I'm willing to hear about any system to help improve objectivity and reduce the drive-by voting, and there may be something in source counting, but I don't see an obvious solution that still creates a systematic bias problem. Masem (t) 15:46, 17 February 2023 (UTC)[reply]
  • The change proposed runs counter to the stated goal, but that doesn't matter because the premise here is ridiculous. ITN/C aggressively combats bias and engages in affirmative action by applying lower standards to quality and significance for under-represented regions. GreatCaesarsGhost 16:44, 14 February 2023 (UTC)[reply]
I'd support trying either of these proposals, or pretty much anything that will make ITNC more objective and less subjective. It's always felt bizarre to me that widespread media coverage does not count as "significance" at ITNC, and instead, "significance" is just the sum of subjective opinions of participating editors. As between these two proposals, I'm split; I can see both the upsides and downsides of specifying a number. Other reform ideas: eliminate blurbs altogether and just post the links to articles (so everything would look like ongoing or RD, and we could use the extra space for more pictures, so we could have, e.g. three picture slots and no blurbs); replace ITN with a "most-edited articles" or "most-viewed articles" list; eliminate ITN altogether and rearrange the main page with the remaining elements (FA/FL/FP, DYK, OTD). Levivich (talk) 20:42, 14 February 2023 (UTC)[reply]
Wikipedia is not a newspaper, and ITN is not a news ticker. We're there on the front page to feature quality articles about recent events that happen to be in the news, not to feature news stories that have quality articles. That's why significance and quality are the key determinants in ITNC discussions. Following the news does not give us the broad range of topics that we want ITN on the Main Page to be. I do think that the "significance" factor has been watered down and/or weakened which started with the whole issue of mass shootings in the US, and there's lots of bitter feelings on that that which has made objective evaluation of significant far more difficult to come to on other topics. Masem (t) 03:45, 15 February 2023 (UTC)[reply]
  • I have for a long time thought that this section has departed from its encyclopedic purpose, which is to highlight Wikipedia articles about subjects that are in the news, rather than specially created articles about the news events themselves, which should usually be deleted per WP:NOTNEWS rather than highlighted on our front page. People are given the impression that this is a news site rather than an encyclopedia. Phil Bridger (talk) 20:57, 14 February 2023 (UTC)[reply]
    ITN was established on the basis of how fast the community worked to produce a quality article on 9/11, and type of effort been repeated multiple times since. Now part of the problem is that NOTNEWS and NEVENT ofter go disregarded because nearly every event is claimed to be notable because of a burst of coverage (notable requires more enduring coverage). We really need to be more enforcing on NOTNEWS which should with some of the topic noise at OTN (currently exemplified by the UFO shhotdowns) Masem (t) 21:48, 14 February 2023 (UTC)[reply]
    I know I'm not with general consensus here, but we should really only be covering topics that are covered by secondary sources. Most articles about news events are primary sources. Of course newspapers sometime publish secondary sources, such as reviews of a situation, but the general opinion here seems to be that we should accept lots of primary sources, with some geographical distrubution, as the basis for an article. Phil Bridger (talk) 19:30, 16 February 2023 (UTC)[reply]
  • Good idea. I want ITN to reflect what's actually in the news, not what some of us wish got attention. Maybe 21 sources is too high an estimate for a practical mainstream core, but the seven offered above are certainly a good start. InedibleHulk (talk) 21:27, 14 February 2023 (UTC)[reply]
  • And yeah, even a perfect score in the coverage department won't allow a crap article posting, quality still matters. InedibleHulk (talk) 21:31, 14 February 2023 (UTC)[reply]
  • Will not oppose. Well, we could try one of these, but I don't know what's going to happen. Many of the folks on ITN think it'll just heavily prioritize celebrity news or gossip. But who knows, it might work. I don't think we should do away with our current ITN/R items wholesale based on this new proposal. Just continue to nominate those for addition or removal on a case-by-case basis. WaltClipper -(talk) 21:55, 14 February 2023 (UTC)[reply]
Support something along these lines. I would add a stipulation that topics be in the "world" or "national" news sections of each publication to prevent NOTNEWS pop culture/sports/etc. creeping in. JoelleJay (talk) 23:07, 14 February 2023 (UTC)[reply]
I think I agree with that, but the wording is a little ambiguous; I assume you mean any event that is classified as either international, related to a country, or related to a continent? BilledMammal (talk) 03:49, 15 February 2023 (UTC)[reply]
Yeah, news articles that are classified under those sections of the newspaper. Like The Times of India homepage has a ticker menu at the top with "India" and "World", and articles map their directory path back to the parent category (e.g. here where it says NEWS / WORLD NEWS / CHINA NEWS / [article title]). JoelleJay (talk) 20:24, 15 February 2023 (UTC)[reply]
@JoelleJay, how do you expect your criteria to work for something like the Nobel Prize in Literature, which should be in the Culture section? What about the election of a pope or patriarch, which should be listed in the Religion section of a newspaper? WhatamIdoing (talk) 23:52, 21 February 2023 (UTC)[reply]
If we're going to be using some set number of sources, editors would have to come to a consensus on which sections of each paper qualify for assessing significance. JoelleJay (talk) 01:24, 22 February 2023 (UTC)[reply]
I don't really expect this model to work. In addition to the problem of choosing which wire services/newspapers count (nothing from Pakistan, home to 100 million English speakers? nothing from Nigeria, the third-largest population of native English speakers on Earth?), I think editors will be unhappy with the results. We'll want the culture section when the Nobel Prize in Literature is announced but not when the next Harry Potter-equivalent is released. there is a significant fraction of editors who want Wikipedia to feel "serious", and relying on external sources that don't have a bias towards "serious" will not achieve their goals. WhatamIdoing (talk) 02:56, 22 February 2023 (UTC)[reply]
I would suggest that it is better to collect evidence without changing the process first. It should be possible to, given an ITNC nomination, how many times it appears in the selected list of sources. Or alternatively, figure out a means to determine the top 5 stories of each work each day and count repetition across sources. With, say, a good two weeks or a month of data, it would be far easier to understand the impacts on ITNC without actually changing it. My gut remains that this type of a approach will overwhelm ITNC with Western and English topic per WP:BIAS, but it would be best to prove that wrong before making any change. --Masem (t) 03:38, 15 February 2023 (UTC)[reply]
I don't think such a test will work as I don't know what sources the community will chose and the result of this change will depend heavily on that. My overall position is that something has to be done; we can try this, and if after trying it for a few months we discover it doesn't work we can try something else. BilledMammal (talk) 03:49, 15 February 2023 (UTC)[reply]
Then the first thing is to determine the "jury" of news sources, first and foremost, before even applying that. This has far too many working parts to implement without evidence and other testing beforehand, and could fundamentally break ITN if its not thought out well. Masem (t) 13:17, 15 February 2023 (UTC)[reply]
My preference would be to start with a limited number of sources and then expand it over time. BilledMammal (talk) 13:47, 17 February 2023 (UTC)[reply]
  • We should mainly use English-language sources as this is our working language and it's the international lingua franca. But there are lots of news sites and channels now which present their content in English even if it's not their domestic language. For example, see 22 English-language news outlets in Europe to follow and Top European newspapers in English. Andrew🐉(talk) 12:10, 15 February 2023 (UTC)[reply]
    However, there are times when key news happens first in foreign language sources (particularly SE Asia and South America) that using the known RSes from those regions are fine as well. That's another flaw in this system is that not all news breaks first in English. Masem (t) 13:27, 15 February 2023 (UTC)[reply]
This is a terrible idea. Firstly, because ITN is not, and should not be, purely a news ticker. Secondly because far from avoiding systemic bias, having a special list of 'blessed' sources that determine what is worthy of ITN will deeply entrench that bias. I also don't think we should be reliant on English-language sources, which will also drastically skew our coverage. If a Hindi, or Chinese, or Brazilian Portuguese source is the main source for a significant story, we should reflect that. We had great trouble in this respect with trying to build consensus on the Nagorno-Karabakh blockade story - very few of the sources were even in the Latin alphabet, but that did not mean we should not attempt good, unbiased coverage of it. GenevieveDEon (talk) 13:14, 15 February 2023 (UTC)[reply]
  • "Our criteria are too subjective, so let's post less stuff." Yeah, that will work. 2603:3005:42DF:4000:C512:B59A:D574:391A (talk) 17:36, 15 February 2023 (UTC)[reply]
  • I think the last thing we should do is have any sort of minimum source requirement. That really isn't the issue we are currently facing. Most are concerned about not enough items being posted. I think a minimum source threshold will only exacerbate preexisting balance issues at ITN, as quite frankly most media coverage nowadays is about what generates more clicks. For example, many nations in Africa suffer from power instability and are subject to military coups, but usually this is just a passing news story for many Western publications. However, a transfer of power (especially one done by force) is clearly more noteworthy and impactful than some of the events we have and will post, such as the Ohio train derailment, which may not get posted but is on the cusp as of right now. DarkSide830 (talk) 18:45, 15 February 2023 (UTC)[reply]
  • Somebody has to ask… Given WP:NOTNEWS, why do we even HAVE an “In the news” section on the main page? Blueboar (talk) 15:00, 16 February 2023 (UTC)[reply]
    One, WP:NOTNEWS does not mean we do not cover current events. Two, per WP:ITN, the purpose of the section is:
    • To help readers find and quickly access content they are likely to be searching for because an item is in the news.
    • To showcase quality Wikipedia content on current events.
    • To point readers to subjects they might not have been looking for but nonetheless may interest them.
    • To emphasize Wikipedia as a dynamic resource.
    --WaltClipper -(talk) 18:09, 16 February 2023 (UTC)[reply]
    As such, it seems that citing WP:NOTNEWS at WP:ITNC is misguided, when the MP section is literally titled "In the news".—Bagumba (talk) 16:53, 19 February 2023 (UTC)[reply]
    On the contrary, it's probably the most important place to remind editors of that policy. "In the news" exists to showcase Wikipedia articles about topics that are in the news, not to provide a news service. Phil Bridger (talk) 18:02, 19 February 2023 (UTC)[reply]
    One of the reforms we should do is to get rid of the "showcase" purpose, as pretty much no article about current events is worth showcasing; there's not enough time to bring them up to GA or FA quality. We flatter ourselves with "showcase". The purpose of ITN is (and should be) to help readers find articles about topics that are in the news. Levivich (talk) 15:38, 20 February 2023 (UTC)[reply]
    The minimum standard is WP:ITNQUALITY, which does not claim to be GA/FA. —Bagumba (talk) 15:51, 20 February 2023 (UTC)[reply]
    Exactly. ITNQUALITY isn't good enough to be worth "showcasing", thus ITN doesn't fulfill the purpose of showcasing quality content, thus we should remove this purported purpose (not to be confused with a purported porpoise) from the list of purposes (not to be confused with the list of porpoises). Levivich (talk) 16:08, 20 February 2023 (UTC)[reply]
    Unless you are proposing the same pruning of article quality that DYK has, this is unworkable. We recognize many articles about news topics that do have encyclopedic purpose can get up to a perceived quality within a day or so, but the process to get through GA or FA is far longer than that, so we'll accept something that we can tell is likely to be of high quality in the short term. Masem (t) 16:10, 20 February 2023 (UTC)[reply]
    The Main Page of WP is to showcase quality articles in their relevant sections. We can't remove that. We are not expecting GA/FA quality (though clearly will accept them), just as DYK doesn't expect those. The purpose of ITN is to help readers find encyclopedic articles about topics in the news, which means that not all topics in the news will necessarily be featured. Wikinews is better suited for the latter function where there is no encyclopedic requirement. Masem (t) 15:53, 20 February 2023 (UTC)[reply]
    Sometimes I think if I said "we should turn the light on" you would respond with a paragraph explaining that it is dark here. Levivich (talk) 16:10, 20 February 2023 (UTC)[reply]
    You keep introducing a meaning of ITN that is not based on the actual meaning of ITN, instead wanting to turn it into a news ticker. Masem (t) 16:13, 20 February 2023 (UTC)[reply]
    Hahaha yes that's called "change". I keep suggesting change, you keep explaining how things are as if I don't know. But seriously: please stop, it's annoying af. Levivich (talk) 16:16, 20 February 2023 (UTC)[reply]
    Things are as they are for a reason. The changes you want to make would turn Wikipedia into something it is not meant to be. Blueboar (talk) 16:22, 20 February 2023 (UTC)[reply]
    Oh man, you're not seriously pulling out the "things are as they are for a reason" line? Trust me: that line is a weak argument wherever it's deployed; it's akin to rhetorical surrender, because "a reason" is not necessarily a good reason. Sometimes things are as they are for a bad reason, like in this case: editor vanity, wanting to "showcase" average work. Levivich (talk) 16:28, 20 February 2023 (UTC)[reply]
  • I do think we need some change to ITN, because looking through the candidates often makes me often wonder what does "significance" mean? Some sport news are considered significant, some are not. Some mass shootings are significant, some are not. Some deaths are notable, some are not. We need some more clearly defined criteria, otherwise it just becomes heavily subjective. My question about the first proposal is what does "wide variety" mean? Does that mean 5 sources, 10, 15? Natg 19 (talk) 18:04, 21 February 2023 (UTC)[reply]
    You're pretty much on the mark. Although people have their own individual standards for what constitutes a significant story, in the end, overall significance is based on a headcount. That's the dirty word that people around here don't like to say, but I feel it's true in the case of ITN. A !vote that says "it's notable" has just as much weight as one that says "it's not notable". WaltClipper -(talk) 17:10, 22 February 2023 (UTC)[reply]
    ITNC is the only page on Wikipedia I can think of where closers seem to never weigh votes; it's pure headcount. Even a vote that says "only significant in one country" gets weighed, despite being against the instructions on the very page. IMO, pretty much all the problems with ITN would be fixed if closers applied our WP:PAGs and weighed votes when closing discussions. If they explicitly stated the kinds of votes that they weren't considering (the kind that are contra PAGs, contra WP:ITN), editors would eventually stop making those kind of votes, and the whole enterprise would improve. But, alas, easier said than done. I'd do it myself if it didn't require running for RFA. Levivich (talk) 17:16, 22 February 2023 (UTC)[reply]
    Ditto on the RFA bit. Can't see myself as having "a need for the tools" if it's just to administer to one area of Wikipedia. WaltClipper -(talk) 17:32, 22 February 2023 (UTC)[reply]
    Oh I disagree there; there are many examples of successful RFA candidates who ran specifically to work in just one area (e.g. SPI, DYK, CCI), and I think "main page admin" or "ITN admin" is a perfectly valid reason for someone to run for RFA. I'd never encourage anyone to run for RFA because I think the process is awful, but Walt if you're inclined to do so, I'd say go for it. Unlike me, I think you'd actually pass and ITN could use more admins. (ERRORS, too.) Levivich (talk) 17:43, 22 February 2023 (UTC)[reply]
    I'd only run if three people nominate me in good faith, one of them preferably being another admin. I've long bemoaned the gauntlet, and I also have a ton of skeletons in my closet that would be dredged up from my early editing days. WaltClipper -(talk) 17:48, 22 February 2023 (UTC)[reply]
    As a further note, the only "guidelines" ITN has are the article has to be of sufficient quality (WP:ITNQUALITY), have "updated content" and the significance section, which itself states ultimately, there are no rules or guidance beyond two: (1)The event can be described as "current", that is the event is appearing currently in news sources, and/or the event itself occurred within the time frame of ITN. (2)There is consensus to post the event. There is a lot of explainer text following these two "rules", but unfortunately, this process is way too subjective to determine what is significant, so we need something to clarify "significance". Natg 19 (talk) 18:39, 22 February 2023 (UTC)[reply]
    My WP:HOWITNWORKS essay tries to clarify it, although honestly it explains the problem more than it tries to solve it. WaltClipper -(talk) 19:12, 22 February 2023 (UTC)[reply]
    We have worked to address the issue of deaths, in that as long as the person has an article, it qualifies for the RD line once quality has been assessed. That might still leave debates over whether the person should get a blurb about their death, but that's not core to this concern. ITNR is also there to make sure a wide variety of recurring world events get covered (again, barring quality issues). One thing that gets us, and why using any type of source based counting causes problems, is that per NOTNEWS, we shouldn't be covering topics that have a burst if coverage but no long tail as its own story. We frequently have the heaviest discussions on such "burst" news coverage, and we need to adherents NOTNEWS better by focusing on topics with long coverage of events. --Masem (t) 18:47, 22 February 2023 (UTC)[reply]

Alternative - quality only

The issue is that our current assessment of significance is subjective. The alternative to using a less subjective method of determining significance is to remove the significance requirement entirely; change "In the news" to "Good articles on recent events" and have the requirements be that the article meets the good article criteria (with some leeway for stability given the event was recent) and that the event covered is more recent than the oldest currently listed.

This does open the possibility of abuse by paid editors to increase the profile of their product so articles likely to be of interest to paid editors would also need to be excluded. BilledMammal (talk) 09:05, 19 February 2023 (UTC)[reply]

A less stringent requirement than good article class would be B class, but there have been past objections to using the class system for ITN on the grounds that it is too subjective below good article class. BilledMammal (talk) 09:09, 19 February 2023 (UTC)[reply]
I'll go over the reasons why this won't work. First, it'll make the systemic bias issue worse than it is now, since topics from underrepresented regions will be harmed by the lack of extensive sources and the lack of people working on those articles to bring them up to the lofty requirements of the B or GA criteria. Second, having the lack of a significance standard will hyper-prioritize minuscule developments in those areas of Wikipedia which are well-developed, such as American politics, sports, celebrity news, business news, gaming, etc., and while that might not sound so bad in theory, you will have a very hard time getting ITN users to buy into that. Finally, it runs contrary to the goal of "[emphasizing] Wikipedia as a dynamic resource", since the increased quality standards would result in a stagnant article base. WaltClipper -(talk) 15:58, 19 February 2023 (UTC)[reply]
I'll second Walt Clip's comments here. And, in the end, do we really have such a colossal issue here that we need to make such a massive change that a lot of editors aren't going to buy into. DarkSide830 (talk) 16:19, 19 February 2023 (UTC)[reply]
I don't think we'll have enough GAs to fill the pipeline. However, drop that, and it could just be "new articles" or "recently-updated articles" that meet the minimum ITN requirements (like DYK). There's also the possibility of somehow combining DYK and ITN. Levivich (talk) 17:18, 23 February 2023 (UTC)[reply]
As in, all articles, regardless of significance, as long as they meet the quality requirements? I would support that, so long as we include appropriate protections about it being abused by UPE's. BilledMammal (talk) 01:49, 24 February 2023 (UTC)[reply]

Trending Topics

I believe supplanting with a "trending topics" of some kind is long over-due. We had posted this idea and got some amount of interest, but that thread as with many other ideas died because we did not know where to take it. In some sense the iOS app already does this.

Repeating the posting from here Wikipedia_talk:In_the_news/Archive_96#Trending_Topics

Background: We often get comments tying stories and nominations to their potential popularity particularly as measured by page views. However, we all broadly agree that we should not be conflating WP:ITN significance with WP:PAGEVIEWS. Also, we agree that WP:ITN is not a news ticker.
Suggestion: I think this might be time to introduce a trending topics section either as a part of the WP:ITN box or outside of that. It does reflect quite poor if our mainpage after all these years is still fairly static in its content refresh capability and is not dynamic i.e. tailored either based on audience interest (trending topics), geographic interest (trending near you), or personalized reccos (tailored for you). Trending topics reflects the lowest level of personalization but is still dynamic, whereas tailored for you is the highest level of personalization, while trending near you is in between. This can either be text-based links or better still, images. Requires some amount of creative thinking and might not be in the remit of this group which is largely in a maintenance and operations mode.

Complexity: This is not an easy problem to solve since it requires a technical solution, which might or might not exist within the Wikipedia realm. Furthermore, there will have to be new sets of processes including of reviews and such that might need to be baked in.

Next Steps: Would love to get this group's input on the interest for such an idea. More importantly who would be the right group to take this idea forward, if at all.

Some good ideas came up there. But, we could not take it further. Good luck. Ktin (talk) 02:27, 23 February 2023 (UTC)[reply]

I like this idea (trending topics), and I especially like it if it can be more than "just" WP:PAGEVIEWS (call it "general trending"), such as the examples you gave, "trending near you" (geographic) or "tailored for you" (personalized). Here are my concerns about it:
  1. I'm no expert in these things, but I don't think there is a technical impediment, I believe a bot could use Pageviews API to grab page views and then write them to a Wikipedia page, which could be updated periodically (not sure what frequency is possible/desirable). I could be very wrong about this, as I've never tried to do it before.
  2. I have no idea if the Pageviews API lets you break it down geographically. I don't think MediaWiki has the capability to deliver personalized suggestions (for logged-in or logged-out users), but I might be wrong about that, too. In theory this is doable, though, as many other websites do it.
  3. I'm concerned about "false positives", which could be exploited. See [1]. This happened in January with topviews reporting Index (statistics) (don't know why) and Cleopatra (see [2]). However, we could account for this issue by allowing human editors (admins) to override the bot algorithm and exclude certain pages when appropriate.
  4. Human editors could also override the bot to pull listings based on poor quality, if we wanted to do that.
  5. It's possible to do this and not have it automated at all, but it would slow down the rate of update.
Still, I think it's an idea worth exploring, mostly because it takes the "significance" requirement of ITN out of the subjective hands of editor opinion and puts it into the objective hands of readership statistics. Levivich (talk) 17:16, 23 February 2023 (UTC)[reply]
Pageviews API is heavily rate limited; I don't think that is a viable solution. However, the WMF does do (daily?) dumps of pageviews, so pulling that file could work as well. Unfortunately, the WMF does not provide view localization, although I am certain that they do have this information and might provide it if requested?
MediaWiki does support content localization; we already see this with banners. It will take some work to make it work for page content but I believe this is something we can do and don't require the WMF for.
I would generally support a solution like this; we could also use weighted pageviews, to give extra weight to articles whose content has recently been updated.
On the topic of false positives, does anyone know why the pageviews of Cat might have spiked by a factor of ten yesterday? BilledMammal (talk) 01:58, 24 February 2023 (UTC)[reply]
It's probably getting recommended by a virtual assistant, same as Cleopatra. Curbon7 (talk) 02:02, 24 February 2023 (UTC)[reply]
My cat told me to stop asking questions. Levivich (talk) 06:17, 26 February 2023 (UTC)[reply]
  • Question. Who can take this idea forward to see if it has some merit toward implementation? Should I be posting this in some specific group? Thoughts? Ktin (talk) 16:40, 1 March 2023 (UTC)[reply]

search algorithm

Create a smarter search algorithm or improve the display of the output from the present search algorithm.

This is a problem that Google and other search engines have which make searching much more difficult by including items completely unrelated to what is desired.


I recently searched Wikipedia for "Vines of Belize".

I was amazed that the "found" items were anything that included the word "of".  Hence it included topics like Coinage "of" the world, Inflation, Deflation, Stagflation, and other topics worthless to my search.


In normal English usage, the phrase "of Belize" is a prepositional qualifier that limits the noun that it is qualifying.


Similarly, if I had search for "Belize vine", the word "Belize" is used as an adjective qualifying the noun that follows it.  I didn't phrase my search as "Belizean vine" because the word "Belizean" is unlikely to be used in the article if the article merely lists some of the countries in which the vine is found.


A smarter search algorithm would first search for the main noun (vine) and then eliminate those entries that don't include the word "Belize".


Instead, Wikipedia's search algorithm returns any entry that contains "Vines" or "of" or "Belize".


Although the above suggestion would require that your search algorithm be smart enough to understand word usage, there is a simpler implementation that eliminates having to be this smart about a given language.


Although not as smart as the change I'm suggesting, a simpler search would return just those entries that contain all 3 words, or at least order the list so that entries that contain all 3 words are listed first, followed by entries that contain just 2 of the search words, followed by entries that contain just 1 of the search words.


Another improvement could be made by eliminating all words that are articles (such as "a", "an", and "the"), eliminating connectors (like "and" and "or"), and common prepositions (like "of", "on", "in", "for", etc). These words could be language dependent.


Wikipedia does not do this, and thus wastes their users' time by mixing it all up so people have to wade thru hundreds of irrelevant articles.

It just occurred to me that there is another change you could make that would not require that you change your current search algorithm !

Add an option to the displayed list of found entries, perhaps call it "Smart ordering" that takes the output from your search algorithm and reorders them to list those containing all of the search words first.

Personally, I wouldn't care how the remaining are ordered if the top entries contain ALL of my search terms, but the result would be much more valuable to the average Wikipedia user if the remaining were ordered by the number of search words actually found.

You might even include a check box by the original search to exclude words that are articles ("a", "an", and "the"), connectors ("and", "or"), and prepositions ("of", "for", etc.).

Or you could include this check box beside the "Smart ordering" button on the displayed results page.

-- Darrel Joy -- a frequent Wikipedia user 190.197.122.145 (talk) 19:56, 16 February 2023 (UTC)[reply]

Poll Readers/IP editors on Vector change

A few logistic questions first, since I know we can post banners, but don't know the limits.

  • Can we anonymously track response by users to a binary question? Anonymous being the same extent they are granted now as a passive reader.
  • Can we target IP editors specifically? Is there a part of the process they see but public does not? And can a banner/link/something be inserted there?

Proposal:

1. Put up for 72 hours a banner asking "Do you prefer the old look or new look"?

2. Should the banner be shown to all users or only ip editors?

Editors have other ways to change the look and give feedback. IP editors may not avail themselves and readers are not expected to. This seems a reasonable way to judge whether the dislike felt internally is equally as passionate outside. Slywriter (talk) 05:53, 17 February 2023 (UTC)[reply]

The Vector22 team are already planning on doing surveys of readers - they may already be active, come to think of it, given the timeline. I know that they held off doing it in the first week because they wanted to avoid change shock affecting the outcome in favour of V10 - I don't know why they couldn't do a survey in both 1st and 3rd weeks. Nosebagbear (talk) 09:52, 20 February 2023 (UTC)[reply]

I think that this proposed guideline is ready for being a notability guideline. Can someone start discussion for making this proposed guideline into notability guideline? Thanks. ​​​​​​​𝐋𝐨𝐫𝐝𝐕𝐨𝐥𝐝𝐞𝐦𝐨𝐫𝐭𝟕𝟐𝟖🧙‍♂️Let's Talk ! 03:45, 18 February 2023 (UTC)[reply]

WP:PROPOSAL outlines the instructions. Curbon7 (talk) 04:14, 18 February 2023 (UTC)[reply]
... Didn't we just go through this? --WaltClipper -(talk) 17:13, 19 February 2023 (UTC)[reply]
Can we just not. --Jayron32 14:03, 20 February 2023 (UTC)[reply]
  • FYI - an RFC on the proposal was held (see talk page) … and has already been SNOW closed as “unsuccessful”. Blueboar (talk) 15:31, 20 February 2023 (UTC)[reply]

Idea: wiki file format

Here I go, hoping I'm not trying to reinvent the wheel

I think it would be a great idea if wiki could create a couple standard file format, to group hyperlink text into a single file (therefore replacing the file/folder thing enforced by HTML...

My idea would be a single archive fail containing:

  • the text content in wiki code, with each page as a file
  • a folder containing the necessary media associated with pages

It would be a great tool, for a lot of stuff:

  • I'm always wishing I had a wiki dedicated for this or that base of knowledge (in an association, to group all acronyms, and other interconnected notions; in development, to share knowledge on a project, and, as myself, as a )TTRPG player, to keep lore on our world updated and easy to browse)
  • one can imagine publishing "books" in that format, effectively replacing pdf/doc files, a bit in the same spirit as latex files but more in a CBR/ebook spirit
  • it would be awesome to be able to "extract" a collection of Wikipedia pages (determining which and suggesting things to add with the see also feature and links in the already selected page) for offline use, effectively creating a sharable, savable "reading list" from the android app.
  • it would reduce bandwidth use in that respect, and therefore, maintenance cost and environmental impact.
  • it also would mean that Wikipedia would have the first "interpreter" of that file type, but that would/could rapidly change: people could want different skin, style, etc for their wiki file readers.

On top of that, I said typeS of file, one can imagine variations of this:

  • one with only the text content
  • one with the media attached
  • one with the modification history
  • etc.

In any case, if the android application can be "split" between a sort of reader and a sort of browser/research query manager, it would be a perfect implementation of the kiss principle and UNIX philosophy. Alefith (talk) 03:07, 21 February 2023 (UTC)[reply]

@Alefith maybe a partial answer, there is a standard export format for wikipages, in XML. Here is an example of what it produces: Special:Export/Stigmella_corylifoliella. That is for the current version, using Special:Export you can also get the full page history with all attribution. Notably, it does not contain images. — xaosflux Talk 19:22, 22 February 2023 (UTC)[reply]
We also have a PDF renderer, see this example for that same article. — xaosflux Talk 19:24, 22 February 2023 (UTC)[reply]
Yes these are parts of the functionality I'm looking for, but it's impossible to do what I want to do.
Let's take an use case, so that it's more clear:
Let's consider three pages:
what I want is:
  • to be able to compare the two type of moth (say in different tab, or better yet in a split window)
  • to add one type of moth, or insect, or whatever, or edit something (without having to type HTML/XML, or anything more than wikitext)
  • that the links towards any of the pages encapsulated in the file (any of the three mentionned, or any other I created afterwards) are pointing towards the page and not towards the wikipedia (or source wiki) website.
Basically, a file would be the data of a "mini-wiki". And you could consult or modify the pages with a wiki website, or with other tools (say, a different wiki engine, or one can imagine offline apps, etc.)
As far as I understand, the only two things I can do with the tools you proposed are:
  • having one pdf / XML file with every page concatenated (that probably won't even work because of table of content conflicts) and then relink every link in the pages that point towards an incapsulated page (ie moth, attacus atlas, and stigmella corylifofiella) so that they point to the right section of the file, and not towards the source wiki.
  • having one XML file for each wiki page, and manually switch any time I want to consult a linked page
Basically, going from hypertext data to text data.
You could say what I'm looking for is HTML with the markup limited to wikitext (and therefore the syntax as simple as wikitext as well), links always relative and always towards included pages, no scripting, and the styling handled externally.
Honestly, if that existed, I'm pretty sure at some point it could be used a lot, say, for documentation of IT project, etc. Basically, it would be a new *.txt . Literally a hypertext file to replace a text file, hey how about *.htxt as an extension? Alefith (talk) 21:47, 22 February 2023 (UTC)[reply]
I don't think this is what you're looking for, but there is also the "parse" endpoint, example output. — xaosflux Talk 23:30, 22 February 2023 (UTC)[reply]
Also, the rvprop output, example here. — xaosflux Talk 23:31, 22 February 2023 (UTC)[reply]
No it isn't but thanks! Alefith (talk) 13:06, 27 February 2023 (UTC)[reply]

Concern about biases towards older, foreign and obscure topics

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


I like to write a lot about older and foreign things. However, this nearly always ends up leading to a few problems:

- The internet wasn't really used widely until the last 20 years. Therefore, there's not a lot of sources for various things like there are now. Take a sporting event for example. I can go to multiple local news sites or national sports sites and find results of various games. I can get takes on how everyone performed, along with videos and interviews to further prove this. That's not really possible when you go into things from prior to the 1990's because....

- If you want to find out about things that weren't major general events prior to the 1990's, you are going to have to get a book, magazine or newspaper. Finding any of these things from prior to the 1990's is going to cost you money unless you get lucky and it's in a library or online for free. Worse, when it comes to newspapers, you likely cannot even find copies of various newspapers and such anywhere due to people not saving them or not releasing them publicly.

- Now these two things are problems, but then we dip into problem #3 - foreign sources. Finding the book/newspaper and hoping it covered what you wanted (which is an expensive guessing game, because you don't know what's in the book until you have read it) is hard enough, but then you need to be able to read it. Sure, if it's in german or spanish, we can probably use google translate, but if the source is in a language like Chinese or Japanese, you can't type in Chinese/Japanese on an American keyboard without knowledge of the language. And in general, a lot of the world is still not online which limits possible sources. The vast majority of internet articles online are coming from richer countries where more people have computers and the internet.

Overall, I think Wikipedia needs to change how they view articles on older and foreign topics. It's discriminatory on money, location and language and for article writers, you are already kind of facing the gauntlet to begin with against people with more power than you. KatoKungLee (talk) 19:02, 22 February 2023 (UTC)[reply]

  • Are you suggesting that we allow people to add stuff without supporting it with a source? If so, NOPE. Blueboar (talk) 19:19, 22 February 2023 (UTC)[reply]
    • Blueboar - No? KatoKungLee (talk) 20:00, 22 February 2023 (UTC)[reply]
      • No. Verifiability is a fundamental principle of Wikipedia. We would rather have no information on a topic than have unverifiable (and therefore potentially inaccurate) information. Blueboar (talk) 20:54, 22 February 2023 (UTC)[reply]
        • Blueboar - Verifiability relies on having the resources to verify. I can find hundreds of articles on local people at my library. I'm not going to find the same resources on people from Uzbekistan, even if they were equal levels of notable. I'd have to go to Uzbekistan or spend large amounts of money to buy the materials necessary along with having some knowledge of the Uzbeki language.
        • Blueboar - I was given this site to look at for old German newspaper sources (https://www.deutsche-digitale-bibliothek.de/newspaper/item/LJZFOK36AZB3AL7MEMMIVB7KFWYNONQX?query=%22neumayr%22+%22tsv+1860%22&hit=1&issuepage=9). I'm sure local German libraries have more newspapers like this, but local libraries where I live don't for Germany. Some of the information in that source might be useful, however as you might notice, the font used is unusual and unreadable compared to something like this (https://de.wikipedia.org/wiki/Skorpion_I.) despite being in the same language. You'd have to know German and be able to get comfortable with the font just to even begin the search. Asking someone to get a hold of these resources and expecting them to be able to read them is not a realistic expectation for something like this.KatoKungLee (talk) 14:39, 24 February 2023 (UTC)[reply]
          FYI that typeface is Fraktur, and if you only speak the language, guessing the unfamiliar letters from the context is not hard at all -- similar to how as an English speaker you can probably understand all of the text in this picture, if with some effort ;) Daß Wölf 19:19, 24 February 2023 (UTC)[reply]
  • In what way does "Wikipedia needs to change how they [sic] view articles on older and foreign topics"? We aleady have the policies WP:NONENG and WP:SOURCEACCESS. Phil Bridger (talk) 19:36, 22 February 2023 (UTC)[reply]
    • Phil Bridger - I think having less scrutiny on older and foreign topics would be helpful since so much of it requires spending money or being in various locations, which are both discriminatory. Many books are expensive. Many books are out of print and difficult if impossible to find. Some publishers don't want to ship to other countries either and many types of media were never made available outside of their country. It would be easier and cheaper for me in many situations to write my own book based on sources that would not be allowed here than to actually track down the original books that they came from. It sounds odd, but it would work.
While the resource request section is cool, I would guess that 99.9% of resource requests are never granted, making it unreliable as a tool to get resources from.KatoKungLee (talk) 20:00, 22 February 2023 (UTC)[reply]
I've always been successful when using Resource Request, and looking at the latest archive nearly ever request is resolved, so I don't know where you're getting the 99.9% figure from. -- LCU ActivelyDisinterested transmissions °co-ords° 15:16, 24 February 2023 (UTC)[reply]
  • I would prefer if editors used books and newspapers far more than they currently do, regardless of the topic. There are many places to get them for free. We even have WP:The Wikipedia Library that provides open access to a ridiculous number of articles and books for regular editors. Beyond that, Archive.org is an invaluable resource for books, especially older books. Thebiguglyalien (talk) 20:07, 22 February 2023 (UTC)[reply]
    • Thebiguglyalien - Newspapers are tough. You have to be in the area to get access to a lot of them and they often don't have full collections. And who knows what the situation is like in non-US countries, since you would have to know the language to get anywhere with it. You also have to hope it has what you are looking for, since there's no guarantee they covered it. While archive.org and other sites are great, the selection is also not complete. KatoKungLee (talk) 20:39, 22 February 2023 (UTC)[reply]
      • TWL does not have many newspapers that are not in English. --Rschen7754 04:11, 23 February 2023 (UTC)[reply]
      newspapers.com has absolutely extensive records of newspapers in the US, and I know there are similar repositories for the UK and Aus. Curbon7 (talk) 19:57, 24 February 2023 (UTC)[reply]
  • It would be a deep disservice to older, "foreign", and obscure topics if we ran a policy of allowing their articles to be inherently less reliable or inherently lower quality. CMD (talk) 02:56, 23 February 2023 (UTC)[reply]
  • Indeed, I raised this subject here. And this could also happen to plot summaries in TV/movie articles older than around 2000 pretty easily. I'm not saying to throw out our sourcing guidelines, but WP:NODEADLINE and use common sense. --Rschen7754 04:11, 23 February 2023 (UTC)[reply]
    • TV plots are a good example. The only way you could prove that the episode was about something was if you had a TV Guide. Collecting those to prove a summary is an expensive task when anyone who has seen the episode would know what it's generally about. The people who are going to challenge it though would rather raise questions about it then spend 10 minutes and find out first hand.
    • A general common sense rule is also badly needed here as some of the most vocal people here seem to have a total lack of it.KatoKungLee (talk) 14:18, 24 February 2023 (UTC)[reply]
      Summaries can be referenced to the work, this is common pratice. As to this idea it has been floated in one form or another multiple times. If anyone had any common sense they would let it go. -- LCU ActivelyDisinterested transmissions °co-ords° 14:21, 24 February 2023 (UTC)[reply]
  • WP:V is non-negotiable, regardless of the subject. Using non-English or difficult to access sources is fine, but sources must exist. If no sources exist then the content is WP:OR and should be removed. -- LCU ActivelyDisinterested transmissions °co-ords° 20:41, 23 February 2023 (UTC)[reply]
    Agree. This proposal is basically saying “researching obscure topics is hard, can we pretty please cut corners?” Which the answer to is obviously no. Dronebogus (talk) 20:50, 23 February 2023 (UTC)[reply]
    • Dronebogus - I can find hundreds of sources on the library of local figures, because I live in the area. I'm not going to find the same resources on people from Uzbekistan. Local libraries and such are just not going to carry that type of information like a local library in Uzbekistan would. The only way to get the dozens of sources to prove notability for a new topic would be to go to Uzebikstan or to spend thousands of dollars on books hoping that the book has the information you want. And as I learned with this ((https://www.deutsche-digitale-bibliothek.de/newspaper/item/LJZFOK36AZB3AL7MEMMIVB7KFWYNONQX?query=%22neumayr%22+%22tsv+1860%22&hit=1&issuepage=9)), even if I have the material, I have to be able to read it and I have to have a program that can recognize the font. My programs cannot recognize that font. Now imagine trying to do this with languages that don't have roman characters. Do you really think that's possible for someone who isn't a native speaker of a language? I don't.KatoKungLee (talk) 14:47, 24 February 2023 (UTC)[reply]
    At this point I've likely spent 40+ hours searching out sources for obscure language orthographies, finding such works isn't easy but referencing isn't optional. -- LCU ActivelyDisinterested transmissions °co-ords° 21:02, 23 February 2023 (UTC)[reply]
  • I don't think KatoKungLee is suggesting to ignore the verifiability guidelines, but rather, to lesson the notability restrictions on people with significant accomplishments from foreign countries in the pre-internet era, for which finding sources is a very difficult task. BeanieFan11 (talk) 21:55, 23 February 2023 (UTC)[reply]
    • BeanieFan11 - This. I can go to the local library and pull up dozens of sources on a local figure from the 1930's to make them seem notable, because local libraries have the old newspapers, books and magazines that would probably have that information. If I went to the same library to get that info on a local figure in Uzbekistan in the 1930's, I'm not going to find anything. Getting the books or sources needed for that person would require me to either spend a ton of money or to be in Uzbekistan to do research, and I don't have the resources to do something like that. And even if I did, I'd still have to be able to read Uzbeki to a degree to find information. Essentially asking someone to visit a foreign country, purchase expensive books or know a foreign language to submit an article is insanity. KatoKungLee (talk) 14:30, 24 February 2023 (UTC)[reply]
      • On the contrary, I would consider it much more insane to accept an article written by someone who doesn't have access to reliable sources on the topic and wouldn't understand them if they did. Caeciliusinhorto-public (talk) 14:41, 24 February 2023 (UTC)[reply]
        • Caeciliusinhorto-public - How do you get the local sources if you don't speak the language or aren't in the area? You have to hope someone else writes about it or does the work for you. Relying on secondhand information isn't that good either. Original research isn't allowed here and publishing first hand accounts isn't allowed here either.KatoKungLee (talk) 14:57, 24 February 2023 (UTC)[reply]
          • You've essentially answered your own question. If there's a subject you can't write about, then you let someone else do it. If no one else does, then it doesn't get covered. It's an unfortunate truth, but that doesn't make it less true. There are things we can try to do to mitigate this, such as recruiting more bilingual editors, but non-Anglosphere topics will always be slightly more difficult to write about than Anglosphere topics, and that's something we have to acknowledge and work with. Thebiguglyalien (talk) 15:59, 24 February 2023 (UTC)[reply]
            • I think having some information about a subject is better than having no information at all. You can always add more to an article and just having it invites people to improve it. Nobody can improve an article that doesn't exist and with increased scrutiny on here, less articles are getting published period.KatoKungLee (talk) 16:59, 24 February 2023 (UTC)[reply]
          • KatoKungLee exactly my point. If you don't have access to reliable sources you shouldn't be creating an article. Caeciliusinhorto-public (talk) 16:00, 24 February 2023 (UTC)[reply]
            • Not only do people's ideas of reliable sources differ on the site, but people's knowledge of reliable sources also differ on the site. An article can always be improved, and the articles existence usually invites improvement. An article that doesn't exist doesn't really encourage much group research.KatoKungLee (talk) 16:59, 24 February 2023 (UTC)[reply]
              • Existing articles can also be made worse. In fact, my experience suggests that most articles on obscure topics don't change very much at all for long periods of time until either (a) someone comes along with access to actual reliable sources and makes a concerted effort to make the article better or (b) someone comes and inserts misinformation, making the article worse. The first case is not made easier by weakening our requirements for reliable sources, but the second is. I would argue that an article which misinforms readers is worse than no article at all, so any proposed reform which makes it easier to add misinformation is one which makes Wikipedia worse, not better. Caeciliusinhorto (talk) 17:37, 24 February 2023 (UTC)[reply]
@KatoKungLee The solution to your issue is actually very simple. If you don't have access to the sources, then you don't write the article. Leave it for other editors who do understand the language and /or do have access the sources. Roger (Dodger67) (talk) 20:11, 24 February 2023 (UTC)[reply]
I can get dozens of references on local people where I live at the library that other people can't who aren't local. I can't do the same for people from foreign, poorer and smaller countries. It doesn't mean they aren't important, it's just unreasonable to expect people to have to spend large amounts of money and time to hunt down sources that are not available in their country.KatoKungLee (talk) 16:47, 25 February 2023 (UTC)[reply]

There is also a "bias" that works in the reverse direction. When somebody cite a less-available source, it is harder to scrutinize whether or not the source supports what was written. North8000 (talk) 16:56, 24 February 2023 (UTC)[reply]

  • The unfortunate thing is that lots of people want to use Wikipedia for promotion, whether for financial gain or simply to promote what they are interested in as a hobby or obsession. That means that an established, trusted editor has to have access to the sources, whether that is physical or linguistic access. Many editors seem to be frightened of anything that is not on the Internet or is not in English, even more so if it is not in the Roman alphabet. I don't think there's an awful lot we can do about that. What would you suggest, more specifically than "having less scrutiny"? Phil Bridger (talk) 18:19, 24 February 2023 (UTC)[reply]
I was just noting that the "bias" works both ways. I can't think of any change to make with regard to wp:verifiability, not any big problem there. The bigger problem is with a combination if wp:notability and wp:before. We need to establish that including and identifying GNG sources is a key part of developing articles. Not the job of somebody else to prove a negative amongst a pile of non-english or inaccessible sources. In other words, get rid of or modify wp:before. Sincerely, North8000 (talk) 19:28, 24 February 2023 (UTC)[reply]
Some of you might be interested in this opportunity:
The mw:Editing team is starting a new project, mw:Edit Check. The basic story is: Imagine a world in which the visual editor prompts editors to add inline citations. The problem is: How much is too much? It'd be annoying if you get interrupted after each character, but pointless if you never see it. If you're looking at a diff in your watchlist or Special:RecentChanges, what are you looking for (that a computer might be able to recognize)?
They're hosting a meeting in Google Meet this coming Friday, 3 March 2023. More information is available at mw:Editing team/Community Conversations#3 March 2023. I hope that many of you will be able to attend, but if you aren't, please consider leaving your advice to the team on the talk page.
Thanks, Whatamidoing (WMF) (talk) 20:33, 24 February 2023 (UTC)[reply]
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Users should try to attempt to improve articles first and mark them for deletion second

I don't want to name any specific articles here in the interest of keeping the peace.

In the last week, I've personally edited atleast 7 different articles to make them more eligible for being on wikipedia. These articles had notices on them and I went in and fixed them. I didn't know anything about a lot of these subjects and I don't even know how to do a lot of things on wikipedia. The fixes probably took me less than 5 minutes.

While I agree that people should do the best they can with articles and should put effort into finding sources, I think it's wrong to mark articles for deletion without any attempts to improve the articles themselves. In many situations, fixes seem to take even less time than marking them for deletion does.

I'd like to suggest that a rule be made where people should first attempt to fix articles before deleting them. Or, if perhaps something could be done involving users whose first move is to delete an article without ever thinking of improving them. No, not every article can be fixed or is worthy, but the mindset of immediately going for the deletion is very concerning.

KatoKungLee (talk) 21:11, 23 February 2023 (UTC)[reply]

Based on some comments you've posted in a couple of AfDs, I suspect you're thinking of my edits. Please feel free to look through my edit history, but I can assure you that I spend plenty of time improving articles that have problems. I also nominate a lot of articles for deletion, but only after looking to see if they can be improved first. Since I work on footballer biographies most of the time, it is only natural that nominating articles for deletion is a large part of what I do - for example, we have over 50,000 football articles assessed as stubs with low importance; many of those are simply not ever going to meet our notability policies and guidelines. Jogurney (talk) 21:23, 23 February 2023 (UTC)[reply]
I did not mention anyone specifically by name. Please don't assume.KatoKungLee (talk) 21:36, 23 February 2023 (UTC)[reply]
Absolutely editors should be fixing articles that they believe should be part of Wikipedia, but fixing an article you don't believe should be part of Wikipedia doesn't really make any sense. -- LCU ActivelyDisinterested transmissions °co-ords° 21:25, 23 February 2023 (UTC)[reply]
I suggest a better idea would be that editors creating articles should have to add sources proving the articles notability, and so don't waste the time of other editors who have to check if the articles are notable or not. -- LCU ActivelyDisinterested transmissions °co-ords° 21:52, 23 February 2023 (UTC)[reply]

I'd like to suggest that a rule be made where people should first attempt to fix articles before deleting them

The problem from my perspective is the amount of work it could take to improve an article and dig up sources, and how the people who don't want the article to exist are often not motivated to work on it. Placing burdens like that on people who are already in "good deed mood" rather than "this is important modd" and "this is interesting mode" seems unworkable. The cases that come to mind are things like niche psychological or medical diagnoses with with missing or poor sources which when they exists are old and tangetial. The burdens I would place on the deleter would be more along the lines of "don't delete an article that contains a version of the article you would be happy with if you removed material" and "actually look at the sources" Talpedia (talk) 21:35, 23 February 2023 (UTC)[reply]
  • This "rule" you are speaking of already exists! One of the "requirements" before nominating an article for deletion is to follow WP:BEFORE. Of course, some nominators do not do this, or do not spend enough time on this, but it is already part of the process of AfD, and editors have been "warned" or chastised for not doing this. Natg 19 (talk) 21:53, 23 February 2023 (UTC)[reply]
    Greater and more explicit enforcement of Wikipedia:Deletion is not cleanup might help. Imagine if selected editors were told, "Look, you keep nominating articles for deletion because the current version is ugly, but that's not how notability works. AFD is not a place for demanding that other editors clean up an article and add sources. If you do this again, you'll be TBANned from nominating articles for deletion." WhatamIdoing (talk) 22:04, 23 February 2023 (UTC)[reply]
    Only if we equally BAN editors from article creation who consistently don't reference they articles properly. Wasting other editors time by not fulfilling BURDEN is disruptive. -- LCU ActivelyDisinterested transmissions °co-ords° 00:11, 24 February 2023 (UTC)[reply]
    @ActivelyDisinterested, please see my note about the meeting above. What's a "proper" level of citations? How could that be calculated in software? Whatamidoing (WMF) (talk) 20:36, 24 February 2023 (UTC)[reply]
    I would have thought a good rule of thumb is each block of text for additions, if someone has pressed return they have completed some specific point (you can see this behaviour in a lot of Wikipedias articles). That's not the "correct" answer, but based on a design stand point. I don't think an absolutely correct answer is possible, but each sentence would be aggravating while waiting an entire edit wouldn't encourage the desired behaviour (note the block of text could be one sentence, but could be several). How to deal with edits that change details is harder, maybe some form of lesser nudge asking if the changes require a new reference. -- LCU ActivelyDisinterested transmissions °co-ords° 21:26, 24 February 2023 (UTC)[reply]
    Whether that rule is a rule is debated; some editors do not consider it to be required and there is no consensus on whether it is. For example, when nominating mass created articles many editors think it is reasonable to put in the same level of effort that the creator put in, which precludes a WP:BEFORE. BilledMammal (talk) 23:26, 23 February 2023 (UTC)[reply]
    One of the questions at the, sooner to be defunct, AfD RFC was going to be whether BEFORE should be obligatory or not. As it stands it's an unanswered question. -- LCU ActivelyDisinterested transmissions °co-ords° 00:09, 24 February 2023 (UTC)[reply]
  • It is important to remember that a nomination for deletion is just that - a nomination. It is the beginning of a discussion, not the end of it. Lots of articles get nominated and end up being kept. That said, see WP:BURDON… it is up to those who wish to keep the article to bring it up to minimum standards. We are all volunteers, and you can not make someone else do the work - you can only make yourself do it. Blueboar (talk) 22:01, 23 February 2023 (UTC)[reply]
    OTOH, the minimum standard is "it would be possible to find sources, if someone spent enough time and effort to do so". There is no minimum standard for what the current version of the article must look like. A 100% unsourced substub saying nothing more than "Cancer is a kind of disease" is not eligible for deletion. WhatamIdoing (talk) 22:06, 23 February 2023 (UTC)[reply]
    I think we should explore making unsourced articles eligible for draftification that can only be contested by adding a source. BilledMammal (talk) 23:26, 23 February 2023 (UTC)[reply]
    "A 100% unsourced substub saying nothing more than "Cancer is a kind of disease" is not eligible for deletion. " That's a problem. Such an "article" is worse than useless and should be eradicated on sight with no discussion needed.--User:Khajidha (talk) (contributions) 04:07, 25 February 2023 (UTC)[reply]
Users should try to ensure that articles they create will meet notability criteria first, and move them into article space second... AndyTheGrump (talk) 22:07, 23 February 2023 (UTC)[reply]
  • The debate between “inclusionists” and “deletionists” is as old as Wikipedia. It isn’t going to end soon. But we can strive for a middle ground. I have long felt that if we are going to ask deletionists to do a BEFORE search prior to nomination, we should also ask inclusionists to do an AFTER fix of the article should the result be “keep”. While it is very frustrating when deletionists nominate an article that can easily be sourced… it is equally frustrating when inclusionists say that they found sources, but never bother to actually cite them IN the article. Blueboar (talk) 02:03, 24 February 2023 (UTC)[reply]
    There's a very big difference between finding enough sources online to convince yourself that the topic is notable (typically takes less than 10 minutes) on the one hand and, on the other, writing reliable encyclopedic content (which will involve hunting for more sources than the bare minimum required for notability, reading those sources, and then carefully writing up the text in a way that doesn't accidentally misinterpret those sources: this takes at least hours). – Uanfala (talk) 14:19, 25 February 2023 (UTC)[reply]
    @Blueboar I totally agree. We've got WP:BEFORE. We need WP:AFTER. Deletion decisions should be made based entirely on what's in the article. If people find better sources during a discussion, they should update the article to include them. Otherwise, all we end up with is an article that's still not sourced, and that's a disservice to our readers. I've largely stopped nominating articles for deletion because the arguments at AfD drive me nuts. -- RoySmith (talk) 18:51, 24 February 2023 (UTC)[reply]
    Why? Did someone impose a WP:DEADLINE that can be triggered by an AFD nomination, and forget to tell the rest of us? Maybe the rule should be that if you nominate an article for deletion, and your BEFORE search proved inadequate, then it's your job, as the sloppy nominator, to take any sources I provided to you, on the silver platter of the AFD page, and stick them in the article yourself. Why should I have to do any extra work at all, just because some editor was so stupid as to think a modern, thousand-bed teaching hospital probably never had anything written about it? WhatamIdoing (talk) 20:40, 24 February 2023 (UTC)[reply]
    Maybe the solution is if the AfD closes as keep the nominator should remove all unreferenced information from the article. Then per WP:BURDEN anyone wanting to return the removed text would have to reference it correctly. -- LCU ActivelyDisinterested transmissions °co-ords° 21:33, 24 February 2023 (UTC)[reply]
    How does blindly removing content, especially after I've handed you the sources needed for the article, help any readers? Why should an editor being stupid (every basically functional adult knows that it's not possible to build a huge teaching hospital without both government agencies and newspapers taking notice of the event) result in an article being gutted by the stupid editor? WhatamIdoing (talk) 00:36, 25 February 2023 (UTC)[reply]
    Why would an editor stupidly waste their's and other editors time finding good sources that they then don't add to an article, why? Improve the article, everyone wants good articles. Maybe stupid ideas are required to combat stupidity. -- LCU ActivelyDisinterested transmissions °co-ords° 01:10, 25 February 2023 (UTC)[reply]
    Perhaps everyone can avoid calling editors stupid? For better or worse, different editors have different expectations about how to work collaboratively towards building better articles. Preferring one method or another doesn't mean one editor is less intelligent than another. isaacl (talk) 22:17, 25 February 2023 (UTC)[reply]
You're not going to have any luck with this, unfortunately. The trend for what people (or at least the subset of people who participate in the discussions on this topic) want to happen is actually in the opposite direction.
In lieu of that, the best way to accomplish what you are suggesting is to proactively source and improve articles before they are nominated. On that note, there are about 129,000 candidates. Gnomingstuff (talk) 21:25, 25 February 2023 (UTC)[reply]


We shouldn't try to force this (nor bad-idea-wp:before) onto the ham-handed scale of inclusionist vs. deletionist. With some exceptions (e.g wp:not and wp:speedy type) which are generally not disputed where article should not clearly not even exist, AFD is about wp:notability. A key part of the job of building the article is establishing wp:notability, which generally means finding and including a couple of GNG sources. Without that, in Wiki terms, you haven't really created anything. Like me handing you a windshield wiper and saying "here's the car I built, it just needs somebody to complete it" :-) Once that is done, it doesn't go to AFD. AFD is about articles that shouldn't exist, not about articles that need improvement. Sincerely, North8000 (talk) 19:49, 24 February 2023 (UTC)[reply]

I don't agree that "establishing notability", by which you seem to mean "proving notability to the satisfaction of editors unfamiliar with the subject" is a key requirement for an article. A 100% unsourced substub saying nothing more than "Cancer is a kind of disease" is not eligible for deletion because Cancer is an incontestably notable subject. Proving that will happen, automatically and incidentally, when the article is expanded (because WP:MEDRS), but notability depends on the real world, not on the current state of the article, even if the current state of the article is little more than a maker's nameplate and a windshield wiper. WhatamIdoing (talk) 20:47, 24 February 2023 (UTC)[reply]
An article containing only "Cancer is a kind of disease" should be pushed to draft space, and continually creating unsourced articles is disruptive behaviour as it's a massive time sink for other editors. -- LCU ActivelyDisinterested transmissions °co-ords° 21:30, 24 February 2023 (UTC)[reply]
Why should an article about an obviously notable subject be hidden in draft space? The one thing that we know about draftspace is that it will get less attention from other editors. Is that what you really want to accomplish with an article on a clearly notable subject? Do you want to establish a rule that effectively says "If you don't want to spend several hours working on this yourself, then don't create it at all"? WhatamIdoing (talk) 00:15, 25 February 2023 (UTC)[reply]
Because it's obviously not ready for main space. -- LCU ActivelyDisinterested transmissions °co-ords° 01:12, 25 February 2023 (UTC)[reply]
@WhatamIdoing: Of course there is a range of situations. I think that your Cancer example is so non-typical (new article= has not had an article in Wikipedia with absolutely obvious wp:notability) that I don't think that it is useful for the discussion other than to illustrate the point that wp:notability goes with the subject not the content of the article. Also proving is a pretty extreme degree, beyond what I was talking about which is that if you don't have some sources, you haven't done the basic job of starting an article. But my main point to the OP was: put in a couple GNG sources, and you don't need to worry about AFD, even if the article quality is low. And my main vague idea that I'm promoting is that step #1 of writing an article is finding sources. North8000 (talk) 21:50, 24 February 2023 (UTC)[reply]
Yeah, creating an article based on sources that don't establish notability, or especially based on no sources at all, is functionally equivalent to adding the subject's name to a categorized list of "maybe-notable" topics. Either the material in the article is unencyclopedic fluff sourced to trivial mentions/non-independent bodies, or it's not sourced and should be deleted; regardless, it doesn't serve readers as a trustable encyclopedia entry. JoelleJay (talk) 23:50, 24 February 2023 (UTC)[reply]
Does it? Do you really think that having little blue clicky numbers is what causes readers to trust an article? I don't. There are multiple factors that affect readers' willingness to trust a page, and I doubt that little blue clicky numbers even make the top 10. Readers barely notice them and almost never click on them. The biggest factor seems to be that the page aligns with their expectations.
I'm also not talking about "maybe-notable" topics. First of all, readers don't care whether the topic is Wikipedia:Notable. They care whether the topic contains the information they're looking for. Think about it: Have you ever heard anyone complain that Wikipedia had the information they were searching for, and they thought this was a bad thing? Can you imagine what that would even sound like? "Yeah, @JoelleJay, you people over at Wikipedia are totally screwing up. I put <niche subject> in Google, and Wikipedia had a short little article on it that answered my question. What a bunch of garbage. Why did you all even bother writing the thing I needed?" I've never heard anything even remotely like that, and I'll bet nobody reading this page has, either.
Second, I find that these are still red links:
Do you doubt that any of these are notable subjects? I don't, and I doubt that you have any qualms about those subjects either.
I believe that a unsourced substub that says "French Renaissance sculpture is sculpture produced during the French Renaissance", followed by a ==See also== list of relevant sculptors would be better for readers than nothing at all, which is what we have now. Do you really believe that nothing at all would be more informative/educational/helpful to readers? WhatamIdoing (talk) 00:30, 25 February 2023 (UTC)[reply]
Ok, I have sneakily shot all these foxes by redirecting them to the relevant section of Sculpture in the Renaissance period. This is a recent machine-translated version of the Spanish article, and has many, many faults, but its sections are still much better than nothing. Btw, we didn't even have Italian Renaissance sculpture until I did it for last year's Core Contest. Johnbod (talk) 17:15, 25 February 2023 (UTC)[reply]
People complain all the time that the wikipedia entry for X topic is totally useless and/or has outdated, incomplete, or incorrect information. They will be and are rightfully pissed when what they're expecting is an encyclopedia article and what they get is a stub with three sentences and a database ref, or 30 sentences of unsourced material they have no way of easily verifying, or 30 sentences of pure trivia sourced to an interview. An article that is not based on notability-granting sources cannot provide an encyclopedic summary of the topic even if it really is notable, so what questions could it be answering anyway? Of course your Renaissance examples shouldn't exist in mainspace! Having some See Also links makes it even worse since they're inevitably providing an unbalanced view of the topic to boot, so yes readers are better off not getting a shitty info-sparse stub SEO'd to the top of Google results. A dedicated resource where that info is actively curated is going to be orders of magnitude more helpful than a tautology and random context-free links. JoelleJay (talk) 01:39, 25 February 2023 (UTC)[reply]
Pages can always be edited to be made more up to date even by people who don't know much about the subject. Pages that don't exist can't be. There's also little guarantee that the sources will always be available as books go out of print and website links get broken.KatoKungLee (talk) 16:52, 25 February 2023 (UTC)[reply]
JoelleJay, to say these Renaissance sculpture articles should not exist in mainspace seems to me in contradiction to the way this encyclopedia evolved. Originally every article was like that. The goal was to make progress, not write only articles that had some minimum number of references. There are far fewer articles that are in that state (clearly notable, completely unreferenced) any more, but to say they are a bad thing is to say that the community's attitude to article creation has changed, and I don't think that's true. WP:REDYES is similar: it exists because incompleteness is seen as not inherently bad -- the missing material encourages others to write to fill the gap. Perhaps your take on this is becoming more common, but I would be surprised if it has become the majority view. Mike Christie (talk - contribs - library) 12:26, 25 February 2023 (UTC)[reply]
If it has no content it should just be redirected, for instance we don't have an article on "German Renaissance sculpture" but we have a very informative article on German Renaissance or even Sculpture in the Renaissance period#German sculpture. Why should we give readers a poorly authored article when we have a fairly informative article on the greater subject or a well written section covering the topic. If someone wants to create an article that actually offers the reader more value they can then remove the redirect. This isn't the early days, we already have many good articles that are more informative than some stub that blocks readers from finding what they are looking for. -- LCU ActivelyDisinterested transmissions °co-ords° 13:16, 25 February 2023 (UTC)[reply]
I agree that if there's an appropriate redirect target that's better than an unreferenced stub. That wouldn't be the case for many articles related to recent events, though. If something like the Staffordshire Hoard were to be discovered today, an article titled Worcestershire Hoard with one sentence and no references would be fine, as far as I can see. It's there to be improved and it's better than a red link, which in turn is better than no link. If the point is these situations would be rare, then I agree with you, but when these rare situations happen I don't think there's anything wrong with these stubs. Mike Christie (talk - contribs - library) 13:24, 25 February 2023 (UTC)[reply]
The fringes of all arguments is why there is WP:IAR, if something new is emerging then other editors will shortly expand the article. It shouldn't though be standard practice for topics that may or may not be notable, that is the area where it wastes other editors time. -- LCU ActivelyDisinterested transmissions °co-ords° 16:59, 25 February 2023 (UTC)[reply]
An article that gives so little information is of no value to readers, even if it is on a notable topic that can be expanded by other editors. I don't see how this is debatable. If the article is taking up a title that would be much better served as a redirect to a more comprehensive section on another page, it is actively doing a disservice to readers. We aren't in the early days of Wikipedia where people visiting had no preconceptions of accuracy or completion or encyclopedicity; nowadays readers are arriving from Google search results anticipating an encyclopedia article, and especially when the topic is something "obviously notable" they will expect it to be much more informative than a couple sentences. JoelleJay (talk) 23:30, 26 February 2023 (UTC)[reply]
An article that gives so little information is of no value to readers is an opinion, not a fact. It is also an opinion that is not shared by everyone. I have personally found an apparently abandoned substub to be very useful on occasion. (I have also found several long articles to be useless, including material that was cited. Does anyone happen to know what it means to provide log management and analytics services that leverage machine-generated big data to deliver real-time IT insights? I know what all the individual words mean, but I still have no idea what this company does.)
I wonder how you know that nowadays readers are arriving from Google search results anticipating an encyclopedia article, and especially when the topic is something "obviously notable" they will expect it to be much more informative than a couple sentences. Have you talked to any non-editors about their expectations? Have you asked them if they would rather find nothing at all, than to find only a basic definition or substub, like "Alice Athlete (1899–1962) competed in the Olympics for Country"? WhatamIdoing (talk) 00:12, 28 February 2023 (UTC)[reply]
Have you talked to any non-editors about their expectations? Of course I have? People in my lab and my social community constantly use Wikipedia as an overview source and importantly as a place to find comprehensive refs to read for themselves. As soon as they learn I edit Wikipedia they have complaints about how uninformative and incomplete particular pages are (usually this is for individual genes/proteins; some of the sporadic biomed edits I make are following up on such comments). Most of them get there from the top results on Google, which they expect to contain some information. JoelleJay (talk) 04:17, 28 February 2023 (UTC)[reply]
But do they say that they would rather have nothing at all? BeanieFan11 (talk) 16:04, 28 February 2023 (UTC)[reply]
Well, they regard clicking on the wikipedia link a complete waste of time, so I would say yes. JoelleJay (talk) 23:33, 1 March 2023 (UTC)[reply]
Irrespective of whether a lay reader's trust is informed by the presence of blue clicky numbers, the blue clicky numbers are the epistemic cornerstone of everything we do here. signed, Rosguill talk 07:07, 25 February 2023 (UTC)[reply]
I believe that the little blue clicky numbers are important to editors, but they do not seem to be important to readers. WhatamIdoing (talk) 00:13, 28 February 2023 (UTC)[reply]
They're important to building a reliable encyclopedia; readers indirectly care about them, even if they don't directly do so. BilledMammal (talk) 01:32, 28 February 2023 (UTC)[reply]
I am a reader before I am an editor, and I find myself quite annoyed when I click into a Wikipedia article only to find it is essentially nothing. For me this is most common for species articles, which are often a single sentence or two. Maybe it has a link to IUCN page. In that case, I have wasted my time and could have gone directly there from the google search. An unsourced sentence and a list of see alsos sounds even worse than that. CMD (talk) 13:17, 25 February 2023 (UTC)[reply]

Trivia in year articles and Template:Year article header

All the year articles have an intro generated by Template:Year article header which is mostly trivia. For example, the intro for 1754 is:

1754 (MDCCLIV) was a common year starting on Tuesday of the Gregorian calendar and a common year starting on Saturday of the Julian calendar, the 1754th year of the Common Era (CE) and Anno Domini (AD) designations, the 754th year of the 2nd millennium, the 54th year of the 18th century, and the 5th year of the 1750s decade. As of the start of 1754, the Gregorian calendar was 11 days ahead of the Julian calendar, which remained in localized use until 1923.

I think a lot can be removed. No need to say the day of the week for 1st January. I would also remove the whole "the 754th year of the 2nd millennium, the 54th year of the 18th century, and the 5th year of the 1750s decade", just keeping the century and millennium.

Any thoughts? Vpab15 (talk) 18:34, 26 February 2023 (UTC)[reply]

It's useful information, and I like the idea of a standard paragraph like this for all year articles. I agree about trimming the detail per Vpab except I'd keep day of the week (where else would you find that information?). I think we could also ditch the Roman numerals, as I do not think English Wikipedia has many Roman readers. I'd move the whole paragraph further down in the lead, perhaps it could be the standard last paragraph of the lead. Then there is the question of what do we write for the first paragraph to replace it, though? Levivich (talk) 18:58, 26 February 2023 (UTC)[reply]
I always imagined the Roman numerals might be a useful search term to include for readers who have seen them in the credits of movies and TV shows.
Overall I think including these kind of calendrical factoids is in scope for main year articles, but I agree that "the 754th year of the 2nd millennium, the 54th year of the 18th century, and the 5th year of the 1750s decade" serves little purpose, and it duplicates what’s in the infobox immediately to its right. Barnards.tar.gz (talk) 15:22, 1 March 2023 (UTC)[reply]

New RFC-like process involving academia

The previous discussion on expert contributors was wide-ranging, but I'd like to request feedback on a more specific and actionable idea:

Push for the WMF to start a program involving select, reputable university departments. Editors can formulate a question on an academic or scientific topic; once that question gets vetted, it is shared with these departments, who are invited to provide short statements stating the academic consensus on the topic. These academic statements would inform our content discussions.

Rationale:

  • Why couldn't we accomplish the same thing on our own by just searching through peer-reviewed literature? Because (a) passed peer-review ≠ matches academic consensus, we're all Dunning-Krugers here, and (b) academics tend to not publish things that their peers would consider obvious, that may not be obvious to us.
  • It may help us de-escalate our high-conflict contentious topic areas. See for example the Holocaust in Poland debacle (and ArbCom case). It would also be a first step to countering civil POV pushing, which is currently completely unremedied, and it would make us less reliant on the vigilance of editors with the "right" POV.
  • It may also help address what Elemimele argued in the previous discussion: Ideas That Went Nowhere after being published in peer-reviewed journals, whose current standing is hard for us to assess since they hasn't been addressed in more recent sources. We widely include such things, which poses NPOV concerns.
  • These statements would come from whole departments, not from individual academics, so we'd minimise COI and possible FRINGE concerns. We'd be the ones doing the editing, so this overcomes the WMF's unwillingness to get involved in content. And this doesn't depend on academics being willing to waste inordinate time learning how Wikipedia works and making their own contributions. It finally provides them with an approachable venue to offer content feedback, which is something academics mostly gave up on doing a decade ago due to their feedback often being ignored.

DFlhb (talk) 23:22, 26 February 2023 (UTC)[reply]

I'm sympathetic to the goals, especially w/r/t to the Ideas That Went Nowhere problem, but this doesn't sound workable.
First, while individual academics might have the free time and interest to contribute to a project like Wikipedia, there are little to no incentives for a whole department to assign their already overworked resources to improving Wikipedia. I don't see how this is a more approachable venue to offer content feedback than e.g. article talk pages. Influential, sure; approachable, no.
Second, the invitation-based selection process sounds fraught with problems w/r/t e.g. systemic bias. There's also the problem of how is WMF (or the community) supposed to determine who the right departments to invite are, if the whole underlying thesis is that the WMF/community does not have the expertise to evaluate the state of research.
Third, I see no reason to believe that having a department ostensibly sign off on a comment would actually do much if anything to address WP:COI or WP:FRINGE. Just as individuals have their idiosyncrasies, so do communities such as departments. Framed in terms of the Hierarchy of Influences model, at the very best it removes the effect of the individual, but that is already largely achieved through our normal editing policies' requirements for reliable sources etc. At the worst (in the very realistic scenario where the statements by "departments" are de facto prepared by that one academic who has a bit of free time and wants to contribute to the project) this would instead entrench the idiosyncrasies of said individual, hiding them behind a veneer of "academic consensus".
Fourth, w/r/t academics tend to not publish things that their peers would consider obvious: in my experience, this is because rather often these obvious things are more informed guesses or hunches and less positions well-grounded in data or research. If anything, the citation-based incentives of the modern academic world incentivize publishing a paper that says "that one thing everyone thinks is true, but doesn't have a reference for, indeed is true".
Fifth, I don't see how this could be aligned with our policies on e.g. original research and synthesis. Either the comments simply regurgitate reliable sources - in which case this reduces to a more complicated multi-tiered variant of the standard editing process - or they contain OR/SYNTH, in which case we'd have to carve out some kind of an exception to those policies for a cabal of "approved original researchers", which, uh, I think would have a snowball's chance in hell of being accepted by the community. If the idea is that we'd interpret the comments as subject matter expert-authored self-published sources, then this just sounds like a lot of extra bureaucracy for what could be achieved by asking a department to write a blog post on their departmental website. Ljleppan (talk) 08:05, 27 February 2023 (UTC)[reply]
I'm not sure that adding incentives is as important as removing disincentives. Shmuel (Seymour J.) Metz Username:Chatul (talk) 16:58, 27 February 2023 (UTC)[reply]
Removing disincentives is useful for attracting individuals (and certainly a worthy goal), but probably not enough to attract organizational partners (like departments). Overcoming the organizational inertia is no small task even if we find someone to "internally champion" participation, if all they have to go with is "it'd be nice" and the organization is likely already resource starved. Ljleppan (talk) 05:28, 28 February 2023 (UTC)[reply]
"then this just sounds like a lot of extra bureaucracy for what could be achieved by asking a department to write a blog post on their departmental website" hmm, despite my concerns about the policy in general this is something I've occassionally wanted and I might support a process for this. A problem I found in psychology was "pop-science" diagnoses proped by with a network of blog posts that would be better understood through other concepts. A good examples is Victim mentality which might be better understand through Trauma and things in Victimology. I guess WP:PARITY is the reason I'm happier with this - if there are no sources any source by an academic would be useful. On the other hand, I don't know how motivated most academics are by "there is insufficient literature for wikipedia to adequalty address a topic to its exacting standards". Talpedia (talk) 12:13, 28 February 2023 (UTC)[reply]
I think this could, in principle, already be done within the scope of the subject-matter expert exception of WP:SPS. WP:MEDRS would naturally rule out that stuff for biomedical claims, but that's probably for the best, and WP:EXCEPTIONAL should rule out any outrageous stuff in general.
W/r/t I don't know how motivated most academics are by..., it's indeed tricky. Popularizing science or science comms tend to either be ignored, or are given very low weight in contrast to most other academic activities, when it comes to academic evaluations etc. so there's little external motivation. It's also not helped that we'd not be simply asking "hey prof so-an-so, what do you think about X?" by email, but instead "could you write a thing about your view on a potentially controversial topic, copy edit it to your standard of public writing, and post it for eternity on e.g. your department's blog". I'd imagine many (if not most) academics would be happy to give a brief answer to a private emailed query, but asking them to write something they'd be willing to sign their name on is a much bigger ask. Ljleppan (talk) 07:22, 1 March 2023 (UTC)[reply]
Are you sure? Naïvely, I would expect academics to want their names on what they wrote, and I would be hesitant to ask them to write anything anonymously. --Shmuel (Seymour J.) Metz Username:Chatul (talk) 10:37, 1 March 2023 (UTC)[reply]
Things like this have been done before, usually with individual experts rather than whole departments. See Wikipedia:BMJ/Expert review for one example of an organized program.
The individuals who have agreed to do this in the past have generally suggested smaller changes and were happy to provide sources. Some want us to WP:SELFCITE for them (and sometimes we reach out to the person because their source was so useful, so that seems fair), but most are happy to recommend sources that they didn't write as well. There's no need at all to worry about academics trying to add "material—such as facts, allegations, and ideas—for which no reliable, published sources exist" (=the actual definition of OR according to our policy) because these people literally have their jobs because they know how to cite credible sources. Also, much of our content can be found in textbooks instead of recent papers, and sometimes what we need is someone to tell us to remove certain claims that the field generally considers disputed, unimportant or nonsense. WhatamIdoing (talk) 02:21, 28 February 2023 (UTC)[reply]
The problem comes with certain claims that the field generally considers disputed, unimportant or nonsense. Either this is something that can be easily identified from RS (i.e. a textbook or something like that says "X is generally not considered a credible theory", or it's not even discussed by others, at which point WP:FRINGE's Fringe theories may be excluded from articles about scientific topics when the scientific community has ignored the ideas. and A conjecture that has not received critical review from the scientific community [..] may be included in an article about a scientific subject only if other high-quality reliable sources discuss it as an alternative position. come into play), at which point there is little need to set up this massive bureaucracy, or it's not easily identifiable from RS, and we're at or approaching OR/SYNTH/a situation where there is no verifiable consensus.
I'd also highlight that organizations - consisting of individuals as they are - have all kinds of idiosyncrasies, just as the individuals do. Every single research organization has their own collectve "hot takes" on exactly the kinds of issues where this proposal is intended to be useful. "Ask a group of people who work together" is not the panacea this proposal appears to assume it is. It might filter out the most burning hot takes, but I fear it will only entrench and hide behind a smoke screen the types of medium-hot takes that we'd -- presumably -- need the most help with. Ljleppan (talk) 05:59, 28 February 2023 (UTC)[reply]
All very good points. DFlhb (talk) 11:06, 28 February 2023 (UTC)[reply]
Kind of doubtful. I think academic review and input is great.
My concerns are that you get this "false consensus" effect in academia. Basically, if you aren't working directly in a topic your understanding will be based on "the sort of things that people say" and what *must* be true rather than what you are certain of. I feel that a whole lot of wikipedia is and should be add odds with the consensus within the wider field. It should be the consensus of the few dozen academics who have actually spend time grappling with or synthesising the topic.
I can't help but feel that this approach seeks an authority that doesn't exist, and then elects one that won't necessarily care very much about what is actually true: "it's just science communication". I sort of view wikipedia as acting *against* this sort of "expertisation". We take the things that academics are happy to share with one another - where they can't just win by "being an expert" and then share this with the readers, thereby imposing academic scrutiny by proxy to issues. Normal science communication seems to more be along the lines of "what some guy said to a writer with a science-background simplified in lines with their values". Talpedia (talk) 11:26, 28 February 2023 (UTC)[reply]
Also fun fact. The Dunning-Kruger effect was a concept created by citogenesis with wikipedia and the original research applied to assumptions of competence within the population rather than ability to perform given tasks - though there has been later research, which may look into this. My impression is that the "generalised" version complete with diagrams including "mount stupid" isn't well supported by evidence and a convenient fiction that lines up with people's biases about expertise. Talpedia (talk) 11:37, 28 February 2023 (UTC)[reply]

OpenAI and ChatGDP

Have you tried to chat with openAI bot ChatGDP? It is amazing, breathtaking to say the least. But what are the implications for WP? As I realize, ChatGDP has a "training data pool", with info retrieved from WP, among other sites. Whilst I think OpenAI is a great tool for enhancing education and research, I think there are potential dangers. I would like to suggest, we, the WP, suggest to openAI to only include reviewed articles ie Good articles or better. Have you guys discussed OpenAI issues anywhere else in WP? I 'd like to read others perspectives on the issue. Cinadon36 11:13, 28 February 2023 (UTC)[reply]

There is some discussions going on in the talk page of Wikipedia:Large language models if you are interested. Vpab15 (talk) 12:26, 28 February 2023 (UTC)[reply]
I think it is actually better for LLMs to include “junk” in their training input, especially if low-quality articles are clearly distinguished from the Good articles, as this allows the AI to learn what good looks like and what bad looks like. Barnards.tar.gz (talk) 16:04, 1 March 2023 (UTC)[reply]
Large language models are good at reproducing coherent language, and bad at writing properly referenced encyclopedia articles. They produce something that at first glance looks like a really good Wikipedia article, but don't (for example) actually read, cite, synthesize, and create coherent articles from source texts. They can give something that has all of the trappings of a really good Wikipedia article on the surface, but is actually nonsense once you scratch it away. --Jayron32 20:51, 1 March 2023 (UTC)[reply]
I feel you are a little harsh here @Jayron32. You are placing the bar too up high. We are at the beginnings here. I have asked chatGDP a couple of questions in my field of expertise and the replies I got were moderate to good. Which is astonishing, especially if you keep in mind that this is a new technology. Yes, I have also noticed the problem with references. Also, I feel there is an element of vagueness and cliches? Maybe, I am not sure. But, it is a project that is still developing. Cinadon36 08:02, 3 March 2023 (UTC)[reply]
I too can search for common text strings and copy and paste the information in a way that shows no understanding on my end, but I know enough basic English to reproduce reasonable text. I can even search and replace synonyms and alter sentence structure so that it isn't readily obvious where I got the information from. It's a more efficient version of that. --Jayron32 11:46, 3 March 2023 (UTC)[reply]

Rather than mass get rid of stubs because they're stubs (like some are suggesting), why not improve them?

An idea: At User:BilledMammal/Lugnuts Olympian stubs, there's discussion on whether to propose mass getting rid of Lugnuts' stubs by the thousands because they're stubs. I have what I think would be a much better plan: start a project called something like Wikipedia:WikiProject Fix Lugnuts' stubs where we give rewards for those who can do so, e.g. I'd be willing to give a barnstar to someone who can expand two of his so-called "permastubs" into something that would pass the criteria at WP:DYK; one for three further; and one more for each five additional. I think something like this would be much more productive for the encyclopedia than mass getting rid of stubs (and something like this where we give rewards for stub article improvers would work for any topic area, I think). Thoughts on something like this? BeanieFan11 (talk) 02:21, 1 March 2023 (UTC)[reply]

We should keep and work to improve all sports biography stubs that include at least one reference to a source providing significant coverage of the subject, excluding database sources, as is required by Wikipedia:Notability (sports). We should delete or redirect all such stubs that do not meet that minimum requirement after an AfD debate of a week or more. Information about such non-notable athletes can be presented in list articles, or team/season articles. Such freestanding biography articles can easily be recreated if the necessary significant coverage is discovered and cited. Find the significant coverage first, and then write the article. Cullen328 (talk) 02:38, 1 March 2023 (UTC)[reply]
That proposal isn't to get rid of them, it is to temporarily move them to draft space unless adopted by a wikiproject or user, in which case the articles would be moved to project or draft space. It also isn't because they are stubs; it is because of WP:GNG and WP:SPORTSCRIT #5 issues and because the quantity makes it impossible to deal with through our normal processes. BilledMammal (talk) 02:41, 1 March 2023 (UTC)[reply]
Both of you are missing my main point: should we create a project where we give rewards for stub improvers? BeanieFan11 (talk) 13:51, 1 March 2023 (UTC)[reply]
I don't think the point was missed. WikiProjects are just groups of editors sharing common goals who choose to collaborate together on a WikiProject page. If you can find other people interested in participating in a project that works together on improving stubs, great! The key first step is finding interested editors; after that, the group can decide how it should organize (within of course bounds of general community consensus). isaacl (talk) 17:41, 1 March 2023 (UTC)[reply]
AGREE - First, the 24/7 scheming against Lugnuts is harassment and it shouldn't be allowed here. It does not matter whether Charles Ponzi or Jesus wrote an article. All that matters is the article's content. We do not want a wikipedia where articles get changed and deleted over the latest hottest trend of the week.
Mass deletions of articles or mass drafting of articles is completely against everything this site should be. Massive red flags should be raised about what's really going on at this site if that is ever seriously considered. Willie Magee (cyclist) has no involvement in anything related to Albert Johnstone. These two articles should not be compared in any way.
Articles should also not have to meet standards that do not exist yet. If a law is made tomorrow that every wikipedia article has to mention the word "jello" in it or be deleted, 99% of wikipedia articles would have to be deleted. That would never be possible even if everyone on the site worked on it and tons of good articles would be gone for silly reasons.
BeanieFan11 is 100% correct. Wikipedia users should always try to improve an article over going for the speedy deletion. Users wanting to get articles deleted should have to show proof that they performed a basic Google search and checked online newspaper databases, Google Books and other recommended free wikipedia sourcing options first. They should also have to open up a help topic on one of the sections here looking for more sources. There's plenty of people who would be interested in helping, just we have to know about it. KatoKungLee (talk) 17:36, 1 March 2023 (UTC)[reply]
It sounds like DYK already does this? If anyone fixes up a stub into something more (and that article has at least one item of interest), they can get a shiny DYK question mark. CMD (talk) 17:52, 1 March 2023 (UTC)[reply]
What I am suggesting is having a project which gives out barnstars to those who can expand stubs – I think a barnstar would be more motivating than a small . BeanieFan11 (talk) 18:02, 1 March 2023 (UTC)[reply]
I would have thought it the other way around, although I'm not sure how this could be studied. Wikipedia:De-stub-athon may interest you. CMD (talk) 18:08, 1 March 2023 (UTC)[reply]
It doesn't really matter—if there are people interested in doing one, the other, or both, they can just do it! isaacl (talk) 23:50, 1 March 2023 (UTC)[reply]
  • No to both - No, we shouldn't delete (or delete-via-drafticiation) those stubs. The stubs do not have much positive or negative effect on the project or anyone else. Mass creation of so many stubs was undesirable, but also not disallowed, and the creator has since been indeffed. We also shouldn't be celebrating those creations by creating a special event just for them. We have millions of stubs that need attention, with many subjects a whole lot more pressing than an Olympian. I cannot imagine there would be consensus for mass-draftify or mass-delete, and while anyone can start an article improvement drive if they wish, I'll register my opinion that it's not a good idea, either [to focus on Lugnuts' articles]. Deal with them on a case-by-case basis if it's an area you care about, knowing that the creator won't be doing it any more, and that they're not doing any harm or any good if you choose to ignore them. — Rhododendrites talk \\ 18:16, 1 March 2023 (UTC)[reply]
  • The correct solution is two parts: 1) fix the stubs that you can fix (if you want to) and just do it yourself, without asking anyone to help at all (because this is a volunteer organization and no one can make anyone else help with anything) 2) If you find a stub that you believe cannot be fixed (and, no one else can tell you that you don't believe it, it's your belief) then tag it for deletion. That's how we proceed. We neither delete them all (because some may have reliable sources we can use to improve them) nor do we demand that they are all improved (because some of them may never be able to be improved). Instead, we simply take on whatever work we feel like doing to make Wikipedia better, and we don't tell anyone else they need to do it also. That means we don't tell anyone they have to improve the articles along with us (we just do it) and we don't tell anyone they have to tag the articles for deletion (we just do it). In the entire history of Wikipedia, among all editors that have ever interacted with the encyclopedia, nothing has ever gotten done because we demanded that others do it. Instead of starting threads telling people what they should do, just do it yourself. If you don't feel like doing it yourself, don't. --Jayron32 19:03, 1 March 2023 (UTC)[reply]
    I'm not sure how proposing a WikiProject constitutes "demanding" anything of anybody, and I'm not sure how anyone could draw that conclusion in good faith. The entire purpose of the village pump is to bring things to a wider audience. Gnomingstuff (talk) 18:21, 2 March 2023 (UTC)[reply]
  • No need to delete or immediately expand them all. Just let them be stubs until interested editors expand them, or nominate them for deletion, like any other stubs on Wikipedia... ---Another Believer (Talk) 19:06, 1 March 2023 (UTC)[reply]
If someone wants is put forward a suggestion on something let's wait until they do, rather than going over this now and if and when anything is suggested. -- LCU ActivelyDisinterested transmissions °co-ords° 19:46, 1 March 2023 (UTC)[reply]
@ActivelyDisinterested: Well, the proposal has been made: Wikipedia:Village pump (proposals)#RfC on draftifying a subset of mass-created Olympian microstubs. BeanieFan11 (talk) 15:15, 2 March 2023 (UTC)[reply]
Thanks for alerting me, I'll comment there. -- LCU ActivelyDisinterested transmissions °co-ords° 15:18, 2 March 2023 (UTC)[reply]
It would be great if you can find a group of editors interested in improving the articles in question. Ultimately, everyone wants more comprehensive articles. isaacl (talk) 00:08, 3 March 2023 (UTC)[reply]

A way of resolving "No consensus"

In an RfC, when there are two choices and no consensus can be reached, we are left with no decision. If no consensus generally means we stay with what we had before, then at least we have something, but sometimes there is no established rule to fall back on and we are left with mixed usage. In that case, Wikipedia is inconsistent, looks sloppy, and nobody's happy.

However, at that point we could ask another question to see if there is a consensus that either option is better than no decision, then a coin could be flipped or the closer could make a decision based on a majority rather than a consensus.

I proposed this in one case and the answer was that we don't do that here. I've been involved in several such RfCs on matters of style and Wikipedia remains inconsistent. I think either choice is often better than no choice.

What think you? Has this been suggested before? Any other ideas? Thank you.  SchreiberBike | ⌨  23:45, 2 March 2023 (UTC)[reply]

Additional guidance for new editors

Hi! Curious if y'all think there are any aspects of editing where new editors may need more guidance/advice than what our current PAGs and explanatory essays provide, such as RFCs. — Ixtal ( T / C ) Non nobis solum. 01:26, 3 March 2023 (UTC)[reply]

I'll suggest the opposite: there's too much guidance, and the way it's organized makes it really difficult for new editors to know where to start. Thebiguglyalien (talk) 05:04, 3 March 2023 (UTC)[reply]
Thebiguglyalien, that's also very useful insight. How do you think we can improve the organization of the guidance? Any particular areas of concern? — Ixtal ( T / C ) Non nobis solum. 10:19, 3 March 2023 (UTC)[reply]
I'll second Thebiguglyalien's comment: too much "stuff", poorly organized. We can expect editors to go through Help:Introduction. But practically no one reads Wikipedia:Contributing to Wikipedia or any of those pages (I certainly didn't). They're both too basic and too verbose, making them pointless, while far less prominent pages, like MILHIST's Academy or userspace essays, are more useful by a mile. Nobody reads a manual before driving their new car; our help pages shouldn't try to address everything, they should focus on frequently asked questions, like how to copy edit (with specific guidance and examples), and how to find high-quality sources (too bad we can't just link out to LibGen/Sci-Hub, but people can still find tons of books/papers for free on the web or archive.org, yet most don't know how to). DFlhb (talk) 11:54, 3 March 2023 (UTC)[reply]

Motto of the day

According to what I know, the main page has been relatively static for quite some time now. Perhaps we could add the motto of the day to it? TheBestagon 12:32, 3 March 2023 (UTC)[reply]