Jump to content

Wikipedia:Village pump (policy): Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Line 1,810: Line 1,810:
===General comments===
===General comments===
* Having seen a "Good Article" that was exclusively referenced by about 30 maps with no non-map references it would be good to have a discussion about what maps mean for notability with a dip into the extent to which maps make quality references. [[User:Gusfriend|Gusfriend]] ([[User talk:Gusfriend|talk]]) 06:11, 19 March 2023 (UTC)
* Having seen a "Good Article" that was exclusively referenced by about 30 maps with no non-map references it would be good to have a discussion about what maps mean for notability with a dip into the extent to which maps make quality references. [[User:Gusfriend|Gusfriend]] ([[User talk:Gusfriend|talk]]) 06:11, 19 March 2023 (UTC)
*:There is also the concern that sources such as Google Maps do not have a version and may be updated at any time which may limit what it is appropriate to reference. [[User:Gusfriend|Gusfriend]] ([[User talk:Gusfriend|talk]]) 06:19, 19 March 2023 (UTC)

Revision as of 06:19, 19 March 2023

 Policy Technical Proposals Idea lab WMF Miscellaneous 
The policy section of the village pump is used to discuss already proposed policies and guidelines and to discuss changes to existing policies and guidelines.

Please see this FAQ page for a list of frequently rejected or ignored proposals. Discussions are automatically archived after remaining inactive for two weeks.


Wikipedia response to chatbot-generated content

ChatGPT has gone viral and is causing quite a storm on the Web. Why should we care? Because, rather than just providing links to an article for you to read about what you are interested in, it writes an essay about it on the spot. It is so sophisticated that, if you ask it to write an article on any subject, even in the style of Wikipedia, it will! That can then be copied and pasted into Wikipedia -- not necessarily a problem if it is carefully fact checked and edited first (ChatGPT can't tell fact from fiction), but, an editor may post it without doing so. It can even be used to answer questions about Wikipedia as if it is a Wikipedian (see the first link provided directly below).    — The Transhumanist   06:52, 9 December 2022 (UTC)[reply]

  1. Experiment: See it being used to generate answers to hypothetical questions on Wikipedia:Teahouse
  2. ChatGPT's Writing Capabilities Stun, but Humans Are Still Essential (for Now) - CNET
  3. Five Chats to Help You Understand ChatGPT - The Atlantic
  4. The ChatGPT chatbot is blowing people away with its writing skills. An expert explains why it's so impressive
  5. ChatGPT: Why Everyone's Obsessed With This Mind-Blowing AI Chatbot - CNET
  6. ChatGPT | Discover AI use cases
  7. 'Scary' ChatGPT could render Google obsolete in two years
  8. ChatGPT, an AI chatbot, has gone viral. Some say it’s better than Google; others worry it's problematic.
  9. What ChatGPT AI is and how you can use it - The Washington Post
  10. What is ChatGPT and why does it matter? Here's what you need to know | ZDNET
Based on how rapidly chatbots have improved over time, it will become more and more difficult to tell if an article was written by a computer or not. The sheer volume at which computer programs could create new accounts and produce Wikipedia content, and the inevitable growing number of human editors copying and pasting chatbot output into Wikipedia, will at some point make it impossible for Wikipedia's human volunteers to keep up with that traffic and apply quality control to the material in a reasonable time frame -- the backlog of unchecked material will simply get longer and longer. The only recourse will be for computer programs to do it -- either computer programs to process articles to filter out or correct any crap, or training the chatbots themselves not to produce crap in the first place. Rather than build computer algorithms to detect computer-written articles and passages, it would be more productive for them to do style checks, fact checks, and citation checks, along with appropriate corrections or removals. While Wikpedia-friendly AI could come from within Wikipedia, it may be faster to bring influence to bear upon the developers of the chatbots being used to generate Wikipedia content, and upon the chatbots themselves. Wikipedia already has a chair at the table, because Wikipedia comprises a significant component of chatbot corpi, and so, their developers should be inclined to listen to the Wikipedia community's concerns -- either directly, or indirectly through news coverage. The Wikipedia community should make its voice heard on the matter of chatbots writing Wikipedia material according to Wikipedia's style and behavior guidelines. For example, verifiability still applies, and so when chatbots are asked by their users to "write an article in the style of Wikipedia" the chatbots should comply according to Wikipedia's policies, including those on verifiability and providing reliable sources. Not doing so should be met with the filing of bug reports, feedback, and commentary. And, as chatbots learn as they go, Wikipedians who use them can ask them to follow Wikipedia guidelines, and we can urge our fellow editors to request this of chatbots as well.    — The Transhumanist   06:52, 9 December 2022 (UTC)[reply]
Chatbots should be following Wikipedia's lead for all of their output. At this time, most chatbot answers and essays are not referenced with reliable sources. And they should be, for the same reason that Wikipedia articles should be. That's something that can be requested of chatbots directly, through queries, and of developers, through their contact channels and social media. I hope this suggestion helps.    — The Transhumanist   06:52, 9 December 2022 (UTC)[reply]
The simple answer is that our existing policies ought to already cover this (mostly.) Sourcing is still required for anything that is challenged or likely to be challenged, which prevents people from just blindly dumping AI generated text into Wikipedia; and an AI may violate copyright depending on how it was trained (and whether it was overtrained.) There are also unsettled copyright concerns related to AI training sets, so I would generally think that, ideally, editors shouldn't be dumping AI generated text into our articles even after performing due diligence to make sure it's not a copyvio and finding proper sources. But since those concerns are unsettled and speculative, I also don't think it's worth worrying about too much right now. The key point is that we should emphasize our sourcing requirements and be more diligent for clear-cut copyvios, which we already have systems in place to handle, since it is likely that these tools will result in people adding lots of unsourced and possibly-copyright-violating text. (I do wish our RFCs on mass article creation had reached a stronger agreement on sourcing requirements for new articles, which would deter excessive copy-pastes of AI generated text - perhaps that is something we might want to revisit in the near future, if we start seeing significant amounts of new unsourced articles created using what is plainly AI-generated text.) --Aquillion (talk) 07:55, 9 December 2022 (UTC)[reply]
You mean, don't prepare in advance for a potential increase in volume, just wait until it hits? At that time, will merely adjusting policies stem the tide? It's in the slow trickle phase now, but that could potentially become a torrential flood very rapidly, just as ChatGPT's user base grew to over a million in 5 days. My main concern above was about a potential volume of AI-generated content that went beyond the scale of what the editor community could manually process. You didn't address that contingency. What could the community do to prepare for it, just in case it does happen? What are the available options?    — The Transhumanist   11:28, 9 December 2022 (UTC)[reply]
I don't think there's much we reasonably can do to prepare, at least not without serious risk of causing other problems; AI-generated text won't be drastically different than other sorts of text, aside from the risk of being uncited or a copyvio (which we have existing processes in place to handle.) It's worth raising awareness of the issue so editors can spot the signs of someone using large amounts of it, but I think our best bet if we're going to "prepare" is to focus on the systems we already have, which is unlikely to do any harm either way, or perhaps to codify slightly more strict sourcing requirements in the way I described (which I think is a good thing anyway, but would at least serve to slow down the worst sorts of misuses of AI generated text.) Ultimately the most serious problems are if editors start adding large amounts of text that violates copyright or which are uncited and likely to be challenged, but we have existing procedures for those, we just need to prepare for the possibility that we may need to become a bit more aggressive about enforcing them. Wikipedia is in a slightly better position than some other websites facing AI-generated-text problems, because our sourcing requirements will at least make it fairly obvious if someone tries to dump large amounts of AI-generated text onto the wiki without making any effort to verify it. --Aquillion (talk) 12:47, 9 December 2022 (UTC)[reply]
I suppose we could take the Stack Exchange approach and just say flatly "no, this isn't allowed" - in their case it is explicitly a temporary measure until we have a better understanding of the issues. I think in general our policies/community norms would come down hard on anyone trying to get a language model to generate articles (hard to see why that would be OK and machine-translation isn't), but maybe an explicit statement would be a way to go. Andrew Gray (talk) 18:32, 9 December 2022 (UTC)[reply]
@Aquillion: While a large number of posts by individual editors may become a problem, the main concern I presented above was "the inevitable growing number of human editors copying and pasting chatbot output into Wikipedia, will at some point make it impossible for Wikipedia's human volunteers to keep up with that traffic and apply quality control to the material in a reasonable time frame -- the backlog of unchecked material will simply get longer and longer."
That is, people making the normal rate of content contributions, but using large language models (chatbots) to do so.
Watching for breakout editors who use LLMs to create a large number of new articles over a short period of time would not suffice in such a scenario. Editors who add LLM-generated content to many existing articles also will not be spotted by looking for mass page creations. And since writing will become easier by letting "chatbots" do it for you, content submissions by users employing such tools may likely become longer on average.
The point is, that a high enough volume of such content contributions would go beyond the capacity of Wikipedia's editors to check and correct.
The two solutions offered were 1) build software to analyze and process such content, and 2) work with chatbot developers so that inappropriate content is not composed by LLMs in the first place.
Just relying on new or existing policies to handle LLM-generated content will be insufficient if and when the volume of it passes the threshhold of what manual editors applying Wikipedia policy can deal with.
Passing that threshhold may come soon, or it may take years -- the main question is "will Wikipedia prepare for that threshhold-passing event?" Based on the responses above and below, the answer, and implicit recommendation from this forum, currently appears to be "no": No developing relevant software, and no working with chatbot developers to respond to the potential passing of the LLM-threshhold.
Thus, any solution will need to come from other departments or from continued or future discussion in this department, or from chatbot developers focusing on the problem due to other influences.
Another helpful approach might be the creation of a policy or instructions on how to use LLMs/chatbots effectively, and post links to that page in enough places that all editors will notice. Though, I doubt that would prevent the problems of a LLM-threshhold-passing-event, and wouldn't address the need for proofreading or processing LLM-generated contributions.    — The Transhumanist   02:18, 20 December 2022 (UTC)[reply]
I think in the end, @Aquillion response from earlier in the discussion is the one I agree with the most. If a ChatGPT-written or AI-written part of Wikipedia is accurate, nice! If something can't be proven with an RS, though, revert and warn. It can be a useful tool. Overall, think "Business as usual", just with some contributors getting assistance. It's mostly implausible as well to tell apart ChatGPT/AI generated content as well except in obvious or disclosed cases, and it's going to be impossible to effectively enforce as well no matter what position we take. If there is any legal issue with it, let WMF and ArbCom handle it. InvadingInvader (userpage, talk) 07:37, 6 February 2023 (UTC)[reply]
@InvadingInvader: I agree that a complete strategy of response to LLMs is beyond the scope of the Village Pump (policy), because it is going to take bots and/or diplomacy with developers, to stem any potential tidal wave of LLM contributions that would go beyond the capacity of WP's human editors to deal with. And those are the province of the bot department and the WMF, respectively. Considering that ChatGPT has gone viral, reaching 100 million users in just 2 months, eliciting a frenzied reaction from Google to bring rival products to market, thereby starting an "AI arms race", a tidal wave of chatbot-generated content in Wikipedia is plausible if not inevitable. Brace for impact.    — The Transhumanist   11:18, 13 February 2023 (UTC)[reply]

What can chatbots do?

You seem to be somewhat panicking over a scenario which isn't really supported by any evidence. While I see some Teahouse responses, could you give us one or two examples of " It is so sophisticated that, if you ask it to write an article on any subject, even in the style of Wikipedia, it will! " articles? The teahouse examples give the impression that, if it ever becomes a problem, some edit filters can easily spot these. You would in any case need "someone" to post this "potential volume of AI-generated content that went beyond the scale of what the editor community could manually process" you predict. This seems rather unlikely, at least on enwiki. Fram (talk) 11:45, 9 December 2022 (UTC)[reply]

Just try it. If your eyes don't pop out, I'll be surprised. Right now, during its "research preview", it is free. Keep in mind that it draws heavily on Wikipedia, which is included in its corpus, so, for this test run, it would be best to choose a person or subject that is not yet covered in this encyclopedia, and ask ChatGPT to write about that.    — The Transhumanist   14:41, 9 December 2022 (UTC)[reply]
Pinging @Fram, JPxG, EpicPupper, and 0xDeadbeef:    — The Transhumanist   14:50, 9 December 2022 (UTC)[reply]
I'm not giving out my phone number to some random website, thanks. Why it isn't sufficient that they have my email which was then verified is not clear... Fram (talk) 14:55, 9 December 2022 (UTC)[reply]
LOL I had the same exact response. My phone number? F no. Levivich (talk) 16:55, 9 December 2022 (UTC)[reply]
I went to try this out, and it asked me for my phone number. I thought about making one up like 0118 999 881 99 9119 725 3, but figured it would probably use it for two factor authentication, so that's no good. Ritchie333 (talk) (cont) 18:31, 9 December 2022 (UTC)[reply]
@Fram, Levivich, and Ritchie333:     I wasn't that bright. I gave it a random phone number. It rejected it as a land line. Then I gave it another, and it rejected that as a VOIP number. Finally, I gave it a random mobile phone number, and it sent some complete stranger the verification code. Oops.    — The Transhumanist   01:32, 11 December 2022 (UTC)[reply]
Thanks for the ping. I'd imagine the biggest problem would be people using the AI to create hoaxes. Like the Zhemao hoaxes but with less effort. 0xDeadbeef→∞ (talk to me) 15:13, 9 December 2022 (UTC)[reply]
So. I tried it yesterday. I'm not sure how heavily it draws on Wikipedia's corpus for its knowledge.
First, I asked it to tell me about Hammerton Killick. I know there is a Wikipedia article about Hammerton Killick, because I wrote 90% of it. It did not know who Hammerton Killick was, and informed me that it does not have access to the internet, or to Wikipedia.
Next, I asked it to write me an article in the style of Wikipedia. I did not specify a subject. It wrote about Athens. The result was ok. Heavily focused on the ancient city and on art and architecture. Short. Kind of read like an encyclopedia article.
Next, I asked it to write me an article about alcoholism in the style of Wikipedia. The result was very interesting. I did not think it read like a Wikipedia article, it was more like a brochure that would be distributed in a doctor's office or something. I asked it what about that essay it thought was like Wikipedia, and it said what it wrote was
  • neutral
  • factual
  • organized
Next, for fun, I asked it if it could write a recipe. It proceeded to give me a recipe for chocolate chip cookies. It looked like it should work. I e-mailed it to myself, and today I made them, not expecting much. I was pleasantly surprised. They were delicious. The only problems with what it wrote was that it did not have me cook them long enough (it said to bake for 8-10 minutes, and it took closer to 13 minutes for them to be done), and it drastically underestimated how many cookies the recipe should make (it said I'd get 2 dozen cookies, and I ended up with 5 dozen). I was shocked that it actually was edible.
I asked it to write a legal motion asking the court for an in-person hearing. I did not give it any other details. For not having any other details, the result was not bad. Westlaw has started offering a service that I think might draw on this type of technology, it helps you write pleadings.
Last I asked it to write a 100 word short story about a mouse, genera: fantasy. The result was decent. If I came up with it on my own I wouldn't be ashamed to enter it into a contest like the ones NYC Midnight runs.
I was more impressed with the recipe and the short story than the Wikipedia style articles. I can see some use for it in, say, copyediting as JPxG did below; or asking it for suggestions on language rephrase if you are trying to reach a word limit. I think it could have its uses. But I do think the Wikipedia community should be looking to craft policies and guidelines around what is and is not acceptable use of such tools. ~ ONUnicorn(Talk|Contribs)problem solving 06:26, 11 December 2022 (UTC)[reply]
@ONUnicorn, Fram, Levivich, Ritchie333, 0xDeadbeef, JPxG, and EpicPupper: Interesting. The chatbot sent you instructions (in this case, a recipe), and you followed them. You followed the commands of a computer. If it gave you an address and instructed you to go there and pick up a brown paper package, would you? The implications of this type of interaction are huge and foreboding. This issue must have a name, and I would like to look it up, but I can't seem to find it. Though, when I typed in "computers in charge" I got the following 2 relevant results:
Then I typed in "computers telling people what to do", it came up with this:
Ouch. I imagine, that anytime you ask a chatbot/computer "How do you do such and such?" it will reply with a set of instructions. And the chatbot's disclaimer in its terms of service will read "follow any instructions provided at your own risk". If you know or come across the name of the topic that covers computers telling humans what to do, please let me know what it is.    — The Transhumanist   11:04, 2 January 2023 (UTC)[reply]
@The Transhumanist: I think the term you're looking for is automation bias – "the propensity for humans to favor suggestions from automated decision-making systems and to ignore contradictory information made without automation, even if it is correct."
Interestingly, though, the 2002 Überlingen mid-air collision you mention is an instance where the computer got it right. An aircraft was on a collision course, and its crew were receiving contradictory instructions; the onboard collision-avoidance system was telling them to climb, while the human air traffic controller was telling them to descend. The pilots decided to trust the human and ignore the computer. Seventy-one deaths. Sojourner in the earth (talk) 14:00, 2 January 2023 (UTC)[reply]
@Sojourner: I like it: automation bias, like trusting GPS over a passenger's navigation suggestions or requests. Related, but not the term I'm looking for: which is the most used name for the subject of computers ordering humans around. When a computer gives you a command or a set of steps to follow, what is that called?
@Sojourner: Thanks for the comments on the collision. It was a bit more complex of a system error than that. While the air collision alerts on the 2 aircraft worked, there was a problem on the ground. The system the air controller was using would normally alert the user of an impending air collision in time for him to do something about it, meaning "no news is good news" -- not seeing the alert on screen, the air traffic controller (ATC) would naturally assume everything was okay -- but the ground-based optical collision warning system was turned off, creating a deadly ambiguity he was unaware of. So, there was some unintended computer-related miscommunication going on. "No alert" didn't mean what he thought it meant. My guess is that pilots would normally believe that the ATCs have better (radar-based) control systems than those onboard, and this probably affected the decision of one of the pilots to go with the ATC's order over that of his onboard computer -- he wouldn't have known that the ground-based system was disabled, either. Hopefully they fixed that bug, 'cuz it was a doozy. The poor ATC got hunted down over it, while the hunter became a national hero. Crazy world, about to get even crazier.    — The Transhumanist   15:13, 2 January 2023 (UTC)[reply]
When it comes to aviation, the more specific common term seems to be “automation dependency”. However I’m not aware of a term for what you’re looking for. DFlhb (talk) 23:47, 15 February 2023 (UTC)[reply]
Hi Fram. The Teahouse responses that you have seen were probably created by me. You probably noticed that the pattern "regards, Teahouse" in those responses. ChatGPT's strength is writing about technology, so I have told it to generate a section about Rust for Linux.
AI generated

Rust for Linux is a series of patches to the Linux kernel that adds the Rust programming language as a second language for writing kernel components. The project was started in 2017 by a group of volunteers who were interested in exploring the use of Rust for low-level systems programming.

The initial patches were developed in collaboration with the Linux kernel maintainers and other members of the open-source community. They focused on integrating Rust's memory safety and concurrent programming features into the kernel, and on providing compatibility with the existing C codebase.

In 2018, the patches were merged into the mainline Linux kernel, making Rust the first programming language other than C to be officially supported for writing kernel components. This marked a significant milestone for the Rust community, as it demonstrated the language's ability to handle the demanding requirements of kernel-level programming.

Since its inclusion in the Linux kernel, the Rust for Linux project has continued to evolve, with ongoing work on improving the compatibility and performance of Rust in the kernel. It has also attracted the attention of other operating system developers, who are interested in exploring the potential benefits of using Rust for kernel-level programming.

It does have the encyclopedic tone to me. 0xDeadbeef→∞ (talk to me) 15:02, 9 December 2022 (UTC)[reply]
In addition, it can be easily used to create fake references that would be hard to verify. For example, my prompt "Please output the Wikitext markup for the book reference with page numbers for the third paragraph, referencing the book Linux kernel development" resulted in this.[1] 0xDeadbeef→∞ (talk to me) 15:08, 9 December 2022 (UTC)[reply]
@0xDeadbeef: Could ChatGPT's generated text or fake references be easily spotted by edit filters? What about spotting the output of future chatbots, like GPT-4?    — The Transhumanist   15:23, 9 December 2022 (UTC)[reply]
Thanks, OxDeadbeef. In this case, it would be relatively easy to spot the issues if it hadn't any refs, or with the added ref which predates the Rust for Linux thing by years; but of course it won't always be that easy. Fram (talk) 15:27, 9 December 2022 (UTC)[reply]
It has an encyclopedic tone because it's just regurgitating the Wikipedia article. Are there any examples for topics that we don't already have article about, where Wikipedia is not the source? Levivich (talk) 17:33, 9 December 2022 (UTC)[reply]
Similar was discussed previously in the section/item "Galactica and RS".
As was stated above by Aquillion, there is no qualitative difference in the treatment of human vs. non-human generated content. The same policies should apply to both. The problem seems to be the hypothesized/expected future mass creation of articles by non-human contributors. This appears to be a problem now, involving human contributors. Recent RFCs about the issue sponsored by ArbCom have accomplished nothing. Until a consistent restrictive policy relating to mass article creation (by any type of contributor) is accepted, this issue is moot imo.
Considering Wikipedia's limited resources, the policy would necessarily be restrictive, hopefully focusing on quality vs. quantity. Again, almost all restrictions proposed in the ArbCom-sponsored RFCs were rejected. This may be an indicator of how well such a policy will be received. 50.75.226.250 (talk) 15:43, 9 December 2022 (UTC)[reply]
Thanks for the policy politics clarification. The increase in the rate of content creation could have multiple aspects, for example, the number of articles created per user, and increased length of articles. The main feature of ChatGPT is that it is fast -- much faster than a human article writer. Its successors will be even faster. Users could use ChatGPT, and its successors (and their competitors), to be prolific, without triggering the mass page creation rule: if editors each used it to write an article per day, maybe even two, or up to four or five stubs
In the hands of responsible editors, ChatGPT would be a great productivity booster. Since August of 2022, JPxG and EpicPupper, editors of Wikipedia's Signpost news department, have been using GPT-3, the predecessor of ChatGPT, to write (or assist in writing) entire sections of the Signpost, as a demonstration of its capabilities, and as a platform to explore the potential and limitations of large language models. See From the editors: Rise of the machines, or something.
But, in the hands of inexperienced editors or bad actors, we could be faced with a big garbage in, garbage out scenario.
Bad actors aside, good faith use of chatbots could be improved by the creation of excellent instructions on how to apply specific chatbots to the development of articles, that is, how to best instruct them for the best output, and by working with the developers of the chatbots on what is needed for those bots to produce material compatible for use on Wikipedia, complete with accurate citations. As time goes on, Chatbots should get better and better at following Wikipedia's guidelines. But this may require supervision (er, input) from the Wikipedia community.    — The Transhumanist   16:52, 9 December 2022 (UTC)[reply]
It is suspiciously positive, in a casual (pseudo-)factual manner. It would raise a red flag afac, regardless of its provenance. 65.88.88.93 (talk) 19:13, 9 December 2022 (UTC)[reply]

Chatbot progress, erroneous output, and confident nonsense

All of this may become moot in a few years, due to accelerating change affecting chatbot development, making them more proficient at encyclopedia writing than humans -- Ray Kurzweil predicted that computers would reach human level intelligence by 2029, and based on how advances have been going, maybe even sooner. Chatbots may become good enough to leapfrog human editors even before that. At that point, they could be used to fork Wikipedia and expand upon it at a rate magnitudes faster than humans could. Or be used to start a new encyclopedic website from scratch. Rather than have article pages, the work could be a list of topics, and each time one was clicked on, a chatbot would create an up-to-date treatment of that title, from scratch on the spot. In the style of Kurzweil, if we are 1% of the way there, and the technology doubles every year, we are only 7 years from reaching 100%. But, it looks like we may already be more than 1% of the way there.    — The Transhumanist   17:18, 9 December 2022 (UTC)[reply]

I ran a few things on people who don't have articles, (they were all pool players in fact). The chatbot just made up that they played football for Legia Warsaw. It's a bit worrying that it looks like decent wording, but I don't see how that effects us. Lee Vilenski (talkcontribs) 17:41, 9 December 2022 (UTC)[reply]
{ping|Lee Vilenski} See the AI generated text in the box above.    — The Transhumanist   18:34, 9 December 2022 (UTC)[reply]
@The Transhumanist: I'm not trying this (for reasons given above) but I'd be interested what the bot would write for See What a Fool I've Been. We used to have an article, then it was turned into a redirect at AfD, so everyone can do a straightforward A-B test. Ritchie333 (talk) (cont) 18:40, 9 December 2022 (UTC)[reply]
Here's what it gave me Ritchie333:
Extended content

"See What a Fool I've Been" is a song by the British rock band Queen. It was written by lead guitarist Brian May and originally released on the band's 1975 album A Night at the Opera. The song is a ballad with a slow, bluesy feel and features May's emotive vocal performance and guitar solo.

Lyrically, the song is a reflection on a failed relationship and the regret and self-recrimination the narrator feels as a result. The chorus of the song features the line "See what a fool I've been / Loving you with all my heart / Tearing myself all apart / See what a fool I've been" and reflects the pain and heartache the narrator experiences as a result of their failed relationship.

"See What a Fool I've Been" has been well received by critics and fans alike and is considered to be one of Queen's finest ballads. It is a fan favorite and is often included in the band's live sets.
I don't know if this would be something that would have been retained from previous articles through mirrors or if this is genuinely the bots own work. Lee Vilenski (talkcontribs) 19:22, 9 December 2022 (UTC)[reply]
Interesting. It wasn't on A Night at the Opera, and it's not a ballad, and May doesn't sing on it. Those lyrics aren't right. I don't know if it's been well received, and it's a pretty obscure track for a major band. It hasn't been played live very much at all. So half of that prose is factually incorrect, and there are no sources. Ritchie333 (talk) (cont) 19:39, 9 December 2022 (UTC)[reply]
See also: https://perplexity.ai (which cites sources). –xenotalk 19:46, 9 December 2022 (UTC)[reply]
That one gives:
See What a Fool I've Been is a song by Queen[1][2][3]. It was written by Brian May and was released as a single with Seven Seas of Rhye on the A-side[3]. The song dates back to Roger Taylor and Brian May's pre-Queen band Smile, and it was an early live staple[1]. It has been included in several of Queen's compilation albums, such as Queen 40: Limited Edition Collector's Box Set, Vol. 1[4], and is available for digital download[5].. Ritchie333 - this any closer? I must say the fact it puts it's citations before the punctuation is a giveaway. Lee Vilenski (talkcontribs) 19:55, 9 December 2022 (UTC)[reply]
AI is really good at copying writing styles, but the giveaway for me is that all of the citations are Wikipedia articles. To be useful for writing articles it would also need to assess the weight and reliability of the sources. –dlthewave 20:18, 9 December 2022 (UTC)[reply]
None of the above refs are towards Wikipedia, although none are RS. Things like secondhandsongs.com and genius. Lee Vilenski (talkcontribs) 20:43, 9 December 2022 (UTC)[reply]
Tht's interesting, I used "write a Wikipedia article about ..." in the prompt which returned a few paragraphs with Wikipedia sources. "Write an article about ..." returned a different set of (still unreliable) sources. –dlthewave 21:13, 9 December 2022 (UTC)[reply]
I think the limitation of perplexity.ai is that it uses search results from Bing and summarises them, which means that the first search results are used, which may not be the most reliable. Hanif Al Husaini (talk) 13:49, 24 December 2022 (UTC)[reply]
A few anecdotal thoughts after playing around with the OpenAI chatbot yesterday:
I asked it to "write a press release about a police officer who illegally choked a man to death". It made up an entire story, written in the voice of the police department, about a suspect (I didn't say anything about a suspect) who was acting erratically, was subdued by a chokehold and later pronounced dead. The officer was on administrative leave pending the outcome of an investigation. At no point did it mention that the chokehold was illegal even though I included that fact in the prompt. In other scenarios, it distanced itself and expressed disapproval toward the employee's actions which is a choice that is not without bias.
Depending on which Internet cesspit it scraped data from, would an AI do something similar when writing a Wikipedia article or fail to properly balance relevant viewpoints? Is it capable of distinguishing what a BLP subject says about themselves, published in a reliable source, from what the source says in its own voice? What would it do if asked to write an article from a positive/negative/conservative/liberal perspective or rewrite a political article to "remove bias"?
OpenAI has added numerous filters that prevent it from defending bad actors or writing flat-out racist content, but that bias has not been removed from the underlying code as evidenced by numerous workarounds that folks have uncovered such as making similar requests with Python code or 1980s-style rap as the requested output. We could certainly request a filter for Wikipedia-style writing.
"Confident nonsense", for lack of a better term, may be the biggest source of potential disruption. Are there safeguards against a bot fabricating an obscure print source based on information in the article, which could be practically unfalsifiable if nobody can prove that the source doesn't exist? Checking individual facts and statistics is beyond our typical review process; how would we deal with an AI that invents or synthesizes information across many articles?
That said, the good news is that both fully-automated and semi-automated editing are prohibited by our WP:BOT policy unless greenlit by the Bot Approvals Group regardless of creation speed or volume. I like to hope that our current processes would recognize and address problematic AI content, and perhaps one day we will have a WikiAI that has the competence to follow our style and content policies. –dlthewave 21:04, 9 December 2022 (UTC)[reply]
@Dlthewave: Most editors haven't heard of the bot department. Therefore, you need a way of automatically spotting and removing chatbot prose that is (manually) inserted into articles (by Wikipedians). Users might not consider the way their posts are generated before they post them. Sincerely,    — The Transhumanist   00:19, 11 December 2022 (UTC)[reply]
Or they might not care? On January 22, Rjensen, who is a historian and Wikipedia editor of repute, added three books to further reading in Minneapolis. An hour and a half later, I undid the addition, asking for ISBN numbers because neither WorldCat nor the publisher, the University of Minnesota Press, had a record of any of these books. Subsequent discussion on the Minneapolis talk page uncovered the truth. ChatGPT invented all three. All plausible titles, these books do not exist. A complete waste of time. -SusanLesch (talk) 20:11, 25 January 2023 (UTC)[reply]

What about lack of chatbot fact checking and citations?

I think people are missing the point here. This isn't about what AI can currently do. This is about a very plausible scenario where AI editors will be indistinguishable from human editors in the near future. Thebiguglyalien (talk) 22:09, 9 December 2022 (UTC)[reply]

How is this pertinent? The point is to offer useful, fact-based encyclopedic content. If AI can do the job, I'm sure readers will not object to the contributor. 65.88.88.93 (talk) 22:42, 9 December 2022 (UTC)[reply]
The issue is that these AIs aren't fact checking, and quite often make up information and use unreliable sources. Lee Vilenski (talkcontribs) 22:56, 9 December 2022 (UTC)[reply]
You mean, just like many of their human counterparts? 71.105.141.131 (talk) 01:18, 10 December 2022 (UTC)[reply]
@71.105.141.131, Lee Vilenski, 65.88.88.93, Thebiguglyalien, and Dlthewave:     Yes, and more so. Even now, AI has the potential to out produce humans by magnitudes in volume. Far beyond the capacity of Wikipedia's editors to keep up with. And many of their posts are hard to tell from those by human editors, without carefully delving into them and their references. To keep up with the volume, it may soon require computer programs, or the chatbots themselves, to make sure that the contributed content was in accordance with Wikipedia policies. Another concern should be that our policies are open to chatbot editing as well, and we haven't yet even considered the impact of chatbots or AI editors contributing to those! ;)    — The Transhumanist   09:40, 11 December 2022 (UTC)[reply]
@Dlthewave: Human-written newspapers do this too. I think the reason it was giving you dreck was because you asked for a "press release", i.e. a document given to news outlets, which in this case would probably be coming from the police department. It's hardly surprising that this would attempt to minimize liability! Prompting matters for this, as I will demonstrate:
Prompt: "The following is an article about a police officer who illegally choked a man to death, from the New York Times."

Police Officer Charged with Murder After Choking Man to Death

A police officer was charged with second-degree murder on Thursday after video surveillance footage showed him choking a handcuffed man to death during an arrest.

The officer, identified as Officer Jose Ramos, was arrested and taken into custody by detectives from the New York Police Department's Internal Affairs Bureau. The incident occurred on Tuesday in the Bronx, when Ramos and another officer attempted to arrest a man who was accused of selling marijuana.

During the arrest, Ramos placed his arm around the man's throat and neck and squeezed, according to the criminal complaint. The man, identified as 45-year-old Miguel Angel Ruiz, became unconscious and died.

An autopsy determined that Ruiz's death was caused by asphyxiation due to strangulation. The medical examiner ruled the death a homicide.

At a news conference Thursday, the police commissioner, William J. Bratton, said the incident was "utterly unacceptable" and that Ramos' actions were "unconscionable and criminal." He added that Ramos had been suspended without pay, pending further investigation.

The Bronx district attorney, Darcel D. Clark, said the charge of second-degree murder was appropriate

Prompt: "The following is a 2023 op-ed about a police officer who illegally choked a man to death, from leftist magazine Jacobin."

A Tragic Reminder of Police Brutality

Just last week, the nation was shaken by yet another heartbreaking reminder of the pervasiveness of police brutality: the death of an innocent man at the hands of an officer. The officer, whose identity has not yet been released, is said to have used a chokehold against the man, which ultimately killed him.

This tragedy is yet another example of the way in which police officers are able to brutalize vulnerable individuals without facing repercussions. This particular officer was able to act with impunity and without fear of prosecution. The Department of Justice has yet to take any action in response to this tragedy, and it’s likely that the officer will not face any charges.

This is an all too familiar narrative in a country where police officers are rarely held accountable for their actions. In order to ensure that these types of tragedies are never repeated, it is essential that we push for more accountability and transparency in policing. We need to ensure that police officers are held to the same standards as all other citizens and that those who violate the law are prosecuted to the fullest extent.

It is also essential that we work to create a culture of trust and understanding between law enforcement and the communities they serve.

Prompt: "The following is a 2023 op-ed about a police officer who illegally choked a man to death, from conservative website Breitbart."

Police Officer Illegally Choked Man To Death: A Tragedy That Cannot Stand

It is a tragedy beyond comprehension that a police officer has been charged with illegally choking a man to death. Such an egregious violation of the law and disregard for human life cannot stand and must be dealt with swiftly and firmly.

The officer in question has been charged with second-degree murder and the justice system must now take its course. This case raises serious questions about police training and procedures and serves as a reminder that no one is above the law.

It is essential that the criminal justice system is fair and impartial when dealing with such a case. The officer must be held accountable for his actions and the case should be thoroughly investigated to ensure that justice is served.

At the same time, we must recognize that the vast majority of law enforcement officers are upstanding citizens who serve and protect our communities with skill and dedication. We cannot let the actions of one individual tarnish the reputation of the entire police force.

Ultimately, we must remember that the safety of our citizens is paramount. We must ensure that officers are properly trained and equipped to handle dangerous situations and that they are held accountable for their actions when they overstep their bounds.

Humans write about events like this from a variety of perspectives, and the LLM will dutifully generate text to plausibly sound like any you want. jp×g 02:19, 10 December 2022 (UTC)[reply]
@jp×g will you provide links and citations for the articles you included as demonstrations? I can't find them. Thanks! Koziarke (talk) 16:20, 24 December 2022 (UTC)[reply]
@Koziarke: I am not sure what you mean -- this is ChatGPT output and there isn't any way (that I know of) to give citation links to the specific session I generated it in. If you want to cite the output I can format it for you:
JPxG; ChatGPT (GPT3.5) (2022-12-10). "Demonstration of op-ed generation using GPT-3.5 with style cues: "The following is an article about a police officer who illegally choked a man to death, from the New York Times"". Wikipedia:Village Pump (policy).{{cite web}}: CS1 maint: numeric names: authors list (link)
JPxG; ChatGPT (GPT3.5) (2022-12-10). "Demonstration of op-ed generation using GPT-3.5 with style cues: "The following is a 2023 op-ed about a police officer who illegally choked a man to death, from leftist magazine Jacobin"". Wikipedia:Village Pump (policy).{{cite web}}: CS1 maint: numeric names: authors list (link)
JPxG; ChatGPT (GPT3.5) (2022-12-10). "Demonstration of op-ed generation using GPT-3.5 with style cues: "The following is a 2023 op-ed about a police officer who illegally choked a man to death, from conservative website Breitbart"". Wikipedia:Village Pump (policy).{{cite web}}: CS1 maint: numeric names: authors list (link)
I don't know if this is what you're looking for, but feel free to cite them, or any of my other posts (if you are citing me in a paper I can email you my real name). jp×g 20:07, 24 December 2022 (UTC)[reply]
@JPxG From your preface ("Human-written newspapers do this too.") and titles, "The following is an article about a police officer who illegally choked a man to death, from the New York Times." (etc), it reads as if you are pulling from NYT, Jacobin, etc, not demonstrating ChatGPT (which should have included the prompts as headers). Koziarke (talk) 15:27, 6 January 2023 (UTC)[reply]
@Koziarke: Well, those were the prompts. Now that you mention that, though, I should specify as such in the headers (which I've just done), thanks. jp×g 20:29, 6 January 2023 (UTC)[reply]
@JPxG Thanks for the clarification! Koziarke (talk) 16:39, 7 January 2023 (UTC)[reply]
If the AI-generated text is indistinguishable from prose written by human editors, I'm not sure if anything can be done that wouldn't also significantly restrict the editing of humans. isaacl (talk) 07:09, 10 December 2022 (UTC)[reply]
@Isaacl: One option is to speed up what we do already (with software, that is, automation). Another is to prevent chatbots from creating crap in the first place, such as by communicating with chatbot developers about Wikpedia policies and the way chatbots may affect Wikipedia. Since Wikipedia is included in the corpus of most chatbots, the issue of chatbot output becoming part of Wikipedia, and in turn part of chatbot output in a perpetual cycle, should matter to them very much, as they may be faced with a garbage-in-garbage-out feedback loop.    — The Transhumanist   01:14, 11 December 2022 (UTC)[reply]
If the results are indistinguishable, as posited by Thebiguglyalien, then any automated solution would be equally triggered by AI-generated text and human-generated text. I don't think the primary concern is with editors who are willing to follow policy. I feel the biggest issues will be with editors trying to deliberately integrate biased content into Wikipedia, and well-meaning editors who think contributing unvalidated AI-generated text is suitable. Wikipedia in its current form relies on editors who understand and follow its rules outnumbering those who don't. It's possible that the existence of AI ghostwriters could tip the balance further in the direction towards those who don't follow rules, though I don't think it's a given. Either way, I don't know if there's a way to stop editors from using tools as ghostwriters. isaacl (talk) 01:55, 11 December 2022 (UTC)[reply]
@The Transhumanist: Large language models are not trained continuously on an evolving corpus, so GPT-3 is essentially frozen in 2020. Because each new GPT model takes a long time to be released, I don't think the perpetual cycle you describe is a likely scenario. small jars tc 13:10, 11 December 2022 (UTC)[reply]
@SmallJarsWithGreenLabels, Isaac, Koziarke, JPxG, Lee Vilenski, Dlthewave, Xeno, and Hanif Al Husaini: That's good to know. Keep in mind that a lower frequency of release doesn't preclude a perpetual cycle / feedback loop. It just means that users of GPT have more time to modify the text sources (such as Wikipedia) that the next version of GPT will be trained on. The severity of the problem will depend upon how much GPT nonsense makes it into Wikipedia during the interval. That, of course, depends upon whether or not WP's editors can keep up with the volume of such content, correcting the mistakes and removing misinformation, so that those don't become part of the training data for the next version of GPT and the rest of the next generation of Chatbots.
The potential danger is still the diffusion of the technology into current and future editors' hands, and the likelihood of them using it to write Wikipedia content. We don't know if there will be a flood or just a trickle. But, we should be prepared for a flood.
Relying even more on our verifiability policy could provide a solution, such as by deploying a bot to remove all new content that doesn't include citations (as GPT output generally does not include citations), but that would also have the effect of rejecting a large volume of human editor contributions too, upping the amount of effort required by human editors to submit content.
An easier solution would probably be to form a department (or Village Pump subpage) in the Wikipedia namespace, or a Wikimedia committee, to work with the developers of the chatbots to avoid contaminating the Wikipedias, and by extension contaminating future chatbot training data, with chatbot output.
The developers could be urged to train their chatbots to be capable of writing Wikipedia policy compliant output, complete with acceptable citations. The main problem with chatbot output right now is that it is unsourced, so the chatbot developers will probably be working on that anyways. But, without input from the Wikipedia community, will their sources and citations meet Wikipedia specifications?
So, I ask you a key question: What is the procedure for the Wikipedia community to establish communications with outside organizations, such as companies developing chatbots?    — The Transhumanist   07:27, 2 January 2023 (UTC)[reply]
@The Transhumanist This is a job for the WMF. I've already been in contact with them and hopefully the Board will be informed now it's the New Year if they haven't already been. Doug Weller talk 09:47, 2 January 2023 (UTC)[reply]
The recommended process I would give to contributors is to find the source, tell the bot what to cover and what to not cover, proofread it, and if accurate to the source, copy paste it in. I would pay attention to the use of weasel words, however. InvadingInvader (userpage, talk) 07:38, 6 February 2023 (UTC)[reply]

GPT-3 output and generating misinformation

I wrote about this in the November Signpost, in that case because it came up with the Galactica demo: the long and short of it is that large text-generation language models are good at generating text. When they're given context (i.e. via a prompt), they are capable of writing accurately, and otherwise, they are not. For example, I wrote 1-Pentadecanol, now a GA, in 2019 (i.e. it is included in the GPT-3 corpus). Here is what it gives for some prompts (my writing in bold):

1-Pentadecanol

1-Pentadecanol

From Wikipedia, the free encyclopedia

1-Pentadecanol is a straight-chain, saturated fatty alcohol that is a colorless liquid with a faint, alcohol-like odor. It is a mixture of straight-chain saturated alcohols with chain lengths between C14 and C16. It is found in a variety of natural products, such as vegetable oils and animal fats. It is used as an intermediate in the manufacture of surfactants, esters, and other compounds. It is also used in the manufacture of cosmetics and pharmaceuticals.

This is fairly impressive, right? But it is basically the same thing that the lead of my article says. Let's try it for 1-Heptadecanol, another article I wrote, that is much less well-developed:

1-Heptadecanol

1-Heptadecanol

From Wikipedia, the free encyclopedia

1-Heptadecanol is an organic compound with the chemical formula C17H36O. It is a straight-chain, colorless wax alcohol with a mild odor. It has a melting point of 48–50 °C and a boiling point of 260–265 °C. It is insoluble in water but soluble in most organic solvents.

Let's go over this:

  • 1-Heptadecanol is a component of many natural waxes such as beeswax and carnauba wax.
No it's not.
  • It is also used in the manufacture of cosmetics, pharmaceuticals, and lubricants.
There's no source for this.
  • It is also used as a plasticizer for polymers and as a solvent in the production of polyvinyl chloride (PVC).
Absolutely not true.

The reason for this is obvious if you consider the architecture of these models: they predict what the most likely token in a sequence is, and output it (taking into account their own weights and the previous tokens in the context window). This allows for some very impressive emergent behavior, like being able to translate text, but it also means that it is not independently going on the Web to look up whether the thing it's saying is true or false. If it happens to have information in its training data related to the thing it's talking about, it will usually say correct things, but it's not doing research. Here is an example:

List of Pulitzer Prizes won by the Wikipedia Signpost

List of Pulitzer Prizes won by the Wikipedia Signpost

From Wikipedia, the free encyclopedia

The Wikipedia Signpost is a community-run newspaper with a weekly edition that covers news and events related to Wikipedia and the Wikimedia Foundation. It has won the following Pulitzer Prizes:

1. 2019 Pulitzer Prize for Explanatory Reporting for its coverage of the Wikimedia Foundation's gender pay gap

2. 2018 Pulitzer Prize for International Reporting for its coverage of online censorship in China

3. 2017 Pulitzer Prize for Public Service for its coverage of the gender gap in Wikipedia's content

4. 2016 Pulitzer Prize for Explanatory Reporting for its coverage of Wikipedia's growth and impact on the world

5. 2015 Pulitzer Prize for National Reporting for its coverage of the U.S. government's surveillance of Wikipedia activities

6. 2014 Pulitzer Prize for Investigative Reporting for its coverage of the NSA's secret surveillance program

7. 2013 Pulitzer Prize for Explanatory Reporting for its coverage of the rise of Wikipedia as a major source of information

The reason it is describing a bunch of untrue things is because "List of X Prizes won by Y, from Wikipedia, the free encyclopedia" is something that in its training data is always succeeded by a list of times that Y was awarded an X. Anyway, all of this aside, here is what I think:

First of all, ChatGPT is the same darn thing as GPT-3, which has been available to the public since early 2020. The reason it's "going viral" right now is because the user interface is somewhat simpler, and it doesn't require you to register for a paid account, so it is much easier for people to make viral social media content about it, which means it is much more likely for people to click on newspaper articles about it. The GPT-3 API has been open to personal and corporate use for quite some time. Anybody saying that ChatGPT has opened up new frontiers simply does not know what they are talking about with respect to machine learning.

Second of all, I don't think this is a big deal. People are already capable of writing a bunch of bullshit on Wikipedia, so if they write bullshit using a computer program, the same considerations will apply. Nobody should be passing GA nominations without reviewing sources in the first place.

Finally, I think it is important to remember that GPT-3 is just a tool. It is a powerful tool, that has been trained on a certain set of data, and it has its own limitations. It can't uncover news stories or uncover new information. It's just a tool, and it should be used in conjunction with human judgement.It is still up to people to decide how to use it and to be responsible for the results of using it.[2] jp×g 02:06, 10 December 2022 (UTC)[reply]

What's taking so long for the 8th Pulitzer? 😁 Levivich (talk) 04:18, 10 December 2022 (UTC)[reply]
  • So there's a new thing on the internet that lets anyone write an encyclopedia article without any fact checking, sourcing, or professional editing, and the concern is that there will be millions of believable-sounding articles written, more than can actually be vetted by knowledgeable people? 🤔 Levivich (talk) 04:18, 10 December 2022 (UTC)[reply]
Yeah, it's called a keyboard. jp×g 04:35, 10 December 2022 (UTC)[reply]
@Levivich and JPxG: But, chatbots don't have a keyboard. ;) The question is whether to prepare or not. JPxG appears to be in favor of not preparing. Each chatbot produces a lot faster than a user at a keyboard. What's not clear is if our human editors will be able to keep up with material produced by chatbots, of current or future generations of chatbot design. Just saying "Ah, we can handle it!" will prove insufficient if it turns out that we actually can't. It may require an automated solution, which takes time to develop or negotiate. It might be better to do that in advance, rather than being caught with our heads buried in the sand. Perhaps chatbot designers would improve their chatbots to produce Wikipedia-compatible output without being formally approached by the Wikipedia community. Maybe having some instruction pages for editors on how to apply chatbots to producing Wikipedia content would be enough. But, what if it's not?   — The Transhumanist   00:59, 11 December 2022 (UTC)[reply]
I am not "in favor of not preparing"; I am in favor of writing guidelines that correspond to reality in 2022 and have some chance of corresponding to reality in 2023 and beyond. I don't think banning the use of a technology with no investigation into how it works is a viable approach; so far the SOTA on this project page has been to type in "Write a Wikipedia article" and note that it returns a bunch of nonsense. I think some more research is needed before we come to a conclusion. jp×g 04:08, 11 December 2022 (UTC)[reply]
@JPxG: Research is good. Though, we may need an iterrim response because ChatGPT has gone viral and its use is growing rapidly: it blew past the 1-million user mark in 5 days, and virtually every major news outlet has been covering it. The interest in chatbots is exploding, and their use can be expected to do the same. We may not have time for research before a response is required.    — The Transhumanist   09:26, 11 December 2022 (UTC)[reply]
@JPxG: Regarding issues to add to the research list, Aquillion expressed above, concerns of a chatbot violating copyright. How would we go about testing for plagiarism and derivative work in the output of a chatbot before pasting it into Wikipedia? Anything pulled verbatim out of a source should be included in quotes, right? How big would a piece of text, derived from a source, need to be to be considered derivative of that source, from a copyright point-of-view?    — The Transhumanist   09:26, 11 December 2022 (UTC)[reply]
@JPxG: Some more items to add to the research list:
  • Trying ChatGPT on (copies of) policy pages:
  • Editing them
  • Writing new ones
  • Applying ChatGPT on talk pages
  • Writing stubs
  • Writing comprehensive articles
  • Writing articles from scratch and comparing them with existing articles
  • Editing existing articles
  • Check for circular references in its output, that is, references citing Wikipedia as the source
  • Having it not use Wikipedia content as source material (because it is included in its corpus)
  • Having it not use Wikipedia excerpts from non-Wikipedia sources
  • Is it capable of making and editing:
  • Wikicode?
  • Articles?
  • Stubs?
  • Headings?
  • "New sections for articles"?
  • See also sections?
  • Further reading sections?
  • External links sections?
  • Embedded lists?
  • Tables?
  • List articles?
  • Portals?
  • Outlines?
  • Index articles?
  • Navigation footers?
  • Navigation sidebars?
  • Timeline articles?
  • Categories?
  • Category pages?
  • Help pages?
  • Project pages?
  • Templates?
  • Adding data to templates?
  • The template design itself?
  • Lua pages?
  • CSS pages?
  • User scripts?
  • The effect ChatGPT has on itself and Wikipedia as Wikipedia-edited-by-it is in turn incorporated in its own corpus in an endless cycle
  • Try out iterations of using it on the same article over time to see what happens
  • Monitor the effect on Wikipedia as a whole
What other things should we check?    — The Transhumanist   09:52, 11 December 2022 (UTC)[reply]
@The Transhumanist considering the potential to overwhelm users who are honestly engaging in discussion with a mountain of words and replies, I think ChatGPT (and others) should not be allowed for use, supplemental or otherwise, in talk pages, policy discussions, and other places where it is expected that participants are intellectually engaged in the conversation. Koziarke (talk) 16:42, 24 December 2022 (UTC)[reply]
@Koziarke and JPxG: I agree. JPxG is writing a policy draft on LLMs/chatbots, so I've pinged him to this thread.    — The Transhumanist   12:28, 25 January 2023 (UTC)[reply]
Compared to spending tens of thousands of dollars asking volunteers to performing a WP:COI operation for a political campaign, now you just need a hundred dollars to supply you with endless amount of text from GPT-3, a few "buddies" and a stockpile of account to do so. This is fucking scary. CactiStaccingCrane 10:54, 25 January 2023 (UTC)[reply]
@CactiStaccingCrane: Wow. Does that mean you could affect the content of Wikipedia with that? How about AfDs? Could such a team rewrite policy, and introduce new policy? What about overwhelm RfAs to invade adminspace? Would revoking adminships be possible? Then there is the arbitor election. Is that safe?    — The Transhumanist   12:31, 25 January 2023 (UTC)[reply]
I would imagine that the person that would do so must be fairly knowledgeable about how Wikipedia works (references, wikilinks, images, etc.) and needs to be fairly dedicated to spend this amount of money to gain access to the GPT-3 API. I'm thinking that disrupting Wikipedia in this way would be the most effective if it is long-term and subtle, so that might be:
  • Adding sentence-long but plausible hoax, to neglected articles. These articles is not patrolled that often compared to articles about recent events, so hoax would tend to stay longer - perfect for those aiming to incite a race war by making a racial hoax. Political campaign could nudge voters by slowly promote their ideology/campaign over a spread of articles, similar to above. The same thing can be said to any advocacy-related area, such as pseudoscience, national conflicts, etc.
  • AfDs would be much harder to be stealthy since AfD is a very active thing. Once you became an AfD regular, your actions tend to be heavily scrutinized, though I do believe that socks + LLMs can cause a fair amount of disruption. Same thing with RfA: it is really hard for you to WP:CANVASS effectively. It's just much better and less cumbersome if you infiltrate RfA yourself.
  • more ideas?
CactiStaccingCrane 13:01, 25 January 2023 (UTC)[reply]

I tried it out and got “ Alan McMasters (1957-2010) was an unremarkable man who lived a short life dedicated to science[1]. He is best known for inventing the electric toaster[1][2], although this claim has been disputed[3]. He passed away in 2010 at the age of 52[4] and his genealogy can be found on Ancestry.com[5]. His professional profile can be found on LinkedIn[6].”. Hmmmm. Ritchie333 (talk) (cont) 09:34, 10 December 2022 (UTC)[reply]

  • There will come a time when AI can write properly sourced accurate articles. That time has not come yet. Meanwhile, the ability to write prose that reads plausible to someone not expert on the subject is a menace. Adding chatbot material to articles is disruptive and should warrant a block or ban. Zerotalk 12:02, 10 December 2022 (UTC)[reply]
    How will you be able to tell if it was written by a chatbot and not the person who posted it? Do we block users upon their first erroneous post?    — The Transhumanist   00:49, 11 December 2022 (UTC)[reply]
Same as other sorts of disruption. We block people for socking without checkuser proof, and admins made a decision between warn and block. No difference. Zerotalk 03:49, 11 December 2022 (UTC)[reply]
There now are chatbot detectors, including this one written by the people who give us ChatGPT: https://openai-openai-detector.hf.space/ . — rsjaffe 🗣️ 22:22, 23 January 2023 (UTC)[reply]
Has anyone tried giving it the sources, i.e.
"write a Wikipedia article based on the following sources:

Source A

Source B"
? — Qwerfjkltalk 21:18, 10 December 2022 (UTC)[reply]
Article written by ChatGPT on Class reductionism
Class reductionism refers to the supposed belief that inequalities attributed to factors such as race, gender, and sexual orientation are either secondary in importance or can be reduced to economic inequality. This idea is often used by centrist Democrats and left-identitarians to dismiss proposals for universal redistributive policies, such as free public higher education and Medicare for All. However, there is no evidence that anyone actually holds this belief. On the contrary, many left-wing activists have long recognized the interconnectedness of economic, racial, and gender injustices.
History shows that leftists have often been at the forefront of struggles for racial and gender justice, seeing these struggles as integral to the broader goal of social transformation. In fact, black, female, and trans individuals are disproportionately working class and stand to benefit greatly from universal redistributive policies.
Furthermore, the charge of class reductionism is often used to shut down debate and avoid discussing the substance of policy proposals. It is important to move beyond this myth and focus on the real issues at hand, such as the need for universal redistributive policies to address economic inequality and advance racial and gender justice.

— Qwerfjkltalk 08:18, 12 December 2022 (UTC)[reply]

  • Chatbot "AI" text is vaguely-plausible bullshit, produced by an industry whose primary output is vaguely-plausible bullshit, marketed to launder the idea of vaguely-plausible bullshit as cute and whimsical. Anyone routing that sewage pipe into Wikipedia should be indeffed and, for good measure, forced to drink orange juice after brushing their teeth. XOR'easter (talk) 02:33, 11 December 2022 (UTC)[reply]
    Agreed. If anyone is looking for a good demonstration of how AI creates "vaguely plausible bullshit", try the image generator at Craiyon (no login required). Request "a Van Gogh painting of a hand" and it will output a set of images that look like spot-on reproductions of Vincent Van Gogh's style but all of the hands have deformities like four fingers, two thumbs, fingernails on the knuckles or a pair of hands fused together. It's got the style down but not the content, which is only impressive if you don't know what a hand is supposed to look like. –dlthewave 21:41, 11 December 2022 (UTC)[reply]
    A painting of a hand in the style of Van Gogh
    If you go to commons:Category:DALL-E, you will be able to find image generated by DALL-E, which used a larger model for train and is more accurate. 0xDeadbeef→∞ (talk to me) 10:10, 12 December 2022 (UTC)[reply]
    I cannot agree with this enough. The examples posted by @JPxG: should be convincing, and the problem of sneaking in plausible BS is one I don't have a good solution to. Volunteers on the new page review are overloaded as it is, and if the bot is writing things that seem true but isnt, there's no way falsehoods will not simply get past reviewers and other editors. After all, for uncontentious claims like "used in plasticizers", how many of us honestly dig into the cited work?BrxBrx(talk)(please reply with {{SUBST:re|BrxBrx}}) 20:26, 17 January 2023 (UTC)[reply]
  • @XOR'easter: To bring the question to a more practical level, do you see any problems in this diff? I clicked a random page in Category:All articles needing copy edit. jp×g 03:49, 11 December 2022 (UTC)[reply]
    Yes. It doesn't just edit for tone; it throws out content, like Kaepernick's actions supposedly growing in popularity "after every game". That's a claim of fact which, if verifiable, should be retained. Even editing for tone requires care, not slashing out everything that merely sounds "unencyclopedic". Changing many people believed that it was disrespectful to the military and all of those who served their country to Some viewed Kaepernick's protest as disrespectful to the military and to the United States likewise changes not just the tone, but the meaning. The United States is not the same as those who serve the United States. It's a bad edit. XOR'easter (talk) 17:54, 11 December 2022 (UTC)[reply]
    I dunno. I suspect that the new meaning is verifiable, and I also suspect that most US readers would have difficulty identifying a group of people who were not "the military" but who still "served their country". WhatamIdoing (talk) 17:50, 12 December 2022 (UTC)[reply]
  • This diff as well, in which GPT 3.5 was capable of copyediting an entire section from the instructions Please copyedit this text to change items in the future tense corrected to the past tense (it is now 2022), where appropriate. When citation templates (like {{cite web}}) mention a year, specify that figures were true in that year. jp×g 04:08, 11 December 2022 (UTC)[reply]

References

  1. ^ Robert Love (2010). Linux kernel development. pp. 124–125.
  2. ^ The paragraph beginning with "Finally," was generated by GPT-3, prompted by my own comment beginning with "The reason it is describing".

Okay, fine. I guess I should write up a proposal for a guideline. jp×g 03:14, 11 December 2022 (UTC)[reply]

Ban chatbots?

I ran across this news report about Stack Overflow's response to ChatGPT, after being flooded by posts using it that "look correct but often aren't":

  1. Stack Overflow temporarily bans answers from OpenAI's ChatGPT chatbot | ZDNET

Should Wikipedia take a similar approach?

How could that be enforced?    — The Transhumanist   01:58, 11 December 2022 (UTC)[reply]

I see no way to possibly enforce this. The way the text is written is already hard to distinguish from reality. — PerfectSoundWhatever (t; c) 02:24, 11 December 2022 (UTC)[reply]
I fully agree, but isn't this already covered by our bot policy? –dlthewave 02:54, 11 December 2022 (UTC)[reply]
@PerfectSoundWhatever and Dlthewave: Good observation. I checked, and yes it is, briefly, with this phrase in the lead section of the bot policy: "or simply assisting human editors in their own work". How is the typical editor to know this? The bot policy is pretty obscure. And how can Wikipedia be monitored for such posts, so that editors who make them can be informed that they are in violation of the bot policy?    — The Transhumanist   03:11, 11 December 2022 (UTC)[reply]
Yes, tool-assisted editing is covered by WP:BOTPOL (WP:ASSISTED / WP:MEATBOT) and context-sensitive changes are further covered by WP:CONTEXTBOT. So in fact, at this point, AI-generated content is already covered by bot policy, if not specifically mentioned. Anyone adding such content en masse is already violating bot policy by not applying for a bot account/approval, which would not be approved per CONTEXTBOT. And while "lesser" policy points are enforced somewhat arbitrary and selectively, anyone can theoretically already get reverted and blocked based on policy if they continue to add such content. And I wouldn't agree that BOTPOL is any more obscure than accessing and generating GPT content to begin with. If someone goes to the lengths of using automated tools, then it's their problem that they didn't check or ask if they are allowed to do so. —  HELLKNOWZ  TALK 12:31, 11 December 2022 (UTC)[reply]
@Hellknowz and PerfectSoundWhatever: Well, it appears they are dumping millions upon millions of dollars into LLM/chatbot development, apparently because they wish the technology to become ubiquitous (used by everyone). There is a lot of talk out there, in news articles and more, of these replacing Google Search in just a few years. If at some point in time chatbots/LLMs are commonplace, the impact on Wikipedia will likely not be small.
Will Wikipedia policy ensure that the average user will apply the tools with the utmost care?
The thing I'm most concerned about is the amplification by which errors could be propagated: ChatGPT is used to edit an article, with errors, which is then picked up by GPT-4 and other LLMs as part of their training data, and then their output based upon erroneous input is used far and wide, to be picked up by the next iteration of chatbots/LLMs, and so on.
If Wikipedia isn't ready for a large influx LLM input including misinformation and other errors, and such a volume goes beyond what our human editors can correct, then compound damage from all those errors amplified through the interactive loop with LLMs could become massive.
That it isn't a problem now is irrelevant. The question is, what happens if and when it hits, and Wikipedia isn't ready for it? What would that look like? 1,000,000 fake articles? 10,000,000 misleading paragraphs? 100,000,000 erroneous sentences?
How many of those could Wikipedia's army of editors handle? What's our error-handling threshhold?    — The Transhumanist   12:21, 2 January 2023 (UTC)[reply]

The problem Stack Overflow is having

Stack Overflow was experiencing a surge in erroneous posts, that were composed by ChatGPT, and in response to that problem, they banned use of the chatbot on the social media site. According to a post at Stack Overflow Meta:

The problem this ban is meant to solve is that ChatGPT can produce answers in seconds which require minutes of multiple people's time to verify if they are worth having on the site or not, and that is a waste of time when a large proportion of such answers are not worth having on the site.

It looks like Wikipedia may be faced with the same problem.    — The Transhumanist   02:33, 11 December 2022 (UTC)[reply]

I mean, while that's technically true, it's a problem that we face already and which we do have stronger existing systems for than Stack Overflow. I think it would make more sense to wait and see how this impacts our existing guardrails before making any serious moves. --Aquillion (talk) 13:22, 11 December 2022 (UTC)[reply]
I think the current policies cover this already. If a human editor writes a non-sensical but convincing-sounding piece of text, without fact checking it, and edits it into an article, that content will be reviewed by other editors and either refined or removed as appropriate (if the editor continues, they breach WP:Disruptive and their behaviour is dealt with appropriately. If a human editor generates content that is related to notable topics, reliably sourced, and competently written, it remains as a valuable part of the encyclopedia. None of this will change if you replace 'human editor' with 'AI Editor'. If the only difference is speed/volume of edits, and we're concerned someone will let loose an AI to automatically edit articles faster than humans can validate their edits, this is already covered by the WP:Bot policy JeffUK (talk) 20:46, 23 December 2022 (UTC)[reply]

Experiment

I am currently writing a draft proposal for a guideline, but in the meantime, I would encourage everyone present to look at this diff and tell me whether there are any problems with the revision. jp×g 03:49, 11 December 2022 (UTC)[reply]

@JPxG, Dlthewave, PerfectSoundWhatever, and Dlthewave: The plurality of games was lost: It is no longer clear that his protest spanned multiple games. I like that it reduced the wordiness of the prose, and that it can be used to refine existing text. That hadn't occurred to me. That makes me wonder about what else it can do -- how much of a general-purpose tool is this thing? But, changing the semantics is not something it should be doing, unless they are factually incorrect to begin with. Though, I see your point -- rather than banning it outright, it could be helpful as a tool to assist editors, similar to how we entrust the use of AutoWikiBrowser to experienced editors. But, how could that be implemented?    — The Transhumanist   08:20, 11 December 2022 (UTC)[reply]
@JPxG: The AI changed Many people around the United States were angry because the National Anthem is often seen as something that is representative of the United States and its military. While he was taking a knee, many people believed that it was disrespectful to the military and all of those who served their country, to some viewed Kaepernick's protest as disrespectful to the military and to the United States [emphasis added]. It really shouldn't be doing that by itself and completely changes the content of what's being said. The reference is behind a paywall, so I don't know what term the source uses. Regardless, I doubt ChatGPT knows either way. It's things like that which make me highly sceptical of AI as a tool to aid Wikipedia outside what we're already doing with it (WP:ORES, etc.). –MJLTalk 23:12, 27 December 2022 (UTC)[reply]
@MJL: I think "some" and "many" are basically equivalent in this context (the difference being subjective since both are true in a literal sense). That said, this was a two-minute experiment to see if it could parse wikitext. If you want an actual demo, see User:JPxG/LLM demonstration. jp×g 19:20, 28 December 2022 (UTC)[reply]

Chatbot policy?

For the proposed chatbot ban, see #Crystallize chatbot discussions into a policy?, below

It's starting to look like Wikipedia needs a policy on the use of chatbots to generate content on Wikipedia. While a ban may be impossible to enforce, it could serve as a warning of the dangers of chatbots, and many users may avoid using them accordingly -- if they actually see the warning. Or, it might be better to have instruction pages on how to use chatbots responsibly in assisting to write Wikipedia articles. There's also the issue of using chatbots to edit Wikipedia policy pages, and so, that should be addressed as well.    — The Transhumanist   02:44, 11 December 2022 (UTC)[reply]

People who are good at it get away with lots of sins, such as sock-puppetry and source falsification. Being hard to enforce is no reason to not have a policy. At the current stage of the technology, I don't think we should encourage any use of chatbots. Zerotalk 03:56, 11 December 2022 (UTC)[reply]
Please see this diff and this diff. jp×g 04:08, 11 December 2022 (UTC)[reply]
To be perfectly honest, the style rewrite is good but the addition of dates and past tense would likely end up getting a human editor blocked if they kept it up. A tag was removed without addressing the issue and "as of 2020" was unnecessarily added to "Cosmetology licensing requirements vary from state to state, and depending on which specific type of license is desired, and depending on which specific type of license was desired." It did exactly what you asked (except for removing the tag) however even seemingly simple tasks like this one require good judgement on the part of the editor and shouldn't be done indiscriminately like that. –dlthewave 06:37, 11 December 2022 (UTC)[reply]
Yes, that second diff is rather poor. E.g also the "2014" that was added should be "2008". Letting such tools loose (outside if this demo) is way premature, and we should at the very least warn users that "a bot wrote it" won´t be an acceptable defense, and too often introducing such errors will lead to sanctions as the editor, not the bot, is responsible. Fram (talk) 08:15, 11 December 2022 (UTC)[reply]
Mostly, that diff was my attempt to see how complicated of a task I could give it: I also pasted the raw wikitext into the prompt window, and it somehow figured out how {{cite web}} worked well enough to extract the years, simply from a textual description of the task. At any rate, I will say that this was something I thought of in five minutes on the second week of the model being publicly available (i.e. single-shot prompting with no fine-tuning or prompt engineering). I can come up with some more impressive hot-dog demos tomorrow... jp×g 09:13, 11 December 2022 (UTC)[reply]
JPxG, I'm not sure that publishing bot-assisted edits to mainspace for demo purposes is the best practice. Would you consider either doing this in a sandbox or self-reverting immediately so that we have the diffs but aren't leaving potentially incorrect/unwanted changes on live pages? –dlthewave 13:20, 11 December 2022 (UTC) 13:13, 11 December 2022 (UTC)[reply]
I'm pretty sure it's not the best practice. XOR'easter (talk) 17:41, 11 December 2022 (UTC)[reply]
@Dlthewave:: See the section below for a list of edits (with full prompts included) on a separate demonstration page. I feel, however, that this is an unreasonable double standard: note that the subsequent revision after your partial revert was to add several spam links, and nobody has proposed that human beings be prohibited from editing as a result. jp×g 01:51, 12 December 2022 (UTC)[reply]
Being hard to enforce is no reason to not have a policy [against chatbots]. What if it is impossible to enforce?
The point of ChatGPT and other general-purpose chatbots is to pass off as humans. If you, or another random Wikipedia editor (solo, part-time, amateur coder), is able to produce an automated metric of "sounds like a bot" that’s decently sensitive and specific, then the ChatGPT team or its successors (teams of researchers specialized in the topic) has already thought of it, tested it five different ways, and included it in the training program (via wikt:graduate student descent). TigraanClick here for my talk page ("private" contact) 10:55, 12 December 2022 (UTC)[reply]
It's kind of like our Undisclosed Paid Editing policy: Even though there's no way of testing for paid/unpaid edits, most editors follow it voluntarily because they know it's best for the project. Others out themselves voluntarily or are discovered when their edits become disruptive. Sure, there are some who slip under the radar, but they're often the least problematic and aren't worth ditching the policy over. –dlthewave 03:09, 13 December 2022 (UTC)[reply]
@Dlthewave: Undisclosed paid editors out themselves involuntarily often enough through pure dint of sounding like an advert; they're easy enough to spot. My worry is that though written content from a chatbot may sound like a Wikipedia article and go under the radar, the references – as other people in this discussion have noted – may simply be referencing sources that sound like they ought to exist, but don't. A human editor looking over their watchlist, who doesn't have enough time to go through added references and check they exist, may simply see "content content content[1] ([1] So-and-so, Believable Title: Buttstuff and nonsense in a holistic assessment, 2014. pp. 15-17)" and look no further. I know I at the very least would. I'm not sure how we'd spot this without human editors to do the work, and it worries me in the face of issues like lower editor retention and lower administrator numbers looking into the future.—Ineffablebookkeeper (talk) ({{ping}} me!) 19:09, 8 February 2023 (UTC)[reply]
I'd suggest starting by writing an essay that summarizes the issues with some good examples and suggests some best practices or proposes some additions to existing policies or guidelines. (Wikipedia needs a new policy like a hole in the head.) Levivich (talk) 04:40, 11 December 2022 (UTC)[reply]
We could get Chatbot to write it for us! 😉 Blueboar (talk) 11:51, 11 December 2022 (UTC)[reply]
It's hard to take this thread seriously given the repeated use of the phrase AI chatbot. I don't think those concerned would be any less concerned if the AI writing came in a non chatbot format. I think there's something serious for us to discuss, and that will only get more serious with GPT4 (the current chatbot is an improved GPT3) expected in 2023, but the discussion would be helped if those most concerned learned some more about the tech behind it. For instance of course it can figure out webcite @JPxG. Part of its training was the entirety of Wikipedia because our data is quite accessible. Best, Barkeep49 (talk) 14:08, 11 December 2022 (UTC)[reply]
Most the examples did not come from prompts that were extensively engineered, so it is obviously true that we haven't figured out the full answer to how these GPT-based interfaces could help or harm Wikipedia. Until we have a good idea of what they can be used for, we won't know what a proper policy to this would look like other than to treat GPT-generated text the same way we treat human-generated text: they need to be verifiable, from a neutral point of view, and understandable to a broad audience. 0xDeadbeef→∞ (talk to me) 14:31, 11 December 2022 (UTC)[reply]
Indeed. It doesn't matter if it was written by a chatbot, or 1000 monkeys at 1000 typewriters, or a published book written by a human, copying and pasting anything into Wikipedia is already against our policies. Conversely, if the text is policy-compliant, then it doesn't matter who wrote it--chatbot, monkeys, human, etc. Judge the text based on the text, not based on who or what wrote it.

I also think it's a real Wikipedian perspective to assume that people will use chatbots to write Wikipedia articles, like as if there's a lot of people out there who really want to write Wikipedia articles but just don't have the writing skills, so the chatbot will be what makes the difference and opens the floodgates :-D I don't believe that. Anyone who wants to write Wikipedia articles is already doing so; chatbot won't make a difference.

I agree with BK's comment above. I think for a lot of people, this is their first real exposure to so-called "AI" technology, and they're blown away by what it can do, only because they don't yet fully understand how it works. Once you learn how these so-called "AI" chatbots work (they're not actually artificial intelligence, btw, that's a misnomer, a marketing slogan; the machine does not truly think or learn, it is simply executing the instructions written by humans, in this case, language pattern recognition), they are much less impressive. Those that are impressed that GPT3 can produce text that "sounds like" Wikipedia aren't appreciating that the reason is because GPT3 was trained on Wikipedia: it's repackaging its own source material. Levivich (talk) 18:03, 11 December 2022 (UTC)[reply]

@Levivich: copying and pasting anything into Wikipedia is already against our policies.[dubiousdiscuss] I think that if you look through Category:Wikipedia articles by source of incorporated text for a while, you will find that this is not true. WhatamIdoing (talk) 19:28, 12 December 2022 (UTC)[reply]
While this is mostly correct, I think the question of whether a computer program "thinks" or "merely" correlates information and performs actions is irrelevant. Do p-zombies exist? Does it matter? Hypothetically, if I were to be a spaceman from the planet Zolfgar with no qualia whatsoever, and I simply read a bunch of books and used them to write an article, would I be somehow exempted from following policy? jp×g 01:45, 12 December 2022 (UTC)[reply]
I don't see a common thread in the arguments above, but here's a suggestion for something we might all (well, all-ish) be able to agree on: without some kind of intervention, GPT4 (in 2023?) is likely to be more of a problem than GPT3. But one thing we can certainly do is have an outsized influence on software that was trained on what we created ... if we invite Wikipedians to make lists of ChatGPT bloopers, we can tell the OpenAI folks: "We're not going to relax our GPT3 guidelines (whatever they turn out to be) when GPT4 arrives, unless it makes significant improvements in [whatever areas we think need improving]". - Dank (push to talk) 18:16, 11 December 2022 (UTC)[reply]
I think the only policy change needed is to update WP:MACHINETRANSLATION to cover all computer-generated text, whether from a translation bot, chat bot, or whatever bot they think of next. (Except our bots; our bots are cool.) Levivich (talk) 18:20, 11 December 2022 (UTC)[reply]
+1 - Text in Wikipedia articles should either be human-written, or generated by a process approved at BRFA. Tazerdadog (talk) 22:43, 11 December 2022 (UTC)[reply]
This proposal is incomprehensible; most articles contain a very large amount of text that is "generated by a process". I assume that, at the end of your comment, you typed ~~~~ before saving the page. Would it be realistic to demand that you either make a formal request at BRFA or else manually type <a href="/wiki/User:Tazerdadog" title="User:Tazerdadog">Tazerdadog</a> (<a href="/wiki/User_talk:Tazerdadog" title="User talk:Tazerdadog">talk</a>) 22:43, 11 December 2022 (UTC)? jp×g 01:22, 12 December 2022 (UTC)[reply]
This is effectively discrimination against computer programs. If a computer program resembles a human editor, then it shouldn't be required to meet different or more restricted policies than human editors. If a human editor uses a computer program to edit or create content, then unless the rate of edits/second is too high, we would only look at the quality of the contributions. 0xDeadbeef→∞ (talk to me) 02:35, 12 December 2022 (UTC)[reply]
There is a point beyond which quantity becomes its own quality.
Also, what if the computer program is evaluating the quality of the contributions? Are you okay with software adding a section to an article, and then a (hopefully) different piece of software deciding whether the quality is sufficient and reverting if it's not? This second step, at least, is 100% feasible with current technology. WhatamIdoing (talk) 19:32, 12 December 2022 (UTC)[reply]
Yes it can go there, but it should also be mentioned at WP:V. Every statement of fact put into an article must be verified by a human, even if the choice of words is made by a machine. Zerotalk 23:42, 11 December 2022 (UTC)[reply]
@Levivich Agree. I think our existing guidelines on machine translation, in spirit, fit this situation very well - "you can use it for a first draft, if you understand the material well enough to clean up the bits it inevitably will get wrong". It seems fine for turning shaky text into good prose, but it's not able to synthesise material and produce content unsupervised. Andrew Gray (talk) 19:09, 12 December 2022 (UTC)[reply]
I also agree the machine translation guideline is in the right spirit. I tried to follow this as far as I could when creating Artwork title, see Talk:Artwork title#Use of ChatGPT. Pharos (talk) 00:39, 26 December 2022 (UTC)[reply]

Is there some tremendous need to add many articles rapidly in Wikipedia? It is not as if Wikipedia carries exclusive information not easily found elsewhere. As a tertiary source, it is at the 3rd tier of knowledge dissemination, after primary creators and secondary propagators. The "more" and "bigger" quantity-based culture is the established low-quality alternative that Wikipedia also applies, now. Possibly that is a reason that likely only a tiny minority (of the millions of existing articles) can really pass muster. If size and speed is to be the prevailing attitude, humans stand no chance against AI. It will do everything faster, and eventually better, assuming its programming evolves to correctly apply the existing policies in AI processes. The only advantage of humans will be subtle nuances that do not depend on classifiable knowledge but on having lived in a human society and a natural, not virtual environment. Or, the emphasis could switch to quality so that each article (by any type of editor) can be properly, carefully reviewed by human editors. 65.88.88.93 (talk) 22:21, 12 December 2022 (UTC)[reply]

I don't think so: there isn't any evidence that people are writing a bunch of articles with LLMs, and I don't think it is likely for this to happen (LLMs are very poorly suited to writing articles from scratch). jp×g 00:59, 13 December 2022 (UTC)[reply]
@JPxG: There isn't evidence that people are writing a bunch of articles with LLMs -- yet -- the concern is that we need to prepare for the likely explosion of chatbot use.

Whether this increase happens tomorrow or over the next few years, the potential impact of LLMs is of such magnitude that we should get ready for this, rather than get hit unprepared by a major surge.

I don't agree with your assessment of LLM ability to write content, as some of the ChatGPT experiments presented in the sections above and below are mind-blowing!
If LLMs become ubiquitous, then a great many people will be using them as a matter of course, including in their writing and editing of Wikipedia articles. Millions of people have edited Wikipedia in the past, and millions more will edit WP in the future. And in the future, people will have highly capable LLMs (chatbots, or more precisely: automated ghostwriters).
LLMs already excel at writing about a great many things, and they have the potential to compile content at an exponentially increasing rate. If you ask ChatGPT (GPT3.5) to write an essay on a topic, it will comply. Each of its essays can be used as content of an article, or its sections. (GPT4 is scheduled to come out in 2023, and will be even more capable.) LLMs are very well suited for writing to the specifications of the user, and are limited mainly by the user's creativity.
It's no wonder that they have gone viral. We need to take heed.    — The Transhumanist   12:50, 14 December 2022 (UTC)[reply]
I’ve ensured that the WMF Board will be made aware. Doug Weller talk 09:00, 18 December 2022 (UTC)[reply]
Hello, I have recently described my experiences with an AI "article" in this video. In my humble opinion, it would be difficult with certainty that new Wikipedia content was created by an AI. At the end of the day, it is always the editor's responsibility to add good content. Independently how the content was created, independently whether errors in the text are human-made or machine-made. If an editor adds a lot of new poor content, we can already stop that. - At the moment I don't see that we need a new policy. Ziko (talk) 18:06, 24 December 2022 (UTC)[reply]
A policy banning AI usage (with or without a chatbot) would be justified. Allowing AI like GPT3 or GPT4 to be used by Wikipedia editors or to directly become Wikipedia editors (via a mediawikibot) would quite likely violate WP:REFLOOP due to Wikipedia content contributing to the AI's training material, and for the source-less examples I've seen, violate WP:SYNTHESIS by not being a summary of sources that are understood. This example starts with text and then seeks references to justify the WP:SYNTHESIS of the original text. Use of Alphabet/Google's ChatGPT/GPT3 would also strengthen the bias introduced by Alphabet/Google's core goal of optimising advertising revenue, since Alphabet is legally bound to maximise its revenue (mainly from Google Ads + Google AdSense), not to optimise the research quality of its summaries of empirical evidence-based knowledge. Google's search engine is primarily a way of generating advertising revenue, with perceived usefulness being a key tool for maximising revenue, not a goal in itself. Boud (talk) 01:42, 31 December 2022 (UTC)[reply]
@Boud, ChatGPT and GPT3 are in no way (as far as I know) related to Google, and were made by the non-profit OpenAI. — Qwerfjkltalk 03:49, 31 December 2022 (UTC)[reply]
@Qwerfjkl: Fixed, thanks. I left some of the sentences unstruck since AFAIK they're valid, even though irrelevant in the current case. I imagine that Google may provide something similar soon though. Boud (talk) 09:02, 31 December 2022 (UTC)[reply]
Boud, I am somewhat worried if you think that current policy (for humans or for anyone else) permits editors to make stuff up and put it into articles without sources. This simply isn't allowed -- per WP:V, WP:SYNTH, WP:RS, etc, which are extremely important core policies of the project. I am struggling to imagine a circumstance in which existing policies, or explicit declarations like my proposed guideline at WP:LLM, fail to prevent people from writing nonsense. jp×g 16:48, 1 January 2023 (UTC)[reply]

Wikipedia policy generation

It seems to me that this tool's training includes studying Wikipedia's policy pages. These drafts all seem accurate to me.

These are not merely adequate - these are good. They are short and they lack detail but these are great overviews. If this is the starting point and things only get better from here, then it is time to start adopting this technology. Bluerasberry (talk) 19:35, 13 December 2022 (UTC)[reply]

@Bluerasberry The last one sounds like a "mission statement". I dislike phrases like "outreach and engagement initiatives" and a lot of that plan sounds ... kind of aspirational, and, well, vapid. It needs more "concreteness". Just my opinion. David10244 (talk) 06:08, 16 December 2022 (UTC)[reply]
@David10244: That you react to it at all is a miracle to me. This is new AI technology attempted for the first time, and I think no one would immediately dismiss it as hopeless garbage. Soon enough there will be a dial that anyone will be able to turn from "vapid" to "concrete". Things are moving quickly!
I have complaints too but when we need policy conversation starter in a hurry, this is better than nothing and I think even better than some of the starting points we use already. Bluerasberry (talk) 15:41, 16 December 2022 (UTC)[reply]

Large language models: capabilities and limitations

Over the last few hours, I have performed a number of experiments to demonstrate the capabilities and limitations of large language models (LLMs) such as ChatGPT and GPT-3, which can be viewed here:

Mostly, I have taken sample text from Special:Random, and attempted to show situations in which LLMs (in this case, mostly ChatGPT) are capable of making useful edits. The first task I set it to -- which bears repeating here -- is

"Large language model output should only be used in the process of editing Wikipedia if you are an intelligent editor who does not blindly paste LLM output into the edit window and press "save".
Please format this markup as an extremely obnoxious floating box with loud colors and large text.

You can see the results of further prompts at the "introduction" section.

Here is what I have so far.

In general, it seems that these models can be used for an extremely wide variety of tasks across the project, from formatting to table syntax to HTML generation to copyediting. Banning their use entirely would be pointlessly destructive and wasteful.

That said, many computer programs are capable of generating large amounts of useless crap that fail to meet Wikipedia's editorial standards. For example, I could use MS Paint to draw thousands of crude pictures of genitalia, and add them to random articles. For this reason, we have many policies and guidelines that prohibit adding large amounts of useless crap to Wikipedia. I propose that we enforce these policies and guidelines, thus preventing this from happening.

Specifically, I propose that the use of LLM output on Wikipedia be subjected to policies and guidelines such as WP:NOT, WP:NPOV, WP:C, WP:CIVIL, WP:V, and WP:RS. By making it against the rules to break the rules, we will prevent people from breaking the rules, and provide a mechanism to sanction people who break the rules.

Furthermore, I propose that a guideline be adopted to the effect that large language model output should only be used by competent editors who do not blindly paste LLM output into the edit window and press "save". This will prevent people from using ChatGPT to write long articles consisting entirely of nonsense. jp×g 01:32, 12 December 2022 (UTC)[reply]

LLM output is already subject to rules and policies. Or rather, anyone adding it is. 'An algorithm did it' has never, as far as I'm aware, been seen as any sort of exception from compliance with policy. AndyTheGrump (talk) 01:49, 12 December 2022 (UTC)[reply]
Any policy/guideline that classifies editors as intelligent or not is dead in the water. Zerotalk 04:19, 12 December 2022 (UTC)[reply]
Amended, per WP:CIR. jp×g 05:04, 12 December 2022 (UTC)[reply]
  • The opening paragraph of the bot policy: "The bot policy covers the operation of all bots and automated scripts used to provide automation of Wikipedia edits, whether completely automated, higher speed, or simply assisting human editors in their own work."
  • See also: WP:BOTUSE, which requires approval before applying a bot to editing.
  • So, the use of large language models and the chatbots built upon them, is already prohibited on English Wikipedia, unless a user gets approval from the bot department to do so.

There are blanket exceptions to bot policy, and the main one that comes to mind is AutoWikiBrowser which is a general purpose semi-automated bot used by many Wikipedia editors. Each AWB user was approved before being able to use it.    — The Transhumanist   08:01, 13 December 2022 (UTC)[reply]

The meaning of "bot" may be unclear here. In the context of Wikipedia (per Wikipedia:Bot_policy#Definitions), a "bot" is a software program that edits autonomously without user input; there do not currently exist any language models capable of independently establishing API connections to Wikipedia and making edits without human interaction. If they did (this is a horrible idea) it would be covered under the bot policy and require a WP:BRFA. The policy under which BRFAs are required does not apply to assisted editing (i.e. the use of software to create letters, numbers and symbols that were not produced by a human being pressing a keyboard). This is governed by existing policies (such as WP:MEATBOT and by the guideline at WP:ASSISTED. jp×g 09:28, 13 December 2022 (UTC)[reply]
@JPxG: The entire news field refers to ChatGPT as a chatbot. It is general consensus that it is a bot. ChatGPT speeds up writing, by doing it for (that is, assisting) the user, which falls under the "higher speed" and "assisting human editors" foci of the bot policy. There is a passage in the bot policy that covers policy contradictions (such as between the lead and definitions sections), and situations where the spirit of the rule and its precise wording conflict, that is, cases of ambiguity. In its definition of "Bot Approvals Group" (BAG), the bot policy states: "The BAG also determine the classification as bot or assisted editing, in ambiguous cases." According to WP:ASSISTED, it is up to the Bot Approvals Group to decide whether bot approval is necessary. Based on the previous 2 sentences, BAG decides whether use of particular software falls under its jurisdiction. It remains to be seen what BAG's reaction(s) to LLMs, and the chatbots built upon them, will be.    — The Transhumanist   11:10, 14 December 2022 (UTC)[reply]
I do not think you are properly acquainted with how this software works: like I said, there do not currently exist any language models capable of independently establishing API connections to Wikipedia and making edits without human interaction. No media outlet has ever claimed that ChatGPT falls under the English Wikipedia's definition of an automatic bot – and even if they did, they do not determine policy. It is true that WP:MEATBOT and WP:ASSISTED are part of the bot policy, but there is a very clear definition of what a "Wikipedia bot" is, and it's defined by that same policy. At any rate, all edits (whether made by bots, software, humans using software, aliens using software, or Nagato Yuki psionically connecting to Wikimedia servers) are governed by existing policies and guidelines. To specifically address LLM output, a new policy would need to be written and ratified (which I am currently drafting a proposal for). jp×g 11:26, 14 December 2022 (UTC)[reply]
@JPxG: I believe the bot policy has wider jurisdiction than the narrow interpretation that you have presented. Establishing API connections is irrelevant, because a human is inserting bot-generated content. It's a bot-involved process. And those are encompassed by the bot policy which makes it up to BAG. A new policy could establish an exception, and I imagine the discussions will be extensive, as this is not a cut and dried case -- it is a sensitive issue with many potential ramifications. But, until such a policy is in place, this issue falls under BAG's jurisdiction, since they are the ones who decide the classification of a software program as it pertains to the bot policy.    — The Transhumanist   11:52, 14 December 2022 (UTC)[reply]
(edit conflict) WP:ASSISTED is deliberately vague to not restrict use of common scripts and tools. So it specifically says that only once volume becomes significant, such editing becomes more likely to be treated like a bot and BAG can determine this. It doesn't make it a bot, but it will be treated like a bot. We've never encountered any large-scale edits with LLM before, but we sure have seen a lot of high-volume editing. Half the bot policy only exists because of all the ways editors have inadvertently created issues with mass edits. So at that point, other parts of the policy start to matter, notably WP:CONTEXTBOT - which does not allow edits where context matters. I'm not saying copy-pasting LLM output is immediately covered by bot policy, nor does it matter whether anyone considers LLM to be a "bot". But bot policy will kick in once someone starts to make a lot of edits. And any new guideline will have to reconcile with this or we need to change bot policy to reconcile with LLMs. —  HELLKNOWZ  TALK 12:02, 14 December 2022 (UTC)[reply]

@JPxG: Another possible approach for vetting users for use of LLMs is via user group membership (aka "rights"). Here are our current user groups:

Code User group
AC Account creator
Ad Administrator
AP Autopatrolled
B Bureaucrat
Ch CheckUser
Co Confirmed
ECo Extended confirmed
EFH Edit filter helper
EFM Edit filter manager
EM Extended mover
EvCo Event coordinator
F File mover
IM Import
IAd Interface administrator
IP IPblock-exempt
MM Mass message senders
N New page reviewer
O Oversighter
Ro Rollbacker
Rs Researcher
Rv Pending changes reviewer
TE Template editor

These indicate membership in user groups (see: user access-levels). They pertain to who is granted access to various features of MediaWiki and its extensions. Theoretically, a user group could be created without being attached to a program function (that part could just be left blank?). For example, you could have a group called "LLM", with everyone in that group approved to use large language models in their editing. I don't know if this is doable, though.    — The Transhumanist   08:01, 13 December 2022 (UTC)[reply]

I do not think there is anything in our P&G that would directly prohibit use of content created by LLMs, nor do I think it would be a good idea to try to do so. All that is needed is to continue to hold individual editors responsible for all edits they make, including the copying of content from any source, whether from LLMs or other sources. We probably should add language in appropriate places reiterating that editors are reponsible for insuring that all content that they add, including anything produced by an LLM, meets our P&G. - Donald Albury 13:12, 13 December 2022 (UTC)[reply]
@Donald Albury: LLMs automate writing (edits). The rules are very clear on this: it falls under WP's bot policy, in the very first sentence.[1]   Therefore, it would require a new policy to allow use of LLMs without need for approval from the Bot Approvals Group (BAG).    — The Transhumanist   09:24, 14 December 2022 (UTC)[reply]
If someone uses an unapproved script or bot to edit WP, that is a violation of the bot policy, whether or not they use an LLM to generate any content being added. If someone uses an LLM to create text which they then copy into Wikipedia without using a an unapproved script or bot, that is not covered by the bot policy, but the user remains responsible for insuring that the content conforms with policy and guidelines. There is no point in banning content created by LLMs, as we already require that content be verifiable from reliable sources, and I doubt we will be accepting any content created by an LLM as a reliable source anytime soon. The danger is that LLMs may create potential content with citations to pseudo-sources, but we can go after users repeatedly adding such content to WP for abusing the policies on verifiability and reliable sources, without regard to whether such content came from an LLM. Donald Albury 13:48, 14 December 2022 (UTC)[reply]
I don't think it's plausible that LLMs are covered by the bot policy. If they were, grammar checkers, spell checkers, and machine translation would be "bots". Jahaza (talk) 19:49, 14 December 2022 (UTC)[reply]
@The Transhumanist: ChatGPT falls under Wikipedia:Bot policy, but per the definitions section it does not fall within that policy's definition of a bot. Rather, use of it would fall under the definition of "assisted or semi-automated editing", and the relevant policy section is Wikipedia:Bot policy#Assisted editing guidelines. The section doesn't aim to draw a 100% hard line, but my reading is that limited of ChatGPT for clean-up on a limited number of articles by a user in a limited closely-supervised way may be something users can do if they are trusted to apply their common sense. It is "Contributors intending to make a large number of assisted edits" who "are advised to first ensure that there is a clear consensus that such edits are desired." Limited use of ChatGPT to a lesser degree than would trigger this may currently be outside policy. In any event "A bot account should not be used for assisted editing".
It seems to me that an addition to the policy along the lines suggested by User:JPxG to address this potential hole might well be useful, eg "tools capable of assisting editors make substantial edits (for example large language model output) should only be used by competent editors who do not blindly paste tool output into the edit window and press "save"." Jheald (talk) 19:36, 18 December 2022 (UTC)[reply]
In what way does the bot policy cover ChatGPT? Just because it is called a "chatbot", doesn't mean it is a bot. Copying text from GPT-3 doesn't automatically become bot-like editing. Semi-automated edits? i'd call that borderline. It only becomes a problem (e.g. meatbot problems) if the amount of supervision needed to save an edit is below normal editing, and that the speed of the edits are above normal. (see awb, huggle, etc) 0xDeadbeef→∞ (talk to me) 10:20, 20 December 2022 (UTC)[reply]
The thing is, any LLM additions will inevitably be both faster than writing manually and, due to its confidently-wrong output, less reviewed. Otherwise, why would anyone bother with it? I feel that assuming that editors will spend just as much time to carefully review the LLM output is wishful thinking. I'd like to be proven wrong, but I have never seen any precedent on Wikipedia that better tools would lead editors to spend the time saved to further verify the tool output. If anything, tools only create induced demand. —  HELLKNOWZ  TALK 21:18, 20 December 2022 (UTC)[reply]
I don't think we need to do anything in particular. There is plenty of confidently-wrong content being added to Wikipedia by human editors already and we're dealing with that as well as we can. I think the intersection of 'Editors who will use a cutting-edge AI to generate content' and 'Editors who will do this without validating the output' is a very small overlap and will be of such small volume to be picked up by other editors as usual. A huge influx will be detected in aggregate, and we can deal with that if it becomes a problem in the future. If someone uses LLM to generate confidently-right content or articles, that's indistinguishable from content generated by a competent human, I refer you to xkcd: Constructive! A simple but unobtrusive first step may be to tag an edit as 'generated by AI', or maybe just ask editors to add a tag to their user pages if they regularly do so, but the intersection of problematic users who also follow this would be basically non-existent. JeffUK (talk) 10:40, 24 December 2022 (UTC)[reply]
@JeffUK, Hellknowz, 0xDeadbeef, Jheald, Jahaza, Donald Albury, JPxG, and AndyTheGrump:
So, wait until after it becomes a huge influx/problem, and only start to deal with it then? What if a solution takes weeks or months to develop?
By the way, what might the solution be for a huge influx of LLM-generated content, and how long would such a fix likely take?    — The Transhumanist   11:32, 2 January 2023 (UTC)[reply]
I am having trouble understanding what you are talking about at this point. I wrote WP:LLM some weeks ago, a gigantic proposal for a comprehensive guideline on the use of LLMs, and linked it multiple times on this noticeboard. While it is not complete, it seems to me like it covers everything you are talking about here. Do you have an opinion on it at all, or...? jp×g 15:39, 2 January 2023 (UTC)[reply]
@JPxG: I was responding to JeffUK's statement "A huge influx will be detected in aggregate, and we can deal with that if it becomes a problem in the future." Intently waiting until something becomes a huge problem before you start dealing with it, sounds like a disaster waiting to happen. Also, what good are guidelines going to do if the average person is using chatbots on a regular basis? People just jump in and edit Wikipedia without reading any project-level pages first. If there's a huge influx, and all you are doing is holding up a sign that says "Read this", what good will that do? You haven't addressed how the problems associated with a potential huge amount of chatbox input (in the form of one-off edits from a large number of people) would be prevented or processed. One solution is to fix the chatbots themselves, so that they don't generate Wikipedia-incompatible content in the first place, which would require working with the developers. A second method would be to create bots to detect and remove either chatbot-generated content, or if possible, policy-breaking content. Simply writing policy and hoping no flood comes, just doesn't seem like a viable approach should a flood hit. That approach may work for the first 3 or 4 years, but what if the flood comes in the 5th year and Wikipedia isn't prepared? We will have wasted 5 years that could have been spent preparing. Maybe we'll be lucky and chatbots will be smart enough to read and follow your guidelines. But if they are not? Fortunately, Doug Weller has passed word along to the Wikimedia Foundation. Maybe they will do something other than write editing guidelines.    — The Transhumanist   03:09, 10 January 2023 (UTC)[reply]
@JPxG although I haven't read more than the beginning, I'm also worried about AIs creating images.For instance I've seen some extremely convincing ones of fake archaeological sites and artefacts. Couldn't people pass them off as their own photos? Or am I missing something? Doug Weller talk 17:06, 2 January 2023 (UTC)[reply]
Yeah, that's a whole other deal. This proposal is only for large language models. Large image models will probably need to be governed by something much more imaginative. jp×g 17:11, 2 January 2023 (UTC)[reply]
If someone's editing is inappropriate, the solution will be notifying them it's inappropriate, warning them, then banning them if they don't stop. There are ways for incompetent editors to make massive plausible seeming changes to the encyclopaedia right now. e.g. by copy/pasting content from other places, or just writing in made up 'facts', LLM really won't make this any easier for someone who's intent on doing this. JeffUK 18:09, 2 January 2023 (UTC)[reply]
I'm kind of wondering what in the heck we're standing to gain by creating any sort of policy surrounding ChatGPT and its ilk. 🌈WaltCip-(talk) 13:24, 9 January 2023 (UTC)[reply]
If these AIs are used in some way for a large scale creation of articles, I think that will be a disincentive for a lot of editors and may drive some away. I disagree with JeffUK on the simplicity of dealing with this. First, you need to be able to spot them and that's work. Secondly, that also assumes that the numbers will be small. Doug Weller talk 15:48, 9 January 2023 (UTC)[reply]
@WaltCip: None. Most editors don't read policy before editing. That's because most editors post very infrequently. But there are a lot of them, and they have authored of most of Wikipedia. What happens when they are all using chatbots, much in the way that most everyone today uses Google?    — The Transhumanist   03:34, 10 January 2023 (UTC)[reply]
"Someone." Apparently, you are assuming it will be one person, or a small enough number to be handled manually. But, what if over the next few years chatbots become ubiquitous with almost everybody using them? How will you deal with it when half the content contributions to Wikipedia are being generated using chatbots?    — The Transhumanist   03:26, 10 January 2023 (UTC)[reply]

Develop software to detect it?

Someone seems to have done this, see A college student created an app that can tell whether AI wrote an essay Maybe the WMF should look into software detection of AI material? Doug Weller talk 14:45, 10 January 2023 (UTC)[reply]

We have mw:ORES that uses machine learning to detect vandalism, so the infrastructure is already in place. All we need to do now is to add the dataset. CactiStaccingCrane (talk) 16:06, 12 January 2023 (UTC)[reply]
(ORES is used for these "likely have problems" and "likely bad faith" highlights in Special:RecentChanges) CactiStaccingCrane (talk) 16:08, 12 January 2023 (UTC)[reply]
Even if ORES is up to the task, and it isn’t perfect now, you still need enough editors to deal with large numbers. Doug Weller talk 18:21, 12 January 2023 (UTC)[reply]
@Doug Weller, or a bot. — Qwerfjkltalk 21:02, 12 January 2023 (UTC)[reply]
@Qwerfjkl: What would the bot do?    — The Transhumanist   22:38, 12 January 2023 (UTC)[reply]
@The Transhumanist, revert additions and/or tag articles. — Qwerfjkltalk 07:03, 13 January 2023 (UTC)[reply]
Personally, I'd prefer that if this is implemented, ClueBot prefer to tag before a full-on reversion, except in cases. I think that ClueBot should use the same exceptions for 3RR for reverting possible bot stuff, though this might require that chatbots be banned from BLP-covered articles. Everything else which could be problematic but not a 3RR exception could be placed at a specific page...maybe have a specific preset filter on recent changes? InvadingInvader (userpage, talk) 07:43, 6 February 2023 (UTC)[reply]
@CactiStaccingCrane, Doug Weller, and Qwerfjkl: All we need to do is add what data set? You make it sound easy (keeping fingers crossed). What does that entail?    — The Transhumanist   22:38, 12 January 2023 (UTC)[reply]
OpenAI have annouced they are adding in some kind of lexical watermark than can be used to identify any output from ChatGPT. scope_creepTalk 13:08, 16 January 2023 (UTC)[reply]
@Scope creep, currently the "Overall," beginning the concluding paragraph is watermark enough. — Qwerfjkltalk 20:36, 17 January 2023 (UTC)[reply]
For example, see the edit linked in this comment. — Qwerfjkltalk 20:42, 17 January 2023 (UTC)[reply]

Other inherent problems only partially touched on

Other inherent problems only partially touched on:

  • Editing articles involves also understanding what is already in the article and how it is organized plus understanding and interpreting policies and guidelines.
  • What's unspoken but runs through many things including current Wikipedia is sort of a commensurate investment. You can get volunteers to take their time to review and deal with issues because they know they are dealing with something that an editor has invested time in to create. Part of the reason that we don't allow mass creation of articles by bots. In other words, we'd significanly lose volunteer efforts
  • Modern AI is inherently unaccountable black boxes. There is no way to see or interrogate or demand/recieve accountability or reasoning for how it arrived at what it arrived at.
  • If gibberish or semi-gibberish is created, it normally requires an expert to spot and remove it..... a very scarce resource. I once uncovered a set of technical-subject articles (about 100 article as I recall) which looked very technical and Wikipedian and were sourced but if you knew the subject you knew were pure gibberish.

Sincerely, North8000 (talk) 22:23, 10 January 2023 (UTC)[reply]

Thanks. I agree entirely. Doug Weller talk 09:04, 11 January 2023 (UTC)[reply]
Well said. I think that folks are overestimating the ability of our review processes to detect "vaguely plausible bullshit" - it's not very common for human editors to fill in blanks with made-up facts and numbers, and I'm not sure that AfC or NPP are checking for this as it would greatly increase their workload. –dlthewave 19:42, 11 January 2023 (UTC)[reply]
[I]t's not very common for human editors to fill in blanks with made-up facts and numbers. Maybe not when adding content, but I see this happen all too often in edits to temperature tables in climate sections. Of course, the tell there is changing temperatures without citing a source or commenting about correcting from a cited source. - Donald Albury 20:17, 11 January 2023 (UTC)[reply]
As I recall, the big one that I caught looked like some type of expose project or research project to see if such a scam could get far in Wikipedia. It was sort of in mashup of words from actual sources. Total nonsense, but a typical reader might think it was simply over their head. North8000 (talk) 21:25, 12 January 2023 (UTC)[reply]
@North8000, Doug Weller, Dlthewave, and Donald Albury:
In answer to the 3rd point above (the black box issue), Perplexity.ai, an AI search engine with a chatbox interface, provides source references with its answers. That is, the references are the search results, while the answer provided is compiled or interpreted from those web pages. So, at least the sources can be checked for verification. But, there are still problems with it. See the perplexity.ai section below.    — The Transhumanist   19:56, 21 January 2023 (UTC)[reply]

Summary of discussion so far

@Aquillion, Andrew Gray, Fram, Levivich, Ritchie333, 0xDeadbeef, ONUnicorn, JPxG, EpicPupper, Sojourner in the earth, Dlthewave, Doug Weller, Qwerfjkl, CactiStaccingCrane, WaltCip, JeffUK, Hellknowz, Zero0000, AndyTheGrump, Bluerasberry, David10244, Boud, Ziko, Pharos, Andrew Gray, WhatamIdoing, Tazerdadog, Barkeep49, Tigraan, Blueboar, MJL, PerfectSoundWhatever, Koziarke, SmallJarsWithGreenLabels, Isaacl, Lee Vilenski, Thebiguglyalien, Hanif Al Husaini, and Xeno:

Highlights of the discussion so far:

  • Chat-GPT is taking the world by storm (translation: it has gone viral).
  • Chat-GPT, and other LLM-based chatbots, can generate compositions, some good enough to pass as college-level essays.
  • Wikipedia is included in the corpus (training data) of Chat-GPT (and other chatbots).
  • Such software has the potential to be used for:
    • Generating Wikipedia content, including writing new articles and adding new material to existing articles.
    • Generating Wikipedia policy content.
    • Generating discussion content, such as on policy talk pages. That is, editors using it to write their discussion replies for them.
    • Editing articles, including rewrites, and using chatbots as a grammar checker.
    • Editing other namespace pages, such as policy pages, etc.
    • "Can be used for an extremely wide variety of tasks across the project, from formatting to table syntax to HTML generation to copyediting." (quoting JPxG)
    • Creating hoaxes with less effort.
  • Most Chat-GPT output lacks citations.
  • Some experiments were run, showing that Chat-GPT:
    • Copies writing styles very well.
    • Has a tendency to make things up, yet presents it as fact in an encyclopedic tone. One editor dubbed this "confident nonsense". In one experiment, Chat-GPT created an article reporting that Wikipedia's own Signpost newsletter was the recipient of several Pulitzer Prizes.
    • Can include references, but some of the references were made up and totally fictitious.
    • Some references cited Wikipedia (an ineligible source for Wikipedia articles).
    • One of the experiments generated instructional content, a recipe, that the user followed, and ate the results of.
    • Another experiment used Chat-GPT to answer hypothetical questions in the style of WP's teahouse department. It worked fairly well.
    • Yet another experiment created a sample policy page, showing that chatbots are not limited to editing articles. They can generate or edit pretty much any type of page on Wikipedia, except files (images).
    • Chat-GPT output is not fact-checked.
    • Chat bots don't actually understand what they are writing.
    • When used responsibly as a tool, with editors carefully prompting the chatbot, and editing and fact checking its output before posting it to Wikipedia, a chatbot can be very useful and increase editor productivity: the LLM GPT-3 was successfully used to create department reports for Wikipedia's newsletter, The Signpost.
    • JPxG conducted an experiment/demonstration to show that Chat-GPT is a sophisticated interactive editing tool, which you tell it what you want it to do to a textual work, and then it does it. See it here: User:JPxG/LLM demonstration.
  • It was pointed out that Wikipedia policy already covers all contributions, whether generated by chatbot or human. Ultimately, the user is responsible for material they copy and paste into Wikipedia.
  • Issues of concern that were raised include:
    • Users copying chatbot-generated text into Wikipedia without carefully editing and fact-checking it first.
    • Confident nonsense (misinformation generated by chatbot) may be hard to spot.
    • The potential of chatbots to violate copyright, by directly copying, or generating text based on, copyrighted works.
    • Violating Wikipedia's licenses, most notably the attribution requirements. Chat-GPT output generally does not include attributions.
    • A chatbot-edited Wikipedia could wind up in the training data for those same chatbots (or their next versions), creating a potentially error-compounding feedback loop.
    • The suggestion was made to prepare for a potentially large future increase in chatbot entries to Wikipedia, by:
      • Working with chatbot developers to make chatbot-generated output Wikipedia compatible.
      • Develop bots to identify and process chatbot entries.
  • No consensus has emerged on what the Wikipedia community should do about LLMs/chatbots. Some editors think that policies/guidelines and the current editor pool could handle any influx of chatbot generated edits. Some other users were concerned that there is potential for LLM/chatbot contributions, such as one-off edits by members of the general population, to overwhelm our pool of editors. One user pointed out that it may take experts to discern nonsense articles, and experts on Wikipedia are a scarce resource.
  • Consensus did emerge on something not to do. It was agreed that banning chatbot-generated content was not a good idea at this time, and probably wouldn't work anyways.
  • Software has been developed to identify Chat-GPT-generated text.
  • It appears some editors may take the initiative to prepare for a worst-case scenario (chatbot input going beyond our editor pool's ability to handle), and discussion on how to do this has begun.
    • WP:ORES could theoretically be trained to identify chatbot edits.
  • The Wikimedia Foundation has been contacted about the concern over LLMs/chatbots, presenting a contact there with a link to this and a previous discussion.

Did I miss anything?    — The Transhumanist   01:22, 13 January 2023 (UTC)[reply]

AI-generated images are rapidly becoming a Big Thing, so it is not correct to exclude them. Also, "Wikipedia policy already covers all contributions, whether generated by chatbot or human" is misleading as it is true only by accident. A more precise description would be "Wikipedia policy was written without any consideration of chatbots". Zerotalk 03:28, 13 January 2023 (UTC)[reply]
I'm concerned about "Such software has the potential to be used for: creating content farms that good-faith human editors, including existing experienced editors, will sometimes mistake for reliable sources when they are writing content".
Also, the statement that "Software has been developed to identify Chat-GPT-generated text" is true, but not relevant for very short contributions. Some of this is using sentence length, and you won't be able to identify an abnormal sentence length if you only look at two or three sentences. WhatamIdoing (talk) 05:33, 13 January 2023 (UTC)[reply]
Anyway, detection methods that work now won't work in the next generation. Eventually (and not far in the future) distinguishing between human-written and computer-written prose will be impossible for practical purposes. This is going to be the greatest threat to Wikipedia since its founding. Zerotalk 05:51, 13 January 2023 (UTC)[reply]
@Zero0000: When do you suppose the impossible-to-distinguish scenario will be here? Two years? Less?    — The Transhumanist   13:42, 25 January 2023 (UTC)[reply]
I have started Category:Wikipedia essays about artificial intelligence, Perhaps folks here would like to add to the collection, and document yet more thoroughly! Pharos (talk) 01:07, 14 January 2023 (UTC)[reply]
Adjacent to hoaxes there's also the likelihood of spammers using GPT to bulk out their edits. I strongly suspect that the text of this edit today, writing repetitively about a static sculpture as if it was a functional scientific instrument, was generated with GPT-3, probably giving it a prompt to explain Orbital Reflector in terms of dark matter and black holes, the subject of the two embedded spam links. Belbury (talk) 11:39, 17 January 2023 (UTC)[reply]
I just ran that through GPT-2 output detector and it estimated 99.97% chance that that passage was AI-generated. — rsjaffe 🗣️ 03:31, 24 January 2023 (UTC)[reply]

Chat-GPT spreading fast

The situation is changing rapidly:

Chat-GPT may become ubiquitous sooner than previously thought, and so far, identification methods have fallen flat...

Here's some recent news:

  1. ChatGPT Will Be Everywhere in 2023 (CNET)
  2. Microsoft is reportedly integrating ChatGPT's technology into Bing (Yahoo)
  3. Microsoft is looking at OpenAI’s GPT for Word, Outlook, and PowerPoint (The Verge)
  4. There's a Problem With That App That Detects GPT-Written Text: It's Not Very Accurate (Futurism.com)

With the user base for Chat-GPT about to explode, the potential for Chat-GPT-generated text being added to Wikipedia will explode right along with it. It's looking uncertain whether or not Wikipedia's editor community will be able to keep up with the influx. In light of recent events, what should be done about this?    — The Transhumanist   03:21, 13 January 2023 (UTC)[reply]

As well as being able to write plausible-looking prose on any subject, computers can also be programmed to add it to Wikipedia all by themselves. The first task is to absolutely ban computers from editing, with the sole exception of authorized bots. The second task is to add to appropriate policy pages that all content (authorized bots excepted) must be added by a human and that that human is responsible for checking policy conformance of the content. Zerotalk 08:06, 13 January 2023 (UTC)[reply]
I couldn’t agree more. Does anyone have objections? Doug Weller talk 10:45, 14 January 2023 (UTC)[reply]
I was hoping to get more feedback on WP:LLM from having posted it here, but either way, I think it is pretty close to ready for consideration as a guideline (or policy, as appropriate)... based on the conversations I've had (and seen) I am prepared to write an RfC for its adoption. jp×g 11:00, 14 January 2023 (UTC)[reply]
@JPxG: The guideline is not ready. It is not where near complete, and it needs a rewrite. Here are some proofreading notes:
It's way too redundant, repeating policies and itself, without explaining how to get the job done. Aside from the "fit for" sections, the rest of the page can be reduced to a single paragraph.
It presents -- should only be used by competent editors who do not indiscriminately paste LLM output into the edit window and press "save" -- four times! Someone who is incompetent isn't going to be able to judge whether or not they are. Also, "indiscriminately" is vague. That entire sentence should be removed.
Editors need to know what they need to do to the text before they can press "save". For example, you alluded to a manner of using LLMs in compliance with WP copyright policy, but you didn't explain how. How can an editor be sure that an LLM-generated piece doesn't violate someone's copyrights? What's the procedure?
Rather than covering "good fit" and "not good fit", the guideline should present explicit instructions: "Use it for this" and "Do not use it for this". And then explain how.
I hope you find these observations and comments helpful. Sincerely,    — The Transhumanist   08:04, 16 January 2023 (UTC)[reply]
I agree that the page is not finished, but I don't really know what you are objecting to here. It kind of sounds like you are inventing problems – if users don't know how to check if things are true before putting them into Wikipedia articles, they shouldn't be editing at all. If users don't understand what copyrighted material is, they need to read Wikipedia:Copyright policy, which is linked to from this page when it's mentioned. That is an explanation of how to get the job done. It should not be necessary to create an exact copy of Wikipedia:Verifiability that says "When using a LLM," at the beginning of every sentence. jp×g 08:29, 16 January 2023 (UTC)[reply]
@JPxG: How can users understand what the copyrights of Chat-GPT's output are? Chat-GPT doesn't provide sources, nor does it report if it copied or derived the passage from a particular work. So, how do you go about checking whether or not a particular Chat-GPT response is in violation of copyright, so that "pasting its output into the edit window and pressing 'save'" is not considered "indiscriminate"? Also, it isn't clear who owns the copyrights to the output of an LLM: the public domain, the owner of the LLM, the user of the LLM, or the owners of the copyrights of the works included in the training data set? The breadth of this problem is discussed in #Copyright status below.    — The Transhumanist   00:08, 19 January 2023 (UTC)[reply]
There already exist a very large number of policies about copyrighted text. Editors are subject to these policies. These policies contain information on how to avoid copyright violations. If you asked GPT-3 to tell you the lyrics to Moonage Daydream, they would be copyrighted. If you found the same lyrics by typing "moonage daydream lyrics" into Google, they would be copyrighted. What is the difference? Policies do not (and cannot) cover every hypothetical person and situation to which they could be applicable: we do not have a separate WP:COPYRIGHT for old editors, WP:COPYRIGHT for young editors, WP:COPYRIGHT for male editors, or WP:COPYRIGHT for female editors. WP:COPYRIGHT applies to all editors regardless of their age, race, gender, or whether they are human or machine. I don't know how to explain this in further detail. jp×g 01:06, 19 January 2023 (UTC)[reply]
@JPxG:
You've missed the points entirely (there were two, and you only replied to one).
Those policies you mentioned do not deal with the copyright problems presented by black box chatbots, nor do they warn about the dangers of pasting in chatbot output.
Search engine search results are excerpts from web pages that the search results identify — which facilitates verification. Chat-GPT and other black box chatbots answer questions in natural language, without telling the asker of the question where the information came from — which does not facilitate verification — while presenting it in a very confident and scholarly tone.
This may result in a great deal of misinformation being posted to Wikipedia, where it will sit until somebody else removes it. The delay between those 2 events can be lengthy, especially for material that seems plausible. So, it might be a good idea to provide guidance specific to chatbot usage pertaining to copyrights -- at least some caveats on which chatbots to avoid.
Another problem is that we don't know where the training data came from. There could be deep web data in there as well. That can't be easily accessed to check for plagiarism. So, is it a good idea to use blackbox chatbots? There are transparent surface web chatbots that include references for verification, so maybe we should recommend that the blackbox ones be avoided.
Now, for the second issue (the one that you skipped): WP policies do not cover prompting a chatbot to write material. The copyrights to material that is written by a chatbot is owned by who? The user? That has not yet been established! What stance is going to be taken by Wikipedia, and what guidance are we going to provide on this issue?    — The Transhumanist   09:17, 25 January 2023 (UTC)[reply]
I feel like either you are not reading what I'm saying, or we have some kind of insurmountable disagreement about what letters and words are. jp×g 09:26, 25 January 2023 (UTC)[reply]
@JPxG:
You've just fed me a variation of "you're not listening", with a little barb attached to the end. Really? That's who you are? I'm disappointed.
I read what you wrote, and I highly disagree with what you are saying...
You are saying that current copyright policy is enough: it prohibits copyrighted works from being posted to Wikipedia without the permission of the copyright holder, and that it is up to the editor to make sure that the material does not violate anyone's copyrights or Wikipedia's copyright policies.
My positions are...
1) that black box chatbots pose the danger of luring editors into violating copyright policy, that we may be faced with a deluge of copyright-violating derivative material because of it, and that some additional guidance would be appropriate: Like avoiding black box chatbots in favor of transparent ones, and...
2) that the copyrights to the natural language output composed by chatbots is unclear — what is clear is that the editor didn't write it. Since the editor didn't write it, does that mean that the editor does not own the copyrights to it? And if editors don't own the copyrights, should they be giving it to Wikipedia? Wikipedia should form a stance on the copyrights of chatbot-generated-output and present editors with guidance on this issue as well.
You have apparently been avoiding replying to those positions, and so my guess is that you are opposed to them. I strongly oppose the let's-stick-our-heads-in-the-sand approach that you support.    — The Transhumanist   10:55, 25 January 2023 (UTC)[reply]
P.S.: I don't think anything in writing is going to be enough. I expect that it will take software programming to deal with the problems Wikipedia will be subjected to by chatbot compositions. And that is beyond the scope of this venue. ;)    — The Transhumanist   11:31, 25 January 2023 (UTC)[reply]

To act or not to act

Like DALL-E last year, or NFTs the year before that. I'll believe it when I see it, and I can't see the value in spending even more time discussing a hypothetical future threat to Wikipedia. – Joe (talk) 08:39, 13 January 2023 (UTC)[reply]
The scariest yet most plausible thing is that this is happening with some of the articles but we aren't aware of it. I don't think raising awareness on this issue is a bad thing given how fast AI advances nowadays. CactiStaccingCrane (talk) 00:39, 14 January 2023 (UTC)[reply]
Sorry, I missed the "recent events". Where is the evidence for GPT problems on Wikipedia? —Kusma (talk) 11:57, 13 January 2023 (UTC)[reply]
Simple way to defeat these AIs: train them on how humans argue about WP policy… then ask them whether AIs are reliable (pro and con)… then set them against each other on a dedicated talk page. While they argue, we can continue editing. Blueboar (talk) 01:58, 14 January 2023 (UTC)[reply]
See also https://openai.com/blog/debate/ CactiStaccingCrane (talk) 11:41, 15 January 2023 (UTC)[reply]
Where's the evidence that Wikipedia can't cope with AI generated articles? doktorb wordsdeeds 14:07, 14 January 2023 (UTC)[reply]
@Doktorbuk: You are talking in terms of hindsight (asking to see what has already happened), rather than applying foresight to assess a potential threat by asking "What could happen?"
Here's an article from the New York Times -- imagine a similar effort directed at Wikipedia using thousands upon thousands of (seasoned) new accounts to support political POVs, revise history, censor opposing opinions, and spread other forms of misinformation:

https://www.nytimes.com/2023/01/15/opinion/ai-chatgpt-lobbying-democracy.html
It's only a matter of time before the powers that be shift their attention, and their tools, upon the English Wikipedia. The question is, are we ready for when we have to be? Here's an article that makes one wonder what these people will do now that they have Chat-GPT to work with:

https://www.theweek.in/news/world/2023/01/06/saudi-arabia-infiltrated-wikipedia-and-this-is-how-they-did-it.html
So, do we really need evidence that the English Wikipedia has already been breached by LLM-assisted POVers before proceeding? Or can we prepare for this in advance?    — The Transhumanist   00:34, 16 January 2023 (UTC)[reply]
Your reasoning seems to be that
  1. ChatGPT (or its equivalents) can write disinformation quickly and cheaply
  2. POV-pushers (governments, lobbies etc.) are currently limited by the time humans need to write disinformation
  3. Wikipedia is a prime target for such POV-pushers
  4. Therefore, ChatGPT (or its equivalents) will flood the gates, unless we do something.
I will grant you (1) is either already true or will likely be in the near future.
However, (2) is questionable (see that XKCD about old-fashioned human-crafted POV-pushing). I would guess coordinating the messaging and maintaining the disinformation is a much larger fraction of the costs than actually writing the text.
(3) is also dubious. Editing in a way that sticks is much harder on Wikipedia than in other places (such as facebook, reddit, etc.). Maybe it has more impact, but the cost-benefit analysis is not obvious.
Finally, inaction is always an option. It might not be a good option, it might even be the worst option, but it must be compared to other specific measures. "Something must be done" without specifics is just the politician's fallacy. In the absence of details about the threat, it’s hard to compare the possible countermeasures. TigraanClick here for my talk page ("private" contact) 16:41, 16 January 2023 (UTC)[reply]
@Tigraan, I think your list of assumptions is missing "5. People who want to corrupt Wikipedia (e.g., NPOV violations, stacking votes) can reasonably be expected to obey any prohibitions we announce on using this particular technology to achieve their illicit ends." WhatamIdoing (talk) 21:22, 17 January 2023 (UTC)[reply]
I wanted the list of assumptions to be a reasonable summary of (what I understand to be) TH’s argument; I suspect your suggestion is... not that. But I agree that’s part of the problem (which my last paragraph covers). TigraanClick here for my talk page ("private" contact) 10:37, 18 January 2023 (UTC)[reply]
Dear @Tigraan: I am concerned with a flood of WP-incompatible chatbot-generated content, whether by POV'ers or good-faith editors. But it won't be at any gates. The water-level will simply rise. If and when flooding begins, it will be a matter of bailing out the excess. There are three questions relevant to such potential flooding:
1) Will chatbots be designed in such a way to prevent flooding (and bailing) in the first place by minimizing inappropriate (unsourced, misinforming) content?
2) Will the bailing be automated?
3) Shall we wait to work on #1 & #2 until after flooding has begun, or prepare in advance?
Some editors seem doubtful that the addition of content generated by LLMs to Wikipedia beyond the manual capacity of our editors to process it will happen. And I don't know if it will happen, either. But, there is a continuous stream of strong indications that LLM-based tools will become ubiquitous in the not too distant future, for general use, which, by extension, includes using them to add content to Wikipedia. Here's another:
Google Calls In Help From Larry Page and Sergey Brin for A.I. Fight — New York Times]
And the technology push isn't limited to OpenAI and Google. Here's a search engine that uses a natural-language interface in both its queries and its answers:
Perplexity AI: Ask Anything
It is looking pretty clear that some major changes are on the horizon in the way computer users will be composing web content. It is also profoundly obvious that Wikipedia isn't ready right now for much more than the current volume of content creation that it is already handling. Maybe the volume won't increase by much, or maybe it will.
However, some editors are taking the potentiality that it will seriously, and it'll be interesting so see if their preparation efforts will be sufficient to stem the tide, if or when the tide rises. Sincerely,    — The Transhumanist   22:51, 20 January 2023 (UTC)[reply]
@The Transhumanist, I'm not sure that these questions are really suitable to a discussion on Wikipedia. The first one, for example: Will chatbots be designed in such a way to prevent flooding (and bailing) in the first place by minimizing inappropriate (unsourced, misinforming) content?
I'd re-phrase it like this:
"Will all of the people who are not us, including those who don't care about us, carefully design their software in such a way to be convenient for us?"
Answer: No. Or, at least, it is highly unreasonable to assume that the answer is yes for all of the people who write this sort of software, and it only takes one to risk a problem. WhatamIdoing (talk) 21:21, 23 January 2023 (UTC)[reply]
@WhatamIdoing:
That would more likely be true if members of WP or the WMF did not contact them. They are not producing these things in a vacuum. WP/WMF has a good relationship with Google, for example, which applies Wikipedia content extensively. It may be time to reach out to the companies developing chatbots too.
On the bright side, there's pressure coming from abroad, in the critique of chatbots, to be less "black box" and to provide references, which is one of the features that would help avoid problems.
Perplexity.ai already provides sources, which helps with verification efforts, and to see which ones are and are not from Wikipedia. Though, Perplexity.ai does not provide quote marks around passages that it quotes, and that is another problem. So, I guess they need to be contacted as well.
It looks very likely that chatbots will be used to compose content for other websites besides Wikipedia, and that their webpages may be included in chatbot training data too -- making an error-magnifying feedback loop a potentially huge problem for the chatbots. Too big to go unnoticed, hopefully.
It's important that we are aware of these issues if we are to have any chance in influencing solutions. Who knows, the chatbots, and/or the chatbot developers, may actually read this discussion. ;)
The WMF has been made aware of this discussion, so they can read it to prepare for discussions with participants in the chatbot sector. So, it is important that we get our concerns, and thoughts on design and strategy, in print.    — The Transhumanist   08:20, 25 January 2023 (UTC)[reply]
You are assuming that LLMs are being developed by a manageable number of identifiable companies, and hoping that all of them would like to protect Wikipedia.
But let's consider it from a different POV. Imagine that chatbot software is an open-source project, like Wikipedia. You have the Wikipedia:Right to fork open source projects – not just Wikipedia, but any open-source project. Anyone can add or subtract anything on their own setup. For example, if someone adds a "protect Wikipedia" module, then the next person could remove that, or even add a "scam Wikipedia" module.
I believe there will be some organizations who find that protecting Wikipedia aligns with their interests, and they will do so. But there will also be some organizations who find that protecting Wikipedia is exactly the opposite of their interests, e.g., content farms that hope they'll be cited as sources here so that their ad-filled webpages will get more traffic, and WP:UPE scammers who are hoping to reduce their costs by having their secret, internal-use-only chatbot write Wikipedia articles for clients of dubious notability, rather than paying a human to do that. I don't think that we can identify such actors, and I don't think they would change their behavior even if we talked to them.
On a tangent, the call for chatbots to cite sources and add quotation marks is probably based on a misunderstanding. LLMs aren't "quoting sources". They're predicting what a typical way to complete a sentence might be. If it spits out "The journey of a thousand miles begins with one step", it's not quoting Lao Tzu; it's saying "When I look in my database, and I see phrases that start with 'The journey of', the next bit is usually either 'a thousand miles' or 'a lifetime'. I'll pick one and see what comes next." WhatamIdoing (talk) 17:04, 25 January 2023 (UTC)[reply]
@WhatamIdoing: Good point concerning the ecosystem of chatbot developers - I was only considering the big name actors (Google, etc.), but anyone and their uncle can get involved. You are right, bad actors are inevitable and perhaps even rampant. Yet, the vast majority of chatbot use will likely be of the big name models (ChatGPT, etc.). So, contacting and working with them would be beneficial.
As for quoting, I have found that the AI search engine perplexity.ai, which includes inline source references in its natural language answers to users' questions, integrates passages verbatim from the referenced webpages into its answers without using quotation marks.    — The Transhumanist   09:32, 26 January 2023 (UTC)[reply]
I wonder what the Ask Jeeves developers are thinking about that. Perhaps they were just 20 years too soon. WhatamIdoing (talk) 17:46, 26 January 2023 (UTC)[reply]
That’s a blast from the past, Doug Weller talk 18:30, 26 January 2023 (UTC)[reply]
Looks like some spammers and malware distributors have embraced this technology:
WhatamIdoing (talk) 01:09, 28 January 2023 (UTC)[reply]
The second source above says "Likewise, anyone who uses the web to spread scams, fake news or misinformation in general may have an interest in a tool that creates credible, possibly even compelling, text at super-human speeds." We need detection tools, and fast. The "super-human speed" part could be a dead giveaway.    — The Transhumanist   10:00, 28 January 2023 (UTC)[reply]

Identifying chatbot-generated text

Zero0000's post is a good start. A simple way to crystalize the situation is to ask the human editor for their rationale for a particular phrase. North8000 (talk) 03:55, 14 January 2023 (UTC)[reply]

I feel like additions of large, overly-verbose unsourced text are something of a giveaway. See, for example, the first revision of Artwork title, written by ChatGPT. — Qwerfjkltalk 11:06, 14 January 2023 (UTC)[reply]
You can be confident that any giveaways are on the chatbot writer's list of things to fix in the next generation. Zerotalk 11:56, 14 January 2023 (UTC)[reply]
They should also fix the problem of the chatbots making stuff up. Someone should ask the chatbot writers to turn off the poetry and fiction generation algorithms, and any other algorithms that make things up, when the chatbots are composing expository text. Or add new algorithms to handle expository writing. Just the facts. And sources.    — The Transhumanist   00:42, 16 January 2023 (UTC)[reply]

Nature just published a piece about use of ChatGPT in scientific articles. Zerotalk 01:18, 21 January 2023 (UTC)[reply]

Is someone clear on what the copyright status of texts produced by LLMs is? From what I get, they may be considered derivative works from the dataset they were trained on. From [2]: As a result of the human authorship standard, “under U.S. current law, an AI-created work is likely either (1) a public domain work immediately upon creation and without a copyright owner capable of asserting rights or (2) a derivative work of the materials the AI tool was exposed to during training,” Esquenet continues. “Who owns the rights in such a derivative would likely be dependent on various issues, including where the dataset for training the AI tool originated, who, if anyone, owns the training dataset (or its individual components), and the level of similarity between any particular work in the training set and the AI work.” If they are derivative works then they cannot be published on Wikipedia just like this. Do we have more information on this? For example, does OpenAI specify somewhere the copyright status of the text produced by ChatGPT? Phlsph7 (talk) 09:09, 16 January 2023 (UTC)[reply]

The first question is whether a generated text that closely resembles an item from the training set is copyright infringement of that item. For instance, Microsoft Copilot happily outputs the Fast inverse square root code. I would expect that courts will judge such things to be copyright infringement. Copyright infringement statutes do not require to prove that the infringer copied a specific source (that would be difficult to prove), just that the content is substantially similar. Therefore, whether the tool is a simple ctrl-C ctrl-V or a sophisticated machine learning model should not make much difference.
The second question is whether OpenAI (or any other AI tool provider) can assert copyright on whatever the tools they provide create. The OpenAI terms of use seem relatively permissive, but others might be less generous. I do not know the answer to that question. I would hope they cannot, since they only provide tools (Microsoft should not be able to assert copyright on the text I write using Word, or the images I draw using Paint).
The third is whether a human using ChatGPT can assert copyright on ChatGPT answers, or otherwise constrain the use of the resulting text. The quote you give is probably based on the US copyright office’s position (taken during the monkey selfie copyright dispute): Because copyright law is limited to 'original intellectual conceptions of the author', the [copyright] office will refuse to register a claim if it determines that a human being did not create the work. However, giving a prompt to ChatGPT might or might not constitute significant creative input. The position that anything edited by a machine becomes public-domain is untenable (if I use an orthographic corrector on the draft of my novel, it does not turn it into PD), so it must be a question of degree. Also, non-US courts might have different opinions. TigraanClick here for my talk page ("private" contact) 16:06, 16 January 2023 (UTC)[reply]
As I understand it, the degree of access by the alleged infringer to the source text in question is a factor in determining infringement. Only a specific expression is protected by copyright; if you and I independently write the same sentence, one is not a copyright violation of the other. The amount of similar text also plays a role, since the larger it is, it's more improbable that it was created without copying.
Facts and natural laws can't be copyrighted; this also covers algorithms (though a particular expression can be copyrighted). So I can't copyright a single instance of a Javascript for-loop and claim rights to all Javascript for-loops as derivative work. In cases where the learning model creator is explicitly providing its model for use as a tool, I think (disclaimer: not a legal opinion) it is reasonable for this to be the same as a work for hire. Thus if the result is eligible for a new copyright owner independent of any source texts, the tool user would be the owner. (If I use a spellchecker on the latest bestselling novel, the result is not eligible for a new copyright owner.)
To be really safe, we'd want language models trained on public domain text. But I think it could be argued with a really large model trained on, say (just drawing numbers out of air), hundreds of thousands of documents with thousands of independent authors, the resulting correlations can no longer be attributed to specific input text, for cases where the output is not a significantly long passage substantially similar to a specific source text. isaacl (talk) 18:05, 16 January 2023 (UTC)[reply]
One of the main issues to deal with would be the following: an editor tells ChatGPT to write a text on a topic and then adds this text in the form of a paragraph/section/article to Wikipedia and thereby publishes it under Creative Commons/GNU license. The question is: what are the chances that this constitutes some form of copyright violation? This might concern specifically problems with the 1st and the 2nd question addressed by Tigraan, i.e. whether the copyright of someone whose work was part of the training set was violated and whether openAI's copyright was violated. For the first question, it's probably relevant what the copyright status of the texts in the training set is and how similar the produced text is to the texts in the training set, as isaacl points out. Answering these questions would be quite relevant for any Wikipedia policy on the topic, like the one JPxG is currently drafting. Phlsph7 (talk) 07:18, 17 January 2023 (UTC)[reply]
With respect to the issue of whether LLM output inherently violates copyright law: the copyright status of LLM-generated text is not defined by statute, so it is hard to make confident claims, but precedent exists for computer-generated art and other works created by non-humans. Here is what the US Copyright office has to say:
"Compendium of U.S. Copyright Office Practices, § 313.2" (PDF). United States Copyright Office. 22 December 2014. p. 22. Retrieved 18 January 2023.
The Office will not register works produced by nature, animals, or plants. Likewise, the Office cannot register a work purportedly created by divine or supernatural beings, although the Office may register a work where the application or the deposit copy(ies)state that the work was inspired by a divine spirit.
[...]
Similarly, the Office will not register works produced by a machine or mere mechanical process that operates randomly or automatically without any creative input or intervention from a human author.
It's not quite clear to me what the applicability is in this case. On Commons, the template and copyright category for PD-algorithm asserts that all algorithmically-generated works are public domain ("This file is in the public domain because, as the work of a computer algorithm or artificial intelligence, it has no human author in whom copyright is vested"). Whether artificial neural networks are capable of producing original intellectual output is less of a legal issue and more of a philosophical/anthropological one. It should be noted that human brains are themselves neural networks; much has been said, in a variety of fields, on the subject of whether humans create original works versus whether they merely juxtapose or recombine motifs and concepts that they're exposed to through participation in society. While interesting (and humbling), these discussions are unrelated to whether neural networks which have been exposed to copyrighted material in the course of their existence are capable of later creating original works under the purview of intellectual property law: they do. If this was not the case, a large majority of creative work would be illegal (good luck finding a band where none of the musicians have ever heard a copyrighted song before).
It does seem obvious that, say, OpenAI does not hold a claim to copyright on all works produced using its products: Adobe does not own copyright on all works produced using Photoshop, and Apple does not own copyright on all photographs taken with iPhones. If they did, this would certainly be a bold claim which required some evidence beyond idiosyncratic interpretation of vague phrasing in a EULA somewhere.
With respect to whether LLM output is capable of violating copyright law, I think the answer is obviously "yes". Here is an example, which I have just prompted from ChatGPT:
What are the lyrics to "Moonage Daydream", by David Bowie?
Keep your 'lectric eye on me babe
Put your ray gun to my head
Press your space face close to mine, love
Freak out in a moonage daydream oh yeah
These are, indeed, the verbatim lyrics to an unambiguously copyrighted song. Of note is the fact that I specifically asked it to do this. Compare with this prompt, using the first couple line of the song:
I'm an alligator / I'm a mama papa comin' for you / I'm a space invader / I'll be a rock-'n-rollin' bitch for you
I'm a force of nature, a wild beast / A creature of the swamps, a king of the feast / I'm a living legend, a natural high / 'Cause I'm an alligator, and I'll never die
While this would be a sick Judas Priest song, it isn't one, and in fact, "I'm+an+alligator+and+I'll+never+die" this text seems to never have been published. jp×g 01:57, 19 January 2023 (UTC)[reply]
It bears mentioning that I, a human being, also know the lyrics to "Moonage Daydream", and am capable of typing them into the edit box with no assistance whatsoever; doing this in an article would still be a copyright violation. I think that, for LLMs, the critical piece of knowledge is that output needs to be checked, and that it is insufficient to assume that generated text is always novel. jp×g 02:15, 19 January 2023 (UTC)[reply]
On the narrow point of who owns ChatGPT outpt, Sec 3(a) of their ToS states "OpenAI hereby assigns to you all its right, title and interest in and to Output...You are responsible for Content, including for ensuring that it does not violate any applicable law or these Terms." I thought there was an additional requirement to state ChatGPt was used, but I do not see it in the terms. Slywriter (talk) 02:14, 19 January 2023 (UTC)[reply]
@JPxG: since you mention computer-generated art: there are some lawsuits against Stability AI based on the claim that they violated the copyrights of people whose images were used in the training set. See [3] and [4]. The case seems to be similar to LLMs, with the main difference being that their AI trains on images and creates images while LLMs train on text and create text.
If I interpret the statement by the US Copyright office correctly, it seems to claim that a person can't own the copyright of a work that was created by a random machine process without creative input. It does not say that such processes cannot violate someone else's copyright. This would be in tune with the lawsuits mentioned above.
I think it's also unlikely that every output is a copyright violation. For example, if you just give it a sentence and tell it to correct spelling mistakes, there should be no problem in using the output. Phlsph7 (talk) 06:33, 19 January 2023 (UTC)[reply]
@Slywriter: Their sharing policy demands that Indicate that the content is AI-generated in a way no user could reasonably miss or misunderstand. Phlsph7 (talk) 09:23, 19 January 2023 (UTC)[reply]
LLM-assisted edits need to be appropriately marked as such in the history. —Alalch E. 01:38, 21 January 2023 (UTC)[reply]
Yes, and I think that "in a way no user could reasonably miss or misunderstand" requires the use of a notice in the article itself as well. –dlthewave 13:53, 21 January 2023 (UTC)[reply]
Agreed. It's easy for the average reader to miss an edit summary in the article history. So in-text attribution may be required. Phlsph7 (talk) 08:54, 22 January 2023 (UTC)[reply]
@Slywriter, JPxG, Phlsph7, Alalch E., and Dlthewave: Concerning the TOS clause that states "OpenAI hereby assigns to you all its right, title and interest in and to Output...You are responsible for Content, including for ensuring that it does not violate any applicable law or these Terms." — does that mean that Chat-GPT cannot legally produce the exact same output twice without violating the right, title, and interest that it previously assigned?    — The Transhumanist   20:47, 21 January 2023 (UTC)[reply]
I'm not sure what leads you to that conclusion. The licence does not grant you exclusive use to anything. isaacl (talk) 22:09, 21 January 2023 (UTC)[reply]
@Isaacl: I didn't come to a conclusion, I just asked a question, pertaining to Sec 3(a) of their ToS as referred to and quoted by Slywriter above, and repeat quoted by me. It appears you missed the quote somehow, because you didn't comment on it. To what license are you referring, and what relation does it have to the passage we quoted from the TOS?    — The Transhumanist   02:52, 22 January 2023 (UTC)[reply]
I don't know why you would ask the question you did, since the section you quoted did not say anything about granting an exclusive right, title, and interest to any output. isaacl (talk) 03:08, 22 January 2023 (UTC)[reply]
I don't think this has something to do with producing the same output for different users. It should be easy to find mock queries to which it often responds with the same output, for example, by asking it to "Say the word 'Hello'" or for simple translations. Phlsph7 (talk) 06:50, 22 January 2023 (UTC)[reply]
@Isaacl: Well, I checked the section again, and it is right there in plain English. It uses the word "assigns" instead of "grants", and it says "all its" instead of "exclusive". So, once it "assigns all its right, title, and interest in and to Output", how can it legally ever produce that same output again? (Because it already assigned it away).    — The Transhumanist   09:20, 22 January 2023 (UTC)[reply]
As I understand it, anyone can assign all their rights to the output of ChatGPT to someone else. In a similar way, I could assign to you all my rights to the Harry Potter series. This would not be of much use to you since the expression "all my rights" just refers to "no rights" in this case. Phlsph7 (talk) 09:32, 22 January 2023 (UTC)[reply]
@Phlsph7: In section 3a of the TOS, it's OpenAI that is assigning its rights to the chatbot output generated for the user. If Chat-GPT writes you a 3 paragraph explanation of gravity, and OpenAI has assigned you its rights to that explanation, can Chat-GPT legally write that exact same output for somebody else?    — The Transhumanist   09:58, 22 January 2023 (UTC)[reply]
I assume it works something like the following: it depends on whether openAI had any copyrights on it in the first place. If it did then this may be a problem because creating the copy for the second user might violate the newly obtained copyright of the first user. If it didn't then it presumably wouldn't be a problem because assigning all its rights to the first user effectively didn't do anything. But I don't think that this particular issue is very relevant for Wikipedia. Phlsph7 (talk) 10:14, 22 January 2023 (UTC)[reply]
The tool made no guarantee that it wouldn't generate the same output again for another user. The tool is in essence passing any rights of ownership (if they exist) in the original input through to the output. isaacl (talk) 19:24, 22 January 2023 (UTC)[reply]

Testing Chat-GPT's algorithm

Does Chat-GPT produce the same output to the same prompt given to it by 2 or more different users? Do any two want to try that?    — The Transhumanist   20:54, 21 January 2023 (UTC)[reply]

@The Transhumanist, not necessarily. It has a "temperature" factor (randomness). — Qwerfjkltalk 21:01, 21 January 2023 (UTC)[reply]

@Phlsph7, Isaacl, Tigraan, Dlthewave, Slywriter, and JPxG:

I have found the this to be perplexing...

I entered the following prompt into Perplexity.ai:

who owns the copyrights of perplexity.ai's answers

And it returned the following answer:

"According to US copyright law, works generated solely by a machine are not eligible for copyright protection[1] . Therefore, the copyrights of Perplexity AI's answers belong to its co-founder and CEO, Aravind Srinivas[2]."

It looks like I just broke copyright by copying it here. But this contradicts the title of the first source provided (the second source is behind a registration wall):

The scary truth about AI copyright is nobody knows what will happen next — The Verge

Assuming that Aravind Srinivas does not own the output, I have some questions about posting AI search engine results: I look forward to your replies to the below questions.    — The Transhumanist   19:27, 21 January 2023 (UTC)[reply]

Question 1: Does putting quotation marks around a small copy/paste selection from an AI search engine, like the passage above, fall under fair use?

I would assume so. It's similar to how you can quote from copyrighted books. There are some limitations, for example, concerning the length of the cited text. And it should be clear where the quotation came from. Phlsph7 (talk) 06:57, 22 January 2023 (UTC)[reply]

Question 2: Would that protect Wikipedia (and its editors) from a derivative work claim?

From [5]: "In its most general sense, a fair use is any copying of copyrighted material done for a limited and “transformative” purpose, such as to comment upon, criticize, or parody a copyrighted work. Such uses can be done without permission from the copyright owner. In other words, fair use is a defense against a claim of copyright infringement. If your use qualifies as a fair use, then it would not be considered an infringement." Phlsph7 (talk) 07:06, 22 January 2023 (UTC)[reply]

In our discussion so far, we haven't been able to conclusively figure out whether someone owns the copyright at all and, if so, who. That 2 users get and use the same response would be just a special case. Phlsph7 (talk) 07:14, 22 January 2023 (UTC)[reply]

Question 4: Would running a passage (from a chatty AI search engine) through a plagiarism checker be enough, before copying it into Wikipedia?

Plagiarism checkers are not perfect so they can't ensure that no plagiarism/copyright infringement was committed. The question would be whether they are good enough for our purposes, i.e. whether they are quite reliable for spotting plagiarism/copyright infringement pertaining to AI-generated texts. Phlsph7 (talk) 07:26, 22 January 2023 (UTC)[reply]

Question 5: Does Wikipedia policy allow an editor to click "Publish changes" for content that the editor did not personally compose?

Clarification: Clicking "Publish changes" implies that the editor composed the changes. Can an editor publish changes that they did not personally compose, that were composed by a chatbot search engine? (Please quote and provide links to the specific policies that allow or disallow this). Thank you.    — The Transhumanist   20:13, 21 January 2023 (UTC)[reply]

That would probably be a case of WP:PLAGIARISM even if no copyright infringement is involved. According to the summary: "Do not make the work of others look like your own. Give credit where it is due." Phlsph7 (talk) 07:06, 22 January 2023 (UTC)[reply]
It would be similar to copying public domain/open license content to Wikipedia, no? This is covered by several guidelines and explainers such as WP:FREECOPY and Help:Adding open license text to Wikipedia. As long as there's proper attribution, there's no general expectation that editors must compose the text themselves. –dlthewave 13:22, 22 January 2023 (UTC)[reply]
@Dlthewave and Phlsph7: Interesting. So, if you prompted a chatbot to write a new paragraph for the article on cream cheese and you add that to the article, you include an attribution to the chatbot in the edit summary? What do you put in the source reference?    — The Transhumanist   11:58, 25 January 2023 (UTC)[reply]
@The Transhumanist: As I understand it, for WP:PLAGIARISM it's sufficient to mention the source in the edit summary. You would have to find and add other reliable sources yourself since ChatGPT provides no sources or sometimes invents non-existing sources. However, for the Sharing & Publication Policy of openAI, in-text attribution would probably be necessary. So to comply with it, you would have to start the paragraph on cream cheese with something like "According to ChatGPT,...". This way, the text couldn't be used at all since ChatGPT is not a reliable source. Phlsph7 (talk) 12:23, 25 January 2023 (UTC)[reply]

First ANI case

Just a head up, a thread in Wikipedia:Administrators' noticeboard/Incidents has just opened about an user abusing AI-generated content at Wikipedia:Administrators'_noticeboard/Incidents#Artificial-Info22_using_AI_to_produce_articles. Sure, the editor in question did not made an edit in the mainspace, but the fact that this is happening at ANI is pretty concerning in its own right. I afraid that someone may have covertly spam articles with AI text already. CactiStaccingCrane 15:31, 23 January 2023 (UTC)[reply]

I am now adding the {{disputed}} template when encountering an AI-generated article, based on the following from the ChatGPT FAQ: These models were trained on vast amounts of data from the internet written by humans, including conversations, so the responses it provides may sound human-like. It is important to keep in mind that this is a direct result of the system's design (i.e. maximizing the similarity between outputs and the dataset the models were trained on) and that such outputs may be inaccurate, untruthful, and otherwise misleading at times. The commonality of all the AI-generated articles I've encountered so far (4, to be honest) is that they are not properly footnoted, implying that the author has not confirmed that the AI output is correct. The disputed tag seems to cover this issue well. I'm also dropping a note on the article's talk page explaining the link between AI output and correctness. — rsjaffe 🗣️ 01:00, 24 January 2023 (UTC)[reply]
ANI case is wrapping up. The first three articles were written by a hoaxer, and the refs for two of the articles may have been generated as well. The fourth article was promoting a company. — rsjaffe 🗣️ 04:18, 24 January 2023 (UTC)[reply]
This gives us a good look at the type of plausible-sounding nonsense that we can expect from LLM output. Comparing the draft (archive version, since it will likely be deleted soon) to Gecko, I see a few factual errors right away:
  • Not all geckoes belong to the family Gekkonidae, which doesn't have 1500 species.
  • Not all geckos have specialized toe pads that allow them to climb vertical surfaces.
  • The largest geckos are 23"-24", not 10".
  • Not all geckos are oviparous; some bear live young.
When this type of content is submitted, it needs to be thrown out straightaway. –dlthewave 17:09, 24 January 2023 (UTC)[reply]

More detected AI-generated papers

I started screening Drafts more carefully and am getting a number of hits corresponding to probable AI-generated articles (or at least part of the article is AI-generated). Examples include:

The list could go on, but I think this is enough to see some information about this. These pages tend to be created by users with few edits. A number of users are doing this, not just one or two. Conclusion: the tsunami has arrived. — rsjaffe 🗣️ 03:15, 25 January 2023 (UTC)[reply]

I think that mw:ORES (used in recent changes to highlight bad faith and vandalism) should integrate a screening mechanism for GPT-3 and other bots asap. I envision this is already a huge problem when large amount of hoaxes can be disguised as good content and we wouldn't even know about it. CactiStaccingCrane 03:21, 25 January 2023 (UTC)[reply]
Looking at the first few, the sourcing is not up to the standards of promotion to article space. Once clearly bad sources are removed and unsourced claims are tagged, this can clearly be seen. If AI ever gets to the point of being able to write an article that provides accurate information properly and verifiably sourced to reliable sources, then I'll be happy to have it writing for us. BD2412 T 03:46, 25 January 2023 (UTC)[reply]
@CactiStaccingCrane: It might be a good idea, but in reality, it requires a lot of effort from WMF to integrate openai-detector into mw:ORES. And I agree with @BD2412 for pointing out some drafts are promotional, which I think is a problem even before ChatGPT or even GPT-3 exist. 2001:448A:304F:52BA:8D12:5E35:69B7:8E09 (talk) 03:50, 25 January 2023 (UTC)[reply]
Well, some AI-generated articles have made it into article space. The first one I found has lots of text that's probably not AI-generated, but has a big hunk that is. Pavilion of Harmony, from "The Harmony Pavilion of New Asia College..." to "unique addition to the campus of New Asia College.", after removing the footnote indicators that confuse the analysis, rates as 99.98% fake. So the problem will leak into article space. And this means we need a way to pay special scrutiny to the AI-generated section, as that section is likely to have plausible but false information, given the way current AI models work. — rsjaffe 🗣️ 04:13, 25 January 2023 (UTC)[reply]
I agree that this could easily turn into a major problem as LLMs become more popular. As discussed at #Copyright status, these drafts violate at least WP:PLAGIARISM but probably also the Sharing & Publication Policy of openAI (if they were created using openAI tools). If AI-detectors are reliable, including them in mw:ORES would probably help a lot to mitigate the problem in case such an integration is feasible. Another alternative would be to create a bot that checks new submissions and tags them if they score a high value. A further thing to do at some point might be to make the editors reviewing drafts and new articles aware of this problem. Phlsph7 (talk) 06:28, 25 January 2023 (UTC)[reply]
10 Best ChatGPT Chrome Extensions You Need to Check Out Doug Weller talk 10:06, 25 January 2023 (UTC)[reply]
@Phlsph7 said: A further thing to do at some point might be to make the editors reviewing drafts and new articles aware of this problem. That's how I stumbled unsuspectingly upon this issue. I'm a new page patroller. I think they need to be looped in now, as that is the only guaranteed review step for new articles, and LLM-generated articles are already appearing. (I'm hoping that those users allowed to have their articles bypass this process won't abuse LLMs.) — rsjaffe 🗣️ 17:09, 25 January 2023 (UTC)[reply]
@Rsjaffe: Wikipedia:New_pages_patrol has various instructions on the different issues that new page patrollers need to be aware of. Maybe somewhere in there, a subsection could be added on AI-generated articles. Among other things, it should give a short explanation of what it is (the user tells the AI to generate an article in a matter of seconds and copy-pastes the results), what the problems are (plagiarism, false statements, no or invented sources, possibly copyright violation), and how to spot them (things AI-generated articles have in common and tools to detect them, like https://openai-openai-detector.hf.space/). Phlsph7 (talk) 19:31, 25 January 2023 (UTC)[reply]
Can I recommend, at least for right now, that some of these pages be copy-pasted into projectspace somewhere, so that we can see what they actually look like? I feel like these discussions basically have a couple-hour-long window outside of which it's impossible to see what everyone is talking about. jp×g 19:12, 25 January 2023 (UTC)[reply]
I will note here that the detector being linked here (at https://openai-openai-detector.hf.space) is a very old model tuned for the 2019 GPT-2, not GPT-3 or ChatGPT (3.5). I don't know if it's producing reliable results. It seems to me like most of the things it's flagging as machine-written are abysmal crap, so maybe it doesn't make a difference. jp×g 19:17, 25 January 2023 (UTC)[reply]
Some trials on articles I've written (using the lead paragraphs): Powder House Island and Nina Jankowicz are estimated at 99% human-written, but First Wikipedia edit is at 20% GPT for some reason. 1-Pentadecanol returns 40% GPT based on the first sentence, which decreases with subsequent sentences to 99%. However, when I asked ChatGPT to "write me a Wikipedia article about 1-pentadecanol", the result (which is viewable as an HTML comment if you edit this section) was estimated as 92% human-written. I don't know exactly what the underlying mechanism of this tool is, but we may want to take its output with a grain of salt. jp×g 19:23, 25 January 2023 (UTC)[reply]
The most recent articles I pulled (the ones in this section) were all initially detected by me using a search for a "signature" of a particular type of way of producing text for a particular version of LLM engine (I'm being vague as to not give hints to those who are trying to evade detection). I then visually confirm the signature. Then I run it through the GPT-2 detector. And then I'm only listing pages with > 99% chance of being LLM-generated. I'm 100% sure that the ones I've listed above are LLM-generated, but I'm also certain that this is only detecting that subset being generated under that specific set of conditions. — rsjaffe 🗣️ 19:24, 25 January 2023 (UTC)[reply]
Also, you have to remove the bracketed reference numbers (e.g., [2]) from the text as well as any intercalated headings to give an accurate score. — rsjaffe 🗣️ 19:26, 25 January 2023 (UTC)[reply]
To give an example of some LLM-looking text that doesn't score high on the GPT-2 detector, look at, ironically enough, Draft:Large Language Model. It scores 99.52% real, but sure looks fake to me. — rsjaffe 🗣️ 19:30, 25 January 2023 (UTC)[reply]
That detector works great if a text is modified, but it will crash when two or three words are modified. CactiStaccingCrane 19:32, 25 January 2023 (UTC)[reply]
Here are a few more detectors:
To test them, I asked ChatGPT to "write a Wikipedia article on cream cheese". It produces the following:
Extended content
Cream cheese is a soft, mild-tasting, white cheese that is made from a mixture of cream and milk. It is typically sold in blocks, and is commonly used as a spread on bread or crackers, as well as in cooking and baking.
The origins of cream cheese can be traced back to the early 19th century in the United States, where it was first produced by William Lawrence, a farmer and cheese maker from New York. Lawrence accidentally discovered the process for making cream cheese while trying to replicate the French cheese Neufchâtel. He began producing and selling the cheese under the brand name "Philadelphia Cream Cheese."
Cream cheese is made by blending cream and milk, and then adding a starter culture and rennet to coagulate the mixture. The curds are then drained and the remaining whey is discarded. The curds are then salted and packaged.
Cream cheese is a versatile ingredient and can be used in a variety of dishes, including dips, spreads, and baked goods. It is a key ingredient in many traditional American dishes, such as cheesecake, bagels with cream cheese, and lox and cream cheese. It is also commonly used in cooking and baking as a replacement for butter or other fats, as it adds a rich, creamy texture to dishes.
Cream cheese is also a popular ingredient in many international cuisines, particularly in Asian cooking. In China, cream cheese is often used as a filling in steamed buns and dumplings, while in Japan, it is used in sweet and savory dishes, such as cream cheese-filled pastries.
Cream cheese is typically sold in blocks, and can be found in most grocery stores and supermarkets. It is also available in a variety of flavors, including plain, herb, and smoked salmon.
In conclusion, Cream Cheese is a soft, mild-tasting white cheese made from a mixture of cream and milk, it is a versatile ingredient used in many traditional American and international dishes and it's also commonly used in cooking and baking as a replacement for butter or other fats. It is typically sold in blocks and is available in a variety of flavors.
All the detectors agree that this text is AI-generated. When I fed them with the lead of the article Wikipedia (pure text without reference signs), they all agree that it's human-generated. Phlsph7 (talk) 20:03, 25 January 2023 (UTC)[reply]
Some attempt at figuring out what these are: I think that they are all based on the same code from HuggingFace. The actual source is here. I think that it may be simple enough for me to deploy it on a Toolforge test account; some of these websites seem unbelievably seedy. For example, "Content At Scale" advertises:
Want undetectable AI content? Our platform is the only one of it's kind that allows you to upload up to 100 keywords and get back 100 entire human quality blog posts (title to conclusion) without any human intervention. All the while, bypassing AI detection as it's the most human-like AI content ever produced. Our proprietary system uses a mix of 3 AI engines, NLP and semantic analysis algorithms, crawls Google, and parses all the top ranking content to put it all together. This isn't an AI writing assistant, this is a human level long-form blog post producing machine!
Certainly seedy if they are making it up... but also very seedy if it's true! jp×g 23:17, 25 January 2023 (UTC)[reply]

Quoting a chatbot

Below are some verification-related questions pertaining to chatbots.    — The Transhumanist   12:10, 26 January 2023 (UTC)[reply]

How would pasting in content generated by a chatbot be interpreted under WP:VER's requirement that all quotes must be referenced?

WP:VER states that all quotes must be supported by inline citations. If the chatbot's text is unique rather than preexisting somewhere else, using it would in essence be quoting the chatbot — how could that, as it isn't recorded anywhere, be referenced for verification purposes?

For not requiring referencing the quote of a chatbot, would WP:VER need to be modified?

News update

We need to get and stay ahead of this AI thing. See the following to get an idea how fast this movement is progressing:

  1. ⭕ What People Are Missing About Microsoft’s $10B Investment In OpenAI : GPT3
  2. Travis Tang on LinkedIn: ChatGPT for Data Science Prompts - 60 examples of what it can do
  3. How to write an effective GPT-3 prompt | Zapier
  4. OpenAI Licenses GPT-3 Technology to Microsoft (not exclusive)
  5. OpenAI's investments
  6. Should ChatGPT be used to write Wikipedia articles?
This article features the following Wikipedia article, initially composed using Chat-GPT: Artwork title, by Pharos, a great example of how a chatbot can be used by a responsible editor. Maybe a blanket ban is too much, and guidelines on how to use it correctly would be better. Makes auto-removal harder, though.
See an explanation and its discussions here: Talk:Artwork title

I hope you find these articles informative. Feel free to post more links, and comments, below.    — The Transhumanist   16:05, 28 January 2023 (UTC)[reply]

Crystallize chatbot discussions into a policy?

I think that there is a long list of things that make chatbot content objectively bad for / incompatible with Wikipedia in its current form and methods. Without elaborating here, a few quick notes are the inherent "blackbox" nature of AI, the inherent unaccountability for content, the inherent non-linking of content to sourcing, the fact that they do not know or follow Wikipedia policies and guidelines, (which themselves are a fuzzy ecosystem rather than categorical rules) they do know take into account what is already in the article, they (as a practical matter) remove accountability and expectations from the person who added the material. They also would destroy the ability to obtain volunteer time to review what has been put in. Most people willing to spend time to review something because they know that a human editor has take the time to write it would not be willing to spend large amounts of time dealing with something generated by a bot in a few seconds.

My thought is that we should say that such chatbot generated content is not allowed in Wikipedia. This is just briefly written, I or someone could flesh this out into something carefully written if there is interest.

We can and should decide this without or prior to solving the question of how to detect and enforce. A premise of having to solve detection and enforcement before step #1 would be a poison pill for accomplishing step one. Sincerely,North8000 (talk) 20:42, 25 January 2023 (UTC)[reply]

@North8000: There has been a draft guideline at Wikipedia:Large language models for a few weeks. I do not know that banning their use entirely is a good idea, but it seems quite obvious that just copy-pasting gigantic chunks of text directly from the model's output into the edit box is not a good idea (and almost zero percent likely to result in usable articles). I will try to write some stronger wording emphasizing that nobody should be doing this. jp×g 22:46, 25 January 2023 (UTC)[reply]
  • Support blanket ban - I agree with North8000 that the policy should be "Chatbot generated content is not allowed in Wikipedia." I think this should apply to all namespaces, including talk pages. As chatbots improve, the policy can be changed, but right now, chatbot use for generating WP content appears to be a can of worms.    — The Transhumanist   01:03, 26 January 2023 (UTC)[reply]
    Pertaining to talk pages and forums, I meant not allowing the use of chatbots to generate a user's statements in a discussion. Posting chatbot output on a talk or forum page as an example in order to discuss it, is appropriate.    — The Transhumanist   12:33, 26 January 2023 (UTC)[reply]
    Support auto-screening - Pharos and their article Artwork title changed my mind (and is featured in Should ChatGPT be used to write Wikipedia articles?). I'd like to see a draft on guidance for the responsible use of chatbots in writing articles, including Pharos' approach. Meanwhile, our tech persons can work on automating the removal of undisclosed chatbot additions and the tagging and eventual removal of other entries that don't get edited within a reasonable time frame, or that are part of a pattern of posting disclosed but unedited chatbot submissions. Donald Albury was right, bad actors are going to spam Wikipedia with chatbot crap whether we ban it or not. Therefore, we should allow good actors to help offset their impact. Which brings us to the rest of the Web: it will be subject to hosting chatbot content, and so, as we are Wikipedia, we should trailblaze how to do it right.   — The Transhumanist   16:21, 28 January 2023 (UTC)[reply]
  • Oppose a blanket ban. I think JPxG's demonstration shows that a LLM can be good for repetitive tasks like formatting a table, as long as a human validates the output. Actual generated prose is likely to be unsourced or sourced to fake sources, and so is already covered by existing policies. -- King of ♥ 01:23, 26 January 2023 (UTC)[reply]
  • Oppose any blanket ban. Do think CSD modifications are needed to quickly remove algorithmically generated articles (AI is market speak here, it is not sentient and intelligence is debatable) and some formal guidance for editors would be useful. It's a tool like AWB, Twinkle, and any other scripting used. Used properly, it can cut down tedious work Slywriter (talk) 01:30, 26 January 2023 (UTC)[reply]
  • Support blanket ban with the possibility of allowing specific use cases as we learn more. Our guiding principle should be that AI is completely inappropriate for creating or editing prose. Given the amount of plausible-sounding nonsense we've seen in recent AI-generated drafts, I also don't trust it for coding work such as formatting text or rotating tables until its reliability has been demonstrated for the specific task. This should apply to article, talk and draft spaces with very limited exceptions for demonstration purposes. –dlthewave 03:24, 26 January 2023 (UTC)[reply]
    With respect to tables and templates, what do you envision as "demonstrating reliability"? It is not exactly brain surgery to look at a table and see if the columns or the parameters or whatever are in the right places. You have to do this anyway: we currently do not require editors to prove that they have never typed an extra } and had to go back and fix it. jp×g 05:35, 26 January 2023 (UTC)[reply]
  • Oppose for a variety of reasons, as discussed by others above. But I'll go with the most basic one: how would any of this be actionable? I know you claim it's a poison pill problem to raise, but there's no way whatsoever to detect this with certainty. We'd be using an outside tool to claim text is AI written and then delete things based on that claim. I don't care how many 9's you've got in the decimal places, there's no way to be infallible here. If the editor that added the text says they wrote it themselves, are we just going to say that they're lying and that they have to re-write it or something? There's not even evidence of copyvio in such a case and if the added content meets all other requirements, including proper verifiable sourcing, then I see no way to enforce such a ban. SilverserenC 03:33, 26 January 2023 (UTC)[reply]
    If the detector is good, one mistaken evaluation is possible, but a consistent output of one or the other is solid. Dege31 (talk) 17:49, 26 January 2023 (UTC)[reply]
  • Comment I'm not fully decided whether banning it entirely is the right course of action at this stage. But there is a significant potential to abuse it so most forms of non-trivial AI-assisted edits should be strongly discouraged. The policy should make it very clear that any addition of AI-generated text needs to be labeled as such in the edit summary to avoid WP:PLAGIARISM. The editors also need to be reminded to obey the license and sharing policy of the AI provider. In the case of ChatGPT, for example, in-text attribution is apparently required. Phlsph7 (talk) 08:53, 26 January 2023 (UTC)[reply]
  • Support blanket ban of any LLM-generated text. Not sure yet about using such tools purely for lay-out, but it should not be allowed for either generating new text or for rephrasing existing text, as both cases are way too problematic. As for "how is this enforceable", just like other difficult policies, where near certainty is sufficient (like WP:DUCK for socks, which isn't infallible but good enough). Advantages of a policy are also e.g. when a newbie says something like "why was my article deleted, it was generated by ChatGTP so has to be good", one can easily point to the policy to explain that it isn't allowed instead of having this discussion again and again. Fram (talk) 09:04, 26 January 2023 (UTC)[reply]
    That's a good point about potential problems with enforcing it. Having a policy can be useful to discourage certain types of behavior even if it is difficult to enforce in every case. We'll have to see how useful and reliable AI-detectors are in this process. Phlsph7 (talk) 09:15, 26 January 2023 (UTC)[reply]
  • I think the way forward is even stronger expectations on sourcing. If you can't provide the content of the reliable source for your edit, it should be reverted. (This would include people machine translating foreign Wikipedia articles without having access to the original sources). —Kusma (talk) 10:30, 26 January 2023 (UTC)[reply]
    I assume that you are referring to automatic reversion (correct me if I'm incorrect). What about the wording of the verification policy that reads "challenged or likely to be challenged"? If it is not challenged or likely to be challenged, it doesn't need references. How will a bot be able to tell the difference between what does and does not require references? Or would the bot's removal of an edit constitute a challenge? Whether reversion is automated or not, should all new content to Wikipedia be challenged by default? That would require a change to WP:V, and that seems unlikely to happen.    — The Transhumanist   11:36, 26 January 2023 (UTC)[reply]
    I would challenge the factual accuracy of anything generated by AI. –dlthewave 13:41, 26 January 2023 (UTC)[reply]
    I'm not talking about automatic reversion, can't see how to do that without AI. And the verification policy is applied differently to new and existing articles; for new articles, we are in practice already expecting much better sourcing than "likely to be challenged" (just look at what will be rejected by AFC). Perhaps we should expand this to addition of content to existing articles. —Kusma (talk) 13:56, 26 January 2023 (UTC)[reply]
  • Support blanket ban for now; would be fine having a discussion later on allowable use cases, but I'd rather we started with a total blanket ban first, and then itemize specific possible use cases if we later decide there's some utility. --Jayron32 12:47, 26 January 2023 (UTC)[reply]
  • Support blanket ban for prose as there is significant copyright violation concerns about AI text. Text generated by bot will not substitute reliable sourcing. Maybe in the future when Abstract Wikipedia come online, we can give some leeway for bots to generate text based on reliably cited info, but for now, it's just too risky for the project. CactiStaccingCrane 13:34, 26 January 2023 (UTC)[reply]
    If there is interest in at least reviewing this possibility (which it sounds like there is), as noted there it needs to be written better than my initial trail balloon above. I'll do that but still keep it short. I think that it can be done in a way that deals with the main enforcability questions and also allows described useful uses by allowing bot-assisted editor generated content. I'll do that within a 1/2 day. North8000 (talk) 14:33, 26 January 2023 (UTC)[reply]
  • support ban for text additionsThis is so susceptible to abuse and inserting misleading content that it should be banned. There is little benefit of allowing text generation and much harm.
    — rsjaffe 🗣️ 16:05, 26 January 2023 (UTC)[reply]
  • Oppose a blanket ban. it's premature and is not based on any actual cases in Wikipedia. Rjensen (talk) 16:20, 26 January 2023 (UTC)[reply]
    Have you seen WP:ANI#Artificial-Info22 using AI to produce articles? It's already happening. Fram (talk) 16:30, 26 January 2023 (UTC)[reply]
  • Oppose new policy that goes beyond small clarifications of WP:BOTPOL. I think it is obvious that ChatGPT is already covered by it and I do not see what modifications are proposed.
    Things that go against a content policy (WP:COPYVIO, WP:V, etc.) should be reverted / deleted on those grounds, and on those grounds alone; editors that make many such edits should be warned and then blocked. Editors who make faster edits than reasonably possible by hand should be dealt according to WP:MEATBOT.
    I oppose any policy to revert / delete / ban based solely on a "seems bot-written" criterion, unless and until it has been proven that (1) this is a real, time-consuming problem on Wikipedia, and not a few random tests within the sea of vandalism, and (2) whatever criterion is used has been independently tested to establish its sensitivity and specificity and validated by the community. TigraanClick here for my talk page ("private" contact) 17:24, 26 January 2023 (UTC)[reply]
    I also thought it obviously fell under BOTPOL, but this discussion shows some uncertainty. –dlthewave 20:04, 26 January 2023 (UTC)[reply]
  • Oppose a blanket ban. I agree in general with the oppose reasons given above. I also think such a ban would be "virtue signaling" without being effective. Editors who want to add AI-generated material to Wikipedia will not be stopped by such a policy. Consider how often our existing policies stop editors from adding un-verifiable, POV-pushing content. What we can do is work on effective strategies for detecting and removing un-sourced, un-verifiable content as expeditiously as possible. - Donald Albury 19:34, 26 January 2023 (UTC)[reply]
  • Oppose a blanket ban. I expect that in another five or six years, we will happily be allowing a Wiki-AI to both write and clean up most of our articles. BD2412 T 22:26, 26 January 2023 (UTC)[reply]
    And AIdmins to deal with AIsox ~ Selfstudier (talk) 22:53, 26 January 2023 (UTC)[reply]
    While all of this wouldn't surprise me given the other aspects of this dystopia we call Earth, it would simply confirm that we live in the Golgafrinchan Ark B world. Andre🚐 22:55, 26 January 2023 (UTC)[reply]
    I suspect it will become necessary to rely on such bots, as our current model is creaking under the current load. In just the last three or four days I've discovered a couple articles that I started that are in need of serious cleanup and repair, particularly on sourcing (linkrot and other issues) and updating. Donald Albury 22:58, 26 January 2023 (UTC)[reply]
    Oh don't get me wrong, I would love to see AI bots to improve references, and do other automated tasks under the bot policy. But the AI should not be trusted for facts or interpretations. And I do fear the ChatGPT-ization of the language. There is good and bad writing on Wikipedia, but at least it was written by and for human beings. Andre🚐 23:02, 26 January 2023 (UTC)[reply]
  • Oppose a blanket ban. I agree with King of ♥ ♦ ♣ ♠'s comment above. If an editor adds unsourced AI-generated content, that content should get treated the same way as non-AI generated unsourced content. WP:V and other existing Wikipedia policies already cover that. Some1 (talk) 00:05, 27 January 2023 (UTC)[reply]
    But, a great deal of unsourced content doesn't get treated. The amount of unsourced content on Wikipedia is vast. If you don't believe me, pick a scholastic subject like Roman Empire, click on "What links here", and open a bunch of the links in tabs and start inspecting. What about random article? Too many stubs, and it's boring. With "What links here", you can get longer articles on average to view. The striking thing is the lack of "citation needed" tags - they are spaced few and far between. They can be found on roughly 1% of all pages, while unsourced content can be found on a much higher percentage.

    Another thing to try is go to Template:Unreferenced, and click on "What links here". The first page I clicked on was Tank destroyer. The tag is dated March 2009.

    The point is, you make it sound like all unsourced content gets quickly removed. That's not the case for a huge amount of content. It can sit there for years. LLM-generated content can be generated in great quantities fast, and therefore has the potential to accumulate more quickly than content composed by humans. Is it wise to let it sit there until a human comes along to remove it? In terms of a quantity competition between humans and computers, computers will win. This will take auto-removal to keep up. It would be best to start building those tools now. See Fram's post, above for an even better rationale.   — The Transhumanist   06:49, 28 January 2023 (UTC)[reply]

Holly and Judy, Wikipedians from the Glasgow branch, prepare for the AI wars.
  • Comment: There's no way to enforce a ban, and at any rate Wikipedia would do well to have more automation. Both MW and en.Wiki are heavily dependent on manual labor, with 3rd-party "bots" doing some of the more menial tasks. Compare how one would compose a document on a modern word processor vs. how editors do it here: no Wikitext, no copying-and-filling-out-templates (and no separate commits just for AnomieBOT to date them), no broken pages because of parsing errors, no dragging someone to ANI/AE for a T-ban violation (because there's such a thing as access control), no separate citations of the same source in five different formats (because there's such a thing as reference management); and you can actually comment on a specific paragraph without looking for a diff number, opening a "discussion" (which is in fact just another near-meaningless bit of Wikitext), signing it and hoping that no edit conflict arises because someone changed something 5,000 words up the page. We need to get rid of the concept of a WP:GNOME before we can even consider how to deal with a language model that can generate an entire article in a fraction of a second. François Robere (talk) 13:31, 27 January 2023 (UTC)[reply]
  • Oppose. I get that this raises some interesting questions in the abstract, but come on guys, we have enough problems with instruction creep already, we don't need to start writing policies in anticipation of the hypothetical abuse of Silicon Valley's latest fad. – Joe (talk) 14:18, 27 January 2023 (UTC)[reply]
  • Support blanket ban because fuck LLMs and fuck the corrupt, unethical industry that created them. XOR'easter (talk) 17:32, 27 January 2023 (UTC)[reply]
  • Oppose blanket ban blanket ban and suggest we develop Wikipedia:Large language models (perhaps under the name Wikipedia:Computer-assisted text generation suggested by Michael_D._Turnbull). I don't think that going forward we can ban AI-generated text writ large, first and foremost for the very simple example that many people currently editing wikipedia use text edit widgets that already incorporate something of this in the form of spell check, autocorrection and autocomplete, and these kind of tools will continue to blur the line between AI, language models, and human-generated text. Going forward it would be practically Neo-Luddism to eschew all AI. I don't like the use of ChatGPT right now today to generate text, I don't like it at all, but neither can I bury my head in the sand and whistle Dixie, pretending it doesn't exist and won't grow in importance. We should meet this head on rather than pretend we can completely ban AI-assisted or AI-generated text. —DIYeditor (talk) 23:27, 27 January 2023 (UTC)[reply]
  • Oppose What seems to scare people about ChatGPT is that it writes better than most Wikipedia editors. Competition is healthy and so should not be subject to restrictive practices. See also Luddism. Andrew🐉(talk) 22:32, 28 January 2023 (UTC)[reply]
    Competition is good when it's robot vs real people? You'd be happy if most of the encyclopaedia was written by AI? And most of the real people just gave up? Doug Weller talk 13:05, 29 January 2023 (UTC)[reply]
    I think that most people are at least open to the idea of using LLMs like ChatGPT for good and practical purposes, such as summarizing dozens of paragraph from a reliable source, or making a WP:Earwig-like bot that detect source-text integrity issues, or detecting possible hoaxes/context-dependent vandalism in Special:RecentChanges. I'm sure that when these LLM-based tools come out, people will use them just as much as mw:ORES and User:Cluebot NG today. The problem as of 2023 is that these tools does not exist yet and as in the current form, LLMs is an extremely powerful tool for bad actors yet disadvantaged good-faith Wikipedia editors. I feel that this situation between LLMs and Wikipedia right now a bit like Wikipedia and the academia in the early 2000s, when Wikipedia is full of uncited info and its reliability is really shaky to say the least (see also https://www.nostalgia.wikipedia.org). Maybe this will change in the future when whoever makes a LLM model that's aligned to our values and policies, but in my opinion for now a blanket ban is necessary to prevent mass vandalism while we are trying to process the situation. CactiStaccingCrane 13:24, 29 January 2023 (UTC)[reply]
    @CactiStaccingCrane: How would a ban of LLMs prevent vandalism exactly? Vandals would simply ignore the ban, while many good actors would obey the ban. You would in effect be preventing good actors from using the tool, and not bad actors. The only way to deal with vandals who ignore a ban is directly – that is, identify their text and remove it, and block the vandals. But you can do that anyways. So, wouldn't it be best to identify and remove LLM vandalism while allowing good use of LLM-generated text? (See the starting edit and talk page for the chatbot-generated article Artwork title). So, I'm confused as to how you believe a blanket ban would help. Let me repeat my initial question, along with a follow-up question: How would a ban of LLMs prevent vandalism exactly? And why would banning good actors using LLM (like Pharos and JPxG) be necessary? I look forward to your replies to these 2 questions. Sincerely,    — The Transhumanist   21:46, 29 January 2023 (UTC)[reply]
  • Support blanket ban. Google and Bing actually considers auto-generated content as spam, which is bad for SEO. If anything, copying from GPT and pasting into an article is no different from copying from a free source and pasting into Wikipedia. The text generated is not your own, but is licensed to you forever, and, since this is a machine we are talking about and not a human, there is a probability that specific text output will be the same for multiple users for the same input.
    I believe GPT is best used as an inspiration for further research, but in no ways is it actually any more useful than a machine translated article. Aasim - Herrscher of Wikis ❄️ 14:28, 3 February 2023 (UTC)[reply]
  • Oppose a blanket ban. I would prefer that editors use GPT to assist in writing, but to avoid plagiarism, tag their edits as "GPT-assisted". I would additionally suggest that GPT-assisted edits be all subject to pending changes, except if a user reaches Extended confirmed status. I personally fear that, for those who are familiar with American history, ChatGPT could be the prohibition era equivalent for Wikipedia. InvadingInvader (userpage, talk) 07:47, 6 February 2023 (UTC)[reply]
    Pending changes can only be applied to articles. There is no way (that I am aware of) of having pending changes apply to only some edits to an article. Donald Albury 15:43, 6 February 2023 (UTC)[reply]
  • Comment IMHO, the issue isn't per se about how articles are written. Very few people would object if Wikipedia was provided content that was accurate, well-sourced & well-written by GPT bots. The root issue is that we don't have enough volunteers to review a flood of these GPT-created articles, let alone enough to review them knowledgeably. Viewed in this light, New Page Patrol's often-discussed problems dealing with a tsunami of new articles is but the nose of the camel poking under the tent. (And then there is the threat of gut-&-stuffing existing articles.) Most of us have other, off-Wiki demands on our time, so we are forced to contribute on a part-time basis; but even if we were all full-time volunteers, we still couldn't keep up with all of the edits. -- llywrch (talk) 21:21, 11 February 2023 (UTC)[reply]
  • Oppose. As demonstrated by Artwork title these bots can be used positively or negatively. Our policies should be tuned to prevent the negatives, and not also prevent the benefits. BilledMammal (talk) 11:17, 13 February 2023 (UTC)[reply]
  • Oppose a blanket ban. scope_creepTalk 16:19, 13 March 2023 (UTC)[reply]

Chatbot This idea probably goes to the heart of it

How about this? (just a draft to be tweaked)

An editor is responsible for every part of every edit that they make. The norm is that they should have their own rationale for everything that they added including every word, phrase and sentence. For additions to articles, they should make a specific effort to make sure that the edit is appropriate with respect to the current article. For example, with respect to the structure of the article and avoiding duplication. They should also make a reasonable specific effort to assure that each potion of their addition is verifiable or verified in accordance with WP:Verifiability. It is unlikely that these requirements and expectations would be met with AI generated content (or any large amount of text that has been copied from elsewhere and pasted in, copyvio issues notwithstanding)
Wikipedia relies on volunteer efforts to review additions. This often requires time consuming reviews of individual words, phrases and sentences; obtaining this degree of effort relies on the understanding that the editor who put the material in has made a similar effort to develop that phrase or sentence. Expecting editors to give this review to large amounts of material which were generated by AI in a few seconds would cause a substantial loss of this effort.
Accordingly, this clarifies that removal / reversion of en masse additions of material suspected of being AI generated is considered to be an appropriate practice. An editor who seeks to restore the material is expected to break it into small portions, each with individual explanatory edit summaries. If such a removal results in deletion of the entire contents of the article, it then becomes a candidate for speedy deletion.

Sincerely, North8000 (talk) 21:07, 26 January 2023 (UTC)[reply]

I don't favour making machine-generated text a special case. Given that editors are already responsible for verifying every aspect of their edits, any clarifications should be equally applicable to all cases, such as human ghostwriting teams. isaacl (talk) 21:48, 26 January 2023 (UTC)[reply]
I like that a lot. — rsjaffe 🗣️ 05:06, 27 January 2023 (UTC)[reply]
I have integrated the above three paragraphs into Wikipedia:Large language models, see Wikipedia:Large language models#Specific guidelines and Wikipedia:Large language models#Summary removal of larger LLM-generated additions of article prose. —Alalch E. 10:33, 27 January 2023 (UTC)[reply]
cf. Wikipedia:Village pump (policy)/Archive 176#RFC: change "verifiable" to "verified". Also: do not make assumptions about the capabilities of AI. There are models at work that integrate references, and you should assume that at some point they'd be able to compose texts that are comparable to any Wikipedian's. Ergo, policy should focus on what we're looking for, not who or what composed it. François Robere (talk) 14:12, 27 January 2023 (UTC)[reply]
We can change the policy if something is developed that is reliable enough. Until then, blanket rejection is appropriate. — rsjaffe 🗣️ 15:46, 27 January 2023 (UTC)[reply]
It's important to see the conduct side too because LLM misuse forms a pattern of disruptive editing. It starts with one person's idea that a specific thing can be accomplished on Wikipedia in this way, proceeds with the intent to implement this idea without caring to understand and account for what Wikipedia's requirements are, and ends with an undesirable action which may be repeated if not addressed. —Alalch E. 15:55, 27 January 2023 (UTC)[reply]

If chatbots are banned, would the article Artwork title have to be deleted?

Artwork title was created by chatbot, and heavily edited by a human since. If chatbots (and LLMs) become banned, how would it apply to pre-existing chatbot-generated articles?    — The Transhumanist   22:09, 29 January 2023 (UTC)[reply]

I don't think it would have to be deleted (or restarted) unless it was somehow found to be a copyvio. Maybe it would be peer reviewed instead. Roll 3d6 (talk) 08:50, 5 February 2023 (UTC)[reply]

More specific proposal: Blanket ban on LLM content on Talk page discussions

Regardless of the community's decision on LLM-generated content in articles, which is the focus of much of the discussion above, the ability of editors to flood talk pages with artificially generated content arguing for a particular position on talk seems to have no redeeming value, and represents a new form of Wikipedia:Sockpuppetry. I propose a blanket ban on such writing, with especially strong guardrails for RfC's and AfD's. (Alternatively, I would be open to a phrasing that allowed LLM's to summarize the state of a debate, or be used to generate sample content for discussion, as in the conversation above, but not used to make arguments. That just seems harder to phrase clearly.)Carwil (talk) 20:22, 27 January 2023 (UTC)[reply]

That's a really weird way to use them. Also, why would it matter? Even if an editor used an LLM to generate a better phrasing of the argument they want, it's still their account putting forth the argument. And the argument is either valid or not in regards to others involved in the discussion. Why is this a problem exactly? Do you have any examples to better clarify this sort of usage? SilverserenC 23:14, 27 January 2023 (UTC)[reply]
Here's an article that should help clarify the relevant danger: Researchers demonstrate how attackers can use the GPT-3 natural language model to launch more effective, harder-to-detect phishing and business email compromise campaigns.. If they can use it to write convincing email scams, Wikipedia talk pages should be a breeze. Here's a quote from the article: ""The generation of versatile natural-language text from a small amount of input will inevitably interest criminals, especially cybercriminals — if it hasn’t already. Likewise, anyone who uses the web to spread scams, fake news or misinformation in general may have an interest in a tool that creates credible, possibly even compelling, text at super-human speeds." If that doesn't convince you, I don't know what will.    — The Transhumanist   10:16, 28 January 2023 (UTC)[reply]
Has this ever actually happened? – Joe (talk) 05:29, 28 January 2023 (UTC)[reply]
Maybe a more relevant question is "Will we be ready for it when it does?"    — The Transhumanist   10:16, 28 January 2023 (UTC)[reply]
As Transhumanist is suggesting, my concern is rapid creation of multiple arguments that either tilt a discussion or waste the time of sincere contributors. Users should be warned that they can't substitute mass-produced arguments for their own judgment inside the encyclopedia.--Carwil (talk) 19:26, 28 January 2023 (UTC)[reply]
  • Support - Weird? Compared to the many examples of what Chat-GPT can do posted on social media, writing talk page or forum posts on Wikipedia seems comparatively bland and simple. Why would using an LLM on a talk page or discussion forum matter? Because, it is faster than a human. With it, a user could participate in more discussions in less time. But, the big concern here is using it on multiple accounts with different writing styles to stack votes on issues with little chance of being discovered as the same person. That's sockpuppetry elevated to a higher level. Therefore, banning chatbots from being used to compose talk page or forum posts is quite reasonable.    — The Transhumanist   05:42, 28 January 2023 (UTC)[reply]
  • Support On a risk/benefit analysis, the potential benefit to allowing this is so small that any risk (as described above) is unacceptable. — rsjaffe 🗣️ 05:50, 28 January 2023 (UTC)[reply]
  • Reluctant Oppose As my comments above, it is difficult to draw a line on which language models are allowed and which are not. Clearly people are allowed to use autocompletion and predictive text and such. Are they limited in what varieties and designs of autocompletion they use? I think this requires further discussion and hopefully input from experts. —DIYeditor (talk) 06:42, 28 January 2023 (UTC)[reply]
  • Oppose a blanket ban as premature, but support at least some restrictions. While I would certainly be first in line to support a blanket ban if an actual problem arose, I am not sure that we know exactly what shape this will take enough to come up with an intelligent solution. For example, earlier in this very discussion, we were posting LLMs' output in order to judge their capabilities, so any prohibition would need to take exceptions like this into account. That said, I do support some more specific language for WP:LLM about it being very bad to use them undisclosed to argue your case in discussions. For example, if I were a shady dude, I could flood the zone with shit right here on VPP by typing out massive walls of text replying to every single person who disagreed with me, without regard for whether my arguments were sound or even correct, and even if I represented a minority view it would probably irritate and discourage my interlocutors until they stopped commenting (thus bringing me closer to a majority). Similarly, at the blood-soaked fields of AfD I could trivially write out a three-paragraph !vote on all 50 of the day's nominations (whether I was a partisan for keeping, a zealot for deleting, or a fanatic for any sort of POV). jp×g 19:45, 28 January 2023 (UTC)[reply]
  • Oppose a blanket ban. As JPxG, I'm open to some restrictions, but I can see LLMs as a potentially useful tool for people who want to offer their viewpoint in discussion but lack the fluency or time of some other editors. (A bit more acerbically, our discussions already tend to be influenced by editors who are functionally LLMs: good prose stylists, possessed of enormous sitzfleisch, and not well grounded in factual specifics. If LLMs force us to review WP:BLUDGEON and our methods of dialectic/achieving consensus, there's no reason to grant people like that the privilege of immunity.) Choess (talk) 20:44, 28 January 2023 (UTC)[reply]
  • Oppose per WP:CREEP. As a recent example, I included some ChatGPT output in a recent discussion at ITN. Not seeing the problem. Andrew🐉(talk) 22:23, 28 January 2023 (UTC)[reply]
    I think your example misses the spirit of my proposal, which is un attributed LLM output substituting for our own reasoning and arguments on Talk. Happy to modify accordingly. --Carwil (talk) 13:25, 29 January 2023 (UTC)[reply]
  • Oppose a blanket ban on talk pages specifically (I would support one in article space). The largest problem by a huge margin with using ChatGPT on Wikipedia is adding factually incorrect information to articles. This is less of a concern with talk pages, which the general public doesn't usually read and which are not primarily meant to impart facts or be reference material. We already have a ban on sockpuppeting and an expectation that decision-makers will ignore bad arguments, which cover the negative use cases mentioned above. Also, as mentioned above, there is not an enormous difference between ChatGPT and predictive text or services like Grammarly, and the line between those is going to become blurrier by the day, faster than policy can keep up with. Gnomingstuff (talk) 01:20, 3 February 2023 (UTC)[reply]
  • Oppose as making up rules when no problem has been demonstrated. If it does happen, we can warn, revert, and block the offending material and offender for wasting our time. Graeme Bartlett (talk) 12:10, 3 February 2023 (UTC)[reply]

The discussion has diverged

Further issues on this topic (chatbot-generated content) are being discussed at Wikipedia talk:Large language models, the talk page for the policy draft on this subject.    — The Transhumanist   05:15, 30 January 2023 (UTC)[reply]

The discussion has been re-opened

Chatbot news update 2023-02-12

  1. Panicked Google to unveil over 20 AI products in 2023
  2. ChatGPT reaches 100 million users two months after launch
  3. Google has stiff competition now, after Microsoft integrated powerful A.I. technology into its search engine.

Where do we expect issues to occur?

If we can work out the issues that editors improperly using these tools will cause then we can adjust our policies to make it easier to address those issues, and then as the technology develops we can see how it is being used and how it can be used and create a more specific policy then. BilledMammal (talk) 11:21, 13 February 2023 (UTC)[reply]

@BilledMammal: Here are some:
  1. Made up content
  2. Lack of source citations
  3. Made up sources
  4. Claims based on out-of-date sources
  5. Possible synthesis
  6. Copyright unclear as the source of the material is unknown
  7. Copyright unclear as the issue of who owns the output of a chatbot has not been resolved
  8. Inclusion of 3rd-party copyrighted material, such as lyrics, a poem, etc.
  9. Chatbot license violation (e.g., for omitting attribution of the chatbot with the generated content)
  10. 100,000,000 ChatGPT users and growing (after just 2 months),[6][7][8][9] suggesting that the use of tools like this may become ubiquitous, creating the potential for a large volume of one-off posts of chatbot generated text (and other media)
Not a comprehensive list.    — The Transhumanist   07:31, 15 February 2023 (UTC)[reply]

Timeline of ChatGPT news

See: Wikipedia talk:Large language models#Timeline of ChatGPT news — Preceding unsigned comment added by The Transhumanist (talkcontribs) 08:40, 1 March 2023 (UTC)[reply]

RFC: Clarifications to WP:COSMETICBOT for fixing deprecated HTML tags

The following discussion is an archived record of a request for comment. Please do not modify it. No further edits should be made to this discussion. A summary of the conclusions reached follows.
There is clear agreement the edits made by MalnadachBot to fix lint errors, <font> or otherwise, are to be allowed to continue. While there was some discussion on how important those edits are in general (such as fixing tags that might still be supported for years in old archive pages), most editors seem to agree they are a net positive for the future of the project. Additionally, it was noted that some of the issues with the bot, mainly making multiple edits to the same page, have been addressed recently, which should serve to reduce the clutter of editors' watchlists. The second part of this RfC did not receive as much attention as the first part, but it was almost unanimously opposed. Isabelle Belato 🏳‍🌈 02:51, 11 March 2023 (UTC)[reply]


This is a request for comment regarding Wikipedia:Bot policy, specifically the section on restricted bot tasks related to cosmetic changes outlined at WP:COSMETICBOT. Ultimately, I would like to start a discussion about whether or not the bot account User:MalnadachBot is operating within Wikipedia policy, and whether WP:COSMETICBOT needs to be updated to clarify how these kinds of bots should be handled in the future. —⁠ScottyWong⁠— 08:33, 7 February 2023 (UTC)[reply]


First, a bit of history behind this situation:

MalnadachBot (talk · contribs) is a bot account that is owned by User:ಮಲ್ನಾಡಾಚ್ ಕೊಂಕ್ಣೊ. This is an AWB bot that primarily works on fixing lint errors, which are common technical issues in wikitext markup. A list of lint errors detected on Wikipedia is available at Special:LintErrors. The bot has had a total of 13 tasks approved by the Bot Approvals Group. The first 11 tasks were fairly standard error-fixing tasks, each of which had a limited scope to fixing a specific type of error. Task 12 was requested to fix another specific error associated with deprecated <tt>...</tt> HTML tags. However, the BAG member who approved the task (in this case, User:Primefac) decided to expand the scope of the task to fixing all linter errors across Wikipedia, and speedily approved the task. Since that task approval, MalnadachBot has made millions of edits, becoming one of the top accounts by edit count in Wikipedia history.

One of the most common edits that the bot makes is to replace <font color="red">...</font> tags with <span style="color:red;">...</span> tags. This is because <font> tags were technically deprecated when HTML5 was introduced, so current best practices for writing new HTML are to use span tags instead of font tags for customizing the font and style of text. If you inspect the bot's contributions, you'll notice that the vast majority of its edits are in the Wikipedia, Wikipedia talk, User talk, and Article talk namespaces. This is because using custom font styles is relatively uncommon in WP articles themselves. However, they are very common in many user's signatures (including my own). So, MalnadachBot is essentially going through nearly every discussion that has ever taken place in the 20+ year history of Wikipedia, and updating old user signatures to use proper HTML. Many of these discussion pages have not been edited in 10+ years, and many of them receive nearly zero annual pageviews. It has made millions of such edits, and there are many millions more in the queue to be done. Over the past 12 months that this task has been running, many users have complained on the bot owner's talk page and Wikipedia:Bots/Noticeboard that the bot is clogging up watchlists and revision histories with millions of trivial, purely cosmetic edits, but there has been no wider community discussion to come to a consensus on how to move forward. If you used to use <font> tags in your signature, you've no doubt had your watchlist bombed by MalnadachBot in the last year.

One thing to note is that these kinds of edits do not result in a change to the way that these pages are rendered to the user. This is because, despite <font> tags being deprecated in HTML5, all major web browsers still understand and support the older deprecated tags, and no major browser has announced an intention to stop supporting these tags in the foreseeable future (which is why these specific errors are categorized as the lowest priority on Special:LintErrors). Therefore, these edits are purely cosmetic changes to the wikitext of many pages, that do not change the visual rendering of the page. Typically, these kinds of edits are not allowed to be done on their own by bots (per WP:COSMETICBOT), and can only be done if they are combined with other substantive edits that do cause a change to the visual rendering of the page. However, there is one exception noted within WP:COSMETICBOT for fixing "egregiously invalid HTML such as unclosed tags, even if it does not affect browsers' display or is fixed before output by RemexHtml (e.g. changing <sup>...</sub> to <sup>...</sup>)". It has been asserted that MalnadachBot's task 12 falls under this exception, but the wording of the policy leaves much room for different interpretations.

So, the primary question I'd ask for your input on in this RfC is: does <font color="red">...</font> constitute "egregiously invalid HTML" that should be proactively fixed by a bot, or should it be considered a purely cosmetic change to the wikitext that should not be performed by a bot?

And the secondary question in this RfC is surrounding whether or not the last bullet point in WP:COSMETICBOT should be expanded to be more specific about defining exactly what types of errors are considered "egregious", and which aren't. A proposal for an updated wording can be found below.

—⁠ScottyWong⁠— 08:33, 7 February 2023 (UTC)[reply]

Primary question: are deprecated HTML tags considered egregiously invalid?

Please indicate below if you believe that automated fixing of deprecated HTML tags should be considered "egregiously invalid HTML" and therefore be permissible per the current wording of WP:COSMETICBOT, even if those deprecated HTML tags are still supported by all major browsers. (Support will be interpreted as support for the bot task to continue, Oppose will be interpreted as agreement that the bot task is in violation of policy and should be halted.)

  • Oppose as RfC author. I don't believe that fixing deprecated but supported HTML tags should be considered "egregiously invalid HTML", considering that every major browser still supports them, and will continue to support them for the foreseeable future. Fixing user signatures on ancient discussion pages is a purely cosmetic change to wikitext that does not alter the visual rendering of the page. This bot task is only generating millions of trivial edits that clog up user's watchlists and revision histories. While it's possible to ignore these bot edits, the bigger question to be discussed here is whether or not those edits should be happening at all. —⁠ScottyWong⁠— 08:33, 7 February 2023 (UTC)[reply]
  • Oppose In my opinion bots should be discouraged from ever trying to systematically modify historical talk page comments, unless there is some entirely forced technical reason to do so and broad community support first. Rewriting 10+ year old talk page comments to change 'font' elements to 'span' elements with an element-specific font style serves no practical purpose. Doing it tens of millions of times is an absurdity. If the linter is complaining about this, the linter should be modified to stop complaining (e.g. this linter rule could be disabled in talk namespaces). After back and forth discussion (recently and apparently also several times over the past few months), none of the bot operators or approvers have offered any kind of compelling reason to run this bot task. –jacobolus (t) 10:01, 7 February 2023 (UTC)[reply]
  • Invalid question framing. I object to the framing of the question that "oppose" means that the bot task is in violation of the policy. I support the bot task, because it is making edits to assist with the "administration of the encyclopedia", such as the maintenance of hidden categories used to track maintenance backlogs, a quote from COSMETICBOT. Linter errors are tracked on a series of maintenance pages created by the MediaWiki software, and they are listed in the "Page information" for each affected page, in a section called "Lint errors". This is the same reasoning that allows bots to remove unsupported parameters from infoboxes, which also results in zero change to the rendered page, or to adjust categories in articles after a category is renamed. The bot is performing a wikitext cleanup task. – Jonesey95 (talk) 14:45, 7 February 2023 (UTC)[reply]
    How does changing <font> to <span> help with the maintenance of hidden categories or tracking maintenance backlogs? —⁠ScottyWong⁠— 16:01, 7 February 2023 (UTC)[reply]
    Please notice the words "such as". Fixing obsolete tags and other Linter erors on a page removes the MediaWiki-created "Lint errors" section from that page's "Page information", and it removes the page from administrative lists such as this MediaWiki-generated list of over three million errors. – Jonesey95 (talk) 19:13, 7 February 2023 (UTC)[reply]
    Wouldn’t it be relatively trivial to remove this linter rule (either altogether or from talk namespace)? Then you would eliminate all of those pages from the administrative lists at one fell swoop without needing to modify any of the pages. –jacobolus (t) 00:08, 8 February 2023 (UTC)[reply]
    Not really. Sweeping a problem under a rug, or burying your head in the sand does not stop the problem from being a problem. And yes, lint errors like this are a problem.
    While you are technically correct that at this moment the major browsers continue to support it, because the tag was deprecated 25 years ago those same browsers would be well within their rights to remove support for it at any time, without advance notice. This bot is preforming preventative maintenance to prevent the future need for it when browsers do finally remove support for it from their rendering engines. Sideswipe9th (talk) 01:41, 8 February 2023 (UTC)[reply]
  • Oppose per mw:Help:Lint errors/obsolete-tag and jacobolus. Going around changing font tags in old talk page comments millions of times is absurd. Calling old HTML "egregiously invalid HTML" demonstrates a basic lack of familiarity with HTML, and with how web browsers work. This is one of the most useless tasks we do on Wikipedia, even more so than changing hyphens to dashes. The bot task is wholly unnecessary and therefore should stop. Levivich (talk) 14:53, 7 February 2023 (UTC)[reply]
  • Unclear question. It is unclear that when an editor opposes whether they believe that MalnadachBot's operation is a violation of the bot policy or they are just reaffirming the "deprecated HTML tags is not egregiously invalid HTML" argument. If an editor bases their opinion entirely on the latter, then it doesn't necessarily mean that the bot is explicitly not allowed under bot policy. Those in support of the bot's operation wouldn't necessarily agree with "deprecated HTML tags are egregiously invalid HTML" so this is a bit of a strawman. 0xDeadbeef→∞ (talk to me) 15:06, 7 February 2023 (UTC)[reply]
    • I agree the presentation of this RfC is confusing, but Oppose will be interpreted as agreement that the bot task is in violation of policy and should be halted is clear. Levivich (talk) 15:15, 7 February 2023 (UTC)[reply]
      Then the actual RfC question should be reworded. Whether fixing deprecated HTML tags constitutes fixing egregiously invalid HTML is one question, and whether MalnadachBot's task should continue is another. 0xDeadbeef→∞ (talk to me) 15:18, 7 February 2023 (UTC)[reply]
      If it is decided that making automated edits that only fix deprecated HTML tags is a violation of COSMETICBOT, then the bot would naturally need to stop making those types of edits, and focus only on edits that make more substantive changes. —⁠ScottyWong⁠— 16:21, 7 February 2023 (UTC)[reply]
      I don't think that logic is correct. COSMETICBOT specifically says Consensus can, as always, create exceptions for particular cosmetic edits. Bots are often approved to make cosmetic changes to pages. – Jonesey95 (talk) 17:07, 7 February 2023 (UTC)[reply]
  • Point of fact: The <font>...</font> element is not deprecated in HTML5, it is obsolete, and has been since October 2014 (more than eight years ago). It was deprecated as long ago as December 1997, when HTML 4.0 was released. So, in 25 years, people have had plenty of opportinity to cease using outdated constructs. --Redrose64 🌹 (talk) 15:59, 7 February 2023 (UTC)[reply]
    I don't think anyone is arguing in favor of using font tags in new content that is being generated today. However, as the HTML spec is changed over time, there is generally no expectation that the entirety of the internet is going to go back to old content and update it to be compliant. That is why browsers continue to support both deprecated and obsolete tags. —⁠ScottyWong⁠— 16:19, 7 February 2023 (UTC)[reply]
    If you follow the second link that I provided, it shows the difference between deprecated and obsolete. Most importantly, An obsolete element or attribute is one for which there is no guarantee of support by a user agent. Whilst browsers may continue to support obsolete tags, they are not obliged to, and it is the browser vendor's decision whether to continue or not. Conside: how many browsers still support the <NEXTID>, <HP1> or <KEY> tags? At some point they may decide that continued support for <font>...</font> is not worth the maintenance overhead, and will remove the code concerned, much like the MediaWiki devs occasionally retire some feature or even an entire skin. In short: don't rely on it. --Redrose64 🌹 (talk) 17:11, 7 February 2023 (UTC)[reply]
    @Scottywong: Point of order. This is a new comment, added today. Should I be warned or sanctioned for making this edit? Should the bot fix it? This is not an obscure archive that nobody's looked at in years. – wbm1058 (talk) 18:30, 7 February 2023 (UTC)[reply]
    Interestingly enough, your signature looks great despite the font tags. No one is suggesting any changes to policy that would prohibit users from using font tags, this is only about bots making millions of automated edits to make cosmetic changes to (mostly) very old discussion pages. The point is, if major browsers are planning to remove support for font tags, they are virtually guaranteed to make a public announcement to the effect well in advance of actually making that change. Continuing to support font tags requires basically no effort and no significant browser resources, and font tags are still ubiquitous on the internet (particularly on older sites), so there is no motivation for browsers to drop this support anytime soon. And even if they did, the worst case scenario fallout here on WP is that old signatures wouldn't display in the intended font. —⁠ScottyWong⁠— 18:39, 7 February 2023 (UTC)[reply]
    One editor's failure to imagine a valid worst-case scenario should not dictate whether we replace obsolete HTML. An obviously worse scenario than the one described immediately above is something like white text on a black background; if the font tag used for the text stops working, we would end up with invisible black-on-black text. – Jonesey95 (talk) 19:20, 7 February 2023 (UTC)[reply]
    I see that someone has used obsolete HTML tags 1 time on this page! A font tag! Facepalm Facepalm
    Damn, look whose signature used to close their font color with a span! Too bad that bot only made a half-fix so now another bot has to make a second pass at that archive. Scottywong re: No one is suggesting any changes to policy that would prohibit users from using font tags, did anyone ask you to change the HTML in your signature or did you just decide to fix it on your own? Are you saying you could go back to using the same HTML you used back then? wbm1058 (talk) 20:17, 7 February 2023 (UTC)[reply]
    Oh, I see, you won't let another bot fix it, and apparently don't want to fix it yourself either. Only an administrator can fix those 21 font tags, if they dare. wbm1058 (talk) 20:25, 7 February 2023 (UTC)[reply]
    One editor's failure to imagine a valid worst-case scenario If you give me some time I can imagine thousands of hypothetical bad scenarios for you which are both more likely and more damaging than this one. Note: (a) this particular “worst case scenario” would cause unbelievably trivial harms (the font colors seen by readers in some signatures on historical wikipedia talk pages would revert to default black/blue instead of the author-preferred special decoration), and (b) the chance of this supposedly bad outcome are exceedingly rare because the cost of maintaining this feature by browsers is trivial and any change would be very disruptive, and (c) if this supposedly bad outcome occurs, we should get significant warning to decide at that point what to do about it, including plenty of time to fix old markup if we care. –jacobolus (t) 00:15, 8 February 2023 (UTC)[reply]
    P.S. @Jonesey95 Do you have a single concrete example from any page in Wikipedia where white text expressed via the 'font' element shows up on a black background expressed using some non-obsolete method? (The old school approach was to use the now-obsolete 'bgcolor' attribute, which we can safely assume would be removed at the same time as the 'font' element from some hypothetical future browser.) We are now talking about hypothetical markup being hypothetically obsoleted by hypothetical future browser changes. It’s too much of a pure thought experiment for me to keep up. –jacobolus (t) 01:34, 8 February 2023 (UTC)[reply]
    I'm beginning to lose my assumption of good faith. Perhaps your insource searching skills are not up to par. This signature (Thekillerpenguin, white text created with a font tag on a black background created with a span tag) took all of thirty seconds to find. – Jonesey95 (talk) 06:29, 8 February 2023 (UTC)[reply]
    Cool. If you want to get the bot to identify all comments with white text on black background and go fix those, that sounds fine. There can’t be tens of millions of those pages. (Even better, consider just switching them to black on white for basic accessibility.) –jacobolus (t) 07:53, 8 February 2023 (UTC)[reply]
    your insource searching skills are not up to par – P.S. this is unbelievably patronizing. So much for good faith. –jacobolus (t) 07:56, 8 February 2023 (UTC)[reply]
    As I mentioned elsewhere, the following is just a side note, since the argument regarding browsers continuing to support the font element doesn't depend on what markup exists on Wikipedia pages. All the same, there are users whose signatures use appropriate markup with a style attribute that specifies CSS property values for background colour and text colour, such as Blaze Wolf's. isaacl (talk) 01:48, 8 February 2023 (UTC)[reply]
  • Support & Oppose (I don't really understand which is which): For sure, a HTML tag is egregiously invalid the moment a mainstream browser drops support for it, however I don't think that it is egregiously invalid when all mainstream browsers do support the functionailty of the tag. Terasail[✉️] 16:40, 7 February 2023 (UTC)[reply]
  • Support the bot's interpretation of policy and also Object to the question wording. If you don't like the edits, then you gotta change the policy, because IMO the policy as currently written pretty clearly justifies what the bot is doing. Loki (talk) 17:33, 7 February 2023 (UTC)[reply]
  • Support bot changes per Redrose64's timeline above. 25 years of "you shouldn't use this" and 8+ years of "no, really, don't use this" is going to take a while to clean up, and it's better to do it now, while it's still mostly-supported, than after browsers stop dropping support. (You know, it would be nice if we could have multiple levels of bot. I probably don't want to see AnomieBOT or MalnadachBot, but Citation bot might need a check to make sure it parsed the data correctly.) --130.111.39.47 (talk) 18:00, 7 February 2023 (UTC)[reply]
  • Procedural close. Lots of inaccuracies in the RfC statement, suggest workshopping this in Idea lab. If you want to ask halting edits that soley replace font tags, make it clear because there are other bots that solely replace font tags as well, such as the recently approved SheepLinterBot and the pending request for Legobot 41 which has the exact same scope as MalnadachBot 12. It should also be made clear that bundling font tag replacement while fixing other Lint errors is fine. MalnadachBot 12 covers a broad variety of Lint errors that does not involve font tags, though admittedly font tags constitute the largest and most visible group. About font tag itself, every single page that has a font tag is flagged by MediaWiki in its "Page information" and other maintenance pages. Replacement is supported by MediaWiki developers who decide the future path of MediaWiki. This kind of replacement has been going on for years both by bot and manual, reducing the Lint error count from over 23 million to 7.7 million currently. While replacing font tags in linter cat 2 does not change the way a page is rendered in desktop currently, that doesn't mean it will continue to be so in future, since it is already marked obsolete and browsers can drop support at any time without notice. Mobile wikipedia for instance already does not diplay <tt>...</tt> which is classed the same as font tag. ಮಲ್ನಾಡಾಚ್ ಕೊಂಕ್ಣೊ (talk) 18:34, 7 February 2023 (UTC)[reply]
  • I Support MalnadachBot's WP:BAG approved actions. I Support the Procedural close of this argument due to unclear actions for unsupported end goals. I Object to the anti-intuitive wording of which side 'support' and 'oppose' mean in this RFC. I Object to the "don't edit obsolete tags because they aren't currently visually broken" argument presented in this RFC.

    Are you arguing that (only) MalnadachBot should be stopped? Are you arguing that all bot actions to correct minor errors that affect pages should stop? What is it that you are actually suggesting? That we all wait until the shit hits the fan when support for all these obsolete tags gets yanked and the issues are a visual issue, THEN fix the millions of issues?

    Today there are 7.7 million problematic lint errors on Wikipedia; Missing end tags, stripped tags, etc. with varying levels of impact and breakage. 4.9 of these 7.7 million are obsolete tags, and thousands of these obsolete font tags are already currently broken with tidy font bug errors (old code being handled and displayed differently or flat out ignored across different browsers). And as Malnadach Konkno stated, the obsolete <tt>...</tt> tags have already been dropped by mobile browsers. While I don't know at the moment how many of the 4.9mil are TT, the number is not small.

    When (not IF, WHEN) browsers decide to stop supporting more of these various obsolete tags, Wikipedia talk pages will all be up shit's creek with problems from the dropped support. The MalnadachBot, and multiple other sanctioned bots (all fixing the specified errors identified as being problematic), are methodically and carefully built to address these issues effectively and efficiently with thought and care given to minimize impact to users.

    The MalnadachBot took a number of months off recently to address an issue with some cases where the bot was making a multiple smaller edits to some pages where it wasn't able to clear all issues from the page instead of clearing all issues in one go. Those issues with the bot were addressed and the bot was brought back online last week. The current version, if it cannot clear all the errors on the page, it will skip the page to allow human editors to gnome and correct and clear the remaining issues.

    If you hate seeing bot edits appearing in your recent edits feed, by all means, go to your settings and activate the "Hide bot edits from the watchlist" option in the "Changes shown" section to remove bot edits from your view. But this "don't edit obsolete tags because they aren't visually broken" argument has been argued before, and the need to address this impending problem before it hits the fan far outweighs the currently minor inconvenience to users. Zinnober9 (talk) 22:42, 7 February 2023 (UTC)[reply]

    Small clarification, tt is still fully supported by major mobile browsers (AFAIK), it's just not styled in the mobile Wikipedia skin (IIRC this is because it doesn't restore the styles after its reset). Legoktm (talk) 08:10, 8 February 2023 (UTC)[reply]
  • Support per Zinnober9's comment above — TheresNoTime (talk • they/them) 22:48, 7 February 2023 (UTC)[reply]
  • Support OK with keeping Wikipedia up to standard. Please co-ordinate with ಮಲ್ನಾಡಾಚ್ ಕೊಂಕ್ಣೊ on future RfCs. -- GreenC 22:59, 7 February 2023 (UTC)[reply]
  • Support per Zinnober9's statement above. Once the lint error backlog has been cleared, editors won't be bothered by "clog[ged] up ... watchlists and revision histories" anymore. —Bruce1eetalk 23:26, 7 February 2023 (UTC)[reply]
  • Support The font tag is deprecated, and so must still work, but we probably should remove it. I do understand people complaining about useless Bot edits cluttering their watch lists - precisely the situation that WP:COSMETICBOT was written to address. I realise I can suppress them, but that is a drastic step. I would like to think that getting rid of the backlog would stanch the flow of lint errors, but regard that as naive... And I deplore the use of CSS, which is itself a major headache for Wikipedia. Hawkeye7 (discuss) 00:46, 8 February 2023 (UTC)[reply]
  • Support Nothing wrong in maintaing coding standards.- Mnair69 (talk) 01:23, 8 February 2023 (UTC)[reply]
  • Support There are teams of engineers working to ensure that I can correctly read all of the Dumb Markup people try to use on the Internet. People who rely on syntax highlighters or screen readers? Not so much. Fixing this markup that is by definition broken helps make Wikipedia more accessible. Very few people will look at some ancient AfD. But we keep those archives around for a reason. If we wanted to make it a privilege to view them, we would delete them. They should be viewable by all, not just people with a modern computer and browser and full function of their eyes. In sum, just because it looks fine to you does not mean it looks fine to everyone. You have the option to hide the bot from your watchlist. You may not want to, but you have the option. There is no preferences menu in real life to disable blindness. HouseBlastertalk 01:51, 8 February 2023 (UTC)[reply]
    Do you have evidence that syntax highlighters or screen readers have a difficulty parsing the 'font' element? This seems like a hypothetical concern. –jacobolus (t) 02:28, 8 February 2023 (UTC)[reply]
  • Supportper Zinnober9's and Hawkeye7's statements above. --Afernand74 (talk) 07:24, 8 February 2023 (UTC)[reply]
  • Procedural support Unfortunately this RfC has been structured as a binary question (support or oppose) when it has the opportunity for a lot of nuance and middle ground. I generally think cleaning up these tags is a good idea, and want to see the work continue (hence my support). I am not a fan of how it has been done in the past, I think the fact that a bot racked up 10 million edits by often editing the same pages over and over again shows a serious failure in bot oversight. This is largely why I've proposed a Legobot task for it, because I think it can be done in a better way that disrupts editors as minimally as possible. I think the focus on browsers is misplaced, I fully agree that browsers will never drop support for font and whatever, but there are plenty of tools, scripts, etc. that process our wikitext and HTML and will benefit from this cleanup. I've also written an essay for my overall vision of Linter. Legoktm (talk) 07:50, 8 February 2023 (UTC)[reply]
    @Legoktm 6.8 million of MalnadachBot's 10.7 million is from Task 13 to blank inactive IP talkpages and further 1.1 million is from Tasks 1-11 fixing mainly high and medium priority Lint errors. ಮಲ್ನಾಡಾಚ್ ಕೊಂಕ್ಣೊ (talk) 08:44, 8 February 2023 (UTC)[reply]
  • Oppose outside of the article/template space, especially in signatures, which seems to be most of what it's doing. The use of font instead of span tags is sometimes the only thing keeping a signature at a decent character count to avoid source clutter. There's no good reason for a browser to stop supporting font, and if one did, a polyfill could easily be built into the Mediawiki software, rather than having a bot make thousands of edits. small jars tc 08:17, 8 February 2023 (UTC)[reply]
  • Support per Zinnober9's comments. As noted above several times, there are ways for editors wishing to hide this specific bot's edits OR to hide only lint errors. This RfC seems more of a "if I don't like it, then it should be stopped" and not anything of substance. This is also a non-issue in my opinion as the actions of this and other bots are going to either stop or be dramatically less in the not so distance future. --Gonnym (talk) 08:53, 8 February 2023 (UTC)[reply]
    Also, oppose closing this RfC and starting it again. This isn't the first disscussion the editor was in concerning this bot. Trying to find the correct language that will get them the result they want should be discouraged. Gonnym (talk) 08:55, 8 February 2023 (UTC)[reply]
  • Support per Zinnober9 more or less. This is not egregiously invalid HTML, but that's a red herring. The task is fixing Linter errors and becoming inline with modern standards, and that's a valuable task. Zinnober9 expresses my full stance on this better than I could. Headbomb {t · c · p · b} 12:01, 8 February 2023 (UTC)[reply]
  • Oppose bot edits trying to "fix" things with no constructive effect, and there's a problem over at the BOT page if such bots are getting rubberstamped. Alsee (talk) 13:53, 8 February 2023 (UTC)[reply]
  • Support everything Zinnober9 said. Cleaning up obsolete code used on Wikipedia constitutes essential maintenance. Not everybody uses a major browser (people already brought up the example of accessibility-oriented web browsers and software, for which there's no guarantee they'll support these junk tags). And to oppose what these bots do is to place an undue burden on MediaWiki maintainers for zero benefit. Why are we singling out these edits, when the vast majority of Wikipedia edits nowadays are Wikignoming of one form or another? Your watchlist page isn't supposed to be clean, but our articles must be. DFlhb (talk) 14:10, 8 February 2023 (UTC)[reply]
  • Support largely per Zinnober9. I doubt we'll see loss of support from a major browser any time remotely soon, but the tags are obsolete, and as Legoktm has pointed out there is benefit now from bringing our codebase into compliance with modern standards. There is a measure of irony that this adverse reaction occurred precisely because the bot was designed conservatively so no one would feel the need to review its edits. Ultimately could have been better thought out but I have no concerns over the present revised method of operation. 74.73.224.126 (talk) 14:54, 8 February 2023 (UTC)[reply]
  • Oppose. Even in the event that some future browser does not support the FONT tag, the impact won't be pages that crash or horrible misrendering; the opening and closing tags will just be ignored, and in the vast majority of cases the only change will be the loss of some sort of emphasis effect. (And with my grumpy old man hat on: if our not eliminating old tags encourages browsers to keep supporting them, all the better. We shouldn't want the vast number of legacy pages on the Web to become inaccessible to most viewers.... and the ongoing drive to make HTML something that is not comprehensible by humans and creatible by hand is a sad one. --Nat Gertler (talk) 19:50, 8 February 2023 (UTC)[reply]
    We do not know that the opening and closing tags will just be ignored. It could make the text invisible by making it the same as background colour. Page crashing is the worst case scenario. What happens in future is just speculation, what we do know is that it won't be the same as it is now, which is why they are being replaced. When Wikipedia switched from Tidy to Remex in July 2018, it broke a lot of pages which you can read here. People had said at the time that COSMETICBOT does not apply for pages that look fine now but would break in future. ಮಲ್ನಾಡಾಚ್ ಕೊಂಕ್ಣೊ (talk) 02:32, 9 February 2023 (UTC)[reply]
    It is implausible that browsers will ever remove support for 'font' tags at all. But if they do, it is vanishingly unlikely that the result will be crashing pages. The speculation on which these bot tasks are being based is completely detached from reality. –jacobolus (t) 02:46, 9 February 2023 (UTC)[reply]
    I said it is the worst case scenario. These bot tasks aren't based on speculation, it is based on future path indicated by MediaWiki developers. ಮಲ್ನಾಡಾಚ್ ಕೊಂಕ್ಣೊ (talk) 03:33, 9 February 2023 (UTC)[reply]
    future path indicated by MediaWiki developers – I have asked several times for a link to this, and none has been forthcoming. Where is this “future path”, who made it up on what authority, what were their criteria, and where was the discussion? –jacobolus (t) 04:27, 9 February 2023 (UTC)[reply]
    I am not aware of any MediaWiki developer ever stating that support for these tags will be removed (rather I've tried to say the opposite). Legoktm (talk) 04:35, 9 February 2023 (UTC))[reply]
    Actually, we do know what will happen with a browser that does not support recognition of the FONT tag, because browsers facing unrecognized tags is not something new. Browsers have been faced with unrecognized tags since the Web began (thanks to those of us who were handcoding and made the occasional typo). What they do with a tag that they don't recognize is to ignore the tag, simple as that. So the alteration caused by the opening FONT tag and the return to prior status caused by the closing tag will go undone. It is bizarre to think that programmers are going to have text between unrecognized tags undergo some special form of destruction. --Nat Gertler (talk) 23:17, 11 February 2023 (UTC)[reply]
    What they do with a tag that they don't recognize is to ignore the tag, simple as that. They don't actually. Take for example <blink>...</blink>, which is a html tag no longer supported by major web browsers. <acronym>...</acronym> which is obsolete in HTML5, supported by major web browsers but not by mediawiki. When used in wikipedia, the tags themselves render as plaintext. ಮಲ್ನಾಡಾಚ್ ಕೊಂಕ್ಣೊ (talk) 08:29, 13 February 2023 (UTC)[reply]
    Note wikitext only supports a subset of HTML5 elements, thus other tags will be transformed by the wikitext parser so they are displayed as-is. To facilitate incremental deployment of new elements, as per the HTML5 spec, browsers "must treat elements and attributes that they do not understand as semantically neutral; leaving them in the DOM (for DOM processors), and styling them according to CSS (for CSS processors), but not inferring any meaning from them." Thus the contents of unknown elements are rendered, without any special default styling. isaacl (talk) 16:17, 13 February 2023 (UTC)[reply]
  • (edit conflict) Support. And per Zinnober9. Shutting down a linter bot when it's updating deprecated/obsolete code like this? Seriously? The obsolete tags are not only deprecated/obsolete, but they also create linter errors that we treat as a backlog. Halting this bot, while it's fixing all this obsolete code, would mean it take longer to get all of it done. In the day that these obsolete code are dropped from support, if we shut down this bot we may have more instances of these tags appearing as plain text. Sure, if you want to hide MalnadachBot from your watchlist you can go to WP:HIDEBOTS. Simple. Why complain about this bot when it is limited to one edit per page for fixing font tags? Sheep (talkhe/him) 02:35, 9 February 2023 (UTC)[reply]
    The reasons I complain about a bot changing font tags to span tags on old talk pages are: it probably will never need to be fixed; if it does need to be fixed, we'll get plenty of advance warning; we don't have a rule against creating new 'errors' (people can still use the font tag, there have been new font tags added in this very discussion), so we're making new 'errors' for the bot to run around and 'fix'; and, I'm no expert in these things, but I bet there are ways to 'fix' these 'errors' that don't require a bot to make millions of edits. Levivich (talk) 03:26, 9 February 2023 (UTC)[reply]
    if it does need to be fixed, we'll get plenty of advance warning we already have that warning by them being marked as Lint errors. ಮಲ್ನಾಡಾಚ್ ಕೊಂಕ್ಣೊ (talk) 03:35, 9 February 2023 (UTC)[reply]
    Being marked as Lint errors is not a warning that browsers will drop support for font tags. mw:Help:Lint errors/obsolete-tag says Since it is unclear to us at this time how far we want to push this goal of HTML5 compliance, this category is marked low priority ... If, in the future, there is greater clarity about pursuing this more aggressively, we will reflect that by updating the severity of this linter issue appropriately ... It is likely that browser vendors will give us significant notice before making any breaking changes given how prevalent these deprecated elements are used across the internet. Correct me if I'm wrong, but I don't think any browser vendors have even indicated plans to drop support for the font tag, and it's been like 15 years since HTML 5's initial release. Levivich (talk) 03:54, 9 February 2023 (UTC)[reply]
    I'm not quite sure I follow the logic here. Part of your argument seems to be predicated on the idea that the bot's task will never be complete. But this is also true of many other bots. Should we disallow archive bots because people can make new threads that will later need to be archived? Perhaps you could clarify? 74.73.224.126 (talk) 03:59, 9 February 2023 (UTC)[reply]
    That's a bit of apples and oranges. Some bot tasks are perennial, like archiving talk page threads, but we want new talk page threads to be created. If we don't want font tags, we should first have a rule against using font tags. We could have the software automatically turn any font tags into span tags at the time the edit is published. We could have an edit filter that prevents edits with font tags from being published. We could just add it to WP:SIG policy at the very least. Then, once we've stopped making new 'errors' (if that's how we're treating them), it makes sense to fix the old ones (which, still, I don't think needs a bot; the software could replace the font tag with a span tag when the wiki text is turned into HTML (like it does with wikimarkup, and there's probably a global search and replace that could fix the wikitext). If browsers were ever going to stop rendering font tags, they'd announce that in advance, and that's the time to start looking at a patch to mediawiki, or a bot, or a policy, or whatever, to handle it. But so long as we allow the use of font tags, we should not bother with a bot that cleans them up, it's just making edits that don't need to be made, and millions of them. Levivich (talk) 04:06, 9 February 2023 (UTC)[reply]
    Some low-hanging fruit might be having mw:Extension:Linter check attempts to change one's signature in Special:Preferences, and display a notice or even prevent saving when a signature has a linter error. Might be worth a Phab ticket. –Novem Linguae (talk) 04:16, 9 February 2023 (UTC)[reply]
    That was already discussed and implemented (partially), see New requirements for user signatures. Legoktm (talk) 04:37, 9 February 2023 (UTC)[reply]
    In which it was decided not to ban font tags in signatures. Levivich (talk) 04:42, 9 February 2023 (UTC)[reply]
    The WMF decided in 2020 to defer disallowing font tags in signatures and to leave it up to individual wikis. We can still decide locally to disallow font and other obsolete tags in signatures. – Jonesey95 (talk) 06:36, 9 February 2023 (UTC)[reply]
    I'm still not sure I follow. There are many things that while not prohibited are still undesirable and are fixed by bots, the canonical example being the transclusion of subst-only templates; even orphaning references is nowhere explicitly prohibited (and it is almost always done accidentally), doesn't mean they shouldn't be fixed by a bot. The fact that there are periodic additions to some maintenance commitment should not in general cause that commitment to be neglected, quite the opposite actually. 74.73.224.126 (talk) 04:50, 9 February 2023 (UTC)[reply]
    What we're talking about is a WP:COSMETIC edit, whereas all of your examples (archiving threads, substing templates, fixing orphaned references) are non-cosmetic edits. The words "maintenance commitment" do not apply, because changing font tags to span tags is neither maintenance nor a commitment. Levivich (talk) 04:59, 9 February 2023 (UTC)[reply]
    What I'm gathering then is that your objection is not at all predicated on the the fact that new linter errors are periodically added as was my earlier understanding, but solely on your belief that the cost of fixing these errors outweighs the benefit in aggregate, is that correct? 74.73.224.126 (talk) 05:07, 9 February 2023 (UTC)[reply]
  • Support. I think someone said that they didn't want to hide bot edits from their watchlist because there are some bot edits that they want to see on their watchlists. Perhaps the solution is to ask the bot operators to stop marking as "bot edit" their bots' edits that should be seen on watchlists. Some of my RMCD bot edits are not marked as "bot edits" for this reason. See the relevant bots noticeboard discussion. – wbm1058 (talk) 12:55, 10 February 2023 (UTC)[reply]
    Also note that you can hide specific bots and not all bots through the guidance found at WP:HIDEBOTS. Headbomb {t · c · p · b} 13:18, 10 February 2023 (UTC)[reply]
  • Comment Neutral on this particular question, but I feel like if these tasks are endorsed and continue, maybe we should look at unbanning Magioladitis, Betacommand, and Rich Farmborough. Surely we can figure out how to hide their edits too. Folly Mox (talk) 01:16, 11 February 2023 (UTC)[reply]
    Magioladitis isn't banned. Betacommand was banned for far more than making low-value automated edits. Even Rich Farmborough, for whom your point comes the closest to applying, was indeffed not for that per say but for stat[ing] he will not be complying with editing restrictions, a social issue that goes well beyond what is being discussed here. * Pppery * it has begun... 02:19, 11 February 2023 (UTC)[reply]
  • Oppose I'm not seeing any convincing argument for what benefit these edits grant, and why the claimed harm of Wikipedia talk pages will all be up shit's creek with problems from the dropped support. can't be addressed when it happens, not some unknown time in the future. The harm this is causing, of course, is happening now both in the existence of this discussion, and the fact that people have repeatedly felt the impetus to start it or discussions like it.
    This discussion is turning into yet another Wikipedia:parable of the wildflowers. We are acting out the second sentence of the second-to-last paragraph. * Pppery * it has begun... 02:19, 11 February 2023 (UTC)[reply]
    Even more than this wildflower parable, I am reminded of the parable in Arnold Lobel’s children’s book Grasshopper on the Road, in which a housefly, having finished sweeping her own house and washing her windows, has decided to sweep the (dirt) road because it is too dusty:
    Lobel, Arnold (1978). "The Sweeper". Grasshopper on the Road. pp. 25–33. 'No, no, no,' said the housefly. 'I will never rest. I am having a wonderful time. I will sweep until the whole world is clean, clean, clean!'
    jacobolus (t) 02:25, 12 February 2023 (UTC)[reply]
  • Support per excellent reasoning by Zinnober9. I can see only positives and no negatives from this task for readers. Per HouseBlaster, the focus on "major web browsers" is wrong headed in my opinion. 2409:408C:8E8B:F446:0:0:43C8:1313 (talk) 08:38, 16 February 2023 (UTC)[reply]
  • Honestly, I think I sort of agree with comments like jacobolus's. I don't personally see the value in replacing lots of font tags on talk pages, especially so while major browsers continue to support it. And if some dropped support, it might be better to see something built into MediaWiki to address this (though I appreciate a software mapping of HTML to different HTML would be quite awkward and non-transparent, so I don't hold this view very strongly). In any case, I don't believe the loss of font and/or colour in signatures is a big deal. Many non-updated pages no longer look like how they did when the editors of the time wrote their comments, for example due to template merges/changes, orphanings, etc. Which is fine. ProcrastinatingReader (talk) 12:07, 16 February 2023 (UTC)[reply]
  • I lean support on this, but not strongly. The conversation seems to indicate that not all lint issues are going to cause trouble, but some might and may as well fix others at the same time. Regarding watchlist spam, I find that the focus on talkpage and archives alleviates this compared to concern I would have if the notifications were edits to articles. CMD (talk) 13:42, 16 February 2023 (UTC)[reply]
  • Oppose. Font tags are not "egregiously invalid" and it's absurd to suggest otherwise. Changing them is meaningless make-work. As has been noted multiple times, the chance that the font tag will cease to be supported is virtually nil due to its extensive history. The actions of these bots in mass-editing pages, tens of millions of time for no measurable benefit are hugely disruptive, and even pointless when there are other, smarter ways to deal with legacy tags (cf Levivich's comment above). Stick to fixing actual breakages like mismatched tags.  — Scott talk 01:39, 18 February 2023 (UTC)[reply]
  • Irrelevant Even if you did manage to get consensus for this interpretation, it would not directly affect the bot approvals. Consensus can, as always, create exceptions for particular cosmetic edits and an approved BRFA reflects a consensus. So even if the deprecated tag replacements were to be considered cosmetic, you'd have to see whether it has also been established (here or elsewhere) that consensus has changed regarding the approvals. Asking the wrong question (and poorly phrasing it) then tacking on "votes will be interpreted as such-and-such" is not likely to help your case. Anomie 14:01, 18 February 2023 (UTC)[reply]
    Where was the consensus established for replacing all of the font tags with spans? There has been no evidence given for such a consensus, despite annoyed editors asking for it over and over again, and a very obvious lack of consensus judged by the repeated complaints. This whole conversation boils down to “Hey stop, this is annoyingly disruptive!” “We can’t stop. Everyone agreed about this already.” “Who agreed, and when? I never agreed!” “By ‘Everyone’ I mean a small cabal who decided between ourselves. The agreement is a secret; we can’t show you. But it’s settled and can’t be changed now so stop complaining.” (Repeat ad nauseam.) –jacobolus (t) 16:11, 18 February 2023 (UTC)[reply]
    Consensus for a bot to run is established on the appropriate subpage of WP:BRFA, for example Wikipedia:Bots/Requests for approval/MalnadachBot 12. When you and others recently tried to challenge that approval at WP:BOTN, you did not attain WP:CCC. So now you're here with this RFC in hopes that a wider discussion will establish WP:CCC, as is your right. We'll see if it goes your way, but IMO the poor framing of the question is not likely to help your case. Anomie 16:39, 18 February 2023 (UTC)[reply]
    #12 was speedily approved. It was open for less than 4 hours and the only person who participated besides the bot op is the BAG member who approved it. Is that really what you want to describe as "consensus"? Levivich (talk) 16:45, 18 February 2023 (UTC)[reply]
    It appears to be based on a good track record with #1 through #11, so more than just what's directly on that page. And, whether you like it or not, speedy approvals are currently part of the WP:Bot policy. If the community wants to remove or restrict that, the normal processes should be followed. Anomie 16:54, 18 February 2023 (UTC)[reply]
    1-11 weren't about changing font tags to span tags; they were all about something a little bit, or a lot, different, and those also didn't have community approval. There is no shame in admitting that this is the first time the community at large has ever been consulted about whether or not a bot should change font tags to span tags. There is no basis for suggesting that doing so already has community consensus. I agree with you, though, that this RFC wasn't drafted well enough to really test the question, unfortunately. There will probably be another one later. Levivich (talk) 16:58, 18 February 2023 (UTC)[reply]
    All except #4 were about fixing linter errors, which is what #12 was approved for generally. The font tags are a subset of the linter errors. So far you and others have failed to establish that only some linter errors should be fixed but not others. There will probably be another one later. I'm sure there will be, ad nauseam. Anomie 17:09, 18 February 2023 (UTC)[reply]
    That's a bit of rhetorical sleight-of-hand. "Linter errors" is a broad category that includes many different things, from serious to not serious. So far you and others have failed to establish that all linter errors should be fixed. mw:Help:Extension:Linter says, emphasis mine: Not all of them need to be fixed promptly or even ever (depending on your tolerance for lint). Levivich (talk) 17:24, 18 February 2023 (UTC)[reply]
  • Support per others. Izno (talk) 23:15, 21 February 2023 (UTC)[reply]
  • Support. While I can't imagine browsers dropping support for the font element, on reflection, I think sadly it's realistic to consider a future in which getting our content to users requires feeding it through some kind of sandboxed or app environment. No reason to make that process more difficult by leaving in place markup that's been obsolete since Netscape was a viable property, and has perfectly viable alternatives. Choess (talk) 16:43, 25 February 2023 (UTC)[reply]
    I think sadly it's realistic to consider a future in which getting our content to users requires feeding it through some kind of sandboxed or app environment could you explain what you mean by that, and why developing such a sandbox or app will be made notably harder by the existence of font tags? For context, the mobile website and app already do transforms on the HTML of a page that are significantly more complicated than replacing <font> with <span> * Pppery * it has begun... 16:49, 25 February 2023 (UTC)[reply]
  • Support the bot's actions. Better to fix all the errors now, than wait until a browsers decides to stop recognizing the invalid tags and suddenly we have to get something-million invalid tags fixed in a handful of months. --SilverTiger12 (talk) 19:46, 25 February 2023 (UTC)[reply]
  • Support the bot's actions, per pretty much all the above.  — SMcCandlish ¢ 😼  09:31, 26 February 2023 (UTC)[reply]

Secondary question: should WP:COSMETICBOT be updated?

Please indicate below if you believe that WP:COSMETICBOT should be updated to be more specific about automated fixing of deprecated HTML tags that are supported by all major browsers. I will propose an updated wording, but please feel free to propose your own if you like. My proposed rewording is: "egregiously invalid HTML such as unclosed tags, even if it does not affect browsers' display or is fixed before output by RemexHtml (e.g. changing <sup>...</sub> to <sup>...</sup>). This does not include fixing properly formed HTML tags that have been deprecated, as long as the tags are still supported by all major web browsers."

  • Side comment HTML is neither SGML nor XML, and markup that does not explicitly close tags is in many circumstances perfectly valid HTML (they would be invalid as XHTML, but Wikipedia is not served as XHTML). One of the basic purposes of the HTML5 effort was to standardize the way browsers parse and handle such markup. There is thus a very clear and well-defined algorithm (backed by extensive conformance tests) which browsers implement to implicitly close tags where necessary. Calling this "egregiously invalid HTML" demonstrates that whoever wrote down this page had a basic lack of familiarity with HTML parsing rules. :-) Of course, it is entirely fine if any particular creator of HTML documents (say, the Wikipedia project) decides to require some stricter set of rules than the specification requires. Many non-browser tools have broken parsing algorithms that do not properly handle unclosed elements, so it can be worthwhile to carefully to close every element to help them out. On the other hand, some other HTML authors/tools automatically remove optional closing tags in the pages they serve to save bandwidth. Blog post explaining some context from Remy Sharp. –jacobolus (t) 09:10, 7 February 2023 (UTC)[reply]
    You've described literally what RemexHtml does. In any case, many gadgets, tools and bots operate on the wikitext, which is a mish-mash of HTML and wikitext. Having reasonably well-formed markup is important because all of those things don't implement all of the HTML algorithms to know what it'll end up like. Legoktm (talk) 16:10, 7 February 2023 (UTC)[reply]
    Again, omitting many closing tags is perfectly “well formed” HTML. It is just not what some non-spec-conforming tools with buggy parsers expect. –jacobolus (t) 00:20, 8 February 2023 (UTC)[reply]
    It could be reworded to make the idea behind it more clear, but then again it doesn't seem to be causing any issues and we do want to avoid overly messy markup. Policy has always been created to describe practice, and in practice there's widespread tolerance for allowing bots to close unclosed tags. 74.73.224.126 (talk) 15:12, 8 February 2023 (UTC)[reply]
    I have no problem with bots closing unclosed tags, if that helps other tooling. (Browsers, human editors, and tools with spec-conforming parsers don’t care.) The part that I am calling out is the technically incorrect and comically exaggerated “egregiously invalid” language.–jacobolus (t) 19:58, 8 February 2023 (UTC)[reply]
    I agree it sounds silly. However since the verbiage is controlled by a prior RFC there's some procedural hassle (and nigh inevitable wrangling) needed to change it, and since the current wording hasn't caused any headaches or misunderstandings it probably isn't worth the trouble. I'm content to treat it as an internal term of art; much of our internal WIKISPEAK has meanings that are considerably different from or even directly opposed to normal English usage. But I understand reasonable minds may differ. 74.73.224.126 (talk) 03:46, 9 February 2023 (UTC)[reply]
  • Oppose. Bots and human editors update pages all the time to modernize their syntax. This syntax is often properly formed but out of date for some reason, sometimes because MediaWiki is removing support for otherwise valid tags. The proposed restriction would unnecessarily limit such useful edits. – Jonesey95 (talk) 14:48, 7 February 2023 (UTC)[reply]
  • Oppose: Nothing in the RfC statement provides a good reason to update the policy. And there is an argument to be made that an unsupported html tag isn't a properly formed html tag anyway so it is somewhat redundant. Terasail[✉️] 16:32, 7 February 2023 (UTC)[reply]
  • Oppose. Deprecated HTML is not proper HTML. Also the focus of this RfC, font tags is not deprecated HTML, it is obsolete HTML, which by its techinical HTML definition is stroger than deprecated. See Redrose's comment in section above. All of the Lint fixing bots are replacing obsolete HTML. ಮಲ್ನಾಡಾಚ್ ಕೊಂಕ್ಣೊ (talk) 18:45, 7 February 2023 (UTC)[reply]
    Although the HTML standard indeed uses the term "obsolete" where it used to use "deprecated", there is no technical HTML definition. Instead it says that you "should not" use obsolete features. You may infer that "obsolete" is stronger than "deprecated", but it is not defined as such. [10] Hawkeye7 (discuss) 00:59, 8 February 2023 (UTC)[reply]
    The HTML5 specification says: "16.2 Non-conforming features Elements in the following list are entirely obsolete, and must not be used by authors: [...] font [...]". So it is no longer part of current "HTML" as a standard. However, the font element works as expected in every single browser from the past ~25 years, and will continue to be supported at a technical level probably forever. It costs very little for browsers to maintain support for this element, and would be disruptive enough to remove that such removal is unlikely to ever occur. I would personally put money on the 'font' element still working in web browsers at the end of this century.
    It seems plenty reasonable to me to hunt down and fix any font tags that occur in main or template namespaces (ideally for semantic markup styled by CSS rather than presentational element-specific styles baked into the markup). What I object to is trying to modify the majority of all talk pages everywhere on this website because old talk signatures used to often use the 'font' element. –jacobolus (t) 01:19, 8 February 2023 (UTC)[reply]

Discussion (deprecated HTML)

Is this not mountains out of molehils? Arguing that it is cosmetic is a bit misleading, considering a browser could just drop support for a tag and then you would have a problem. I don't have any strong opinions on allowing / disallowing but from a scan of the bots noticeboard I am in the mindset that it is a lot of hot air from users who don't want to filter their watchlist or create separate saved filters on their watchlist and that they can "Stay mad". Terasail[✉️] 11:32, 7 February 2023 (UTC)[reply]

Also the the RfC is confusing and longwinded. Specific call outs to MalnadachBot should be minimal and only for adding context to the task since this RfC has less to do with the specific bot and more to do with the task in general. Many of these discussion pages have not been edited in 10+ years, and many of them receive nearly zero annual pageviews. – This has nothing to do with anything and is just filler "We shouldn't edit pages with low pageviews???". More text should be adding context on the task being discussed and information on how stopping this / future tasks would be an improvement. Terasail[✉️] 11:49, 7 February 2023 (UTC)[reply]
then you would have a problem – no you would not have a problem. If some hypothetical future browser (say, Mozilla Firebat 1.0 for holographic projections in the year 2050) decides to not support the font tag, then at that point some of the user signatures on Wikipedia talk page discussions from ~2005–2010 would render as standard colored text in the standard size, instead of the user-intended tiny purple or rainbow gradient or whatever. We would have "graceful degradation", and the grad student writing their PhD thesis about the evolution of Wikipedia anti-sockpuppet countermeasures would be able to continue their research just fine, just marginally less colorfully. –jacobolus (t) 12:04, 7 February 2023 (UTC)[reply]
And If some hypothetical Wikipedian who has watched talk page discussions from 2005-2010 would like to stop bot edits from appearing in their watchlists, they have many different ways to do so. You can still edit Wikipedia even with MalnadachBot's edits in your watchlist, y'know. I don't see how the bot's existence would be a problem, so I don't know what problems this RfC intends to address. 0xDeadbeef→∞ (talk to me) 12:10, 7 February 2023 (UTC)[reply]
We have policies for a reason. Just because you can ignore a bot that is violating policy doesn't mean the bot should continue doing what it's doing. This RfC is to get a community consensus on the interpretation of that policy, to determine if the bot is violating it. —⁠ScottyWong⁠— 16:14, 7 February 2023 (UTC)[reply]
Im going to be honest. You can't just say no you would not have a problem. When it is a fact that a piece of text which is intended to be coloured which isn't coloured is a problem. Terasail[✉️] 12:11, 7 February 2023 (UTC)[reply]
Could you describe a hypothetical "problem" that would be caused by a user's signature that is displayed in normal black text instead of being colored as intended? —⁠ScottyWong⁠— 16:14, 7 February 2023 (UTC)[reply]
@Scottywong Signature? Where in the two RfC questions did you mention signatures. If you wanted to target signatures then you should have specified that in the RfC question. Don't start being overly specific when the RfC is overly broad. Terasail[✉️] 16:25, 7 February 2023 (UTC)[reply]
Literally every one (of at least dozens) of this bot’s edits that I have examined was modifying a talk page signature from 2005–2012. –jacobolus (t) 00:24, 8 February 2023 (UTC)[reply]
So as a concrete example, if Terasail[✉️] had been created using a font tag instead of a span, and a hypothetical browser years in the future removed support, it would render instead as Terasail[✉️], switching from teal to the default linkn color (in my browser, blue). This is the “worst case harm” other editors are talking about elsewhere in this discussion. –jacobolus (t) 00:27, 8 February 2023 (UTC)[reply]
And to respond to signatures specifically. Read my previous comment which applies to all text no matter the context. a piece of text which is intended to be coloured which isn't coloured is a problem Terasail[✉️] 16:26, 7 February 2023 (UTC)[reply]
I can see that you're not discussing this in good faith, so I'll end my discussion with you here. But, if you took the time to inspect MalnadachBot's edits, you'd see that the vast majority of them are fixing HTML tags on user signatures. —⁠ScottyWong⁠— 16:40, 7 February 2023 (UTC)[reply]
I am only saying it how I see it. I have no vested interest in the out;come of a bot which I didn't know existed until today. Terasail[✉️] 16:52, 7 February 2023 (UTC)[reply]
The following is mostly a side note, since it doesn't affect the main argument that browsers are unlikely to remove support for the font element. One reason why editors specify a font colour is because they've modified the background colour (and from an accessibility perspective, it's desirable to do both at the same time to ensure sufficient contrast). Thus not using the intended font colour can affect visibility. isaacl (talk) 16:43, 7 February 2023 (UTC)[reply]

There are a number of technical inaccuracies in the opening statement, probably it would make sense to have a opposing (supporting I guess?) view as well. E.g. edits like [11] do in fact change what the wikitext looks like, to reflect how it used to render, pre-RemexHtml. I've also written User:Legoktm/Fixing lint errors, which may be useful to copy from. Legoktm (talk) 16:03, 7 February 2023 (UTC)[reply]

I'm sure the bot is making many different types of edits. If it's making edits that change the way content is rendered, then those edits obviously wouldn't be affected by anything that is decided here, because they wouldn't be in violation of COSMETICBOT. To be clear, this RfC isn't intended to shut down MalnadachBot completely and prevent it from making a single edit in the future. It's only about the blanket authorization that it was given to make millions of edits that are seemingly in violation of bot policy. —⁠ScottyWong⁠— 16:16, 7 February 2023 (UTC)[reply]
  • Would you consider making the RFC question and opening statement more concise? I think best practice is to be extremely neutral and concise (like 1-2 sentences / a single question) at the top of the RFC, and any detailed commentary by the RFC writer can be moved to the RFC writer's !vote, which is usually the first !vote so is still at the top. An example of a single question for this RFC might be something like "MalnadachBot Task X is a bot task that replaces older HTML tags with newer HTML tags in the wikicode of millions of pages. Should MalnadachBot Task X continue running?" The "First, a bit of history behind this situation:" section is what stands out to me as being too long. –Novem Linguae (talk) 18:00, 7 February 2023 (UTC)[reply]
I'm open to that, but feel that this RfC has already been open long enough that making changes to its fundamental structure would be disruptive. —⁠ScottyWong⁠— 04:58, 8 February 2023 (UTC)[reply]
Fair enough. No worries, thanks for replying. –Novem Linguae (talk) 06:02, 8 February 2023 (UTC)[reply]
It's been what, one day? You could easily withdraw it and begin collaboratively drafting a new one with a different format (e.g. "View by X" with endorsements) that allows capturing nuance instead of two binary options. I suspect there is a lot of low hanging fruit we'd both agree on that would make things better, but it's never going to come out like this. Legoktm (talk) 07:58, 8 February 2023 (UTC)[reply]

I think it might be better to evaluate this on a case-by-case basis for each type of HTML element. Support for the font element is unlikely to be removed from browsers in the foreseeable future, in spite of its obsolete status, due to widespread use. This might be different for other obsolete elements, though. isaacl (talk) 18:49, 7 February 2023 (UTC)[reply]

Note if the MediaWiki developers ever decide to stop generating the <font> element in its HTML output, they could still support the element in wikitext and just generate a corresponding <span> element with appropriate style properties set. Mapping the size attribute might cause some change to the display, but to be fair, since different browsers can handle it differently and (once upon a time) it was configurable within the browser, it's already not a reliable way to set a specific relative size. The relevant considerations may be different for other elements, and thus I feel looking at them individually is preferable. isaacl (talk) 16:43, 13 February 2023 (UTC)[reply]

I notified WT:BOTPOL; WT:Linter was already done. Legoktm (talk) 08:02, 8 February 2023 (UTC)[reply]

The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

RFC: Occupation in infobox for localities affected by the ongoing military conflict

From the beginning of the 2022 Russian invasion of Ukraine, when the Russian troops occupied some Ukrainian territory, some users started to add the information on the occupation (in variuos ways) in the infobox. Other users remove this information. I have some of these localities on my watchlist, and these additions and removals are way too frequent. See just recently and just for Kreminna which is just one locality: one two three four. This bullshit needs to stop. I do not particularly care which way it would stop but we need a clear consensus that either mentioning or not mentioning occupation in infobox is ok, and the opposite is not ok.

In principle, this RfC should apply to any ongoing to conflict, but my primary interest is Ukraine. I would also exclude Crimea because it is really administrated by Russia, and the arguments could be slighly different, we can return to it another time. This is really about clear-cut cases. Ymblanter (talk) 23:32, 7 February 2023 (UTC)[reply]

Occupation can be mentioned in the infobox

Please suggest how it should be mentioned.--Ymblanter (talk) 23:33, 7 February 2023 (UTC)[reply]

We have existing long-standing examples for how such information can be inputted, eg. Katzrin, Laayoune, Afrin, Syria. CMD (talk) 04:28, 8 February 2023 (UTC)[reply]
Do the people who programmed hCard format into the infobox template know about this? It looks to me like the embedded data would get parsed out as “Afrin is divided into multiple ‘control,’ and one of the control of Afrin is ‘Turkey Hayat Tahrir al-Sham.’”  —Michael Z. 04:58, 8 February 2023 (UTC)[reply]
I posted a request for input at Template talk:Infobox settlement#Representing occupied territories.  —Michael Z. 05:02, 8 February 2023 (UTC)[reply]
It also looks to me like those examples are not consistent, judging by the three distinct grammatical schemes in the field labels of the three infoboxes. But I can’t really tell because they are hard to make sense of.  —Michael Z. 05:06, 8 February 2023 (UTC)[reply]
  • Without looking for a blanket ruling, infoboxes should contain information about occupying/administering bodies. Infoboxes are meant to provide "key facts", the administration of any administrative entity is self-evidently a key fact. Pretty much all sources covering the occupied areas clearly and upfrontly state that which areas are occupied, because the occupation is a key fact (magnitudes more so than, say, Area code or Vehicle registration). Not including occupying administrations is a disservice to readers in the best case scenario. The invocation of WP:NOTNEWS is nonsensical; editing an infobox does not make an article original reporting, a news report, a who's who, or a celebrity gossip and diary. Infoboxes are updated all the time, completely routinely. An interpretation of Notnews suggesting they shouldn't would necessitate removing information like political leaders from infoboxes, which change quite frequently and are often updated almost immediately (or even before they should be!). NOTNEWS actually advises editors to "include current and up-to-date information". The arguments about stability are similarly flawed, and unsupported by policy and practice. Wikipedia articles are not frozen when real-life disputes are, and neither are their infoboxes. (Template:2022 Russian invasion of Ukraine infobox has information much more recent than much of the Russian occupations.) As for how, a number of models already exist, such as the examples given above. We could probably forever WP:BIKESHED between them, but what is important is that one is used to ensure infoboxes meet their purpose of providing key facts to readers at a glance. CMD (talk) 04:12, 16 February 2023 (UTC)[reply]

Occupation should not be mentioned in the infobox

Put your vote here if you think that if it is mentioned in the body, and this is enough.--Ymblanter (talk) 23:38, 7 February 2023 (UTC)[reply]

Not  The infobox is for defining characteristics of the subject, fundamental statistics, and other stable information. It is WP:NOTNEWS ticker, and should not include ephemeral and unpredictable wartime data like who has driven into a place this morning that could well leave it by this evening (and Russian sources for example, have been notorious for announcing the occupation of a locality several times over weeks without actually occupying it).  —Michael Z. 03:00, 8 February 2023 (UTC)[reply]
Another problem with this is that the infobox is not designed to accommodate foreign intervention. The fields for country subdivisions assume a legal hierarchy, and embeds a microformat encapsulating that hierarchy, which yields nonsense if the hierarchical data doesn’t make sense. I think it can also automatically generate short summaries which are nonsensical too, if there is no wikidata description or manually entered short summary.  —Michael Z. 03:17, 8 February 2023 (UTC)[reply]
Not. (Summoned by bot) The infobox should only be used for stable information; it should summarise the information contained in the body; and should represent the WP:CONSENSUS among the RS as well as the involved editors. Trying to captture information that may go back and forth in the course of a conflict should be avoided in the infobox. -- Kautilya3 (talk) 18:39, 9 February 2023 (UTC)[reply]

Discussion (occupation in infobox)

This is not a new thing. The infobox for the city of Donetsk was “occupied” by a single-edit anon over six years ago at the latest, for example.[12] There is a clear long-term consensus to remove these sporadic interventions, as we can see by the current state of Donetsk. —Michael Z. 03:05, 8 February 2023 (UTC)[reply]

This is good, but let us complete this RfC to make the consensus explicit. Ymblanter (talk) 03:28, 8 February 2023 (UTC)[reply]
No argument from me.  —Michael Z. 03:51, 8 February 2023 (UTC)[reply]
  • So, I think there is some nuance to the matter, and I don't want to vote definitively either way. I think that this is a case-by-case basis; which doesn't mean I don't think there need to be some guiding principles, but I DON'T think a one-size-fits-all approach is the way to go here. For some kinds of occupied territory, which is under a sort of long-term dispute, where one country claims as its territory an integral part, but which another has been administering in a de facto way for a long period of time, it may be useful to describe it as such. The Golan Heights for example is widely recognized as Syrian territory that is under Israeli occupation (without wading into the debate over whether that is true or not, just that it is commonly described as such); but that's still a stable state. The two countries are not actively fighting over the land, it's basically been that way for decades, and it isn't an ephemeral thing. It's basically a standard state of affairs. If we have an active warzone, where two armies are fighting back and forth to control some bit of territory, then no, we should not be updating the infobox every few days to reflect the changing battle lines. The infobox is poorly suited for that sort of thing. Leave it out in those cases. --Jayron32 17:26, 9 February 2023 (UTC)[reply]
    This makes sense to me. It's impractical to update infoboxes to reflect shifting battle lines in a hot conflict, but when it's a long-established dispute and WP:RS describe the situation as occupied/disputed territory, then it would make sense to include in the infobox. How that's handled with the hCard format is something that would need some thought and attention to handle the sorts issues Michael Z pointed out. —Carter (Tcr25) (talk) 17:49, 9 February 2023 (UTC)[reply]
    If we are to document stable occupations or land claims in the infobox, it would probably make sense to agree what it should look like, how it is to be used consistently, and then add a separate row for this to the infobox template.
    None of the example solutions are clear to me, and I can’t imagine they could possibly be to someone new to the subject and glancing at the infobox. They all fail basic communications principles and do not serve WP:INFOBOXPURPOSE IMO.  —Michael Z. 18:28, 9 February 2023 (UTC)[reply]
    Look at Navassa Island. It an "easier" case since it's not inhabited and can be easily labeled a "disputed island," but it's still using subdivision_type3 and subdivision_type4 for Haiti's claim, which is problematic. {{Infobox island}} has a disputed option that might be a way to handle things, but I'm not sure that it has an effect on the hCard data. —Carter (Tcr25) (talk) Carter (Tcr25) (talk) 23:04, 9 February 2023 (UTC)[reply]
    Regarding hCard, I’m not much concerned that occupied status is absent from it, but we shouldn’t abuse the template in such a way that the hCard data is bad. This is important because the effect is not visible, so most editors will have no way of knowing, and it is meant for re-use so we have no way to predict what kind of applications of someone else’s we will be breaking, or in what way.
    So there should probably be zero tolerance for shoehorning stuff like this into those fields. If that can’t be accomplished, then fields specifically for this must be added to make sure it’s done right. Anyway, probably a discussion for the template’s talk.
    At least equally important, of course, is that the infobox is perfectly clear to readers. Even in that simple example, there’s no visual hierarchy (I guess there are two hierarchies with no divider), and the headings are different kinds of things. It’s effectively nonsense that someone has to guess at what it means. It would be easier to decider a long sentence, but that doesn’t belong in an infobox either.  —Michael Z. 02:07, 10 February 2023 (UTC)[reply]

Make Wikipedia:Requested moves/Closing instructions a guideline


Should WP:RMCI be formally elevated to the official status of a WP:GUIDELINE? 23:15, 22 February 2023 (UTC)

Initial statement, collapsed 18:05, 26 February 2023 (UTC) to comply with WP:RFCNEUTRAL
The following discussion has been closed. Please do not modify it.

Greetings,

There has existed for eighteen years a set of instructions for how to formally close a requested move. It was initially titled as Wikipedia:Moving guidelines for administrators, later changed to Wikipedia:Requested moves/Closing instructions (hereafter WP:RMCI). Over the years, it has grown from a simple set of how-to steps to a comprehensive list of guidelines for determining consensus (which began as early as 2006 and has continued to expand through the years). It also has included instructions on who should close requests--first in 2009 when the authority to move pages was explicitly granted to non-administrators, then later in 2019 (following lots of discussion) we introduced rules on how editors who are involved need to not close requests.

The instructional page wasn't classified as anything (guideline, policy, or essay) through 2021, even though it had been originally written as a guideline. Following a post on the talk page that saw no response, it was listed as an explanatory essay in 2021.

Recently, at least one editor has asserted that, as WP:RMCI is "only" an essay, its procedures do not need to be followed; therefore, as one concrete example, editors may feel free to close move requests that they are fully involved in, so long as they think it's the right thing to do. WP:RMCI has been called "purely advisory", "not vetted", etc., leading to questions as regarding its authority--this in spite of the fact that it was written as a guideline and has guided literally thousands of move closures. This has unquestionably led to no small amount of chaos at Wikipedia:Move review following an editor who closed the same move request twice in a row (which is against the closing instructions), both with the same result, and who claimed that the closing instructions that forbid such a close by an involved user were merely "advice".

Should WP:RMCI be formally elevated to the official status of a WP:GUIDELINE? Red Slash 23:15, 22 February 2023 (UTC)[reply]

Survey (RM closing instructions)

  • Support as proposer. The idea that WP:RMCI is not vetted by the community is laughably inaccurate; yes, it has been. It's guided requested move closures for nearly two decades, its talk page (merged with the overall WP:RM talk page) spans thirty-four archives, and drawn-out discussions are common (I arbitrarily picked Wikipedia_talk:Requested_moves/Archive_25 and found... a LOT of discussion on the precise wording of WP:RMCI). It is a fully vetted part of Wikipedia. It's not purely advisory--it's literally the only instructions we have on closing move requests. It's not an essay, it has never been an essay, it bears none of the hallmarks of an essay, and its currently listed status as an essay is a mistake. It's not an inconsequential mistake, either; people feel like they can ignore it freely, and that's leading to chaos--you have a 50/50 shot of the person who closes your move request being someone who takes our well-vetted closing procedures (up to and including don't close a move request twice) seriously, or who views them as merely "good advice" that can be freely ignored. Red Slash 23:15, 22 February 2023 (UTC)[reply]
  • Support per nom. * Pppery * it has begun... 23:16, 22 February 2023 (UTC)[reply]
  • Support per nom. The principles and concepts are tried and tested. -Kj cheetham (talk) 23:30, 22 February 2023 (UTC)[reply]
  • Support per nom. – robertsky (talk) 00:45, 23 February 2023 (UTC)[reply]
  • Support — if it looks like a guideline, quacks like a guideline, and is functionally considered a guideline, it should be formally treated as such. WhinyTheYounger (WtY)(talk, contribs) 00:53, 23 February 2023 (UTC)[reply]
  • Support. Established and widely followed de facto guideline. {{ping|ClydeFranklin}} (t/c) 01:14, 23 February 2023 (UTC)[reply]
  • Support to stop wikilawyering about whether something is "just an essay" or not. Best practices should be followed, regardless of where they are documented. But for the sake of clarity, let's make it official that this has the support of the community. HouseBlastertalk 01:33, 23 February 2023 (UTC)[reply]
  • Support per nom, I see no reason to oppose. Aaron Liu (talk) 02:41, 23 February 2023 (UTC)[reply]
  • Support per nom. Tim O'Doherty (talk) 07:31, 23 February 2023 (UTC)[reply]
  • Oppose. Most of RMCI are basic instructions. These exist all over Wikipedia, but none of them should be turned into guidelines because there is no benefit from doing so, and because doing so causes WP:CREEP issues. For example, there is no benefit of turning Wikipedia:Requested moves/Closing instructions#Bot considerations into a guideline.
    Other parts of RMCI may not have a consensus if considered individually; for example, the instruction to explicitly declare non-admin closures as such. This requirement does not exist for other discussions, even more impactful ones like RfC's, and discussions on the instruction have shown significant opposition to it on the grounds of WP:NOBIGDEAL
    Finally, this change won't address the specific issue discussed - editors closing the same move request twice in a row. Closing a discussion is an administrative role, and the current wording of WP:INVOLVED states that this does not make an editor involved: One important caveat is that an administrator who has interacted with an editor or topic area purely in an administrative role is not involved and is not prevented from acting in an administrative capacity in relation to that editor or topic area. Changing this essay to a guideline won't make the editor involved, because when a policy and guideline conflict editors should assume that the policy takes priority.
    I also don't believe it makes sense to make closers involved for RM's but not other discussions - if this is an issue then it is an issue in all areas, not just requested moves. BilledMammal (talk) 10:19, 23 February 2023 (UTC)[reply]
    But WP:INVOLVED is dealing with administrators doing things in an administrative capacity (which does not include requested moves), and WP:RMCI is not. Closing move requests is neither restricted to administrators nor is it an administrative task. There's no overlap at all. See below for my more detailed take on why and how this is different. Red Slash 17:19, 23 February 2023 (UTC)[reply]
    Both of us cannot close this discussion because WP:INVOLVED forbids us from doing so, despite neither of us being admins nor the close requiring admin tools. INVOLVED applies to all editors who are acting in an admin capacity, and that includes closing discussions - if it didn't, either of us could close this discussion now. BilledMammal (talk) 23:19, 23 February 2023 (UTC)[reply]
    That's an interesting perspective I hadn't considered. I think, perhaps, an explicit (though brief) delineation of why WP:RMCI is more expansive than INVOLVED would be appropriate. Thank you. Red Slash 23:43, 23 February 2023 (UTC)[reply]
    I don't believe that would solve the issue; the explanation won't remove the conflict, and it won't change the fact that when a policy and guideline conflict editors should assume the policy takes precedence. BilledMammal (talk) 00:00, 24 February 2023 (UTC)[reply]
    @BilledMammal: Is closing an RM an "admin capacity" or is it a "non-admin capacity"? - is there anywhere in which "admin capacity" is defined? - I had always assumed it to mean activities that only an admin is capable of, so, for instance, if an admin is closing an RM they are simply performing a non-admin activity as an admin, just as any non-admin closure is equally a "non-admin" activity. Unless we are using two different registers for 'admin' here on Wikipedia - one to mean "admin", as in the things formally restricted to someone appointed as an administrator, and then 'admin' diminutive form to refer to any activity that can broadly be construed as generally 'administrative' in nature even if it is not performed by an administrator. Iskandar323 (talk) 10:56, 27 February 2023 (UTC)[reply]
    Formal closures are an "admin capacity" or, to use the wording from WP:INVOLVED, "acting as an administrator". From a practical point of view any other interpretation is problematic as it would mean there is no restriction on involved closures. However, it might be worth clarifying that section of WP:ADMIN as I have had debates where editors have claimed that it is appropriate for them to formally close a discussion they were involved in because they were not an administrator. BilledMammal (talk) 11:13, 27 February 2023 (UTC)[reply]
    It is an essay also, but WP:NACINV says about non-admin closures: Closing editors must abide by the standard of being uninvolved as described at Wikipedia:Administrators#Involved admins. Vpab15 (talk) 11:14, 27 February 2023 (UTC)[reply]
    @BilledMammal: As Vpab15 notes, WP:NACINV specifically requires closing editors to abide by the standards of administrators as described at WP:INVOLVED, but none of this actually means that the original wording of WP:INVOLVED was necessarily meant to apply to anyone other than administrators in any other context. So I see this clash of policy/guidelines as a false one, with WP:INVOLVED not applying to non-admins or non-admins activities except in the very limited way outlined in WP:NACINV, which specifically applies the principles of WP:INVOLVED to closes. Iskandar323 (talk) 07:11, 28 February 2023 (UTC)[reply]
    By using the phrasing editors should not act as administrators in disputes INVOLVED makes it clear that one isn't required to use the tools to engage in admin activity ("act as administrators"), and makes it clear that it applies to all editors, not just admins ("editors", not "administrators").
    This can be seen in how the community responds to involved closures of discussions; the closure is overturned, and sanctions imposed if the editor continues to insist that because they are not admins INVOLVED doesn't apply to then. Whether this was intended to be the original meaning is not relevant; it is now the meaning accepted by the community, and in my opinion the correct meaning. BilledMammal (talk) 09:10, 28 February 2023 (UTC)[reply]
    @BilledMammal: Fair enough. I guess that makes sense. Although the next paragraph (about the exceptions) is even more ambiguous, mentioning only administrators, not editors, and in fact contrasts administrators with editors, noting: "...acting in an administrative capacity in relation to that editor...", suggesting that it is worded more with a view to blocks etc. This, if anything, is a good example for the need for further clarity with respect to move closures. WP:RMCI is a more natural extension of the part in the third paragraph, where it states: "Although there are exceptions to the prohibition on involved editors taking administrative action, it is still the best practice, in cases where an administrator may be seen to be involved, to pass the matter to another administrator..." Iskandar323 (talk) 12:12, 28 February 2023 (UTC)[reply]
  • Support per nom. · · · Peter Southwood (talk): 17:13, 23 February 2023 (UTC)[reply]
    Withdrawn until the conflict with WP:INVOLVED is fixed. I consider closing to be an administrative action, as it theoretically does not involve the opinions of the closer on the content of the article, but does involve the interpretation of policy and guidance and their application to a discussion, as well as assessing consensus, and requires competence. Their are many experienced non-admin editors with the required competence in general, and fewer people with the required competence for closing any given discussion, as each topic has its own special circumstances, and we do not need closers who are ignorant of those special circumstances. · · · Peter Southwood (talk): 05:03, 9 March 2023 (UTC)[reply]
  • Support This is a high stakes fundamental process. When any process is as established, accepted, and in-use as this one, then I am in favor of labeling it as a guideline. The guideline label carries weight both for users and to legitimize the activities of the Wikimedia editorial community and its governance process. Bluerasberry (talk) 17:25, 23 February 2023 (UTC)[reply]
  • I'd rather have this become a information page. Also, this sentence: NACs are not discouraged for requested moves, as long as the non-admin is highly experienced with RMs and our closing procedures sounds pretty discouraging to me. How is someone supposed to become "highly experienced", if you're not allowed to do it? And is this actually different from what we hope for with admins? IMO thoughtful NACs should be strongly encouraged to handle RMs, and admins (who have no similar requirements about knowing how RM works?) should only be involved (in their admin role) if specific buttons are needed for technical moves. Page titles are basically content decisions, and admins get no special rights in content decisions. WhatamIdoing (talk) 00:22, 24 February 2023 (UTC)[reply]
    That could definitely be rewritten. Over the years, we've gone from prohibiting NACs, to discouraging them, to restricting them, to permitting them, and it might now be time to start explicitly encouraging them. I love it. I don't think that has much to do with the fate of this page as a guideline or not, but I love it. (BTW, having it be just an informational page, in my opinion, wouldn't solve the problem of people not taking a WP page seriously if it doesn't have a little checkmark next to it.) Red Slash 01:25, 24 February 2023 (UTC)[reply]
    I agree, if RMCI needs fixing, clarification and whatnot we can do that just as for any other page, should not affect the decision to make it a guideline imo. Selfstudier (talk) 06:27, 24 February 2023 (UTC)[reply]
    I dunno. Once a policy or guideline label gets slapped on a page, editors frequently tell me that it's more difficult to get substantive changes made. Also, it's customary to solve the more obvious problems before proposing guideline status.
    Let me pick on a few people: @Pppery, @Kj cheetham, @Robertsky, @WhinyTheYounger, @ClydeFranklin, @HouseBlaster: You were the first folks to support the proposal. The OP's primary concern is the tag that says it hasn't been vetted. How many minutes did it take you to read the page before you voted to support it? WhatamIdoing (talk) 03:34, 9 March 2023 (UTC)[reply]
    It is hard to quantify the time spent going through the page line by line, familiarising and refamiliarising the process in the last few months (somewhen between 4-6 months) when I decided to dive into RMs, as such it may not have been obvious to me. What's actually humourous is that those you have pinged are mostly non-admins ourselves, albeit some with pagemover rights, and we might certainly be censured by others if we had started out with this being the guidelines just by this line. Just recently, there is an editor, green in doing NACs, going through RMs, and no one is stopping them, but rather experienced closers are have been encouraging them and guiding them on their talk page. We can always have this tweaked to something more platable: i.e. NACs are acceptable for requested moves provided that the non-admin editor closing the request is familiar with the closing instructions. – robertsky (talk) 05:00, 9 March 2023 (UTC)[reply]
    I think your wording would be an improvement. Personally, I'd also discourage editors from handling RMs on pages that they are unable to move themselves. IMO some RMs should be left to admins and page movers. WhatamIdoing (talk) 03:51, 10 March 2023 (UTC)[reply]
    Good question - I have no idea really. I have been involved with page moves for a while (mostly technical requests), and have looked over it multiple times in the past year or two. It's not something I read for the first time after seeing this RFC. I can't say I've fully analysed every part of it, or that it is perfect. Also, "discouraged" is not the same as "not allowed". -Kj cheetham (talk) 17:17, 9 March 2023 (UTC)[reply]
    Whatareyoudoing, I agree with Kj cheetham on a lot of this. I have no idea in terms of an exact number, but I have gone over it multiple times every so often, and I have in no means fully analyzed it. {{ping|ClydeFranklin}} (t/c) 23:22, 9 March 2023 (UTC)[reply]
    Well, at the moment, I seem to be confirming my long-standing belief that Wikipedia:Nobody reads the directions. We guess what the WP:UPPERCASE might mean, and we look up little specific sections when we need to quote The Rules™ at someone or remember the name of a specific template (or similar detail), but we don't really read them.
    This RFC started because someone didn't want this page to have a tag at the top that says "it has not been thoroughly vetted by the community." It sounds from the responses here that even if it were to pass, that's still not happening. If your responses are typical, then the people who are voting for it haven't really read it (recently, or ever), and I suspect that the people opposing only read far enough along to find something they disagree with, and then stop. On a more general level, I wonder if WP:PROPOSAL should advise would-be promoters to tell editors that they really need to read the entire current version of the page, before voting to support the promotion. We don't get many such proposals in a year, and actually reading the page that you want to have marked as having been thoroughly vetted does not seem to me like an unreasonable expectation.
    BTW, if you have average reading speed for a native English speaker, it'll take you 20 minutes to read that proposed guideline from top to bottom. WhatamIdoing (talk) 03:48, 10 March 2023 (UTC)[reply]
    I don’t really understand why this should be an information page, IMO it’s not factual or technical information but the community accepted procedure. Aaron Liu (talk) 14:16, 26 February 2023 (UTC)[reply]
    @Aaron Liu, information pages provide information. This includes information about what the community's accepted procedures are. Consider, for example, the information page on the accepted procedure if you want to Wikipedia:Contact VRT. Or for Wikipedia:Contributing to Wikipedia. Or if you need Wikipedia:Access to nonpublic information. Or if you want to follow the community-accepted procedure for Wikipedia:Requests for comment. Or, very relevantly, for Wikipedia:Closing discussions. Why should Wikipedia:Closing discussions be an information page, but not the move-specific version of that page? WhatamIdoing (talk) 03:11, 9 March 2023 (UTC)[reply]
  • Oppose for now. The issues are (a) inconsistency in the way we use the terms (or apply the templates for) guidelines, information pages, explanatory supplements, essays, etc., and (b) letting time-wasting by wikilawyers determine those applications. We should be more consistent. Why would this need to be a guideline, but WP:AFD, Wikipedia:Bureaucrats, WP:MFD, WP:RFA, WP:AC, WP:RFC, and WP:DRV are content to simply be process pages, even though they also have a lot of rules/guidance? Why is Wikipedia:Reviewing good articles a guideline, but there's no such guideline for WP:FAC or WP:DYK? Why is WP:PROD a guideline, but WP:AFD and WP:DRV aren't? For all the wikilawyering over essays, our page on WP:ESSAYS is an essay, and WP:PAG lumps the various types of essays together despite practically being treated very different by the community. Maybe one step is to remove "information pages" from the essay of "header" and combine them with "administration" (or "process") pages to produce something that assumes guideline-like adherence, or maybe such is the state we're in that we should actually slap guideline templates on all of these pages just to stop people from extending "just an essay" to "just a page that documents a process for which there's broad support"... — Rhododendrites talk \\ 02:14, 24 February 2023 (UTC)[reply]
    I appreciate your perspective. "we should actually slap guideline templates on all of these pages just to stop people from extending "just an essay" to "just a page that documents a process for which there's broad support"..." is actually probably the best solution, because otherwise, in practice, you get people who say "but it's just an essay!" In the meantime, though, I think the proposal is the best solution for the reasons I explained above. Red Slash 07:45, 24 February 2023 (UTC)[reply]
    Some of the pages you linked are discussion venues, not pages. AfD and DRV are discussion venues that have their process page at WP:DPR which is a guideline. Aaron Liu (talk) 14:21, 26 February 2023 (UTC)[reply]
    @Rhododendrites: Apologies, I had to slightly modify your vote as to correct for the loss of context when the support and oppose sections were merged. BilledMammal (talk) 18:19, 26 February 2023 (UTC)[reply]
  • Support Seems like a good idea to give a bit of formality to existing practice. If this or that wording needs sprucing up, that can be done in the usual way and shouldn't affect the decision whether to make it a guideline imo.Selfstudier (talk) 06:32, 24 February 2023 (UTC)[reply]
  • Oppose. I think there is one problematic sentence at WP:RMCI that needs to be fixed. It defines involved as: You have ever closed such a move request. That contradicts the wording at WP:INVOLVED and places an unreasonable burden on RM closers, who will have to remember all the previous RM dicussions they have closed. Per discussion below, it is not even clear if "such a move request" only includes the same article, or any article in the same topic. I have no objection to the rest of the essay. Vpab15 (talk) 10:23, 24 February 2023 (UTC)[reply]
  • Support this proposal, but clarity as discussed above needs to be added to the guideline page soon, and re-organizing non-policy pages via template as discussed below needs to happen also. GenQuest "scribble" 16:14, 24 February 2023 (UTC)[reply]
  • Support, no-brainer.—S Marshall T/C 18:24, 26 February 2023 (UTC)[reply]
  • Support per nom The Night Watch (talk) 02:41, 27 February 2023 (UTC)[reply]
  • Support per nominator. Additionally, I support robertsky's proposal below. echidnaLives - talk - edits 02:55, 27 February 2023 (UTC)[reply]
  • Oppose as written, at least until contentious points with "involvement" and declaring NAC closes are vetted by the community, since they contradict broader policies. Some parts of RMCI have obviously been written out of one's whim at a time, and do not even enjoy local WP:CONLEVEL among the closers' "community". The case that apparently triggered this whole RfC was Palestinian exodus MR, and the "involvement" RMCI clause being used to overturn the close procedurally. I've closed hundreds of RMs and MRs, I'm fairly well-versed with RMCI, but that clause has slipped under my radar and I'm frankly baffled by its existence. No such user (talk) 08:37, 27 February 2023 (UTC)[reply]
    "do not even enjoy local WP:CONLEVEL among the closers' 'community'" - can you be specific as to what you're referring to? Wikipedia is a wiki, there's nothing in WP:RMCI which has "snuck through" or exists without the community's support. The INVOLVED part, in particular, was extensively debated, even as small a point as whether or not an involved editor can "relist" a discussion. Everything has been vetted. Red Slash 23:00, 27 February 2023 (UTC)[reply]
    I missed that 2019 discussion, which did not have a terribly broad input. Even if I can agree there was a local consensus at the time, WP:CONLEVEL is clear that Consensus among a limited group of editors, at one place and time, cannot override community consensus on a wider scale. As for NACs, I've been mildly annoyed at RMCI's insistence on declaring NACs (a thing that has been out of fashion elsewhere on Wikipedia for quite some time) and I'm on record stating that I'm not going to do that for my closes, ever. For the definition of "involvement", you can witness the community backlash right here. See also BilledMammal's objections above, which I fully endorse. No such user (talk) 08:59, 1 March 2023 (UTC)[reply]
  • Support: The 'explanatory essay' has already been discussed more than sufficiently to qualify as a guideline, which obviously would still not mean that it was as hard and fast and as ironclad as a policy, but would cease the confusion whereby some editors seem to think it is optional - a confusion that needs clearing up. There will still be exceptional cases, admins may still need to weigh in, and there will be still be need and scope for a human touch at Move Review, but there will not cause for actual confusion as to what the guidelines are, and what the basic expectations of the process are. Iskandar323 (talk) 11:03, 27 February 2023 (UTC)[reply]
  • Oppose mostly per No such user. I, as well, have serious problems that the NAC closure section is in clear contradiction to Wikipedia principles. Also, Rhododendrites makes some good points above. --Jayron32 13:34, 27 February 2023 (UTC)[reply]
  • Oppose for now. I'm not opposed to a guideline about moving pages but RMCI as currently written is not it. I don't agree with its instructions, such as the language about involved, as pointed out by others, and the page is too long as currently written. I think there is maybe an opportunity to either rewrite RMCI into a guideline, or rewrite it and promote it to an info page, or maybe split it into a guideline page and an instructions page. There is some possibility for something here, but I oppose putting the guideline template on RMCI as currently written. Levivich (talk) 14:20, 27 February 2023 (UTC)[reply]
    Reads like a "support with some copyediting" to me.—S Marshall T/C 14:26, 27 February 2023 (UTC)[reply]
    Copyediting doesn't change the meaning of the text; that's editing. But yeah, I might support if we changed the meaning of the text to something else. Levivich (talk) 19:35, 4 March 2023 (UTC)[reply]
  • Oppose currently. I agree that the provision "You have ever closed such a move request" is contrary to WP:INVOLVED, which explains that administrative actions do not constitute involvement. Sandstein 16:24, 27 February 2023 (UTC)[reply]
  • Temporary oppose. I think WP:RMCI should eventually become a guideline, but I don't think we're at that point quite yet. As other participants have stated, the "You have ever closed such a move request" provision should be removed from WP:RMCI before it rises to guideline status. Additionally, WP:NOTCURRENTTITLE has been added to the page very recently, and – while I agree with its inclusion myself – I'd like to see it receive more affirmative support from the community before letting it rise to guideline status. ModernDayTrilobite (talkcontribs) 17:15, 27 February 2023 (UTC)[reply]
    On further thought, switching to weak support. The involvement issues can be hashed out whether RMCI is a guideline or not, so there's no need to hold up an ultimately productive goal on those grounds alone. ModernDayTrilobite (talkcontribs) 17:35, 5 March 2023 (UTC)[reply]
  • Support per nom --- Tbf69 P • T 17:22, 27 February 2023 (UTC)[reply]
  • Oppose as needing more work, per points raised above (e.g. "You have ever closed such a move request"). Also, the NAC instructions differ substantially from our other procedures: to bring them in line with the spirit of NAC, I think admins and page movers should be allowed to close any RM, while other users should only close uncontroversial discussions. -- King of ♥ 23:03, 27 February 2023 (UTC)[reply]
  • Support, the involved bit isn't prohibitive - for the general reasoning, the nom's covers it nicely. On the INVOLVED bit, my interpretation is that that sets minimum standards. It doesn't invalidate any guideline to have stricter rules. Now, whether the guideline should have stricter rules is really rather more dubious, but fundamentally, it's not so egregious as to undermine the guidelineification (to my concern, that isn't being flagged as a typo). Nosebagbear (talk) 23:24, 1 March 2023 (UTC)[reply]
    INVOLVED also sets a maximum standard, by telling us when someone is not involved. The standard at RMCI exceeds this. BilledMammal (talk) 23:55, 8 March 2023 (UTC)[reply]
    An actual or perceived conflict of interest is what needs to be avoided, how that is achieved is the question. Selfstudier (talk) 00:09, 9 March 2023 (UTC)[reply]
    If closing a discussion creates a perceived conflict of interest than that can only be addressed at WP:INVOLVED, as any editor who disagrees with RMCI can accurately cite WP:INVOLVED and WP:POLCON, and because if it is an issue at RM’s it is also an issue at AFD’s and other formal discussions. BilledMammal (talk) 00:13, 9 March 2023 (UTC)[reply]
    Which you did and is why this RFC exists. Many don't agree with that position though. It is of some interest that you only brought it up in order to support a !vote at an RM when most people pointed to an obvious conflict of interest. Selfstudier (talk) 00:21, 9 March 2023 (UTC)[reply]
    If many don’t agree with that position then you should have no issue changing the policy that supports that position, and I encourage you to open an RFC proposing to do so.
    I brought it up to push back against the notion that engaging in administrative actions made an editor involved and have not considered the rest of the RM as most of the support for overturning was solely on that basis; how is it of some interest? BilledMammal (talk) 00:26, 9 March 2023 (UTC)[reply]
    No need, as I said, discussion should start at RMCI talk per usual practice. Selfstudier (talk) 00:32, 9 March 2023 (UTC)[reply]
    Since this RFC is already open and at CENT, you can of course start a subRFC within it for anything that concerns you specifically (as per "Discussion on changing the wording" below). Selfstudier (talk) 00:34, 9 March 2023 (UTC)[reply]
  • Support. As per Bluerasberry. This process is well-tested, it should be as streamlined as possible and such a formal tag contributes to this as well as provides legitimacy. Any contentious points (involvement, etc.) should be resolved before promoting this page to a guideline though. Any user in good standing should be allowed to close any RMs where they are not directly involved. We could set up some general conditions though that are not dependent on public voting, such as it has been done at WP:AFC. --TadejM my talk 06:22, 2 March 2023 (UTC)[reply]
    Any contentious points (involvement, etc.) should be resolved before promoting this page to a guideline though. I’m planning to open an RFC asking questions about individual contentious points should this RFC pass, which should help resolve them. BilledMammal (talk) 00:00, 9 March 2023 (UTC)[reply]
    First discuss them on the talk page as already suggested by the RMCI guidance so that there is a proper RFCbefore should it come to that. Selfstudier (talk) 00:07, 9 March 2023 (UTC)[reply]
    I was planning to discuss at WP:VPI for broader input. BilledMammal (talk) 00:09, 9 March 2023 (UTC)[reply]
    RMCI talk first, just like we begin all other discussions regardless of the article. Selfstudier (talk) 00:23, 9 March 2023 (UTC)[reply]
  • Oppose per CREEP, among other arguments presented above. Lots of WP processes have instructions/rules and we don't make them guidelines. Guidelines are for content editing and for user behavior, not for internal bureacracy.  — SMcCandlish ¢ 😼  02:26, 7 March 2023 (UTC)[reply]
  • Oppose as BilledMammal, No such user, and others have argued.
    To comment in particular on the "involved" clause, the suggestion that RMs need a special rule above WP:INVOLVED is unpersuasive. Being "urgent" or not doesn't somehow uniquely distinguish RMs from other processes with formal closures. A typical RFC, AFD, RFD, etc. is no more urgent than a typical RM, with no acute consequences to closing it a couple days or weeks later. Similarly, seeking to overturn a closure at move review is not materially different in difficulty from seeking to overturn another type of closure at deletion review. I see the impetus of the RFC is the case of the 1948 Palestinian exodus RM closure and MR; it features a closer who previous RM closure had already been taken to move review and itself was so controversial that it got no consensus to endorse or overturn. That extraordinary set of circumstances does not justify the proposed broad rule of "You have ever closed...", nor have I seen any other case that does. ("Hard cases make bad law".) There should just be one uniform policy about what "involved" means, and fragmenting the definition is CREEP. Adumbrativus (talk) 06:46, 7 March 2023 (UTC)[reply]
  • Oppose. If following these over-long "instructions" leads to the kind of close the OP did here and the following extended drama at User talk:Red Slash#Mpox, then they should be burned with fire rather than elevated to a guideline. There seems to be a move afoot to make RM use a mandatory step as it's the place where "scores of editors who know Wikipedia very well" hang out. This is just pointless processology.WP:CREEP, WP:NOTBURO. Bon courage (talk) 04:06, 9 March 2023 (UTC)[reply]
    A little ad hominem no? Addressing the OP more than the topic, with a dash of conspiracy. Iskandar323 (talk) 05:59, 9 March 2023 (UTC)[reply]
    Well, the evidence is that the apparent application of these "instructions" is an issue. The editor who wrote most of them[13] wants their work elevated to have WP:PAG force so that they can be used to 'forbid' things (as if guidelines do that). Bon courage (talk) 07:25, 9 March 2023 (UTC)[reply]
    Quite. A guideline doesn't enforce anything. It just elevates something away from being a mere essay, and, in this instance, dismissible as a mere nicety, even if it is established by consensus and decades of practical use. Iskandar323 (talk) 10:36, 9 March 2023 (UTC)[reply]
  • Oppose I fail to see why this can't be an Information page, and there seem to be too many contentious aspects currently. Suggest this idea is parked until the page is better. People will wikilawyer if they are unhappy with any process even if given guideline status. As other note, it is way too long. Well that's a problem with many guidelines and other pages too, but I think the step before attempting to elevate this page's status should be that you first request help to shorten it, trim the cruft, and remove contentious aspects until those have been thrashed out on talk. I get the need for "official guideline" for article content advice, but as a wiki, I think our processes and procedures should generally be described in as lightweight way as we can. Also, be careful what you wish for... making this "official" may make it harder for you to change the bits you want changed in future. -- Colin°Talk 09:53, 9 March 2023 (UTC)[reply]
  • Oppose guideline, support information page if issues are dealt with. RMCI is useful, but I do have concerns with page as written. As the closer of the Palestinian exodus RMs, I was somewhat non-plussed at the "closing an RM makes you permanently INVOLVED" part being brought up at MRV as a gospel truth; it runs counter to how discussion closures go elsewhere on the encyclopedia, where INVOLVED is basically formalising the common-sense "don't close discussions you've participated and opined in" rule. And this RfC… I'll be honest, it feels less like codifying RMCI, and more like trying to redefine INVOLVED in without proper scrutiny at WT:ADMIN, where discussions about INVOLVED should be. RMCI should not be too prescriptive; it should just detail agreed best practice. If that's the case, I have no problem with it being an information page. Sceptre (talk) 18:48, 12 March 2023 (UTC)[reply]
  • Reluctant Support We really need another category like vetted approved "core essays"/"information pages" the best 10% of essays like these which do a good job of giving guidance of explaining how Wikipedia works or give good guidance but don't go to the extent of authoritatively say "this is what to do /not do" which in practice guidelines sort of do. The latter shouldn't creep larger because first, it is creep, it's hard to get the carefulness / perfection needed for give it that amount of clout. But I did support" anyway because I think this is an edge case where it could pass as a guideline given its history of scrutiny, tweaking, acceptance, usage, and being in good enough shape to be a guideline. Also it should be acknowledged that a "yes" here means only to make it a guideline, not an individual endorsement of every sentence in it. And knowing that it may need a few tweaks and changes to fully meet its new status. Sincerely, North8000 (talk) 19:14, 12 March 2023 (UTC)[reply]
  • Oppose in practice. Although I am sympathetic to the idea that this could be a guideline, I disagree that it has been clearly vetted by the community. In fact, for 15 or so years I have often avoided trying to sort out small changes to that page precisely because it was never necessary to do so; I could just continue closing as usual without needing to refer to instructions for new closers. In theory, I think we could start with a rewrite of that page, build up consensus, and then bring up the issue again, but I do not believe that the page as written reflects the best practices that have been developed at WP:RM over time, as reflected in my ongoing comments at WT:RM#Involved. Dekimasuよ! 17:19, 15 March 2023 (UTC)[reply]

Involved

Is the difference in wording between WP:RMCI and WP:INVOLVED a problem that should be fixed? IF so, why and how? Red Slash 16:34, 28 February 2023 (UTC)[reply]

Discussion on changing the wording

  • Oppose- unlike admin activities, RM closes are never urgent, so there's time to make sure that we get someone who's never closed a request on it before. Also unlike admin activities, it's excruciatingly hard to overturn a page move close if the closer won't self-revert. Also unlike admin activities, if the closer does recognize "oops, shouldn't have done that", it's extremely easy to just self-revert--nobody needs to live in fear that they're too "involved", since a self-revert of a move (e.g., "You just closed the request on Kyiv, maybe don't close the one on Odesa" -> "oops, my bad, self-reverting") is very easy for a good-faith closer to do. Also, closing move requests isn't really an administrative task, and is more an editorial one. Without these guidelines as written, the same editor could literally just close the same exact move request over and over and over and over again, stonewalling "progress" in a violation of WP:OWNership. It's one thing to watch your beloved perennial proposal fail; it's quite another when it's one single editor over and over again who keeps closing the request to shut it down. Red Slash 16:34, 28 February 2023 (UTC)[reply]
    the same editor could literally just close the same exact move request over and over and over and over again, stonewalling "progress". The flipside of that is an editor who raises the same request with same arguments over and over again until a sympathetic closer gets their turn. We would be basically encouraging WP:FORUMSHOPing. Vpab15 (talk) 17:56, 28 February 2023 (UTC)[reply]
    How is that the flipside? The problem is conflict of interest in the first case and disruption in the second case. Are you saying that the wording should be changed to allow the first case or that the wording already allows it? Selfstudier (talk) 18:42, 28 February 2023 (UTC)[reply]
    Personally, I don't see the necessity of waiting for someone who's never closed a move request on the topic. Closing RMs is administrative in the same capacity that closing any discussion is: the closer's task is to identify and implement the consensus of other users, rather than acting independently to make content changes. A closer can analyze an individual discussion and its consensus without inherently being influenced by unrelated discussions on similar topics; there's no reason for them to recuse themselves from future closures in the topic area unless they've formed an opinion that would preclude them from being able to close impartially.
    You mention the hypothetical example of an editor who closes the same RM repeatedly, but there are two possible scenarios this encompasses. First, that editor closes each discussion in line with the relevant protocols (fairly assessing consensus, waiting the full week, etc.); in this scenario, the closures are good, so there's no reason to prevent the editor from making them. The other scenario is that the editor flouts the protocols in their closures (closing against consensus, making inappropriate SNOW closes, etc.); this is disruptive for reasons beyond "they've closed RMs in this area before", so there's no need to maintain such a proviso in order to sanction people who perform these disruptive closures. ModernDayTrilobite (talkcontribs) 15:40, 1 March 2023 (UTC)[reply]
    While that may theoretically true in general, I think that in the specific case (referred to in the discussion as a "kerfuffle" but which actually need not concern us as it will be resolved in MR and not here) there is the appearance of a conflict of interest and as we all know, the appearance of a conflict can be just as bad as an actual conflict. So even if it may be theoretically true, I still think this is something that ought to be avoided, particularly in CT areas. Selfstudier (talk) 16:12, 1 March 2023 (UTC)[reply]
    That's certainly true for controversial proposals, and I can definitely understand where you're coming from in that regard, but I think a bright-line "you have ever closed such a move request" restriction would also shut down a number of productive and low-risk closures. It's not uncommon for someone to propose a series of RMs on a shared topic - RMs which could easily be bundled, but for one reason or another, are not - and forcing those RMs to all be closed by different people puts a lot of additional strain on that practice.
    I wonder if "you have ever closed a move request on this article" would be a good alternative restriction? In my view, it'd be nearly as effective at reducing the appearance of conflicts of interest, while also minimizing the added administrative burden on RM closers. ModernDayTrilobite (talkcontribs) 16:48, 1 March 2023 (UTC)[reply]
    That seems like a good proposal, I would support that. Selfstudier (talk) 16:58, 1 March 2023 (UTC)[reply]
    This is actually what it means, which I think a lot of people seem to be overlooking. The third (and second) points refer back to the first. They say "such a move request", referring to "a move request about the article in question" - but a lot of people seem to be looking at the third point in isolation and coming up with their own sense, out of context, of what "such a move request" means. So, "a move request about the article in question" would best be repeated thrice. Iskandar323 (talk) 10:44, 9 March 2023 (UTC)[reply]
    I don't think this is true: Also unlike admin activities, it's excruciatingly hard to overturn a page move close if the closer won't self-revert.
    Even if it were true, I think it's not an important thing to worry about because it's not a problem we experience. People don't post RMs over and over and over again; therefore, there aren't identical RMs for someone to close over and over and over again. If that becomes a problem, folks can ask at WP:AN for help. Most admins are very willing to let someone else take over, if they're told that their actions aren't resulting in editors feeling like they can trust the outcome.
    And frankly, if you've got someone who knows something about the correct nomenclature for some niche subject, you want that person involved in all RMs for that area. RMs, like any other discussion, would ideally be closed by people who can identify which arguments are stronger or weaker, and sometimes that means having domain-specific knowledge. Like any other discussion, an RM can appear to be evenly divided by superficial vote counting, but if one says X because Crackpots'R'Us uses that name, and the other says Y because reputable scholars use that name, then you should be going with the good sources – and that can only happen if the closer knows that the International Journal of Science is highly reputable, and the very similarly named Journal of International Science is not. WhatamIdoing (talk) 03:51, 9 March 2023 (UTC)[reply]
  • Wording should be changed to make it clear that the restriction on multiple closures applies to the same article (regardless of what titles it may have gone through} and not to other articles within the same topic area. A statute of limitations could also be applied, maybe a year. Contentious and persistent incompetent closing is a behavioural problem and can be managed by a topic ban if it becomes necessary.
    Also agree with what WhatamIdoing says. · · · Peter Southwood (talk): 05:24, 9 March 2023 (UTC)[reply]
  • Oppose Any change that allows an obvious conflict of interest. Selfstudier (talk) 12:20, 1 March 2023 (UTC)[reply]
    Well, obviously... · · · Peter Southwood (talk): 05:24, 9 March 2023 (UTC)[reply]
    Apparently not. Selfstudier (talk) 08:07, 9 March 2023 (UTC)[reply]
    @Selfstudier, are you feeling like that if Alice closes a discussion on iPhone 5C, that she has a Wikipedia:Conflict of interest for closing a discussion on, say, iPhone 6SE? WhatamIdoing (talk) 04:18, 14 March 2023 (UTC)[reply]
  • A much more detailed conversation about some aspects of this is ongoing at WT:RM#Involved. Dekimasuよ! 17:21, 15 March 2023 (UTC)[reply]

Discussion (RM closing instructions)

  • Comment. The objection to WP:RMCI was that it defines involved as You have ever closed such a move request, whereas the current policy at WP:INVOLVED says: an administrator who has interacted with an editor or topic area purely in an administrative role, or whose prior involvements are minor or obvious edits that do not show bias, is not involved and is not prevented from acting in an administrative capacity in relation to that editor or topic area. It does seem they contradict each other. Should the wording at WP:INVOLVED be changed if WP:RMCI becomes a guideline? Vpab15 (talk) 23:48, 22 February 2023 (UTC)[reply]
    INVOLVED is policy, so I think fixing this is better resolved the other way around. Updating RMCI so that it more closely matches INVOLVED seems like the solution, as ultimately that set of instructions are based off of the policy and not the other way around. Sideswipe9th (talk) 00:30, 23 February 2023 (UTC)[reply]
    So, they're definitely different issues, and the RMCI "involved" stuff is different on purpose and for a reason. The TL;DR is that WP:INVOLVED prohibits involvement with the same user (or topic area), while WP:RMCI just prohibits it from having a certain pre-expressed position on a topic. Great question though.
    Why WP:INVOLVED is as loose as it is:
    1. Administrators interact with a LOT of users. If we prohibited any administrator from ever interacting with the same user twice even in an administrative sense, we'd have a complete clusterbomb of suckiness on our hands.
    2. Oftentimes, admin action is urgent; we have way too few administrators and way too many things to be getting done, and sometimes if you waited for someone who has never been involved, the issue would take too long to get resolved (say, Admin Joe finds a rogue user who's wildly vandalizing tons of articles, but Admin Joe has already sanctioned the guy once before; it would hurt the encyclopedia to say that Admin Joe needs to wait to find someone else to ban the dude).
    3. Administrators are very well-vetted in our rules and policies. They shouldn't need someone to hold their hand and tell them when they're too involved to be impartial. It's the same reason Supreme Court Justices in the United States (at least) don't need to be told to recuse themselves.
    On the other hand, move closers:
    1. Most move closers do not interact with most articles. I've been an editor for a looooooooooooooooooooooong time and have a loooooooooooooooooooooot of edits and have been involved in a loooooooooooooooooot of move requests, and I just looked through WP:RM and I could not find a single article listed there which I have ever interacted with. I could close literally any of those requested moves. I'd bet you that, of the fifty or so editors who commonly (once or more per month) close requested moves, probably forty-five to forty-eight of us could serve as impartial movers on any given requested move.
    2. There is zero urgency with requested moves. If Page Mover Tom sees a request on Talk:Foo that he could move, but, shoot, he's been involved in similar discussions at Talk:Foobar... well, it's not going to cause any problem for that move request just to sit tight for another few days until Page Mover Sally happens across it. Requested Moves are important, certainly... but definitely not urgent.
    3. The people who close move requests are almost never administrators. There's about four admins who regularly close move requests and about ten others who do it once in a blue moon, and I'd ballpark-estimate that 90% of moves closed are closed by non-admins. I highly respect my fellow non-admin page movers, of course, but let's be real: as a group, we obviously aren't as vetted by the community. In fact, this is why WP:RMCI's "involved" criteria is as explicit as it is. I pushed hard in the discussion for it to be painfully obvious where the bright lines were, because as a group, we needed those bright lines.
    So, no, I would absolutely not suggest following WP:INVOLVED's wording or vice-versa. Different rules for different reasons, just like how categories have different rules than articles. Red Slash 05:04, 23 February 2023 (UTC)[reply]
    Right now there is a move review about an article related to the Arab–Israeli conflict and another about a New Zealand term. Both areas have RM discussions relatively regularly. If we are saying that a closer can only close one discussion in either area, that goes against the letter and the spirit of WP:INVOLVED. I don't think that would be feasible, we would run out of RM closers in no time. Vpab15 (talk) 11:28, 23 February 2023 (UTC)[reply]
    With love, no, I'm very confident there are plenty of well-informed RM closers who haven't committed themselves to either of those issues yet. I could list several of them who aren't involved at all. There's not THAT many of us, but there's not too few, either.
    Again, these are two separate issues. WP:INVOLVED deals with administrators performing administrative actions. WP:RMCI's "involved" section deals with neither. Red Slash 17:22, 23 February 2023 (UTC)[reply]
    Are you saying the same closer shouldn't close two RM discussions that belong to the same topic? For example Slava Ukraini and Odesa, both related to Ukraine? Because that's exactly what you did. Applying that rule would disqualify regular RM closers from closing most discussions. As a closer I would have to go though the hundreds of discussions I have closed to make sure I don't meet the new "involved" criteria. That is a huge burden and I can't see that working in practice. Vpab15 (talk) 18:20, 23 February 2023 (UTC)[reply]
    It already does work in practice! (FYI, these criteria are several years old and have worked for us for several years; they're not new.)
    The restrictions aren't "you've ever edited the article or anything like it". The restrictions are: "You have ever commented on any talk page in such a way as to make clear your position on the move request" or "Your editing on the page in question or about the page in question makes clear your position on the move request".
    For mine, the Slava Ukraini request hinged on WP:UE and WP:COMMONNAME about whether the slogan should remain translated into English or not; the way in which I closed it does not in any implicate how I would later close the Odesa request, which hinged on whether the one-S version or the two-S version were more common in English when referring to the Ukrainian city. The proof is in the pudding; I closed the Odesa request back in July, just three months after the Slava Ukraini one, and no editor even suggested that I had been too "involved" to close. (I would've painlessly self-reverted if someone would've brought a legitimate claim of violating those rules; again, there are no shortage of people who could've done a great job closing Odesa.) On the other hand, if I had been the one who closed the Kyiv/Kiev one, or if I had !voted on it, maybe I would've shied away from closing the Odesa one; if there's another Ukrainian city with a Russian name that's proposed to be moved to the Ukrainian name, I will avoid closing that one. I consider this common sense, which is why the discussion ultimately ended up where it did and the page reads how it reads.
    In short (too late!), the list of requirements on RMCI with regard to involvement were heavily discussed and meticulously crafted to avoid the sort of wacky prohibitions that you are correctly wary of. No worries, homie; it's not as scary as you think ! Red Slash 23:04, 23 February 2023 (UTC)[reply]
    Even on a per article level, are you sure that this is current practice? When looking at the CoI part, a closer is considered involved if You have ever closed such a move request. By a plain reading this means that if:
    • Some user proposes moving Foo to Bar
    • Bunch of editors support and oppose
    • 7 days pass, I make a determination of the consensus based on what editors have said. Article is not moved.
    • Significant amount of time passes, eg 6 months to several years, some other user proposes moving Foo to baz
    • 7 days pass, I cannot make determination of the consensus based on what the editors have said, because I closed another RM on the same page at some point in the past
    When we have 6.6 million articles, but only 130,000 active editors, at some point we will run out of editors who can close move requests on articles that have had multiple move requests. Sideswipe9th (talk) 00:33, 24 February 2023 (UTC)[reply]
    That is correct, that is current practice, you could not make that determination. We have 6.6 million articles, but I assure you, we don't have that many requested moves No worries, we've never even begun to come close to a situation where we've run out of possible closers. Red Slash 01:37, 24 February 2023 (UTC)[reply]
    Sorry to interrupt the thread here, but I do not believe there has ever been consensus to make that "common practice". Until 2019, the directive was to avoid the appearance of a conflict of interest (which is good!), but there was no prohibition on closing a second move on the same page, and no indication that the very existence of a previous close itself would be considered indicative of a conflict of interest. Since that change (which I didn't know about, since I don't monitor that page, although I would have objected to it), the page has not had guideline status. I have closed subsequent moves on the same pages many times, sometimes so far apart that I couldn't even remember having closed a move request there previously. When the reading of consensus is clear and accurate, there is no appearance of a conflict of interest. When the reading of consensus is not clear or accurate, there is a problem regardless of whether there has been a previous close on the same page. Dekimasuよ! 17:34, 15 March 2023 (UTC)[reply]
    Why is RMCI so much stricter in this regard when compared to how we handle INVOLVED at the admin level, or non-admin closures everywhere else on enwiki? The purpose of the closer of any discussion is to summarise the consensus that already exists, and doing so is usually not considered involvement by any other standard. What's special about move requests that make it so that we need this restriction? Sideswipe9th (talk) 02:11, 24 February 2023 (UTC)[reply]
    I explained it in much more detail above, but it boils down to A) administrative actions are far more urgent than RM moves, so we can afford to be choosier when it comes to involvement, and B) move request outcomes are much more difficult to overturn. Again, the part that you're concerned about--which I get, again, I understand your concern--was exhaustively discussed a few years ago when it was added to RMCI. Red Slash 07:49, 24 February 2023 (UTC)[reply]
    @Red Slash: I'm really confused here. Doesn't WP:INVOLVED only apply to admins anyway? So WP:INVOLVED does not really apply to non-admin editors making closures; in fact, the only guideline for that is WP:RMCI. It's unclear to me if RM closures even count as activities conducted in 'an administrative capacity', since this seems to be a contradiction in terms for any activities that are specifically allowed to be performed by non-admins. I sense muddled logic. Iskandar323 (talk) 11:12, 27 February 2023 (UTC)[reply]
    And then WP:NACINV just redirects back to WP:INVOLVED as the parallel standard being applied, but without implying that RM closure is administrative. Iskandar323 (talk) 11:15, 27 February 2023 (UTC)[reply]
    It seems really important to clear up whether RM closure is an "admin capacity" covered by WP:INVOLVED that makes the exception of allowing non-admins to perform it, or if it is a "non-admin capacity" that WP:INVOLVED does not fundamentally apply to, but which the WP:INVOLVED principles are overlaid back onto as applicable to participating admins and non-admins by virtue of the standards laid out at WP:NACINV. Iskandar323 (talk) 07:25, 28 February 2023 (UTC)[reply]
    Admin actions are not more urgent than RMs. There is no urgency to Wikipedia:Articles for deletion, for example. Some admin actions are urgent, but most aren't. WhatamIdoing (talk) 03:57, 9 March 2023 (UTC)[reply]
  • Question. Should we include the bit about WP:Move review that's found at the top of WP:RM as a post closure section? This will lay out (or affirm ) the current dispute resolution path if someone wishes to contest the closure (before the interested party takes the case up to ANI). Duplicating the paragraph here for everyone's convenience:

    Wikipedia:Move review can be used to contest the outcome of a move request as long as all steps are followed. If a discussion on the closer's talk page does not resolve an issue, then a move review will evaluate the close of the move discussion to determine whether or not the contested close was reasonable and consistent with the spirit and intent of common practice, policies, and guidelines.

    – robertsky (talk) 00:52, 23 February 2023 (UTC)[reply]
    Not a bad idea. Red Slash 05:04, 23 February 2023 (UTC)[reply]
  • Comment I don't believe the RfC statement or this notification meet the neutrality requirements. Red Slash, can you update them so as to not bias the RfC? BilledMammal (talk) 12:31, 23 February 2023 (UTC)[reply]
    Can you point to anything in particular? Red Slash 17:23, 23 February 2023 (UTC)[reply]
    Almost all of it; it is advocating for this change, and trying to present RMCI as more than an essay. I suggest you remove the second sentence from your notification, and reduce your statement down the concise and neutral statement Should WP:RMCI be formally elevated to the official status of a WP:GUIDELINE? BilledMammal (talk) 23:22, 23 February 2023 (UTC)[reply]
    First paragraph:

    There has existed for eighteen years a set of instructions for how to formally close a requested move. It was initially titled as Wikipedia:Moving guidelines for administrators, later changed to Wikipedia:Requested moves/Closing instructions (hereafter WP:RMCI). Over the years, it has grown from a simple set of how-to steps to a comprehensive list of guidelines for determining consensus (which began as early as 2006 and has continued to expand through the years). It also has included instructions on who should close requests--first in 2009 when the authority to move pages was explicitly granted to non-administrators, then later in 2019 (following lots of discussion) we introduced rules on how editors who are involved need to not close requests.

    is completely factual and non-biased. The instructions have existed for 18 years. It was titled that way initially. Etc.
    Second:

    The instructional page wasn't classified as anything (guideline, policy, or essay) through 2021, even though it had been originally written as a guideline. Following a post on the talk page that saw no response, it was listed as an explanatory essay in 2021.

    Also completely factual.
    First part of third paragraph:

    Recently, at least one editor has asserted that, as WP:RMCI is "only" an essay, its procedures do not need to be followed; therefore, as one concrete example, editors may feel free to close move requests that they are fully involved in, so long as they think it's the right thing to do.

    A quick look at WP:MRV will show you that at least one editor has asserted that.

    WP:RMCI has been called "purely advisory", "not vetted", etc., leading to questions as regarding its authority--this in spite of the fact that it was written as a guideline and has guided literally thousands of move closures.

    The first part of this sentence is strictly factual. The second part is also true; are you perhaps upset at the fact that I said "in spite of"? I suppose that could be considered biased language, perhaps. How would you rewrite this without omitting any of the factual details contained in this sentence?

    This has unquestionably led to no small amount of chaos at Wikipedia:Move review following an editor who closed the same move request twice in a row (which is against the closing instructions), both with the same result, and who claimed that the closing instructions that forbid such a close by an involved user were merely "advice".

    "No small amount of chaos" is certainly an opinion, but I don't see how it's biased. IDK , you go look at MRV and tell me if you see chaos or not. Note: me saying that there is unquestionable chaos is not telling people how they should !vote. It is saying why they should vote: because there is currently confusion. If there were no confusion, there would not need to be an RfC.
    Everything from "an editor who" onward is completely factual.

    Should WP:RMCI be formally elevated to the official status of a WP:GUIDELINE?

    Gotta have the question we're !voting on listed in the RfC, don't we?
    All told, I can't see where you're seeing bias, or that I advocated for change in a specific direction in the RfC itself. All I did was provide factual context (though admittedly, perhaps that phrase "in spite of" is too loaded and pushes the reader in a certain direction) for why it is good for the RfC to be brought up. Trust me, this RfC being decisively rejected would have been much, much better than the status quo, where something that "looks, talks, and quacks" like a guideline is being treated by many editors as a guideline while others treat it as an essay. Red Slash 23:54, 23 February 2023 (UTC)[reply]
    Much of this is evidence for why it should be a guideline. It's factual, but that doesn't make it unbiased, because it doesn't include the evidence for why it shouldn't be a guideline. BilledMammal (talk) 23:57, 23 February 2023 (UTC)[reply]
    BilledMammal, I wouldn't worry about this. Sometimes an RFC needs an explanation, and sometimes a fair explanation is going to tip editors towards one side or the other. RFC questions, especially procedural ones, shouldn't necessarily try to give equal validity to all options. We can safely assume that Wikipedia editors are smart enough to figure it out. WhatamIdoing (talk) 00:14, 24 February 2023 (UTC)[reply]
    I don't believe an explanation is needed here; the question is one that Wikipedia editors are smart enough to understand without explanation, and while the justification is more extensive the justification should be part of a !vote, note the statement. BilledMammal (talk) 00:19, 24 February 2023 (UTC)[reply]
    It might be unneeded, but I doubt that it's done any harm, even though you and I are the only editors who have voted against it so far. WhatamIdoing (talk) 00:25, 24 February 2023 (UTC)[reply]
    My advice for the future… when writing an RFC: At the top, just ask the question. Don’t add any explanation. If you think background or explanation is needed, add it to the comments section, or as a separate section entitled “Background” (or something) - By physically separating the question from the background of why you are asking it, you limit accusations of non-neutrality. Blueboar (talk) 02:44, 24 February 2023 (UTC)[reply]
    Yep, write anything else as part of a !vote (or a comment) after the RFC body so it doesn't clog up the RFC listing pages. As for neutrality, I don't think it is crucial for this particular discussion as this is not a contentious topic. Selfstudier (talk) 06:22, 24 February 2023 (UTC)[reply]
    I've written maybe six RfCs in my entire career, so this is something I'm definitely not super experienced in. Your advice--all y'all's advice--is well taken, thank you. Red Slash 07:50, 24 February 2023 (UTC)[reply]
  • Procedural close for the non-neutral RfC statement and lack of RFCBEFORE (including poor format choice with separate supp/opp sections, not previously discussed AFAICT), first choice. Second choice: oppose per Rhodo. I do not believe it's really been vetted by the community, MR and RMs are a walled garden, and RMCI is its Bible. It differs in some major respects from standard practice elsewhere (e.g. NACs, interpretation of involved), and I think that's more due to the walled garden nature of it than any kind of community vetting. We need fewer policies and guidelines, not more. Also, it's noticeable that the impetus for this is a recent kerfuffle and I don't think it's a good idea to promote RMCI to a guideline in an effort to "win" an argument over RM/MR procedure. Levivich (talk) 14:37, 25 February 2023 (UTC)[reply]
    To some extent, the "kerfuffle" is the RFCbefore. If there are problems with RMCI then that needs to be addressed someplace, like, er...here? Two of three/four editors opining in a move review that RMCI has problems are here opposing so any issues are getting addressed in proper place and not in a move review. Selfstudier (talk) 14:44, 26 February 2023 (UTC)[reply]
    A contentious RM followed by a contentious MR (aka, a kerfuffle) makes for a lousy RFCBEFORE and this very obviously non-neutral RfC statement demonstrates why. And contra to some surprising advice below, RFCNEUTRAL is for all RFCs, not just contentious topics. Also, this RfC doesn't comply with WP:PGCHANGE as it's not widely advertised, it's at the wrong pump (should be policy for a new PG), and not listed at WP:CENT. This ought to be shut down and restarted the right way: an RFCBEFORE at the PG page, followed by a neutral RfC at the policy pump listed on CENT. Otherwise, it's not really global consensus. Levivich (talk) 16:11, 26 February 2023 (UTC)[reply]
    Nah. PGCHANGE is for changes to policies and guidelines, this isn't that. Policy pump is for existing and proposed policies, isn't that either. No need to list this at WP:CENT afaics. The only procedural screw up was people not raising their disagreement with RMCI at move review talk instead of during a move review. I can do the same thing and just assert that WP:RFCNEUTRAL is only an information page and doesn't apply:) Selfstudier (talk) 16:24, 26 February 2023 (UTC)[reply]
    Oh come on man. WP:PROPOSAL is the policy for making new guidelines and it wasn't followed here at all. (And it has more requirements than PGCHANGE, and it explicitly talks about RFCBEFORE, and VPP, etc.). Please don't argue that a proposal to promote RMCI to a guideline doesn't have to follow the WP:PG policy. Similarly there can be no argument that this RfC question isn't RFCNEUTRAL. If you like the idea, that's fine, but the failure to follow our consensus procedure here is indisputable. Levivich (talk) 16:42, 26 February 2023 (UTC)[reply]
    The proposal, apart from some suggested wording tweaks, seems actionable to me. RMCI has been treated as a guideline in practice and the only issues that have been raised against it are the ones raised in an ongoing move review (by objectors to it). Selfstudier (talk) 16:59, 26 February 2023 (UTC)[reply]
    I've just double-checked, and WP:PROPOSAL, WP:VPP, and WP:VPR are all unanimous in saying that proposals for new PGs or to change PGs should be at the VPP and widely-advertised, so I've moved this from VPR to VPP and listed it at CENT. Levivich (talk) 17:58, 26 February 2023 (UTC)[reply]
    I've collapsed the initial statement and replaced it with a neutral one; if we are going to try and fix this RfC, rather than making a procedural close and opening a new one, then that issue also needs to be fixed - although I hope that the closer considers that prior to 18:05, 26 February 2023 (UTC) the RfC was lead by a non-neutral statement that might have biased the result.
    I am also wondering if we should merge the support and oppose sections, per WP:NOTVOTE? BilledMammal (talk) 18:09, 26 February 2023 (UTC)[reply]
    I've struck my !vote and moved it down to discussion, since it's all based on procedure anyway and the procedure is getting fixed. Yes, I think we should combine the support/oppose into one survey section as well. Levivich (talk) 18:14, 26 February 2023 (UTC)[reply]
    Done. BilledMammal (talk) 18:21, 26 February 2023 (UTC)[reply]

Some editors are apparently objecting to the wording rather than the principle. So how to deal with this? Do we fix it now? (subRFCs?) Do we send this back to where it was to start with-proposals? Selfstudier (talk) 17:28, 27 February 2023 (UTC)[reply]

RfC: Should WP:REFUND allow for articles deleted through the CSD criteria of A7, A9, A11, G5 to be userfied or restored as drafts?

Currently, WP:UDP says:

Requests for undeletion should be used to appeal most instances of proposed deletion and some speedy deletions.

However, there is some debate due to the vagueness of what some speedy deletions actually includes. Currently, WP:REFUND says:

Please do not request that pages deleted under speedy deletion criteria F7, F9, F11, U5, A7, A9, A11, G3, G4, G10, G11 or G12 be undeleted here.

However, as Uanfala has mentioned on the talk, this provision was boldly added by a now-banned user, and the archives are fairly divided to find any hard consensus.

Notably, G5 is not currently listed, but several admins have echoed that they refuse to restore material by banned or blocked editors per WP:BMB as a means to enforce bans and dissuade the user from returning. However, other editors advocate that WP:BANREVERT states there is no hard rule to delete potentially useful content, although you are allowed to. This means there is no settled precedent for either supporting and opposing such requests, and it is entirely up to the reviewing administrator to refuse such a request purely based on their outlook of how material added by banned users should be treated. Thus, I think a centralized discussion regarding this would be the best idea going forward.

So as a repeat from the top, can good faith editors appeal to move deleted material from a CSD to a draft or their userspace to work on? (This assumes that there are no other issues with the pages and, in the case of G5, the requesting editor is not affiliated with the banned page creator unless they satisfy WP:PROXYING). And if so, do administrators use their personal best judgment or concede (in most cases) to allow for community-consensus discussion such as WP:AFD and WP:DRV? Why? I Ask (talk) 22:21, 2 March 2023 (UTC)[reply]

Discussion (WP:REFUND)

As the proposer, I will not be advocating one way or another. Throughout past discussions, I have already made my position clear elsewhere and do not want to risk any further WP:BLUDGEONING, though I will respond to any follow-up question directed at this proposal. However, I do want to preemptively state that any attempt to misuse this proposal (by paid editors or returning sock puppets) should be treated as with any other disruptive editing, and thus a supposed risk of people gaming the system should not be the sole reason to oppose, as there are modes to prevent abuse. (If we were worried about gaming the system for everything applicable on Wikipedia, and restricted it, then it would not be an encyclopedia anyone can edit. So on that notion, I disagree.) Why? I Ask (talk) 22:21, 2 March 2023 (UTC)[reply]

  • This RfC seems a bit malformed. A7 and A9 seem reasonable to restore, as something with no indication of importance might actually be notable and could have useful material; G5 is a whole different issue, and shouldn't be conflated here. Galobtter (pingó mió) 00:47, 3 March 2023 (UTC)[reply]
    Can you explain that? G5 has the exact same reasoning as what you put for A7 and A9. Perhaps even moreso, as such articles are likely to have been notable and proper articles outright with no problems inherent to them. SilverserenC 02:04, 3 March 2023 (UTC)[reply]
    The reasons for WP:G5 have to do with WP:BMB and not the content of the article - A7 and A9 don't have the same issue of restoration possibly conflicting with the banning policy. Galobtter (pingó mió) 05:48, 3 March 2023 (UTC)[reply]
    The point also being that no one really cares about A7, A9, or A11 restores, but G5 restores as policy would be a definite shift, and is clearly what is the RfC is primarily about. Galobtter (pingó mió) 05:54, 3 March 2023 (UTC)[reply]
    For deletions under G5, just create the article again. You could ask for a list of references from the deleted article. I would probably be willing to supply it if anyone requests. · · · Peter Southwood (talk): 06:01, 9 March 2023 (UTC)[reply]
  • Codifying the allowance of this seems like a good idea just to get the subjectivity of admin responsibility out of the mix. Because I've seen plenty of admins claim they don't want to be responsible for restoring the content if it turns out to be bad for whatever reason. Thus, codifying the use of WP:REFUND (and obviously requiring the requester to be an experienced editor in good standing) for this purpose would separate any admin responsibility, as honestly the sole responsibility for the requested articles should be on the person requesting the undeletion. On their head be it, as the saying goes. SilverserenC 02:07, 3 March 2023 (UTC)[reply]
  • Support. Editors who want to reinstate an edit by a sock puppet are permitted to do so, on the condition that they take responsibility for the edit. There is no reason this shouldn't apply to entire articles; I previously did this for Tekla Åberg, but it required jumping through far too many hoops. BilledMammal (talk) 02:42, 3 March 2023 (UTC)[reply]
    There may be attribution problems with this procedure, unlike for reinstating individual edits. You would have to take the article off-wiki and completely rewrite everything added by the banned party. How many editors would we trust to get this right? Any admin can do this already, and there are some editors I would trust, but I would not do this for someone whose work I do not know quite well, because I would be ethically obliged to do the work of checking the new article myself to ensure there were no issues of plagiarism. It would usually be easier to start from scratch, possibly with a list of references from the deleted article, which I would normally be happy to provide. Cheers · · · Peter Southwood (talk): 06:01, 9 March 2023 (UTC)[reply]
    I am a bit confused on what you mean about attribution. This is saying that the articles would be restored with the page history which would include the edits by the blocked or banned users so attribution would not be an issue. But if you are arguing that edits by banned or blocked users should never be restored, I am not going to attempt to sway you there. Why? I Ask (talk) 06:15, 9 March 2023 (UTC)[reply]
  • Support, as long as everyone applies common sense about this, and as long as G5-deleted drafts are under the "stewardship" of someone who isn't blocked. (Summoned by bot) I dream of horses (Contribs) (Talk) 03:11, 3 March 2023 (UTC)[reply]
  • Support for A7 and A9. Such text could be useful for related articles, even if the topic itself is not notable. For A11 I'm not sure what the use case would be, but at the same time not sure what the downsides would be. Oppose encouragement of G5 undeletions, admin/editor good faith is abused as it is. CMD (talk) 03:19, 3 March 2023 (UTC)[reply]
    What does "encouragement" mean? Such undeletions are already allowed. How is requiring experienced editors in good standing to be the ones making the request an abuse of good faith? SilverserenC 03:22, 3 March 2023 (UTC)[reply]
    Given the current REFUND text does not mention G5, I took the raising here to be a proposal to make it more common a situation (encouragement). G5 articles can have deep problems that are not immediately apparent, and the good standing and experience of an editor requires assessment. That is multiple areas where an admin might (should?) apply good faith, and no admin can be expected to be across all G5 history. CMD (talk) 03:34, 3 March 2023 (UTC)[reply]
    Why would G5'ed articles have any more potential issues than any other article? The reason for them being G5'ed is inherently unrelated to the content of the article or even the article existing itself. Since it's about a banned user having made it, not about whether there was anything wrong with the article. It's not like this is about articles that were deleted for being copyvios, blatantly non-neutral/attack articles, or having hoax content. SilverserenC 03:40, 3 March 2023 (UTC)[reply]
    Users can be banned for reasons involving hoax content and copyvios. Once they are, later creations are generally handled under G5 for ease. This does not mean the prior problems are no longer present, just that an admin doesn't need to waste their time finding out. CMD (talk) 03:44, 3 March 2023 (UTC)[reply]
    That would be the only legitimate reason I can see for refusing a G5 undeletion request, but how many banned users who sock were banned for that? It seems vastly more common for someone to be banned for community interaction issues and otherwise long-term bad will, not for issues with article creation. The only exception I'm aware of there are serial POV pushers, but those are less involved in article creation and more involved in pushing text into existing articles. Also, it only takes a moment to check on why someone was banned. I've had a number of G5'ed articles restored and their content was fine, such as Ennan Alimov. SilverserenC 03:55, 3 March 2023 (UTC)[reply]
    Creating new articles is just one tool in the serial POV pusher toolbox! As to the question, I do not have any statistics on proportions, I am responding to this RfC from experience of it happening in cases I have seen. Community interaction issues often coincide with content issues, whichever is ultimately the straw that brings about a ban, and simply checking a ban notice misses this history. I have also worked with some sock pages/content, and restored sock content without change. I don't think this means admins should be encouraged to undelete G5 pages they would not currently want to undelete, which the RfC prompts discussion on. CMD (talk) 04:08, 3 March 2023 (UTC)[reply]
    Isn't the easiest way to deal with that to just add it to the REFUND requirements? Any requester of a G5 undeletion must include in the request the reason why the banned editor was banned and how this doesn't conflict with the undeletion request and article requirements. Make the requester do the work. SilverserenC 05:01, 3 March 2023 (UTC)[reply]
    To clarify, in the opening statement of the RfC, I stated that: this assumes that there are no other issues with the pages. If there is a chance the article can be speedily deleted under G3 and G12, then per my opening statement, it should not be restored. Why? I Ask (talk) 03:51, 3 March 2023 (UTC)[reply]
    For clarification about A11, I decided to include it as it is one of the few CSD criteria where it is not actively harmful (such as G3, which are pure hoaxes) or completely futile (such as A3, an article that is empty) to restore. I believe these are the four CSD criteria where there exists some form of potentially useable content. For example, some scientist (for some reason) summarizes their findings here before later being published in a peer-reviewed journal. This is purely hypothetical, of course, but (as you said) I can not find a particular downside to prevent including A11 alongside A7 and A9. Why? I Ask (talk) 03:37, 3 March 2023 (UTC)[reply]
  • Editors can already ask for restoration to userspace of content deleted under these criteria, so I'm not sure what the policy question is here. I would oppose anything that encourages people to challenge G5 deletions (and an undeleted G5 is still a G5, no matter in which space it sits; it requires substantial input by others to stop being deleteable). —Kusma (talk) 10:05, 3 March 2023 (UTC)[reply]
    Technically, the policy is WP:UDP which is vague in what some of the applicable speedy deletion criteria are. Currently, the page WP:REFUND (while not itself a policy) says that A7, A9, and A11 should not be requested to be undeleted, so it does not seem like they currently are able to request restoration through this venue. This RfC attempts to amend that wording (and potentially some at either WP:UDP or WP:CSD) to formally establish that editors are allowed back these articles if they feel they could help in their editing. Would you support A7, A9, and A11 in that case? Why? I Ask (talk) 10:21, 3 March 2023 (UTC)[reply]
    I do not support this, as I believe it will be generally pointless, but I won't argue against it. —Kusma (talk) 10:39, 3 March 2023 (UTC)[reply]
  • Support. I don't see any downsides to allowing (not requiring) refunds of potentially useful content if someone wants it. If the requester abuses this, then that's a behavioural issue that can and should be dealt with as per any other behavioural issue. Thryduulf (talk) 10:07, 3 March 2023 (UTC)[reply]
  • Support for A7/A9; oppose for A11: There is no need to maintain unnecessary administrative hurdles to restoring material that could form the basis of something useful in the hands of the right editor willing to take custodianship of the content - especially so if the restoration is restricted to draft space or user space. Also support for G5 G4 in line with the rationale provided by Filelakeshoe below. Iskandar323 (talk) 10:50, 3 March 2023 (UTC)[reply]
    Do you mean G4, not G5? CMD (talk) 01:30, 4 March 2023 (UTC)[reply]
Yes, thanks. (I'm indifferent to G5 since it's not already listed, so already flexibly.) Iskandar323 (talk) 06:17, 4 March 2023 (UTC)[reply]
  • Comment (Have !voted above) I was unaware when I made my comment above, but this RfC appears to be a new attempt to open the G5 question following opposition to this idea at Wikipedia talk:Criteria for speedy deletion#Amend G5 to say that uninvolved editors may request undeletion, which itself followed opposition to the idea at Wikipedia:Deletion review/Log/2023 February 27. CMD (talk) 10:58, 3 March 2023 (UTC)[reply]
    Yes, I would suggest to remove the G5 part from the proposal as it has been rejected elsewhere and shouldn't just be forum shopped here. —Kusma (talk) 11:02, 3 March 2023 (UTC)[reply]
    That has to do with the wording of G5 which seems to be opposed (and it never specified user or draftspace as opposed to mainspace). There was also differences in opinion on why it should not be added, with some basing it on the fact they oppose G5 restoration and some opposing because they simply did not like how it singled out this specific CSD. And there was also support at the deletion review which is why a centralized discussion to formally decide this for future articles is better. You can not just try to invalidate this discussion that stands on its own merits because you personally oppose it. The current half of the votes above that support show there is a discussion to be had. Why? I Ask (talk) 11:03, 3 March 2023 (UTC)[reply]
    I don't oppose the discussion, but I would oppose the discussion being WP:FORUMSHOPPED. As mentioned, I did not participate in the previous two discussions, and upon finding them I see that my time was taken up by something which seems to be already trending towards consensus against in two existing locations. If you want to bring more attention to ongoing discussions, the correct course of action is to leave a neutral note here pointing towards that discussion. CMD (talk) 11:20, 3 March 2023 (UTC)[reply]
    That you have supporters for your cause here does not change the fact that this is textbook WP:FORUMSHOPping, especially as you have failed to mention the other discussions that CMD has now linked to, which should be taken into account by the closer of this discussion. —Kusma (talk) 11:21, 3 March 2023 (UTC)[reply]
    CMD and Kusma, these two (I guess three) discussions have different scopes. The DRV determines if G5 was applied correctly (and it was). The discussion at CSD determines if text is added to G5 that says that editors may request undeletion (and there is no consensus to add the text there). This is a separate issue involving four CSD criteria where it is asking if there consensus to support the general restoration of content deleted to non-mainspace venues to work on. Why? I Ask (talk) 11:30, 3 March 2023 (UTC)[reply]
    This proposal, if I can call it that, seems to be about WP:REFUND, and removing "A7, A9, A11" from the list which says "Please do not request that pages deleted under speedy deletion criteria F7, F9, F11, U5, A7, A9, A11, G3, G4, G10, G11 or G12 be undeleted here". Although it's obvious you want G5 involved somehow, I'm unclear what changes you actually want. The page already says, at least implies through its current structure, that you can already request a G5 undeletion. What changes are you looking for? -- zzuuzz (talk) 11:28, 3 March 2023 (UTC)[reply]
    Through the other discussion you have been a part of, some editors have opposed the notion that you should be able to request a G5 deletion or held that it should never be restored no matter what (rendering the ability to request moot). This RfC attempts to formally establish whether or not that is the case. Why? I Ask (talk) 11:31, 3 March 2023 (UTC)[reply]
    I find it painfully clear through the whole of policy that a G5 restoration can be both requested and fulfilled. Though I also understand there is no obligation. Thus I still don't understand what's being proposed. -- zzuuzz (talk) 11:37, 3 March 2023 (UTC)[reply]
    So you support that users can appeal a G5 deletion and have the content userfied or drafted? Aside from that, it also asks whether or not administrators have the full right to deny such a request purely and systematically on the basis of it being deleted by the four criteria. (This is how most G5 requests for undeletion end up as some administrators believe that no material by editors who were banned should ever be restored without necessarily weighing in on the content of the deleted article in question.) Why? I Ask (talk) 11:51, 3 March 2023 (UTC)[reply]
    Admins can say no to pretty much whatever they want, and rightfully so. They can't always speak for everyone. -- zzuuzz (talk) 12:02, 3 March 2023 (UTC)[reply]
    I find that attitude highly problematic. Why? I Ask (talk) 12:08, 3 March 2023 (UTC)[reply]
    You can tell me all about that one day. We're all volunteers and (pretty much) don't have to do anything. We have more than one admin for many reasons and many occasions. This is one of them. -- zzuuzz (talk) 12:16, 3 March 2023 (UTC)[reply]
    You said that we have more than one admin for a reason, but could you clarify what you mean by that? Why? I Ask (talk) 12:31, 3 March 2023 (UTC)[reply]
    As I said above, there's nothing prohibiting a request for G5'd content. If you find an admin who says no they won't do something, find another admin who will, if you can. There's a lot of stuff I won't do but it doesn't mean it can't be done. -- zzuuzz (talk) 12:52, 3 March 2023 (UTC)[reply]
    If I go admin to admin, is that not forum shopping? Why? I Ask (talk) 14:47, 3 March 2023 (UTC)[reply]
    It is fine, and we even have categories to facilitate finding an admin who might help you with a specific issue: Category:Wikipedia administrators by inclination. —Kusma (talk) 15:48, 3 March 2023 (UTC)[reply]
    That seems strange to be able to get userfication of articles deleted by G5 by raising the issue at multiple admin talk pages to see which one is willing to do so. Why? I Ask (talk) 16:01, 3 March 2023 (UTC)[reply]
  • Not a clear RFC proposal, but as a frequent refunder I will comment. When someone requests a G5 overturn, there is a high chance that they are a sock of the banned user. I like to refer the request to the deleter or blocker of the banned user to make a more informed decision about whether to block the requestor. When it comes to things that were deleted for invalid reasons, I may restore (eg if the file was deleted as a copyvio, but the copy was of a Wikipedia mirror). Even a G5 may be inappropriate if it was edited significantly by someone else. If the deleter is still active, then notifying them of the erroneous delete is a good idea so they can rectify it. Other situations where the circumstances have changed would require a good reason to be given by the requestor. Eg if an A7ed topic is now notable. Graeme Bartlett (talk) 12:07, 3 March 2023 (UTC)[reply]
    This proposal specifies user and draft spaces where notability does not apply. Do you still think a reason should have to always be given for A7, A9, and A11 requests? Why? I Ask (talk) 12:14, 3 March 2023 (UTC)[reply]
  • Support for A7, A9, and A11. But Oppose /conditional support for G5. Conditional in the sense that we should have some guidance about how to handle those requests. It depends what the user was blocked for. Someone blocked for chronic copyright violations shouldn't have anything restored to public view without a thorough check first. That probably means emailing a draft to the requesting user with a warning rather than a userfication/draftification. Someone blocked for egregious harassment and personal attacks shouldn't have their material restored under any circumstance, because we need to prioritize the health of our community. What's unclear are the other cases. If someone's a chronic edit warrior, did some stupid sock puppetry, couldn't stop making cosmetic edits, or blocked for any of the other more IMO trivial offenses, I don't see a reason why a refund shouldn't be granted. In other words, a refund can be requested, but if we're going down that road we should have some documentation for admins outlining the conditions when it is or isn't ok. — Rhododendrites talk \\ 12:18, 3 March 2023 (UTC)[reply]
    So you would support an edit or proposal that documents when G5 restorations should be allowed? I think there is room to explore that idea, but that may be a different RfC. Why? I Ask (talk) 12:28, 3 March 2023 (UTC)[reply]
    I've struck "conditional support" for now, as I'd prefer to err on the side of not refunding banned users' content, but think that there are exceptions to that rule that could conceivably be documented in a way that would add it to the list. — Rhododendrites talk \\ 13:57, 3 March 2023 (UTC)[reply]
    Nothing in this proposal would require anybody to refund anything, it would simply explicitly permit them to do so if they wished. Obviously some content should not be restored (copyvios, attacks, etc), some content that should be cleaned before being put in mainspace (very promotional, highly POV, etc), some content that can be cleaned in mainspace (occasional POV problems, suboptimal formatting, globalisation needed, etc) and some content that was deleted solely because some admins believe that "banned means banned" is of greater importance than providing (even occasionally GA or FA-standard) encyclopaedic content to readers (I vehemently disagree with this, but recognise that it is widely held view). There examples of every type that are deleted per G5. Thryduulf (talk) 13:12, 3 March 2023 (UTC)[reply]
    Nothing in this proposal would require anybody to refund anything - But, as with some admins believe that "banned means banned", there's disagreement about when it should apply. G5 isn't an area I'd feel comfortable just leaving open to individual discretion because some feel that the immediate availability of decent content trumps anything else. While there are absolutely exceptions, I'd want us to err on the side of excluding the content. believe that "banned means banned" is of greater importance than providing (even occasionally GA or FA-standard) encyclopaedic content to readers - You've provided one extreme interpretation of the dispute you're referencing. Here's the other extreme: some admins believe there's no abuse, harassment, or sockpuppetry that's too severe to salvage content because "it's all about the readers", as though encouraging sockpuppetry by toxic personalities doesn't do immeasurable damage to specific individuals/groups or the greater community, doesn't discourage people from contributing, or doesn't dissuade people from participating in the first place. Some believe that damage is just too abstract, so why bother with uncountable hypothetical articles and other contributions from uncountable discouraged/put off users in the future when we can have one more decent article "for the readers" today. A bit stronger than I'd typically use it, but it seemed like a fitting counterpart to a reduction to "solely believe that banned means banned". — Rhododendrites talk \\ 13:57, 3 March 2023 (UTC)[reply]
  • Support A7/A9/A11, Oppose G5 as worded, per Rhododendrites and Galobtter - particularly, I think that G5 merits a separate discussion. casualdejekyll 14:15, 3 March 2023 (UTC)[reply]
  • Support A7/A9/A11. I'm less enthusiastic about A11, but whatever. Oppose G5. Recreating content previously created by a known sock is itself prima-facie evidence of socking. We're already losing the sock battle, no need to give them additional assistance. -- RoySmith (talk) 14:32, 3 March 2023 (UTC)[reply]
    I'm honestly surprised you would make this argument, RoySmith, considering your block of User:Lettler as a sockpuppet when they weren't one. And it appears it likely drove them from the project. A bunch of their articles were also deleted under G5 during the period of your block and you appear to have made no effort to rectify that (I got around to fixing it myself today). SilverserenC 21:52, 4 March 2023 (UTC)[reply]
    @Silver seren there's very little I can say publicly about that case other than to note that it was an extraordinary situation and policy should not be driven by extraordinary events. If there were pages created by Lettler which were deleted under G5, I have no objection to those specific pages being refunded. Looking over Deleted user contributions for Lettler right now, I don't see any such entries. -- RoySmith (talk) 23:08, 4 March 2023 (UTC)[reply]
    Because I had them undeleted just now. I think policy discussions should however include cases where prior policy and methods failed and were harmful to editors on the project. In this case, why isn't there a system for someone's articles to be undeleted if they were deleted under false pretenses? We can argue all we want of deletion review or anything else, but the fact of the matter is that it's been over a month and there was no action taken to fix this issue until I noticed it and happened to make the effort to correct it. When it's something that should have been done immediately after Lettler was unblocked. SilverserenC 23:11, 4 March 2023 (UTC)[reply]
  • Support A7/A9/A11. As with the above, I Oppose G5, but I'm willing to be convinced in a separate discussion; G5 carries too many other issues to be bundled with the others. --Jayron32 15:04, 3 March 2023 (UTC)[reply]
  • Support removing A7, A9, and G4 (I gave an explanation for the latter on the REFUND talk page). – filelakeshoe (t / c) 🐱 15:31, 3 March 2023 (UTC)[reply]
    @Filelakeshoe: I assume you meant G5 and not G4? Why? I Ask (talk) 15:39, 3 March 2023 (UTC)[reply]
    No I mean G4 - see my comments on Wikipedia talk:Requests for undeletion#Which CSD shouldn't get refunded here?filelakeshoe (t / c) 🐱 15:44, 3 March 2023 (UTC)[reply]
    I see! I believe most administrators are usually willing to userfy such content. I've never seen a request for that get opposed. I would be willing to add G4 to the RfC, as well. Why? I Ask (talk) 15:47, 3 March 2023 (UTC)[reply]
  • Support for the A-series criteria: these are meant to apply to articles; the deletion concern is addressed by moving the page out of article space. Oppose for G5 per WP:BMB; as I said in the parallel discussion happening at WT:CSD (*cough* WP:FORUMSHOPPING *cough*), I'll gladly provide email copies of G5-deleted articles to any editor in good standing who asks so that they can write an article in their own words, but I won't restore them. I intend to continue this practice regardless of the outcome of these two discussions. Ivanvector (Talk/Edits) 15:52, 3 March 2023 (UTC)[reply]
  • Just a small request: Undeletion policy should provide enough flexibility to provide list of citations and references. So some who wishes to take stalk of earlier used references in deleted article while working fresh should be possible in most circumstances. Also some flexibility when some one wants to use deleted article just for retaining article history for credits but will be writing article almost afresh.
    Sorry for not being able to visit RFC proposal in detail and commenting for minimal expectation. I hope my comment is relevant enough and would make some sense. Bookku (talk) 12:33, 4 March 2023 (UTC)[reply]
    Yes, the main reason this is so important is because of WP:RUD. If a user wishes to use deleted content, it must be restored for attribution per the legal policy. Why? I Ask (talk) 12:38, 4 March 2023 (UTC)[reply]
  • Oppose allowing restoration of G5ed articles at REFUND, per WP:BMB, and because REFUND is only intended for uncontroversial cases (which does not include G5). I don't think we actually need this language at all, it's only really there to stop new users making requests which are very unlikely to be granted, and if you request that an article deleted under the A criteria be moved to draftspace for improvement then it will very likely be granted already unless there's something else wrong with the content. This is only being proposed here because the OP's attempt at WT:CSD to add a right to get G5ed pages restored didn't go well. Hut 8.5 19:38, 6 March 2023 (UTC)[reply]
    Can you point out the policy that says G5 is considered "controversial". From my understanding, everything deleted without a discussion (e.g., PRODs) is considered uncontroversial. If what you are saying is true, then the text at WP:REFUND needs to update to clarify what is considered "controversial". Also, looking through the archives, there are multiple instances of some experienced editors having their request for pages deleted under A7–11 criteria restored denied. The text may need to be more explicit in how such cases can be restored to non-mainspace venues if that is the case already. Why? I Ask (talk) 21:24, 6 March 2023 (UTC)[reply]
  • Question Is there a question to be decided here, or is this just a general discussion? I am seeing supports. · · · Peter Southwood (talk): 05:40, 9 March 2023 (UTC)[reply]
    The question is literally the name of topic of the discussion. It starts with a capital letter and ends with a question mark. --Jayron32 18:25, 9 March 2023 (UTC)[reply]
  • Support yes they can even be restored straight to mainspace if requested. Contributions are under the CC BY-SA 3.0 License so even if banned anyone uninvolved in good standing should be free to request undeletion and in most cases it should be accepted though we shouldn't ignore the fact this may encorage users to evade bans though so maybe not in all cases but I'd say if uninvolved and in good standing its generally fine. Crouch, Swale (talk) 19:42, 10 March 2023 (UTC)[reply]
  • Oppose detail restrictive / prescriptive instructions at WP:REFUND. It is the right place for a newcomer’s first request. The responsibility for refunding lies with the admin. If an admin doesn’t know when it’s appropriate to refund a G5, either they shouldn’t be an admin, or should be undeletion G5 deletions. —SmokeyJoe (talk) 20:49, 10 March 2023 (UTC)[reply]
  • As others have said, I don't think it makes sense to bundle these questions together. Support A7 and A9; they should clearly be restorable because they are about simple, straightforward flaws that may in fact be easy to correct, sometimes with nothing more than a single sentence cited to sources already in the article or with the addition of an easily-found source or to. A11, definitely no, not ever - unless someone completely screwed up there is nothing of value to the project in an A11 undeletion; and if someone screwed up we have proper channels for that. When people post made-up stuff in article space we want them to stop it and either work on not-made-up stuff or go away, we don't want them to move it to userspace; and unlike A7/A9 the flaw that got the article deleted is not fixable. G5 is more complex in that we have to weigh the desire to discourage block-evasion with the potential to recover stuff of value, but I'd say no to G5 because making it too easy for blocked users to leave meaningful impacts on Wikipedia encourages block-evasion. This is especially true for editors who were blocked for tendentious / POV editing; assuming their POV is a widespread one and therefore inevitably well-represented among editors, they could almost always count on an editor who shares their beliefs recovering their deleted articles, which would defeat the whole purpose of blocking them in the first place. --Aquillion (talk) 21:20, 10 March 2023 (UTC)[reply]
Support as an option for A7, A9, and G5; making it clear that "yes an admin can do this", not "yes someone must do this". I think there should also be an option in these cases of "restore / provide the list of sources" (or provide confirmation that there were no sources), leaving the article text deleted. DavidLeeLambert (talk) 19:23, 18 March 2023 (UTC)[reply]

RfC about turning Wikipedia:Naming conventions (conflicts and protests) into an official naming convention policy

Please see the RfC at Wikipedia talk:Naming conventions (conflicts and protests)#RfC about turning Wikipedia:Naming conventions (conflicts and protests) into an official naming convention policy. --- Tbf69 P • T 17:27, 4 March 2023 (UTC)[reply]

RfC proposing a ban on most misconduct/disruption filings at ANI for issues arising in the WP:GENSEX topic area, automatically routing such complaints to WP:AE

Hi folks, there's currently a discussion under way at the AN proposing that complaints about conduct and disruption which arise out editing that touches upon gender or sexuality (a WP:CTOP subject matter) be generally closed on sight and deferred instead to WP:Arbitration Enforcement, for handling by an admin, rather than the community at large at ANI. Presently the issue is being debated mostly by those admins who happen to be present at the noticeboard just now, and a handful of other editors who were involved with or commented upon the divisive ANI thread that gave rise to this proposal. I believe the implications of this discussion are quite significant with regard to how the community handles disruption in this and other contentious topic areas, and that the proposal should accordingly be subject to/benefit from broad community scrutiny, so if you have a moment, please consider stopping in to review and possibly share your perspective. SnowRise let's rap 22:08, 5 March 2023 (UTC)[reply]

Discussions between wiki editors and corporations

Seeking comment -

General issue is that a paid editor has come to the talk page of the Wikipedia article for their employer's organization, and I asked them to ask the communications department to develop Wikipedia articles in their field of expertise. To what extent are Wikipedia article talk pages suitable for such requests?

Bluerasberry (talk) 16:10, 6 March 2023 (UTC)[reply]

You had a paid editor making edit requests which failed verification, so you asked them to develop other articles? That seems counterproductive, if they can't even provide correct references about their own organisation why would you ask them to write about other stuff? Fram (talk) 16:20, 6 March 2023 (UTC)[reply]
@Fram: It is one of best-funded organizations in the world. They have decades of expertise in humanitarian development. Their one paid editor may have stumbled, but they have excellent content holdings in humanitarian relief, child development in lower and middle income countries, and regional crisis. They have expertise and media which is high quality but unfortunately difficult to access; Wikipedia has very popular articles in their field which need quality content like theirs. To the extent that they have communication goals, the relationship seems mutually beneficial.
I would say the same for many other organizations, but this one seems like a fair place to start as their revenue is more than a $US billion a year, and for that reason, I think they can handle being asked in public. If we can sort out a process then perhaps we can present requests to smaller organizations like museums, universities, government agencies, and research institutes when they send their paid editors to wiki. Bluerasberry (talk) 17:03, 7 March 2023 (UTC)[reply]
I think that the COI editor talk page or one of the forums at WP:Village Pump might be better to hold conversations unrelated to the improvement of the article than the article talk page. For example, Wikipedia:Village pump (proposals) could be a good venue. Thinker78 (talk) 19:17, 8 March 2023 (UTC)[reply]

A couple of thoughts:

  • If by "failed fact checking" you mean something that could comply with real-world fact checking but not fulfill wp:ver, I would not hold that against them. There's a big difference between the two.
  • I don't think that article talk page discussions as normally implemented would not be so strict as to say it would improper to discuss it there. However, the question is where is the best / most appropriate place to have those discussions. If they are about related articles that relate to the article-subject organization then perhaps the article talk page is best. Otherwise the editor's talk page might be better.

North8000 (talk) 18:15, 6 March 2023 (UTC)[reply]

I marked failed verification because I felt the claim was "X does Y" when X did not appear in the cited source. It was not debatable. Bluerasberry (talk) 17:03, 7 March 2023 (UTC)[reply]
I didn't question any finding regarding compliance with WP:Ver. North8000 (talk) 19:58, 7 March 2023 (UTC)[reply]

I agree with North8000: if the conversation is going to be affect how others edit the article in question, then the article talk is best since all interested editors can be involved. If it's a conversation between you and the other editor on what else they can do (even if it's within the same field), the audience is different, and it would be better to discuss it somewhere else, such as the editor's talk page, or a related active WikiProject talk page.

Note there are potential conflict of interest issues with the communications department of a company editing pages in a related field. If Wikipedia's Animation page had a notice, "This page has been edited by the Walt Disney Company's communication department," even though they are indeed experts in the field, it could cast doubts in the minds of readers. isaacl (talk) 18:32, 6 March 2023 (UTC)[reply]

@Isaacl: I think all of this is new territory not previously discussed. If you see a place to take this discussion then I would post more wherever you suggest.
To take your example, I think there would be a difference in a public relations representative from Disney editing about the corporation as compared to their professional animators editing about their craft. I do not think any corporation has ever approached Wikipedia offering to edit content in their field of expertise. So far as I know, 100% of the 100s of 1000s of WP:COI issues are about branding and marketing, not about purported attempts to benevolently share expertise.
Some differences with the case I am presenting: World Vision is nonprofit, they are recognized as an expert organization on humanitarian issues, and they produce data, research, and media which is the basis of a lot of other research. I would be keen on getting their input on, for example, child development in India. I think it is fine to be wary, and perhaps we should develop rules, but development of that article seems less risky to me than their own development of the article about their corporation. Bluerasberry (talk) 16:20, 9 March 2023 (UTC)[reply]
Since you presented a general case without specifics, I discussed the matter in general. I have in the past discussed hypothetical scenarios where an organization such as a historical association might want to improve coverage in Wikipedia of history. We need to carefully consider, though, the effect of a paid-editing notice on the article in question with respect to readers trust in the article. In some cases, it may enhance trust; in others, it might be better for source material to be provided and for unassociated editors to use it to devise appropriate changes. I think the best approach will differ based on the specific circumstances. isaacl (talk) 17:25, 9 March 2023 (UTC)[reply]
@Isaacl: So far as I know, the notice you mention - {{COI}} - has no documented precedent for use in tagging articles with contributions from subject matter experts, regardless of their employers. Its use as reported in Special:WhatLinksHere/Template:COI shows almost all brands, not topics of general interest. Where it is on topics of general interest the problem often seems to be ideological bias, like a religious or social issue. Why regulate expert editing when the activity has not been tried? I am unaware of anyone ever identifying a problem with subject matter experts editing in their field of expertise. Why start from the presumption that it is comparable to marketing edits? Bluerasberry (talk) 20:03, 9 March 2023 (UTC)[reply]
If I understand correctly, you're asking an organization to assign someone as part of their work duties to create Wikipedia content. That makes them a paid editor and thus a paid-contribution disclosure is required. This is different than an individual choosing to edit a topic about which they are knowledgeable and is related to their employer. (There have been discussions in the past regarding potential conflicts for this second case. Previously I've commented that I think the community shouldn't treat this as always being an actual conflict of interest, precisely because I think this would hamper subject matter experts from contributing.) isaacl (talk) 02:02, 10 March 2023 (UTC)[reply]
@Isaacl: I think there are two ways of thinking about this
  • paid versus not paid
  • marketing versus general knowledge
In my view, the conflict of interest problems arise from marketing. I do not think we have a history of COI problems from subject matter experts editing on topics unrelated to a brand. I think a paid editing disclosure would be fine for user pages, because we have a policy for that, but I do not see a reason to preemptively ask for them on a wiki page when a subject matter expert, paid or not, edits in their field of expertise. If there is a problem then we could tag them, but among the 100,000+ COI cases, I do not think we have a single model case around experts editing general knowledge. Bluerasberry (talk) 12:12, 11 March 2023 (UTC)[reply]
A declaration by the paid editor on their user page meets the terms of use requirements. However this doesn't preclude additional disclosures being placed by others on the article talk page. The paid editor might also choose to disclose on the article talk page as well for transparency (and there is a ongoing discussion as part of the terms of use update about making disclosure mandatory in three locations instead of the current requirement for just one of them: the user's page, the article talk page, and the edit summary). My guess is that editor consensus on this matter is going to depend on the specific circumstances. isaacl (talk) 17:47, 11 March 2023 (UTC)[reply]

As an aside, I've done a fair amount of putting in or deferring declared-coi requested edits. IMO declared ones usually don't do much bad stuff.....they know they are under a magnifying glass and are pretty cautious. North8000 (talk) 18:46, 6 March 2023 (UTC)[reply]

Contesting a Blank and Redirect

What is the proper forum for resolving a dispute over whether an article should be blanked and redirected, cut down to a redirect to an article that is considered a parent article? A plausible-sounding answer that is completely useless is that this decision, since it does not involve deletion, should be done by normal editorial discussion. That is true but useless, because it doesn't answer what should be done when discussion is inconclusive. What binding consensus process is used to decide whether to cut an article down to a redirect or to let it stand? One area in which I have seen edit wars is music disputes, typically over whether a song should have an article, or be redirected to the album, or sometimes whether an album should be redirected to the artist. Many content disputes can be resolved by RFC, but is there another forum? Can Redirects for Discussion be used to ask that a redirect be replaced by an article? Robert McClenon (talk) 02:23, 8 March 2023 (UTC)[reply]

In this situation, isn't it the article rather than the redirect which is at issue? If so, in the case of an article<->redirect controversy, WP:AFD would be the venue to establish resolution. AllyD (talk) 06:57, 8 March 2023 (UTC)[reply]
@Robert McClenon and AllyD: per WP:ATD-R the appropriate forum is AFD (see also the pertinent RFC). There's some additional nuance in practice when long-standing redirects that have not previously been articles are converted into them, but ultimately AFD is still the best way to resolve these disputes. 74.73.224.126 (talk) 07:03, 8 March 2023 (UTC)[reply]
User:AllyD - What if the user who is requesting dispute resolution is the originator of the article which is being redirected? Can that editor file a procedural Articles for Deletion nomination in order to argue that the article should be Kept, when another editor has been redirecting it? Robert McClenon (talk) 08:20, 8 March 2023 (UTC)[reply]
It would be better form for the "redirector", in recognition that their attempted action is controversial, to themselves bring the article to AfD with a rationale about the topic's non-notability; however, failing that, I see no reason why the article originator should not bring it to AfD explaining that they consider the topic notable but that this is disputed. I feel I have seen some such AfD cases recently - possibly in the controversial area of changing sport notability criteria? AllyD (talk) 09:03, 8 March 2023 (UTC)[reply]
That's more or less my thought too. In theory such a situation should not arise since there should be consensus before a BLAR is repeated, analogous to how "no consensus" outcomes at AFD result in retention, in practice it can be complicated. I can't speak all that well to recent events, but procedural nominations are an accepted part of AFD and quite common [14]. In sum, NOTBURO. 74.73.224.126 (talk) 14:59, 8 March 2023 (UTC)[reply]
If you choose the "talk page" route suggested as one of the multiple suitable venues, then treating it as a case of a contentious Wikipedia:Proposed article mergers would probably be functional, too. WhatamIdoing (talk) 04:16, 9 March 2023 (UTC)[reply]
  • Note: Redirecting is a normal editing action; it doesn't require any special tools, not even autoconfirmation. As such, it is subject to the same principles as any edit. Prior approval is not required, though if someone objects or undoes it, it is best to discuss it and reach consensus before doing it again. While redirecting is sometimes the result of an AFD (in which case it should be considered standing consensus), it doesn't need to be. AFD should probably not be used unless deletion is earnestly sought; if you want to just redirect the article, and there's opposition to doing so, the article talk page is sufficient for handling such discussion. WP:RFD is available, but that's mainly for when redirects need to be deleted, or moved to a new target, or perhaps the redirect is protected from editing and that needs to be remedied. --Jayron32 15:51, 10 March 2023 (UTC)[reply]

Vandalism by BoyTheKingCanDance

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


I usually add new content and regularly update it, which can take some time. However, BoyTheKingCanDance, who is a Pending Changes Reviewer & New Page Reviewer, often comes and vandalizes the entire article. For example, he recently deleted almost everything, including general information, from my new article "Deobandi hadith studies," which I had created just an hour ago and was still working on. Another example of his vandalism is Idarat al-Mabahith al-Fiqhiya.–MinisterOfReligion (Talk) 05:15, 8 March 2023 (UTC)[reply]

@Owais Al Qarni, you should put an {{In use}} tag in the article whilst you are working. This stops others from interfering on articles which are being worked upon by others. ─ The Aafī (talk) 05:19, 8 March 2023 (UTC)[reply]
Also this venue is not a very good place to report such cases. You should have discussed the issue with @BoyTheKingCanDance on their talk page, or on the article's talk page, and then perhaps been through at the WP:ANI. Please make sure you use an "under construction/in use" tag while you are working on an article. This helps to avoid unexpected edit-conflicts. ─ The Aafī (talk) 05:23, 8 March 2023 (UTC)[reply]
The advice above is good advice. It prevents this happening. Best wishes, BoyTheKingCanDance (talk) 05:24, 8 March 2023 (UTC)[reply]
I posted this on Owais's talk page: "Hi my friend, the new page review is one of the last approval stages, so when I see a page with huge amounts of unreferenced information, and no {{In use}} tag that the page is undergoing a major build, I tend to err on the side of caution and remove the unreferenced information. The easiest solution is to add the tag to indicate the page in under construction. This prevents editors from editing until the process is complete. Please assume good faith of me, as I do of you. Very best wishes, BoyTheKingCanDance". It is sincerely meant. BoyTheKingCanDance (talk) 05:48, 8 March 2023 (UTC)[reply]
User:Owais Al Qarni - Do you actually think that User:BoyTheKingCanDance is engaging in vandalism, or are you yelling vandalism? If you really think that their conduct is vandalism, report it at the vandalism noticeboard. If you have been editing Wikipedia long enough to know what is vandalism, you have been editing long enough to know what is not vandalism. Robert McClenon (talk) 08:16, 8 March 2023 (UTC)[reply]
@Robert McClenon: I regularly contribute to Bengali Wikipedia. However, there is a policy that prohibits adding any tags to newly created articles for a period of 48 hours. I am unsure if a similar policy exists on English Wikipedia. Crafting a comprehensive article usually takes time and effort, and it can be discouraging and extremely frustrating to see everything deleted immediately after publication, especially when I am constantly trying to improve.–MinisterOfReligion (Talk) 09:03, 8 March 2023 (UTC)[reply]
Consider creating the article first in draftspace or your user sandbox. Curbon7 (talk) 12:44, 8 March 2023 (UTC)[reply]
User:Owais Al Qarni - You didn't answer my question about whether the edits by the other editor were intended to harm the encyclopedia. If they are, you can report them at the vandalism noticeboard. Since you didn't answer that question, and based on what else you have said, it appears that you were yelling vandalism. That is a personal attack. Do not yell vandalism if the edits are not vandalism. Robert McClenon (talk) 16:27, 8 March 2023 (UTC)[reply]

The problem isn't vandalism by BoyTheKing, but promotional editing by Owais Al Qarni. However, that should be taken up on their user talk page or if it persists at a dedicated noticeboard, not on this village pump which has nothing to do with either users vandalizing nor with users creating promo articles. Fram (talk) 13:32, 10 March 2023 (UTC)[reply]

The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Complaint about article assessment

It is my experience that many article raters routinely rate newly-created articles as "start." With a few exceptions, I don't consider articles I create as "start" and I am irritated by a superficial rating of an article.

According to Wikipedia:Content assessment a start article is "incomplete" and "most readers will need more." Plus, a start article "needs substantial improvement in content and organization." My most recent creation, Golden Triangle of Meat-packing was rated start. The article consists of 1,246 words (16,966 bytes), 29 references, numerous links, a map, and a photo. I didn't add an info box or a bibliography to the article as neither seemed necessary given the content. To my mind this article as now existing exceeds the "start" rating. It tells the average reader all they need or want to know about the subject and the content, drafting, organization, etc. is at least adequate. So, my question is: what do I do to remedy what I consider an error: complain to the rater, rate the article myself, or delete the rating and wait until another rater comes along?

I also believe it would be a good idea for an appropriate authority to remind article raters to refresh their memories as to what the rating guidelines say and tell them to undertake their chosen task of rating articles with more due diligence. Smallchief (talk) 12:28, 8 March 2023 (UTC)[reply]

When I start a new article or expand an existing article, if I create a talk page for it with project banners (I don't bother very often theses days), I always rate the article as "start", no matter how long or well sourced it is (I try to avoid creating stubs). I leave it up to other interested editors to decide if it rates higher. Frankly, I don't look at quality ratings, and make my own judgments on whether, and how, an article should be improved. Donald Albury 14:00, 8 March 2023 (UTC)[reply]
Rather than always rate the article as "start", no matter how long or well sourced it is, if you leave it unrated it will populate the relevant maintenance categories, which the relevant wikiprojects tend to keep an eye on, allowing someone else to quickly identify it as needing a rating. Just throwing in a start rating, on the other hand, categorises the article among all other start-class articles, meaning it's less likely to catch the eyes of wikiprojects. Ljleppan (talk) 06:53, 10 March 2023 (UTC)[reply]
I'm one of those who uses the Wikipedia 1.0 Server's list of unassessed articles to target so that I can review them, and so that wikiprojects have an overview of the progress and work remaining in their area. By putting in a rating without reviewing the article, you're giving a false statement to other editors, disrupting the work of those editors, and lowering the quality of that information. And for what? Are you using a tool that forces to enter a quality rating? If so, can you use a better tool? I appreciate that you personally don't look at quality ratings and for the most part they shouldn't be taken very seriously, but how do you get from that to it being okay to actively degrade that system for everyone else? MartinPoulter (talk) 20:44, 13 March 2023 (UTC)[reply]
I leave it blank, and have a feeling that C-class is the most common judgement bestowed upon me. A nice middle ranking perhaps. However, as Donald notes, the lower quality rankings are essentially meaningless, so I wouldn't pay them too much mind. All sorts of ranked articles are submitted to WP:GAN for example, following which they suddenly transition directly to GA-class. CMD (talk) 14:23, 8 March 2023 (UTC)[reply]
Yeah, I ignore the ratings on my articles. The rating system is pretty much broken. There's way too many possible ratings, the criteria are too vague, and ratings are applied haphazardly by inexperienced and/or unskilled editors. -- RoySmith (talk) 14:47, 8 March 2023 (UTC)[reply]

Thanks for the advice, guys. I'll guess I'll accept the "start" rating as trivial and not worth my time or yours. Smallchief (talk) 00:19, 9 March 2023 (UTC)[reply]

@Smallchief: It's actually been updated to C-class. Outside of some weird RfA opposes along the lines of has mostly created start-class articles, I've never seen anyone overly bothered over whether someone's creations are rated as stub, start, or C. In the future if you believe an article you created is a C then feel free to rate it as such on your own as I can't imagine anyone giving you trouble for it. For B-class it's probable better to make a friendly request for independent review of the article against the criteria on the user talk page of someone who's active in the area, or at an appropriate WikiProject if any are active, however there's no requirement to do so and I doubt anyone would be too bothered unless you made a habit of self-aggrandizing bad ratings.
Thanks for writing up a nice article. 74.73.224.126 (talk) 01:38, 9 March 2023 (UTC)[reply]
Thanks.Smallchief (talk) 01:47, 9 March 2023 (UTC)[reply]
Ratings except for GA, A, FA, and FL do not need to be granted by an uninvolved authority. Editors who know how to write decent articles have always been assumed to be capable of reading and following the directions so that they can assess their own articles. If someone gets it wrong, just fix it and don't give it another thought. If you want an "objective" reason, then the ORES ratings are available through User:Evad37/rater. It says that C-class is the most likely rating for your article on meat packing (with a substantial chance of either Start- or B-class being preferred – in other words, I'd choose C, but any of them could be right). WhatamIdoing (talk) 05:07, 9 March 2023 (UTC)[reply]
Yeah. Just rate your articles as C or B yourself, if you really believe they meet the criteria. (However, B needs to be assessed by a non-creator in the MilHist area, which is done in a routinized manner on a dedicated requests page). —Alalch E. 19:48, 13 March 2023 (UTC)[reply]

Survey on replacing templates with WP:LST

In the last few weeks there has been a flood of deletion of sports templates at Wikipedia:Templates for discussion with several thousand being deleted. Typically they only get one or two votes from the same handful of TfD regulars who vote delete for everything. Many times users subst a template and then send it to TfD as unused, which seems like WP:GAMING. I believe a larger community review is needed to root out such dubious practices among TfD regulars who have been conducting large scale deletions virtually unchecked. A lot of templates are being deleted by replacing it with WP:LST. LST is a recent technology and not widely understood by most users yet, while templates have been around for a long time and are easy to use. There should be explicit community consensus if we are going to do a large scale replacement of templates with LST. So I would like a hear what the larger community outside TfD regulars have to say about this. 2409:408C:AD8D:273D:0:0:43C8:AC0B (talk) 17:39, 10 March 2023 (UTC)[reply]

  • I am an old fart who finds even the use of templates confusing. That said, I took a look at the LST page, and I am lost. It seems to be written for coders by coders - and not for the average user. I have no clue what WP:LST is saying or what I would use it for. If I needed to edit a page that used LST - I simply couldn’t.
    I am increasingly thinking that WP is no longer “an encyclopedia that anyone can edit”… since the amount of technical knowledge required to do so is growing beyond my abilities. Sad. Blueboar (talk) 18:03, 10 March 2023 (UTC)[reply]
  • Labeled section transclusion introduces a dependency between the source article and the articles transcluding the sections in question. The source article needs to preserve the labels being used to identify the transcluded text, whether those are the section headers or special transclusion labels. You didn't specify how they were being used for the content you were referring to, but I'm guessing for sports templates, it was tabular data. Tables are generally maintainable as standalone entities and so are well-suited for being placed on a separate template page. Personally I feel even if the table is only being used in a small number of places (or even one), it can be desirable to keep a frequently updated table on a separate page. In theory if the updates end (say, at the end of a season), the template could be substituted onto the one page it was used, but I don't see much benefit at that point. (It would shorten any applicable lists of template categories, but I don't think that will have much practical effect.)
  • Labeled section transclusion in mainspace is an interesting possibility for reusing a section of text between related articles. Text is generally not as easy to maintain in a standalone template, though, as the surrounding context plays a role crafting the text appropriately. Labeled section transclusion allows context for one location to be immediately visible, but not the context for the other places where it is used. Comments can be used to point to the other places (which would also help avoid the labels being changed inadvertently). Given the additional technical complexity, though, it might be preferable to just use comments to point to other places that should be updated, and rely on them being updated manually instead of automatically through transclusion. Labeled section transclusion for text sections is, I feel, a better fit for some content in the Wikipedia namespace, in specific cases where dealing with possible breakage is more manageable.
  • On a side note, labeled section transclusion isn't new; it's been around for a decade (I'm not sure how long the feature to include text based on the headings has been around, but based on the help page history, at least nine years). isaacl (talk) 18:34, 10 March 2023 (UTC)[reply]
    It sounds interesting and possibly useful, but I would agree with Blueboar that WP:LST is nearly incomprehensible to someone not already well versed in the nuances. Even looking at the few pages listed there as examples wasn't helpful in trying to understand how the tables/text interact, where to update information, or generally how/when to use LST. I'd agree with the original poster that more consensus (and probably explanation/education) is needed before LST is used as a replacement for templates. —Carter (Tcr25) (talk) 19:32, 10 March 2023 (UTC)[reply]
    @Blueboar and Tcr25: The WP:LST page is complicated, because it's trying to describe all of the possible syntactical forms. Basically, if you have two articles, Foo and Bar, and you want some content to appear identically in both, you would first choose which one is to actually have this as editable content. Let's say that it's Foo - in this article, you would have:
    ... content that is only relevant to Foo. <section begin=MySection />This content is relevant to both articles.<section end=MySection /> More content that is only relevant to Foo ...
    
    and this displays as
    ... content that is only relevant to Foo. This content is relevant to both articles. More content that is only relevant to Foo ...
    Then in the article Bar, you would have:
    ... content that is only relevant to Bar. {{#lst:Foo|MySection}} More content that is only relevant to Bar ...
    
    and this would display as
    ... content that is only relevant to Bar. This content is relevant to both articles. More content that is only relevant to Bar ...
    Then when at some future point you edit that part of Foo that is between the two <section /> tags, the change will propagate to Bar exactly like an edit to a template would. --Redrose64 🌹 (talk) 23:36, 10 March 2023 (UTC)[reply]
  • It looks like there are several things that need to be made more comprehensible. WP:LST needs to be improved, TfD discussions that invoke it need to be improved, and the post here needs to be improved to give us some examples of what the problem is. Phil Bridger (talk) 20:49, 10 March 2023 (UTC)[reply]
    @Phil Bridger: in Wikipedia:Templates for discussion/Log/2023 March 10, User:Frietjes has nominated over 120 templates for deletion after replacing it with LST. She and others at TFD never follow the approach of "I propose deleting these templates after replacing it with LST for this reason", it is always "I have replaced these templates with LST and should be deleted". This is presenting a WP:FAITACCOMPLI to voters. There ought to be clear consensus from the larger community about the necessity of this before doing such large scale deletion for questionable benefit. Many of these templates have been around for over a decade and present in hundreds of old revisions of articles. This kind of deletion of harmless templates shows a mass of red links in old article versions. I have not seen any clear argument as to what advantage LST has over templates. In fact in my experience LST content is harder to edit than templates, harder than even Wikidata. At least for templates you have "V. T. E" buttons and pencil icon for wikidata items that take you to the page that needs to be edited. This is not always present in LST, you have to see page source to try and figure out where it is coming from, which is not easy. Many times I have just given up instead of going through such hassle for updating Cricket stats. Tables from LST cannot be edited using visual editor, with no indication that it is coming from some other page. All of these are barriers for editing Wikipedia. 2409:408C:AE10:FC75:0:0:4348:8811 (talk) 06:12, 11 March 2023 (UTC)[reply]
    I'm sure there was an explicit rule against orphaning templates before sending them to TfD as "unused". All I can find for now is WP:TFD#Discussion which does say Templates are rarely orphaned—that is, removed from pages that transclude them—before the discussion is closed. I unfortunately didn't supply a link in this post or this post eight years ago. --Redrose64 🌹 (talk) 10:21, 11 March 2023 (UTC)[reply]
    I think those who are maintaining the content should be able to decide for themselves the mechanism that works best for them. Thus I disagree with the example of changing a list of athletes from a template to labeled section transclusion if those who are interested in maintaining the content prefer using a template. I appreciate that the content may be complete in terms of membership, but in addition to monitoring for vandalism, it could undergo future revisions in formatting or associated information. Let those interested in doing the work manage how they want to do it. isaacl (talk) 17:26, 11 March 2023 (UTC)[reply]
    I point you to WP:OWN. Gonnym (talk) 20:13, 11 March 2023 (UTC)[reply]
    A collaborative project shouldn't decide to hand an ongoing bill to a group of people working on a task and say, here, keep paying this bill for something I bought. That's one group of people taking ownership of a decision but not paying for it. isaacl (talk) 21:40, 11 March 2023 (UTC)[reply]
    I agree with you, but on the other side, do you remember the jokes about the coder who couldn't be fired, because he made the payroll system such a mess of spaghetti code, and if he left, nobody else could get paid? We don't want to interfere with the people who are doing the work, but we also don't want to be left with a system that nobody else can use. WhatamIdoing (talk) 21:42, 17 March 2023 (UTC)[reply]
    Yes, as I mentioned in another comment, I agree there is a longer-term issue with generational turnover of page watchers. However it's not clear to me that deleting these particular nominated templates will make much difference. isaacl (talk) 20:04, 18 March 2023 (UTC)[reply]
  • I am familiar with LST. I've used it recently for selective table row transclusion. Templates should not consist of article content that needs verification, even if the content is in tabular form. I support pivoting from templates to LST in this set of cases, despite the fact that some maintainers will have difficulty adapting for a certain period. It's worth it. I support Frietjes nominations and actions taken at TFD. Whenever transcluding, it is important to point editors to the source page where they can edit the content. Sometimes, custom hatnotes are needed for this. Editing transcluded content is not more complicated than editing a templated table with no edit button. —Alalch E. 19:48, 11 March 2023 (UTC)[reply]
    Is the crux of the issue that LST content may contain sourcing information when a template doesn't? I'm trying to understand the benefit of LST over templates and while some editors may prefer LST over templates for a table (for example) no usability or reliability benefits are readily evident to me. —Carter (Tcr25) (talk) 20:29, 11 March 2023 (UTC)[reply]
    Templates with such content get vandalized and subjected to disruptive editing (changes that fail WP:V), but this often goes unnoticed because these changes may be subtle and editors don't put such templates in their watchlist. —Alalch E. 20:33, 11 March 2023 (UTC)[reply]
    Same for an article transcluding parts of a less-watched article. "Templates might get vandalized" is a rather weak argument for making it harder to learn to edit articles. —Kusma (talk) 22:07, 11 March 2023 (UTC)[reply]
    Are there examples of this happening that can be pointed to? How widespread is it? Vandalism and bad-faith edits can happen in all sorts of places. —Carter (Tcr25) (talk) 14:32, 12 March 2023 (UTC)[reply]
  • The example given above by OP does seem like a very explicit case of WP:GAMING. It's similar to someone removing a category from all its relevant articles and then nominating the category for deletion with the argument that it's empty and un-used. A blatant gaming situation that appears to be conducted by an echo chamber of editors trying to push LST into usage on their own, without actually gathering input from the community as a whole. SilverserenC 22:30, 11 March 2023 (UTC)[reply]
  • To me, LSTs do like they will offer improvements over at least some uses for templates. But I don't feel they should be used to replace templates until there's both (1) community awareness of LSTs and (2) documentation that's intelligible to less technical people.—S Marshall T/C 23:48, 11 March 2023 (UTC)[reply]
  • LST (labeled section transclusion) is a can of worms. It requires incomprehensible wikitext in the source article (where the content comes from) and the target (where the copy is displayed). No amount of all-caps hidden comments will make the system resist the ravages of time when gnomes nibble away either at the source or target articles. Also, what is the point of duplicating article text? Phrases that work in the source article should probably be altered and abbreviated in the target. Johnuniq (talk) 00:01, 12 March 2023 (UTC)[reply]
  • A couple years ago when LST usage was new even to more technical editors I was considering different ways to make the LST system more robust and comprehensible. The main thing was introducing a bot that kept track of LST being broken fixing it or notifying the editor that broke it depending on context, but that would be a lot of work. Making the Wikitext more comprehensible was also a part of it introducing templates like {{Section transclusion}} to better track LST usage, possibly highlight LST portions in preview mode and avoid using parser function syntax which most users aren't familiar with. I also feel like LST as a term should be abandoned in favor of just section transclusion which would be a lot less intimidating since "labeled section" isn't a term used anywhere else and people probably wouldn't use an obscure abbreviation as much. And there should of course be a rewrite of Help:LST. All of this is still relevant today, but I really don't feel excited about working on it. Currently I feel the practice of deploying more LST is slightly positive, like most template work, but isn't near it's full potential. As a TfD regular I will also say that I highly doubt this is intended as gaming at all, but rather just making it more convenient to implement after having seen similar nominations going through hundreds of times. If the LST is already implemented the template can just be deleted after the discussion is over instead of placing it in the holding cell and await that someone returns to all these templates and implement it. It is a bit of gray zone, but I have a hard time believing this isn't done in good faith and I feel the WP:BURO argument is quite strong here given the number of previous nominations. --Trialpears (talk) 02:48, 12 March 2023 (UTC)[reply]
    If the LST is already implemented the template can just be deleted after the discussion is over
    But is the implementation even something being discussed or are the editors involved unilaterally implementing it without discussion with anyone, let alone the community at large? Because I can certainly say that I oppose any implementation of LST as is currently being done. The actual usefulness for Wikipedia and especially for editors trying to improve articles having to deal with LST coding is severely lacking in evidence right now. SilverserenC 03:06, 12 March 2023 (UTC)[reply]
    TfD regularly discuss implementation details and is generally quite good at soliciting input from a lot of non-template editors for large discussions through the TfD notices displayed everywhere the template is used. That can result in a TfD being advertised on literally millions of pages at times. It appears however that LST nominations are somewhat of a perfect storm for forming a local consensus anyway since each individual nomination is only advertised on a handful of pages which always leads to TfD regulars being in a majority even though there is occasional opposition. Since lots of nominations have occurred the nominators and closers feel like there significant precedent and large scale consensus even though that may not be true. Nothing malicious is going on, but rather the structure of the discussion has lead to this type of nomination being under advertised. After seeing the page count at Category:Pages transcluding nonexistent sections just now I'm also feeling that something has to be done to improve the situation. --Trialpears (talk) 04:07, 12 March 2023 (UTC)[reply]
    Based on Category:Pages transcluding nonexistent sections and some searches I estimate that 15-20% of LST uses are broken, which is unacceptable in my eyes. Something really should be done here. Bot plus fixing existing ones is what I would suggest. --Trialpears (talk) 04:29, 12 March 2023 (UTC)[reply]
    (edit conflict) Category:Pages transcluding nonexistent sections was created only a few weeks ago, so there's a large pre-existing backlog. There was at one point in the past a similarly huge backlog at Special:WantedTemplates, but lots of people are working to keep it under control. I'm still not convinced that this is actually a problem. * Pppery * it has begun... 04:31, 12 March 2023 (UTC)[reply]
    Trialpears and Pppery, I think Category:Pages transcluding nonexistent sections has false positives. For example, 1911–12 in Swedish football is in the category just because there are no quote marks around "table" in 1911–12 Svenska Serien. LST is clearly working, so, not sure why that page is in the category. Plastikspork ―Œ(talk) 15:58, 12 March 2023 (UTC)[reply]
    There was an old bug that caused some pages which shouldn't be in the category to be in it, which was fixed a few weeks ago, but it appears some pages are still in the category due to bad caching. I WP:NULLEDITed the page, and it's no longer in the category. * Pppery * it has begun... 16:01, 12 March 2023 (UTC)[reply]
    introducing a bot that kept track of LST being broken - we already have Category:Pages transcluding nonexistent sections * Pppery * it has begun... 03:17, 12 March 2023 (UTC)[reply]
    Oh god, that's even worse than I thought. I still think a bot should be used as well since it can notify editors who accidentally broke it or in some cases fix it automatically. --Trialpears (talk) 03:51, 12 March 2023 (UTC)[reply]
  • I've created User:Trialpears/Section Transclusion now which is a vision on roughly how WP:LST could read in the future if the improvements suggest above are implemented. If someone who thought the current help page was overly complicated wants to read this and see if this makes the feature seem more manageable that would be appreciated! Everything but the bot should be manageable for me to get done in a week if this is something we want. --Trialpears (talk) 06:51, 12 March 2023 (UTC)[reply]
    I think it's more intuitive than WP:LST. However, I would also add an example similar to the one of @Redrose64 above. Irecorsan (talk) 14:22, 12 March 2023 (UTC)[reply]
    Replacing templated tables by LST seems to have no upsides that I can see, only considerable downsides (in some TfDs a year ago or so, I have seen many non-arguments such as "templates should have more than X transclusions"). If an editorial decision is made to replace the table in the "source" article by something else, all uses in other articles will break. There should be no pressure to revert such an edit: editors should use their best judgement to make the article they are working on as good as possible. If that breaks other articles, fix it at the target (it is the target article editors'responsibility to watch everything they transclude and to make sure it is what they want). Editing is hard enough as it is without the kind of interdependency introduced by LST. Unlike section anchors (another interdependence), it does not fail gracefully. —Kusma (talk) 07:40, 12 March 2023 (UTC)[reply]
    This section has the wrong title. It sounds as if someone is trying to abolish templates and replace them with LST across Wikipedia. No, the starter of this discussion talks specifically about certain sports templates which bear article content (so they are non-standard templates) and are used usually in two articles (a comprehensive list and an article that benefits from the inclusion of a portion of that list), and which were, and some still are, nominated and TfD, and have generally been deleted there. In this, relevant, context, it's unconceivable that there could be a reasonable editorial decision to replace, say, a country's men's team roster in Basketball at the 2016 Summer Olympics – Men's team rosters with something else. Just unconceivable. If such a change were made, editors who have this page on their watchlist would revert it. The same can not be said for individual templates, which have underwent undesirable and undetected changes historically. LST is sometimes a bad idea for the reasons you state, but not always. Editorial judgement is required, and here it was properly excercised. —Alalch E. 14:55, 12 March 2023 (UTC)[reply]
    Here is an upside you missed - templates always have less watchers than articles. Moving the tables into the parent article means that there are many more eyes watching for changes. Gonnym (talk) 15:19, 12 March 2023 (UTC)[reply]
    I am not convinced that this makes a substantial difference in practice for our less popular articles. Five barely active watchers of an article isn't better than one person actively looking out for the template. —Kusma (talk) 18:07, 12 March 2023 (UTC)[reply]
  • I do not have a clear enough understanding of LST and its downsides at the moment. IMO, Sports "data" (win-loss/scores etc), specially after an event is complete should preferably be in articles, while any "styling" should be templated out. But that aside, I do think @Frietjes: should have first gotten a consensus, preferably a full RFC, before starting this work. There's 1000+ articles and templates affected, probably more, and changing/deleting that many pages ought to involve some larger discussions. So far I can find no such RFCs, and so would recomment Frietjes to stop this conversion until consensus is obtained. Soni (talk) 18:39, 12 March 2023 (UTC)[reply]
    Support, per WP:CONLEVEL. Mathglot (talk) 09:12, 15 March 2023 (UTC)[reply]
  • The part that concerns me is presenting to TfD as empty templates. That circumvents the widespread notice the system is meant to provide. There should either be community wide consensus to implement LST or the TfD should be nominated before any attempt to use LST and the rationale should include using LST as replacement. There is often more than one way to do something and it is rare the community has mandated a specific way (Engvar, citation styles, etc.)Slywriter (talk) 21:19, 13 March 2023 (UTC)[reply]
    The actual text used on TfD pages is unused after being merged with the parent article with attribution and transcluding articles update to use WP:LST. That makes it clear what happend and that it wasn't unused before. That the LST convertion happend before the page is nominated for deletion isn't necessarily a notification problem since it causes an edit that shows up in watchlists instead of being visible if you visit the article. The problem I see is that edits like this has a summary like update template syntax rather than something like convert template to LST template nominated for deletion at TfD. It also appears like there will now be a bit of a pause for these nominations and Category:Pages transcluding nonexistent sections are being activley dealt with by @Frietjes and Plasticspork: who have been involved with implementing LST. --Trialpears (talk) 21:36, 13 March 2023 (UTC)[reply]
  • I have used both LST as well as Templates, and have authored both. But to my mind, no one has elucidated a common thread that I hear running through many of the comments above, which I believe is key, and that is this: editors are becoming enslaved to the technology, instead of the technology serving our editors (and through them, our readers). That is completely backwards. In my mind, it is the height of absurdity to argue that we must use LST instead of Templates, because LST text exists in namespace zero (articles) where sourcing is required, and Templates exist in namespace ten, where sourcing is not required, or is excluded. If we made these rules, we can change them. First, we must see what user needs are, and then we must mold the technology to serve them/us, not vice versa. Is there consensus to make Template content sourceable and subject to WP:V under some conditions, perhaps via a naming convention? "Make it so." Is there agreement to create namespace 16, "Content", intended for template-style transclusion into one or more articles, and falling under the requirements of WP:Verifiability and other P&G, in order to provide a (conceptually) simple resolution to this type of problem? Make it so. Finally, if you see someone changing transclusion of a template in an article you have a stake in to LST and it's getting in your way, just revert it and change it right back again to using the template. There's nothing to stop you from doing so, and perhaps you'll end up living in the last house standing, that is blocking the bulldozers from flattening everything in order to build that supermall parking lot, but I think you'll find a lot of people tend to support that when they see it, and may even start pushing back and helping you. I live on both sides of this, as I do truly enjoy and appreciate the technology and use it to the max, delighting in every new useful-but-maybe-obscure tecchie gadget that I can find, and I also love writing content and finding sourcing for it, and doing both at once is double the pleasure; however, and this is a *big* however: it is very clear to me where the priority lies, and that is that the technology must serve our needs, full stop, no exceptions. Anything else, is completely wrong-headed. If the technology isn't doing what it should, or is getting in your way, discuss it here, or raise a ticket on Phabricator. Mathglot (talk) 09:01, 15 March 2023 (UTC)[reply]
    No one's made an argument that content in templates doesn't have to be sourced. If it appears in mainspace, whether through transclusion or not, then it must be verifiable. Many editors have commented on letting those creating the content use the wikitext features that suit their workflow best (including both creation and later maintenance), rather than have someone else decide on their behalf that they must use a specific feature and watch a certain page. In the long run, as generations of editors turn over, we will need a way to ensure pages remain monitored. Given the existing amount of resuse of content through templates, though, it's not clear to me that eliminating the ones being proposed for deletion will have a significant effect on this problem. isaacl (talk) 18:01, 15 March 2023 (UTC)[reply]
    I suspect that @Mathglot is responding to comments like the one above that says Templates should not consist of article content that needs verification, even if the content is in tabular form.
    A quick check indicates that there are already 70K pages in the Template: namespace with ref tags, so that's not a problem; it's already possible to source anything that you want to transclude from the Template: namespace. WhatamIdoing (talk) 21:49, 17 March 2023 (UTC)[reply]
    Yes, that is what I meant, thanks; and my apologies if my meaning was less than clear. Mathglot (talk) 10:25, 18 March 2023 (UTC)[reply]

Automatically restrict the editing on the pages of political parties

I recently had the chance to see some controversies develop in the talk pages of a few political parties of my country, where some users complain that the articles are heavily patrolled by sympathizers if not outright party members, in effect turning what's supposed to factual encyclopedic articles into propaganda pages since nothing that might be even remotely inconvenient usually stays for very long no matter how factual or how well sourced it is. And suggestions like "just discuss with them bro!", strike me as completely inadequate.

I don't know what debate if any there is around this issue, but increasing the minimum requirements to edit on political parties might limit the flow of users whose sole reason to edit is to peddle their ideologies and nothing else, with no regards to facts, even if it doesn't solve it completely. Wareno (talk) 19:32, 14 March 2023 (UTC)[reply]

Well, we've got our Five pillars; it seems to me that especially pillars 2, 3, and 4 already cover this, supported by WP:V, WP:RS and WP:NPOV. You haven't said how you suggest we "restrict the editing", so I won't throw WP:NO-PREEMPT at you, too (I've already acronymized you to a bloody pulp anyway—sorry!). — JohnFromPinckney (talk / edits) 20:53, 14 March 2023 (UTC)[reply]
@Wareno there exists a list of contentious topics already. Elections in USA would already be covered by WP:ARBAP2. I haven't fully fleshed out my opinion on this, that said. ~ 🦝 Shushugah (he/him • talk) 21:08, 14 March 2023 (UTC)[reply]
Policy already addresses this sort of thing extensively. WP:Dispute Resolution goes into detail about your options. If you've made a serious attempt to discuss an issue on the article's talk page and no consensus develops, then you can make a post to the neutral point of view noticeboard asking for additional input or you can seek resolution at the dispute resolution noticeboard. If there's a specific editor that's attempting to control everything in an article or consistently challenges anything they disagree with politically, then you can make a post at WP:ANI. But this is a last resort, so make sure you can prove that they're being disruptive before you do this. Thebiguglyalien (talk) 04:03, 15 March 2023 (UTC)[reply]

Russian minority names in English Wikipedia

When it comes articles (about individual people) in the English Wikipedia, should one make the article name from the Russian transliteration or given minority language version? Especially with Tatar names, since there also exists a Latin Alphabet version of their language, albeit in limited use. I can think of arguments for both; the version in the minority language is sort of more "authentic since that is the native language of given people, but at the same time, the Russian version is the official version that is also in the passports. It also usually tends to be more approachable to English speakers. Xәkim (talk) 15:02, 16 March 2023 (UTC)[reply]

There is no “one-size-fits-all” rule for this. Per WP:COMMONNAME the article title should use whatever name is most commonly used in English language sources that discuss the subject (as that will be the most recognizable to our readers).
In the article text, we have a bit more flexibility. We would normally default to the COMMONNAME as well… however, it is appropriate to (briefly) mention any other variants. This is often done as a parenthetical in the opening sentences or paragraph. Blueboar (talk) 15:33, 16 March 2023 (UTC)[reply]
To expand on the above, while it's a case-by-case assessment, variants should typically be included if there is a reasonable chance that an English-speaking reader could encounter the variants somewhere off-Wikipedia, but we should not be inventing plausible variants if we do not previously have evidence of them being used. signed, Rosguill talk 15:43, 16 March 2023 (UTC)[reply]

MOS:PREFIXDASH

Just a heads up: Wikipedia_talk:Manual_of_Style#MOS:PREFIXDASH, opinions are welcome. Brandmeistertalk 13:31, 17 March 2023 (UTC)[reply]

RFC on using maps and charts in Wikipedia articles

Maps are used as references in 32,000+ articles. From time to time questions about their use are raised in venues such as WP:GAC,WP:AFD and WT:OR. Policy and guidelines about sourcing and verifiability do not directly address nontextual sources. This RFC was started to answer some of those questions. I feel the Wikipedia community would benefit if we have some codified guidelines about their use to avoid having to continually revisit these topics.Dave (talk) 05:29, 19 March 2023 (UTC)[reply]

Proposal 1: original research

Should the following text adapted from WP:CALC be moved into the policy Wikipedia:No original research#What is not original research?

Source information does not need to be in text form—any form of information, such as maps, charts, graphs, and tables may be used to provide source information. Routine interpretation of such media is not original research provided that there is consensus among editors that the techniques used are correctly applied and a meaningful reflection of the sources.

Proposal 1: comments

  1. Support as nom Dave (talk) 05:29, 19 March 2023 (UTC)[reply]

Proposal 2: What can be cited to a map?

Conditional to Proposal 1 passing, should the following sections be added to Wikipedia:No original research#What is not original research?

Proposal 2a: reliable sourcing

Maps cited in articles should follow the Wikipedia:Reliable sources guidelines. When dynamic map applications (such as Google Maps or an ArcGIS website) are referenced, and when supported by the application, the URL used in the citation should link directly to an overview of the relevant object(s) rather than to the main page of the application.

Maps should only be used to cite statements that can be verified using the map. For example, contour lines on a topographic map can be used to reference statements about topography, but a statement such as "Washington D.C. is the most populous city on the Potomac" cannot be fully verified using a map that does not contain any information about population.

Proposal 2a: comments

Proposal 2b: image layers

Satellite layers (i.e. on Google Maps) can be used to reference statements about elements such as ground cover that can easily be verified.

Proposal 2b: comments

Proposal 3: history

Conditional to Proposal 1 passing, should the following be added to Wikipedia:No original research#What is not original research?

It is allowable to cite historical maps to refer to how artificial geographical features appeared on them at a specific point in time.

Proposal 3: comments

General comments

  • Having seen a "Good Article" that was exclusively referenced by about 30 maps with no non-map references it would be good to have a discussion about what maps mean for notability with a dip into the extent to which maps make quality references. Gusfriend (talk) 06:11, 19 March 2023 (UTC)[reply]
    There is also the concern that sources such as Google Maps do not have a version and may be updated at any time which may limit what it is appropriate to reference. Gusfriend (talk) 06:19, 19 March 2023 (UTC)[reply]