Wikipedia talk:Large language model policy

This page is within the scope of WikiProject AI Cleanup, a collaborative effort to clean up artificial intelligence-generated content on Wikipedia. If you would like to participate, please visit the project page, where you can join the discussion and see a list of open tasks.AI CleanupWikipedia:WikiProject AI CleanupTemplate:WikiProject AI CleanupAI Cleanup articles

Archives

Archive 1

Discussion at Wikipedia:Templates for discussion/Log/2023 December 13 § Template:AI-generated notification

You are invited to join the discussion at Wikipedia:Templates for discussion/Log/2023 December 13 § Template:AI-generated notification. –Novem Linguae (talk) 08:26, 14 December 2023 (UTC)[reply]

Discussion at Wikipedia:Templates for discussion/Log/2023 December 13 § Template:OpenAI

You are invited to join the discussion at Wikipedia:Templates for discussion/Log/2023 December 13 § Template:OpenAI. –Novem Linguae (talk) 08:26, 14 December 2023 (UTC)[reply]

Notes

Future directions

I think it may be appropriate to note here my intentions for after the RfC, assuming it is successful.

When writing the proposal, I did my best to prevent it from being a "pro-LLM" or "anti-LLM" policy as written. My hope is that, rather than a meandering general referendum on the whole field of artificial intelligence, we could establish some simple and non-intrusive rule to cut down on the bottom 10% of slop without presenting too much of an obstacle to people who are interested in using the tools productively. And we are getting a rather consistent flow of slop (see WP:WikiProject AI Cleanup), from people who are either using these models improperly, using them for tasks to which they're not suited, or being insufficiently careful in verifying their output. This puts a rather large (and unnecessary) strain on new page patrollers, AfC reviewers, and editors in general.

For what it's worth, I am myself a great fan of transformer models, and have followed them with great interest for several years (I created the articles for GPT-2 and DALL-E, my first interaction with them was a GPT-2-124M in summer 2019, and I had access to the GPT-3 API in 2020). Last August I used the GPT-3 API to assist in writing several Signpost columns; I guess you will have to take my word for it that I didn't write this as a stalking-horse for a project-wide LLM ban.

Some people think that these things are just plain crap, and there is a lot of very lively debate on what utility they really have, and whether it is worth the effort, et cetera. Well, I think it is, but the consensus of the editing community isn't mine to decide, and if everyone thinks that they are junk, then I guess we will have to live with that.

I will note that the number of people who want to ban LLMs entirely increases every time a gigantic bucket of GPT slop is poured into the NPP queue, so if there's some very low-effort solution we can implement to slow down the flow, I think it is worth it even if you are a LLM maximalist who resents any sort of restriction.

Anyway, it is hard to predict the trajectory of a technology like this. They may get better, they may level off, or they may improve a lot at some things and very little at other things in a disjunct way that makes no sense. So maybe we are right on the precipice of a tsunami of crap, or maybe it already passed over, or maybe we're on the precipice of a tsunami of happiness. What I do think is important is that we have policies that address existing issues without prematurely committing to thigns in the future being good or bad. If it turns out that this cuts down on 90% of the slop and we never have an ANI thread about GPT again, then maybe there does not need to be any further discourse on the issue. If it turns out that this short sentence isn't enough, then maybe we can write more of them. jp×g 🗯️ 09:37, 15 December 2023 (UTC)[reply]

Then:

Old problem: We had a bunch of badly written articles posted.
Old action: We wrote a bunch of rules against undisclosed paid editing.
Old result: A few folks changed their behavior, and the rest kept doing the same thing anyway, because we had no good way to identify them.

Now:

New problem: We have a bunch of badly written articles being posted.
New action: We write some rules against a set of tools that might be used to make them.
New result: A few folks changed their behavior, and the rest kept doing the same thing anyway, because we had no good way to identify them?

WhatamIdoing (talk) 04:04, 17 December 2023 (UTC)[reply]

Even if there is no good way to identify them, that does not mean it is not a bad idea to institute as policy. Is there an easy way to, for example, identify bot-like or semi-automated editing? Unless if there are tags to identify a script that made that tool, a semi automated edit could have any edit summary or no summary and no one would really know that it was semi automated. The whole point is not banning LLMs from mainspace poses a significant risk of disruption, and encouraging it would just be encouraging more disruption. And DE is one thing that, regardless of the means or intent, results in a block if it is prolonged. Awesome Aasim 22:13, 17 December 2023 (UTC)[reply]

The thing is that everything about LLM use that disrupts Wikipedia is already prohibited by existing policies. Nobody in any discussion so far has provided any evidence of anything produced by an LLM that is both permitted by current policy and harmful to Wikipedia. Thryduulf (talk) 10:27, 18 December 2023 (UTC)[reply]

Because the issue the policy is trying to address is more about larger editing patterns than individual diffs. It's not illogical if the scope of policies overlap—in fact, it's arguably a feature, since it reinforces the points that the community find most important. Remsense留 14:11, 31 December 2023 (UTC)[reply]

While there is inevitably some overlap in policies, I disagree that it's a feature per se. Generally speaking, it easier for editors to keep track of fewer policies than more, thus having a few number of central policies with supporting guidance that expands on details provides an organizing structure that simplifies remembering and following guidance. Avoiding redundancy supports this principle and helps prevent guidance from getting out of sync, and thus being contradictory. It also can forestall complaints about there being too much guidance, as the basic shape of the guidance can be understood from the central policies, and the details can be learned gradually, without having to jump between overlapping guidance. isaacl (talk) 17:04, 31 December 2023 (UTC)[reply]

I don't think that The whole point is not banning LLMs from mainspace poses a significant risk. I think there's some good old human emotions at play here, but the problem is that we already know the ban will be ineffective. Most people won't know the rule, you won't be able to catch them (and we will wrongly accuse innocent people), and most of the few people who are using LLMs and actually know the rule won't follow it, either, because a good proportion of them don't know that you decided that their grammar checker is an LLM, and the rest don't think it's really any of your business.

This is King Canute and the tide all over again: We declare that people who are secretly using LLMs must stop doing it secretly, so that we know what they're doing (and can revert them more often). You're standing on the beach and saying "You, there! Tide! Stop coming in, by orders of the king!" We can't achieve any of the goals merely by issuing orders.

And your plan for "And what if they don't follow your edict?" is what exactly? To harrumph about how they are violating the policies? To not even know that they didn't follow your orders? WhatamIdoing (talk) 07:06, 11 January 2024 (UTC)[reply]

A good summary of our WP:COI guidelines, but it doesn't seem a reason to scrap them. CMD (talk) 07:28, 11 January 2024 (UTC)[reply]

I am also concerned that it will add an unnecessary burden on those of us who will follow the policy, for no apparent reason. MarioGom (talk) 12:04, 11 January 2024 (UTC)[reply]

Request for close

I'm going to make a request, because the bot just removed the RFC template since it's been a month (I obviously am not going to close it myself). jp×g 🗯️ 10:18, 13 January 2024 (UTC)[reply]