Jump to content

User talk:SandyGeorgia

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Gimmetrow (talk | contribs) at 16:48, 7 February 2023 (r). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

About meTalk to meTo do listTools and other
useful things
Some of
my work
Nice
things
Yukky
things
Archives

You may want to increment {{Archive basics}} to |counter= 119 as User talk:SandyGeorgia/arch118 is larger than the recommended 150Kb.

I usually respond on my talk page, so watch the page for my reply.
Please provide a link to the article or page you want me to look at; that will increase the likelihood of me getting to it sooner rather than later.
I lose track of those pingie-thingies; because I don't get along with them, I have converted all notifications to email only. A post here on my talk page is the best way to get my attention. Besides that, we used to actually talk to each other in here, and get to know each other. REJECT the pingie-thingie!
iPad typing: I am unable to sit at a real computer with a keyboard for extended periods of time because of a back injury. When I am typing from my iPad, my posts are brief and full of typos. Please be patient; I will come back later to correct the typos :) I'm all thumbs, and sometimes the blooming iPad just won't let me backspace to correct a typo.


TPS alert/rant: CCI work

Because of the User:Doug Coldwell situation that I happened across by pure chance, I am back trying to help out at CCI. After giving up in dejection the last time I tried that. And the time before. And so on.

This is demoralizing work; not a hobby, not fun, not relaxing, should not be done by anyone for free. Most of my TPS are used to working with fine content. When working CCI, you deal with pure crap; while trying to sort out if something is a copyvio, you have to do that with content that is poorly written and not even based on reliable sources often, and it's nothing but miserable unpaid grunt work and drudgery. And when working on one CCI, I discovered a whole 'nother serial copyviolator! Just makes one want to quit.

I have checked WP:PEREN and don't know where else to look, but I can't understand why the WMF doesn't hire people to clean copyvio. Why should any volunteer be doing such crap work for free? How does "hire people to clean up copyvio" not make it on to those wish lists thingies the WMF puts out? WhatamIdoing? Hats off to any CCI worker and copyvio admin who deals with this demoralizing content day in and day out.

And I just reread through all of the 2010 Grace Sherwood debacle, where FAC did do something about it, but why has nothing changed in two decades with DYK feeding the copyvio pile, and GA promoting more up the line. Who's checking besides Nikkimaria? SandyGeorgia (Talk) 06:20, 25 January 2023 (UTC)[reply]

I believe that MER-C and Diannaa do a lot of copyvio-related work. If memory serves, Wizardman used to, but I don't know if he's still active in that area.
I'd like to see law schools, especially those that pride themselves on intellectual property, start summer internships (or similar programs) to evaluate copyright questions. Just reading the Commons discussions can be an education. WhatamIdoing (talk) 21:28, 25 January 2023 (UTC)[reply]
We Need More Help. SandyGeorgia (Talk) 21:40, 25 January 2023 (UTC)[reply]
I've got a couple less-busy days due to a winter storm, so I'll try to take a look at some of the Appomattox ones. Hog Farm Talk 22:02, 25 January 2023 (UTC)[reply]
Hog Farm, I think you went through Battle of Ridgefield once? While looking at Ludington/Coldwell stuff, I happened across another unrelated big mess there, and waiting for the experts to tell me, what next. I hate how often I have to ping them when I don't know what to do next. It is such specialized work ... and they must be so sick of pings. SandyGeorgia (Talk) 22:06, 25 January 2023 (UTC)[reply]
When I looked through Ridgefield, I was mainly looking for patently unreliable sourcing. Hog Farm Talk 00:04, 26 January 2023 (UTC)[reply]
Oh, yes, I know Hog Farm. I hope that didn't come across as me saying you missed something. I only happened upon it because of trying to sort the Coldwell stuff. Bst, SandyGeorgia (Talk) 02:27, 26 January 2023 (UTC)[reply]

Copyright problems: break

100% right, sadly. That's why I tend to go in spurts on CCI, sometimes I can close out 2 or 3 relatively quickly and other times I don't even want to look at it. Honestly the only thing that helps me get through it sometimes is spite; these serial violators wasted enough of the site's time so torching said content helps a little bit. Perhaps once I'm in another spurt I can get Hathorn resolved once and for all (yes, that's still being addressed a decade later...) Wizardman 23:22, 25 January 2023 (UTC)[reply]
Hathorn came to my mind recently, when I was contemplating how many times we've been down this road ... but I held back, thinking it wiser not to start naming them all and all of the various debacles. But. What have we changed in content review processes to get it to stop, and what are we doing that encourages it ? SandyGeorgia (Talk) 23:27, 25 January 2023 (UTC)[reply]
That anyone can edit, while it doesn't encourage it, it does to a degree enable it. A lot of people just flat out don't understand copyright and/or how to determine if something is public domain or not. And for those that do, there's a strong possibility they don't live in the US, so are only familiar with sometimes radically different copyright laws in their country/region.
I know I struggle particularly with images, partially because of the differing legal systems between the UK and US, and partially because the US system seems so counter-intuitive with respect to who owns the copyright of derivative works. I'm better with text based stuff, as there is similarities between the two jurisdictions, so I try to keep an eye out for copyvios on my watchlist, as getting them early (I hope) prevents long term problems arising.
Alas short of running every edit through a service like Turnitin, which is not without its own host of problems, I don't know of a way that we could solve it without fundamentally changing how the site operates as a whole. If there's a lot of copyvios coming from article creation, then having some sort of copyright detection training for new page patrollers might help, but it would still require editors to engage with a rather thankless task. However for editors who maybe just add a paragraph here to one article, and a paragraph there to another, that goes undetected and adds up over time, I dunno if there is a way we could handle that beyond what we already do, short of the Foundation hiring dedicated CCI people. Sideswipe9th (talk) 23:50, 25 January 2023 (UTC)[reply]
Npp/afc editors actually look for copyvio although they are always in danger of missing more subtle cases involving foreign language, offline sources, or a paywall. The highest risk of copyvio are adding content to existing pages, because it is often not checked at all. (t · c) buidhe 00:35, 26 January 2023 (UTC)[reply]
Buidhe, a fairly well known NPPer had an article deleted this week for copyvio. And I was told today by an experienced editor that "the tools" at DYK and GAN pick up copyvio. No understanding that Earwig is useless when all sources are offline, and even when they are online, not very good at picking up too-close-paraphrasing. Look at the date on Wikipedia:Wikipedia Signpost/2009-04-13/Dispatches. And then re-read the Grace Sherwood debacle. What reform has there been, outside of FAC?
Sideswipe9th, I was hoping someone would pop up here to explain to me why the WMF is not paying for this to be done. SandyGeorgia (Talk) 02:32, 26 January 2023 (UTC)[reply]
Alas the Foundation's actions are as much a mystery to me as to most other editors I'm afraid. It might be worth starting a discussion at one of the Village Pumps though? Like if enough people recognise this is a problem, then we can at least as a community ask them to pull their purse out for some actual support on this. Sideswipe9th (talk) 02:48, 26 January 2023 (UTC)[reply]
This is all very surprising to me. I first checked WP:PEREN, expecting to find it there. How is it possible it hasn't already been raised and repeatedly? SandyGeorgia (Talk) 02:49, 26 January 2023 (UTC)[reply]
How many editors actively think about copyvios? I suspect the number is quite small. Sideswipe9th (talk) 02:51, 26 January 2023 (UTC)[reply]
Sideswipe9th maybe we should list those who should be thinking about it!
  1. New page patrollers (one who had been through NPP school had an article copyvio-deleted this week).
  2. All DYK reviewers and admins who promote DYK queues. (The speed to get an article to a certain size is a driving factor for some whose motivation is the reward culture.)
  3. All FAC, FAR and GAN reviewers. Never mind whether you most spotcheck sources; if you're suppporting an article, you should.
  4. Anyone doing WikiProject assessments. I just saw a B-class assessment assigned to a brand new article with a four-sentence lead, two of which contained copyright issues.
What else ? Awareness needs to be raised about the miserable extent of this problem. Someone should rewrite and udpate the old Plagiarism update and ask the Signpost to run it. SandyGeorgia (Talk) 00:19, 27 January 2023 (UTC)[reply]
Commenting on yours in order:
  1. NPP and AfC reviewers definitely. I know checking for copyvios is on the NPP flowchart, but I do have to query how many editors actually do it, especially when we have a backlog drive on and reviewers are reviewing articles pretty quickly.
  2. I don't know enough about how DYK works to comment, but definitely seems sensible.
  3. Yeah absolutely. Any editor doing a FA or GA review should be checking for copyvios as part of that process. I wonder if this could be more formalised into the structure of the review process, like some sort of requirement for the reviewer to say "I checked/I've not checked for copyvios", and for the FA/GA confirmation to be held until someone has done it.
  4. This is a tough one, occasionally I'll use WP:RATER when sticking WikiProject banners onto talk pages, and it uses some sort of prediction when adding the banners. I wonder how many editors are just using that versus actually assessing it? For the later, actual assessments yeah that should have a copyvio check done as part of it.
And possible additions:
  1. Any editor actively cleaning up the recent contributions of a blocked or banned editor should do a copyvio check on those contributions as part of determining whether or not they should be reverted. Like if the content is obviously disruptive, just revert it, but if it looks plausibly good, run a copyvio check on it.
  2. Editors doing recent change patrolling should probably be checking for copyvios when reviewing the diffs, at least for the new contributions.
  3. I'm tempted to say that WikiProjects should have dedicated members/teams for this as well, on a per project basis, whom are active beyond the assessment level. When dealing with specialist content, it helps to be familiar with the topic when determining if something is likely a copyvio. This would also fit in nicely with your #4, as there could/should be some overlap there.
The biggest blind spot though, at least with this sort of active encouragement, is the low traffic/low watchlisted article. The sort of article that someone creates, and then no-one really pays attention to until there's some sort of problem, either vandalism or with the original author of the article. While AfC/NPP should catch some of that, if the author has the autopatrolled flag and is inserting copyvios, who is checking their edits before they get hauled up to ANI and CCI? Sideswipe9th (talk) 00:47, 27 January 2023 (UTC)[reply]
I actually think the biggest blind spot is DYK, because that is the specific area where almost every long-term serial copyvio abuser was born. That would be the place to initiate reform, and catch stuff early on. Most of the historical serial offenders are not the first-time editors or the non-English speaking, rather those seeking icons and rewards-- working too fast, not getting seriously reviewed, racking up rewards. Re #3 (FA or GA reviewers), I started pushing on this problem at WT:FAC several years ago, and got so far as to get 1f added to WP:WIAFA, but my proposals for more active source work were rejected. In theory, anyone entering a Support at FAC should be stating whether the article meets 1f. The weakness at NPP seems to be in the area of detecting too-close-paraphrasing, which is why I think the old Signpost dispatch pushed by the then-FAC regulars via the WP:FCDW should be updated and published broadly. Wikipedia:Wikipedia Signpost/2009-04-13/Dispatches.
5. Oops, one I forgot; add copyvio spotchecks to anything claiming WP:WIKICUP points. SandyGeorgia (Talk) 00:56, 27 January 2023 (UTC)[reply]
The only other thing I can think of right now (tis late, my brain is derping and words are hard) would be to maybe have a conversation at Wikipedia talk:Contributor copyright investigations asking all of the regulars who investigate and clean up cases where they think the majority of problematic editors are coming from, and what process changes in those areas could catch this sort of thing early before multi-year long cleanup cases are needed. Sideswipe9th (talk) 01:14, 27 January 2023 (UTC)[reply]
Maybe after more general brainstorming here ? SandyGeorgia (Talk) 02:02, 27 January 2023 (UTC)[reply]
I don't think that WikiProject assessments should be especially concerned about copyvios. Every editor should, but this group only at the usual level.
CSD, especially for copyvios, is the primary purpose of NPP. I am concerned that every time we add some extra "little" thing to NPP's workload, their primary purpose gets more and more obscured. The NPP folks are talking with the WMF's Growth team about fixing up Special:NewPagesFeed. I'm not involved, but it's not unusual for this sort of thing to be a round of "give me more bells and whistles" instead of "strip this workflow to the most efficient, effective minimum". I gently suggest that we need less NPP attention on things like adding maintenance tags and tagging for WikiProjects, or even trying to determine notability, so that we can have them focused on speedy deletion. WhatamIdoing (talk) 22:04, 28 January 2023 (UTC)[reply]
My sense is that Barkeep is saying more or less the same ... WhatamIdoing have you looked at MER-C's suggestions below? SandyGeorgia (Talk) 22:09, 28 January 2023 (UTC)[reply]
EranBot and CopyPatrol do something similar to the "running every edit through Turnitin" thing, although they don't catch copyvios that are short, minimally paraphrased, translated, and so on. It's fundamentally easier to copy and paste random stuff from the internet than it is to detect and remove it, which means that the CCI backlog is unlikely to ever be resolved. I have to wonder about some of our current approach to copyright and how it would be viewed outside of Wikipedia. I've seen a few complaints on VRT from writers who alleged that Wikipedia had plagiarized their books. In these cases the content was appropriately paraphrased and no informed editor would conceivably argue that it constituted a copyright violation. But the authors were concerned that the Wikipedia page summarized every important point of their book, meaning that no one would have any need to purchase it anymore. That sort of thing strikes me as posing more risk to an author's livelihood than, say, someone copying and pasting a plot summary from IMDB. I'm not proposing that we rework our copyright policies because of this - I just think it's an interesting perspective. Spicy (talk) 00:56, 26 January 2023 (UTC)[reply]
Interesting ... a whole 'nother problem. And then there's the guy out there on the lecture circuit profiting unscrupulously by using a page written 90+% by me, and nothing the WMF can offer in the way of tools to help me deal with it. So it works both ways ... SandyGeorgia (Talk) 02:34, 26 January 2023 (UTC)[reply]
@Spicy 45.6.2.15 (talk) 14:15, 1 February 2023 (UTC)[reply]

Copyright problems: brainstorming

Ok, have a clearer head now. What have we changed in content review processes to get it to stop, and what are we doing that encourages it? I think this is close to the right question we should be asking. While investing in and attracting more people would solve the current workload problems with CCI, it doesn't tackle the root cause.
For me right now, the question is What aren't we doing to catch this problem early? So there's two examples that spring to mind here that I'm surface level familiar with; Doug Caldwell, and Martinevans. Both are users with very high edit counts (70,556 and 206,311 respectively). Checking and cleaning up each of these editors will take a substantial time and editorial energy investment. While that needs to be done, the pertinent question from a prevention perspective is why didn't we catch this sooner?
So yeah, what is causing us to be unable to detect this sort of problem until we have editors with tens or hundreds of thousands of edits? What can we do to catch this earlier, so that a CCI case only has to check say hundreds of edits, instead of thousands? Sideswipe9th (talk) 16:40, 27 January 2023 (UTC)[reply]
Great start. Some partial answers/ideas.
The GA process at least has a major drive underway right now for GA reform: Wikipedia:Good Article proposal drive 2023. I don't know if they're doing enough, but at least they're trying, and the new data that Mike Christie is working on should help uncover what reviewers may be pushing faulty GAs up the line without rigorous review.
To my knowledge, nothing has changed at DYK, it has been promoting copyvio for as long as I've been editing, and that isn't going to change unless the community takes a strong stand.
It might be worthwhile to ask Barkeep49 what might be helpful to get NPP or AFC more on board with too-close paraphrasing.
FAC has never been a source of extreme instances of copyvio as have GAN and DYK. I'd like to see stronger sourcing checks there, as in my proposals of a year or so ago, but it's just not a place where this problem needs more focus. The Rlevse/PumpkinSky situation was an oddity that was obscured because of a competent copyedit by another editor.
WikiCup has been at times a problem, but that can be solved by fixing whatever ails GAN and DYK (although It still would be nice if they contemplated adding copyright spot checks).
In summary, change needs to happen at DYK. Looking beyond that at individual cases (which I've been doing lately):
  1. When trying to address the Coldwell CCI, one gets unpleasant pushback from DC associates. I'll be bringing forward some proposals when I get a freer moment. Of interest there is that an experienced editor told me that content review processes vetted for copyright, so someone somewhere needs to write up a good description of all the things that Earwig etc cannot detect. I continue to believe we should update and expand the Plagiarism dispatch written by our best IP people in 2009.
  2. We should be catching new editor mistakes sooner. I'm up to my eyeballs right now on a situation like that (stop them early) and getting No Help From Anyone, and I'm sure that editor is beginning to feel hounded by me. Do we need a mechanism for getting more eyes on new editors sooner and helping them out? I am to the point of contemplating an ANI post just so I can back out and let someone else take that one on, as it's exhausting.
SandyGeorgia (Talk) 17:23, 27 January 2023 (UTC)[reply]
I think close paraphrasing is always going to be some level of difficulty to uncover. I think some forms of close paraphrasing are reasonable for a NPP/AfC reviewer to uncover. However, truthfully I think a lot of the kinds of issues we saw with Doug Caldwell require a more thorough version of a review than is reasonable to expect from an NPP/AfC reviewer. I do think it reasonable to expect a GA reviewer to be able to uncover such issues and for the DYK process (whether at the reviewer or at the prep builder level) to uncover. Best, Barkeep49 (talk) 17:24, 27 January 2023 (UTC)[reply]
But Barkeep, the main wall that GA reviewers hit with editors like Coldwell is the one of WP:AGF on offline sources. Perhaps the review should require them to ask to be sent some offline sources, but I don't think stronger sourcing checks is passing their proposal drive. SandyGeorgia (Talk) 17:26, 27 January 2023 (UTC)[reply]
If the Caldwell paraphrasing would require a more thorough review than is reasonable from NPP/AfC then how do we detect it early? Not every article is going to be nominated for DYK, GA, or FA, and so that leaves a huge area for that sort of content to be left unnoticed until we have a ten/hundred thousand edit count CCI, which is a different kind of unreasonable. Sideswipe9th (talk) 17:39, 27 January 2023 (UTC)[reply]
There are trade-offs to be made between the time a patroller spends on an article and the number of articles they are able to patrol. Copyright investigating and cleanup is a specialized skill for a reason. I think the NPP tutorial discusses the expectations in a reasonable manner. Best, Barkeep49 (talk) 17:54, 27 January 2023 (UTC)[reply]
Would revisiting WP:AGFC be useful? How long must we AGF once copyright issues have surfaced ? Why do we have to have an open CCI before WP:PDEL can kick in ? SandyGeorgia (Talk) 17:58, 27 January 2023 (UTC)[reply]
What is the page name of that place where one can request access to sources (Nikkimaria)? Who are the regulars there; that is, are there editors who can be enlisted to help spotcheck sources in the other processes we're discussing above (DYK, GAN, AFC, NPP etc)? The reason I ask is that I just saw Ucucha popping back in to address an article at URFA/2020, and if Ucucha were still actively editing, I'd have someone I could enlist to help with the editor I'm now frustrated with. That Is. We've lost too many top content editors who have the ability and resources to deal with a growing problem. Another example is the loss of Geometry guy, a sorta kinds defacto GA process Coordinator in the older days, who would have put forward some sort of proposal to deal with this. SandyGeorgia (Talk) 17:54, 27 January 2023 (UTC)[reply]
WP:REREQ is the place where you can request access to sources. There's a list of editors at WP:REREQ#Reference resources, but I dunno how up to date it is. Sideswipe9th (talk) 18:03, 27 January 2023 (UTC)[reply]
WP:RX (it feels wrong to me to call it REREQ) is a great place, and people are jumping over each other trying to fulfill resource requests. That said, I don't think it's common to request that someone do spotchecks, rather than just cough up the source for the requester to then put in the work. Firefangledfeathers (talk / contribs) 22:40, 27 January 2023 (UTC)[reply]
So looking at WP:DYK, it seems like there's two strands to it; new articles, and significantly expanded articles. It'd be useful to find out if the copyvios coming out of DYK are predominantly from one of those two strands, or an even split from both strands. For example, if the copyvios are predominently from new DYK articles, then that could imply a problem with the copyvio detection at the NPP and AFC level, because new DYKs should also have gone through that review and clearly they missed something, unless the copyvio content was added after the NPP/AFC review but before the DYK review. However if it's predominantly from expanded articles, which are generally already NPP/AFC reviewed, then clearly that's where we should put more focus on the prevention side.
Looking at the DYK requirements, 4c states that Articles should be free of copyright violations, including close paraphrasing and image copyright violations. So at the very least, it is formally part of their workflow. The DYK checklist that gets attached to every nomination has a yes/no/? field for copyvios and plagarism. Looking at WP:DYKN, there are definitely some editors there running the articles through Earwigs. For example this nomination has been held pending since November due to some plagarism issues with public domain text/block quotes.
So I think I'd need some more data from approved copyvio DYKs before I could speculate more. Is there a specific DYK strand where copyvios are more or less likely? For DYKs that contained copyvios and were approved, was there a copyvio detected and handled during the nomination process? Or was the copyvio undetected for some reason? Or was a copyvio check said to have been done, but no check was actually done? Or is this not a DYK review problem, but instead a DYK effect? Is the review clear, and the copyvio text only being inserted after the DYK hook appears on the main page?
Do we need a mechanism for getting more eyes on new editors sooner and helping them out? Good question. I'd say yes on the principles alone. It might be worth looping Diannaa into this conversation? I know she does a lot of copyright cleanup, and issues a great many {{uw-copyright}} warnings every day. At the very least she may also be able to help you handle the hounding feeling. Sideswipe9th (talk) 17:32, 27 January 2023 (UTC)[reply]
I am hesitant to ping Diannaa because she (and all of the CCI people) are so overburdened already. Not sure if we should ping them or not. A few of them have already been pinged in this discussion, so they may be following anyway. I'll leave it to someone to decide whether to ping Diannaa only because I hesitate to wear out my welcome with the blooming pingie thingie. Maybe instead a post at the copyright talk page?
An educational writeup of the shortcomings of Earwig could help.
I'm not sure it matters if a DYK is new or expanded, because what drives the problems that come out of DYK is the reward culture -- the quick and easy "get my work on the mainpage" gratification. Efforts might be better placed to get the throughput at DYK to slow down. Featuring new content on the mainpage made sense in the early days, when growing the 'pedia was a goal. Does it still make sense to have so many editors working to populate DYK, and then so many more editors having to engage the problems at WP:ERRORS? Why are we still doing this? How many of those seeking rewards would stop committing copyvio if they couldn't get that gratification? One interesting bit of data I'd like to see is how many DYKs move on to FAs ... those are the editors who are adding substantial value. SandyGeorgia (Talk) 17:42, 27 January 2023 (UTC)[reply]
At some point in this, I think we'll need to get Diannaa, and all of the major CCI people involved, at the very least to hear where they think the major problem area(s) are for undetected copyvios. They may agree with you that it's DYK or some other reward driven process, or they may be seeing it from somewhere else that we've not considered. I wonder if brainstorming a brief set of questions to be asked on the CCI and/or CP talk pages would be a worthwhile exercise here?
I dunno if reward culture is just a DYK and WikiCup problem. I know when we run a NPP backlog, there's a similar leaderboard + rewards for contributing setup that if mishandled could encourage speed over accuracy.
As for the new versus expansion thing, I think it would be helpful to at quantify where the problematic articles are coming from. Those that are new should have had at least two reviews (NPP + DYK), so two sets of eyes looking at the same or similar content. If both of those sets of editors are missing something, beyond the close paraphrasing of offline sources problem, then that might help us track down why two different groups of editors are missing this. If it's primarily the expansion side, then that limits the pool of reviewers to just those involved in DYK, which might help us figure out if this is a process, tooling, or training problem specific to the DYK expansion side.
At the moment there's too many questions like "is DYK too speedy?", "is there a lack of training for DYK reviewers?", "does DYK's process encourage hooks over accuracy?", "is there template or process blindness behind the DKY review causing editors to skip over this step?", or "is this something else entirely?" to try and workshop possible solutions. More data from the underlying DYK process would help us pre-filter out some of these questions when figuring out solutions. Sideswipe9th (talk) 17:59, 27 January 2023 (UTC)[reply]
SS, I just re-read and saw that Diaanna has already been pinged to this discussion. SandyGeorgia (Talk) 18:02, 27 January 2023 (UTC)[reply]
I am impressed with the format of Wikipedia:Good Article proposal drive 2023. If we were to start a list as you suggest in a new section below, would we head that direction? Or too soon? Need data first ? SandyGeorgia (Talk) 18:04, 27 January 2023 (UTC)[reply]
Sorry, too many conversation tangents here. Is this a list of questions for the CCI/CP talk pages? Or data gathering questions to more thoroughly figure out the problem spots in the DYK process? Or both? Sideswipe9th (talk) 18:08, 27 January 2023 (UTC)[reply]
I'm asking you :) :) Do you think we can put together questions before we have data, or data comes first? And where would we get the DYK data ? Or should we not even be assuming that DYK is a big driver of the problem, as the CCI people may disagree? My sample could be biased, as I tend to notice the big CCIs that come from frequent DYKers (including some too frequent close paraphrasing that never resulted in a CCI on one frequent DYKer who basically closely paraphrased NYT obits into DYKs years ago). SandyGeorgia (Talk) 18:13, 27 January 2023 (UTC)[reply]
Aaah, following now. For the CCI/CP talk pages, I think the questions are pretty generic, we don't need data to ask questions like "Where do you find the most problematic copyvio edits coming from?". A brainstorm for this would be to figure out the 3-6 important questions that the answers of would help direct us for further investigations. Ideally this would be a short set of questions that would only take maybe 5 minutes to answer.
For the DYK data, we'd be gathering it ourselves. I'd recommend workshopping a series of investigative questions that we could then apply to both the recent known historical problem editors (the "big CCIs that come from frequent DYKers" as you put it from 2022 or a 3/6/9 month period of 2022 if that's too many editors), as well as a snapshot of all DYK nominations over a short fixed period (eg 7 days). We should be looking at things like when was the copyvio detected in relation to the article being drafted/DYK nominated & reviewed/DYK live/post-DYK, were any red flags raised during the DYK review and if so how were these handled at the time, when was the offending text inserted into the article (pre-nom during article drafting, post-nom but pre-hook, during the hook, post hook), were there any DYK process steps skipped or glanced over because the editor in question was a regular, who was involved in the review (is there a specific subset of DYK reviewers that are operating in good faith but are just bad at copyvio detection?). Anything relevant that we can structure into something that we can then use comparatively across the dataset to figure out what (if any) patterns there are. Sideswipe9th (talk) 18:35, 27 January 2023 (UTC)[reply]
Great. On the first, how about a new section below on this page to begin gathering samples which we can whittle down before going to the next step? On the second, I hesitate to over-involve myself in the DYK data gathering, as I have been closely involved in past efforts at DYK reform, and feathers could be ruffled. Leaving that to others :) And separately, I was seriously exposed to active COVID a day and a half ago, so I might fall ill any day now ... just saying ! Gonna go get a ton of work in another area done right now as in making hay while the sun shines. SandyGeorgia (Talk) 18:51, 27 January 2023 (UTC)[reply]
I think feathers have the chance to be ruffled regardless of how we handle this. But as long as we're clear and open about how we gathered the data, and the process used to analyse it, then I think we can keep that at a minimum.
Yeah sections below to work on the questions would be ideal. Or we could move this off to a subpage if you want to stop getting emails/notification pings every time someone replies or edits here. I've got other off-wiki stuff to do now though so won't be able to look at this for a while.
Oh no! Here's hoping that you get lucky and didn't get infected, or that if you did it passes swiftly and mildly. Sideswipe9th (talk) 19:12, 27 January 2023 (UTC)[reply]
In terms of badly-needed DYK reform, I think efforts will be more productive if I am less involved.
For now, I think enough knowledgeable editors are following here that we might get the beginnings of a list here. I fear if we move off to a subpage already, we may lose a few.
Thanks, not so worried about me with COVID, as my 94-year-old dear friend who exposed me :( :( SandyGeorgia (Talk) 19:33, 27 January 2023 (UTC)[reply]
One of the challenges with this work is that one person's "close paraphrasing" is another person's WP:STICKTOSOURCES. There are editors who think that if a sentence can't be credibly accused of a copyvio, then it should be banned as original research.
One of the general areas that I wish we were stronger in is briefly summarizing long passages. I'd love to see more editors summarizing whole book chapters into a single short paragraph. Doing that eliminates all concerns about copyright violations. But some RecentChanges patrollers and watchlist inhabitants, when/if they check an addition, have been known to object to anything that requires them to read more than a paragraph, and if it's the least bit contentious, they want to see close paraphrasing, and their actions put pressure on editors to engage in close paraphrasing. WhatamIdoing (talk) 01:23, 29 January 2023 (UTC)[reply]
Mmmm. It's funny, I think there's definitely a subset of editors who see the ALLCAPS shortcut for that, and use it almost as a thought terminating cliche, conveniently ignoring the start of the second sentence that tells us to summarise in our own words. Sideswipe9th (talk) 01:28, 29 January 2023 (UTC)[reply]
Sideswipe9th, you'd think someone might write some essays about that kind of WP:UPPERCASE mistake leading to myths that we can't use WP:OUROWNWORDS. (I know WAID is perfectly aware of this mistake and was illustrating flawed thinking with a common example of flawed policy citation).
I agree with WAIDs comment about books, and wish it was easier for us to get hold of (and encourage using) professional textbooks like it is for some editors to get hold of papers. The worst example of plagiarism citing a single sentence in a single source came when I looked at student assignments many years ago. The students, who were taking a first-year university course (and so therefore knew nothing) were asked to find a research paper and insert its findings into Wikipedia. The lack of subject knowledge, the lack of variety of sources and authors, and the inability to summarise what is already just a sentence, meant it was nearly impossible for them to paraphrase, and those who tried often importantly mischaracterised their source. -- Colin°Talk 20:39, 29 January 2023 (UTC)[reply]

Copyright problems: time spent

Ugh. I just spent two hours of my life rewriting Appomattox Court House National Historical Park, although I couldn't fix all the failed verification and have listed it at GAR for that and comprehensiveness/weighting issues. That one's at least partially my fault, because I performed a bad GA review back in 2020 when I was still newer to the process. The fact that those two hours will constitute most of my wiki time for this week is fairly frustrating, too. Between burnout from complex Yellow Book audits at work, some RL mental health stuff, and the knowledge that I'm at least partially responsible for the Coldwell situation, I feel heavily discouraged. Will probably return to my normal level of activity in mid-February, but at this point I can make no guarantees. Hog Farm Talk 02:36, 26 January 2023 (UTC)[reply]

Hog Farm Stop That (stern finger wagging). By the time you came along, the Coldwell Phenom (which is a culture) was already very well established. Hundreds of DYKs and a slew of GAs and people assume the editor is sourcing soundly. Not just you. More than a handful of very good editors. I have no use for blaming individual editors when there is an entire culture built around counting notches in belts. It's the culture that needs to be addressed. And WMF needs to pay people to deal with copyvio. Talk:Battle of Ridgefield-- editor rams through boatloads of cut-and-paste on 20 to 22 May, and it's on the mainpage at DYK in less than a week (28 May 2008). I don't see that anything has changed since Wikipedia:Administrators' noticeboard/Incidents/Plagiarism and copyright concerns on the main page. SandyGeorgia (Talk) 02:40, 26 January 2023 (UTC)[reply]
PS, and remember, DC used offline sources, so policy forced reviewers to AGF. (That's why a stern FAC copyvio check asks the nominator to supply random bits from offline sources.) SandyGeorgia (Talk) 02:42, 26 January 2023 (UTC)[reply]
@Hog Farm: Please don't beat yourself up over the Doug situation. You aren't responsible for his actions, or his choice to plagiarise offline sources. Sideswipe9th (talk) 02:46, 26 January 2023 (UTC)[reply]
And by the way, I mentioned above that I hesitated to name all the past exact situations, but one old-time DYK serial problem is very much still active. That's a rub. SandyGeorgia (Talk) 02:52, 26 January 2023 (UTC)[reply]

I spent my entire morning on yet another one that I came across by happenstance. This is 13 years after we published the Plagiarism dispatch, pulling together all of our best IP people to "get serious". The culture needs to change and something needs to be done. This (no one looking closely) is how the DCs and Billy Hathorns (and over a half a dozen more I can name but won't) come to leave behind big messes that we don't enough resources to clean up. SandyGeorgia (Talk) 19:31, 26 January 2023 (UTC)[reply]

Copyright problems: WMF

Could it be that the reason WMF won't employ someone to find and remove copyright violations is that that would break the claim that they are not responsible for it. They handle formal takedown requests and nothing more. Doing more could be a trap? -- Colin°Talk 10:10, 26 January 2023 (UTC)[reply]
I was wondering if the logic was something along those lines (and I notice that WAID didn't answer my query :) Of course, assuming there is some logic may be a stretch here. There must be info out there somewhere on this that we're just not aware of. SandyGeorgia (Talk) 10:22, 26 January 2023 (UTC)[reply]
(talk page stalker) This would be my assumption too. WMF has never been responsible for the content on the servers aside from their legal liabilities under the DMCA. Any more moderation, and they'd run into additional responsibilities under Section 230. Given that Section 230 is being litigated in front of the Supreme Court this term, and WMF has filed an amicus brief in the case, I would assume that they won't comment any further until the litigation has concluded. In short, this is something that the community will have to resolve. Imzadi 1979  20:39, 26 January 2023 (UTC)[reply]
Ah ha ... very interesting info ... thx, Imzadi ... now it all makes more sense.
There must be some sort of workaround involving grants or some funding to editors, not limited only to copyright, and as long as WMF isn't in a position to control edits ... ???? SandyGeorgia (Talk) 23:57, 26 January 2023 (UTC)[reply]

Hi Sandy, sorry I've been a bit slow on the DC related stuff, I've been pretty busy recently but should be able to get back to helping with that soon. I just finished reading this mega thread, I'm very happy that you've bringing these issues up and that there's an ongoing conversation about this here. Tomorrow I'll try and answer some more questions, but on the topic of the WMF, I think Colin has it right--employing people to take care of copyvios might make it more of an "issue" for them. Grants and funding are the way to go. But that has me thinking, it'd be nice if we had an advisor or community liaison for copyvio related issues, I don't think that's asking for too much.... Moneytrees🏝️(Talk) 08:09, 28 January 2023 (UTC)[reply]

Now there's an idea we can run with. Thanks for popping in, Moneytrees. SandyGeorgia (Talk) 09:48, 28 January 2023 (UTC)[reply]

More copyvio burnout.

A little good news: I had these software tweaks done: [1][2]. Hopefully temporary blocks for copyvios are less frequent. But that's only one problem fixed. There are the amateurs that just point and click at Earwig and say everything is OK. Our tools currently have too many false negatives, and this creates CCIs that can really only be dealt with using PDEL. There are also the plot summary copyvios, the subcontinental copyvios, and worst of all - the persistent copyvio sockpuppeteers, like Dante8. MER-C 19:22, 28 January 2023 (UTC)[reply]

That is truly bad news about Diannaa, but most understandable. I don't know how you all do it. I spent a couple hours this morning on one article only. It is not only amateur editors who misunderstand Earwig; a very experienced editor pointed me to Earwig on a Coldwell article containing copyright issues. It takes hours and hours to go back and locate these very old sources, which are hard to search in various formats used, and PDEL is the only answer when serial issues are found. SandyGeorgia (Talk) 19:39, 28 January 2023 (UTC)[reply]
Even then PDEL isn't enough. I can deal with the easy PDELs in a few minutes each (the gadget Who Wrote That makes it easy) but (1) the sheer number means I hesistate to push more than five a day through WP:CP and (2) there are still an overwhelming number of complex cases. From experience, PDEL only halves the work at best. MER-C 11:03, 29 January 2023 (UTC)[reply]
So what can be done? Should we all convene in a sandbox somewhere for brainstorming? I have been working for days trying to nip another new one in the bud. And failing. It's exhausting and demoralizing and I'm too tired to write up the ANi now. There aren't enough of us. SandyGeorgia (Talk) 11:11, 29 January 2023 (UTC)[reply]
Hooking Earwig up to machine translation would help reduce false negatives and tackle one broad swathe of difficult to detect copyvios. I don't see it being added to Copypatrol - it's another batch of API calls to some external service that will require money to access. MER-C 19:38, 30 January 2023 (UTC)[reply]
By the way, halving the work on Coldwell is a tonna work! And, it's the pushback that I find frustrating, which is why we need a consensus. SandyGeorgia (Talk) 12:11, 29 January 2023 (UTC)[reply]

Copyright problems: DYK datagathering questions

Ok, starting this section to brainstorm and hopefully format a set of questions that we can apply to known bad DYKs, and a snapshot of DYK nominations over a fixed period. Will fill in more momentarily. Sideswipe9th (talk) 01:31, 29 January 2023 (UTC)[reply]

Some basic structure first. This is mostly for convenience to quickly get to the relevant article revisions, DYK review, etc.
  • Name of article: [wikilink to article name here]
  • Date of creation: [link to first revision of the article here]
  • Date of copyvio or close paraphrase detection: [link to revision where cv-revdel requested, or close paraphrase first removed]
  • DYK nomination status: [approved/rejected]
  • DYK Nomination Review: [link to archived completed DYK review of article]
  • State of article at nomination: [link to diff of the article at or just prior to it being DYK nominated]
  • State of the article after DYK review: [link to diff of the article immediately after DYK review completed]
  • State of the article after DYK hook ended: [link to the diff of the article immediately after it left the main page]
Now some questions. Comments/explainers are in italics.
  • Did the copyvio or close paraphrase exist prior to the DYK review?
    • This will let us quickly filter out articles where the offending text was inserted after the review
  • Was the copyvio or close paraphrase inserted as part of the DYK review?
    • This is a very controversial question, and one I hope we maybe don't have to ask. But if we do ask it, it will give us more info on how the offending text was inserted into the article.
  • Was there a copyvio or close paraphrase detected during the DYK review?
    • If yes, was the revision deleted?
    • If yes, was every copyvio or close paraphrase detected during the review?
    • If no, was there mention of a potential copyvio or paraphrase in the review outside of the DYK review template?
      • Note, the three above questions at the level 2 list are optional and dependent on the answer to the question at level 1
  • How long after article creation was an issue confirmed and actioned?
  • How long after DYK nomination was an issue confirmed and actioned?
  • How long after DYK review completed was an issue confirmed and actioned?
  • Was the copyvio or close paraphrase from an online or offline source?
    • This one might be difficult to ascertain. In theory it should be determinable from edit summaries that removed the content or the Special:Log entries that actually hid the offending revisions. Where an article had been tagged with {{cv-revdel}} prior to revision deletion, the output of the template should state the source.
  • If known, how was the copyvio or close paraphrase detected?
    • Again this might be difficult to ascertain if revisions have been hidden. Checking the DYK review, article talk page, and edit summaries may help. Does copypatrol keep any relevant records here that would help?
That's all I can think of right now. Obviously formatting and phrasing is pretty far from final. And there's at least one question that I hope we don't have to ask, but might give us more insight into how copyvios are getting through DYK. Sideswipe9th (talk) 02:06, 29 January 2023 (UTC)[reply]
That's a bit overwhelming :) Who would gather this data?
I was thinking more along the lines of "how many of Wikipedia's serial copyvio offenders were spawned by the pursuit of rewards via DYK and GA"? I'd rather eat nails than have to look at DYK every day to answer these questions. I just took a look at one queue and it has a DYK hook for a recent GA that contains prose that is gibberish. And the DYK hook is not only incomprehensible, it's probably untrue and it's probably a copyvio from Spanish sources.
At one time, I tracked DYK daily, because every queue had at least one (often more) instances of failed verification, copyvio, or incomprehensible prose. That remains true, 15 years later. I don't want to have to get down to the level of analyzing DYKs to try to figure out how we can stem the copyvio problem. Anyone who doesn't know it's a problem and needs data hasn't been following the main page. What we need to know is whether the DYK process is furthering the problem, or helping teach editors to prepare better articles.
In the article I just looked at, neither the DYK nor the GA review amounted to ... anything. Passed 'em up the line with scant review. Are NPP and AFC doing a better job of vetting articles? What process does a better job of educating editors on policies and guidelines and best practices? How can we reallocate more resources to what works? DYK doesn't; we have 549 Coldwell DYKs as one example. (Those of us who have been around long enough know of quite a few more.). He just kept on doing what he did, and DYK kept on passing them. How we can better focus resources so that we don't have gobs of editors promoting DYK queues so that another gob of editors can file ERRORS reports? And still not catch copyvio, 'cuz no one's looking.
So, I'm confused about who would gather this data as you outline, and what we'd do with it. SandyGeorgia (Talk) 06:11, 29 January 2023 (UTC)[reply]
For data gathering, whomever volunteers. But at the very least I'm happy to do it, I'd just need feedback on what it is I should be looking for.
So the purpose of this set of questions is to figure out where in the DYK process copyvios and CLOP issues are being missed. If we were to go to them and say something like "Hey, you all have a problem with letting copyvios and CLOP through the nomination and review process. Can you fix that please?" we'd not get much traction, and maybe some heat. However if we can go to them and say "Hey, there's a problem with the DYK process resulting in copyvios and CLOP being undetected. It's coming from [this part of the DYK review process], here's the data that shows how this is happening and how you can replicate our findings. Can this be fixed please?" I think, or I hope we'll have a much more positive response.
I'm not suggesting we look at DYK every day for a set period. What I'm suggesting is that we take a set of known bad DYKs from editors who have been subject to CCI, say around a dozen articles, and use a set of questions like this to determine where the copyvio/CLOP issue originated, and how it was missed at the DYK review. Then we compare that against a sample of recent nominations that have recently fully gone through the DYK process, for example all DYK nominations from 1 January 2023, using the same questions, to see if the same problems exist.
It is my hope that from the two sets of data, we can figure out what it is in the DYK process that is missing these issues. Is it because as you say "no one's looking"? Is it because DYK nominations are getting non-rigorous reviews? Are there DYK reviewers who are AGFing a little too hard on supposedly good/well known editors (eg, "oh that's a Doug Coldwell nomination? Not much for me to check here. Approve.")? Do some DYK reviewers just not have the competency to handle copyvios/CLOP issues when Earwig comes up clean? At the moment we don't know why this issue is arising from that process. Analysing data should help us determine that. Sideswipe9th (talk) 20:05, 30 January 2023 (UTC)[reply]

Copyright: Ideas so far

  1. Summer internships for law students.
  2. A WMF Community liaison for copyright issues.
  3. Grants funding editors who work on copyvio.
  4. Policy changes (WP:PDEL earlier and easier once a copyvio is found, things like that ... I have spent days trying to rein in a new editor)
    What policy changes might allow us to nip more in the bud ... sooner, easier?
    Are user right limits too lax ?
    Reform AGF? how much copyvio before we suspend AGF and shoot on sight content cited to offline sources.
  5. More CCI admins Barkeep49 get the RFA nomination machine moving on copyvio types. SandyGeorgia (Talk) 12:17, 29 January 2023 (UTC)[reply]
    I've got a candidate with CCI experience who will hopefully run this spring. I'm also always open for recommendations. Additionally I know Moneytrees is active in CCI and is currently trying to do more admin finding. That said, as Money's RfA showed, I think CCI editors going for admin face the challenge that it's easy to be focused on keeping the negative out versus nurturing the positive. On the whole editors who have a story to tell about building, rather than defending, the wiki tend to have an easier go. Best, Barkeep49 (talk) 18:50, 29 January 2023 (UTC)[reply]
    Money, a fine admin and fine person who got a thoughtful neutral from me, went to 'crat chat for the same reason a few others did recently: a nominator statement. Presentations which feel less than forthcoming are always a big concern (one wonders what else they don't know). That doesn't happen with your candidates. But I agree that building is the way to go! SandyGeorgia (Talk) 19:04, 29 January 2023 (UTC)[reply]
    Just a thought, and probably a very radical and wild one. What is it exactly that CCI admins need the tools for? Is it primarily for revdeling and blocking? Has there ever been any thought towards unbundling the revdel part of that to a new permission? I dunno what you'd call it, but in scope I'd consider it something like "CCI clerk", a trusted user who can handle some of the burden of actually suppressing copyvios from articles.
    I had this thought when I was looking at the edit filter helper and edit filter manager perms, which allow for trusted non-admins to see (EFH and EFM) and edit (EFM) private edit filters, both of which are actions that are otherwise restricted to admins. Obviously there'd need to be some checks put in place to ensure that the trusted editors who gain that permission don't abuse it in any way, but could this lighten the load on the current set of CCI admins in any appreciable way? Sideswipe9th (talk) 20:18, 30 January 2023 (UTC)[reply]
  6. GA/DYK reform: are they pushing more volume than they can handle?
  7. WMF funding to develop a better tool for detecting WP:CLOP? Tedious manual work ...
  8. Re-write, update, publish in Signpost Wikipedia:Wikipedia Signpost/2009-04-13/Dispatches (the copyright pages are too dense for a new editor)
  9. Noticeboard reform. Consider this comment (three lost years), these (untrue) claims, and the complexity of using Wikipedia:Copyright problems. (Aside: wow. Just wow. On the three years.) One can drop a problem at the COI noticeboard or the BLP noticeboard without a lot of work, but just figuring out how to lodge a copyright concern stumps me every time. If XOR'easter could have made a simple, "could someone look into this" post at a noticeboard three years ago ... yes, the CCI folks already have too much work, but would not an easier-to-use noticeboard encourage more of us non-admins to help out ?? The COINoticeboard has saved my sanity more than once. SandyGeorgia (Talk) 19:18, 29 January 2023 (UTC)[reply]
    Noticeboards are only useful if someone's there to respond to the plea for help. WhatamIdoing (talk) 21:13, 29 January 2023 (UTC)[reply]
    Hence, my point ... if it were easier, more of us would participate. I engage CCI reluctantly as I'm so afraid to make a mistake and the instrutions are so complicated. SandyGeorgia (Talk) 21:17, 29 January 2023 (UTC)[reply]
  10. Data for GA and DYK to identify QPQ problems. Editor interaction shows clearly which editors were pushing DC's articles through DYK and GA. Mike Christie's data will be helpful as well. When a nominator puts up a GAN or DYK, how to add Mike's data showing frequent collaborators. The trends with DC are apparent via editor interaction, so having this info incorporated into the review might discourage unhealthy QPQ. Some of DC's collaborators have CLOP issues themselves, and should not be reviewing at DYK or GAN. SandyGeorgia (Talk) 20:46, 29 January 2023 (UTC)[reply]
    I posted this at ANI back in September: Not being familiar with CCI discussions I don't want to pontificate but I would have thought PDEL should be the default. If breaking copyright rules doesn't get you a scarlet letter, doesn't require you to fix your own messes, doesn't stop you from editing, and leaves your bad edits in place (since we don't have the manpower to clean most of it up), what is the incentive not to break those rules?. By "scarlet letter" I meant that the CCI page names are anonymized so nobody knows you're to blame. I think at least one of those four things should change. Has there ever been a case where someone unwilling to cooperate by fixing their own messes has continued to edit and been productive? Mike Christie (talk - contribs - library) 21:12, 29 January 2023 (UTC)[reply]
    By the time it reaches that point, the editor may already be blocked. Once blocked, they're usually faced with a choice between "volunteering" to clean up the mess, or staying blocked. WhatamIdoing (talk) 21:15, 29 January 2023 (UTC)[reply]
    That sounds more than plausible, but do you think that happens because we don't bring down the hammer quickly enough? In other words, as soon as a CCI is opened the editor is expected to contribute significantly to the clean up, and if they don't they're blocked? They can edit elsewhere too at the same time, I'm thinking. If my kid were to take a stick and run around the garden lopping the heads off flowers, I'd make them help replant as necessary, and I wouldn't hide it from the rest of the family, give them the free run of the garden, and leave the damaged flowers on the lawn. Mike Christie (talk - contribs - library) 21:25, 29 January 2023 (UTC)[reply]
    When given a chance, they often demonstrate that they aren't able to paraphrase and summarize sources in their own words. Now, in the good news dept, I just investigated the editor I mention below. Arb sanctioned for other behavioral issues, failed RFA where I gave a copyvio example no one else picked up on (which was happening daily at DYK, but there was never a CCI), came to my talk page, I gave them a stern talking to, and current editing of the same type of articles from the same types of sources reveals ... no problem! There you go ... a success story ... not that the old stuff has been cleaned up, though. The problem with most of the editors who end up blocked, and same with DC, is that their friends defend them, and the stern talking to doesn't sink in. SandyGeorgia (Talk) 21:45, 29 January 2023 (UTC)[reply]
    When given a chance, they often demonstrate that they aren't able to paraphrase and summarize sources in their own words.. Then I'd say they have no place editing here. If after Doug's first CCI we'd required him to fix his own work and found that he couldn't, and he'd been blocked as a result, there would be a lot less to clean up. Mike Christie (talk - contribs - library) 21:51, 29 January 2023 (UTC)[reply]
    That's a good question. The most prolific serial DYK offender I used to follow (a decade ago) is still editing, but I don't think there was ever a CCI. Need to do more homework to see if there were any sanctions and if the copyvio continues. People at DYK wanted my head then (there was daily copyvio on the main page, and then Rlevse happened, and the rest of what happened to FAC is history), so I solved my copyvio angst by trying to never again look at DYK. It would be nice if we could get a list of the serial offenders. SandyGeorgia (Talk) 21:16, 29 January 2023 (UTC)[reply]
  11. Machine translation for Earwig. Probably on a single source basis at first to control costs. MER-C 17:31, 31 January 2023 (UTC)[reply]

I am somewhat aware of Doug C's GA/copyright issues. And

Pages for
February 2023
GAR reassessment
and Copyright
contributor investigation
Main pages

Lists

Notices

Scripts and bots

I missed the recent ANI at Wikipedia:Administrators' noticeboard#User:Doug Coldwell
A couple of questions:

So, if I understand the consensus at ANI, all of the DC articles that received a GA designation are to be stripped of their GA status and ...possibly...maybe...someday... get a new GA Review. Is that correct? Had to be done, it's just that the fallout-damage is so enormous... Shearonink (talk) 03:31, 30 January 2023 (UTC)[reply]

(talk page stalker) - I didn't follow the AN discussion closely, but I think there was a method to claim a limited number of articles for a more close look and an GAR of its own. From a quick look at Ludington family, I think it needs some work. Dickson is self-published, I don't know that being published by a local printing company makes Dunathan 1963 much better than self-published, per the copyright pages Miller is self-classified as "juvenile fiction", and I believe modern sources such as Hunt have raised serious concerns about the reliability of Johnson 1907. Hog Farm Talk 03:43, 30 January 2023 (UTC)[reply]
Ah ok...thanks for your thoughts on those Ludingtons...appreciate it. I figured any article with that family name was probably on the metaphorical chopping block - no problems if it gets axed, just wanted to know people's opinions and how to move forward. Shearonink (talk) 03:50, 30 January 2023 (UTC)[reply]
Hi, Shearonink (and thanks HF). I'll put together today a sub-page at GA, as you won't be the only one with these questions.
HF explained it right, but there remains some confusion. At one DC GA I've been working on as a sample, one editor is quite convinced the issues are fixed, but three-fourths of the content is cited to offline sources that newspapers.com doesn't have, and the few I have been able to check don't even verify the claims made. The editor thinks that rewriting the content (without accessing the sources) means too close paraphrasing is addressed and the article is fixed: it's not. That could be an example of an individual GAR where community consensus may be needed, and I hope we don't encounter a lot of that kind of digging in, considering the evidence already presented about the dodgy sourcing. The AN consensus is that text from all offline sources is to be stripped from the articles (Z1720 suggested leaving them as Further reading, but the talk page might be better, as some of the sources are so archaic as to be useless). Something that emerged during the AN commentary is that there is probably COI-based POV lurking in all the Ludington-related articles. So, what is going on now at the list started by Iazyges (people indicating "fixed") isn't very useful. I know we will have some individual GARS that will be challenged, because any offline source is suspect. All we need is the list to be separated into those that claim an individual GAR ... but that can be done after I start a subpage laying out all of this and with Novem's stuff in the section below this.
So, with all that said, from the list (see post below this one), we only need to know which ones will claim an individual GAR (not included in the mass delisting, rather re-assessed individually) as editors believe they are "fixed" (in some cases, doubtful, which will lead to discussion at the GAR). In your case, all things Ludington are a mess, so I'm guessing you won't want to claim a GAR, and yes, those will go to the mass delisting. When we declare we're ready to proceed at GAR, then we can begin the process of stubbing articles (removing all content cited to offline sources and parking those in Further reading or on talk), checking what's left, and then Novem (see below) does the automated part.
It will take me the better part of today to lay all of this out in a GA subpage ... today's project. SandyGeorgia (Talk) 12:04, 30 January 2023 (UTC)[reply]
PS, here's where DC pinged you (and the world): User talk:Doug Coldwell#October 2022. (Permalink) Enjoy :) SandyGeorgia (Talk) 12:11, 30 January 2023 (UTC)[reply]
Thanks for the link. Yeah, I tried to find it but gave up after a cursory search. What an awful time with all this mess and the ongoing/subsequent cleanup...so much time and effort that now has to be devoted to this Wiki-disaster... Shearonink (talk) 15:42, 30 January 2023 (UTC)[reply]
And it continues. SandyGeorgia (Talk) 16:26, 30 January 2023 (UTC)[reply]
Another thought in the back of my mind, but for AFTER I get the explanatory subpage going. Buidhe you have mass sender. Is is permissible/adviseable to use mass sender to send an explanatory post to every GA talk page on the list? If so, once we get the text developed, might you be willing to do that ? Is 233 pages excessive for mass sender and likely to cause problems? We can explore this on the subpage after I get it going today. Should also make sure Premeditated Chaos is reading this thread, although I'll get a subpage set up today for better centralized coordination. SandyGeorgia (Talk) 12:22, 30 January 2023 (UTC)[reply]
I'm aware, we're working on the GAR process now so hopefully that will be settled soon. ♠PMC(talk) 15:51, 30 January 2023 (UTC)[reply]
I'm willing to help, but also any admin can send mass messages. (t · c) buidhe 17:13, 30 January 2023 (UTC)[reply]
Buidhe could you point me to where I can learn if this would be an appropriate use of mass sender? SandyGeorgia (Talk) 17:19, 30 January 2023 (UTC)[reply]
The only one I know of is Wikipedia:Mass message senders#Guidance for use. I expect you could get the perm yourself since you have use for it. It seems like an appropriate use of MMS to me. (t · c) buidhe 20:18, 30 January 2023 (UTC)[reply]
See also: Tony Ballioni talk. SandyGeorgia (Talk) 12:50, 30 January 2023 (UTC)[reply]

GARCloser and mass delisting

See user:SandyGeorgia/sandbox9. SandyGeorgia (Talk) 14:38, 30 January 2023 (UTC)[reply]

Hey Sandy. Long time no chat. I hope you've been well. Let me know if you need technical assistance with mass delisting GAs. GARCloser has the code to do this individually. With some programming and maybe a BRFA, I could probably feed it a list of GAs to mass delist, perhaps creating pro forma individual reassessment pages with a quick explanation of what's going on and a link to the ANI. I'm very busy with work this month so I may be a bit slow, but figured I'd offer. Also, as of Oct 31, 2022, GARCloser now includes the oldid. Hope this helps. –Novem Linguae (talk) 04:26, 30 January 2023 (UTC)[reply]

Awesome ... great to hear from you! You were going to be my next order of business today.
There is no hurry on this, as people at GA have asked that we hold off until the GAR community v. Individual process is merged, in case there are some individual GARs. Tomorrow (oops, just looked at my clock-- it is tomorrow!) I'll start a sub-page at GA where we can begin to sort through the technicalities. Depending on the technical issues with GARcloser, it might not be necessary to hold off, but we can discuss all that over there. For now ... in terms of getting you thinking about the coding ...
I'm thrilled to hear you can feed it a list, as that will be the way to go. Also glad we've now got oldids, but one other thing: most of Coldwell's GAs also have DYKs, and when I checked Henry Ludington, GARcloser had not rolled in the DYK. Is that something you might add meanwhile? In that case, I had to add the oldid and merge the DYK.
The scheme you propose is what I had in mind. User:Iazyges/Doug GA Rewrite Claims needs to be converted to number points (Iazyges), so we can be sure we have all 233 in alpha order, the questions like Shearonink's above need to be sorted (there seems to be some variation in understanding of what "fixed" means, so that's not a helpful grouping, I'll work on that next), then an automated GAR page is created for each one left on the list (after a period where editors can claim an individual GAR) with a link to the ANI and explanatory text we will work on, and then GAR closes it. That's the general scheme to be working on. I'll point you at a DC GA subpage once I get this rolling ... SandyGeorgia (Talk) 11:47, 30 January 2023 (UTC)[reply]
PS, very old DYKs don't have a nompage. SandyGeorgia (Talk) 12:30, 30 January 2023 (UTC)[reply]
Writing code to fold in DYK stuff adds complexity. I'll look into it, but might not have time. There's a whole class of templates that could/should be folded into article history ({{DYK talk}}, {{Old peer review}}, {{OnThisDay}}, {{FailedGA}}, {{Old XfD multi}}), but it's quite a task to do that bug-free. –Novem Linguae (talk) 18:44, 30 January 2023 (UTC)[reply]
Thanks for checking that; I wonder if you could steal the code from FACbot ? Hawkeye7? Except that's a bot rather than a script, and I don't know if they have a shared programming language. In the Coldwell case, it is almost always only DYK. But on the other hand, if there are some loose templates on the talk page that belong in articlehistory, it is not worth taking everyone's time on that ... unless it's an easy fix ... SandyGeorgia (Talk) 18:47, 30 January 2023 (UTC)[reply]
Bot paperwork filed. While that's marinating, may want to start thinking about things like 1) what text do we want to put on the pro forma GAR pages, 2) do we want to do individual or community reassessment GARs? (They'd be open and closed immediately, this is just deciding where to put the pro forma pages.) I'm thinking individual would be best, to avoid spamming the community reassessment page and log. –Novem Linguae (talk) 19:42, 30 January 2023 (UTC)[reply]
Well aren't you speedy!
  1. Working on that in User:SandyGeorgia/sandbox9. The most significant hold up right now is how to name the GA subpage; see WT:GAN discussion re sensitivity that Coldwell/Caldwell is a living person. Wikipedia talk:Good article nominations#Coldwell GAs: Implementation. I want to let that percolate for a bit before moving my sandbox content and continuing to work. We have time, per ... see next point ...
  2. The question of individual or community has already been answered for us and is contemplated in the AN discussion. Those processes are being merged to one, which is why we have to hold off until that merger is completed. Premeditated Chaos is on top of that one, and is aware of our work here. See the discussion at WT:GAN re this approved proposal. SandyGeorgia (Talk) 19:47, 30 January 2023 (UTC)[reply]
Novem Linguae, will Wikipedia:Good article reassessment/Douglas Coldwell GA list work in that form, or should it be converted to a list of talk pages? See also query at VPT. SandyGeorgia (Talk) 17:42, 30 January 2023 (UTC)[reply]
That form works fine. I can easily convert it if I end up using a different format. Thanks for checking. –Novem Linguae (talk) 17:51, 30 January 2023 (UTC)[reply]
Buidhe does that list have to be in the form of a talk page for mass sender to work for article page notices? SandyGeorgia (Talk) 18:00, 30 January 2023 (UTC)[reply]
Like User:Buidhe/test, or can the mass sender convert article pages to talk pages before sending? SandyGeorgia (Talk) 18:01, 30 January 2023 (UTC)[reply]
I believe it has to be in the same format as User:Buidhe/test—a MassMessage delivery list. (t · c) buidhe 18:52, 30 January 2023 (UTC)[reply]
I found the page and inquired there. SandyGeorgia (Talk) 19:11, 30 January 2023 (UTC)[reply]
Thank you Guerillero! Now I'm a mass murderer ... or something :) SandyGeorgia (Talk) 19:38, 30 January 2023 (UTC)[reply]

PLEASE review my sandbox

I have not had a breather all morning ... could others please review User:SandyGeorgia/sandbox9 so I can keep working on it later today? I don't want to move it to a GA subpage until it has fresh eyes. Please use the talk page at the sandbox, as I'm gettin' spread pretty thin now ... got mass message sender, got the master list done, got a procedure for checking the master list, worked on an AN and issues at two DC articles, got the sandbox advanced, answered questions at WT:GAN, and lost track of everything on my own talk page ... SandyGeorgia (Talk) 20:59, 30 January 2023 (UTC)[reply]

I believe it's ready, but could really use more eyes before moving to GA space. SandyGeorgia (Talk) 01:45, 31 January 2023 (UTC)[reply]
Sandy, looks good to me. A couple of very minor comments:
  • Perhaps "All Good articles by Doug Coldwell are delisted" -> "All Good articles by Doug Coldwell are to be delisted" to indicate this was a statement of intention, not a completed task.
    Done (trying to hew to admin closing statement). SandyGeorgia (Talk) 02:57, 31 January 2023 (UTC)[reply]
  • This is only a GAR page, not a general CCI, but I wonder if it would be worth including links to the CCIs for Doug in case anyone wants to help there.
    It's there ... I bolded it ??? SandyGeorgia (Talk) 02:57, 31 January 2023 (UTC)[reply]
    My bad, but bolding it is not a bad idea anyway. Mike Christie (talk - contribs - library) 03:06, 31 January 2023 (UTC)[reply]
  • I would remove mention of the cross-reference to the list of reviewers; I'll provide it when I can but I don't know when that will be. For implementation #1 perhaps say that the notification to reviewers will be done when the list is available.
    Got it, SandyGeorgia (Talk) 02:57, 31 January 2023 (UTC)[reply]
  • In FAQ 5, perhaps suggest AN rather than ANI -- that would not be an urgent issue, I think.
    Yep, got it, SandyGeorgia (Talk) 02:57, 31 January 2023 (UTC)[reply]
That's everything. I think I was the reviewer for at least eight or ten of his GAs. I remember running into him at GA years ago and thinking he was a terrible writer, but of course GA only requires prose to be grammatical, not well-written. I did a burst of GA reviewing last summer, and reviewed several of his, and it was in the middle of that burst that I realized (or was told, rather) that I couldn't assume sourcing quality, as we do at FAC for experienced editors. Before that I had been only checking sources where they looked suspicious to me. After the first ANI thread I went back and checked sources for a hundred or more of my reviews, which turned out to be an educational experience and an unintentional way to get some statistics about sourcing accuracy in random GANs. I checked 103 articles I'd passed (not including 6 of Doug's) and 76 were completely fine with no issues at all, of which a handful had nothing I could check online. The rest had issues that I posted to the article talk pages. All but one were fixed; I ended up delisting one. I did fail one of Doug's that was in an area I am expert in (magazine history) and perhaps that should have alerted me sooner -- he fought hard to persuade me I was wrong, but it's an area I know a lot about. Mike Christie (talk - contribs - library) 02:30, 31 January 2023 (UTC)[reply]
What a horrible experience for everyone involved. I really appreciate the time you took, Mike Christie, as I know you've got your head into generating data and are very busy. Thanks again, SandyGeorgia (Talk) 02:57, 31 January 2023 (UTC)[reply]
I don't see any major issues. There will be a number that I reviewed on the list - I reviewed his regularly (although I did fail a few) before I started to catch on that some of these had comprehensiveness issues. When things first blew up, I tried to reach out to him to suggest some sort of mentoring (he'd been willing to make major changes when I'd requested it a few times), but the response I got at User talk:Hog Farm/Archive 13#Flexible barge was profoundly disappointing and sort of a second wake-up call for me. So I guess I'm fairly involved here. Hog Farm Talk 02:51, 31 January 2023 (UTC)[reply]
What an awful conversation. It is amazing this has gone on for as many years as it has. And to think of how I stumbled into all of it (a ridiculous post about Sybil Ludington on Facebook for the Fourth of July) ... and here we are, thank you Facebook, on the brink of two community bans <eeeeeek> Thanks for your help reviewing it, HF ... I know how swamped you are as well ! SandyGeorgia (Talk) 02:57, 31 January 2023 (UTC)[reply]

DC GA reviewers

Wikipedia:Good article reassessment/February 2023 1/reviewer list
Wikipedia:Good article reassessment/February 2023 1/GA reviewer MMStargets (mailing list)

OK, I whipped up something but it'll need review. This is the result of looking up his GA article titles in the list I pulled of creators of pages with names ending in "/GAn". So the first one in the list is 1836 U.S. Patent Office fire; it was promoted on 2020-09-15 and there is one review in the list for that article, by The Most Comfortable Chair, with a review page started on 2020-09-12. The "age" column is just the difference of the two dates. However, there are 27 articles for which two records came up. The first is Albert Kahn (architect), which was reviewed by Sahaib on 2021-08-07 and also by Dugan Murphy on 2021-10-22. The "promoted date" is 2021-10-25 for both; this doesn't mean both promoted it -- the date is repeated on both lines because I don't know who promoted it. Two records have no valid values for reviewer because my search didn't find them; I'll have to look into why. I hope this is helpful. Mike Christie (talk - contribs - library) 03:00, 31 January 2023 (UTC)[reply]

Sheesh, Mike ... helpful ? You are amazing !!! (Hog Farm, hide your eyes :). Others will be hiding their eyes more than you, though :) All we need is a list of names for the MMS, so I'll condense this down to just a list. Do you consider it good enough for those purposes ?? SandyGeorgia (Talk) 03:06, 31 January 2023 (UTC)[reply]
I'll stand by what it says (i.e. there really is a GA review created by that editor if the table says so) but you should probably scan for the duplicate records and delete the ones that did not promote. Some are obvious (hundreds of days old) but some you may have to go read the GA reviews. And the two nulls need to be looked up. If you do that, yes, I think this is good enough. And by the way, thanks for organizing all this. It needed to be done. Mike Christie (talk - contribs - library) 03:10, 31 January 2023 (UTC)[reply]
Will do ... thanks, Mike, you're awesome! SandyGeorgia (Talk) 03:11, 31 January 2023 (UTC)[reply]
Last post for tonight; just realized the titles should be links. I can easily make that happen and will try to do so tomorrow morning before I go to work, if you like. Mike Christie (talk - contribs - library) 03:25, 31 January 2023 (UTC)[reply]
No need, Mike. I haven't stopped all day, and tomorrow I have a Dr. Appt ... but some time tomorrow, I will check the few items you indicated need checking, and then reduce it to just a straight list of DC GA reviewers. Don't think it's my job to point out how many or which ones, as that could cause red faces ... I don't need the articles linked, as I'll only be listing editors. Then the MMS will have some wording like ... "you may have reviewed one or more of the articles on this list" ... we'll leave it to them. Those in the know will probably pop over here to inquire, and in that case, we can point them to this section. SandyGeorgia (Talk) 03:31, 31 January 2023 (UTC)[reply]

Cleaning up the table. Mike, I'm not removing the duplicates, as they show early fails with later passes. I have to check the talk pages to figure out which was the pass vs fails.

One of the NULLs (Robert Grace) was a dab issue that I had already discovered as faulty in the bot report. I can't decipher why Electric fire engine glitched, but what a solid review (not).

Sorting out the duplicates is slow going because BlueMoonset had been removing {{FailedGA}} templates from talk pages, so I have to dig back and find them-- they are needed for building articlehistory. What sad shape talk pages are with the absence of GimmeBot, who kept all this stuff in order when converting ALL process templates to article history. Part of the time consuming issue here would not be a problem if we still had Gimme. Still working. SandyGeorgia (Talk) 04:52, 31 January 2023 (UTC)[reply]

Note: Wikipedia:Good article reassessment/James L. Buie/1 SandyGeorgia (Talk) 05:08, 31 January 2023 (UTC)[reply]

SUMMARY: There are 10 editors on the list who had only fails (never passed a DC GA). If we notify those who passed even one, and leave out all failed one or more, we are potentially biasing the sample of who shows up to GARs. We need to notify everyone who ever reviewed. SandyGeorgia (Talk) 06:14, 31 January 2023 (UTC)[reply]

Too bad GimmeBot doesn't appear to have posted its source code. I don't see a link to the source code at Wikipedia:Bots/Requests for approval/GimmeBot 2. I agree that the scale of that task could really use a dedicated bot. –Novem Linguae (talk) 10:59, 31 January 2023 (UTC)[reply]
I'm pretty sure BlueMoonset removed the FailedGA templates because of a bug in Legobot that caused it to report a passed GA as a fail if a FailedGA template for an earlier nomination was still present on the page. That should no longer be necessary. Some numbers, FYI: there are 11,479 article history templates with a GAN, GAR, or DGA. There are at least 51,510 pages with a name of the form ".../GAn". There are also 3055 pages with FailedGA and 323 pages with DelistedGA. Formatting and layout errors are so frequent that the only way I could see a bot working would be to add a category named something like ArticleHistoryConversionError to everything it found errors with, and leave those for gnomes. Mike Christie (talk - contribs - library) 11:13, 31 January 2023 (UTC)[reply]
When we originally set up Articlehistory (which was initially FAs only, "we" being Gimmetrow, Maralia, and me), that is exactly how we worked. Maralia and I were Gimmetrow's gnomes. And I stil follow Category:Article history templates with errors daily. No one has yet replaced what GimmeBot did, and talk pages are again a mess. See Taming talk page clutter. I still attempt to keep FA talk pages in order, and regularly see that significant editors don't tend "their" talk pages ... but the Doug Coldwell talk pages are a level I am not used to seeing. I've said many times: no one yet has done for any individual process what Gimme rolled into article history for every process.
Having read through some of these reviews, I am coming around to the idea that this chart should probably be posted to the subpage of the GA page that my sandbox will become. The AN/ANI/GARs reveal there has been some intimidation and battleground behavior, and we have this data now ... why expect people to ask one of us for it or to have to come to my talk page? Thoughts ? SandyGeorgia (Talk) 13:37, 31 January 2023 (UTC)[reply]
Another interesting tidbit: Once templates are converted to articlehistory, page moves don't cause errors. Gimmetrow designed articlehistory to work that way. The faulty bot report at Robert Grace (manufacturer) is coming from unconverted templates after a move from Robert Grace. Apparently leaving unconverted GA templates on talk pages causes problems. SandyGeorgia (Talk) 13:45, 31 January 2023 (UTC)[reply]
A belated confirmation that I only removed FailedGAs from article talk pages to prevent that Legobot error, but I made a policy of only doing so when hand-entering (and sometimes hand-creating) the information into an {{Article history}} template, either pre-existing, or newly created. If you're looking to reconstruct history, every piece of data in a FailedGA would have been added to Article history before the former was deleted. This shouldn't be unusual; FailedGA and GA templates alike typically disappear into a newly created Article history template. BlueMoonset (talk) 23:57, 4 February 2023 (UTC)[reply]
Thx for letting us know, BlueMoonset ... I actually have found many FailedGAs on DC article talk pages that just went away and did not go into Articlehistory ... I recovered any that I found, but I'm pretty (based on how many I found) that there are still lots out there. A lot of people were just removing them and not building AH ... Bst, SandyGeorgia (Talk) 03:29, 5 February 2023 (UTC)[reply]
SandyGeorgia, I've seen it too often myself. Some new GA nominators would delete any previous FailedGA templates—I'm not sure whether they saw it as a badge of shame or just didn't bother to learn the proper way to nominate, but I'd always be sure to restore such deletions, since article history is important, and typically moved the information from them into an Article history template so it was less likely to be removed again. BlueMoonset (talk) 05:21, 5 February 2023 (UTC)[reply]
Thanks for confirming my recollection that i did one GAN for DC and no others. I didn't find it a particularly pleasant experience, so I decided to just not do any more. There wasn't anything that stood out, it just wasn't very much fun, and I got a decidely un-warm feeling from DC, so I moved on to other editors' works and to other subjects. I suppose I should go back and check that over, but I'm still swamped with weather here (it was actual temp of −23 °F (−31 °C) BEFORE wind chills) and just not sure I care enough to keep that article from just being delisted....Ealdgyth (talk) 14:00, 31 January 2023 (UTC)[reply]
Ealdgyth, while simply looking for duplicates and errors last night, I had to read many of these reviews. The picture that is emerging, between this effort and what has been posted to the ANs and ANIs, is one of intimidation, battleground, and QPQ defense from "friends". And reviewers either were afraid to speak up or didn't know where to speak up or didn't care to speak up. And when they did speak up, DC's friends swarmed. Reading through a sampling of the fails before passes reveals a considerable amount of copyvio, poor sourcing, and poor writing. A very ugly picture is starting to emerge, and I suspect many GA reviewers were actually chased off by DC. It's rather shocking this went on for so long. I have NO doubt this would not have happened under Geometry guy's watch, and that the GAN process desperately needs a Coordinator (Gguy acted as a defacto Coord). The solution at DYK is not so easy ... SandyGeorgia (Talk) 14:06, 31 January 2023 (UTC)[reply]
Rereading the single GAN I did of his, I mostly just remember his badgering. In isolation, it's annoying, and the article certainly had referencing issues (though a spot-check didn't reveal stuff to the extent that I felt going through the entire thing was necessary or it should be quick failed) but in aggregate the troubling conduct is much more apparent. Probably this is something that should have gotten brought up at GAN earlier so reviewers could pool their impressions and have caught on to it earlier. More communication would have saved us a lot of issues. I'll definitely go back through Merkel Landis this coming month and vet every statement. Der Wohltemperierte Fuchs talk 16:06, 31 January 2023 (UTC)[reply]

Sandy, Electric fire engine was missed because the reviewer picked it up and passed it between Legobot runs, so the reviewer's name was never added to GA bot's stats list. That means I had no way to find them and search for their reviews. I can fix that over the next day or so so I'm glad it came up.

Nominations of Doug's that were not subsequently passed are not in the table above. I'll think about ways to produce that list and will let you know if I can come up with something. One example is Talk:Four-Track News/GA1. I failed it, and Doug reluctantly merged it into Travel Holiday, as I had suggested, and that led to this conversation on my talk page. After my first reply, saying I had more sources, Doug nominated Travel Holiday for GAN without waiting for the additional material; BlueMoonset removed it twice before Doug gave up. BlueMoonset also posted this to Doug's talk page. In the conversation on my talk page I said I'd look at the article again but that was when the ANI blew up so I gave up on it. You can also see on Doug's talk, a couple of sections below BlueMoonset's comment, the section I started asking him to verify the sources in his GAs before I checked them again (this was during the ANI). This was when I decided it was a CIR issue more than malfeasance -- how could anyone read my comments and think that work like Doug's could pass muster unless they truly did not understand copyright and paraphrasing? Mike Christie (talk - contribs - library) 22:22, 31 January 2023 (UTC)[reply]

Very sad. Mike Christie I have decided to trim the DC GA reviewer table and use it; we just can't expect that info to be hidden on my talk page. I'll ping you to the page after I set it up. SandyGeorgia (Talk) 22:27, 31 January 2023 (UTC)[reply]
I can understand that. FYI, in case you're not aware and don't use VE, VE makes editing tables far easier if you want to do things like delete or move columns or rows. Mike Christie (talk - contribs - library) 22:31, 31 January 2023 (UTC)[reply]

Interim Coldwell GARs

Will it affect the list-sending out if one is delisted between now and when the list is sent? I opened Wikipedia:Good article reassessment/Appomattox Court House National Historical Park/1 on January 26, and the content has already been verified, removed, or replaced by me, so the PDEL won't be a factor there. I have more than half a mind to go ahead and close the Appomattox Court House NHP GAR today. Hog Farm Talk 18:21, 31 January 2023 (UTC)[reply]
I'm afraid the answer is yes. If a GAR is already open, we need to remove that from the bot processing list. I'm glad to have an example of that, as I'm trying to figure out how I will know if there are others ... what you should do depends on what you want to have happen. If you would rather see it in the mass delist, yes, close the GAR now ... if you want it to be saved, we'll remove it from the mass list. But how can I find other instances of same ??? SandyGeorgia (Talk) 18:25, 31 January 2023 (UTC)[reply]
The only way to check the list that I can think of would be to compare against CAT:GAR, which the last few weeks has seemed to mainly consist of a purge of old geography GAs. I'm going to go ahead and delist Appomattox NHP now - with the failed verification and copyvio gone, it still has major comprehensiveness and weighting issues that would disqualify it from GA without a lot of work, and I'd rather the delisting reflect the special issues there rather than the more generic ones. Hog Farm Talk 18:33, 31 January 2023 (UTC)[reply]
Sounds good ... when I get caught up (at Dr appt now), I'll remove it from the bot list. SandyGeorgia (Talk) 18:35, 31 January 2023 (UTC)[reply]
Hog Farm Did you mark it cleaned at the CCI? I'd check that myself if I were home ... SandyGeorgia (Talk) 18:38, 31 January 2023 (UTC)[reply]
I just did the delisting process, so it can be removed. Maybe I'll get to re-working it someday, but the national park history is horribly incomplete as it is. I did mark it on the CCI as cleaned - I only caught one sentence of copyvio, although there were several points that I suspected a violation, but couldn't find the source to confirm. Anything I couldn't verify to the source I removed or rewrote. Hog Farm Talk 18:42, 31 January 2023 (UTC)[reply]
I just looked at it ... heading for delisting ??? If that is so, we can just remove it from the bot list. SandyGeorgia (Talk) 18:34, 31 January 2023 (UTC)[reply]

Novem Linguae will your bot be able to detect whether there is an already open individual GAR on the talk page, and if so, skip that article? Or do I need an automated means of checking for that before we do the mass delisting run? What's our plan for situations like the one above ? SandyGeorgia (Talk) 18:43, 31 January 2023 (UTC)[reply]

PS this and this picked up Hog Farm's delisting of the Court House, so I can use that to doublecheck the list just before your bot run, so it's only already-open GARs that we would need to detect. SandyGeorgia (Talk) 18:58, 31 January 2023 (UTC)[reply]
Sure, I can add that to the algorithm. I wrote up User:Novem Linguae/Work instructions/DougColdwellGARs real quick. Feel free to check it/make edits/make suggestions. –Novem Linguae (talk) 10:20, 1 February 2023 (UTC)[reply]
Responded there (and am working on a template to consolidate all pages I've developed so far). SandyGeorgia (Talk) 10:45, 1 February 2023 (UTC)[reply]

@Novem Linguae and Hog Farm: for now, I am leaving Appomattox Court House National Historical Park on the Wikipedia:Good article reassessment/Doug Coldwell GA list, as it give me a sample to make sure that the Petscan query picks up any delists after we started the process. When Novem is ready to process the list, then we will remove those along with the ones indicated at Wikipedia talk:Good article reassessment/February 2023 1#Intent to open an independent GAR. Since we don't know how long the delay will be until the GAR merger, I think it safest to keep the original 223 article list intact, to avoid getting crosswise, and only edit down the final bot processing list when we are ready to go. SandyGeorgia (Talk) 10:45, 1 February 2023 (UTC)[reply]

Talk page entries

I am working at User:SandyGeorgia/sandbox9. SandyGeorgia (Talk)

Just wanted to say thank you Sandy and to everyone pitching in on cleaning up this lingering mess of sourcing/COI/copyright problems. What a massive workload...smdfh. Just an idea...I think that someone has already mentioned this but if a link or links could be provided within the Talk page header/Article Milestones of all of the affected articles that would be helpful for editors & readers who happen upon these articles in the future. Linkage to someplace central that won't get archived with an explanatory intro. People come and go around WP all the time, signposts left behind on what happened are useful and also needful in terms of attribution. Shearonink (talk) 16:00, 30 January 2023 (UTC)[reply]
Working on that in sandbox9. The idea is to first mass send a message to talk (which would get eventually lost in archives), but we will make sure that the GAR delistings contain a permalink to a subpage where all is documented. Everthing useful I learned on Wikipedia ... SandyGeorgia (Talk) 16:28, 30 January 2023 (UTC)[reply]

Shearonink here is one piece just pointed out to me. Once the articles have been cleared, we should be putting

  • {{subst:CCI|name=20210315}}

on the talk pages. See Talk:Willis Fletcher Johnson SandyGeorgia (Talk) 20:15, 31 January 2023 (UTC)[reply]

Query

Is 7&6=thirteen actually Doug Caldwell? Doug's Flickr has very recent images of recent articles edited heavily by 7&6=thirteen and I saw other similarities on there userpage setup when i went there looking for some other reason for them to have taken recent images of the pages. — Preceding unsigned comment added by 64.39.156.254 (talk) 20:42, 30 January 2023 (UTC)[reply]

I don't even have time or space in my head to contemplate that, but I imagine that the CUs would have picked that up. SandyGeorgia (Talk) 20:44, 30 January 2023 (UTC)[reply]
Oh no....just the possibility... Shearonink (talk) 21:05, 31 January 2023 (UTC)[reply]
Looking at the archived SPI from earlier this month, there is mention of notes on cuwiki. Whether or not those blocks were purely behavioural, or a mixture of behavioural and technical, is not clear from the archive. Sideswipe9th (talk) 20:47, 30 January 2023 (UTC)[reply]
Ummm, kind of hate to ask, but what is cuwiki? Shearonink (talk) 21:25, 31 January 2023 (UTC)[reply]
It's a private wiki that only the Checkusers (CUs) can access. Sharing everything one knows about a sockmaster is not advised, because if they know how they are detected, they can better evade detection. So information about sockmasters is kept where only CUs can see it. SandyGeorgia (Talk) 21:33, 31 January 2023 (UTC)[reply]
Ah, ok...I kind of thought so but WP is such a sprawling place I was thinking maybe I had missed something I was supposed to know... Shearonink (talk) 04:27, 1 February 2023 (UTC)[reply]
Nope :) WP:LTA pages kept on Wikipedia (like Wikipedia:Long-term abuse/Mattisse) turned out to be a mistake because we used to give too much information, that only helped the sockmaster by telling them everything we knew about them so they could adapt and evade. We need to give the others just info to know how to recognize the sock, while holding back that which would help them evade. Bst, SandyGeorgia (Talk) 04:31, 1 February 2023 (UTC)[reply]
If you can make a convincing case, that would be at Wikipedia:Sockpuppet investigations/Doug Coldwell. SandyGeorgia (Talk) 20:46, 30 January 2023 (UTC)[reply]
And then there's proxying for a banned user ... to be watching for across the board. SandyGeorgia (Talk) 20:49, 30 January 2023 (UTC)[reply]

Sandy

Sandy, you are an incredible editor. What you and so many others do in regards to copyright investigations and cleanups is nothing short of miraculous. Your efforts are most appreciated. I don't want you to conflate anything I have said as supporting 7&6's actions in interfering with Doug's copyvio cleanup nor should you apologize for protecting the encyclopedia by bringing forward the discussion at AN the way you did. I want to encourage you to keep doing your work. It is very impactful in keeping Wikipedia, an encyclopedia and community we both love, safe and secure from being damaged by copyrighted content. I know that's not all you do but it is a huge task with grave consequences and serious responsibility and I appreciate that you step forward to take action. I support you 100% and I respect you even more. --ARoseWolf 19:05, 2 February 2023 (UTC)[reply]

Not to worry. I have greatly appreciated your measured tone and obvious intent to help at 7/6. And compassion is an attribute I admire. I have avoided posting to 7/6's talk, as that would probably only irritate them, but there are two things that would sway me towards taking a position against the CBAN, in case you are able to have any influence there:
  1. I don't appreciate the reference to "shame" considering the sole IP that posted here to my talk, inquiring whether 7/6 was a Coldwell sock. I can't control who posts to my talk. But it's still unclear whether 7/6 is heaping shame on my head or the IP. And if he's heaping that at me, it's not at all funny, considering I have a gratuitous and undeserved block log wrt sock puppets, and I indeed take great care in that area.
  2. Considering the considerable Wikifriendship between Coldwell and 7/6-- in conjunction with the socking relationships to the library-- the possibility that Coldwell and 7/6 are in communication, and 7/6 is proxying for a banned user, is real. It's clear why the socks went after Willis Fletcher Johnson. What brought 7/6 to Ludington Public Library, after the block of Thomas Trahey? If there were some alternate explanation for what brought 7/6 to that particular article, I would re-evaluate whether to take a position on the Cban proposal.
I also want to encourage you to keep up the compassion, and respect you for the way in which you have reached across a difficult divide to 7/6. I would have structured the AN much more carefully had I realized there was more background here; I worked on the Doug Coldwell AN in sandbox for days before launching it at AN, but in the case of 7/6, I thought it was an isolated problem, and just wanted the WP:DCGAR interferences to end. SandyGeorgia (Talk) 19:29, 2 February 2023 (UTC)[reply]
  • Sandy, you are an incredible editor. What you and so many others do in regards to copyright investigations and cleanups is nothing short of miraculous. Your efforts are most appreciated *raises whisky glass in support* ... Hope you're doing well Sandy. I know we haven't talk much since the Hamlet fire article and I've only tangentially been involved in the Coldwell cleanup, but do know that you're work is appreciated (along with the work of User:Hog Farm, User:Mike Christie, and others). I'd like to keep the GA process around (and with a few dozen GAs myself you could say I have a vested interest) and the cleanup and integrity checking is vital to making it a functional, meaningful process. Also happy we're going to merge community with individual GARs, don't know why we ever had two separate things for that. -Indy beetle (talk) 09:03, 3 February 2023 (UTC)[reply]
    Thanks for stopping by with such kind words, Indy beetle! The whisky will be most appreciated by dear hubby, who quips that he is "drinking for two" since my doctor declared me acutely allergic to all forms of alcohol, along with aspirin, NSAIDs and antibiotics. He does enjoy a nice whisky, while I drink my flavored fizzy water :/.
    If everyone checked one diff a day at Wikipedia:Contributor copyright investigations/20210315, we could clean up the Coldwell mess sooner. Disappointingly, I just discovered that spot checking on new nominators has fallen by the wayside at FAC, so I hope it doesn't go the way of GAN.
    It's always a pleasure to "see" you, Best, SandyGeorgia (Talk) 13:15, 3 February 2023 (UTC)[reply]

Wikipedia:Discord

Hello. I was wondering if you'd like to join the Wikipedia:Discord. There's even a CCI channel there where you can find me, Moneytrees and other users who work in CCI. If you're interested, feel free to stop by :) MrLinkinPark333 (talk) 19:46, 4 February 2023 (UTC)[reply]

Old dog, new tricks ... I never even did IRC, and I don't know what a "channel" is. Must I learn? I just like to write content. SandyGeorgia (Talk) 20:01, 4 February 2023 (UTC)[reply]
A channel is a specific section of Discord. If you're not interested, no worries! MrLinkinPark333 (talk) 23:30, 4 February 2023 (UTC)[reply]
I like the social element of Discord. Makes Wikipedia feel a little less like a business and more like a group of friends. Might be worth the learning curve. To each their own though :) –Novem Linguae (talk) 10:51, 5 February 2023 (UTC)[reply]
Novem Linguae that's precisely the part that worries me ... I've got a block log as explanation for why, and even though the didn't stand long at all (was corrected while I was out merrily shopping), it's still a painful reminder of what sorts of things got cooked up on IRC. I never did figure out what IRC was-- just knew it was where bad non-transparency reigned. SandyGeorgia (Talk) 02:23, 7 February 2023 (UTC)[reply]

Hi

May I ask whether a ban from “gender-related disputes” relates to sexual violence or anything concerning sex – for instance, information on a trafficker. I thought not as the first is about culture wars whereas the second is not, but just thought to check… Scientelensia (talk) 20:51, 4 February 2023 (UTC)[reply]

Hi, Scientelensia; I am not an admin, so you should not take my word for it on this; until you get an answer, you should err on the side of caution. It would be helpful if you were to put an example of here of a specific article you are thinking of working on; then I can ping some admins who might be able to help. SandyGeorgia (Talk) 22:00, 4 February 2023 (UTC)[reply]
Thank you, I will. The article in question was about someone not so good (to say the least), Andrew Tate. I added some details about recent news concerning his trafficking but it was reverted to err on the side of caution. The information has been added by someone else now but I thought to add it as it was not to do with a gender related dispute.
Also, to buidhe, I’m asking SandyGeorgia because they seem to be very experienced and have helped me before on the lines of this matter. Scientelensia (talk) 10:04, 5 February 2023 (UTC)[reply]
Scientelensia you could post to Callanecc's talk page; as Buidhe points out, they will know better than I do. But remember that admins are quite busy and can't remember all editors, so when you post to them, you should mention the article and link to it, and also include a link to your talk page from the topic they enacted, or remind them exactly of what topic ban you have (gender-related). They'll answer you more quickly if they don't have to go look all that up. Good luck, SandyGeorgia (Talk) 02:20, 7 February 2023 (UTC)[reply]
Alright, thank you! Scientelensia (talk) 07:43, 7 February 2023 (UTC)[reply]
Don't ask some random editor, ask the admin who put on the ban. (t · c) buidhe 06:18, 5 February 2023 (UTC)[reply]

Question about IP editor

Hello. Apologies for the random message. An IP editor has been putting in uncited information into the 3 of Hearts (album) article. I have reverted them three times already (although I fully admit the third time was an accident as I know reverting three times or more is inappropriate) and I have left a message on the IP's talk page. Any advice for what I should do in this situation? Sorry again and thank you. Hope you are having a great week so far. Aoba47 (talk) 22:59, 6 February 2023 (UTC)[reply]

I saw this and blocked them, let me know if they get back to it. ♠PMC(talk) 23:10, 6 February 2023 (UTC)[reply]
Thank you and apologies for the third revert. Aoba47 (talk) 01:27, 7 February 2023 (UTC)[reply]
You're okay, it's only after #3 that they start bringing out out the dogs ;) ♠PMC(talk) 01:52, 7 February 2023 (UTC)[reply]
Premeditated Chaos thank you SO much for helping Aoba while I fretted over finishing those blooming Lunar display CCIs. Almost 18 hours in, and I only have the US articles to go ... another six hours probably ... so 24 hours to clean up one DYK that was an abuse of all things good faith. Now to catch up on my talk page ... thx again. SandyGeorgia (Talk) 02:16, 7 February 2023 (UTC)[reply]
No problem. Sandy, with my experience in CCI I recommend taking it easy and not stressing about it too hard. We have dozens of open CCIs. Many are enormous, bigger than Coldwell's, and carry the same load of problematic content. We work on it the best we can but we can't drive ourselves crazy trying to fix it immediately when that isn't realistically possible. ♠PMC(talk) 02:25, 7 February 2023 (UTC)[reply]
Premeditated Chaos the issue is that one DYK had 30 articles replete with copyright issues, but all similar and using the same sources and needing the same templates, yada yada, so it's easier to complete them all at once ... once I got a system, I had to keep it going ... while keeping track of where I was in the list and what was left to do and all that ... almost home, and I hope not to hit another one like that. SandyGeorgia (Talk) 02:35, 7 February 2023 (UTC)[reply]
That's fair, just don't want you to stress yourself out trying to carry the whole thing :) ♠PMC(talk) 02:37, 7 February 2023 (UTC)[reply]
I'm committed to the Ludington-related (little by little-- same thing, now I know the sources, so can do all of them), finishing up the Lunar displays, and getting the WP:DCGAR launched. After that, it's whatever I have free time for ... but I have my kiddos flying in for a visit this weekend through next week, so at least want to get the loony lunars out of the way before they arrive. Thx for the thoughts :) :) SandyGeorgia (Talk) 02:41, 7 February 2023 (UTC)[reply]

I don't know how you do it

I don't understand how it's possible for you to put as much work in copyright clearing on the DC situation as you do. I just applied nuking (and revision deleting a sizable chunk of the article history) to Sweeney Prizery, and that pretty much ended my desire for wikipedia work this evening. It just sucks the life out of me to take a hatchet to content, knowing that I'll probably never get the time to flesh things back out with needed levels of detail. But it has to happen, and I guess it's better that I try to save a couple paragraphs than it end up at WP:CP down the road and have nothing remaining. Hog Farm Talk 03:50, 7 February 2023 (UTC)[reply]

Hog Farm you are always too hard on yourself: I don't know how YOU do it !!! Remember, I don't also have a job (although I am going back to volunteer Treasurer at church starting next week, so that'll take away half of my Wikitime).
That 30-article DYK was such an abuse of the process that I really wanted to get through all of them to find out just how bad it was ... and it's really bad in every way possible. I needed to see that, to feel OK about nuking where needed going forward, and since all articles are repetitive and use the same sources, it made sense to just do them all (with eight windows open). And some were Spanish sources so I wanted to see if those had cut-and-paste google translates (yep). It's not only the copyvio; it's the misrepresentation of sources, spreading wrong info throughout the internet, putting personal email correspondence in to mainspace, but worse, the complicity of DYK reviewers in contributing to padding up the article with repetitive and meaningless text just to meet the DYK limits. The Who Wrote That? tool shows that DYK reviewers helped pad up the articles, making already bad writing worse. That kind of DYK processing is a) not encouraging source-to text integrity, and b) not teaching good writing. The exercise in getting through all 30 in the lunar display series has really shown me how DC became the poor editor he was; DYK furthered and worsened his editing. And then there's the GA process, with zero accountability. Gguy kept an eye on everything; no such thing now. It's not at all surprising that DC never showed his face around FAC.
So, now that I've seen how bad it is, I won't mind at all nuking DC content going forward. I did all that work on the lunar series on articles that never should have been created (whole thing coulda been a list), and each gets maybe 5 pageviews a day.
And the kids are coming to visit, so I've got to make hay while the sun shines! Take care, SandyGeorgia (Talk) 10:17, 7 February 2023 (UTC)[reply]
There is a list, and at present the separate articles generally have a paragraph or so of history. Think there would be opposition to merging that into the list and redirecting/deleting the separate articles? Gimmetrow 16:48, 7 February 2023 (UTC)[reply]