In the last several days [as of 1 may 2007], I conducted a small survey of vandalism to featured articles on Wikipedia. The following page includes a discussion of my methods and results. The shocking result of my study: it takes the Wikipedia community on average 10 hours to remove serious vandalism other than childish vandalism from Featured Articles.
For my project, I studied only featured articles. Featured articles should be the very best Wikipedia has to offer as they have been extensively checked and vetted. At the beginning, I also assumed that the individuals who went to the work of creating featured articles would also probably watch them heavily. I also figured that most featured articles would me on a number of watchlists. For these reasons, I assumed that featured articles would display the ideal response time for vandalism.
Of course, we all know that vandalism that includes profanity and childish insults is caught rather quickly thanks to the efforts of our hundreds of RC patrollers and the multitude of software devices created to help find easy to spot vandalism. They are quite adept at finding and reverting "Your mom..." or "PENIS!" So, for this study, I engaged in slightly more complex vandalism in three categories:
- Grave Factual Accuracy: For these articles, I changed or inserted material that any average reader or editor of Wikipedia would immediately know to be untrue. For example, in the hydrochloric acid article, I wrote that Martin Sheen discoved the acid by mixing potatoes with salt and that Martin Sheen also invented agent orange for dissolving gold. Finally, I inserted the sentence: "Most of Sheen's research received funding from the United States military due to their interest in new weapons technology". If a reader/editor didn't immediately notice that this information was false (and wholly inconsistent with the remainder of the article), clicking on any of the wikilinks I inserted would've revealed the truth.
- Complete Nonsense: In this category, I inserted a passage of completely irrelevant prose into an article. For example, I inserted the opening of This Side of Paradise into the middle of the article Island Fox.
- Factual Innaccuracy: In this category, I changed articles more slightly so that a user with knowledge of the topic would be needed in order to spot the incorrect information. For example, in the article on Norman Borlaug, I changed "Between 1965 and 1970, wheat yields nearly doubled in Pakistan and India" to "Between 1968 and 1975, wheat yields nearly tripled in Pakistan and India" in the articles lead section.
In the complete nonsense category, I modified a total of five articles. The average response time was 691.8 minutes or 11.5 hours!
|Island Fox||23:43, 27 April 2007||03:04, 30 April 2007||51 hours 47 minutes||This one took longer to revert than any other article in the study (over two days).|
|Data Encryption Standard||23:44, 27 April 2007||00:57, 28 April 2007||1 hour 17 minutes|
|Cornell University||23:48, 27 April 2007||04:07, 28 April 2007||4 hours 19 minutes||This one was reverted by an anon, as were many of my changes.|
|Technetium||02:57, 28 April 2007||03:12, 28 April 2007||15 minutes||One of the quicker reverts in the study, but the editor still assumed good faith.|
|Fin Whale||03:03, 28 April 2007||03:04, 28 April 2007||1 minute||Now that was a fast revert, what we're going for here!|
All in all, the response in this category was shockingly slow, especially for Island Fox. Only two of the articles, Technetium and Fin Whale were reverted in what I would deem an acceptable time frame for vandalism this easy to spot. Of course, the average revert time above is skewed by the extreme values, so if we take the average of the middle 3, we end up with 113 minutes or nearly two hours, which is better, but still terrible.
Grave Factual Innaccuracy
I concentrated the most on this category, modifying a total of 9 articles. The response time here was still quite bad. On average, it took 555.7 minutes or 9.25 hours to revert modifications to articles in this category. Unlike the previous category, no single value threw off the data as there were two values of nearly 14 hours and one of over 31 hours.
|Medal of Honor||22:57, 27 April 2007||22:58, 27 April 2007)||1 Minute||Great revert time, congratulations to ERcheck!|
|Dime (United States coin)||23:01, 27 April 2007||02:55, 28 April 2007||3 hours 54 minutes|
|Butter||23:06, 27 April 2007||13:25, 28 April 2007||14 hours 19 minutes||This one wasn't ever exactly reverted. An anon removed the entire paragraph that I changed and as such the article is simply missing a paragraph. To be honest, it wasn't an essential paragraph, so I didn't put it back in. Be bold and decide if you want to revert to an earlier version!|
|Hydrochloric Acid||23:10, 27 April 2007||13:26, 28 April 2007||14 hours 16 minutes||This one was edited by a bot in between my edit and the revert.|
|James Joyce||23:15, 27 April 2007||02:58, 28 April 2007||3 hours 43 minutes|
|New Orleans Mint||20:51, 28 April 2007||01:04, 29 April 2007||4 hours 13 minutes||I don't know why I kept on modifying articles to include Vietnam information, but this was one of those.|
|Great Lakes Storm of 1913||21:15, 28 April 2007||21:53, 28 April 2007||38 minutes||A good revert time, apparently because the article is heavily edited by brian0918.|
|Second Crusade||21:21, 28 April 2007||15:59, 30 April 2007||42 hours 38 minutes||This one suffered another incident of vandalism and was reverted to my version before my modifications were corrected. Honestly, how long does it take to figure out that Gregory Peck, Bill Cosby, and Harry Potter didn't lead the Second Crusade and that Paul Revere wasn't involved?|
|Francis Petre||19:54, 29 April 2007||20:02, 29 April 2007||8 minutes||I'd call that a good response time!|
Once again, we are faced with an appalling response time. Every article in this category had obviously been vandalized and no one cleaned many of them up for several hours. I shudder a little bit when I think about people reading these articles in the elapsed time. What kind of impression would a casual reader form of an encyclopedia that reported that Bill Cosby led the Second Crusade along with Harry Potter and Gregory Peck?
Interestingly, this category provided the single glimmer of hope in my study with an average revert time of 57.4 minutes for five articles. Of course, prior to this survey I would've looked at a revert time of nearly an hour as atrocious, but you take what you can get. Unfortunately, the quick revert time in this category wasn't the result of actions of a group of vigilant users. Instead Morven, a user with checkuser permissions connected the five articles through the fact that I had made all of my edits up to that point through the same proxy. So, I feel that this category is essentially invalid. Nonetheless, here is the data.
|Eifel Aqueduct||22:26, 27 April 2007||23:41, 27 April 2007||1 hour 15 minutes||Revert by Morven|
|The Scout Association of Hong Kong||22:29, 27 April 2007||23:37, 27 April 2007||1 hour 8 minutes||Revert by Morven|
|Canon T90||22:33, 27 April 2007||23:52, 27 April 2007||1 hour 19 minutes||Revert by Morven|
|Norman Borlaug||22:36, 27 April 2007||23:36, 27 April 2007||1 hour||Revert by Morven|
|Order of the Thistle||22:38, 27 April 2007||22:43, 27 April 2007||5 minutes||Quick revert and not by Morven (Doops reverted it)|
Once again, I don't think this category was especially notable given the influence of Morven and his checkuser permissions. Nonetheless, it is a glimmer of hope when it comes to a possible organized vandal attack.
For the categories of grave factual inaccuracy and complete nonsense, the average overall response time was about 10 hours. I would categorize the edits in these categories as serious vandalism. Personally, I think that a revert time of 10 minutes would be more appropriate for featured articles than 10 hours. In other words, our anti-vandal measures have failed. While they are highly successful at catching childish vandalism, our vandal fighters don't even notice serious, but equally if not more damaging content vandalism.
Also, this study given that it involved changing facts reveals that the general community response to a bad fact in a featured article is far too slow.
What do I suggest
- Discussion on this topic - people need to talk about it
- Put more articles on your watchlist! I've already watched another half-dozen featured articles.
- Stable versions- so people don't have to deal with bad facts
- Divert some of the extensive resources we devote to vandal-fighting to fact-checking. It may be less glamorous, but fact-checking is essential to catch bad facts inserted not just as vandalism but all in good faith.
- Create some sort of bot to monitor edits by new editors. Ideally this would apply to those with a low edit count, but recently created accounts could be monitored. I think such a bot is technically feasible and a good use of resources. Just look at the numerous variations on detecting "PENIS!" in rss feeds or monitoring recently reverted users.
Answers to Questions
I have answered some questions I anticipate receiving below
- 1. Isn't this a massive violation of WP:POINT
- In a word, yes. But someone needed to look at how long it actually takes to revert vandalism and this seemed like the best way.
- 2. Who are you?
- I am a frequent editor with over 2500 mainspace edits. I am not an administrator. For obvious reasons, I have concealed my identity for this survey.
- 3. Why did you use registered accounts?
- For several reasons, primarily so that I could operate behind the same IP without having my edits linked by other editors (except one with checkuser). Additionally, people screen anonymous edits more thoroughly so I felt being registered would give better data.
- 4. What did you expect to be the results of your study?
- I would've guessed about an hour to revert minor factual innaccuracies, 15 minutes for grave factual inaccuracies, and 5-10 minutes for nonsense. Yes, I was wrong.
- 5. Why shouldn't we ban you?
- I'm not a threat. Go ahead and block the accounts that I used to do my modifications, but I am not a vandal and I don't plan to use this account to do anything wrong. This account is just for sparking discussion.
- 6. Why did you choose Featured Articles?
- Because they are supposed to be Wikipedia's best. I think we all know that vandalism to stubs can go unnoticed for a long time. But Featured Articles, simply put, matter a lot.