Spell checker bug[edit]

Your spell checker flagged Herosé, but it showed up in the misspellings list as simply "hero".

Thanks for the bug report. The problem seems to be that it doesn't handle special characters (like the acute accent) properly. I have removed these false positives from the list and will fix it with the next version of the spellchecker. Sietse 14:23, 27 Nov 2004 (UTC)

Extra \[edit]

Page titles containing an apostrophe had extra backslash characters in the links. I removed them, and you should correct this in the future. Eric119 03:48, 29 Nov 2004 (UTC)

Thanks, I didn't notice those entries; I have a greyscale monitor, so red links don't really stand out on my screen. I'll take care of this in the next listing. Sietse 08:52, 29 Nov 2004 (UTC)

Aside - repetitive 'and'[edit]

"In this scene, the King, and, I think, the Queen are present". As shown, there should be a comma between the King and and, and and and I.

-- Solipsist 20:34, 29 Nov 2004 (UTC)

Shouldn't that be either
  • There should be a comma between the King and and, and and and I think.
  • There should be a comma between King and and, and and and I.
...? :-) -- Smjg 11:18, 30 Nov 2004 (UTC)

(Doradus) No, it should be this:

As shown, there should be a comma between "the King" and "and", and "and" and "I".

Checking miscapitalized adjectives[edit]

Very often, names of languages and ethnic groups are not capitalized when they should be, i.e., french, english, instead of French, English. That could be checked too. Danny 14:26, 4 Dec 2004 (UTC)

It turns out that there indeed is a very large amount of such errors in the article space. I have put a selection of the articles that contain such words on the project page. Thanks for the idea! Sietse 17:55, 4 Dec 2004 (UTC)
I broke the list into manageable pieces. I THINK it's less daunting, but I KNOW it lessens the burden on the server. I apologize if this causes a problem, but I was attempting to be bold! Make sure that if it is a problem, you take a moment to slap me around a little! Brian Sayrs 01:22, 2004 Dec 5 (UTC)
No need to slap anyone around a little :) Good idea! It certainly looks less overwhelming this way. I'll keep this in mind when putting the next list on-line.
Wasn't that list a bit small in fact? I am doing the same fixes, sorted by article creation date, and the last 300 articles I fixed were not in your list (Zazou, Saramaka, Rubem Fonseca are the latest), despite having the exact same miscapitalised words. Any idea why they are missing? Sam Hocevar 08:58, 6 Dec 2004 (UTC)
I deliberately kept the list incomplete to make it look less daunting. I thought it would discourage potential contributors if they are faced with a list of a few thousand articles that contain mistakes. My intention was to split the list of problem articles into batches of about 1000 entries and try to fix those in a period of a few weeks. Btw, 300 articles?, great job! -- Sietse 09:15, 6 Dec 2004 (UTC)

I think you should make it clear when these should be capitalized, and when not.. because i believe in many.. possibly most cases, it is correct not to capitalize. [comment by anonymous user]

Which cases do you mean? As far as I know, adjectives that are derived from proper nouns should always be capitalized in English. I have learned it this way; the capitalization article and the grammar guide in my dictionary say this too. But please tell me if I'm wrong or if this is not correct in all variants of English. Sietse 17:12, 5 Dec 2004 (UTC)
"french fries" should be left uncapitalised, because the "french" there comes from the verb "to french". Also, "cousin-german", "brother-german". Sam Hocevar 17:54, 5 Dec 2004 (UTC)
Sam makes a great point...especially how often "french fries" shows up in the 'pedia. It makes you wonder a little! Brian Sayrs 18:50, 2004 Dec 5 (UTC)
Thanks guys, I forgot to filter for those words. Anyway, seems like we'll have to include 'French fries' occurences that are not at the beginning of a sentence in the next run :) Sietse 19:10, 5 Dec 2004 (UTC)


  • italian
  • peruvian
  • estonian
  • algerian
  • wikipedian ;)
Yes, I've only checked for fourteen such adjectives. The next run will contain words which I thought would be less common (italian, peruvian, estonian, algerian, ...) Sietse 10:32, 5 Dec 2004 (UTC)
And fourteen made quite a list in itself, didn't it!
Don't forget soviet... --Dryazan 22:49, 5 Dec 2004 (UTC)
Beware this one: when meaning pertaining to the Soviet Union, it's Soviet; however, when talking about the council, it's soviet. Sam Hocevar 11:01, 8 Dec 2004 (UTC)

I noticed a link in entheogen was changed from Aborigine to Aborigine. This seems pedantic to me ... is there really any good reason why we should care about the capitalization of text the reader will never see? (If there is, it should be Aborigine.) (For those who are wondering what I'm on about, edit this text to see the differences) Rkundalini 00:39, 16 Dec 2004 (UTC)

The idea is that it avoids going through the same false positives again and again, while being completely harmless whatsoever. Also, I dream of a day when Wikipedia article names are fully case sensitive :-) Sam Hocevar 01:47, 16 Dec 2004 (UTC)

Bug in pages with slashes?[edit]

In miscaptialized words, List of people by name was listed a number of times, yet none of the listed errors were present. I suspect the actual errors are on "List of people by name/something" but the page name got truncated at the slash. However, there are so many sub-pages to that particular page that I can't confirm this theory. --Doradus 00:01, Dec 7, 2004 (UTC)

Thanks for the report. I've looked into it, and it turns out that the miscapitalization-finding script actually does what it should do. The culprit is the program that adds wiki-syntax to the output of that script: it doesn't handle colons in titles properly. I'll fix it with the next run of the script. Sietse 11:23, 8 Dec 2004 (UTC)


Why not make this is a proper WikiProject so that someone else can upload the lists? Brianjd | Why restrict HTML? | 03:11, 2005 Mar 30 (UTC)

I have stopped working on this project for at least a while. I'm not working on Wikipedia stuff, except for occasional minor fixes, until my master's thesis is finished. If anyone else wants to add lists to the page in the mean time , or move/copy the page, or convert it to a WikiProject that's fine with me (of course). Sietse 12:47, 31 Mar 2005 (UTC)

List entries fixed from the 20170120 dump subsequently re-appearing on the 20170401 dump[edit]

The only anomaly I could find that may have caused this was that when Becky Sayles cleared the final list of entries on February 23 shown here she had not in fact altered the entries themselves, many of which weren't checked and changed until the next day, February 24. I don't know if this played a part, but when the next dump occurred on 20170401 almost all the entries that were cleared from the list by Becky Sayles on Feb 23 but weren't changed in the articles until Feb 24 were again present in the 20170401 dump, and as of September 21 are still present on the list even though all of them were corrected on Feb 24. My question is why do these corrected entries still persist in appearing on these data dumps after being corrected over seven months ago? SpintendoTalk 19:08, 21 September 2017 (UTC)

  • I'm looking at the version of the page dated 06:52 30 January 2017 (1472 entries) vs. 03:45 23 April 2017 (1117 entries) and I'm not seeing the issue you describe. If an entry reappears on the list, it may be a false positive and there is currently nothing in the database dumps that exclude those. Additionally, in the past I have been working a list only to find that someone has already done the entries without clearing them off the individual dump list. The dumps do not run automatically or monthly, I have to request them from Hazard-SJ. I'm not sure if those are archived or not, but you may want to ask them for further info. Thank you for your work on WP:FIX. Sct72 (talk) 20:23, 21 October 2017 (UTC)
    • Are there any examples if this situation? Also, I don't archive the results separately, so the best way to find previous dump results are probably the on-wiki history pages or via an edit summary search.  Hazard SJ  17:20, 22 October 2017 (UTC)

More Frequent Dump Updates[edit]

Hi everyone, I've just made a few changes that will hopefully update the dumps a lot more frequently (anticipated twice per month since that's how frequently dumps are made), and without needing to be manually started by me. Hopefully everything works out with that, but let me know if you see any issues (e.g. doesn't seem to be running).  Hazard SJ  17:16, 22 October 2017 (UTC)

Related project to find misspellings[edit]

Watchers of this page may also be interested in Wikipedia:Typo Team/moss, a project to find misspellings and typos (including some that occur in dozens of articles). -sche (talk) 19:37, 30 September 2018 (UTC)