Wikipedia:Typo Team/moss

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

The moss project seeks to find and remove the furry green typos that have been growing on Wikipedia articles. It uses software written by User:Beland to automatically find misspellings, mistakes in English grammar, violations of the Wikipedia:Manual of Style, and confusing or broken wiki markup.

Dearth to tyops!

QUICK LINK TO THE BEST PAGE FOR NEW PARTICIPANTS

About misspellings[edit]

How the lists are made[edit]

The moss spell checker is run against a recent set of database dumps, which are generated on the 1st and 20th of every month (but take a few days to process). All the articles in the English Wikipedia are examined. The following are ignored:

  • Text inside references, templates, tables, quotation marks, sections like "External links" and "Works", and some other weird places.
  • Capitalized words (which are presumed to be correctly-spelled proper nouns)
  • Words that appear in titles in the English Wiktionary (which has definitions of all words in all languages, excluding proper nouns and systematic words like chemical names and large numbers)
  • Words that appear in titles in the English Wikipedia (which explains some things that don't appear in the dictionary)
  • Words that appear in titles in the Wikispecies (which has many technical words that don't appear in the dictionary or encyclopedia)

Many mistakes are not (yet) caught:

  • Improper addition of 's (possessives are not added to Wiktionary, so these are excluded systematically)
  • Incorrect capitalization
  • Incorrect multi-word phrases
  • Wrong word used in context
  • Non-English language words not tagged with {{lang}} or where an English misspelling happens to be the same as a word in another language. (These are counted as correct spellings if they are in the English Wiktionary, which lists words in all languages – only the definitions are restricted to English.)
  • Other situations listed in #False negatives below

New statistics[edit]

From 2018-09-20 to 2019-03-01, the number of typos classified as T1 (edit distance 1 from an English word, the most likely to be actual misspellings) dropped by 35,488, or 32%, and this appears to be due to the hard work of editors participating in the moss project fixing typos on the T1 lists. Amazing progress! The numbers for categories we aren't fixing have remained relatively stable, though for all categories there is some bouncing around as new typos are created and fixed in the normal course of writing and editing articles.

While processing the 2019-03-01 dump, I made a major change to how typos are classified. (You can see the old method in the archived statistics.) I've dropped categories with an edit distance greater than 3 from an English word (T4 thru T16) since these are quite unlikely to be misspellings. Most of the reported typos that are not likely English misspellings are either compound words or non-English words. (Some of the non-English words are also misspelled.) Some English compounds end up as TS, if they are caught by a conventional spell checker; the rest are now classified as ME. (There are various other categories for compounds, all starting with M, and these will all need to be refined later because a fair number of words and up there that don't belong.) In an effort to exclude as many non-English words as possible, I've started looking at non-English Wiktionaries; any words found there but not in the English Wikitionary are classified as W. Romanizations are not eligible for Wiktionary; words native to non-Latin writing systems are entered under those other systems. I've written some code that attempts to perform transliteration from any given writing system. It's starting to catch a few thousand words (classified as L) but is obviously missing a lot and so will need to be further refined. I've also added some categories for bad HTML tags and similar problems.

Since the classification changes make the new numbers incomparable with the old numbers, I've started a new table below. I've started posting some TS typos as well as T1s, so expect to see both those numbers to improve significantly in the coming months. -- Beland (talk) 07:30, 23 March 2019 (UTC)

Reporting symbol Explanation Instances, 2019-03-01 dump (692642d) Instances, 2019-03-20 dump (802b6c0)
TS Missing or whitespace or dash (or new compound) 183795 182018 (-1777/.97%)
T1 Edit distance 1 from common English word 75941 73600 (-2341/3.1%)
T2 Edit distance 2 from common English word 72093 71615 (-478/.66%)
T3 Edit distance 3 from common English word 79609 78925 (-684/.86%)
R Regular word (A-Z only) not near a common English word 101178 100067 (-1111/1.1%)
I Definitely not English (International) due to accents or mixed with punctuation (other than hyphen) 93902 90875 (-3027/3.2%)
W Not in English Wikitionary, in non-English Wiktionary 82548 82519 (-29/.04%)
L Probable Romanization (transLiteration) 4294 4306 (+12/.28%)
ME Probable coMpound, English (with and without dash) 51279 51052 (-227/.44%)
MI Probable coMpound, non-English (International) in English Wiktionary (both A-Z and non-ASCII characters, with and without dash) 194949 192743 (-2206/1.1%)
MW Probable coMpound, found in non-English Wiktionary 51656 51240 (-416/.81%)
ML Probable coMpound, transLiteration 4010 3964 (-46/1.1%)
C Chemistry words 1853 1855 (+2/.11%)
D DNA sequences (a, c, g, t) 0 0 (-)
N A-Z plus numbers and hyphens 26620 25854 (-766/2.8%)
P Patterns (e.g. rhyme schemes) 47 50 (+3/6.4%)
H HTML/XML/SGML tag 3519 3459 (-60/1.7%)
HB Known bad HTML tag, like <font> 15366 14837 (-529/3.4%)
HL Bad HTML-like linking, like <http://...> 516 510 (-6/1.2%)
U URL - 1284
Total 1043175 instances 1030773 instances (-12402/1.2%)
Parse failure Mismatched punctuation 199130 articles 200032 articles (+902/.45%)
  • red = Probably need to fix
  • yellow = Unsorted
  • blue = Probably OK (but may need to verify)
  • bold = actively working on fixing

Instructions for editors[edit]

Just like a regular spell checker, sometimes a word that's highlighted is really a misspelling and should be changed, but sometimes it is a correct spelling that needs to be added to the spell checker's dictionary (which in this case is the English Wiktionary and Wikispecies). For the below lists, here's how you can help:

  • For spelling mistakes: Click on the links to the individual Wikipedia articles, and edit them to correct the misspelling.
  • For non-English words (including words from Old English and Middle English, since they are pronounced differently): Edit the article and use the {{lang}} or {{transl}} templates to mark all non-English passages. Template contents are ignored, so they will not show up in the next report. If you can define the word, it would still be helpful to add the non-English word to the English Wiktionary or the same-language Wiktionary if you speak that language. As of the March 20, 2019 dump, only words not found in any Wiktionary are reported by moss as misspellings. (The "home" Wikitionary for Old and Middle English words is the modern English one.) NEW: If you don't know which language is being used, you can tag it with {{which lang}}. If you add a "reason=" parameter, that will change the pop-up tooltip text readers will see when they hover over "what language is this?". If you have a guess as to which language it might be, or any other question or comment, you can leave that here to help future editors. If you use this tag, you can delete the article from the moss listing; the article will be added to Category:Articles with unidentified words instead, and ignored by future runs of moss until the mystery is solved.
  • For incorrect spellings in direct quotes:
    • These shouldn't be picked up by the spell checker, as text in double quotes "" is ignored. The article probably has incorrect punctuation.
    • Regardless of punctuation problems, you can add {{sic}} around the word or phrase. See Wikipedia:Manual of Style#Quotations for guidance.
  • For correct spellings that belong in the dictionary: Click on the word to add it to the English Wiktionary. Remember the word might not be English (though the definition must be), and be sure to check capitalization!
  • For correct spellings already in the dictionary: Delete from the list or strike through; these have been added in the meantime since the database dump by other editors. They do not automatically turn red as internal Wikipedia links do.
  • For correct spellings not appropriate for Wiktionary:
  • Correct or incorrect, when finished delete or strike out the entry for the word from the lists on this page (or subpages), so work won't be duplicated. It is preferred to delete the entry for sections that rotate through specific letters, and strikethrough for sections where the whole thing gets updated (to prevent duplicating work done while the dumps were being processed, which can take more than a week).
  • If an article or section has generally bad grammar, and you don't have time to fix the whole thing, just add {{copyedit}} at the top of the article or {{copyedit|section}} at the top of the affected section. If it's just a sentence or two, {{copy edit inline}} or {{incomprehensible inline}} can go at the end of the problem passage.
  • If you see errors being reported from footnotes or bibliographies, check to make sure the section is titled with a standard name following MOS:APPENDIX conventions. Standard end-matter sections like "References" and "Further reading" and "Works" are ignored.
  • If it helps to leave a message on the article's talk page asking if the word is correct or incorrect, you can use Template:Typo help like this when editing the bottom of the talk page (leave the section header blank; it will automatically be added):
{{subst:typo help|PUT WORD HERE}} -- ~~~~
  • NEW: If you are uncertain whether a word is spelled correctly or not, you can add {{typo help inline}} immediately after it. If you add a "reason=" parameter, that will change the pop-up tooltip text readers will see when they hover over "check spelling". You can add a specific question or comment that may help identification. If you use this tag, you can delete the article from the moss listing; the article will be added to Category:Articles with unidentified words instead, and ignored by future runs of moss until the mystery is solved.

Don't worry if you miss something; it will reappear in a future report if there are still mistakes.

Suggested edit summaries[edit]

If you want to help publicize this project, you can copy-and-paste these into your edit summary, if appropriate.

For Wikipedia edits:

Fix misspelling found by [[Wikipedia:Typo Team/moss]] – you can help!
Tag non-English text found by [[Wikipedia:Typo Team/moss]] – you can help!
Tag correct text as {{not a typo}} for automated spell checkers (including [[Wikipedia:Typo Team/moss]])
Fix mismatched quote marks found by [[Wikipedia:Typo Team/moss]] – you can help!

For Wiktionary edits:

Add word identified by [[w:Wikipedia:Typo Team/moss]] – you can help!

Wiktionary cheat sheet[edit]

Need to add a word to Wiktionary? The Wiktionary cheat sheet has copy-and-paste templates that make it easy for the types of words commonly encountered here, even if you've never done it before.

Misspellings - lists of things to fix[edit]

Likely misspellings by article (main listing)[edit]

The most efficient list to work on if all you want to do is fix misspellings. All typos from a given article are shown, but only typos that are very close to known words are shown. The algorithm is not perfect, so some of these may still be words that need to be added to Wiktionary. A different part of the alphabet is posted on each run to avoid duplicate work, and because the whole list is too long to post all at once.

See subpages due to length:

Cases that require investigation are being moved to Category:Articles with unidentified words. -- Beland (talk) 21:33, 9 March 2019 (UTC)

Likely misspellings by frequency (n-z)[edit]

The best list to work on if you want to eliminate all instances of a specific typo. Only typos that are very close to known words are shown. The algorithm is not perfect, so some of these may still be words that need to be added to Wiktionary. For each run, only words from half of the alphabet are shown, to avoid duplicate work from when new dumps are being processed.

Legitimate misspellings are candidates for Wikipedia:Lists of common misspellings. If there is an obvious correction, adding that to Wikipedia:Lists of common misspellings/For machines will help editors who use automated tools to fix cases faster.

Likely new compounds by frequency (n-z)[edit]

The best list to work on if you want to add variations of known words to Wiktionary, mostly compound words. The algorithm is not perfect, so some of these might be common mistakes that need to corrected. For each run, only words from half of the alphabet are shown, to avoid duplicate work from when new dumps are being processed.

Likely new words by frequency (n-z)[edit]

The best list to work on if you want to add completely new words to Wiktionary. The algorithm is not perfect, so some of these might be common mistakes that need to corrected. For each run, only words from half of the alphabet are shown, to avoid duplicate work from when new dumps are being processed.

Some of the words might not be from English. To get these words off this list, you can either add an entry to the English Wiktionary (which provides English definitions for words in all languages) or tag all instances of the word on the English Wikipedia with {{lang}}. Wiktionary does not accept Romanizations for some languages, so those cases must be tagged as {{transl}} or {{lang}}.

Likely new words by frequency (non-English)[edit]

These are good candidates to add to the English Wiktionary (which provides English definitions for words in all languages), as it seems English Wikipedia readers will frequently encounter them. This is a special manually generated report.

From 2019-02-01 dump:

From 2019-02-01 dump, but clearly not foreign words (need to figure out what to do with them):

I am redirecting, defining or editing these. Graeme Bartlett (talk) 12:06, 9 April 2019 (UTC)

Cases with notes from 2018-09-20 dump:

Likely misspellings by frequency (a-m)[edit]

(Only cases with notes are listed; waiting for next dump to refresh.)

Likely new compounds by frequency (a-m)[edit]

(Only cases with manual notes currently listed; waiting for next dump.)

Manual notes from 2018-09-01 dump:

Notes from 2018-10-20 dump:

Should be all done, but search doesn't appear to be updating, so I can't verify. Darylgolden(talk) Ping when replying 06:20, 10 February 2019 (UTC)
Verified using insource: regex. -- Beland (talk) 07:02, 23 February 2019 (UTC)

Likely new words by frequency (a-m)[edit]

(Only cases with manual notes currently shown; waiting for next dump.)

For Wiktionary[edit]

This is a special section; putting a Wikitionary link here will cause a word to be ignored by the spell checker everywhere it appears (on the assumption it will soon be added to Wiktionary.)

I think it just needs an entry in Wiktionary, then. -- Beland (talk) 19:34, 20 September 2018 (UTC)

Needs Wikipedia article instead?[edit]

Archived notes[edit]

See Wikipedia:Typo Team/moss/Archive.

Articles with the most possibly misspelled words[edit]

These are likely to be lists using non-English-language or technical words.

  • For articles that are just lists of species names, please link to the article from Wikispecies:Wikispecies:Requested articles#From_Wikipedia and delete the entry here. Those are now automatically suppressed.
  • For non-English-language words, add {{lang}} around the foreign passages and delete the row. Articles that don't do this often have formatting of non-English words that is inconsistent either internally or with the Manual of Style, so this is an easy way to fix that at the same time as helping the spell checker and screen readers do the right things.

100+ words[edit]

60-99 words[edit]

Probable mineral mispellings[edit]

(manually identified from old dump, need an expert)

Possible typos by length[edit]

Longest or shortest in certain categories are shown, sometimes just for fun and sometimes because they form a useful group. Please use strikethrough (or leave a note) for this section rather than removing lines, to avoid repeating work done while the dumps were being processed. Thanks!

Likely chemistry words[edit]

Missing articles on single characters[edit]

(All done; waiting for next dump!)

Every character should either have a Wikipedia article, redirect to a Wikipedia article, or Wiktionary entry.


Probable DNA sequences[edit]

If you're sure this is a DNA or RNA sequence, tag it {{DNA sequence}}.

(All fixed from 2019-01-20; waiting for next dump!)

Repeating patterns - easy fixes[edit]

For rhyme schemes, they probably need to be re-styled to follow Wikipedia:WikiProject Poetry#Style for rhyme schemes. If this ends up making them all-caps, they won't show up here on the next run. For mixed-case rhyme scheme notations, use {{not a typo}} after making sure dashes, commas, and spaces follow the recommended style.

Alphabetical[edit]

Poem rhyme schemes just need to be capitalized as explained in the article Rhyme scheme. All the bird noises can be put inside {{not a typo}}. -- Beland (talk) 05:42, 21 October 2018 (UTC)

From 2019-03-01 dump:

Notes for editors[edit]

To be tagged[edit]

Patterny words possibly for Wiktionary[edit]

  • 33 hexipentisteriruncicantitruncated - a nest of specialized geometrical form names; what to do? → since this is a part of several compound names, it may need a set index or disambig page. If it has use in books, it could go in Wiktionary, but Wikipedia seems to be the source of these geometric terms.

For Beland todo[edit]

  • Ignore lines beginning with spaces, or do these need actual markup tags if they are code or something?
  • Rhyme scheme hunting:
    • Sync style for articles in Category:Stanzaic form and Category:Rhyme and add to rhyme scheme list if appropriate.
    • Sync annotation style for articles that mark up poems line-by-line (use tables, not column divs or parens)
    • Manually search for patterns like:
      • a-b-a-b-a-b-c-c
      • AB,CD,AB (internal rhyme)
      • "aa", "ab", "aaa", "aab", "aba", "abb", "abc", "aaaa", "aaba", "aabb", "aabc", "abaa", "abab", "abba", "abca", "abcb", "abcc", "abcd" - probable rhyme sequences where there's an article present so it's not detected as a misspelling
  • Hmm, looks like the chemistry word detector could use some enhancement. -- Beland (talk) 16:27, 15 August 2018 (UTC)

False positives[edit]

Is there a word that is correctly used in an article, but which shouldn't be added to Wiktionary? List it here, and Beland will fix the problem.

Archived solutions: Wikipedia:Typo Team/moss/Archive

False negatives[edit]

Is there a misspelled word in an article mentioned here that was not reported? Feel free to list it below and Beland will try to improve the code if appropriate.

These are currently over-ignored, but could be used to suggest correct spellings:

  • Wikipedia articles with {{R from misspelling}}, {{R from incorrect name}}, {{R from miscapitalisation}}, and redirects to these templates
  • Wiktionary entries that are known misspellings (e.g. wikt:anticiliary)
  • In cases where there are variant spellings of the same word or phrase, Wikipedia should probably pick one and stick to it except to mention the variants. This happens with:
    • Compound words - whether to use a space, dash, or nothing, as in "junebug" vs. "june bug" or "email" vs. "e-mail".
    • Words with multiple transliterations from another language (often there are multiple systems, no particular system, or a modern system different from historical systems).
    • Redirects with {{R from alternate spelling}} and redirects to that template.
  • Article Ana Recio Harvey | detected misspelling: appoinment | additional, undetected misspelling: enterpreneur
    • Looks like this was because of redirects with "enterpreneur" in the title. I have tagged them all {{R from misspelling}}, but I'll have to change the code to ignore those, as noted above. Thanks for catching that! -- Beland (talk) 23:52, 18 October 2018 (UTC)

  • Kenya, as of April 2019: "1963.Their". Probably the NLTK tokenizer found the word boundary despite the missing whitespace? -- Beland (talk) 03:52, 14 April 2019 (UTC)
  • Missing spaces after commas. Probably the NLTK tokenizer is finding the word boundary anyway? -- Beland (talk) 04:02, 14 April 2019 (UTC)

Mismatched markup and punctuation[edit]

Errors in punctuation (mostly quotation marks) and wiki markup generally cause confusion for readers, and also prevent the spell checker from running on these articles.

Inches and feet should not use " and ', per Wikipedia:Manual of Style/Dates and numbers#Specific units; use letters instead. (See MOS:UNITS for general guidance.) Where conversions are needed, use {{convert}}, for example: 2 feet 3 inches (69 cm)

0 (2019-02-01 dump)[edit]

  • 005.1999.06 - Unmatched " near: f Munhwa Ilbo said, "Uhm now has an indis ... her songs, , , and ." However, Park Eun-j
  • 007 (Will Pan album) - Unmatched " near: tible edition) 46'34"
  • 009-1 - Unmatched " near: nte described it as "James Bond with wome
  • 00 - Unmatched " near: ual to 8.452 mm (.33") in diameter
  • 01011001 - Unmatched " near: entury scientist on "E=mc 2 " ... entury scientist on "E=mc 2 " ... X) – guitar solo on "E=mc 2 "
  • 01 Gallery - Unmatched " near: y Warhol's Factory. "Its eccentric mix of ... were more than pop."
  • 0 + 2 = 1 ½ - Unmatched " near: acles, two on the 7" EP on Allied Record
  • 0 + 2 = 1 - Unmatched " near: were issued as a 7" on Allied Recording
  • 02 (Urban Zakapa album) - Unmatched " near: ad single is . and "All The Same(Hangul:
  • 0304 - Unmatched " near: ting that she looks "desperate and on th
  • '03 Bonnie & Clyde - Unmatched " near: ver the sampling of "Me and My Girlfriend ... (Explicit) – 3:26 # "U Don't Know (Remix) ... adio Edit) – 3:28 # "U Don't Know (Remix)
  • 05 Fuck Em - Unmatched " near: the sexual lyrics," which he thought th ... – 3:34 # – 4:23 # "Strip You – 3:06 # ... – 6:15 # "Liii – 3:08 # "Twurk ... – 2:54 # – 4:01 # "Prayin 4 A Brick – 4 ... :24 # "Free Da Wurld – 4:38 ... # "I Am The Rawest Rapp ... er – 2:46 # "Rip Kennedy – 4:23 # ... "Smack – 2:23 # "Licks and Ducktape – ... 2:44 # "Strong Arm – 3:51 # ... "Love B – 3:06 # "Blow – 3:07 # "Lanlo ... rd – 4:01 # "Switch Lanes – 3:24 ... # "Rock Up 4sho – 3:21 ... # "New York Anthem – 3: ... 44 # "People Like Me – 4:1 ... 4 # "Rob The Jewler – 4:2 ... 6 # "Gutta Work – 3:23 # ... "Free Bandz – 3:09 # ... "Insurance – 2:32 # "BGYCFMB – 3:20 # "Li ... ck A Shot – 3:36 # "Bitch KT – 3:31 # – ... 4:05 # "10k Summa – 2:38 # " ... In Florida – 3:16 # "Fmbn – 3:30 # "Im th ... e Rap God – 3:35 # "Ruff Ryder – 1:33 # ... "Stright Up – 3:30 # ... "Amis Scur – 1:06 # "Mount Up – 1:57 # "A ... ct Right – 3:47 # "Built To Survive – 3 ... :47 # "Dear Mama – 2:51 # ... "I'm Gunna Be a Docto ... r – 2:59 # "Bloggers Anthem – 2: ... 55 # "Gutta Goin' Platinum ... – 5:01 # "Beat The Ho Up – 3:0 ... 0 # "Stop Selling Dope – ... 3:02 # "Motivation (Remix) – ... 2:41 # "Painful Intermission ... – 3:43 # "1 Mo – 2:57 # – 4:1 ... # ft. Cashout Clete" – 3:41 # – 3:02 ... he Adopted Tabby Cat" – 4:28 # – 4:32
  • 0-8-2 - Unmatched " near: he oldest working 15" gauge locomotive in
  • 0-8-4 - Unmatched " near: d the episode to be "smoother [than ], al ... hough more formulaic". He criticized the
  • 0898 Beautiful South - Unmatched " near: er Press, the album "features even more u ... just about anyone.)" Marie Lamie, writin ... of Sputnikmusic as "a distant relative o ... beat than that song," refers to the ser ... . Kessler said that "Hemingway and Corrig ... an action replay of " on the song, but T ... and is a song of "blessed with a groov ... rement. , he spits. " ... n the United States." They said it was du ... the album , saying "even more obscure st ... y could be a threat." ... view Play said that "the whole album is f ... ue a revisit or two." ... from the 12" single and CDEP ... from the 12" single and CDEP
  • 0.999... - Unmatched " near: lity, he found that "students continued t
  • 0 to 100 / The Catch Up - Unmatched " near: of the saying that "months after releasi ... ted by boasts like ." The magazine also l

After Z (2019-03-01 dump)[edit]

  • Μ-recursive function - Unmatched " near: nction: Kleene uses " C q n (x) = q " and ... se the abbreviation " const n ( x) = n ": ... Kleene (1952) uses " U i n " to indicate ... he m th of function "f m ", whereas the s ... o the n th variable "x n ": ... ene uses the symbol " R n (base step, ind ... uction step) " where n indicates t ... ariables, B-B-J use " Pr(base step, induc ... tion step)(x)". Given:
  • Π-calculus - Unmatched " near: ngs – for instance, "✂ is bound to term "

cquote[edit]

MOS:BLOCKQUOTE says {{cquote}} should be replaced by {{quote}} in articles.

Find all current instances.

HTML tags[edit]

Updated from 2019-03-01 dump.

You can do one of two things for these articles:

  • Remove, repair, or convert the HTML markup to wiki markup yourself.
  • Tag the article {{cleanup HTML}} and it will show up under Category:Articles with HTML markup but not on this list. Include an HTML comment indicating which tags are present on the page; many editors find it hard to locate the offending HTML. For example:

How to clean up[edit]

See Category:Articles with HTML markup for instructions on how to find the offending tags and what to do about them.

Find all articles by tag[edit]

Can't wait for the next database dump? Want to look for or fix all instances of a specific tag? Use the links below!

Known bad HTML tags[edit]

NOTE: Waiting for Apr 20 dump to update, but feel free to delete fixed items until then. (Report on the tags listed in the above section.)

Bad link formatting[edit]

Angle brackets are not used for external links (per Wikipedia:Manual of Style/Computing § Exposed_URLs); "tags" like <https> and <www> are actually bad formatting. See Wikipedia:External links#How to link for external link syntax; use {{cite web}} for footnotes.

Not HTML[edit]

Sometimes editors use angle brackets (< and >) for other purposes. Though these are not HTML markup, they often need to be fixed.

<<...>> find all can indicate:

  • French quotation marks rendered as <<quoted text>>. These should be normalized to "quoted text" or 'quoted text', even in quotations, per MOS:CONFORM.
  • A broken citation that should be converted to {{cite web}})

Other weirdness:

  • <the> - find all - More French quoting style, bad linking, bad citation style, etc.

TBD[edit]

Unsorted[edit]

Notification of new dumps[edit]

"Most likely misspellings by articles" should always have work to do (if not, ping Beland to add more from the current dump). Some of the other sections are occasionally waiting for a new dump to get a useful list, either because they are ranked by frequency or a code change has been made to clean up noise in the next run. New runs are generally posted twice a month. The database snapshot from the first day of the month generally takes about 9-13 days to process, and the snapshot from the twentieth day of the month might take 4-6 days until it can be posted.

All that said, if you want to get a ping when results from a new dump are posted, you can add your name to the list below. If you are only interested in a particular section, include a note to that effect.

moss source code[edit]

moss is written in Python, and is available on github at: https://github.com/cdbeland/moss

Data is obtained from XML database backup dumps.