Wikipedia:Typo Team/moss

From Wikipedia, the free encyclopedia
Jump to navigation Jump to search

The moss project seeks to find and remove the furry green typos that have been growing on Wikipedia articles. This is an experiment by User:Beland to automatically find misspellings, mistakes in English grammar, and violations of the Wikipedia:Manual of Style.

Dearth to typos!

Contents

Misspellings[edit]

How the lists are made[edit]

Algorithm, using recent database dumps of each project:

  • Look at all words in all articles in the English Wikipedia, ignoring text inside references, templates, tables, quotation marks, and some other weird places.
  • Eliminate all words that appear in titles in the English Wikipedia
  • Eliminate all words that appear in titles in the English Wiktionary
  • Eliminate all words that appear in titles in the Wikispecies
  • For remaining words, count how many articles they have appeared in, and rank accordingly

Many mistakes are not (yet) caught:

  • Misspellings where there is a redirect tagged {{R from misspelling}}
  • Improper addition of 's (possessives are not added to Wiktionary, so these are excluded systematically)
  • Incorrect capitalization
  • Incorrect multi-word phrases
  • Wrong word used in context
  • Mistakes involving any characters that are not A-Z
  • Foreign language words not tagged with {{lang}} or where an English misspelling happens to be the same as a word in another language. (These are counted as correct spellings if they are in the English Wiktionary, which lists words in all languages – only the definitions are restricted to English.)

Statistics[edit]

Misspellings
per article
2018-04-01 dump
moss 4933ad4
2018-07-01 dump
moss 4933ad4
2018-07-20 dump
moss 5e6b2ce
2018-08-01 dump
moss 0f7ddbf
0 4839889 4910541 4948698 4956727
1 319509 319315 315926 312871
2 104405 104591 90630 89861
3 40270 40099 38430 37891
4 22793 22739 21069 20900
5 13355 13331 12561 12392
6 9398 9422 8700 8620
7 6599 6614 6150 6095
8 5314 5312 4854 4832
9 3992 3985 3723 3643
10-19 16753 16879 15508 15437
20-29 4997 4992 4597 4594
30-39 2169 2211 1976 1962
40-49 1177 1205 1061 1061
50-59 674 695 619 618
60-69 453 476 420 419
70-79 299 326 243 241
80-89 213 218 179 181
90-99 140 153 131 126
100-199 456 521 434 435
200-299 90 113 93 95
300-399 44 45 41 42
400-499 19 26 21 22
500-599 12 13 13 13
600-699 8 9 9 8
700-799 2 3 3 5
800-899 2 3 3 3
900-999 6 7 6 6
1000-1999 25 27 27 26
2000-2999 3 5 5 5
4000-4999 0 2 2 2
Parse failed 193671 194777 191813 195147

Instructions for editors[edit]

Just like a regular spell checker, sometimes a word that's highlighted is really a misspelling and should be changed, but sometimes it is a correct spelling that needs to be added to the spell checker's dictionary (which in this case is the English Wiktionary and Wikispecies). For the below lists, here's how you can help:

  • For spelling mistakes: Click on the links to the individual Wikipedia articles, and edit them to correct the misspelling.
  • For incorrect spellings in direct quotes: Add {{sic}} around the word or phrase. See Wikipedia:Manual of Style#Quotations for guidance.
  • For correct spellings that belong in the dictionary: Click on the word to add it to the English Wiktionary. You can add species names to Wikispecies instead, but there's no handy link to bring you to the right place. Remember the word might not be English (though the definition must be), and be sure to check capitalization!
  • For correct spellings already in the dictionary: Delete from the list or strike through; these have been added in the meantime since the database dump by other editors. They do not automatically turn red as internal Wikipedia links do.
  • For DNA sequences (not appropriate for Wiktionary), add {{DNA sequence}} around it.
  • For correct spellings not appropriate for Wiktionary, add {{proper name}} around it if it's a proper noun; otherwise add {{not a typo}} around it (for example, nonsense series of letters used as examples in puzzles or computer code).
  • For foreign (non-English) language words: Edit the article and use the {{lang}} template to mark all foreign-language passages. Template contents are ignored, so they will not show up in the next report. It would still be helpful to add the foreign-language word to the English Wiktionary if you can give an English definition, but this is not required since the word is probably already in its native-language Wiktionary. If you don't know which language is being used, either skip it or try translate.google.com or Wikipedia:Language recognition chart.
  • Correct or incorrect, when finished delete or strike out the entry for the word from the lists on this page, so work won't be duplicated. It is preferred to delete the entry for sections that rotate through specific letters, and strikethrough for sections where the whole thing gets update (to prevent duplicating work done while the dumps were being processed, which can take more than a week).
  • If an article or section has generally bad grammar, and you don't have time to fix the whole thing, just add {{copyedit}} at the top of the article or {{copyedit|section}} at the top of the affected section.
  • If you see errors being reported from footnotes or bibliographies, check to make sure the section is titled with a standard name following MOS:APPENDIX conventions. Standard end-matter sections like "References" and "Further reading" are ignored.

Don't worry if you miss something; it will reappear in a future report if there are still mistakes.

Suggested edit summaries[edit]

If you want to help publicize this project, you can copy-and-paste these into your edit summary, if appropriate.

For Wikipedia edits:

Fix misspelling found by [[Wikipedia:Typo Team/moss]] – you can help!
Tag foreign-language text found by [[Wikipedia:Typo Team/moss]] – you can help!
Tag correct text as {{not a typo}} for automated spell checkers (including [[Wikipedia:Typo Team/moss]])

For Wiktionary edits:

Add word identified by [[w:Wikipedia:Typo Team/moss]] – you can help!

Wiktionary cheat sheet[edit]

Need to add a word to Wiktionary? The Wiktionary cheat sheet has copy-and-paste templates that make it easy for the types of words commonly encountered here, even if you've never done it before.

Articles with a single possible misspelling[edit]

These are most likely to be typos, and are the easiest to clean up.

Useless but maybe fun trivia: Because the whole list would be too long to post all at once, this is a whimsical sampling from the alphabetical listing of articles. First letter frequency varies a lot, so some letters can be posted in their entirety, but others have just a small portion. Letters recently worked on will disappear for one dump cycle to allow the dumps to catch up with recent fixes.

Easy fixes from 2018-07-20 dump[edit]

BB-BE[edit]

all done

BG-BO[edit]
BP-BZ[edit]
X[edit]
À-Å[edit]
Él[edit]
Ém-Í[edit]
Ö-Ü[edit]
Ć-İ[edit]
Ł-Š[edit]

2018-08-01 dump[edit]

2006 (space)[edit]
2006–[edit]
3-3,[edit]
3-30[edit]
31-35[edit]
36-3C[edit]
3D-3d[edit]
3r-4,[edit]
4-40[edit]
41-45[edit]
46-4m[edit]
4t-4°[edit]
4′-55[edit]
56-5α[edit]
Aa-Ab[edit]
Aba[edit]
Abba[edit]
Abbe-Abbs[edit]
Abby-Abda[edit]
Ba-Bab[edit]
Baba-Babc[edit]
Babe-Babr[edit]
Babs-Babu[edit]
Q (space)-Q-[edit]
Q.-Qad[edit]
Qaf-Qas[edit]
Qat-Qh[edit]
Qi-Quab[edit]
Quac[edit]
Quad-Quak[edit]
Qual-Quam[edit]
Quan[edit]
Quar-Quas[edit]
Quat-Queeg[edit]
Queen[edit]
Queer-Quen[edit]
Quer[edit]
Ques-Quic[edit]
Quid-Quinc[edit]
Quind-Quino[edit]
Quint[edit]
Quiny-Qun[edit]
Quo-Qz[edit]
Zd-Zeb[edit]
Zec-Zeh[edit]
Zei-Zek[edit]
Zel-Zem[edit]
Zen[edit]
Zeo-Zep[edit]
Zer-Zes[edit]
Zet[edit]
Zeu-Zg[edit]
Zha-Zhi[edit]
Zho-Zid[edit]
Zie-Zih[edit]
Zij-Zil[edit]
Zim[edit]
Zin-Zio[edit]
Zip-Ziz[edit]
Ziç-Zj[edit]
Zl-Zoe[edit]
Zof-Zok[edit]
Zol-Zom[edit]
Zon[edit]
Zoo-Zop[edit]
Zor-Zot[edit]
Zou-Zug[edit]
Zui-Zun[edit]
Zuo-Zw[edit]

Notes from 2018-07-20 dump[edit]

Notes from 2018-07-01 dump[edit]

  • 1 - Aansoo Ban Gaye Phool - wikt:cellspacing. → false alarm on table markup
    • OK, any article that still has "cellspacing", "colspan" or "rowspan" leaking out of table markup (normally tables are just suppressed so we can spell-check the remaining text) will be shunted to the "parse failure" pile, starting with the next run (August 1 or later). -- Beland (talk) 02:15, 27 July 2018 (UTC)
  • 1 - Aardwolf - wikt:bettonianus → species name - should have article written
  • 1 - Aasu, American Samoa - wikt:oloaufou → false positive after apostrophe - Polynesian names often contain an apostrophe
    • Well, I created A’oloaufou as a redirect. Based on what is documented at ʻokina I think this is supposed to be an ’eta because this is Tahitian, but it doesn't have a separate Unicode point from ’. I definitely don't want to parse quote marks as letters but maybe I can combine things on either side for spell-check dictionary lookup purposes. -- Beland (talk) 00:54, 26 July 2018 (UTC)
  • 1 - Yacambú National Park - wikt:caramerudo - this is the common name used in Venezuela for Odocoileus virginianus deer (see here) - I'm not sure what to do with it. DferDaisy (talk) 15:55, 4 August 2018 (UTC)

Archived notes[edit]

See Wikipedia:Typo Team/moss/Archive.

Possible misspellings appearing most frequently[edit]

These are the most likely to be missing from the dictionary (or bugs in the algorithm). Some may be common misspellings.

Excellent progress being made here; we used to have misspellings with 1000+ occurrances! -- Beland (talk) 00:49, 19 July 2018 (UTC)

100–200 occurrences[edit]

90-99 occurrences[edit]

80-89 occurrences[edit]

70-79 occurrences[edit]

60-69 occurrences[edit]

50-75 occurrences[edit]

45-49 occurrences[edit]

40-44 occurrences[edit]

300-2200, non-English[edit]

These are a special bonus for adventurous Wiktionary folk who can add translations for words from other languages to the English Wiktionary. For some reason, English Wikipedia readers will see them a lot, so they seem like the most useful to have translated. (These have not been checked against non-English Wikipedias.)

Articles with the most possibly misspelled words[edit]

These are likely to be lists using non-English-language or technical words.

  • For articles that are just lists of species names, please link to the article from Wikispecies:Wikispecies:Requested articles#From_Wikipedia and delete the entry here. Those are now automatically suppressed.
  • For non-English-language words, add {{lang}} around the foreign passages and delete the row. Articles that don't do this often have formatting of non-English words that is inconsistent either internally or with the Manual of Style, so this is an easy way to fix that at the same time as helping the spell checker and screen readers do the right things.

200-299 words[edit]

  • 280 - Iban people - Fixed with lots of {{lang}}. -- Beland (talk) 13:58, 10 August 2018 (UTC)
  • 228 - Rama language - psaarik tkua nkiikna mlingu alkwsi salpka nkiikna kiikna ikwsu mlingu malngu aakar aakar aakit tkii itraali tausung tausung saiming puksak ngarak tamaaski kaulingdut tausung kiiknadut kumaalut tiiskamalut salpka yausa nsusuluk tiiskama tairung kaulingdut nsulaing aakar apaakut airung aakar anut tabiu nsul nitangu suulikaas niaukut mtaaku ansiiku tkii itraali alkwsi mtaaku tkii nangalbiu kiikna kwisu psutki ngurii psutki aakar kruubu suulikaas baalpi nsiiki nsiiku nsiikut nsiikbang nsiikka nsiikkata aakari aakar kiikna kwisu atkul kiikna kwisatkulu atkul yakaangatkulu akaang ansungatkulu atkar itkr aakar baakar baakiri aakar nitanangbang alkwisbang alkwis ibatingi batingi baakar baakiri alngu aakar aakari uing uting taakkama apaakut alkwsi ngalbi yuansiiku yutaaku nsukuaakari suulikaas yunsuaukkama kiikna paalpa baanalpi traali baanalpi traali suuli aakwaala suuk mlingkama aakar aakar mliika aakar ipang aakar ngarak ipang aakar kuaakar kwaakar puksak kuaakar nsukuaakari kwiik ngarak ikuaakari tkari baakiri tiiskama almlingi kiikna kwisu alkwsi tausung saiming kuaakar tkii itraali naayarnguli ansiiku ngarak ipang aakar alkwis yaabra aapunu tamaaski airung aakar ngulkang malngi twiis apaakut ngabang yuisiiku yuisiikka yupsi maktungu psutki tiiskama yuitaaki suulikaas niaukut nipiabang suulikaas baalpi nipayakama anaakar paalpa analkuka ipang aakar alkwsi kuaakar apaakut yutaaku ngarangki aakar aakar ngulkang aakar aakar nsulaing alkwsi mtaaku kaulingdut ipang aakar alkwsi tausung baanalpiu aakar tairung aakar tausung saiming kuaakar puksak kuaakar aakwaala anaungi nipaayau nitangu almaling aark alaark alauk alkwis baalp uung suulikaas suuli upulis onomatopoeics tahtah ngaukngauk tkwustkwus nuknuknga ngarngaringba siksiknga kingkingma tiskitiski

150-199 words[edit]

  • 176 - Periyanayaki Shrine, Thiruvithancode - Needs translation from Tamil. -- Beland (talk) 01:07, 23 July 2018 (UTC)
  • 154 - Oromo language - eessa eettii dureessa hiyyeettii aduu manoota hiriyaa hiriyoota barsiisaa barsiiso waggaa waggaawwan laggeen ittii karicha haroo harittii qaalluu qaallicha qallittii dhuma dhuma keessa barsiisaa barsiisaa konkolaataa konkolaataa namni namoota namootni namoonni ibsi namicha namichi maqaa maqaan nyachuu nyachuun haati lafti obboleetti namicha obboleetti namichaa hojii hojii barumsa barumsa afaanii obboleetti namicha namicha namichaa intalaaf sareef baruu baruuf bishaaniif sareedhaa sareedhaaf harkaan halkaniin yeroodhaan bawuu bawuudhaan harkatti guyyaa guyyaatti jalatti biyya biyyaa keessa keessaa gabaadhaa bunaa bunaatii irraa irraa gabaarraa isaani kaleessa dhufne dhufne kaleessa dhufne kootti isheen laalti isheen ofiif makiinaa jaalatu kennaa walii sanatti dhufne deemne deemna deemnu deemnu deemnu kolfite kofalte duuta duuna beelofta beelofna dhaga dhageessa dhageenya feeta feena feetu koottaa beenu beenaa autobenefactive beekam beekamani jedham jedhama beeksis beeksifne galch galchiti barsiis barsiisa autobenefactive bitadh bitadhu qabanna autobenefactives hojjadh hojii kaasis autobenefactive deebi deebis deebifam deebifadh bubbul caccab dhiib dhiddhiib autobenefactive dhug dhuguu jechu fedhuu fechuu dhug dhugam dhugamuu dhugamuudhaan

100-149 words[edit]

  • 147 - Pangasinan language - kumadua katlo kakatlo pidua pinlima aminsan amidua mamitlo sansakey sanderua pamidua sikato sikami sikatayo sikata sikayo sikara kapigan panonto amayamay dakel pigara daiset apatira andokey maawang malapar ambelat melag melanting tingot daiset melag melanting tingot antikey mainget mabeng bolog kaamong kaamong alumbayar kakiewan katakelan bislak obak lubir iknol saklor ikol bikking pueg beklew beneg pagew ketket sepsep lutda sibok dongap linawa nengneng nonot angob ogip onpatey manpatey managnop manpana manerel tegteg pisag paldua doyok tikyab onsabi dokol yorong alagey pelag itdan poyok goyor dait ibagam letaw kigtel bitewen angalakan dagem linew pakigtel asiwek asewek palandey ambalanga ampasiseng andeket ampetang ambetel napsel napesel napno napano abig maoges abolok maringot marutak dutak malimpek limpek tibokel matdem tarem kawanan kawigi ngiriyet masanting marakep ambanget masamit mananam dagem linaew kogip tampol pusok amamayoen sikalay nanengneng akbibiten nodnonoten ogalim nalingwanan kaayos dagem linaew tampol limgas amamayoen akbibiten nanonotan ugalim nalingwanan kauyos ispiritu
  • 144 - English words first attested in Chaucer - begster attourne feminie gigge louke prenticehood begeth anoyful chincher chinchery counterwait customance custumance humblehede laureole rackleness clotheless mistrest nigromancian sustenant disfigurate messagery communably jacounce jagounce mendience miscoveting overgilt outwine outsling papelardy ravisable recreandise ribanding rideled roinous suckeny timbester villainsly wyndre minstrelly sweynt adjoust annoyously arbitry asperness astoning celebrable coetern definish delye disincrease distempre emprent enbaissing ensampler entach entech entalent eschaufe festivally foleye forline formly fortunel fortunous governail habitacule hustlement necess overwhelve plungy portionable presentary previdence purveyable rhetorian scorkle senatory slead troublabla unbetide undoubtous uneschewable unleeful unmovablety unparegal unplite unweened vengeress weeply witnessfully whaped whaped advocary amphilbology asfast avaunter calculing circumscrive defeit defet desespeir desesperaunce disblame enterpart estately executrice forbysen forlose grufe howne inhelde kankedort ounded palaceward palaceward palaestrial refreid reheting resport saluing scrivenliche tempestous unbroided untroth yfled agrote bedote betraising browd countryward radevore renownee tidive tuteler toteler almury embelif solsticion forloin solein uncorven ungrubbed
    • These are almost certainly wanted by Wiktionary. -- Beland (talk) 14:59, 21 July 2018 (UTC)
  • 141 - Bangalore Karaga - veerakumaras veerakumaras veerakumaras sevae veerakumaras chaakrigars veerakumaras sevae veerae chakrastapane veerakumararu vastru veerkumaras veerakumara veerakumaras veerakumaras veerakumaras veerakumara vahnikula kshytriyas poojaris pothraja ganacharyas punyaha dwajarohana yelkunga aarathis aarathees dheeksha shudhi prathisthapana kainkarya veerakumaras deekshapower purohitha bankadasayya veerakumaras veerakumaras shakthipeeta meteriolists sanchara shakthishala veerakumaras shudda maneyavaru sanchara chathra sevakarthas purohitha dravyas purohitha sumanthra karathas shanthikainkarya manthras karpuras kuladevatha pradhakshina veerakumaras veerakumaras kulagowda avishkaras manthras chathra veerkumaras karapura pradhakshina peetashala gantanadha shanchalana sufees churnas gantanaadha manthras mangalaarthi pradhakshina poojaris veerakumaras potharaja uthsava potharaja manthras avishkara potharaja poojaris potharaja poojaries gantepoojari manthras potharaja manthras poojaris shanthithanthra shakthism potharaja poojaries veerakumaras shakthyotsava vasanthotsava vuthsava uthsava dwajadanda veerakumaras veerakumars chakridaras uthsavas kulagowda ganachari ganachari shakthyotsava shakthyotsava gantepoojari shakthyotsava potharaja potharaja potharaja bhankadasayya sanchara kolkara shakthyotsava shakthyotsava mantravadies thantras poojaries sanchara uthsavas sulthans sulthana hasthas kshathriyas veerakumaras peytey proncounced thigalarapete journing ppoje cubbonpete kalyanis mathsta mathastha miscoating
  • 121 - Malaysian National Projects - This just had a huge chunk of Malaysian text that needs to be translated; tagged. -- Beland (talk) 14:59, 21 July 2018 (UTC)
  • 112 - List of minerals R (complete) - heptamagnesium octahydroxy hexacarbonate octadecahydrate tetrauranyl tetraoxo octaqua dioxyhydroxy trimercury dihydrotetraoxotellurate pentaoxoditellurate octaicosaoxodecavanadate pentadecahydrate hexasulfa hexauranyl hexaoxytetrahydroxy pentadecacopper docosahydroxy heptaoxodisilicate hydroxophosphate decasulfa pentarsenide decalead hexasulfate ditetraoxosilicate tetrauranyl tetraoxo tetraqua decahydroxy docosahydrate hexatitanium hexadecaoxide pentadeca nonacarbonate fluorsulfate boroctaoxotrisilicate hexanickel hexadecahydroxy pentacopper tetraoxosilicate diarsenite pentacalcium ditetraoxosilicate tetrauranyl chevkinite tetrastrontium tetratitanium octaoxo diheptaoxodisilicate pentamanganese pentaoxochromate tetraberyllium tetralumino undecaborate octaicosaoxide trioxosilicate octasulfa tristannide trirhodium pentamanganese tetraoxosilicate trisulfarsenide pentacopper tetrarsenate heptalead tetraoxo hydroxytrichloride pentamagnesium tetraoxosilicate tetrairon tetradecaoxide heptaoxohexaborate pentacalcium hexadecaoxohexasilicate dioxotriphosphate tetrairon tetratellurate tricopper undecapotassium pentaicosachloride tetraoxosilicate aluminotetrasilicate undecaoxyhydroxy dodecaoxotetravanadate tricerium pentacarbonate pentamanganese tetraberyllium aluminotrisilicate decaoxydihydroxy octaoxotrisilicate trialuminium decawater trixovanadate hydroxoarsenate mosandrite hexasulfa heptaoxotetraborate tetrayttrium heptaoxodisilicate aluminoctaoxotrisilicate dodecaoxotrisilicate tetratelluride heptaoxovanadate nonahydroxy decacalcium heptaoxodisilicate hexaoxotungstate hexaoxotetrarsenate triplatinum decacalcium heptaoxodisilicate
  • 111 - Insect morphology - unsclerotised frontogenal frontoclypeal epistomal frontoclypeal clypeogenal postgena postgena postgenal postocciput postgena postgenal postgenal sensitiity stemmatal tormae lacinea lacinea lacineal lacinea superlinguae superlingua libium pseudotracheae pseudotracheae pseudotracheae dilineated pterothoracic alinotum antecostal phragmata alinotum subcoxal metepisternum epimiron mesepimiron metepimiron mesopleoron macrotrichia macrotrichia archediction precosta precosta vannal vannal vannal vannal vannal vannal vannal mediocubital vannal vannal mediocubital vannal vannal vannal vannal vannal vannal vannal vannal vannal vannal vannal vannal vannal vannal vannal neala neala mediocubital calypteres vannal pteralia pteralia vannal mediocubital pteralia vannal vannal mediocubital retinalucum vannal vannal waythat vannal basalar basicostal coxomarginale basicoxite trochantin trochantinal basicoxite postarticular basicostal trochantero coxotrochanteral protibia protibiae laterotergite laterosternites paraprocts phallosoma extratory exoporian dytresian exoporian jeuvanile prothorocic pharyngial
  • 110 - South Indian cuisine - pulihaara pachhallu maaghaya pandumirapakayala dosakaya dosavakaya chintakaya kooralu vankaya dondakaya chukkakoora menthikura palakura dosakaya beerakaya sorakaya palakoora sorakaya thotakoora anapakaya miriyala potlakaya sorakaya chekkalu murukulu jantikalu chakkilalu pootarekulu sunnundalu thokkudu nuvvula dosakaya vanakaya usirikaya yellu tambli gojju happala appemidi amateykai kadambattu gojju uppinakai tovve bassaaru uppusaaru masoppu masekai hitakida puliyogre karjikai sajapa kadubus tambittu paramanna sweetmaking menthya shavige kadubu gojju kosambri mosaru kodabale chakkali nippatu paalpradaman nendarangai erucherri upperis podimeen pollichathu kalllumekka vindallu ularthiyadhu vevichathu khyma upperi ethaykkappam ullivada avalosunda neeyyappam unnaykka churuttu modhakam vazhaykka vattayappam irattymadhuram velayappam vaththal kozhambu payarru varuval thokku oorukaai vadaam vaththal thirukannamidu sarkarai idiyappams karuppatti kozhakattai adikoozh kandharappam seeyam seeyam kavuni athirasam kevar pachidi karuvattu
  • 109 - List of minerals B (complete) - tricopper trisulfa dibismuthide pentaaluminium triantimony tetradecaoxide pentamagnesium hexabismuth pentasilver hexaoxy tetrahydroxyl heptasodium dibicarbonate dodecairon nonadecaoxide tetraluminium tetrairon tetrahydroxyl tetracopper tetralead icosairon heptacosasulfide dodecalead hexatricontasulfa hexadecarsenide triantimonide tricopper octadecahydrate tetramanganese heptacopper hexamercury tricopper hexaiodate tristrontium tristrontium hydroxophosphate tetralead tetradecasulf hexanantimonide dodecasulfa heptabismuthide octasilver hexabarium hexacalcium tridecacarbonate pentairon pentahydro octasulfa pentarsenide tetrasulfa diantimonide tetraluminium tetraberyllium hexaluminium nonahydroxy docosahydrate heptasilver pentarsenic triniobium octadecaoxide diphyllosilicate nonanickel dialumino tritellurite pyrovanadate decahydroxyl octadecahydrate pentamanganese pentacopper octahydroxyl tetraluminium icosafluoride dioxymonoborate tetraselenide pentacopper tetratelluride tetrapalladium tricopper tetralead heptantimonide octadecasulfide trihydroxyl heptahydrte dodecahydroxide pentalead tetrantimonide undecasulfide tetracopper dodecahydroxide hexamanganese octaoxy trialuminium heptacalcium tetranesosilicate tetracopper diodate tricopper tetraoxogermanate octasodium hexasulfate tetracopper tetraselenide trihydroxyl pentaaluminium labuntsovite hexahydroxide tetradecamanganese heptacosaoxide henicosahydrate
  • 105 - List of minerals G (complete) - triarsenide decapentasodium pentasulfate tetrairon trigallium heptacalcium triiridium octatelluride decaoxytetrasilicate tetramanganese octaoxy pentamanganese tetrahydroxyl gatelite heptauranyl pentyoxy heptahydroxyl pentairon hexacalcium dinesosilicate oxydisulfate octalead oxyhexachloride soroalumosilicate pentamanganese dihydrogenarsenate tetrahydrogenphosphate tetraplatinum triantimonide tetrazinc tritectosilicate pentacopper hexaiodate tetralead pentaniobium pentahydroxyl trihydroxyl tridecacopper tridecacopper octaantimonide tridecasulfide tetrahydroxyl diantimonide tetramercury tricopper tridecasulfide tricopper trihydroxyl tetrairon labuntsovite labuntsovite labuntsovite labuntsovite tetrairon undecahydroxy tetraberyllium tetrahydroxyl trinesosilicate trialuminium trialuminium hydrooxophosphate tetrazinc dialumino pentacalcium hexasulfate tetraaluminium octadecacopper hexamercury hexacalcium disorosilicate trialuminium hydrooxophosphate tetrairon trititanium tridecaoxide nonalead tetraarsenide pentadecasulfide trioxyphosphate trizirconium disorosilicate tricopper hexavanadate tetraaluminium inoalumosilicate trinesosilicate oxyhydroxyl chalcostibite trimercury tetraantimonide dodecasulfide tetrahydroxyl hexazinc decahydroxyl pentacalcium tetrahydroxyl pentazinc tetraberyllium tetrahydroxyl tribarium hexasulfa tribismuthide labuntsovite oxyhydroxyl neodium
  • 103 - Balagtasan - bukanegan lakandiwa pinagpipitaganan ipagdiwang ginagamit mahalaga ipinakikilala batikang pangangatuwiran hahanga palakpakan pasalubungan mangangatuwiran makakalaban nagpupuri minimithi hangarin sumisidhi ginagamit kilalang ginagamit magpunta magsisilbing panglahat pagsalita katunggaling nasasaktan ginagamit asignaturang ililiwat malirip maisip napapansin tuwid magiging paniwala katunggali patuloy kolonyal dayuhan pasimuno naghihirap maraming magkakaintindihan minamaliit kinagisnan alagaan itaguyod lumago nararapat panlahat kalakalan pagbabalita nakikita matutunganga kanilang panturo upang wastong upang mabubuo magiging panlahat gumagamit dayalekto sinasabi nagturo isinasalin dayalekto paggawa katuto masasalin dayalekto tutunganga panlahat unibersal ginagamit pakikipagtalastasan dayuhan magkaunawaan katunggali kalimutan isapuso kinagisnan nalilimot nababatid magtungo mahuhusay pagtatalo palawigin madlang nanonood magpasiya mahalaga piniling mahalaga gammit sinisinta naririt nagpapasalamat bumabati masigabong palakpakan
  • 102 - Sotho nouns - enumeratives (done except for this last word)

Possible misspellings by word length and type[edit]

Longest and shortest are shown, since they are perhaps the most interesting. Strong candidates for {{not a typo}} and {{lang}}. Please use strikethrough (or leave a note) for this section rather than removing lines, to avoid repeating work done while the dumps were being processed. Thanks!

Probable DNA sequences[edit]

If you're sure this is a DNA sequence, tag it {{DNA sequence}}.

90-99 bytes, non-English[edit]

80-89 bytes, non-English[edit]

Probable chemistry words, 30+ chars[edit]

Mostly general English, 30+ chars[edit]

Hmm, looks like the chemistry word detector could use some enhancement. -- Beland (talk) 16:27, 15 August 2018 (UTC)

If it's proper German, I'd go for {{lang}}. If it's not proper German, it'll get flagged as a spelling error in the future maybe, in which case I'd go for {{not a typo}}. -- Beland (talk) 02:00, 19 August 2018 (UTC)
tagged as Proper name, as it is an exhibition and book name

Repeating patterns[edit]

For rhyme schemes, they probably need to be re-styled to follow Wikipedia:WikiProject Poetry#Style for rhyme schemes. If this ends up making them all-caps, they won't show up here on the next run. For mixed-case rhyme scheme notations, use {{not a typo}} after making sure dashes, commas, and spaces follow the recommended style.

Beland is working on fixing notation in a lot of articles with rhyme schemes, so some of these may already be done.

Notes[edit]

Rhyme scheme hunting for Beland to do:

From most common misspellings (not detected here):

To cross-check (from alphabetical listing of misspellings, decommissioned):

Will also need to search for patterns like:

  • a-b-a-b-a-b-c-c
  • AB,CD,AB (internal rhyme)
Detected[edit]

Detected as longest repeating patterns:

repeated instances of infobox with "| status = UC | construction_began = 2011 " being expanded to "Status Under construction Construction began 2011". I think this is then 'seen' as "constructionconstruction" (see Silvan Dam) Same result for different infoboxen dam/bridge/stadium/power_station when text and following line label both have 'construction' ; out of 181 I found only one real typo!

False positives[edit]

Is there a word that is correctly used in an article, but which shouldn't be added to Wiktionary? List it here, and Beland will fix the problem.

Archived solutions: Wikipedia:Typo Team/moss/Archive

To fix[edit]

{{snd}} is not noticed - so "numbered{{snd}}the" is parsed as "numberedthe" despite showing as "numbered – the" LittlePuppers (talk) 15:27, 1 August 2018 (UTC)
ndash is stylistically wrong, so I changed it to mdash —. Graeme Bartlett (talk) 08:30, 7 August 2018 (UTC)
here are some more examples of the same Sherlotte (talk) 11:15, 9 August 2018 (UTC)

False negatives[edit]

Is there a misspelled word in an article mentioned here that was not reported? Feel free to list it below and Beland will try to improve the code if appropriate.

  • (none reported yet)

HTML tags[edit]

You can do one of two things for these articles:

Documentation helpful for making corrections yourself:

Note that "find all" links may find some false positives for some tags. For example, <ol> is used legitimately with the "start" attribute to control numbering. The articles listed and counted exclude these allowed cases where known.

Angle brackets (< and >) are not used for external links (per Wikipedia:Manual of Style/Computing § Exposed_URLs).

2000-11000 articles with tag[edit]

1000-1999 articles with tag[edit]

100-999 articles with tag[edit]

Special[edit]

These won't be found automatically in future runs because close tags are ignored as redundant. But these are dubious as close tags.

moss source code[edit]

moss is written in Python, and is available on github at: https://github.com/cdbeland/moss