Wikipedia talk:AutoWikiBrowser/Typos

From Wikipedia, the free encyclopedia
Jump to: navigation, search

"award winning" → "award-winning"[edit]

This seems to be a somewhat common misspelling, using a space instead of a hyphen. However, I'm wondering this word's use in puffery should mean we shouldn't correct it. Any thoughts? Stevie is the man! TalkWork 20:21, 31 December 2016 (UTC)

There are many instances where the expression seems to be substantiated.
Wavelength (talk) 20:49, 31 December 2016 (UTC)
Thanks. I can see that in my testing, although there does seem to be some puffery uses too. I went ahead and added it. I guess we don't have to be puffery police when we're just correcting typos.  :) Stevie is the man! TalkWork 15:07, 1 January 2017 (UTC)

Halfway vs half way[edit]

What on earth is going on with AWB enforcing "halfway" over "half way"? Both spellings are usually considered acceptable, and the choice is stylistic. It is a bit much having AWB enforce an arbitrarily preferred spelling where no error exists. Simon Burchell (talk) 11:40, 14 January 2017 (UTC)

@Simon Burchell: more appropriate arena for this question is WT:AWB/T. But this specific change was added here by Chris the speller. --Edgars2007 (talk/contribs) 12:27, 14 January 2017 (UTC)
@Simon Burchell:@Edgars2007: Dictionaries determine what is acceptable, and I found none that accept "half way". While digging through them again, I see that Collins says "also half-way", so I have changed the rule accordingly. It will still change "half way" to "halfway". Chris the speller yack 14:46, 14 January 2017 (UTC)
A Google search for "half way" site:theguardian.com finds about 13,700 results, from that one source alone. Please remove this rule ASAP. Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 18:50, 14 January 2017 (UTC)
General question: Should the Wikipedia be endorsing the misspelling of a word, even if by a mass media outlet? Stevie is the man! TalkWork 18:59, 14 January 2017 (UTC)
From the Grauniad's own style guide, halfway, halfwit. Mr Stephen (talk) 23:20, 14 January 2017 (UTC)
The 13,700 hits is a crock. Some are movie titles, and the first one I examined, "Which plays should you leave halfway through?", has it spelled correctly, and nowhere does it have the wrong spelling. Maybe Google search equates "halfway" and "half way" in some cases. A search for "halfway through" site:theguardian.com gives twice as many hits as "half way through" site:theguardian.com, so apparently most of the contributors to that site get it right. The AWB rule requires "half way" to be followed by a qualifying word such as "across" or "through", and the Google search does not; this is definitely apples and oranges. At any rate, the Gaurnida doesn't outrank a good dictionary. If the rule were removed, it would do nothing to help the many WP articles that have both "halfway" and "half way". Chris the speller yack 18:22, 15 January 2017 (UTC)
Chaucer used "half way" (but spelt it "half wey"); Shakespeare used "half way" in The Taming of the Shrew (but spelt it "halfe way"); California had a place called Half Way; more recently Harold Pinter used "half way" in The Caretaker (1960); various books use "half way" such as Half Way House (Maurice Hewlett), Half Way Home (C. W. Gill - 2011); Half Way to 1983: Governor Tatari's 24 Months in Office : Oct. 1979 - 1981; Half Way Between Everything (Marilyn Dennes, Susan Gilchrist, 2005); " Many inns like this were called “Half Way House” because they were half way between one town or village and the next." (Dolls In Canada by Marion E. Hislop - 1997 - Page 14). I could go on and on ... I agree that we should manually check consistency within one article, but half way, half-way and halfway are all acceptable. Why are we even considering this hypercorrection by bot? Dbfirs 21:11, 15 January 2017 (UTC)
Our concern is with contemporary English. Also, there is creative license for how language is used in titles. Our concern is what up-to-date reliable English dictionaries recommend. Halfway isn't spelled "half way" in any of them, apparently. Unless anyone can find "half way" as a legitimate spelling, I support the correction. Stevie is the man! TalkWork 21:22, 15 January 2017 (UTC)
Both half and way are found in all dictionaries (two-word terms are not listed of course, but they are cited), along with half-way and halfway. We had this argument in 2012. I cited Harold Pinter and a few recent books. I could find lots more. Could we please also remove midway from the bot? I'd support the correction if it could be limited to articles written in American English where concatenation is the current fashion. Dbfirs 21:28, 15 January 2017 (UTC)
Oxford and Cambridge (U.S. and British) dictionaries don't agree with your position. If they did, they would show "half way" as an alternative. As for arguments, I'm only willing to see cites from dictionaries as arguments. Stevie is the man! TalkWork 21:50, 15 January 2017 (UTC)
The OED under the entry half- says: "The two elements are often written separately when the adj. is in the predicate (see half adv. 1); the use of the hyphen mostly implies a feeling of closer unity of notion in the compound attribute, as in half-blind, half-dressed, half-raw, viewed as definite states; but it is often merely for greater syntactical perspicuity, on which ground it is regularly used when the adjective is attributive, thus I am half dead (or half-dead) with cold; a half-dead dog.". I quoted three examples where the OED cites half way (Chaucer, Shakespeare and Pinter; I missed the fourth one by Goldsmith) though I agree that only Pinter is modern. The OED puts the hyphenated form first with the concatenated form as an alternative. Dbfirs 22:04, 15 January 2017 (UTC)
As you have shown, it's either "half-way" or "halfway" per a dictionary entry (what AWB tolerates at this point). I don't think we should be concerned about a particular writer's (external to Wikipedia) usage. Stevie is the man! TalkWork 22:21, 15 January 2017 (UTC)
You are, of course, entitled to your opinion, but I have provided numerous examples of half way as two words in common usage, and you have not established that this usage is proscribed. Dbfirs 00:02, 16 January 2017 (UTC)
My position is based on dictionary entries, not my feelings. "Common usage" does not bound English writing in an encyclopedia as there is all kinds of incorrect, but common usages. We are going for the correct here. As is usual, please feel free to start an RfC if you would like to establish a consensus decision on this matter. Stevie is the man! TalkWork 00:19, 16 January 2017 (UTC)
No dictionary proscribes the adjacent use of two words that appear in the dictionary. Dictionaries do not make rules about style. The Oxford English Dictionary has several cites for half way. I can see that we are not going to agree, so may we compromise by keeping halfway for all articles in American English, and half-way for all articles in British English (since the OED puts the hyphenated form first)? Dbfirs 00:30, 16 January 2017 (UTC)
I agree with the above. Looking at my previous sentence, I doubt any dictionary contains most of the two-word combinations there, yet they are correct as written. Hyphenated "half-way" is an acceptable compromise, but honestly, "half way" is not wrong. Simon Burchell (talk) 10:16, 16 January 2017 (UTC)
I am not in agreement per my earlier statements. Dictionaries ordinarily say if separated word usages ("half way") are an alternative. In this case, no dictionaries appear to do that. I side with correct uses only. As for leaving "half-way" as is (btw, two-word term), the current typo rule does this. Also, technically speaking, there is no way to split typo corrections for U.S./British English. If "common use" is going to be forced here, IMHO, this requires a community consensus to go against the dictionaries. Stevie is the man! TalkWork 15:12, 16 January 2017 (UTC)
Also I will reiterate that two UK-based dictionaries (Cambridge/Oxford), in their entries for this word, say it's 'halfway' with no alternatives. Stevie is the man! TalkWork 15:20, 16 January 2017 (UTC)
Sorry to butt in, but the above is just not true. The OED specifically gives "half-way" as the primary form, with "halfway" relegated to "accepted variant" and numerous examples of the unhyphenated "half way" form in the usage examples. ‑ Iridescent 15:28, 16 January 2017 (UTC)

@Iridescent:, I don't have access to the link you used, but I will take your word for it. The current typo fix doesn't correct half-way or halfway. I don't know why examples of "half way" are provided but not shown as an alternative in the entry itself. Note that my position is not controlling on this typo fix, as others add/update these things, and this typo fix was not created by me, but I frankly don't think there's a strong reason to change my position. If the dictionary thought that "half way" was common enough, it would have noted that as an alternative. I would prefer a community consensus to decide this. Stevie is the man! TalkWork 15:38, 16 January 2017 (UTC)

Also it might help to see examples of the OED examples of "half way". That might illuminate things. Stevie is the man! TalkWork 15:40, 16 January 2017 (UTC)

These are the examples the OED uses of the unhyphenated form. Bear in mind that the OED focuses primarily on earliest usage of a form, so most of them are 17th-century, but there's no suggestion that "half way" isn't still an acceptable usage and it isn't marked as archaic:
Adv
  • c1405 (▸c1390) Chaucer Reeve's Tale (Hengwrt) (2003) Prol. l. 52 Lo Depeford and it is half wey pryme.
  • 1530 J. Palsgrave Lesclarcissement 861/2 Halfe waye, au milieu du chemyn, or a my chemyn.
  • a1616 Shakespeare Taming of Shrew (1623) i. i. 62 I-wis it is not halfe way to her heart.
  • 1717 tr. A. F. Frézier Voy. South-Sea 106 A little above half way up a high mountain.
  • 1757 G. Shelvocke, Jr. Shelvocke's Voy. round World (ed. 2) vi. 198 Before I had got half way off.
  • 1766 O. Goldsmith Vicar of Wakefield I. x. 96 About half way home.
  • 1960 H. Pinter Caretaker iii. 77 He's nutty, he's half way gone.
Noun
  • 1634 T. Herbert Relation Some Yeares Trauaile 13 Cape of good Hope..being the halfe way into India.
  • c1665 L. Hutchinson Mem. Col. Hutchinson (1973) 20 In the halfe way betweene Owthorpe and Nottingham.
Prep
  • 1613 S. Purchas Pilgrimage 488 A cloth..which reacheth halfe way the thigh.
  • 1706 I. Watts Devotion & Muse in Horæ Lyricæ i. iii, Faint devotion panting lies Half way th' ethereal hill.
 ‑ Iridescent 15:56, 16 January 2017 (UTC)
Thanks. It should be noted that we don't fix typos in quotes of any text or in titles. Typo fixing is strictly done in encyclopedic prose. Also, I assume that the English Wikipedia is effectively based on Modern English (assumed further to be based on contemporary English dictionary entries), but I don't know where this is stated (we might have to get a ruling on that by itself). In any of the Modern English examples, I don't think the current rule would change them (per the regex, a space followed by across/around/round/between/down/from/into/line/out/point/through/up must follow). Stevie is the man! TalkWork 16:08, 16 January 2017 (UTC)
In addition to those cites from the OED, I can find dozens of examples of half way followed by "across/around/round/between/down/from/into/line/out/point/through/up" etc. in modern English. I suggested the compromise because I discovered that the use of half way as two words in British English, though very common in British English when I learnt to read, is becoming less common in the twenty-first century, and is rare in American English. I do admire and approve the excellent work of Chris and Stevie in correcting typos and spellings, but the theory that the use of two words instead of a concatenated form is an incorrect spelling, just because some dictionaries omit to mention the alternative, seems like WP:OR to me. Dbfirs 09:42, 17 January 2017 (UTC)
I would say the reverse, that I (and I'm only speaking for myself) am going by WP:RS, that is, dictionary entries by widely agreed official sources. Doing research to find examples of different uses or misuses would seem to be leaning to WP:OR. Stevie is the man! TalkWork 12:53, 17 January 2017 (UTC)
By your own criterion, the Oxford English Dictionary would not include recent citations (such as Pinter) that contained spelling errors. The interpretation of lack of entries in some dictionaries is what I consider original research. Anyway, if we go with the compromise, there is no original research or loss of honour (or even honor) on either side. Dbfirs 13:22, 17 January 2017 (UTC)

"instant grat" → "instant great" incorrect[edit]

There's a rule which replaces grat with great (named "_Great" in the typo list). However, instant grat (short for instant gratification) is a real concept, referring to songs released on iTunes during the pre-order phase of an album. Could the rule be modified to put in instant grat as an exclusion? Harryboyles 13:36, 14 January 2017 (UTC)

Yes check.svg Done -- John of Reading (talk) 14:07, 14 January 2017 (UTC)
A very quick test shows that the exception for instant grat works. Thanks! Harryboyles 16:04, 14 January 2017 (UTC)

Opostegidae[edit]

There is a family of moths called Opostegidae. AWB regex typo fixing has been changing it automatically to Oppostegidae (note the second p), but it shouldn't. If someone could fix that I would appreciate it. Thank you,  SchreiberBike | ⌨  01:07, 18 January 2017 (UTC)

@SchreiberBike: Yes check.svg Done. This is another rule that matches [a-z]+. I'm not keen on those, as they so often end up damaging unusual or foreign words that the rule-writer never thought of. -- John of Reading (talk) 07:09, 18 January 2017 (UTC)

"Comercial"[edit]

Can someone add this to the typo list? I'm not really confident in doing it. Comercial → Commercial. Appreciated! --Jennica / talk 01:41, 19 January 2017 (UTC)

It's probably not feasible to have an AWB correction for this. Comercial is a proper noun in several names, so we would be safe to only look for lower-case uses. 'comercial' is the Spanish/Portuguese word for 'commercial', even though that shouldn't ordinarily be seen in prose. However, it still shows up in spots that AWB will try to "correct", such as lists of titles that aren't properly formatted. Another thing is that this typo appears very infrequently, as I found only 6 articles in the whole of Wikipedia with this typo that could be legitimately corrected. I just corrected most of those. Stevie is the man! TalkWork 13:16, 19 January 2017 (UTC)
Oops, I assumed we didn't have this already, but the typo fix is there that avoids it when capitalized. It will still (apparently) fix the lower-case word, and that may be iffy per my previous review. Stevie is the man! TalkWork 13:32, 19 January 2017 (UTC)

A small "need-fix" report[edit]

During some typo scanning I found out a couple of non typos which maybe are easy to fix, in order to avoid problems:

  • niger → Niger creates problem with scientific binomial names, where niger is rather common as species name, like in Black duiker Cephalophus Niger see diff.
  • Sark based → Sark-based creates an error when changing the Sark based publishing company into the Sark-based publishing company diff
  • team mate → teammate, according to User:MilborneOne: common usage is still team mate in Br English diff
  • Ganes → Games, became error for Ganes Creek named after Thomas Gane, a very uncommon error I guess and should be easy to see and avoid, I should have identified that one. diff

Dan Koehl (talk) 19:19, 13 February 2017 (UTC)

I created the teammate typo fix. All the dictionaries I saw, including British ones Cambridge and Oxford, show it without the space. I need more than a user disagreeing with it to change it. I need reliable sources (dictionaries) that show "team mate" is an alternative spelling. I will review the others. Stevie is the man! TalkWork 19:37, 13 February 2017 (UTC)
This is what I thought was interesting, because in that case I trusted AWB. Dan Koehl (talk) 19:43, 13 February 2017 (UTC)
My review so far:
  1. niger→Niger looks like a tough one. Not sure where to go with that yet.
  2. Sark based→Sark-based in the diff given actually looks correct. The publishing company is based on Sark (island), therefore it is Sark-based.
  3. Ganes→Games with 'Creek' following it affects four articles, so I added regex code to avoid it.

Stevie is the man! TalkWork 20:05, 13 February 2017 (UTC)

@Stevietheman:
  1. Id say niger in a scientific name in most cases comes like (Somegenus niger) which means italic, and within parenthesis.
  2. Regarding Sark-based, please see discussion on my talk page.
Thanks so far for your kind assistance. Dan Koehl (talk) 20:35, 13 February 2017 (UTC)
I have responded to #2 in your user talk. #1 is not quite as simple as it seems. False positive testing for text being inside of italics is not very simple using regex. It can be done, and I've done it in my own Find&Replace's, but it's not reliable enough to replicate for AWB Typos, which needs to avoid an editor's second-guessing as much as possible. Stevie is the man! TalkWork 20:47, 13 February 2017 (UTC)
Edit conflictI was responsible for "Sark based"; and I'm definitely not versatile enough to be able to decide when to use proper nouns as noun adjuncts or not. (I now noticed that OED lists ten "London-based" to one "London based"; I did not look for "Sark-based".)
I suppose that that "niger" invariably should occur as the second part of a species binomen; and these should always be italicised. The first part of the binom should be a capitalised word or an abbreviation "capital"+"full stop" (like Tyrannosaurus rex or T. rex). Is it possible to make the bot avoid a combination like [''][A-Z][[a-z]+|.][ ][niger''] (where the ' and the space perhaps should be quoted; I do not know your regexp conventions)? JoergenB (talk) 20:57, 13 February 2017 (UTC)
Thanks for your suggestion @JoergenB:. @Stevietheman:, if JoergenBs suggestion doesnt work, maybe an alternative could be to run a search, and put {{Not a typo|niger}} on the instances that can be found on enwiki? Would such operations in general make sense, or it is meant just for a couple of handpicked cases? Dan Koehl (talk) 21:02, 13 February 2017 (UTC)
The niger→Niger "fix" is odd, because we already have regex that avoids two single quotes after 'niger', and when I run this in my Find&Replace typo test, it doesn't do the "fix" on List of mammals of Ghana. So, the regex works technically, but somehow fails when run as a typo fix. Stevie is the man! TalkWork 12:45, 15 February 2017 (UTC)
I have created a phab ticket for this issue. Stevie is the man! TalkWork 13:39, 15 February 2017 (UTC)
Per the ticket, it turns out another entry on List of mammals of Ghana had its italics off-balance, and this affected how typos were processed. Stevie is the man! TalkWork 21:22, 15 February 2017 (UTC)
Same problem at Wikipsecoes, I removed it, while we are waiting for a result from the Phab ticket. Dan Koehl (talk) 18:06, 20 February 2017 (UTC)
@Dan Koehl: the phab ticket was closed as invalid. As I said above, the italics used in the article were off-balance. Apparently, the typo fixing software needs italics (two single quotes, or '') to be in total balance for the typo fixes to work properly. I think this has to do with words inside italics being off-limits to such fixes. So, to fix any false-positive you run into, you will need to inspect the article and balance the italics. This shouldn't happen very often. Stevie is the man! TalkWork 21:35, 20 February 2017 (UTC)