Wikipedia talk:WikiProject Check Wikipedia

From Wikipedia, the free encyclopedia
Jump to: navigation, search
  Check Wikipedia   WMFLabs   List of Errors   Discussion

#46[edit]

  • #46 seems to detect a few internal links inside image description: there was 5 articles on frwiki in the list with that problem. --NicoV (Talk on frwiki) 16:31, 30 August 2013 (UTC)
    Can I see the examples? Bgwhite (talk) 17:13, 30 August 2013 (UTC)
    Forgot to put the link... The 5 articles done in the list: fr:Billet de banque (: au premier plan le [[5000 francs Flameng]]), fr:Lyricon (à vent [[Computone]]), fr:Moteur avec cylindres en ligne (en ligne de la [[Honda CBX|Honda CBX 1000]]), fr:Multi One Design (Environnement (MOD 70)|Veolia Environnement]]), and fr:Pollution sonore ([[circulation automobile]]). --NicoV (Talk on frwiki) 17:20, 30 August 2013 (UTC)
    Also in itwiki. Both when the "true" link is at the bottom it:Ampelio eremita and when it isn't it:Ascari del Cielo. Thanks! --AlessioMela (talk) 21:08, 4 November 2013 (UTC)
    Both cases follow the pattern. There is an image tag at the very beginning. There is a bracket error in the articles, but checkwiki shows the error in the image tag. I know why it is happening in the code, but I haven't found a way around it. Bgwhite (talk) 21:26, 4 November 2013 (UTC)

It seems to be happening again on frwiki (fr:Antihéros, fr:Insulte, ...) but I don't find anything wrong in the articles, even somewhere else. --NicoV (Talk on frwiki) 09:21, 5 April 2014 (UTC)

#10[edit]

  • Hmmm, this shouldn't be happening. Looks like it is counting ]]] as having two ]] possibilities. Will look into it. Bgwhite (talk) 20:25, 21 September 2013 (UTC)
    • Matěj Suchánek, interesting case. The problem going on is: [[metr za sekundu|[m/s]]] was followed by a statement with a broken bracket, [ran/[[minuta|min]]. If it wasn't followed by a broken bracket, checkwiki would not say this was an error. Normally I'd say this is a rare case and checkwiki correctly said there is an error on the page, thus this is a real low priority. However, the problem in the code is similar to the problem in the code for #46 error report above this report. So, a solution in one probably fixes the other error. Problem is, I've yet to figure out the #46 error after many hours. Bgwhite (talk) 08:52, 29 October 2013 (UTC)

Error #39[edit]

NicoV Magioladitis After looking at some of the articles in a list of #39 errors not fixed by a bot, I've noticed some "false positives". I use quotation marks because it is actually errors with mediawiki that is causing the problem.

Newlines don't function in <blockquote>, {{quote}}, {{cquote}} and {{quotation}}. I have the checkwiki code skip these for error #39. After looking at the new list of articles, <ref>, [[Image: and {{bq}} also don't work.

<skip several hours>

I have the bug bookmarked and brought it up. Low and behold, the patch that was submitted in December 2011 was finally accepted. Final changes were made today on enwiki. Turns out Visual Editor was assuming newlines worked the same everywhere... silly VE. So, VE started the move to finally fix the problem. Hey, who knew, VE was actually helpful for the first time ever. According to the log, it only took 8 1/2 years to fix.

I've verified that {{quote}}, {{cquote}} and {{quotation}}, <blockquote> and {{bq}} now treat newlines correctly. I've verified that <ref> and [[Image: still barfs on newlines.

I need to add the ref and various image tags to #39's code and remove the currently skipped templates in #39's code. Bgwhite (talk) 05:22, 16 October 2013 (UTC)

Bgwhite this means that now AWB can replace p tags inside blockquote with newlines? -- Magioladitis (talk) 06:12, 16 October 2013 (UTC)
Magioladitis. I'm confused. It doesn't work on Aristole#Geology, but it does work below.

a

b

c
Bgwhite (talk) 06:42, 16 October 2013 (UTC)
Asked a question at User talk:Kaldari#bug 6200 and quote templates. Bgwhite (talk) 06:55, 16 October 2013 (UTC)
Comment was made at bugzilla bug 6200 about the problem. Also bug 55674 for newlines in ref tags. Bgwhite (talk) 09:05, 16 October 2013 (UTC)
6200 marked as fixed. -- Magioladitis (talk) 14:45, 28 October 2013 (UTC)

Statistics[edit]

Hi, I know that you're always looking for more work since it's so easy to use Labs ;-)

I'd like to suggest adding some statistics for Check Wiki to give us some information on how errors evolve on each wiki. Would it be possible to add a table with the following informations ?

  • One line for each error
  • Several columns for each day for a month : number of articles detected for the error after the daily scan, number of articles marked as fixed for the error during the day, eventually number of articles marked as fixed during the day but that were detected again

--NicoV (Talk on frwiki) 10:21, 6 November 2013 (UTC)

Great idea. Though I am not sure if Bgwhite has enough time to implement it. --Meno25 (talk) 16:25, 9 November 2013 (UTC)
I know, it's just wishful thinking, no emergency and no problem if it's not implemented. --NicoV (Talk on frwiki) 12:08, 14 November 2013 (UTC)

Fixing ISBN errors[edit]

Pictogram voting info.svg Note:

Hi, I've made a lot of improvements in WPCleaner to help fixing ISBN errors #69, #70, #71, #72 and #73 (which account for about 10k errors for enwiki). Some of this improvements require configuration in WPCleaner configuration file or Check Wiki configuration file.

  • #72, #73: possibility to search the provided ISBN number or the ISBN number modified with the computed check value in several web sites. Web sites are configurable in general_isbn_search_engines, with 3 default web sites (WorldCat, OttoBib, Copyright Clearance Center). If you know other interesting web sites, let me know, I can add them by default.
  • #70, #71, #72, #73: when the ISBN is provided as a template parameter (isbn=), possibility to search in several web sites using an other parameter of the template (for example the title). This is configurable in general_isbn_search_engines_templates, with no default configuration as it depends on the templates of the wiki. Example available in frwiki configuration.
  • #70: when the ISBN provided contains 8 characters, possibility to search if this is an ISSN number in several web sites. Web sites are configurable in general_issn_search_engines, with 1 default web site (WorldCat). If you know other interesting web sites, let me know, I can add them by default.
  • all: possibility to request help on fixing the ISBN. It's configurable through general_isbn_help_needed_comment, general_isbn_help_needed_templates, error_070_reason_yywiki and so on.

If you have other ideas on how to help fixing those errors, I'm quite interested. --NicoV (Talk on frwiki) 23:21, 19 November 2013 (UTC)

Include pages in namespace 104 on arwiki[edit]

Please include pages in namespace "ملحق" (NS:104) on Arabic Wikipedia (arwiki) in the lists generated by Checkwiki script. This namespace contains lists and years pages. Pages in that namespace are counted in the number of articles (magic word: {{NUMBEROFARTICLES}}) and AWB's Auto-Tagger already tags articles in that namespace. --Meno25 (talk) 12:11, 23 November 2013 (UTC)

Meno25, I'm going to wait on this for a bit. I've held off on 104, commonswiki and File namespace. I'm using code optimized to run only grab Article namespace from the dump file. Changing out will cause a severe decrease in speed. I'll have to some other changing around to insert the code, but everything else is setup for it. For example, there are if statements that say only Article and 104 namespace can check certain errors. Bgwhite (talk) 08:29, 24 November 2013 (UTC)
@Bgwhite: Thank you for the explanation. Take your time. We are not in a hurry. --Meno25 (talk) 12:21, 24 November 2013 (UTC)

Ideas for new errors[edit]

Time to start thinking about what new errors should be added to Checkwiki.

Ping: Magioladitis, NicoV, Meno25, Crazy1880, LindsayH, GoingBatty, Matěj Suchánek, Josve05a, ChrisGualtieri, Graham87. I think that is everybody. If not, add them to the list.

What should or should not be added will be determined by several factors:

  1. How easy is it to code up?
  2. Is it something that AWB or WPCleaner already can find.
  3. Is it something that AWB, WPCleaner or a bot can currently fix?
  4. Is it an accessibility issue?
  5. Is it a serious issue? Are the errors on the high or medium lists?
  1. High priority: error corrupts or distorts the content posted in Article
  2. Medium priority: improving the encyclopedic content or readability of the article
  3. Low priority: improving maintenance or MOS fixes

Some examples:

  1. Replacing <strike> with <s>. It would take a copy/paste to code up. WPCleaner finds and fixes the problem. It would be Low priority.
  2. Finding cases of url=http://http:// This is a common error I see. It would be High priority. It is fixed by AWB.
  3. Blank lines in bulleted vertical lists. This is an accessibility issue per Wikipedia:Accessibility#Blocked elements. This causes problems for screen readers.
  4. No blank space after the comma in DEFAULTSORT values. An example would be: Bush,George. The article would be sorted first for all surnames beginning with Bush. Currently not fixed by AWB or WPCleaner. Probably medium priority.

Bgwhite (talk) 01:34, 26 November 2013 (UTC)

How about putting the TOC in the standard position in the wiki-markup, which is also an accessibility issue? Not sure about automating this though. Graham87 01:39, 26 November 2013 (UTC)
Seems that there are lots of new citation style errors, some of which appear in red text in the references section. Those might be something worth exploring. GoingBatty (talk) 02:08, 26 November 2013 (UTC)

A few suggestions:

  1. An error to detect non-existent files (red linked files). We have a bot on Arabic Wikipedia that removes such links. However, the bot works on all pages of the Main namespace. Generating a list of pages for the bot to work on would be a good idea. See Wikipedia:Database reports/Articles containing deleted files.
  2. Detecting user signatures in articles (articles containing links to user pages). To be worked on manually. See Wikipedia:Database reports/Articles containing links to the user space.
  3. Detecting fat redirects (redirects obscuring page content). To be checked manually. See Wikipedia:Database reports/Redirects obscuring page content. --Meno25 (talk) 06:43, 26 November 2013 (UTC)

The errors I suggested are covered by the Database reports on English Wikipedia. Database reports are updated regularly only on enwiki, Commons and Meta. Moving the errors to checkwiki means that the reports would get generated for other wikis too. So, maybe disable those errors for enwiki and enable them for other wikis. --Meno25 (talk) 06:43, 26 November 2013 (UTC)

Meno25 you can request similar databases for other wikis. -- Magioladitis (talk) 09:19, 26 November 2013 (UTC)

CHECKWIKI is more about common syntax errors. We need to focus on that. If lists are already generated by other bots/projects we do not need to duplicate the job. Bgwhite's idea of unspaced DEFAULTSORT is a great example of what we are after. WPC's extended list is another good example. I have some minor suggestions:

I don't know if this is an error or maybe already monitored but:

  1. {{cite web}} without access dates.
  2. When only <ref>http://exemple.com/</ref> is used without title/description. This is to prevent link rot.
  3. When two (or more) refs with the same information has diffrent ref-name.
  4. When the time (e.g. 08:45 or 8 am) or the day (e.g. Moday or Saturday) is used inside |accessdate=.

(tJosve05a (c) 11:52, 26 November 2013 (UTC)


Hi, I think new errors should be generic enough to work on most wikis, so avoid very specific errors (for example: {{cite web}} without access dates should be dealt by the template itself: put the page in a maintenance category if access dates are missing). Otherwise, some of WPCleaner errors in the #5xx numbers:

  • #502: Useless "Template" in {{Template:...}} (low)
  • #508: Non-existent templates (medium ?)
  • #511: Internal link written as an external link (medium ?)
  • #512: Interwiki link written as an external link (low ?)
  • #513: Internal link inside an external link (medium ?)
  • #517: <strike>...</strike>
  • #519: <a>...</a>
  • I like some of previous proposals: missing space after a comma in a DEFAULTSORT, doubled http, blank bulleted lined, non-existent files, ...

Some of them are probably hard to develop or require access to a lot more information, so they will be difficult to add (non-existent templates / files, ...) --NicoV (Talk on frwiki) 12:57, 26 November 2013 (UTC)

NicoV I agree with you. My first suggestion is not good neither. I think the best suitable new additions are WPCleaner errors in the #5xx numbers. For non-existent file etc I disagree that we should do something about them. There are databases for those already. -- Magioladitis (talk) 13:39, 26 November 2013 (UTC)
NicoV: #508 is already listed, see Special:WantedTemplates, the files are in Category:Pages with missing files.
I have once suggested a link to a year which has another description ([[2012|2013]], medium or high).
Some inspiration: de:Benutzer:Stefan Kühn/Check Wikipedia#Next features. Matt S. (talk | cont. | cs) 15:16, 26 November 2013 (UTC)

A few more:

  1. More than one blue link per * on a disambig-page. (Per WP:MOSDAB)
  2. Refs and reflist on a disambig-page. (per WP:MOSDAB)
  3. When an article does not have "nbsp" between e.g. 15 km, 2,5 miles and 3 cm)

-(tJosve05a (c) 16:12, 26 November 2013 (UTC)

Moin, like a free space in a category as medium. Example: right: "[[category:xyz]]" and wrong "[[categorie: xyz]]". Stefan Kühn had had a code for the persondata-script. Regards --Crazy1880 (talk) 09:09, 29 November 2013 (UTC)
Crazy1880, Error 22 should be picking those up. Bgwhite (talk) 02:20, 3 December 2013 (UTC)
Bgwhite, oh, yes it did. In the german Wikipedia was the question, if ID 69 will check für "ISBN:", because the linked site only use ISBN. (example: ISBN: 978-3-7657-2781-8 > ISBN 978-3-7657-2781-8) Regards --Crazy1880 (talk) 20:19, 4 January 2014 (UTC)
Crazy1880, #69 checks for ISBN: and ISBN- Bgwhite (talk) 22:21, 4 January 2014 (UTC)

Round 2[edit]

Ping: Magioladitis, NicoV, Meno25, Crazy1880, LindsayH, GoingBatty, Matěj Suchánek, Josve05a, ChrisGualtieri, Graham87.

Following is a list of errors that I think could be added. Some notes:

  • English database reports that Meno25 are not being ported to other languages unless somebody is willing to take on the task. Very few have been ported. So, if a report meets the "standard", I see no reason not to add it to checkwiki.
  • Most citation style errors would be a pain in the butt to code, too many articles that take too long to correct and are not really syntax errors. The one exception that I can think of is missing "url=" when the web address is given.
  • NicoV and Magioladitis, could you WPCleaner or AWB to the appropriate errors and columns.
  • Any errors not in the list that you think should be added? Any other comments?
Description Priority Coding Tools to detect Tools to fix Other
Useless "Template" in {{Template:...}} low Done WPC, AWB WPC, AWB #1 (#502)
Internal link written as an external link medium Done WPC WPC & Frescobot #90 (#511)
Interwiki link written as an external link low Done WPC WPC #91 (#512)
Internal link inside an external link medium WPC (#513) WPC
<strike>...</strike> low Done WPC, AWB WPC, AWB* #42 (#517). Obsolete in HTML5. Use <s>...</s> instead
<a>...</a> low Done WPC WPC #4 (#519)
URL without http:// high Done WPC, AWB WPC, AWB #62
Finding cases of url=http://http:// medium Done WPC, AWB WPC, AWB #93
Blank lines in bulleted vertical lists medium Accessibility issue per Wikipedia:Accessibility#Blocked elements
Putting the TOC in the standard position medium Done WPC #96 and #97. Accessibility issue per MOS Elements of the lead
No blank space after the comma in DEFAULTSORT low Done WPC, AWB WPC, AWB #89
Unbalanced ref tags medium Done WPC, AWB WPC, AWB #94
Detecting user signatures in articles low Done WPC, AWB WPC, AWB #95
Detecting fat redirects (redirects obscuring page content) low
<span class="plainlinks"> in articles low
Pipe in external link [http:/www.wikipedia.org|Wikipedia] low
Link to a year which has another description ([[2012|2013]]) low This error is often caused by VE.
Cases of {{cite web|http://www.wikipedia.org| title= medium
Move anchor in front title in heading
Detect non-existent files (red linked files)
Detect non-existent templates WPC (#508)
Detect refs <ref name=> low easy often detected as #56
Category with double colon easy AWB
More same parameters in template medium medium
Good :-). I've added the information about what WPCleaner can currently detect and fix (automatic or bot, at least partially). For errors I've already coded with a #5xx number, feel free to use an error number following what CW currently manages or keep the #5xx number. For other errors, I don't see any problem for implementing them in WPCleaner, but it will probably have to wait 2 months, as I will be almost completely unavailable for several weeks. --NicoV (Talk on frwiki) 20:05, 3 December 2013 (UTC)

Errors added[edit]

Magioladitis, NicoV, Meno25, GoingBatty, Matěj Suchánek, Josve05a, ChrisGualtieri

  • #01 - Template with the useless word "template"
  • #04 - HTML text style element <a>
  • #42 - HTML text style element <strike>
  • #62 - URL containing no http://
  • #89 - DEFAULTSORT with no space after the comma
  • #90 - Internal link written as an external link
  • #91 - Interwiki link written as an external link
  • #93 - External link with double http://
  • #94 - Reference tags with no correct match
  • #95 - Editor's signature or link to user space
  • #96 - TOC after first headline
  • #97 - Material between TOC and first headline

Notes[edit]

  • Only turned on for enwiki for right now. Will start to expand after NicoV's return.
  • Just added #90 and #91. So, there will probably be some problems.
  • For #90 and #91, it will only search for articles written as an external link. Talk pages or special pages will no be searched. History of Wikipedia has examples on why it is done this way.
The description on #91 most be changes to The script found an external link that should be replaced with a interwiki link. An example would be on enwiki [http://fr.wikipedia.org/wiki/Larry Wall] should be written as [[:fr:Larry Wall]] so it says fr.wikipedia.org in the extrnal link and not en.wikipedia.org. -(tJosve05a (c) 21:07, 24 December 2013 (UTC)
And #90 most be changed to e.wikipedia.org. -(tJosve05a (c) 21:11, 24 December 2013 (UTC)
Another thing is that it should not say [...]/wiki/Larry Wall]. It should say [...]/wiki/Larry_Wall Larry Wall].(tJosve05a (c) 21:14, 24 December 2013 (UTC)

Errors modified[edit]

  • #22 - Finds more cases of a space in a category
  • #19 - Finds headlines that start with one "=" anywhere in the article instead of only at the start of the article.

WPCleaner[edit]

Bgwhite I've updated WPCleaner (version 1.31) for the following errors for all wikis: #1 (previously #502), #4 (previously #519), #42 (previously #517), #90 (previously #511), #91 (previously #512). Still have to do: #62, #89, #93, #94. Old #62 and #89 have been disabled. --NicoV (Talk on frwiki) 21:51, 22 January 2014 (UTC)

Thank you Nico. Do you want me to turn on the new errors for frwiki or wait? I'm sure Josve05a will have found a bug before I write this. :). New error #95 will be an editor's signature found in an article. Bgwhite (talk) 22:49, 22 January 2014 (UTC)
Bgwhite, will 95 include UTC, CET, CEST etc.? Since this error might not only be used on enwp? (tJosve05a (c) 22:55, 22 January 2014 (UTC)
(BTW Bgwhite I'm 16 in 5...4...3...2...1...HAPPY BIRTHDAY TO ME!) (tJosve05a (c) 23:00, 22 January 2014 (UTC)
Hey, I already wished you a happy birthday, which you already complained about. Now you want another... pfffft.  :) Time is irrelevant for #95 as I'm only looking for a signature. Bgwhite (talk) 23:09, 22 January 2014 (UTC)
@Bgwhite Yes, I think you can turn the new errors on for frwiki, I'll check what has to be changed in the translation file. --NicoV (Talk on frwiki) 08:47, 23 January 2014 (UTC)
Bgwhite, NicoV, (#91) WPCleaner changes [http://www.imdb.com/name/nm0403424/ Hurley on the [[Internet Movie Database]]] to [[:imdbname:0403424|Hurley on the]][[Internet Movie Database]]]. I see multiple issues with this. It removes the blank space, it leaves 3 bracket at the end (without the WPCleander reporing it. (Found on Colin Hurley). (tJosve05a (c) 10:52, 23 January 2014 (UTC)
@Josve05a: It will be fixed in a next version. It's due to the incorrect syntax of having an internal link inside an external link. It can be reported by WPC if #513 is activated. --NicoV (Talk on frwiki) 19:54, 23 January 2014 (UTC)
@Bgwhite: If possible, start please the new checks for cswiki, too. I will modify the configuration file. Within a week, you can also enable skwiki.
@Josve05a: Happy birthday! You are now same aged as I am (for next 10 months). Matt S. (talk | cont. | cs) 18:28, 23 January 2014 (UTC)

NicoV and Matt S., in theory frwiki and cswiki should start seeing the new errors at the next 0z run.... if the database is up. Today's outage was caused by a disc getting full. Bgwhite (talk) 07:38, 24 January 2014 (UTC)

Hi! It's strange: I've modified the translation file in frwiki 4 days ago, but the old description is still displayed in WMFLabs for #1, #4, #42, #62. No errors are found. --NicoV (Talk on frwiki) 12:15, 27 January 2014 (UTC)
@Bgwhite For frwiki, I've changed the translation file a few days ago: descriptions for new error numbers (#93, ...) have been taken into account on WMF Labs, but not the modified descriptions for old error numbers that have been recycled (#1, #4, #42, #62, #89, #90, #91). Is it a problem to have kept the old descriptions as comments? --NicoV (Talk on frwiki) 09:12, 29 January 2014 (UTC)
NicoV, hmmm, I didn't see the message above this one. Sorry for that. The translation file and every other program has been bombing lately, so that was probably why you didn't see it right away. Between database problems and mounting problems, I'm ready to go screaming into the night. The frwiki dump processing is still running. Which is very amazing that it hasn't died yet.
Do you mean as comments in the translation file as you have done for the French one? I see no problems.
For #96 and #97 I've thrown in a little regex in the English translation file to account for templates being used with a space and no space.
For #95, I only have English "User" and "User talk". I'll get individual wiki's words in a bit. I'll get them thru the API. Bgwhite (talk) 09:31, 29 January 2014 (UTC)
Bgwhite For example, on WMFLabs, #1 is still displayed as "Pas de texte en gras" (the old description, which is commented out in the translation file) when the translation file has been changed 6 days ago; whereas the translations for the new errors (#93 and so on) are correctly used even if they have been changed later (only 2 days ago). --NicoV (Talk on frwiki) 16:47, 29 January 2014 (UTC)
NicoV, looking at the code, it grabs the first variable, commented or not. So, putting the commented lines second does the trick. Bgwhite (talk) 22:32, 29 January 2014 (UTC)
Ok, thanks a lot! --NicoV (Talk on frwiki) 11:08, 30 January 2014 (UTC)

#16[edit]

When I fix the error 16 on arwiki is just fix about 5% of all list, I try with WCP and AWB, where the problem. --Zaher talk 13:42, 28 November 2013 (UTC)

Apparently, there are situations where removing the control character changes the text and it seems to be a problem. I know this is usually happening with some characters (arabic, hebrew, ...). Nobody has been able to explain to me how to know if it's a special situation and how to fix it, so I've coded WPCleaner so that #16 is fixed automatically only if the characters around the control characters are part of a limited list (mainly ASCII, some diacritics, punctuation, ...). That's why it doesn't do much on arwiki. If you're able to guide me to know when it is safe to remove the control characters, I can update WPCleaner. --NicoV (Talk on frwiki) 22:12, 28 November 2013 (UTC)
This is best answered by Magioladitis as he is the resident expert on this. If I remember right, most false-positives do come when dealing with left-to-right languages. Bgwhite (talk) 06:32, 29 November 2013 (UTC)
I was never able to determine when we are in the case where the text order changes. This is a very rare situation in the English Wikipedia (less than 0.1% by my experience). I can't tell the same for Arabic Wikipedia. Are we sure arwiki wants invisible left-to-right characters to be removed? Meno25? -- Magioladitis (talk) 12:20, 6 December 2013 (UTC)
@Magioladitis: Zaher and me want the characters to be removed. I can start a discussion on Arabic Wikipedia Village Pump about this isuue if this is needed. --Meno25 (talk) 12:25, 6 December 2013 (UTC)
@Meno25: I am OK either way, but I don't know the statistics for arwiki. AWB removes the characters using simple Find & Replace method. Check instructions at User:Magioladitis/AWB_and_CHECKWIKI#cite_note-4. Recall that 16 can not be fixed in bot mode. -- Magioladitis (talk) 12:29, 6 December 2013 (UTC)
@Magioladitis: @NicoV: Checkwiki error 16 is fixed automatically (not manually) by WPCleaner for English texts. But this fix is disabled for Arabic texts. What Zaher is trying to say above is that he wants fixing this error to be enabled for Arabic texts too. I have been using AWB to fix this error manually using the same regex you provided for months in Arabic Wikipedia without complains from other users, so, I guess we can safely enable fixing this error for Arabic texts. Of course, bot operators on arwiki can disable fixing error 16 in WPCleaner preferences if a problem arises. --Meno25 (talk) 12:41, 6 December 2013 (UTC)
In WPCleaner, I decided to restrict automatic fixing after some reports of problems. See this discussion for example, or someone reported that fixing fr:Alâ ud-Dîn Khaljî resulted in characters inversion (it may be the same for the few pages left with error #16 on frwiki). Having a discussion about this issue with people knowing how it works would be better before letting again WPCleaner automatically fix every control character. --NicoV (Talk on frwiki) 15:07, 6 December 2013 (UTC)

ID 73 - ISBN errors[edit]

Can you write the errors on the talk page of the appropriate article? Because in many cases the author of an article watches it and then can correct the ISBN. --Tsor (talk) 19:38, 14 January 2014 (UTC)

Tsor, unfortunately it cannot write to articles. This requires bot approval which Checkwiki would not get. There was a bot that was tagging articles and the articles were ending up in Category:Articles with invalid ISBNs. The owner of the bot is no longer active, thus the bot is also no longer active. Bgwhite (talk) 00:31, 15 January 2014 (UTC)

Since yesterday I cannot mark articles as "Done". Leads to an error message. --Tsor (talk) 10:45, 16 January 2014 (UTC)

Tsor, could you give me some examples. What language and what error number? Bgwhite (talk) 11:10, 16 January 2014 (UTC)
Goto ISBN-13, klick on any "Done". After a few minutes you get following error mesage:

{{U|Ts

Check Wikipedia
Aggregat 4
Software error:
Cannot execute: Lock wait timeout exceeded; try restarting transaction
--Tsor (talk) 12:36, 16 January 2014 (UTC)
When I go to https://tools.wmflabs.org/checkwiki/cgi-bin/checkwiki.cgi?project=dewiki&view=only&id=73 I get this following error message:

Could not connect to database: Can't connect to MySQL server on 'tools-db' (111). (tJosve05a (c) 14:48, 16 January 2014 (UTC)

Josve05a, the error you saw is most likely WMFLabs having trouble. When you see that, try again a bit later. Labs are aware of problems to their database machines, but are not going to fix it for who knows how long. The latest excuse is they will when all the machines are physically located to their new location.
Tsor, I still cannot duplicate and I haven't seen that error before. The error message usually means another process has a "lock" or total control over the database and all other database connections are locked out. Why is the error showing up now? Could you tell me the exact time you tried and what article you pressed "done" on. That way I can look at logs and hopefully they will tell me something. Bgwhite (talk) 21:10, 17 January 2014 (UTC)

Error #62[edit]

If a website is called "www.news.de" for example something like this is valid in the German Wikipedia:

<ref>www.news.de: [http://www.news.de/article Article].</ref>
<ref>www.news.de: ''[http://www.news.de/article Article]''.</ref>

This shouldn't be reported as an error. Would be nice to have this excluded somehow. Disabling the check would also disable the check for url= which would be a shame. Here is an idea for an extended regex (not tested).

/(?:<ref\b[^<>]*>|url\s*=)\s*www\w*\.(?![^<>[\]{|}]*\[\w*:?\/\/)/i

--TMg 17:10, 19 January 2014 (UTC)

I had to drop checking for cases with |url=. There were infoboxes which required external links not have http://. That should make the regex a little easier. I currently have:
/(<ref>\s*\[?www\.)/
I'm not yet catching named refs, which you do. Bgwhite (talk) 06:10, 20 January 2014 (UTC)
Unfortunately this will cause the same false positives. Here is my regex again without the url= option.
/<ref\b[^<>]*>\s*\[?www\w*\.(?![^<>[\]{|}]*\[\w*:?\/\/)/i
--TMg 09:51, 20 January 2014 (UTC)
Yes, I know it will cause the same false positives. I was only giving the reasons why for the current status of the regex, including dropping url. Bgwhite (talk) 22:17, 20 January 2014 (UTC)
It does work, but it has a hitch. For example, it does find an error in Central Philippine University, Ciclosporin and Gravity Rush. However, it reports the error at the end of the article. The hitch happens with the entire regex or just /(<ref\b[^<>]*>\s*\[?www\.)/. I'm off to bed Bgwhite (talk) 09:10, 21 January 2014 (UTC)
Not sure what you mean with "hitch". Maybe it's because I removed the brackets but you are relying on them? Let's re-add them:
/(<ref\b[^<>]*>\s*\[?www\w*\.)(?![^<>[\]{|}]*\[\w*:?\/\/)/i
This matches:
But it does not match my two examples above. I'm happy. :-) --TMg 21:24, 21 January 2014 (UTC)
The "hitch"... for some articles, the regex does not tell where the error is found. It just reports the last bracket in the article. See [1] and look at the notice column. Bgwhite (talk) 21:52, 21 January 2014 (UTC)
I see. That's an upper/lowercase problem. The index() call is case-sensitive but gets $1 lowercased. --TMg 22:09, 21 January 2014 (UTC)
Current Suggestion
my $test_text = $lc_text;

if ( $test_text =~ /(<ref\b[^<>]*>\s*\[?www\.)/ ) {
    my $pos = index( $text, $1 );
    error_register( $error_code, substr( $text, $pos, 40 ) );
}
if ( $text =~ /<ref\b[^<>]*>\s*\[?www\w*\.(?![^<>[\]{|}]*\[\w*:?\/\/)/i ) {
    my $pos = $-[0];
    error_register( $error_code, substr( $text, $pos, 40 ) );
}

Error #37[edit]

It was suggested to exclude all pages where adding DEFAULTSORT doesn't make a difference. Redirects are an example. If a page neither

  • contains a template (templates may set categories and therefor may require DEFAULTSORT) nor
  • contains a category with no sort key (e.g. [[Category:Ä]] requires DEFAULTSORT but [[Category:Ä|A]] does not)

it can be skipped. The following line of code should do that (again, not tested). --TMg 20:24, 20 January 2014 (UTC)

if ( index( $text, '{{' ) >= 0 or $text =~ /\[\[($cat_regex):[^[|\]]+\]\]/i ) {
    # Do the check
}
Suggested by whom and where?
For #37, articles and redirects are already skipped if there are no categories. Bgwhite (talk) 22:11, 20 January 2014 (UTC)
Discussed here. This is an example for a page where all categories already contain a sort key. Adding DEFAULTSORT does not change anything. Currently error #37 reports about 14,000 pages in the German Wikipedia. It would help if we could remove such cases that aren't actual errors. Just for now. We could re-add this later. --TMg 00:48, 21 January 2014 (UTC)
Ok, it now makes sense what you are asking. Short answer... No. Long answer... This has been asked several times before. Ideally, defaultsort should be added and any identical sorts in the categories removed. AWB does do this already. Magiolidatis recently finished up all 90,000 missing defaultsorts in enwiki via a bot using AWB. In the long run, this would be the best solution. Bgwhite (talk) 07:29, 21 January 2014 (UTC)
I understand and I agree that all pages should use DEFAULTSORT in the long run. But this is not how things work in the German Wikipedia right now. There is no consensus to use bots for such trivial tasks in dewiki. As I said: It would help the German Checkwiki users a lot to be able to focus on actual errors first. You can add the additional check above for dewiki only. If the current 14,000 reported errors are down to 100 (or something like that) we can remove the check. By the way, I spend several hours updating the German localization. Just to let you know. --TMg 21:47, 21 January 2014 (UTC)

Error #39 (again)[edit]

Hi all. In Demons (novel) the section headed "Characters" employs paragraphs within a bulleted list. This has been coded per the advice given here, but Yobot (and, I think, other AWB-based robots) persists in making "corrections": [2] [3] [4] [5] [6] [7] and so on. Aside from destroying the logical structure of the section, this is also contrary to accessibility guidelines.

I note that the detection of error #39 has already been modified to accept the use of <p>s within certain tags, such as <blockquote>. Can this tolerance be extended to include <p>s within lists?

(I was uncertain whether to raise this concern here, with Yobot, or with AWB. If I've chosen the wrong place, could you please let me know, and I'll try again.) In the meantime, thanks for your collective good work with checkwiki: fighting the good fight, and at scale! — Simon the Likable (talk) 13:49, 10 February 2014 (UTC)

Simon the Likable hi. Thanks for starting the discussion. I was not aware of this problem. Bots tend to revisit a page unless something is changed. -- Magioladitis (talk) 13:54, 10 February 2014 (UTC)
Simon the Likable can you please check if you like my version? -- Magioladitis (talk) 13:57, 10 February 2014 (UTC)
This is both a Checkwiki and AWB issue, so having a discussion at either spot is just fine.
@Graham87: As this is also an accessibility issue, Graham is the one to ask. Current version of Demons (novel) uses * and : to create paragraphs inside lists. This version uses * and standard html paragraph tag. Is the current version acceptable or should the older version be used? Bgwhite (talk) 18:24, 10 February 2014 (UTC)
@Simon the Likable, Magioladitis, Bgwhite: The older version is better, but even there, the gaps between the list items would need to be removed. In the newer version, the HTML lists finish at the end of each paragraph (as can be seen by checking the HTML source). It might be easier to use HTML rather than wiki-markup to create the lists. Graham87 01:13, 11 February 2014 (UTC)
Sorry guys but on my laptop, both versions have the same visual result. I must be semi-blind or something. This happens to me after working on my laptop for several hours. Can someone explain me what are the visual differences? Thanks, Magioladitis (talk) 06:59, 11 February 2014 (UTC)
Visually they are the same. On a screen reader, it breaks up the list. The first item on the list, the one with the <p> tags, with the : it appears as an one item list to a screen reader. Bgwhite (talk) 07:12, 11 February 2014 (UTC)
Thanks Magioladitis. As Bgwhite has outlined, your solution is impeccable visually, but will not allow visually impaired readers good access using a screen reader. I have therefore reverted your change (reinstating the <p>s), but have also taken on board Graham87's point and removed the blank lines between list items. Thus, I think the current version covers both visual and accessibility requirements, and follows recommended coding practices in Help:List#Paragraphs_in_lists and now WP:LISTGAP.
This leaves open my original issue: checkwiki and AWB both regard this recommended markup as an error. Can checks for error #39 be modified to accept the use of <p>s within lists? (Or perhaps there is some other solution?) — Simon the Likable (talk) 13:59, 11 February 2014 (UTC)
Thanks Simon; sounds good here now! Graham87 14:03, 11 February 2014 (UTC)
Hey guys. Any chance that this is a Mediawiki bug and we should report it? -- Magioladitis (talk) 14:06, 11 February 2014 (UTC)
I looked at source code for the latest version and Magioladitis' version. It does not appear to be a bug. In the latest version of Demons (novel), it is one long list made up of <li> tags. If a blank line happens, the list ends. In Magioladitis' version, it starts as a list. When the first : happens, the list is ended. The HTML tags to produce the layout for the : consists of <dl> and <dt> tags. The use of the dl and dt tags is standard HTML practice when text needs varying indentation. The source for this talk page is full of dl and dt tags. Bgwhite (talk) 06:23, 12 February 2014 (UTC)
Yes, both Checkwiki and AWB should not call this an error. Finding a solution is another matter. My brain isn't coming up with an answer. For the time being, I've added the article to a whitelist, so Checkwiki will not find a <p> error in the article. Bgwhite (talk) 06:23, 12 February 2014 (UTC)

ISBN-check[edit]

It would be very helpful if the check could recognize and ignore

  • ISBNistFormalFalsch=J
Example: de:Erich Burgener - {{Literatur | Autor=Bertrand Zimmermann | Titel=Erich Burgener | Verlag= Editions de la Thèle| Ort=Yverdon-les-Bains | Jahr=1987 | ISBN=2-8283-0024 | ISBNistFormalFalsch=J }}
  • http://xxxxx/isbn/282830024

--Tsor (talk) 09:09, 2 March 2014 (UTC)

Tsor, as usual, I'm confused. Why give a bad ISBN in the first place? I did a Google search and only two non-Wikipedia derived websites give this number and one of them is Wikipedia. Bgwhite (talk) 23:43, 2 March 2014 (UTC)
Hello Bgwhite, this ist just a (bad) example. Sometimes we find in a book an ISBN which is formal wrong. Some guys use the template Vorlage:Literatur where they can mark such invalid ISBNs by "ISBNistFormalFalsch=J". There is another template Vorlage:Falsche ISBN which can mark such invalid ISBNs: {{Falsche ISBN|3-123-45678-9}} leads to "ISBN 3-123-45678-9 (formal falsche ISBN)". This template is used very often: https://de.wikipedia.org/wiki/Spezial:Linkliste/Vorlage:Falsche_ISBN
I will look for a better example for an invalid ISBN. --Tsor (talk) 10:10, 3 March 2014 (UTC)
PS: An additional column in the error-list "marked as invalid" would help. --Tsor (talk) 10:18, 3 March 2014 (UTC)
Tsor, I'm slow, but I still fail to see what is wrong. It would be best to use a correct ISBN? A better example would help me understand. TMg, could you help me out.
There are whitelists in which articles can be added so they won't be raised as an error again. To many things can go wrong with "marked as invalid" button... Already a problem of vandalism by people clicking done when they have no intention of fixing errors. Bgwhite (talk) 3 March 2014 (UTC)
Here are 349 examples. --Tsor (talk) 11:10, 3 March 2014 (UTC)
I just looked at the first one in the list, de:Charles de Melun and I don't understand why the ISBN is qualified as bad: the checksum is correct. Is it normal to have "ISBNistFormalFalsch=J" with an ISBN that seems correct? Edit: idem for second example de:Bussard (Einheit). --NicoV (Talk on frwiki) 12:26, 3 March 2014 (UTC)
Hmm, you are right, in de:Charles de Melun ISBN is marked as bad but ist is ok. Same at your second example. I will have a closer look. --Tsor (talk) 13:26, 3 March 2014 (UTC)
Please repeat your calculation. The checksum digit is false, if the first 9 digits are corect the checksum digit in the end should be a 1, so the ISBN should be 2902091311 and not 2902091312. --Cepheiden (talk) 19:15, 5 March 2014 (UTC)
Well, you're just not looking at the version as was looking at, the page was modified since my comment and changed completely about the ISBN: a ISBN-13 with a coherent checksum was replaced by a ISBN-10 with a non-coherent checksum. --NicoV (Talk on frwiki) 20:21, 5 March 2014 (UTC)
I'm sorry, you are right i didn't notice the edit. --Cepheiden (talk) 17:48, 8 March 2014 (UTC)
I also looked at other, a lot seem in the same situation. There's also cases where the ISBN has indeed a wrong checksum, but the book can be found with the correct ISBN on the internet: de:Mare Imbrium and the corresponding book on google. I've spent quite some time on frwiki to fix ISBN reported by CW (still quite some work to do), but I've found very few situations where the ISBN with the incorrect checksum was confirmed as being the ISBN (it's usually fixed at some point). --NicoV (Talk on frwiki) 15:51, 3 March 2014 (UTC)
Yes, there are cases of ISBN's with false checksum digits used as the original ISBN (printed in book and listed in databases of libraries etc.). If someone cites this book with this ISBN we mark them as "formally false" like some libraries do. So what's the point here? --Cepheiden (talk) 19:15, 5 March 2014 (UTC)
My point was that I was surprised by the size of the list (349 pages), because as I said, I fixed a lot of ISBN on frwiki, and didn't find so much situations where the ISBN with the non-coherent checksum had to be kept. Given that the first hits in the search seemed to be mistakes, I was wondering if it was normal that you have so many page with ISBN tagged as formally false. --NicoV (Talk on frwiki) 20:26, 5 March 2014 (UTC)
This was more a reply to Bgwhite (like Tsor already did). --Cepheiden (talk) 17:48, 8 March 2014 (UTC)

Just an example for the second point: http://www.randomhouse.ca/catalog/display.pperl?isbn=9780676978223 found in de:28 Stories über Aids in Afrika. --Tsor (talk) 22:08, 3 March 2014 (UTC)

It links to "Page not found", the correct link seems to be at http://www.randomhouse.ca/catalog/display.pperl?isbn=9780676978230 (different last 2 digits ISBN). --NicoV (Talk on frwiki) 22:29, 3 March 2014 (UTC)

Wondering about ID#84[edit]

Yes check.svg Done

Hi, I saw that - at least for the German WP - there's a huge list of ID#84. But on virtually all sites this is because of captions that are comment by <-- and --> Problem is that often the author did not put the opening commentary-tag in the same line as the caption or that he comment multiple captions thus the second and so on are missing "their" opening tag. See any chances to get a workaround for that? --StreifiGreif (talk) 17:37, 7 March 2014 (UTC)

StreifiGreif Known problem. I did have a fix for it and was in the code. The fix ended up causing a problem on a few sites. It caused the checkwiki program to crash. I'll look at it again in a few weeks. Bgwhite (talk) 21:52, 7 March 2014 (UTC)
StreifiGreif, this should be fixed now. Bgwhite (talk) 07:39, 26 March 2014 (UTC)

2 servers, 2 scripts[edit]

Hi! It seems that now CheckWiki works parallel on 2 servers: toolserver.org and tools.wmflabs.org, and they are using:

Different language communities use different servers, but they translate the same descriptions, which do not always fit to the logic. It seems to be a problem.

So, e.g., error 042 searches errors with incorrect <small> tags on the one server and <strike> tags on the other. But they take description of the error from the same page, which should be translated from enwiki translation page. Another example is error 089, etc.

(I am from eowiki.) Yurij Karcev (talk) 06:38, 14 March 2014 (UTC)

Yurij Karcev, toolserver is dead and WMFLabs is its replacement. People have been given time to move their programs over to WMFLabs, which is why both are running. Toolserver will be turned off in about 3 months. I don't have access to toolserver, so I can't place any messages there.
WMFLabs is adding new errors and turning off some old ones. WMFlabs' checkwiki processes dump files every two weeks when available. Toolserver hasn't run on a dump in over a year. The translation page for eowiki has not been updated in a long time. Should the translation page be in English? If not, could you translate it Esperanto. Bgwhite (talk) 08:08, 14 March 2014 (UTC)
Ok. This transition wasn't described clearly anywhere, and some CheckWiki's are still mentioning toolserver – for example, Russian, Spanish and some others. I'm just working on Esperanto CheckWiki, so have found this inconsistency.
Other problem is – when you change error number meaning in the script logic (see above 042), other language projects must synchronously change their translation pages. Now they don't. Maybe at least not to reuse numbers? Yurij Karcev (talk) 09:48, 14 March 2014 (UTC)
Translation pages have to be changed no matter what, so it is a moot point. I only speak English. The French, German, Greek and Swedish pages have been changed. Czech might have. I already got into a brouhaha in trying to changed some stuff on the German page, was reverted and told Germans only, so I'm hesitant of changing other pages. If you know any other languages your help would be much appreciated. Bgwhite (talk) 17:57, 14 March 2014 (UTC)
@Bgwhite: Czech pages are changed ASAP. I understand Slovak, so I will change something on the Slovak pages. I had asked one user and she said she would translate the rest. Matěj Suchánek (talk | cont.) 08:12, 15 March 2014 (UTC)
@Bgwhite: Esperanto: updated project page and error descriptions. Russian: updated project page, working on error descriptions. Suggestions:
  • Please add characters ĈĜĤĴŜŬĉĝĥĵŝŭ as correct for eowiki in errors 007 and 036;
  • Could you check error 055 – it finds too many strange errors in eowiki, dewiki, eswiki etc. Yurij Karcev (talk) 12:56, 21 March 2014 (UTC)
The characters have been added. Thank you for updating eowiki an ruwiki. Yes, those are strange 55 errors. I ran some articles thru checkwiki and it didn't produce 55 errors. What's stranger is checkwiki is not detecting any strange 55 errors during the daily runs. I just blanked 55 on dewiki and will see if the daily runs produces new errors. Bgwhite (talk) 18:36, 21 March 2014 (UTC)
So, on dewiki daily run produces only real 055 errors. Also on eswiki. But on eowiki daily run doesn't adding anything at all — is it turned off? Yurij Karcev (talk) 05:21, 25 March 2014 (UTC)
Yurij Karcev Daily runs are only done for enwiki, frwiki, eswiki and dewiki. They are the largest ones and most prone to alot of changes. Upon request, arwiki was added. I can add eowiki if you like.
In theory, eowiki has two dumps per month created. Checkwiki runs on those two dumps. A lising of all dumps and schedules is located here. Bgwhite (talk) 05:43, 25 March 2014 (UTC)

Parsoid-based online-detection of broken wikitext[edit]

Greeting, wiki checkers!!

I plan to propose a GSOC project through Wikimedia this year, based around the idea of Parsoid-based online-detection of broken wikitext. The original idea of the project is defined here, Which is to develop a tool that will use parsoid to fix broken wikitext found while parsing wiki pages and then develop a user interface for editors to fix broken wikitext. But after few discussions on the project with the parsoid team, We found out that we already have tool Check Wikipedia. But it lacks the fixup information that parsoid generates while parsing wiki pages. So through my GSOC project we plan to integrate this information with your tool.

After having discussions with parsoid devs, I have written an application draft under my username GSOC Application 2014. I would be really thankful, if I get some feedback and we can have some discussion on the same. Hardik95 (talk) 21:30, 14 March 2014 (UTC)

Sounds good. Using parsoid to finding all pages with broken wikitext would be a good first step.--Salix alba (talk): 08:34, 15 March 2014 (UTC)
Sorry for being late, I've been out sick for the past few days. Your idea does sound like a good idea. Anything I can do to help, just ask. The Checkwiki code is found at here. Checkwiki.pl is the main detection script. It runs at http://tools.wmflabs.org/ and uses wmflabs' MySQL as the database. Both AWB and WPCleaner can retrieve specific Checkwiki errors to fix. Many errors can be corrected in bot mode while the rest have to be fixed manually. The List of errors page contains a listing of the Checkwiki errors and what program can correct each error. Bgwhite (talk) 20:43, 18 March 2014 (UTC)

Notice for #94 ?[edit]

Hi, it would be nice to have the "notice" column filled for #94 (like the text just before the isolated closing ref tag). I'm trying to fix them on frwiki, and when WPCleaner doesn't find the problem I don't know if it has been fixed since it has been detected or if there's a discrepancy between WPCleaner and CheckWiki script. --NicoV (Talk on frwiki) 21:51, 2 April 2014 (UTC)

Magioladitis, I'm working on Nico's request. 2010–11 Morecambe F.C. season was a bugger. AWB does not recognize a stray </ref> tag. It's in the "League table" section right at the end:
‡Hereford United deducted 3 points for fielding an unregistered player.</ref>[1]
Bgwhite (talk) 22:34, 14 April 2014 (UTC)

<includeonly>...</includeonly> and #48[edit]

Hi, should we detect #48 (internal links to the title) when they are inside <includeonly>...</includeonly> tags ? On frwiki, all articles in fr:Catégorie:Effectif actuel de franchise de la LNH are included in other articles, so they have a link to themselves inside a <includeonly>...</includeonly>. --NicoV (Talk on frwiki) 08:43, 13 April 2014 (UTC)

Magioladitis, do you have answer? Bgwhite (talk) 21:11, 14 April 2014 (UTC)
NicoV My answer is that we should not fix them. AWB right now won't fix 48 in a page that has noinclude/includeonly even when the 48 error is outside the area. I would like to fix 48 errors when they are outside the includeonly tags because many pages contain empty includeonly tags or sometimes are they result of a copy pasted navox/infobox. -- Magioladitis (talk) 05:02, 15 April 2014 (UTC)
Bgwhite,Magioladitis I agree about not fixing them, so maybe we should not detect them also ;-) ? I've modified WPCleaner so that it still detects them everywhere (to be coherent with Labs), but it doesn't suggest to fix them when they are inside includeonly tags (I don't check if there are noinclude/includeonly tags somewhere else). --NicoV (Talk on frwiki) 08:27, 15 April 2014 (UTC)

Question about #11[edit]

Hi, what HTML named characters are excluded from the search in #11? I figure dagger, emdash and endash are excluded because they got their own error. But, are there other characters excluded? (like nbsp, emsp, ...). --NicoV (Talk on frwiki) 11:22, 13 April 2014 (UTC)

And I would like to know which ones are included just to make sure AWB fixes all of them. :) -- Magioladitis (talk) 11:29, 13 April 2014 (UTC)
Included would be the correct term. Form the code:
# See http://turner.faculty.swau.edu/webstuff/htmlsymbols.html
our @HTML_NAMED_ENTITIES = qw( aacute acirc aeligi agrave aring aumla bull ccedil cent copy dagger euro hellip iexcl iquest lsquo middot minus ntilde oline ouml pound quot reg rswuo sect sup2 sup3 szling trade uuml crarr darr harr larr rarr uarr );
Bgwhite (talk) 20:26, 13 April 2014 (UTC)
Thanks! I will have to exclude a few from my current list. Question: you don't have the uppercase accented letters ? (like Aacute ?) --NicoV (Talk on frwiki) 20:37, 13 April 2014 (UTC)
AWB check on Bgwhite's list: [8]. -- Magioladitis (talk) 20:42, 13 April 2014 (UTC)
Should I add more from WPCleaner's list or any others? Bgwhite (talk) 20:46, 13 April 2014 (UTC)
Bgwhite, NicoV AWB has a white list of html entities that should not be replaced because they "look bad if changed" these are "ndash|mdash|minus|times|lt|gt|nbsp|thinsp|zwnj|shy|lrm|rlm|[Pp]rime|ensp|emsp|#x2011|#820[13]|#8239". there are some more exceptions for other reason found in Parsers.cs line ~60. You might want to have a look. -- Magioladitis (talk) 21:01, 13 April 2014 (UTC)

@Bgwhite: @Magioladitis: I tried to go through the list of existing HTML named entities to see which ones should be reported. What do you think of this list ? (I took the current list, added what seemed reasonable, and then removed the ones that are excluded by AWB.) --NicoV (Talk on frwiki) 23:00, 14 April 2014 (UTC)

NicoV, sounds good to me. After Magioladitis looks at the list, I'll add them. Bgwhite (talk) 23:49, 14 April 2014 (UTC)
Bgwhite I agree. -- Magioladitis (talk) 05:05, 15 April 2014 (UTC)

CHECKWIKI #81[edit]

Why is #81 off for enwp, has there been a discussion in the past which I was not a part of or...why? (tJosve05a (c) 00:01, 15 April 2014 (UTC)

From what I can find at this latest discussion here I can not see there being consensus for turning off #81
The "latest discussion" was about removing errors. #81 was never removed, it was turned off on enwiki. It was turned off 4-6 months ago. I can't remember the number, but there was over 20,000 articles with no hope of them being taken care of. It's also technically not an error. Bgwhite (talk) 04:57, 15 April 2014 (UTC)
Bgwhite what does this error exactly mean? I thought it was about having a reference list twice. -- Magioladitis (talk) 05:07, 15 April 2014 (UTC)
Magioladitis, no, that is error #78. #81 was if there were two identical references in an articles. AWB would only fix a small subset of the errors. Bgwhite (talk) 05:12, 15 April 2014 (UTC)
Bgwhite true. AWB will only fix pages that already have a multiple reference once. -- Magioladitis (talk) 05:15, 15 April 2014 (UTC)