Wikipedia talk:WikiProject Check Wikipedia

From Wikipedia, the free encyclopedia
Jump to: navigation, search
  Check Wikipedia   WMFLabs   List of Errors   Discussion

Contents

Adjacent references ?[edit]

Hi, what do you think of adding a detection for adjacent references, like <ref>...</ref><ref>...</ref><ref>...</ref> ? This error probably won't be of any interest for enwiki because reference numbers are put between square brackets [1][2][3]. But on frwiki reference numbers are displayed without any decoration so adjacent references may look like only one reference 123, so we're generally using a template {{,}} between references. --NicoV (Talk on frwiki) 22:14, 27 May 2014 (UTC)

NicoV, could you get me some articles with the problem as test subjects. <maniacal laugh> Test Subjects </maniacal laugh> I take it I need to look for cases of: </ref><ref> and <ref name=ack /><ref ? I also saw your message above about adding to the done pages. Bgwhite (talk) 05:29, 28 May 2014 (UTC)
Ok, will try to find some... The subject was brought on WPCleaner's talk page for this modification, but the page is fixed now. --NicoV (Talk on frwiki) 07:17, 28 May 2014 (UTC)
Bgwhite, I checked a lot of articles but I haven't found an other example yet... --NicoV (Talk on frwiki) 12:16, 28 May 2014 (UTC)
fr:Utilisateur:Zetud/Pb Ref should have a list. --NicoV
Bgwhite, fr:Leetchi, with at least 2 problems in the introduction. --NicoV (Talk on frwiki) 07:34, 2 July 2014 (UTC)

#14 false positives[edit]

Resolved

Two false positives at plwiki are reported. To remove such cases, you might check only for "<source ", not "<source", and skip code which is in a <source> by itself. ToSter (talk) 19:20, 21 October 2014 (UTC)

The solution is again "<source[^a-z]". -- Magioladitis (talk) 06:13, 22 October 2014 (UTC)

Magioladitis, Error #14 doesn't use a regex. It uses the same subroutine used for checking imbalanced nowiki, pre, comment, syntaxhighlight, code, math, hiero, and score. The regex also doesn't solve the problem with the articles ToSter mentioned. The problem with the articles... there are valid, unbalanced source tags inside source tags.
Following scenario is in ToSter's articles, where the second source is not an html source tag.
<source> [text] <source> [text] </source>
Problem is... how does one differentiate between ToSter's scenario and a scenario where the first <source> tag is actually missing a closing tag, especially when editors don't always put extra parameters inside source tags. Bgwhite (talk) 07:26, 22 October 2014 (UTC)
We also have false positives on frwiki, which doesn't seem to fall into the above category:
  • fr:Apache Ant: a <sourcePath> tag is detected as being a <source> tag
  • fr:Vidéo HTML5: there are 3 self-closing <source /> tags inside a <syntaxhighlight> tag. The third one is reported.
--NicoV (Talk on frwiki) 13:41, 25 October 2014 (UTC)

Bgwhite is this fixed somehow? I haven't seen any false positives for a long time. -- Magioladitis (talk) 08:33, 24 January 2015 (UTC)

Stripping pre tags[edit]

Hi,

<pre> tags are stripped only if they have no additional attributes. In pl:dmesg there's a pre block:

<pre style="height:20em; overflow-y:scroll">...</pre>

It's not getting stripped by checkwiki.pl (get_pre() function) so false positives are reported (like #56 in that case). ToSter (talk) 18:51, 12 November 2014 (UTC)

ToSter, personally, I'd remove the entire pre text. I don't see the benefit of a boot screen from a 6-year old version of Linux. Bgwhite (talk) 23:35, 12 November 2014 (UTC)
Bgwhite, that's right :) but still the problem can occur in another place. ToSter (talk) 06:13, 13 November 2014 (UTC)
ToSter, it can, but it is not. Also, this is what the whitelist is for. Bgwhite (talk) 21:54, 14 November 2014 (UTC)

Add field with user edit[edit]

Hola, disculpas por escribir en español, se podría agregar un campo mas en el cual indique el nombre de usuario o ip que realizó la edición del error detectado. gracias buen trabojo.Sergio Andres Segovia (talk) 16:59, 30 November 2014 (UTC)

"Hi, I apologize for writing in Spanish, you could add an additional field which states the user name or ip who made the edition of the detected error. thanks good work."
Seems possible but would require a lot of processing to find the particular edit. Frietjes (talk) 17:58, 30 November 2014 (UTC)
Es una pena que requiera una gran cantidad de procesamiento, porque si se agregara ese campo iríamos directamente a las contribuciones del usuario o ip, y el que tenga el flag de reversor podría revertir las edición desde allí. En Wikipedia en español intentamos detectarlo con un filtro de ediciones pero arrojó muchos falsos positivos [1], saludos. Sergio Andres Segovia (talk) 19:05, 30 November 2014 (UTC)
"It's a shame that requires a lot of processing, because if that field we would be linked directly to user contributions or ip, and an editor with rollback could reverse the issue from there. At Spanish Wikipedia, we tried to detect issues with an edit filter but it resulted in many false positives[2], greetings."
I agree that it would be useful. You might be able to get a bot to do this for you? for example, I know that some bots like 'BracketBot' will warn you when you have introduced unbalanced brackets. of course, there is a difference between warning a user about 'breaking an article' and warning a user about using deprecated syntax. maybe you can ask the operator of BracketBot (A930913)? Frietjes (talk) 17:19, 1 December 2014 (UTC)
There is also Bracketbot's brother, ReferenceBot. Both are done by A930913. The two main differences between BracketBot and CheckWiki is: 1) Bracketbot checks articles in near real-time 2) Bracketbot informs the editor of the problem they created instead of reporting the error to a master database. In theory, CheckWiki can also be run in near real-time on individual articles. I would need help from A930913. His bot code would run normally except call CheckWiki to test an article instead of using the bot's checks. Bgwhite (talk) 19:53, 1 December 2014 (UTC)
@Bgwhite: Make a (web)script that I can ping with a pageid/title/diffid/oldid/user? ##930913 connect? 930913 {{ping}} 07:33, 2 December 2014 (UTC)

New error : empty titles ?[edit]

Hi, I was thinking about a new error for detecting empty titles, like the ones VE is creating on a regular basis (== <nowiki /> ==). --NicoV (Talk on frwiki) 18:10, 13 August 2014 (UTC)

NicoV, I did a scan for enwiki and came up with 83 articles. The VE edits all appear old. I wonder if they have fixed the problem in new VE builds? Bgwhite (talk) 22:31, 22 August 2014 (UTC)
Bgwhite, apparently it's still not fixed, the last VE edit I found with this problem is from last night. --NicoV (Talk on frwiki) 09:10, 23 August 2014 (UTC)
Thanks for the list Bgwhite, I've added error #522 to detect empty titles and fixed all the occurrences. --NicoV (Talk on frwiki) 12:21, 24 August 2014 (UTC)

And also, of the same kind, a new error for empty internal links, like in this edit ([[Boom Fm|<nowiki/>]] and [[Roger Blackburn|<nowiki/>]]). --NicoV (Talk on frwiki) 10:11, 23 August 2014 (UTC)

This was fixed in Visual Editor. No new cases have been found over the past few months. Bgwhite (talk) 23:51, 28 January 2015 (UTC)
Bgwhite, not at all. A few examples just in the last 24h (nowiki tags):
And maybe another problem with things like that: [[XX|YY ]]<nowiki/>ZZ which could be easily replaced by [[XX|YY]] ZZ
--NicoV (Talk on frwiki) 13:02, 29 January 2015 (UTC)

@NicoV and Magioladitis: According to Tech News: 2015-14, the problem of nowiki in titles has been fixed. Of course, what new untold problems have arisen due to their fix has yet to be seen. Bgwhite (talk) 20:28, 30 March 2015 (UTC)

Well, when you read in the same announcement that "VisualEditor is now the main editing tool on 53 more Wikipedias", you can't take it really seriously as even on wikis where it has been enabled by default for almost 2 years, it's still far away from from being the "main editing tool"... Face-wink.svg --NicoV (Talk on frwiki) 20:56, 30 March 2015 (UTC)
NicoV When I read that sentence for the first time, thoughts of dread and pity for those 53 sites went thru my mind. I also wondered what "phase 5" meant. mw:VisualEditor/Rollouts explains what each phase means. They have enwiki as a phase 0, which is, "... wikipedias that have been closed or deprecated". Ahhh, Visual Editor... always good for a laugh and a cry. Bgwhite (talk) 21:25, 30 March 2015 (UTC)
You didn't know ? When enwiki made its push to make VE opt-in, they closed enwiki Face-wink.svg Currently, we're not editing enwiki, it's a decoy... In the rollouts, I also liked very much the sentence that wikis in phases 1 to 4 "are relatively easy for VE to support"... --NicoV (Talk on frwiki) 21:35, 30 March 2015 (UTC)
VE is supporting phases 1 thru 4 very well. It's rather obvious. From day one, VE has supported goofs, foul-ups, mistakes and barfs. Bgwhite (talk) 21:45, 30 March 2015 (UTC)
I don't know if they deployed it, but empty titles are still created like here (without nowiki tag). --NicoV (Talk on frwiki) 16:48, 1 April 2015 (UTC)
still nowiki in titles..., and frwiki is running 1.25wmf23, the version identified as fixing the problem in all bug reports... --NicoV (Talk on frwiki) 15:54, 2 April 2015 (UTC)

False positive in Error #85[edit]

In ca:Brainfuck, there are <code> tags between <center> tags. However, the tool is flagging it as if it were empty.

--Joutbis (talk) 11:20, 22 February 2015 (UTC)

Same kind of false positive for frwiki: fr:Messiah with <score> tags between <center> tags, and fr:Tiret with <code> tags between <center> tags. --NicoV (Talk on frwiki) 22:27, 22 February 2015 (UTC)
Joutbis NicoV See discussion two above this one... Anything between comment, math, nowiki, code, pre, source, hiero and score tags gets removed before checks take place.
All right, will do.--Joutbis (talk) 19:50, 28 February 2015 (UTC)
In ca:Brainfuck's case, <center> tags are not to be used like that in tables. This is a case of doing center properly. I did edit Brainfuck to do tables properly. fr:Tiret has the same problem, well actually, it is full of fail (scope="col" is redundant, <font> and <tt> are obsolete).
In the case of fr:Messiah, if it was on enwiki, I'd use the {{center}} template. That does the proper thing anyway instead of using the obsolete <center> tag. Bgwhite (talk) 06:38, 23 February 2015 (UTC)

Whitelist (sv.wikipedia.org) for error #34[edit]

CanI get instances such as {{#expr:{{Stat/Finland/Kommuner/Befolkning|Föglö}}/{{Stat/Finland/Kommuner/Areal land|Föglö}} round 2}} whitelisted on svwp, since this is used to automatically update population numbers in articles such as sv:Föglö. I don't know how to do it, or what to do... (tJosve05a (c) 18:57, 23 February 2015 (UTC)

Josve05a This can be handled two ways and it depends on how many article you are talking about. If it is not "alot", then add the articles to a whitelist. If there are alot, then I can added it to the code.
If you are using a whitelist, look at Wikipedia:WikiProject Check Wikipedia/Translation and see how it is done for enwiki (search for "whitelist"). #34 on enwiki does have a whitelist. Frwiki also has whitelists set up. Bgwhite (talk) 19:26, 23 February 2015 (UTC)
@Bgwhite: THis should be used a lot, at least for all populated places in Sweden with population numbers at Statistiska centralbyrån, since a bot updates those automaticle. Not all articles are using this system yet, but more and more are. (tJosve05a (c) 19:29, 23 February 2015 (UTC)
I've also seen constructions like that on frwiki, but I don't like having calculations in articles. An other solution would be to use a template to do the computation instead of putting the #expr directly in the article. --NicoV (Talk on frwiki) 19:47, 23 February 2015 (UTC)

False positive on error #37[edit]

de:Bělá (Divoká Orlice) is stated as not bearing a sort key, but in fact this key is given with the parameter SORTNAME in template de:Vorlage:Infobox Fluss. Adding a defaultsort parameter to the article itself results in a warning message that the previous sort key has been overwritten. So the sort key seems to be valid. I don’t know if there are other templates affected which are listed in de:Kategorie:Vorlage:mit Kategorisierung. --Hadibe (talk) 17:29, 28 February 2015 (UTC)

@Magioladitis: Hadibe, grrrr.... sort values shouldn't be in Infoboxes. Magioladitis is the one to ask about this. Bgwhite (talk) 05:35, 3 March 2015 (UTC)

List with empty title[edit]

Hi Bgwhite, at least on frwiki list for error #25, there was only one line, and it contains an empty title and time found 0000-00-00 00:00:00. The "Done" button does nothing. The "Set all articles as done" works, and the empty title appears now in the list of done articles. --NicoV (Talk on frwiki) 19:41, 10 March 2015 (UTC)

Same for error 59, but I left it as it is. --NicoV (Talk on frwiki) 19:43, 10 March 2015 (UTC)
Same for error 85. --NicoV (Talk on frwiki) 20:10, 10 March 2015 (UTC)

False positive for #31[edit]

Hi Bgwhite, on frwiki there are several false positives with things like <trl>, <trois>, <trk>, <transformers, <transmission, <traduction, <track>, ... Would it be possible to limit the detection ? For example, detect only <tr when followed by a space, a "/", a ">", ... but not by a letter ? --NicoV (Talk on frwiki) 09:12, 11 March 2015 (UTC)

NicoV, I'll take a look, but it will be a couple of weeks till I can get to it. Bgwhite (talk) 05:10, 16 March 2015 (UTC)

False positives for error #103[edit]

Hi, I think that the script is not doing what user NicoV requested:

it should not detect articles where {{!}} is used in the displayed text of the link.

There are a few examples in cawiki: many train stations, like ca:Estació de Bogatell or ca:Llista de cançons del DJ Hero 2 (this one took several tries to fix error #32, and now it's back!).

I'm not sure if this would cover all the false positives, but I think that, if a | is already there, it should allow several {{!}}'s.

--Joutbis (talk) 09:35, 13 March 2015 (UTC)

Joutbis Yes, those are false positives. The DJ Hero 2 article contains M|A|R|R|S, which is also in several English articles too. I've seen errors that also had | inside a wikilink. I'd say add it to the whitelist for now and I'll take a look at. I've been gone for the best part of 2 weeks, so I need to catch up on things first before diving into the code. Bgwhite (talk) 05:07, 16 March 2015 (UTC)
Bgwhite I fixed M|A|R|S using {{pipe}}. -- Magioladitis (talk) 20:08, 16 March 2015 (UTC)
Good idea, thanks! --Joutbis (talk) 19:54, 17 March 2015 (UTC)

Joutbis I created the template in Catalan Wikipedia! 10 -- Magioladitis (talk) 20:10, 17 March 2015 (UTC)

Magioladitis, thanks! However, it doesn't work 100% of the time. It's OK for the M|A|R|R|S thing, and for train stations, but not in some (brain-damaged, granted) templates, which wrap square brackets around some of the parameters. See ca:Papa Bonifaci II, at the end.--Joutbis (talk) 16:27, 11 April 2015 (UTC)

Joutbis Hm... I can't fight with that.. -- Magioladitis (talk) 16:30, 1 August 2015 (UTC)

Error n°54 false positive[edit]

Yuri (genre) is a false positive. The break is in a reference. Jerodlycett (talk) 13:36, 30 March 2015 (UTC)

The mistake in de:Hilfsfrist. Is there any way else to avoid collecting these articles in WPSK than to separate the ref group entries? --Hadibe (talk) 10:47, 28 October 2015 (UTC)

False positive for #60 ?[edit]

Hi, I don't understand why fr:Liste des commandes et des livraisons de l'Airbus A380 keeps getting reported again and again. It seems that the following part is reported: {{#tag:ref|Singapore Airlines a commandé l'A380 en trois versions différentes, dont deux sont opérées : * 01 = 471 places{{#tag:ref|{{Lien web|url=http://www.seatguru.com/airlines/Singapore_Air/Singapore_Air_Airbus_A380.php|titre= Singapore Airlines Seat Maps (V1)|éditeur=http://www.seatguru.com}}|name=SIA_A}}, * 02 = 411 places{{#tag:ref|{{Lien web|url=http://www.seatguru.com/airlines/Singapore_Air/Singapore_Air_Airbus_A380_B.php|titre= Singapore Airlines Seat Maps (V2)|éditeur=http://www.seatguru.com}}|name=SIA_B}}, * 03 : ''configuration non encore connue'' |group=Note|name=SIAVersions}}

But in the notice, you have #tag:ref, Singapore Airlines a commandé l'A380 en trois versions différentes, dont deux sont opérées: note the comma after tag:ref instead of the actual pipe.

There is a similar construct before (Emirates) but it's not reported. --NicoV (Talk on frwiki) 11:36, 12 April 2015 (UTC)

A few questions, and silly requests[edit]

Hi from it.wiki; a few random things:

  1. Request: in the web interface the "more" link should be sortable, displaying how many errors are there ("1 more", "2 more" and so on); or at least don't display "more" if there isn't any other error; I'd love this so much :D
  2. Question: I'm testing two whitelists on it.wiki; I'm supposed to wait the new dump to see those articles removed from the web interface?
  3. Request: it should be possible to whitelist a single ISBN instead of articles; in it.wiki we have a parameter |ignoraisbn= inside the citation templates (doc here, it's part of the LUA module); article it:Jordan 195 has a wrong ISBN and it's not on our error lists[3]; but I don't know the details about this: is this "ignore" parameter working on every wiki due to the Lua module? Can we always use this instead of whitelists?
  4. Gadget proposal: when logged in Wikipedia, on Special:Watchlist there should be something like "Show errors in my whatchlisted articles", redirecting to our interface with a list of errors found; it should work like clicking on "more" for every article. I can do it already using the url, one article at a time. Similar gadgets can be done for a category, etc.
  5. Error 39, en.wiki translation page: " Due to a Wikimedia bug</a> ": is there a missing url?
  6. Minor bug: in the interface, after clicking on "more", the "list" link is broken.
  7. Suggestion: in the translation, it'd be better to use &nbsp; for spaces inside the examples proposed, at least where spaces are the problem.

Sorry if I'm wasting your time, and thanks for maintaining this wonderful project! --Vittorioo (talk) 23:08, 23 April 2015 (UTC)

Vittorioo There are no silly requests, but I may give silly answers. :)
  1. Good question. I'll look into it.
  2. The whitelist is updated at 0z everyday. Unfortunately, itwiki is only updated twice a month. From what you've already done, it looks good. I'll check it (my) tomorrow to see if the whitelists work just fine.
  3. It's not possible. enwiki has a similar parameter to ignoraisbn. Checkwiki is not checking ISBNs inside any cite template. The Lua module already checks for bad ISBNs. On enwiki, the errors are located at Category:Pages with ISBN errors. Checkwiki is only checking ISBNs that are not inside a cite type template.
  4. I haven't a clue when it comes to Gadgets. Gadgets are written in Javascript, a language I've never dealt with.
  5. I've removed the <a> tag. There was another Mediawiki bug that prevented newlines from being used in <blockquote> and several quote templates, thus <p> had to be used there. Those were fixed and the <a> tag was related to that.
  6. Will fix.
  7. Could you give me an example?
You are not wasting my time. Any suggestions or questions are always welcome. Bgwhite (talk) 23:49, 23 April 2015 (UTC)
7) For example on error 22, the [[Category : ABC]] and the like; but it's just me splitting windows; I've put no break spaces everywere :D Thanks again, will report on it.wiki --Vittorioo (talk) 00:05, 24 April 2015 (UTC)
6) Fixed. -- Magioladitis (talk) 07:35, 8 May 2015 (UTC)
7) I've added a few myself [4]; regarding 1) and 4): I've found a way to use WPCleaner to find articles with multiple errors or to scan my whatchlisted articles, it's quite the same of what I've asked, so don't waste time on them. Even that bug in 6), it's really not essential, just archive all of this. Thanks. --Vittorioo (talk) 12:43, 28 May 2015 (UTC)

Error 82 confusion and new error 104[edit]

Hi from it.wiki again. I've problems with error 82 "Link to other wikiproject"; it's active in es.wiki too.

  • A) Please exclude the "Wikipedia:" namespace from being detected by the script when it's checking a xx.wiki.
  • B) Redirects to en.wiki articles written like [[w:en:Article]] or [[:w:Article]] etc.:
    • from xx.wiki point of view they belong to error #68 "Link to other language";
    • from en.wiki point of view they are internal links badly written, together with [[:en:Article]] and the like: I propose to transfer them to a new error #104;
  • C) There are redirects to en.wiki articles using Meta or Mediawiki mixed syntax like [[m:en:Article]] or [[:en:mw:w:Article]] or [[meta:w:Article]] etc., and the script is not handling them correctly:
    • from xx.wiki point of view they belong to error #68 "Link to other language";
    • from en.wiki point of view they belong to the error #104 I've proposed in point B above;
    • they belong to error #82 only when the script is checking Commons or other sister projects.

A more simple fix to the script would be renaming error 82 to something like "Links with mixed MediaWiki syntax" and heavily expand its description. But in this case you have to be sure that all the above cases and variants are checked.
Sorry for the headache and thanks again. --Vittorioo (talk) 20:02, 1 May 2015 (UTC) Edit: added "at least" in point C) + some minor fixes --Vittorioo (talk) 10:07, 2 May 2015 (UTC) PS: I've rewritten and simplified my proposal. --Vittorioo (talk) 20:49, 26 May 2015 (UTC)

Vittorioo Sorry for ignoring you. I've been sick this past week. When I do edit, I'm trying just to keep up with fixing enwiki checkwiki errors. I'll get back to answering you next week. I've got an in-law gathering this weekend... so I'll probably be really nauseated for awhile. Bgwhite (talk) 21:44, 1 May 2015 (UTC)
Magioladitis Last month, I did the 2nd fewest edits in over four years and March was the 3rd fewest. Besides being sick the past two months, I wonder what else happened..... Bgwhite (talk) 00:01, 2 May 2015 (UTC)
Real life first of course! We have hundreds of years ahead to fix wiki. :D Take care. --Vittorioo (talk) 10:07, 2 May 2015 (UTC)
Pull request with a partial fix, basically an update for the list of projects: we are missing "species" because it's written "speciesi" in the script; also missing "voy" and many others. This is the list I've proposed when the script is checking a xx.wiki (that is, not Commons or other projects): b: c: d: n: q: s: species: v: voy: wikt: m: mw: meta: metawiki: metawikipedia: mediawikiwiki: commons: wikibooks: wikidata: wikinews: wikiquote: wikisource: wikispecies: wiktionary: wikivoyage: wikiversity: phabricator: wikitech: toollabs: testwiki: test2wiki: testwikidata: wmf: foundation: wikimedia: wmania: incubator: outreach:. There are more, but those are less used: see Help:Interwikimedia links. I've also proposed to add zh and bn language codes instead of fl (which doesn't exist) and gv (too small wiki). --Vittorioo (talk) 20:49, 26 May 2015 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── Vittorioo I've updated the program with your changes. Bgwhite (talk) 18:01, 27 May 2015 (UTC)

Slightly better. I've read your commit and you are still listing "fl" language code: as I said, it doesn't exist; also still listing "meta-wiki:" and labs:: they don't exist; re-read the list above please for the updated interwikimedia links: still missing c: for Commons and d: for Wikidata, etc. Also, is it really impossible to consider "Wikipedia:" a namespace, removing it from the list of projects? As I said in the pull request, error #82 is not active in our sister projects, so it's quite safe to remove "Wikipedia:" for now. I'd leave "w:" for a future fix, since that one is more complex to handle. I really appreciate your work, keep going. --Vittorioo (talk) 19:21, 27 May 2015 (UTC) Edited for grammar. --Vittorioo (talk) 12:43, 28 May 2015 (UTC)
With the last week commit it looks much better now. Thank you! --Vittorioo (talk) 10:03, 2 July 2015 (UTC)

Error 4 false positive[edit]

Error 4 (HTML-tag <a>): matches <a throne dais>; it's wrong. Fix: add href= to regexp 92.242.90.153 (talk) 23:17, 24 July 2015 (UTC)

Invalid color tracking[edit]

might not be feasible, but may be interesting to (1) parse an article, (2) grab any css style statements, (3) parse the background/foreground colors and compute the contrast ratio, (5) flag articles with really bad ratios. for the parsing part of the style statement, we have code in module:color contrast . for related discussion see Template talk:Episode list#Invalid color tracking category. of course, it would be pointless if there is no one interested in fixing them, but an idea for helping those of us with (partial) colour blindness. Frietjes (talk) 16:16, 1 August 2015 (UTC)

Template:Editnotices/Page/Wikipedia:WikiProject Check Wikipedia/Translation[edit]

There is a dead link to the Toolserver ([[tools:~sk/checkwiki/enwiki/enwiki_translation.txt|toolserver]]) in Template:Editnotices/Page/Wikipedia:WikiProject Check Wikipedia/Translation. Please update the link or remove it entirely if it is no longer needed. --Meno25 (talk) 14:24, 2 September 2015 (UTC)

At the same time, it should be updated to take into account the new elements that are managed by CW: whitelispage, ... --NicoV (Talk on frwiki) 14:39, 2 September 2015 (UTC)
Meno25 I know nothing about this. What is this for and how is it used? Bgwhite (talk) 17:32, 3 September 2015 (UTC)
Removed. -- Magioladitis (talk) 17:38, 3 September 2015 (UTC)
Update this too: Template:Editnotices/Page/Wikipedia:WikiProject Check Wikipedia. --79.52.67.85 (talk) 18:27, 3 September 2015 (UTC)
I think Template:Editnotices/Page/Wikipedia:WikiProject Check Wikipedia can be deleted as it doesn't seem to match the current situation and is probably useless now. @Bgwhite and Magioladitis: What do you think? --NicoV (Talk on frwiki) 10:53, 25 February 2016 (UTC)

NicoV link was updated. Feel free to perform any further action. If Bgwhite agrees we can delete it. -- Magioladitis (talk) 23:23, 19 September 2016 (UTC)

Magioladitis, I think that:
-- NicoV (Talk on frwiki) 05:48, 20 September 2016 (UTC)

NicoV I deleted the latter. -- Magioladitis (talk) 07:29, 20 September 2016 (UTC)

Proposed error detection[edit]

I noticed some file-delinker bots (or even users) removing an image name leaves an incorrect syntax like [[File:|thumb|]]. Also, in image galleries I noticed that image title was removed, but caption remained (after pipe). It might be useful to detect these errors also. --XXN, 00:34, 20 November 2015 (UTC)

The example you provided looks like Double pipe in a link. Matěj Suchánek (talk) 14:34, 25 November 2015 (UTC)
Only in this particular example. But it can also be like: [[File:|thumb]] or [[File:|caption here]] or [[File:|some_size_px]] etc. --XXN, 13:33, 26 November 2015 (UTC)

Wrong quotes[edit]

See this edit. Don't know how wide the problem is, but maybe it's worth including in Checkwiki? --Edgars2007 (talk/contribs) 09:58, 26 November 2015 (UTC)

For example, such wikisearch insource:/(style|class|colspan|class|rowspan|align)\s?\=\s?[”“]/i gives 200+ results at enwiki. Regex of course could be improved, as there may be cases, when opening brackets are correct, but closing ones are not. --Edgars2007 (talk/contribs) 11:20, 12 December 2015 (UTC)
And if I already started... Articles are using also style="text-align:centre;", which, of course, doesn't work. --Edgars2007 (talk/contribs) 11:36, 12 December 2015 (UTC)

Category namespace[edit]

When the migration started, I asked about including some more namespaces. Detecting stuff like this [5] would be awesome. Matěj Suchánek (talk) 11:02, 12 December 2015 (UTC)

No title displayed[edit]

Today I noticed an error report which has no title displayed (it's in the first row, using search I found it should be Bisabolol in Czech Wikipedia). The timestamp there is also strange. Matěj Suchánek (talk) 12:15, 13 December 2015 (UTC)

Matěj Suchánek, I've seen that on enwiki and fixed one issue that caused most of the problems. It was related to dump files. I haven't been able to find the cause for the remaining problem. Bgwhite (talk) 05:36, 14 December 2015 (UTC)
I'm seeing this from time to time on frwiki, reported a few sections above. If you need, I can report when I see it. --NicoV (Talk on frwiki) 06:18, 14 December 2015 (UTC)
That would be good if you and Matěj could report them. It would help if the article was found via the dump or daily scan. Bgwhite (talk) 06:37, 14 December 2015 (UTC)
Existing ones on frwiki: 67, 91. I don't know for how long they are there, those errors have never been completely cleaned for some time, and it 's only possible to remove the empty title when it's the only one left. --NicoV (Talk on frwiki) 05:25, 15 December 2015 (UTC)
Bgwhite On frwiki, the problem is visible for #105 and it's probably very recent. I haven't done anything to remove it if it can help you understand where the problem comes from. --NicoV (Talk on frwiki) 17:13, 15 December 2015 (UTC)
Also on #60 and #43 but I don't know for how long. --NicoV (Talk on frwiki) 17:16, 15 December 2015 (UTC)
Bgwhite Can I remove the ones that can be removed, and then warn you if they appear again ? --NicoV (Talk on frwiki) 17:01, 18 December 2015 (UTC)
NicoV Yes, you can remove them. They are showing up in via dump processing. What's weird is they don't show in the log file. Bgwhite (talk) 22:17, 21 December 2015 (UTC)
Ok, I've removed the ones I can. I will notify you when I see some more. --NicoV (Talk on frwiki) 07:23, 22 December 2015 (UTC)

Bgwhite, I think there was a full scan yesterday on frwiki, I see empty titles for #26 (same notice as fr:Emphase (typographie)), #38 (same as #26), #45, #51 (similar than #45), #53 (similar than #45), #67 (maybe an old one). --NicoV (Talk on frwiki) 11:45, 31 December 2015 (UTC)

File URLs[edit]

For some reason people cite files from their own hard drive. At least 58 articles when searching for file://c:/Users. Can we get these flagged? — Dispenser 21:42, 13 December 2015 (UTC)

Dispenser I can't remember when, maybe ~10-12 months ago, I did a scan for this problem and |image = http:// in infoboxes. If I remember right, there were different combinations of the problem. I look into it. I need to see what the other language Wikipedia's are like. For example, do they use file or another word. Bgwhite (talk) 05:47, 14 December 2015 (UTC)
Its standardized, see file URI scheme. — Dispenser 13:03, 14 December 2015 (UTC)

ISBN with invalid syntax missed by #69[edit]

Bgwhite Apparently, #69 doesn't catch the invalid syntax in Donald Strachey, like (isbn = 1-55583-387-X). Before fixing them, I tried checkarticle.cgi and nothing was detected. --NicoV (Talk on frwiki) 16:25, 18 December 2015 (UTC)

NicoV Correct, Checkwiki doesn't detect these. The main reason is the use of isbn= inside cite and infobox templates. Bgwhite (talk) 22:15, 21 December 2015 (UTC)
Bgwhite Could it be modified so that they are reported when they are not inside a template ? --NicoV (Talk on frwiki) 07:23, 22 December 2015 (UTC)

False positive for #60 ?[edit]

Hi, fr:Aldébaran keeps being reported for #60 with the notice Palette VizieR, V*. The template VizieR does have a "V*" parameter, but it seems to be detected as an error. Same for fr:Wolf 1061. --NicoV (Talk on frwiki) 17:09, 2 January 2016 (UTC)

NicoV You are correct. Atleast on enwiki, one can't have a parameter with * in its name. Probably true for dewiki as this was originally added by Stefan. Bgwhite (talk) 22:39, 5 January 2016 (UTC)

Error 104 unbalanced quotes with special characters and curly quotes[edit]

In regard of the display problem and the ref names rules, I've created a test page for error 104 (NicoV: WPCleaner wants to put the quote close to the slash in line 11). I've also searched for the opening and closing curly quotes, and some are mixed up with the regular ones. I think that Check Wiki should warn the user to search carefully every occurrence of a ref name found by error 104. If a ref name is "LuisBuñuel-59" the user needs to search at least "LuisBu" in order to find all of them. I hope it's clear enough. --CX42 (talk) 07:34, 10 January 2016 (UTC)

I once looked through Anomie's list of fixes and collected something (that I understand) for Latvian Wikipedia scanning. Yes, some are bracket-unrelated, but most of them are (in section "Kļūdainās atsauces (# Other issues)"). --Edgars2007 (talk/contribs) 08:20, 10 January 2016 (UTC)

False positive for #46[edit]

Hi Bgwhite, someone reported on frwiki that lately there has been false positives for #46 when image legend contains a link. Today examples:

  • fr:AN/APG-76, radar Norden [[AN/APG-76#AN/APQ-148|AN/APQ-148]]: seems fine in the article [[File:AN-APQ-148 Radar, Norden, 1972 - National Electronics Museum - DSC00068.JPG|thumb|280px|Un radar Norden [[AN/APG-76#AN/APQ-148|AN/APQ-148]].]]
  • fr:Bétail, of sheep.jpg|thumb|Troupeau de [[mouton]]: seems fine in the article [[Fichier:Flock of sheep.jpg|thumb|Troupeau de [[mouton]]s]]

--NicoV (Talk on frwiki) 08:56, 9 February 2016 (UTC)

NicoV It's not a false positive, but it is giving the wrong location for the error. Both articles had the broken bracket fixed on the 9th. Bgwhite (talk) 22:58, 11 February 2016 (UTC)
Bgwhite Yes, but sometimes it seems to be the opposite error that should be reported : currently, fr:Gandhara is reported in #46 with the notice Gandhara Guimet 181171.jpg|thumb|[[Bodhisattva]] while the actual problem is a #10 for [[shivaïsme]. That was also the case for fr:AN/APG-76. --NicoV (Talk on frwiki) 07:44, 12 February 2016 (UTC)

False positives for #3[edit]

@Bgwhite: On frwiki, there are some false positives for #3 due to:

  • a whitespace in the <references>...</references> tag, like here or here
  • a carriage return in the template Références like here

Could this be prevented from being detected ? --NicoV (Talk on frwiki) 20:35, 9 March 2016 (UTC)

NicoV
  • Whitespace: The regex is <references[ ]?\/?>. Change it to <references(\s*\/)?>?
  • Carriage return: I only slap {{ onto the front of the regex. You'll need to add a carriage return to your regex.
Bgwhite (talk) 21:33, 9 March 2016 (UTC)
@Bgwhite: For the whitespace, yes maybe. For the carriage return, I didn't know it was also a regex for #3, I thought it was only for #78: are you sure? --NicoV (Talk on frwiki) 08:40, 10 March 2016 (UTC)
NicoV Never mind. I was thinking 78. It's amazing I think at all. Will look more tomorrow. Bgwhite (talk) 08:45, 10 March 2016 (UTC)

Id 85 bug[edit]

Hello. Id 85 returns false positive on empty tags (as in "<center> </center>") if there is a code inside: "<center> <syntaxhighlight ... </syntaxhighlight> </center>" IKhitron (talk) 12:19, 13 April 2016 (UTC)

IKhitron The first thing CheckWiki does is to remove various tags and their content, ie <syntaxhighlight>, <nowiki>, <pre>... These tags often have bad wikicode or wikicode symbols that aren't wikicode. There's nothing that can be done with the false-positive blank center tags. However, as <center> is obsolete HTML, it's best to replace the tag. Bgwhite (talk) 19:47, 18 April 2016 (UTC)
Thank you, Bgwhite. But this is a special case id, it checks empty text. Can't you replace the tags with something neutral, as "qwerty" string, in place of removing, to work property? IKhitron (talk) 19:57, 18 April 2016 (UTC)
Well, Bgwhite, I rephrased the template, and the new run did not catch it. But I still do not know, what was the problem. IKhitron (talk) 18:40, 26 July 2016 (UTC)
IKhitron Wrong discussion. Do you mean #3 down below? Bgwhite (talk) 23:37, 26 July 2016 (UTC)
Sorry. Bgwhite. It's ##60 possible false positive

Self-closing div and span tags to be deprecated[edit]

The latest Tech News (dated today) has this notice:

Future changes
  • Using self-closing tags like <div/> and <span/> to mean <div></div> and <span></span> will not work in the future. Templates and pages that use these tags should be fixed. When Phabricator ticket T134423 is fixed these tags will parse as <div> and <span> instead. This is normal in HTML5. [6]

Should a check for these tags be added to Checkwiki? – Jonesey95 (talk) 21:13, 16 May 2016 (UTC)

Jonesey95 I've already run a list for them. There's a total of 72 in articles. There are <span /> tags in template space and I left a message on Frietjes' talk page about these. I'd rather not touch templates. I'll be adding this to error #2. Bgwhite (talk) 21:22, 16 May 2016 (UTC)
Thanks. I don't mind editing templates, even if it means the occasional run-in with editors who either can't read or refuse to read and then blame me for their shortcomings. I know that you know what that's like. I'll head over to F's talk page for the list. – Jonesey95 (talk) 21:34, 16 May 2016 (UTC)
Sadly, I've turned off the second error, because there's no consensus with <br clear="all" /> -> template replacement in ruwiki. Error #2 becomes more and more sophisticated, maybe it's time to divide it to the several errors? Or could you, please, disable founding br tags with "clear" attribute in ruwiki? If it's not very difficult. Facenapalm (talk) 07:57, 18 May 2016 (UTC)

It appears that the check for error #2 is not catching some cases of errors that cause pages to be placed in the new Category:Pages using invalid self-closed HTML tags. Examples:

Is error #2 supposed to find these? Can it be modified to do so? – Jonesey95 (talk) 01:45, 17 July 2016 (UTC)

Jonesey95 @NicoV: That's a lot of articles in that category. One of the articles I looked at should be caught, but isn't.
  1. I'm currently only catching cases that don't have other attributes, such as id=.
  2. I'm not looking for any cases of some others, such as <p>.
I'll work on adding them. I'm behind on coding things up due to trying to fix articles on the daily CheckWiki scans. Bgwhite (talk) 05:03, 18 July 2016 (UTC)
The category is new, and it is filling slowly as the job queue runs through the whole population of pages. Some gnomes have been busy cleaning out the category, including fixing templates that have zillions of transclusions, but the category population has stayed relatively constant at a few thousand as new pages are null-edited by the job queue. At this writing, it seems likely that there are 5,000 to 10,000 individual pages left with these errors, not including pages transcluding pages that have errors in them.
In addition to the above, I have seen <small/>, <center/>, <p "with text" />, and maybe one or two others, as well as all of those tags with both leading and closing slashes in the same tag. – Jonesey95 (talk) 05:47, 18 July 2016 (UTC)

@Bgwhite and Jonesey95: I've started updating WPCleaner to handle some of the tags that trigger the categorization. It's not finished, but you can help me by listing cases I'm currently missing (not a lot of free time to analyze what's missing). --NicoV (Talk on frwiki) 17:10, 18 July 2016 (UTC)

Is there a list somewhere? In addition to the above tags, I have seen <big/>, <s/>, <del/>, <tr/>, <td/>. – Jonesey95 (talk) 17:15, 18 July 2016 (UTC)
@Jonesey95: List available in the code. --NicoV (Talk on frwiki) 22:14, 18 July 2016 (UTC)
I see a list of tags, but interpreting the code is beyond me. It looks like del, td, and tr are missing. Will it find tags formatted like </blockquote/>, with a leading and trailing slash? There are a surprising number of those. – Jonesey95 (talk) 22:51, 18 July 2016 (UTC)
The link was just for the list of tags, not to analyze the code ;-) The code will find both regular self-closing tags and also incorrect tags with a leading and trailing slash. I've added del, td and tr. If you see other cases, tell me. --NicoV (Talk on frwiki) 06:14, 19 July 2016 (UTC)
I just found and fixed <code/> on one page. There may be more pages with this tag. – Jonesey95 (talk) 17:13, 19 July 2016 (UTC)

@Bgwhite and Jonesey95: If you're interested, I ran a dump analysis yesterday, the result for #2 is at Wikipedia:CHECKWIKI/WPC 002 dump. --NicoV (Talk on frwiki) 08:29, 21 July 2016 (UTC)

Excellent. It looks like there might be a couple of false positives on that list, but they are not worth worrying about until the hundreds of real errors are fixed. Good work. – Jonesey95 (talk) 12:58, 21 July 2016 (UTC)
This one doesn't look like a tag syntax error to me. As far as I know, any amount of white space is valid within a tag:
Does WP have its own rules about tags like this? – Jonesey95 (talk) 14:33, 21 July 2016 (UTC)
I don't know if I should keep detecting this or not : for the moment, carriage return are considered as invalid characters in a tag in WPC. --NicoV (Talk on frwiki) 16:29, 21 July 2016 (UTC)

Here are a few more tags to add to the check: <sup/>, <em/>, <i/>, <th/>, and <rb/> (typo for "br")Jonesey95 (talk) 21:28, 25 July 2016 (UTC)

@Magioladitis, Jonesey95, and NicoV: In theory, tomorrow CheckWiki will start to catch the br tags in NicoV's report and all the self-closing tags. It's also catching br tags with carriage returns. Bgwhite (talk) 00:32, 28 July 2016 (UTC)

Do I look at Wikipedia:CHECKWIKI/WPC 002 dump or somewhere else for the updated list? I fixed a few hundred errors on that page and am looking forward to a refresh of it. I was unable to persuade my computer to run the Java command at the top of the page, so I was unable to refresh it myself. – Jonesey95 (talk) 03:38, 28 July 2016 (UTC)
Jonesey95 In theory, August's dump will come out in a week or so. Might want to wait till then to see all the new and wonderful errors. I reran Nico's list via Checkwiki. The only errors listed were ones with the <br> tag... assuming I coded it right. Not sure if you or Nico have access to WMFLabs. Java and the dump files are available there. Bgwhite (talk) 04:55, 28 July 2016 (UTC)
Jonesey95 I have been trying to rerun the dump analysis for the last 2 days, but I'm only spending an hour or so home once a day (it failed the first time due to an out of memory error, and I don't know what's the status of the second run...). If you want to try it by yourself, the command on fr:Projet:Correction syntaxique/Analyse 002 is probably more explicit than the one displayed on enwiki... I won't be able to handle the August dump analysis, at least not until the 15th.
Bgwhite I think WMLabs severely limits the amount of memory a process can have, so it's probably a no go for WPC for the dump analysis. --NicoV (Talk on frwiki) 08:47, 28 July 2016 (UTC)
NicoV, No, they don't severely limit the amount of memory. One does have to specify the max amount of memory one needs. The default is 256MB. I've gone upto 3GB. Bgwhite (talk) 20:53, 28 July 2016 (UTC)
Jonesey95 I updated the description of the command line to run the dump analysis for enwiki. --NicoV (Talk on frwiki) 12:30, 28 July 2016 (UTC)

Commons[edit]

@Bgwhite: Could you help set up CHECKWIKI for Commons, so that it will list errors there, which doesn't seem to be working? (tJosve05a (c) 07:42, 24 May 2016 (UTC)

"Tags without content" screws up a format hack[edit]

I wanted to link a name that I inserted in brackets because it was simply "she" in the original quote, i.e. [[[Tammy Baldwin]]]. That displays without any formatting, so I put a blank span in the middle. Your bot just took it out. [7]

What worries me is that I've done this a LOT over time - not always for this reason, but it's amazing how often Wiki syntax fouls up some text with a single quote mark or some other feature for which this has been a workaround.

PLEASE stop removing empty tags and review the bot's edits. Wnt (talk) 10:44, 26 June 2016 (UTC)

It's logical to use nowiki instead of span. It's really non-obvious to understand what your span means. I think you should form your code as something like that: <nowiki>[</nowiki>[[Tammy Baldwin]]<nowiki>]</nowiki>. In ruwiki, we usually use self-closing nowiki tags for such purposes as this one, but seems like they're going to be deprecated. :( Facenapalm (talk) 11:30, 26 June 2016 (UTC)
UPD: "but seems like they're going to be deprecated" - hm, seems like not. Then I would write this: [<nowiki />[[Tammy Baldwin]]<nowiki />]. But template is even better, yes. Facenapalm (talk) 11:42, 26 June 2016 (UTC)

Wnt I used Bracket and fixed it for you. -- Magioladitis (talk) 11:37, 26 June 2016 (UTC)

@Facenapalm and Magioladitis: Sometimes I've used nowiki tags, but I didn't care that much one way or the other and I wasn't sure the bot wouldn't come after those. The Bracket template adds &#91; to the text (NOTE: I just tried that with nowiki and it didn't work! It just displays [! And [ html comments also do not work for this sequence!) - I'd actually prefer to do that than to add the confusion of a template which you don't know what it is. I think an HTML comment would work also.
But none of this really matters. My concern isn't trying to write this one sentence - my concern is that the bot is out there churning away, screwing up format kludges (good or bad) that will be very confusing for editors who don't know Wiki/HTML to figure out. It's the changes you don't know about that you need to be concerned about. Some of this stuff could be buried deep in tables and other arcane syntax. If the bot is going to take out empty spans, it should replace them with whatever you would tolerate like nowiki or HTML comments or whatever so that the text displays the same way. Wnt (talk) 11:48, 26 June 2016 (UTC)
IMHO, using empty span is dirty hack to trick the parser. I'm not sure I'll understand what it means even if I'll edit code manually. So it's ok that bot broke this rare case. Usually empty spans are just empty spans, and they shouldn't be replaced by something like <nowiki />. Facenapalm (talk) 11:57, 26 June 2016 (UTC)

This is the reason that the templates were created. It makes wikicode cleaerer and no hacks are needed. -- Magioladitis (talk) 12:14, 26 June 2016 (UTC)

So what is the template for writing &#91; without it coming out as a bracket? How do I look it up? (Or them up ... I have a feeling there are probably dozens, each used by one or two editors and unknown to the rest of us) Wnt (talk) 14:16, 26 June 2016 (UTC)
A few things...
Facenapalm and others... Self-closing HTML tags are being depreciated because they aren't in the HTML5 spec and they are removing them from the Mediawiki parser. <nowiki /> is not HTML, so it is not being depreciated. <br /> is still in HTML5, but is not mentioned in 5.1 that I could find. They are so common, who knows when it will die.
Wnt <span></span> is bad HTML and should never be used, period.
That leaves three options:
  1. <nowiki /> option that Facenapalm mentioned.
  2. {{bracket}}/{{brackets}} templates
  3. &#91; route.
Of the three options, #3 is probably the worst for editors. Not many people know what that means, but it is in common use. Templates are nice because people can look up the doc page for them. I personally use nowiki tags and it is the most common in use. Use whatever option you want. Bgwhite (talk) 05:16, 27 June 2016 (UTC)
The span element is still valid in HTML5.1, but "doesn't mean anything on its own" (cit.), and it's generally "used to color a part of a text" (cit.); so I agree that using the nowiki tag or the brackets templates in the above problem is preferable. Regarding the br element, nothing changed between HTML5 and 5.1, except that "Content model" has been renamed "Nothing" instead of "Empty". The only correct way to write it is <br>. The fact that the old XHTML <br /> is still in use is because Tidy is outdated; fortunately they are working on it (they mention the Sanitizer in the comments). --79.18.67.110 (talk) 14:31, 27 June 2016 (UTC) PS: I've run a little test and the W3C Validator doesn't see an empty span as an error; also, it has been used as a hack for some other reason (Fahrner Image Replacement#Implementations); so, it's just ugly, but harmless. --79.18.67.110 (talk) 15:43, 27 June 2016 (UTC)

feature list request[edit]

On fa.wikipedia we have a page and cleaning bot which lists and do some cleaning task, I will list some of useful Items for your tool:

  1. Category pages which have {{Category redirect}} and interwiki (local or wikidata)
  2. Categories which are like article (huge size) for example page_len>1000. some newbies add article text to category page.
  3. Redirect pages which have interwiki
  4. Pages which have old_interwiki (not wikidata)
  5. Pages which have duplicated coordination
  6. Redirect pages which their talk page is redirected to other page query
  7. Redirect pages which their talk page is not redirect query
  8. Redirect talk pages which the main page is not redirect query
  9. Redirect pages with (disambiguation) and linked to not disambiguation pages query
  10. Similar pages with different hidden characters query
Cleaning content
  1. Pages which have : after == (for example == foo ==\n:the text)
  2. Pages which have more <br/> after each other (for example foo<br/><br/><br/><br/><br/>bar)
  3. Page which have [•●⚫⬤] instead of * (for example • foo \n• bar)
  4. Pages which their lines started with numbers instead of #
  5. Page which have non-standard title for source or external links subsection (for example == our sources == or == the sources == ,...)
  6. Pages linked to (wiki(pedia|media|data|source|news|oyage|quote)|wiktionary)\.org without using their template
  7. Pages/articles which have more ['math', 'code', 'nowiki', 'pre', 'source', 's', 'su[bp]', 'noinclude', 'includeonly', 'big', 'small','gallery'] after each other
  8. Pages which have [\u0085\u00A0\u1680\u180E\u2000-\u200A\u2028\u2029\u202F\u205F\u3000] characters instead of normal space
  9. Pages which have LRM، RLM characters like (\u202A|\u202B|\u202C|\u202D|\u202E|\u200F)
  10. Pages which have ... instead of …
  11. Pages which have ---- for horizental line
  12. Pages which have space between == (for example = =)
  13. Pages which have more than 5 = in their subsection (for example ========= foo ===========)
  14. Pages which have more empty lines in their content (for example \n\n\n\n\n\n\n or \n\n \n\n)
  15. Pages which have tab \t at their first lines (for example \n\t)
Yamaha5 (talk) 01:25, 29 June 2016 (UTC)
I believe that many of these features can be handled by queries or some PetScan lists. IMO CW should be aimed on things which are not accessible from database, such as wikitext or HTML markup errors. Matěj Suchánek (talk) 18:41, 29 June 2016 (UTC)
The Cleaning content part shouldn't be possible by query. the database text's table is closed so it is not possible to get them by queryYamaha5 (talk) 20:25, 29 June 2016 (UTC)
Yamaha5 Egads. I hate to have been your mom. Yamaha, what do you want for dinner. Mom, I'll have chicken, steak, carrots, peas, mashed potatoes, cauliflower, spaghetti ...
  • A quick look... some can't be implemented, for example interwikilinks and ---- are valid.
  • For #8 and #9 on the cleanup list, on enwiki the following are being checked: \x{007F}, \x{200B}, \x{2028}, \x{202A}, \x{202C}, \x{202D}, \x{202E}, \x{00A0}, \x{00AD}, \x{202B}, \x{200F}, \x{2004}, \x{2005}, \x{2006}, \x{2007}, \x{2008}
  • To implement this is easy. Are there any on the enwiki list you don't want? Can you and Magioladitis (he is the expert, not me) look at the rest and see if they are ok to be added. I can't remember exactly but I think \x{202B} and \x{202F} caused problems if they were removed on enwiki.
Bgwhite (talk) 22:18, 29 June 2016 (UTC)

4 will be a disaster ad I good proved why. -- Magioladitis (talk) 22:23, 29 June 2016 (UTC)

User:Bgwhite :))) for characters we can omite LRM، RLM and ZWNJ they uses in foreign languages
User:Magioladitis: 4 you mean #4 ? Yamaha5 (talk) 22:39, 29 June 2016 (UTC)
Yamaha5 yes, I mean #4. -- Magioladitis (talk) 22:42, 29 June 2016 (UTC)
Yamaha5 I've gotten these mixed up in the past. Do you want me to add enwiki's list for fawiki? Bgwhite (talk) 00:20, 30 June 2016 (UTC)
Is it different list for projects? I thought lists for all projects are the same.In fawiki the query part we have active bot for them but the content part which is related to checkwiki we don't have active bot. is it possible to add them to whole project for all languages? if you want I can help you for adding them.
If we have these lists at checkwiki we can clean them regularly. Yamaha5 (talk) 05:44, 30 June 2016 (UTC)
Yamaha5 We are concerned that some of Unicode characters were needed in other languages, especially in right-to-left ones. I'd rather take this one slow and push any new Unicode characters to those projects that what them. For example, two of the LRM، RLM characters on your wanted list does cause problems on enwiki if they were removed. I get confused on what acronyms belong to which Unicode character... I've got dyslexia. I can read ok, it's processing in the head and also writing that causes me problems, LRM and RLM gets jumbled for example. So, what Unicode characters I listed above do you want or not want? These can be easily added for the next run, then we can test the others you mentioned in fawiki and enwiki for August's run. Bgwhite (talk) 05:14, 1 July 2016 (UTC)
Bgwhite what Unicode characters I listed above do you want or not want? if you mean for fa.wiki Now we have cleaning tool which convert #8 to space and #9 to \u200c and do conversion for #10 we tested and It was fine. If you mean which characters may cause problem for other languages like English in my opinion we should get list and check one by one by the local users and they can tell us which one should remove for them. so for fawiki we need #8, #9, #10 as I mentioned above for other languages we can remove as they want.Yamaha5 (talk) 07:48, 1 July 2016 (UTC)
#8:I removed the duplicated characters in mine and your list so there is characters should add to the checkwiki for all languages.  : U+0020, U+2000, U+2001, U+2002, U+2003, U+2009, U+200A, U+007F, U+200B, U+2028, U+202A, U+202C, U+202D, U+202E, U+00A0, U+00AD, U+202B, U+200F, --convert to--> space
#9: for fa.wikipedia we need to list all mentioned in #9 for other languages I don't know.
#10:for fa.wiki we need it.
At end please take a look on this. we can add them to checkwiki (new request :) ).Yamaha5 (talk) 08:34, 1 July 2016 (UTC)
Yamaha5 I've added fawiki to the same ones enwiki currently find. AWB can convert or remove these via the find and replace. For example, add "\u200E|\u200F|\uFEFF|\u200B|\u2028|\u202A|\u202C|\u202D|\u202E|\u00AD" to the find column and a space in the replace column. Bgwhite (talk) 01:04, 3 July 2016 (UTC)

Coming back to this section, I support coding up #3 and #4 (and maybe #13 and #15) from the second list. Matěj Suchánek (talk) 09:25, 26 November 2016 (UTC)

#90 and #91 for fa.wiki[edit]

Is it possible to deactivate #90 and #91 for fa.wiki? (the part which shows error for using other wiki as reference) because of lack of reliable online farsi sources At fa.wikipedia we have a consensus to use en.wikipedia and other big wikis as source for minor articles so most of #90 and #91's reprort for fa.wiki shouldn't solve.Yamaha5 (talk) 19:45, 3 July 2016 (UTC)

Yamaha5 Unfortunately, no. Keeping #90 on should be fine, but you will have to turn #91 off. Bgwhite (talk) 06:45, 4 July 2016 (UTC)
How can i turn of #91. can we control the lists? or you mean we should solve the articles on fa.wiki[ [User:Yamaha5|Yamaha5]] (talk) 08:02, 4 July 2016 (UTC)
Yamaha5 You can turn off #91. You've edited the list before. I generally leave the lists to be maintained by whoever wants to. You know Farsi, I don't, so edit it to your heart's content.— Preceding unsigned comment added by Bgwhite (talkcontribs)
I believe the stat page dosen't use that page becuase as you see we translated many of the labels but at the here we can't see them. for example top_priority_script was translated at fa:ویکی‌پدیا:ویکی‌پروژه_تصحیح_ویکی‌پدیا/ترجمه but still the fawiki_checkwiki page shows high priority also how can I disable #91? show me on english page (the line which should i remove)(I found it) Yamaha5 (talk) 08:43, 4 July 2016 (UTC)
Yamaha5 I think no _script variables are taken into account, you should use _fawiki variables. --NicoV (Talk on frwiki) 16:59, 8 July 2016 (UTC)
NicoV thanks.Yamaha5 (talk) 19:34, 8 July 2016 (UTC)
I added two patchs here please merge them to use fawiki's translation and have better supportYamaha5 (talk) 12:43, 4 July 2016 (UTC)

#54[edit]

Would you please add {{Break}} and these redirects to list #54?Yamaha5 (talk) 08:48, 7 July 2016 (UTC)

False positives for #105[edit]

Hi Bgwhite, CW reports 2 false positives for #105 on frwiki, fr:Tournoi des candidats de Zurich 1953 and fr:Championnat du monde d'échecs 1963, both for the same reason, a table cell filled with several equal signs. Could you ignore those cases as I did with WPC : if the line starts with a pipe, then do not report it as an error as it is most probably a table cell. --NicoV (Talk on frwiki) 15:31, 7 July 2016 (UTC)

New false positives for #22[edit]

Hi Bgwhite, new false positives are appearing on frwiki when the category name itself contains a colon with whitespace characters around it, like [[Catégorie:Acteur de Lost : Les Disparus]] in fr:Terry O'Quinn. --NicoV (Talk on frwiki) 19:21, 28 July 2016 (UTC)

NicoV Should be fixed for the run that starts in an hour. enwiki doesn't have two colons in a cat. No good #*$(@ nothing &(*! French. Problem was caused by the update that catches the #22s WPC found. Bgwhite (talk) 23:11, 28 July 2016 (UTC)
Bgwhite Most of them are fixed, except fr:Lost : Les Disparus where [[Catégorie:Lost : Les Disparus|Lost : Les Disparus]] is still detected by CW. --NicoV (Talk on frwiki) 19:57, 17 August 2016 (UTC)

#6 and #37 mostly obsolete.[edit]

@NicoV, Magioladitis, Yamaha5, Josve05a, Edgars2007, and Facenapalm: MediaWiki is moving to a new collation scheme called Unicode collation algorithm (UCA). Letters with diacritics will be sorted the same as with the non-diacritic version. I still don't know the timetable, but I did find the phab ticket (T136150) on moving enwiki to UCA. They have already moved several other wikis to UCA, including Russian, French, Latvian, Farsi and Swedish wikis. The listing of wikis can be found here; I'm thinking, #6 and #37 will only check for punctuation at some point for all wikis. I'll work on getting the wikis already on UCA to only check punctuation. Bgwhite (talk) 02:14, 29 July 2016 (UTC)

@Bgwhite: keep in your mined we have T139110 bug. is it makes problom for #6 and #37? Yamaha5 (talk) 03:49, 29 July 2016 (UTC)
lvwiki has disabled those ones, so I'm fine. --Edgars2007 (talk/contribs) 06:44, 29 July 2016 (UTC)
Same on ruwiki. In ruwiki, the only allowed letter with diacritic in titles is ё, but it's sorted correctly. Facenapalm (talk) 10:29, 29 July 2016 (UTC)
Czech Wikipedia doesn't use these errors, so you can remove the hardcoded stuff for cswiki from the code. Matěj Suchánek (talk) 08:13, 22 October 2016 (UTC)

Reference localization[edit]

Hello. Is there a possibility to recognize a template as footnote? Thank you. IKhitron (talk) 15:35, 29 July 2016 (UTC)

  • You're talking about this?
 error_003_templates_ruwiki=
   Примечания
   Список примечаний
   Reflist
   Reflist+ END
# ...
 error_078_templates_ruwiki=
   (Примечания|Список примечаний|Reflist\+?)(?![^}]*group) END
Facenapalm (talk) 16:16, 29 July 2016 (UTC)
Not at all, Facenapalm, thank you, I'm talking about a footnote (ref), bot references. IKhitron (talk) 16:28, 29 July 2016 (UTC)
Facenapalm I'm also unclear what you are asking. Remember, I'm slow. Could you put what your asking in different words?
Is there any possibility that you wanted to ask me this question, Bgwhite? IKhitron (talk) 23:59, 29 July 2016 (UTC)
IKhitron Yes. Like I said, I'm slow. Bgwhite (talk) 00:39, 30 July 2016 (UTC)
Well, Bgwhite, when you want to add a footnote you use <ref name=somename...>some text</ref>. I can't do this in rtl, so I use {{reftemplate|name=somename|...|some text}}, which is transcluded to the previous form. I asked if there is a possibility to add local name of footnote template, that will be recognized as ref tag. IKhitron (talk) 00:47, 30 July 2016 (UTC)
@Bgwhite: IKhitron (talk) 10:33, 9 August 2016 (UTC)
@Bgwhite:? IKhitron (talk) 20:34, 30 August 2016 (UTC)
IKhitron Ok, I've got some time this week. I'm still not understanding. Which error is this for? What would be an error case? Bgwhite (talk) 07:19, 31 August 2016 (UTC)
Thank you, Bgwhite. There are some, especially 78 and 81, but also 61 and 67. For 78 if the article doesn't have no ref and no references, but have ref template, it's not recognized. For 81 it will be splendid if the case when ref text and template text are the same, for example, will be recognizable. IKhitron (talk) 11:34, 31 August 2016 (UTC)
IKhitron Ok, I'm understanding. For #61 and #78, you can add reftemplate to your translation file. For #61, add at the end of its config:
error_061_templates_enwiki=
  reftemplate END
Then do the same for 78. For #67... either #61 is on, or #67 is on, but not both. #81 is a different story and its a bugger. Not sure on how to do that one. Do you have some examples so I can do some testing? Bgwhite (talk) 18:22, 31 August 2016 (UTC)
Thank you very much, Bgwhite. It's already a lot for me. About example: You have he:Template:הערה and all transcluded pages. The base is: template named "הערה", which has some parameters, when the reference text is the first unnamed parameter. As in {{הערה|שם=refname1|reftext|קבוצה=refgroup5}}. If you'll decide it's possible I'll thank you even more. IKhitron (talk) 19:36, 31 August 2016 (UTC)
By the way, Bgwhite, is there a possibility to do this for #3? I mean more ref templates, as in #61, not nore references template as in #78? Thank you, IKhitron (talk) 15:35, 4 September 2016 (UTC)
and one more btw, #78 references templates does not recognize different groups. There is some parameter for this? Thank you. IKhitron (talk) 15:48, 4 September 2016 (UTC)

More errors / more bots[edit]

If we manage to have more bots running daily we can reduce the time required to fix errors drastically. This means we have more free time to detect more errors and and add to our list. What could these errors be? In an ideal world, we could check all of WP:GENFIXES and see what is worth to be done even as a sole task. -- Magioladitis (talk) 09:25, 30 July 2016 (UTC)

Help with translation page[edit]

Resolved

Hello. I hope somebody who read this can find 5 minutes to help me. I'll be very glad if it's possible, so if I know it's not your "duty". I made a lot of changes in our translation page, because most of it was there from the time when checkwiki was a beta on dewiki. But it doesn't work any more! I tryed to find some variable without END or some another syntax error, but could not. What could be the problem? Thank you very very much in advance, IKhitron (talk) 11:55, 31 July 2016 (UTC)

Isn't "description_text_hewiki" the one, that screws up everything? --Edgars2007 (talk/contribs) 13:54, 31 July 2016 (UTC)
Everything is possible. Why do you think it's there, there is some problem in the description? Thank you very much, IKhitron (talk) 15:09, 31 July 2016 (UTC)
As I don't know, how those translation files are getting parsed to Checkwiki system, I'm just guessing. </syntaxhighlight> looked suspicous (and other non-HTML stuff), but I may be wrong. --Edgars2007 (talk/contribs) 16:05, 31 July 2016 (UTC)
I see. I created this part as in frwiki, and it works there. IKhitron (talk) 21:04, 31 July 2016 (UTC)

Article that doesn't exist appears in the database and in maintenance categories[edit]

The page USA:S inrikessäkerhetsdepartement has appeared on sv.wp's list of #2-errors for ~1 year now (or longer), at least when processing with WPCleaner. That page does not exist (the page USA:s inrikessäkerhetsdepartement however does exists). Yet this page appears on the CHECKWIKI list, and in the automated maintenece category Pages using invalid self-closed HTML tags on sv.wp. Why is this? (tJosve05a (c) 10:02, 1 August 2016 (UTC)

It looks like parsers think USA is a namespace and automaticaly uppercase the first letter of the rest. IKhitron (talk) 15:07, 1 August 2016 (UTC)

#88 has false positive[edit]

At here most of the reported items are false positive. the {{DEFAULTSORT:}} on fa.wikipedia is {{ترتیب‌پیش‌فرض:}}. checkwiki shows any texts which is started with ترتیب: it doesn't care that it should have {{ at the first. for example fa:آرایه‌های ادبی doesn't have blank at first position.Yamaha5 (talk) 11:48, 9 August 2016 (UTC)

In other word: the report should only check cases which have {{ with the first word of mediawiki magice word () for example for english if we have this text it will report it incorrectly
* some text DEFAULTSORT: foo some text...

for Persian

* some text ترتیب: foo some text...

it is wrong and it should check if DEFAULTSORT: had {{ in advance then report it! like text in below

* some text {{DEFAULTSORT: foo some text...

for Persian

* persian text {{ترتیب: foo some text...

Yamaha5 (talk) 07:44, 12 August 2016 (UTC)

at #88 the code should be like below
                my $sortkey = $test_text;
                $sortkey =~ s/^([ ]+)?$current_magicword//;
                $sortkey =~ s/^([ ]+)?://;

to

                my $sortkey = $test_text;
                $sortkey =~ s/^{{([ ]+)?$current_magicword//;
                $sortkey =~ s/^{{([ ]+)?://;

Yamaha5 (talk) 07:48, 12 August 2016 (UTC)

Request: Report for wrong dictation[edit]

There are some pages on wikipedia's like below which shows common wrong dictation. please add this to the reports to show which pages have these words.

The first word before || is the wrong oneYamaha5 (talk) 09:30, 11 August 2016 (UTC)
Yamaha5 On the enwiki side, you are talking about Wikipedia:Lists of common misspellings and Wikipedia:Lists of common_misspellings/For machines? If so, then this would be outside of CheckWiki's scope. In theory, CheckWiki find syntax errors and other errors in the source code. Spelling and other kinds of word errors wouldn't be in CheckWiki's scope. One can do a Google or a Wikipedia search to find these. Bgwhite (talk) 22:14, 11 August 2016 (UTC)
Bgwhite I know we can search at google. I wanted monthly lists which can be solved by bots or AWB by users Yamaha5 (talk) 07:53, 12 August 2016 (UTC)

AWB provides Typo fixing but this is the outside the scope of this project. - Magioladitis (talk) 07:56, 12 August 2016 (UTC)

Yamaha5, you can generate such lists using WPC, with error #501 (spelling) and the dump analysis feature, but it may require a few modification of configuration and tweaks. --NicoV (Talk on frwiki) 07:46, 25 September 2016 (UTC)

#28 possible false positives[edit]

Hi. I started to fix #28, and found he:(Miss)understood and he:Anastacia at start of the list. It doesn't look like there are problems there. Maybe there are some more, didn't check yet. Thank you, IKhitron (talk) 18:06, 11 August 2016 (UTC)

IKhitron It was fixed a few days ago. The problem happens when a table is the very last thing in an article... no categories, defaultsort or other templates. I made a change to catch more cases of #28. It was thinking |}} was a table ending when it's most likely a template ending. As a result of the change, #28 will pick up cases of {{|, such as {{|url=http... , where "cite web" is missing. This is an error, but not related to tables. Bgwhite (talk) 21:47, 11 August 2016 (UTC)
Thank you, Bgwhite. It means, these articles will not be in the list in the next run? IKhitron (talk) 22:10, 11 August 2016 (UTC)
IKhitron Correct. These should not be in next month's run. Bgwhite (talk) 22:16, 11 August 2016 (UTC)
Thank you very much for your help. IKhitron (talk) 22:58, 11 August 2016 (UTC)

New id suggestion[edit]

Hi. What do you think about such an id:

  1. Read the article.
  2. Find all strings [^']''[^'] and count them as I.
  3. Find all strings [^']'''[^'] and count them as B.
  4. Find all strings [^']'''''[^'] and count them as IB.
  5. At the end, mark the article as new id if I count or B count (or both) are odd.

Thank you. IKhitron (talk) 14:26, 23 August 2016 (UTC)

There will be false positives from constructions such as "Billboard's", which usually renders correctly wherever I have seen it. I suppose someone might think it valuable to replace that construction with Billboard's, but it's not really an error, since it renders correctly. – Jonesey95 (talk) 14:34, 23 August 2016 (UTC)
Yes, you are right. But it's better from ignoring this problem. One can whitelist this article. IKhitron (talk) 14:37, 23 August 2016 (UTC)
I find 1,213 articles with this search for ]]'''s. It might be a fun little AWB project for someone to clean them all. – Jonesey95 (talk) 15:36, 23 August 2016 (UTC)
I afraid this whitelist will be really big. The other problem is that the article can be wrong even if counts are correct, for example, here: a<ref>'''b</ref> '''c d. Facenapalm (talk) 15:39, 23 August 2016 (UTC)

A nice template is {{'}}. -- Magioladitis (talk) 15:44, 23 August 2016 (UTC)

Yes, 1,213 is a lot indeed. It's in enwiki (and other en* wikis) only, but you can't write one id for enwiki and other for rest wikis. So, what about the smaller project - mark if I+B is odd? IKhitron (talk) 16:29, 23 August 2016 (UTC)
Well, Bgwhite, what's the decision? IKhitron (talk) 20:35, 30 August 2016 (UTC)
IKhitron I think there are too many false positives. Looks like more false positives than actual errors. So, I don't think it would be a good idea. Bgwhite (talk) 21:16, 30 August 2016 (UTC)
Thank you. IKhitron (talk) 21:19, 30 August 2016 (UTC)

I would find this very helpful as this is what usually blocks pywikibot's library mwparserfromhell from successful parsing a template. Matěj Suchánek (talk) 18:01, 25 November 2016 (UTC)

False positive for #94[edit]

Hi, fr:Nicotinamide adénine dinucléotide is detected as having an isolated ref tag, with the notice </ref>| cl50 = | logp = | dja = | od but I don't understand what's wrong because the reported closing ref tag has an opening tag <ref name="ChemIDplus">{{ChemID|53-84-9|Nadide}}, consulté le 16 août 2009</ref> | CL50 = | LogP = | DJA = . --NicoV (Talk on frwiki) 22:49, 18 September 2016 (UTC)

NicoV It's not giving me an error. I haven't changed that part of the code this month. Article hasn't been changed this month. I don't know. Bgwhite (talk) 04:57, 19 September 2016 (UTC)
Bgwhite checkarticle.cgi gives the following answer:
  • - 94 3695 </ref>| cl50 = | logp = | dja = | od
so it's still reported as an error on wmflabs... --NicoV (Talk on frwiki) 05:21, 19 September 2016 (UTC)

Invalid link to article[edit]

In dewiki the link to 1% of one shows only 400 Bad Request. The problem is the missing URL-encoding for the "%".

This is the correct link with encoding: https://de.wikipedia.org/wiki/1%25_of_one

This encoding should be done automatically.--GünniX (talk) 03:39, 25 September 2016 (UTC)

Hi. We have the same problem. The name shown as "Ss", and I still do not know what is the right one. IKhitron (talk) 09:13, 25 September 2016 (UTC)
GünniX In theory, this should now be fixed. IKhitron, could you give me an example when one show up. Bgwhite (talk) 00:26, 30 September 2016 (UTC)
Thanx, I'll test it at the next occurrence. --GünniX (talk) 06:19, 30 September 2016 (UTC)
Sure, Bgwhite, here you are: [8]. IKhitron (talk) 09:20, 30 September 2016 (UTC)
The Ss, Bgwhite, was he:ß. IKhitron (talk) 18:50, 13 October 2016 (UTC)

Line break tags[edit]

As part of mw:Parsing/Replacing Tidy, all the wikis need to be checked for invalid </br> codes. They should ideally be replaced with just plain <br>. In the regular search box, you can find these by typing insource:/\<\/br\>/. There are (currently) only a few of these in articles at the English Wikipedia, but there are 900+ pages in the Template: namespace that contain this error, and there are potentially thousands of affected pages at other wikis. Whatamidoing (WMF) (talk) 17:27, 4 October 2016 (UTC)

These are easy to fix (I just fixed 90 of them), but editors will continue to add them. We probably need a maintenance category to track this tag and similar tags that can cause HTML errors. Since all wikis will need the category, it should probably be created at the MediaWiki level. – Jonesey95 (talk) 20:59, 4 October 2016 (UTC)
Whatamidoing (WMF) I think your meant </br> and not </br>/. CheckWiki does find cases of </br> in article space along with </hr>. It also finds invalid self-clsed tags such as <span /> and <small />. For some languages (ie enwiki), CheckWiki does a daily scan. CheckWiki also scans the monthly dump to catch any that were missed.
@NicoV, Meno25, Edgars2007, Facenapalm, Josve05a, Matěj Suchánek, and Magioladitis: Alerting the normal crew. The first link Whatamidoing gave talks about the issues replacing Tidy will cause. Many are fixed by CheckWiki, others are not. One of the maintenance categories already set up is Category:Pages using invalid self-closed HTML tags. This should be available on all Wikis. Bgwhite (talk) 21:12, 4 October 2016 (UTC)
(I fixed Whatamidoing's apparent typo above, so Bgwhite's first sentence may be confusing to readers here.) – Jonesey95 (talk) 21:26, 4 October 2016 (UTC)

I fixed all articles and templates. -- Magioladitis (talk) 23:26, 4 October 2016 (UTC)

Wow, that was fast! I find 75,835 pages when I search all namespaces for the insource string above. I think we need a maintenance category and a bot. – Jonesey95 (talk) 03:50, 5 October 2016 (UTC)
Whatamidoing (WMF), is there a Phab task to add a tracking category to MediaWiki for these errant tags? – Jonesey95 (talk) 14:02, 7 October 2016 (UTC)
I don't know. Whatamidoing (WMF) (talk) 20:52, 7 October 2016 (UTC)
I added a note to T145530 with a link to this discussion. – Jonesey95 (talk) 22:54, 7 October 2016 (UTC)

@Bgwhite: I searched for and fixed all instances of </br> on Arabic (ar) Wikipedia (373 pages) and Egyptian Arabic (arz) Wikipedia (31 pages). However, since users are likely to add the invalid tag again, so, I will have to run the bot regularly to fix it. --Meno25 (talk) 12:25, 8 October 2016 (UTC)

#67 configuration does not work[edit]

Hi. I opened #61 a month ago in hewiki and it gave me 100,000 answers. Than I closed it and opened #67, and it gave only 100 answers. Both have the same template configurations, but in the last one it does not work and returns pure ref only. Thank you. IKhitron (talk) 12:44, 5 October 2016 (UTC)

IKhitron Can't turn on both #61 and #67 at the same time. It depends on Hebrew Wiki's rules on which one to choose. On English Wiki, references come after punctuation marks, so #61 is turned on. On French Wiki, references come before punctuation marks, so #67 is turned on. With 100,000 errors on #61, I'd guess that Hebrew Wiki uses #67, before punctuation mark error. #67 doesn't return just the pure refs, the first thing is a punctuation mark. Bgwhite (talk) 19:29, 5 October 2016 (UTC)
You did not understand me, Bgwhite. I did not turn on both. I wanted the list for 61 on one run and for 67 on another. And I did not mean pure refs without punctuation, I meant pure refs errors, and nothing with templates as references. And hewiki rule is: Does not metter if it's before punctuation mark or after, but it should be unique in every article. So I need both lists to get the common articles in AWB list comparer. IKhitron (talk) 19:38, 5 October 2016 (UTC)
IKhitron I'm still not understanding. Could you give me an example of what's happening and what the desired result is? Bgwhite (talk) 15:26, 19 October 2016 (UTC)
Sure, Bgwhite. #61: <ref>text</ref>. works, {{הערה|text}}. works. #67: .<ref>text</ref> works, .{{הערה|text}} does not work. Thank you, IKhitron (talk) 15:34, 19 October 2016 (UTC)
IKhitron Ok, I understand... I'm slow. #67 doesn't check for templates. I'll need to add it. Could you give me a couple of articles to test on? Bgwhite (talk) 15:56, 19 October 2016 (UTC)
w:he:ויקישיתוף, w:he:הולנד, w:he:מים (commons:, Netherlands and water), Bgwhite. Thank you. IKhitron (talk) 16:03, 19 October 2016 (UTC)

False positives for #24[edit]

Hi Bgwhite, I've found 2 false positives for #24 (pre tags) on frwiki (fr:XML Schema and fr:XQuery), that shouldn't be detected for several reasons:

  • They're not <pre> tags, but <prenom> tags (and they're properly closed)
  • They're inside a <source> tag

--NicoV (Talk on frwiki) 08:27, 8 October 2016 (UTC)

Interestingy, a similar issue at cs:JavaServer Pages. Matěj Suchánek (talk) 13:37, 8 October 2016 (UTC)
And more interestingly, already disscused above. Matěj Suchánek (talk) 13:39, 8 October 2016 (UTC)

New id needed for new category sorting algorithm[edit]

Hi. There is a new algorithm for category sorting in wikipedias. It's much better than previous, but there is a new problem: 232,456,743 is sorted between 230 and 235. To fix this, such an article needs a defaultsort:232456743. Could you please create an id for article with name that includes comma separated number and hasn't default sort, or has but not comma removed? Thank you. IKhitron (talk) 13:03, 20 October 2016 (UTC)

Agree Maybe together with DEFAULTSORT itself having comma separated digits. Matěj Suchánek (talk) 07:43, 22 October 2016 (UTC)

Are people really supposed to be doing WCW edits on user talk pages?[edit]

Resolved

It's unimportant and annoying. Can I opt out at least? --Floquenbeam (talk) 21:28, 9 November 2016 (UTC)

Floquenbeam CheckWiki does not scan any talk pages for errors, only articles. CheckWiki only finds errors, not corrects them. There are tools and scripts out there that also detects and/or fixes CheckWiki errors. Most likely, an editor saw an "error" on your talk page, that happened to be a CheckWiki error. They then used a tool or script to "fix" it. Bgwhite (talk) 22:54, 9 November 2016 (UTC)

#4 expansion[edit]

Resolved

CHECKWIKI now catches unbalanced closing a tags. -- Magioladitis (talk) 22:30, 12 November 2016 (UTC)

Same for WPC. --NicoV (Talk on frwiki) 09:20, 29 November 2016 (UTC)

#3 expansion[edit]

Resolved

CHECKWIKI now is case insensitive. -- Magioladitis (talk) 22:30, 12 November 2016 (UTC)

Same for WPC. --NicoV (Talk on frwiki) 09:20, 29 November 2016 (UTC)

Deprecation of magic links (T145604)[edit]

@Bgwhite and Magioladitis: and others: I was thinking of adding features in WPC to help replacing magic links like RFC, PMID and ISBN by templates as the magic links will stop working when T145604 is activated (maybe in a year). The easiest way for me would be to create new error # (one for each magic link is probably better). What do you think ? Should we also add them for CW (yes: we should use error numbers like #112 to #114 ; no: I will user error numbers like #528 to #530): I don't think it's useful since dedicated categories are already filled up automatically by MW (Pages using PMID magic links, Pages using ISBN magic links and Pages using RFC magic links) ? --NicoV (Talk on frwiki) 14:20, 16 November 2016 (UTC)

I think we need a centralized discussion about what en.WP wants to do with these before we take any action to flag them. – Jonesey95 (talk) 16:21, 16 November 2016 (UTC)
Both CheckWiki and WPCleaner are used across multiple wikis, not just en. The issue is at every language Wikipedia. This could get messy. mw:Requests for comment/Future of magic links contains what already has been done and the status of other tasks. I'd rather not add new CW errors when system-wide categories have already been set up. I wouldn't add anything to CW yet. It looks like there will be a parser function and templates. Parser function isn't ready. Bgwhite (talk) 20:48, 16 November 2016 (UTC)
Ok. I think I will add new errors to WPC as #528 to #530 if I find some free time, and they will be activated only on wikis that decide to activate them. I know that on frwiki I can at least replace all the PMID by a template call and probably ISBN also, less clear for RFC. --NicoV (Talk on frwiki) 13:52, 17 November 2016 (UTC)

I've added error #528 to WPC to detect PMID magic links and suggest to replace them with a template call. It requires modifications both in CW configuration page and WPC configuration page to be full functional. It's already operational for frwiki as the PMID template was already existing and working like the magic link. --NicoV (Talk on frwiki) 14:04, 22 November 2016 (UTC)

I've added error #529 to WPC to detect ISBN magic links and suggest to replace them with a template call. Configuration is also required as for #528. --NicoV (Talk on frwiki) 11:39, 26 November 2016 (UTC)

Wishes for exclusions on #34[edit]

Would it be possible to exclude some cases with {{{ or }}} from detection?

  1. all these {{{Zeige...-expressions like in de:Liste der EU-Vogelschutzgebiete in Berlin
  2. { in front of de:template:overline/ de:template:Oberstrich e. g. de:Ernstit

--Hadibe (talk) 19:19, 16 November 2016 (UTC)

Hadibe I don't know what those Zeige expressions are or do, but I hate them. Some pages have hundreds of them. I can exclude all {{{ from being checked. {{{ is already excluded from ruwiki and ukwiki. Those wikis use {{{|} alot. This would also solve the overline template issue. Bgwhite (talk) 21:06, 16 November 2016 (UTC)
Please don't skip them all. That would avoid detection of typos. Then better leave the status quo and hope that there won't be to much new uses and also hope that Zeige... can be wiped out some day. Anyway, thanks for your fast response. --Hadibe (talk) 21:42, 16 November 2016 (UTC)

CX attributes[edit]

An other type of crap produced by CX : tags like <center>...</center> with CX internal attributes (data-cx-weight="356" data-source="184" class="" id="cx184" contenteditable="true"), like this example. Should we detect them in an other error? --NicoV (Talk on frwiki) 17:49, 17 November 2016 (UTC)

NicoV A search reveals 26 articles on enwiki with "<center " and 134 for frwiki. My favourite is <center align = center>. I don't see one on enwiki that should be kept. On frwiki, they should be deleted or moved to <div tags. Probably dedtect them, but don't automatically fix them? Bgwhite (talk) 20:37, 17 November 2016 (UTC)
Thanks Bgwhite. It's not only center tags, but other tags, also tables... (CX is creating crap in almost every part of wikitext syntax...). For example, 116 pages when searching for data-cx-weight. --NicoV (Talk on frwiki) 11:58, 18 November 2016 (UTC)
Other example 386 results when searching for contenteditable on frwiki... --NicoV (Talk on frwiki) 12:03, 18 November 2016 (UTC)
NicoV I'd never seen or heard of contenteditable before. After looking it up, that is a worthless element on a site where anybody can edit. There's also non-transcrapulator elements such as moz-border-radius. That's a firefox specific element and the general border-radius should be used. Add "<center " to #2? As much as this makes me cry, add "invalid css attributes" as a new error? Bgwhite (talk) 22:03, 18 November 2016 (UTC)
Bgwhite I think I prefer something like the "invalid css attribute" which is more general (contenteditable is crap left by CX and maybe VE, but in many tags). But adding also the center tags with attributes to #2 maybe interesting : are there any valid cases to have attributes to center tags ? --NicoV (Talk on frwiki) 09:59, 19 November 2016 (UTC)
NicoV Not one of the "weird" <center> is valid on enwiki. Most common one is <center class="">. Bgwhite (talk) 23:38, 19 November 2016 (UTC)

@Bgwhite: Thanks for the new error #112. I see there are a few false positives that could be avoided:

--NicoV (Talk on frwiki) 12:34, 15 December 2016 (UTC)

NicoV I original had it just looking for "-moz-" and "-webkit-", but there we false positives with urls. Now there has to be either a space or ; first ... ";-moz-".
I just added "-o-" and yesterday was the first run. I think I'll turn it off. I'll add "-ms-" today and see how it goes from there. Bgwhite (talk) 22:00, 15 December 2016 (UTC)

For #85, ignore <div style="height: ...; width: "> ?[edit]

@Bgwhite: Should we also ignore div tags with a given width or height, like in Phtalocyanine ? For example, <div style="height:150px; width:150px; background-color:#000f89; border-bottom:solid 1px #000000;"></div> gives

--NicoV (Talk on frwiki) 19:11, 17 November 2016 (UTC)

NicoV I ran into one of these the other day. In that case, an infobox should have been used instead of <div>. On enwiki, there is {{Color swatch}} that does the same thing. There are 16 interwiki links listed, but not one to the French equivalent. Bgwhite (talk) 21:02, 17 November 2016 (UTC)

New id suggestion[edit]

Unpleasantly, I suddenly found a lot of categorizations using {{category:...}} instead [[category:...]]. IKhitron (talk) 12:50, 25 November 2016 (UTC)

IKhitron In the "Search Wikipedia" box, I did a search on enwiki, frwiki and dewiki. Search on enwiki was insource:/\{\{Category\:/. On enwiki, there were 9 articles. French and German wikis didn't have any articles. I don't think this is a widespread problem. Do a search on your wiki and see what comes up. Bgwhite (talk) 22:26, 1 December 2016 (UTC)
Thank you. I fixed 28 last week. IKhitron (talk) 11:18, 2 December 2016 (UTC)
I have a lot of sympathy for this, because I made similar mistakes when I was new. These diffs of the incorrect syntax being added seem to be representative: [9][10][11][12]. (This one may be less representative, but is interesting in its own way.) Whatamidoing (WMF) (talk) 20:53, 6 January 2017 (UTC)

ISBN error check - potential enhancement[edit]

Take a look at this version of Ahmad Shah Durrani, specifically the last entry in the Bibliography. The ISBN contains both hyphens and spaces, preventing it from becoming a magic link. Is it possible and/or advisable for WCW's ISBN error check to look for articles containing this error? I don't know whether it would result in a lot of false positives. – Jonesey95 (talk) 18:00, 25 November 2016 (UTC)

From my experience, and we run bots to replace magic links to templates about two weeks, there are a lot of problems with this. Once even it converted cite template parameter name to template. IKhitron (talk) 18:04, 25 November 2016 (UTC)
I do not understand this answer. I'm not talking about doing anything with templates. I am talking about detecting and reporting a problem with ISBNs that breaks magic links. – Jonesey95 (talk) 18:18, 25 November 2016 (UTC)
Me too. Sorry, my English is not good. I meant that there are a lot of cases when you are sure you have right regexp, and then recognize another problem. IKhitron (talk) 18:20, 25 November 2016 (UTC)
Magic links are going away soonish anyway. I'd just suggest fixing errors if you find them. Jerod Lycett (talk) 00:55, 26 November 2016 (UTC)
We can't fix them if we can't find them, which is why I suggest enhancing this particular error check. We also can't make these particular magic links "go away", because they are not detected as such by the Mediawiki software. – Jonesey95 (talk) 00:58, 26 November 2016 (UTC)

@Jonesey95: Interesting... Do you know what exactly breaks the magic link ? It's not only mixing hyphens and spaces as none of the following work: ISBN 978- 1-4907 - 1441-7 ; ISBN 978- 1-4907-1441-7 ; ISBN 978 14907 14417 ; ISBN 97814907 14417. I wonder if the problem is not when you have two consecutive filling characters (2 consecutive spaces seem to break the magic links). It rather seems to be a bug in the magic links that we should report to the developers. I put a comment on phabricator T145604. I've just also modified WPCleaner to report ISBN inside nowiki tags as errors #69 (list of results for frwiki), as it seems to be mostly crap produced by CX or VE. --NicoV (Talk on frwiki) 11:24, 26 November 2016 (UTC)

#111 question[edit]

Hello. Thank you for #111, but I have a question how can I define a single ref template? Thank you. IKhitron (talk) 13:59, 27 November 2016 (UTC)

34th error now detects {! in ruwiki. Again[edit]

66597 matches, in previous dump there were less than 1000. I think the problem is here:

if ( $project ne 'ukwiki' or $project ne 'ruwiki' or $project ne 'bewiki' ) {

Correct code should contain "and", not "or". Facenapalm (talk) 10:30, 28 November 2016 (UTC)

Facenapalm This should be fixed now. Main problem was $project wasn't defined yet. Bgwhite (talk) 23:11, 28 November 2016 (UTC)
Thanks! Are there some static analysis tools for Perl that you can use? I'm not sure if they can catch "or" instead of "and" (but some of static analyzers for other languages can), but they definetely can catch using undefined variables. It's impossible to avoid all stupid errors, but static analyzers can help you to detect them immediately. Facenapalm (talk) 11:29, 29 November 2016 (UTC)
Facenapalm The problem is I bring all types of stupid. The variable was declared, otherwise, an error would have been thrown out. It's very useful to have undefined variables, so Perl doesn't check for it. Perl has strict and warnings pragma that catches a lot of things. I also use Perl::Critic and NYTProf. Bgwhite (talk) 00:04, 30 November 2016 (UTC)
  • @Bgwhite: the problem is still there. There are also some false positives in 43rd error: for example, i can't see any errors here. I'm not sure what algorithm you use for hidding "{{{!}}", so I suggest this: if project is ruwiki (ukwiki/bewiki?), replace all {{{!}} with {{(!}} (there is such template with the same meaning in russian wikipedia, so notice in checkwiki table will be understandable), and all {{!}}}((?:\}\})*[^\}]) with {{!)}}\1 (see next message). This will allow to scan dumps with typical algorithm without checking if current project is ruwiki in different errors. Facenapalm (talk) 11:50, 6 December 2016 (UTC)
    • @Bgwhite: sorry for constantly troubling you, I just want to be sure that this topic doesn't forgotten. I tested some parser features - seems like no sence to make replacements like {{!}}}}} -> {{!)}}}}, they're wrong (last bracket will be processed like text, not the one that is after {{!}}), so second replacement becames even easier: {{!}}}([^}]) -> {{!)}}\1. Facenapalm (talk) 01:23, 19 December 2016 (UTC)

Linebreak inside internal links[edit]

CW doesn't catch stuff like this. Matěj Suchánek (talk) 19:15, 1 December 2016 (UTC)

Matěj Suchánek This will be added to the new error #113. #113 will also include some <br> in wikilinks, for example [[Foo<br>]]. Bgwhite (talk) 09:42, 7 December 2016 (UTC)
@Bgwhite: #113 seems to be available for a few days now, but there are no errors reported in enwiki or frwiki. Does it work ? --NicoV (Talk on frwiki) 12:36, 15 December 2016 (UTC)
NicoV It's been added to the database, which is why it is showing up. It's not turned on yet. I'm waiting for #104 and #112 to calm down first. Bgwhite (talk) 22:12, 15 December 2016 (UTC)

#104 hasn't been detected, but CheckWiki still reports it[edit]

Since today dewiki has a high number of #104 (e.g. Altena is listed with notice <ref name="Kalonymos 1/1999" />). WPCleaner didn't find most of them and shows the error message: The error n*104 hasn't been detected in page <pagename>, but CheckWiki still reports it.

It seems to be a problem with special characters in the ref name. It is shown for names with "/", "?", "'", "#". --GünniX (talk) 09:03, 7 December 2016 (UTC)

GünniX was updated yesterday to report pages with ref names that contain these characters. I do not know if this works correctly. -- Magioladitis (talk) 09:33, 7 December 2016 (UTC)

GünniX Update to #104 makes sure the ref name is compliant with WP:REFNAME. Forbidden characters are # " ' / = > ? \ . WPCleaner is also being updated. A beta version does catch these, see Wikipedia:CHECKWIKI/WPC 104 dump for the enwiki errors caught by WPCleaner. Nico is working out of a hotel room in his spare time, so no word on when a new version of WPCleaner will be released. Bgwhite (talk) 09:40, 7 December 2016 (UTC)
  • By the way, checkwiki now checks for non-latin symbols in refs according to WP:REFNAME. Is it a local enwiki rule, created by enwiki community consensus, or a technical limitation? There are thousands of refs in ruwiki which are named in russian, and seems like all of them worked correctly. I can understand will to delete all russian-named refs (for example) in english Wikipedia - rare enwiki user is able to print russian characters - but I afraid such detections in russian Wikipedia are close to false positives. If this is not technical limitation, I also can't understand, why 104th error was updated instead of adding new one: unbalanced quotes is a syntax error, so it's a high priority error, while unwanted named refs (as I understand it) is just agreement violation, so it's middle or even low priority level. Am I wrong? Facenapalm (talk) 09:45, 7 December 2016 (UTC)
Yes, WP:REFNAME seems to be a local enwiki rule. It is not mentioned on the dewiki help page, too. Today dewiki shows 179 times #104. It seems thousands of pages have to be changed in dewiki (and much more in other wikis). I don't like to do it, I will stop fixing #104 on dewiki. --GünniX (talk) 10:20, 7 December 2016 (UTC)
So, Bgwhite, can you please at least don't check for illegal characters in all wiki-project except enwiki (checking for unbalanced quotes is really useful, I don't want to turn in off)? Another possible solutions is creating new error for this task to be able to turn it off directily or give an opportunity to discribe local array of allowed characters via regexp - it can be [a-zA-Z0-9!$%&()*,\-.:;<@\[\]^_`{|}~ ] for enwiki and just . for project where is no consensus about ref name limitations. Thanks! Facenapalm (talk) 10:52, 7 December 2016 (UTC)
@Facenapalm and GünniX: Checkwiki is not looking for non-latin symbols. I only said characters. The MediaWiki recommendation is for all wiki's to use ASCII only, but it is not a rule. Also, a MediaWiki rule is if punctuation or spaces are used, it must be in quotes. Illegal # " ' / = > ? \ . characters duplicate characters used by <ref> tags. For example, <ref name="foo"foo"> and <ref name="foo'foo"> are clearly bad. These are not enwiki only, but dealing with the Mediawiki software. I'll double check with the WMF if anything has changed in the last four years Bgwhite (talk) 10:57, 7 December 2016 (UTC)
I'm sorry then, I misunderstood this error. Tell me one more time, punctuation like "." and "?" will be reported only if it is not in quotes? Or can you just update your github repository so I'll able to find all answers without constantly troubling you? :) Facenapalm (talk) 11:07, 7 December 2016 (UTC)
@Facenapalm, GünniX, and NicoV: From WMF, the characters "is a problem because it interferes with regexps used to match tags." and " don't see any real reason to use " ' < > / \ in a ref name." Thoughts? Bgwhite (talk) 20:58, 7 December 2016 (UTC)
Seems to be a sensitive issue for some users, so I wouldn't report too many things. With the list I'm using (see below), I currently didn't have any complaints on frwiki for fixing things that are ok. --NicoV (Talk on frwiki) 00:46, 8 December 2016 (UTC)
@Bgwhite: I didn't understand that you were also reporting such characters like "/" inside a quoted ref name. Please don't... Reasons for using "/" in a ref name: for example, CW is reporting several articles with <ref name="95/2/CE">, where "95/2/CE" is the reference of a European Community standard, seems very logical to use this as the ref name, or things like <ref name="Le Point 30/06/2011"> for an article in a newspaper (newspaper + date). --NicoV (Talk on frwiki) 01:35, 8 December 2016 (UTC)
I agree with NicoV. In dewiki I see often <ref name="Magazine mm/yyyy" />, <ref name="DOI10.1056/xxx, <ref name="EG1523/2007"> (EG is German for European Union, same style is used for German government decisions), <ref name="Saison 2015/16" /> and even <ref name="O'Connor2000">. I think you should allow this characters in a quoted name. In my opinion, the only error should be a " in a name. --GünniX (talk) 07:42, 8 December 2016 (UTC)
@Bgwhite: Be aware of the group keyword. Yesterday CheckWiki reported even <ref name="a" group="A" />! --GünniX (talk) 08:16, 8 December 2016 (UTC)

Hi guys. Not much time, so I didn't read everything... WPCleaner should be already up to date regarding #104, but I doubt that it's doing the same as CW. I tried a first version where I was accepting only characters in WP:REFNAME but after doing a dump analysis with that, I decided it was really too much, so the version that I released is accepting all letters and digits even if they are not ASCII characters. --NicoV (Talk on frwiki) 11:53, 7 December 2016 (UTC)

And accepting also all characters that are listed at line 101. --NicoV (Talk on frwiki) 11:57, 7 December 2016 (UTC)
  • @GünniX, NicoV, Facenapalm, and Magioladitis: Ok, it's turned on again. For the time being, I'm using the definition at mw:Help:Cite, The quotes are optional unless the name includes a space, punctuation or other mark. It's got a problem with one word names and "group" being set. Bgwhite (talk) 00:25, 9 December 2016 (UTC)
    • @Bgwhite: That sounds good. But this new definition results even in many errors. Most of them are names with a dash like <ref name=jpl-close/>. If you want to keep this definition (which is OK in my opinion), a bot should fix this error soon. And you should fix the known problems with one word names and "group" before the next run.
    Now I'm waiting for the new releases of WPCleaner and AWB. --GünniX (talk) 07:37, 9 December 2016 (UTC)
      • @GünniX, NicoV, Facenapalm, and Magioladitis: Any known problems should be fixed. If you see any issues, please give a yell. The vast majority of recent code commits for AWB has been related to #104. One can download the latest SVN version of AWB and compile it themselves (instructions).

@GünniX, NicoV, and Bgwhite: Right now neither Yobot not Dexbot fix ref names with dashes. Should they? -- Magioladitis (talk) 13:00, 10 December 2016 (UTC)

I don't know, I think dashes are rather ok in ref names. --NicoV (Talk on frwiki) 15:24, 10 December 2016 (UTC)
@Magioladitis: I think underlines and dashes are OK when used within a word, but a dash with a number could become a negative number. My recommendation: Don't accept it as the first character of a name (or keep the definition simple and put it always in quotes).
But it is important, that bots fix all that errors which are found by CheckWiki, because we have seen that there are thousands of errors with ref names. After you have decided what syntax Check Wiki accepts a bot should fix them (and should even fix names like name=:1 and name='Article about "Harry Miller"'). --GünniX (talk) 18:24, 10 December 2016 (UTC)

@Bgwhite: Today I'm surprised by your long list for #104. The list contains many pages with simple names like <ref name=dt>. Have you decided to put all this names in quotes? Or is something wrong with your check? Be aware that other wikis have no bot to fix it. --GünniX (talk) 01:09, 15 December 2016 (UTC)

GünniX It was an Ooops. Magioladitis' bot was blocked. People weren't happy with a reference containing any punctuation being an error. I changed things so the only punctuation to find is # " ' / = > ? \ .. However, I ooopsed by saying a reference without those characters is an error and not with those characters. Bgwhite (talk) 08:56, 15 December 2016 (UTC)
Bgwhite checkarticle still reports them as being errors, for example Alan Turing returns + 104 10148 <ref name=CNRS>{{article|titre=L'héritag
NicoV See my message just above yours. Bgwhite (talk) 21:45, 15 December 2016 (UTC)
Yes, I saw it, but I thought you had fixed it, that's why I expected checkarticle to stop reporting them as errors. --NicoV (Talk on frwiki) 10:23, 16 December 2016 (UTC)

@Bgwhite: I think it's now obvious that you have an evil plan to leave Yobot blocked for ever. GünniX revealed this plan. -- Magioladitis (talk) 09:10, 15 December 2016 (UTC)

maniacal laugh Bgwhite (talk) 21:45, 15 December 2016 (UTC)
It's not funny any more, Bgwhite and NicoV: [13] IKhitron (talk) 11:02, 19 December 2016 (UTC)
IKhitron Magioladitis and his bot were blocked for reasons that are not CheckWiki related. However, if the blocking admin and the rest of the lynch mob have their way, Checkwiki is essentially dead on enwiki. Bgwhite (talk) 23:11, 19 December 2016 (UTC)

Exclude signatures from #63[edit]

Would it please be possible to check if #63 (small in sup) appears inside of a user's signature? These findings are listed anyway on #95, so the articles don't have to be mentioned twice. On dewiki you don't see anything else. Eventually it's the user's choice how tiny they want to show parts of their signature. --Hadibe (talk) 13:58, 9 December 2016 (UTC)

Edit: I exchanged the list number from 85 to 63. Sorry, bad mistake. --Hadibe (talk) 19:30, 9 December 2016 (UTC)

Hadibe It's the user choice upto a point. For example, no images in signatures on dewiki. It can't cause a big gap between lines on enwiki. Signatures still have to be accessible. This means signatures have to be colour-blind accessible, such as no red text on black background. It also means fonts can't get too small. The <small> tag reduces font size to 85%. I'm not sure how much the <sup> tag reduces text size. Around 80%-85% smaller is the cutoff point where text becomes too small. So, having both the <small> and <sup> together brings text well below 80%. Bgwhite (talk) 21:37, 9 December 2016 (UTC)

Headlines errors mixed[edit]

Error #105 (completely missing =) now includes cases of error #8 (different amount of =), so #8 is now (almost) empty. I believe this was caused by mistake. Matěj Suchánek (talk) 09:22, 17 December 2016 (UTC)

Matěj Suchánek #8 has not changed. For #8, a heading needs one or zero ending = to cause an error. #105 has changed, besides the usual checking that there are one or zero beginning =, it will also check if there are less beginning = than ending... so "== heading ===" is an error. — Preceding unsigned comment added by Bgwhite (talkcontribs) 00:26, 19 December 2016 (UTC)
I believe that previously one error was only for missing "=" (at all) and the second one for unbalanced headings (ie. on both sides but different amount)... Actually, I may be confused as well. Matěj Suchánek (talk) 14:04, 19 December 2016 (UTC)
Matěj Suchánek Join the crowd. My middle name is confused. Bgwhite (talk) 23:03, 19 December 2016 (UTC)
@Bgwhite: So shouldn't the titles be changed? #8 looks correct but #105 should be 'Heading with unbalanced "="'. Matěj Suchánek (talk) 09:13, 29 January 2017 (UTC)

False positives on links to userspace[edit]

#95 seems to have been updated, so that links to talk pages are included as well. Now there are false positives on links to pages starting with "Diskus" and similar. For instance,
Kyberšikana "[[Diskuzní fórum|diskuze]] provokují vkl". Matěj Suchánek (talk) 13:07, 17 December 2016 (UTC)

Matěj Suchánek Problem was already fixed around the 13th. Bgwhite (talk) 00:15, 19 December 2016 (UTC)
Okay, I marked them as done, let's see. Matěj Suchánek (talk) 14:05, 19 December 2016 (UTC)

@Bgwhite: Now it looks like that talk pages links are only detected. Could you please check that both [[Wikipedista: and [[User: are also detected in cswiki? Matěj Suchánek (talk) 09:03, 29 January 2017 (UTC)

Matěj Suchánek The Wikipedia's API keeps reporting bad results. I've had to hardwire it in. With Magioladitis out of action, I'm swamped doing both of our daily routine. I'll get to it on (my) Monday. If you don't hear from me, ping me. Bgwhite (talk) 09:36, 29 January 2017 (UTC)
Matěj Suchánek I updated it just before today's processing ran. I don't see any new #95 for today. Previously, the API did not include [[Wikipedista: and [[User: as results. It still doesn't contain. [[User:. Here are the ones I've hardcoded in:
user:, diskuse s wikipedistou:, wikipedista:, redaktor:, uživatel:, wikipedistka:, diskuse s uživatelem:, diskuse s wikipedistkou:, diskusia s redaktorom:, komentár k redaktorovi:, uživatel diskuse:, uživatelka diskuse:, wikipedista diskuse:, wikipedistka diskuse:
Bgwhite (talk) 09:12, 31 January 2017 (UTC)

#69 false positives[edit]

Hello. Please see here. There are some very strange false positives there, kind a "ISBN <some text>". Thank you. IKhitron (talk) 21:45, 21 December 2016 (UTC)

IKhitron Been fiddling with the code for #69. It will find cases of ISBN 0123456789 and ISBNs inside external links. Did the same for ISSNs. I had an oops with ISBN, which is the false-positives you see. Bgwhite (talk) 22:31, 21 December 2016 (UTC)
Thank you, Bgwhite. So, it will be fine next run? IKhitron (talk) 22:33, 21 December 2016 (UTC)
IKhitron The daily runs have been running fine and I haven't seen any problems. I haven't fiddled with the code for a bit. Today, the latest dump file CheckWiki uses has been produced and Checkwiki should be running, but the files haven't transferred to labs yet. Bgwhite (talk) 22:40, 21 December 2016 (UTC)
I can translate it to "You hope it will be OK, and you've got a reason for it". Very well, thank you. IKhitron (talk) 22:43, 21 December 2016 (UTC)

Article names with escape character[edit]

Article names with a ' are shown with escape character as \' (e.g. St Mary\'s Christian Brothers\' Grammar School, Belfast in list #80). When using the list in AWB, the article is not found. Please remove the escape character. Best regards --GünniX (talk) 04:05, 31 December 2016 (UTC)

GünniX, I can see an empty list. IKhitron (talk) 14:21, 31 December 2016 (UTC)
Today you can see it in list #104 for St Andrew\'s Church, Aysgarth, if it is done you will find it in done Articles. --GünniX (talk) 08:27, 1 January 2017 (UTC)
Thanks' indeed. IKhitron (talk) 12:00, 1 January 2017 (UTC)
IKhitron, GünniX I haven't changed anything for awhile. I keep seeing the morph into something new... one time tables are totally messed up, then next time it is a \'. Not sure what to say. Bgwhite (talk) 00:16, 5 January 2017 (UTC)

#4[edit]

Report Number 4 shouldn't show the case in below as error (have an exception for source tag)

<source lang=html>
 <a href="http://example.com/">foo</a>
</source>

Yamaha5 (talk) 06:23, 4 January 2017 (UTC)

Yamaha5 Can you provide a page for this? -- Magioladitis (talk) 08:04, 4 January 2017 (UTC)

User:Magioladitis yes! fa:ابرپیوند Yamaha5 (talk) 08:07, 4 January 2017 (UTC)
At that page it is used as below
<source lang=html>
 <a href="http://example.com/">مثال</a>
</source>

Yamaha5 (talk) 08:08, 4 January 2017 (UTC)

Yamaha5 I see <source/>. Maybe it' my screen? -- Magioladitis (talk) 08:11, 4 January 2017 (UTC)

No. you should switch the text Ltr and you will see </source>Yamaha5 (talk) 08:14, 4 January 2017 (UTC)

#87[edit]

Report Number 87 shouldn't show the case in below as error (have an exception for image name)

| image = Dieter Frowein Lyasso&Sigmar Polke.jpeg
| caption=
The article:fa:سیگمار پولک

Yamaha5 (talk) 08:02, 4 January 2017 (UTC)

Yamaha5. I think the problem is that there's not really a good way to know that it's an image name. --NicoV (Talk on frwiki) 10:47, 4 January 2017 (UTC)
NicoV It is not difficult. We can do it in two ways:
1-remove all words which are between file alias and .+image extention in a temprary text variable after that check the remain text for #87
2-make an exception list by regex and ask bot to check the rest of the text. Yamaha5 (talk) 12:54, 4 January 2017 (UTC)

this regexs can find image name you can test them at here

regex1="\[\[ *(?:([Ff]ile|"+fileLocalAlias+") *:([^\.]+)\.(?:tiff|tif|png|gif|jpg|jpeg|xcf|pdf|mid|ogg|ogv|svg|djvu|oga|flac|opus|wav|webm) *[\|\]]"
regex2="\|[^\=]+\= *([^\.]+)\.(?:tiff|tif|png|gif|jpg|jpeg|xcf|pdf|mid|ogg|ogv|svg|djvu|oga|flac|opus|wav|webm) *[\|\]]"

Yamaha5 (talk) 13:52, 4 January 2017 (UTC)

Yamaha5 I already bypass images that are [[File:, but not in infoboxes where they are just a plain name. Will update. Bgwhite (talk) 00:30, 5 January 2017 (UTC)

Lists still generated for deactivated errors[edit]

@Bgwhite: it seems that CW is generating lists for errors that have been deactivated since the 28th of December. See for example on frwiki, lists for #37 and #110. --NicoV (Talk on frwiki) 10:46, 4 January 2017 (UTC)

NicoV Should be fixed Bgwhite (talk) 00:30, 5 January 2017 (UTC)
Thanks ! --NicoV (Talk on frwiki) 06:09, 5 January 2017 (UTC)
@Bgwhite: it's still happening, like for #37 with 4 pages reported between the 13th and the 16th of January. --NicoV (Talk on frwiki) 09:45, 18 January 2017 (UTC)
I can confirm this. Yesterday, I marked all #37 as done in cswiki while the error is off. Matěj Suchánek (talk) 09:02, 29 January 2017 (UTC)

Some wikis not updated anymore[edit]

@Bgwhite: It seems that frwiki is not updated anymore : last update is 2017-01-02 while many other wikis have 2017-01-05. Some other wikis have even older dates : fawiki, fiwiki=2016-12-26, many for 2016-12-25 or 2016-12-05. --NicoV (Talk on frwiki) 12:16, 5 January 2017 (UTC)

We were updated at Dec 5, I thought the bots are celebrating Christmas. IKhitron (talk) 13:23, 5 January 2017 (UTC)
IKhitron There was a hung job on the queue for frwiki from the 2nd. I've killed it and things should run again at 0z. There are also issues with transferring the dumps over to labs again. 20161220 wasn't transferred until after the 20170101 dumps started. I've already sent an email, but haven't heard back yet. Bgwhite (talk) 18:19, 5 January 2017 (UTC)

Backward Bug for RTL languages[edit]

AT CheckWiki web page, When the Notice column is started with Non-Arabic characters it is shown Backward . for example at #15 the notice for the first row should be <cod>همچنین می‌توان چندین ربات را هم now shows <cod>مه ار تابر نیدنچ ناوت‌یم نینچمه For example, if the notice is "foo" it shows "oof". It is CSS bug and not the text bug (copy the text to the other place is ok) if

<td class="table" style="background-color:#D0F5A9;"><span style="unicode-bidi: bidi-override;">

is

<td class="table" style="background-color:#D0F5A9;"><span style="unicode-bidi: embed;">

It will be ok (I tested for Farsi and English for both of them it will be ok)Yamaha5 (talk) 15:05, 5 January 2017 (UTC)

Hi, Yamaha5, it's not a bug. I asked to do this in purpose about month ago, see Wikipedia talk:WikiProject Check Wikipedia/Archive 8#RTL bidi IKhitron (talk) 15:21, 5 January 2017 (UTC)
Please revert it at least for rtl languages' web page Yamaha5 (talk) 15:24, 5 January 2017 (UTC)
But why? It exists for rtl pages, because it's the only possibility to use notice with rtl. IKhitron (talk) 15:26, 5 January 2017 (UTC)
the notice for the first row should be <cod>همچنین می‌توان چندین ربات را هم now shows <cod>مه ار تابر نیدنچ ناوت‌یم نینچمه check #15 Yamaha5 (talk) 15:27, 5 January 2017 (UTC)
What do you mean in should? You do not read notices, you use it with copy paste to browser search, for example. Who cares how does it looks, but you should have an ability to copy a part of the notice (without [[ for example), with no direction problems. If I knew there is a possibility to create bidi-override with auto direction (rtl when there are no english letters), I would ask it, of course. As much as I know, it isn't. Do you? IKhitron (talk) 15:31, 5 January 2017 (UTC)
The interface is importent some time reading the table shows false possetive errors for exmple see ##88 has false positive. what is the wrong with <span style="unicode-bidi: embed;">?Yamaha5 (talk) 15:34, 5 January 2017 (UTC)
Because it will not work. For me, it was enough. For you, I searched in google now, and found such a command. What do you think about dir=auto? It's fine for me. IKhitron (talk) 15:43, 5 January 2017 (UTC)
dir=auto not supported at IE and edge but supported at chrome. I tested for this case at chrome doesn't work for me :( Yamaha5 (talk) 15:48, 5 January 2017 (UTC)
Please try adding this code
<!--[if lt IE 9]>
  <script src="https://oss.maxcdn.com/libs/html5shiv/3.7.0/html5shiv.js"></script>
<![endif]-->
(with comments, as is) just before </head>. It should work, at least in internet explorer. IKhitron (talk) 15:54, 5 January 2017 (UTC)

Yamaha5 IKhitron You two have me at a disadvantage as I don't understand RTL languages, so I have to go by what you two think is best. I also would like to avoid any browser specific code. What is the difference between "bidi-override" and "embed"? What is best for the average RTL user? Yamaha, I'm not understanding what you are saying... what way does the notice text show up correctly for RTL? In the address bar, change "checkwiki.cgi" with "checkwikin.cgi" and "embed" will be used. Bgwhite (talk) 18:44, 5 January 2017 (UTC)

Bgwhite Thanks now "checkwikin.cgi" works for me \O/. IKhitron thank you for your time Yamaha5 (talk) 19:02, 5 January 2017 (UTC)
Bgwhite would you please add a possibilty if there is fa.wiki it automaticaly switch to "checkwikin.cgi". We spread the older link in wikiYamaha5 (talk) 19:05, 5 January 2017 (UTC)
Yamaha5 "checkwikin.cgi" is an update I've been working on to the regular cgi program. Nothing big, mostly code cleanup and darker text color. I put "embed" in there as an easy way to try it out. I'd still like to know what is best for the average RTL user. Bgwhite (talk) 19:32, 5 January 2017 (UTC)
comparing checkwiki.cgi and checkwikin.cgi
please check the screenshot
I link two pages
Bgwhite For Farsi checkwiki.cgi is wrong and the interface is backward but checkwikin.cgi is Correct and if you copy the notice at browser search box you will see the original text is the same as you see at the page.(correct)
For Hebrew checkwiki.cgi shows wrong and if you copy the notice at browser search box you will see the original text is backward but checkwikin.cgi the copy text and interface are the same. I don't know why IKhitron says it is not ok.
Arabic and Farsi (Persian) are similar an they have the same characters (except 6 charters) so their condition is the same.

Yamaha5 (talk) 20:03, 5 January 2017 (UTC)

Hi, Bgwhite. This is shortly the issue: I asked you to override bidi, because it's impossible to read otherwise any text that has ltr and rtl text together, and more than that, it can't be partially chosen by mouse to use in browser search box. Yamaha5 does not like this, because it makes the rtl only text to be viewed in opposite direction, from the end to beginning. I do not care because the advantafe is better for me. But of course, If we will not find a solution for Yamaha, I'll ask you to remove the override and suffer, because I don't want anybody else will because of me. I found a solution, but it turns out it did not work on IE 8 and below, because of HTML5 code. I try to think about something else.
Here is another idea, but I do not know if it's possible for you, Bgwhite. I'm almost sure that Yamaha5 will like it. So, can you make a code that will show the notice field in ltr direction for all ltr wikis and in rtl direction for all rtl wikis? The left vs right align can be a good bonus, but it's less important than the direction. Can you do this, Bgwhite? Please, please, please. IKhitron (talk) 21:48, 5 January 2017 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── @Meno25: If I just say I'm going with option A, then one of you will be happy and one not so happy. But that isn't the issue. I more interested in what is best for the average RTL user. There are atleast six wikis that CheckWiki handles that is RTL. I can see each of your points on why one way should be used. So...

  1. What should the default be? You three (Hi Meno), tell me what is best.
  2. How to get it so the other way is an option. Not working on IE8 and below is just fine. Those browsers are not supported by WikiMedia. I'd rather not do a Javascript method as some people hate that. you make a code that will show the notice field in ltr direction for all ltr wikis and in rtl direction for all rtl wikis I can do this. I can also make an alternative css stylesheet, but it requires javascript.

Bgwhite (talk) 06:49, 6 January 2017 (UTC)

@Bgwhite: Hi, Bgwhite. The problem is that the text in the "Notice" field is currently reversed (like an image in a mirror) which makes it unreadable for RTL (e.g. Arabic) users. For example, the word "مصر" (Arabic for "Egypt") would be shown as "رصم". (I didn't notice this earlier myself because I was inactive in the past few weeks.) This link shows the problem. Compare using "bidi-override" and "bidi:embed" and notice that the text when using "bidi-override" is reversed. I believe that the current situation cannot continue. So, I would vote to revert this change as Yamaha5 suggested. --Meno25 (talk) 08:07, 6 January 2017 (UTC)

part2[edit]

@Meno25, IKhitron, and Yamaha5: More questions to go along with the two above

  1. Should I set the direction for the entire page to ltr? checkwikin.cgi currently has that set.
  2. If not, should I set the "article" and "notice" columns to ltr?
  3. Anything else to help out?

Bgwhite (talk) 09:28, 6 January 2017 (UTC)

Thank you for the time you spend to this, Bgwhite! If you can do rtl for rtl wikis only it's much much better than the current position that I asked you. But I meant this and also override-bidi, otherwise it does not help at all. Could you please add the overriding to notice field, not including the column caption, and ask the two other friends for their oppinion? One thing more: if you could make rtl the table only, not all the page, it's much better, because the interface above and below is in English. Thank you very very much. IKhitron (talk) 09:56, 6 January 2017 (UTC)

@Bgwhite: for rtl langs by changing

<table class="table">

with

<table class="table" dir="rtl">

it will be ok. you can test it at here original checkwiki page Yamaha5 (talk) 10:27, 6 January 2017 (UTC)

Thank you for the example, Yamaha5, it's exactly what I was talking about. IKhitron (talk) 11:36, 6 January 2017 (UTC)
Yes, thank you Yamaha5 from me as well. --Meno25 (talk) 12:23, 6 January 2017 (UTC)
Thank you both. So, Bgwhite, all of us agree. Could you change this, please? Thanks, IKhitron (talk) 07:00, 9 January 2017 (UTC)
IKhitron I changed it a few days ago. It should be live on the regular page. Bgwhite (talk) 08:20, 9 January 2017 (UTC)
Not at all, Bgwhite, I checked today and now again, it's still on embed. IKhitron (talk) 10:40, 9 January 2017 (UTC)
IKhitron Didn't know I should do that. I've changed it to override. Bgwhite (talk) 19:46, 9 January 2017 (UTC)
Thanks, Bgwhite. And it should start working? IKhitron (talk) 00:20, 10 January 2017 (UTC)
IKhitron I'm confused again. I did change it to override and left it there for awhile. I noticed it reversed the LTR text, which was not wanted via above. I've changed it back to embed. The table does have "dir=rtl" listed in it.
Bgwhite, it's ok. We all agreed that the english text will be reversed. It's not there for reading, but for fixing mistakes, and this reversing helps to do this ten times quicker. Could you make this, please? Thank you. IKhitron (talk) 21:47, 10 January 2017 (UTC)
@Meno25, Yamaha5, and IKhitron: Just to make sure all is in agreement. Bgwhite (talk) 18:38, 11 January 2017 (UTC)
@Bgwhite: Current situation (Using "class="table" dir="rtl"" and "style="unicode-bidi:embed;"") is correct. Both Arabic and English texts are displayed correctly for me. I disagree if Ikhitron wants to change it. --Meno25 (talk) 18:52, 11 January 2017 (UTC)
So, Meno25 said "Yes, thank you Yamaha5 from me as well" without check the example of Yamaha5. Bgwhite, can you do this for one wiki only, please? IKhitron (talk) 18:55, 11 January 2017 (UTC)
I said yes and thank you on changing "table class="table"" to "table class="table" dir="rtl"" suggested by Yamaha5 above which is correct. If you feel you need custom formatting for hewiki, that's fine by me but, please, don't mess with the arwiki formatting. Thank you. --Meno25 (talk) 19:01, 11 January 2017 (UTC)
I never wanted. You said yes on Yamaha5's example, so I thought you are agree. If I knew that yes means no, I ever would not ask that, of course. I do not want majority, 2 from three. I just do not want to disturbe to anyone. IKhitron (talk) 19:05, 11 January 2017 (UTC)
Bgwhite? Thank you. IKhitron (talk) 12:34, 15 January 2017 (UTC)

New errors: No Space after </ref>-Tag and Point before </ref>-end[edit]

Moin Moin @Bgwhite:, @all, for first of all, I like to wish you a happy new year. I Would like to have a question for new errors. Priority low. There are many errors where is no space after a refs-end.

Example (No Space after </ref>-Tag):

<ref>Here is the ref-text</ref>And hier is normal text.

But right is:

<ref>Here is the ref-text</ref> And hier is normal text.

Example (Point before </ref>-end):

Normal text<ref>Here is the ref-text</ref>. And hier is normal text.

But right is:

Normal text.<ref>Here is the ref-text</ref> And hier is normal text.

What do you think about? mfg --Crazy1880 (talk) 18:20, 5 January 2017 (UTC)

Crazy1880 Probably not. CheckWiki is getting criticism on enwiki for doing "minor" things. I also don't think this is a syntax fix. One can still find these in the search box. Type insource:/\/ref\>[a-zA-Z]/ into the search box and it should return you articles with the problem. We've already asked and were denied output from the search box to be more list-friendly. Bgwhite (talk) 19:23, 5 January 2017 (UTC)

The second point is already implemented as #61. Matěj Suchánek (talk) 17:43, 6 January 2017 (UTC)

Moin Moin @Bgwhite:, my intention was only to see all mistakes even better on the article and then fix them in one step. Thanks for the search-TAG.
Moin Moin @Matěj Suchánek:, thanks for the note, this ID was offline in German. For a test I activated this now, to watch how much there comes. Thank --Crazy1880 (talk) 14:54, 7 January 2017 (UTC)
Moin Moin @Bgwhite:, @Matěj Suchánek:, I checked it a while and found something. A comma, colon or semicolon following may be correct. So I meant it at least. But a point is not right. Thats for the ID #61. What do you think about? Regards --Crazy1880 (talk) 18:55, 12 January 2017 (UTC)

Replacing Tidy[edit]

I know that several of the regulars here are preparing for mw:Parsing/Replacing Tidy. I'm guessing that late January will be the next opportunity to discuss when the first changes will be made (NB "discuss", not "implement without warning"). So with that possible discussion in mind, I wanted to check in with you. How are things going? What's going well? What's being difficult? Do you need specific kinds of help or information? What can I do to make this less painful? Whatamidoing (WMF) (talk) 20:50, 6 January 2017 (UTC)

As I can remember, not all cases of changed behavior can be tracked by categories. If the rest can be found by CW, it will help very much. IKhitron (talk) 20:52, 6 January 2017 (UTC)
If T106685 can be addressed, that will make our insource searches for regex patterns more reliable. I wonder if P3012 produces accurate output, given this insource bug.
It looks like there are some not-very-nice bugs on this phab list. For any that are not going to be worked around, we'll need a list or category of pages to work on.
Have the errors and requests discussed here, like </br>, been addressed via a workaround, maintenance category, or CheckWiki report? As of that discussion, CheckWiki was scanning only article space, which means that tens of thousands of pages will have errors that have not been looked at yet. Also see Wikipedia:CHECKWIKI/WPC 100 dump, which may be a barrier to replacing Tidy (NB that report also covers only article space). – Jonesey95 (talk) 22:01, 6 January 2017 (UTC)
@Whatamidoing (WMF) and Jonesey95: I was going to bug you about this after I pushed out the latest CheckWiki update. It's still a few days away, which I've been saying for two weeks.
  1. CheckWiki can search thru template space if need be. There's just an if statement saying if namespace=0, then check errors.
  2. For me, insource searches aren't that useful. I'm not aware of any way to get a listing to put in a tool, such as AWB. That would speed up fixing things. AWB's built-in insource search is also broken. Magioladitis did a phab ticket so a listing could be done, but it was denied.
  3. Majority of Wikipedia:CHECKWIKI/WPC 100 dump are not technically errors. These are permitted per mw:Help:Lists and Help:list.
  4. It would be nice to have a list of things to be checked. I can gather some things from mw:Parsing/Replacing Tidy, but not all. We then know exactly what to look for.
a) <br/> and other self-closing tags.
b) <small>...<small> and <big>...<big> Need to find way to not search in templates and tables to minimize the vast number.
c) Unclosed tables. Already in Checkwiki.
d) ????
Bgwhite (talk) 11:12, 7 January 2017 (UTC)
Yeah, list of things, that will break would be nice. --Edgars2007 (talk/contribs) 15:13, 7 January 2017 (UTC)

On a side note. Any timing on when the Magiclink's for ISBNs, PMIDs and RFCs are going away? Magioladitis was going to do a bot run on this, but now it looks like it's me. I mention the change in a discussion two weeks back. Sorry Jonesey, but you and I were specifically called "not real editors". Bgwhite (talk) 11:12, 7 January 2017 (UTC)

The sound you hear is me clutching my pearls at such a slight. I may need to lie down. – Jonesey95 (talk) 13:03, 7 January 2017 (UTC)
I've got no recent word on the de-magicking of the magic links. The annual Dev Summit is underway as I type, so there's a chance that decisions have been made this week. I don't believe that it's blocking anything important, so I hope that it's a project that could be postponed until the larger wikis have settled on whether and how they want to update their pages (e.g., with direct interwiki-style links vs by using local templates). User:Legoktm is probably the best-informed person for that project.
On the main point, I would be very happy to have you bug me about this whenever it's convenient for you. I really appreciate the way you're keeping track of this large and mushy project. (Now let me see if I can find someone with a few more details for that list – any replies will likely be either today or delayed by a week, because schedules.) Whatamidoing (WMF) (talk) 20:58, 11 January 2017 (UTC)
Whatamidoing (WMF) What I know about the current situation. WPCleaner is able to detect obsolete magic links (if errors 528, 529 and 530 are activated), and to suggest replacements if replacement templates have been configured (for PMID, ISBN and RFC). On frwiki:
  • PMIDs: I have replaced all of them with the template PMID
  • RFCs: It's a bit more tricky, because the template RFC doesn't work in references (around 300 pages still need to be handled)
  • ISBNs: I have run a first bot to use the template ISBN in some cases, number of pages went down from around 46000 to 34000 but still a lot of work to go on.
--NicoV (Talk on frwiki) 10:22, 12 January 2017 (UTC)
Can (should?) that RFC template be updated so that it works within ref tags? Could a different template be used?
The English Wikipedia has a different problem: {{RFC}} is for the internal RFC process. I don't know if we have a link-to-the-IETF kind of RFC template. Whatamidoing (WMF) (talk) 19:34, 16 January 2017 (UTC)
The RFC template was designed on frwiki to move links to the IETF website inside references (outside of article text, as MOS says external links shouldn't be in the article texte), so I don't think it will be modified for that (but if templates were able to know that they are called inside a reference, it would be easy to do). I'm currently modifying WPC to also suggest replacements with interwiki links because rfc (rfc:1234), pmid (pmid:1234) and issn (issn:1234-5678) seem to be defined in all WMF wikis. --NicoV (Talk on frwiki) 22:12, 16 January 2017 (UTC)

Whatamidoing (WMF): I just noticed that Wikipedia:Sockpuppet investigations/Garrysmith10/Archive appeared in Category:Pages using invalid self-closed HTML tags in the past few days. I have checked that category a couple of times a week since we emptied it. Since that page just showed up, it means that the job queue has not finished running through all of the pages on en.WP, even though the update to MW that created the error category was implemented six months ago. That, in turn, means that there are pages on en.WP that still have errors that we do not know about. Ultimately, that means that T132467, or something like it, is blocking Tidy migration, since the job queue does not run through all pages in a timely fashion to populate maintenance categories. And since insource searches are also not reliable, we are stuck with no reliable way to determine if we have fixed all of the outstanding problems.

At least that's how I interpret the evidence. I could be wrong. – Jonesey95 (talk) 17:49, 8 January 2017 (UTC)

I don't know, but I assume that your conclusion is correct. I'll ask around, and let you know if our thinking is wrong. Whatamidoing (WMF) (talk) 20:58, 11 January 2017 (UTC)
Whatamidoing (WMF) I just had the same problem with fr:MediaWiki:Gadget-C_helper_util.js which just appeared in the maintenance category, but the problem was since March last year, and it only appeared in the category because the page was modified yesterday. So it seems to confirm that the category is not updated in a timely fashion for pages that are not updated and that there are probably still pages with the problem but not visible in the category. --NicoV (Talk on frwiki) 09:23, 12 January 2017 (UTC)
Whatamidoing (WMF) For example, fr:MediaWiki:Gadget-MonobookToolbarLang.js has the same problem since forever (page not modified since 2015), but is not yet visible in the maintenance category... --NicoV (Talk on frwiki) 10:05, 12 January 2017 (UTC)
I have been cleaning out Category:User pages using invalid self-closed HTML tags in the last couple of days, and I found two or three more pages that had errors but had not been put in the category yet. A null edit put them into the category. This confirms that the bug linked above, or something like it, is still valid. Pages are not being null-edited in a timely fashion, rendering MediaWiki changes that create tracking categories less useful and not entirely reliable. – Jonesey95 (talk) 17:18, 13 January 2017 (UTC)
Do we really need to be relying upon those cats? Is there anything in those cats that couldn't be found via CheckWiki or via a regex search instead? Whatamidoing (WMF) (talk) 19:30, 16 January 2017 (UTC)
Regex searches are unreliable, as I noted above. See T106685.
CheckWiki could probably find them, if it were modified to operate outside of article space; I don't know if we have a full list of what to check for yet (also see above, to which you responded).
I think tracking categories are better (as long as they are updated in a reasonable amount of time), since they are visible to anyone who has hidden categories turned on, not just the very very few of us who check CheckWiki pages. – Jonesey95 (talk) 19:51, 16 January 2017 (UTC)

Whatamidoing (WMF), has the late January discussion mentioned at the top of this section happened yet? Was there a discussion of the above requests? Resolution of one or more of the above requests is likely to be necessary for the gnomes to fix all of the problems prior to Tidy being replaced. At a minimum, we need a well defined list of the problematic strings to search for. Thanks. – Jonesey95 (talk) 23:12, 30 January 2017 (UTC)

I have suspission, that this will be no-no-no-you-are-crazy, but phab:T156581. --Edgars2007 (talk/contribs) 14:34, 31 January 2017 (UTC)
I've not seen any proposed dates. I think you can safely assume that it will be several months away. The next step is getting a tool that will show individual articles, so that you can see any problems visually.
I'd like anyone who works on other wikis to take a look at https://tools.wmflabs.org/wikitext-deprecation/ to check for this set of problems. I've asked for this new tool to be put into Tech News soon. Whatamidoing (WMF) (talk) 18:40, 2 February 2017 (UTC)

Suggestion: birth year and age[edit]

Resolved

Well-meaning editors have added ages to birth years, e.g. replacing 1966 by "1966 (age 50-51)". Some of them may forget to come back and update all the articles every January. Would it be possible/sensible to check for a pattern such as \d{4}\s*\(age\s+\d+[-–]\d+\)? A kind editor can then replace it by {{birth year and age}} or similar. This occurs mainly within the birth_date parameter of {{infobox person}} and its descendants, so could be limited to that case if helpful. Thanks, Certes (talk) 09:08, 7 January 2017 (UTC)

You can use the search, eg. hastemplate:"Infobox person" insource:/[0-9]{4} *\(age +[0-9]+[-–][0-9]+\)/. Matěj Suchánek (talk) 10:45, 7 January 2017 (UTC)
Thank you Matěj, I didn't know about hastemplate. All of the search engines I've found dumb my query down to finding pages about what Age means. Certes (talk) 12:20, 7 January 2017 (UTC)

Detection on #1, 2 and 34[edit]

Resolved

Why has the detection on these lists for de.wiki been stopped? Perhaps there are even some more numbers that have been suspended.--Hadibe (talk) 08:56, 8 January 2017 (UTC)

Hadibe They shouldn't be stopped. None of those errors have been turned off in the translation file. I see #2 and #34 errors today. #1 does some iffy. I don't see it showing up in other wikis. I'll look into that. Bgwhite (talk) 06:36, 9 January 2017 (UTC)
It was strange to see #2 and #34 totally empty yesterday when every other day more than 50 articles would have been added. And there were articles containing these. #1 might be empty for nearly 2 weeks now. --Hadibe (talk) 07:02, 9 January 2017 (UTC)
Hadibe The January 8, 0z CheckWiki daily update did not run for any wiki. The latest daily update contains 48 hours worth of errors (8th and 9th) minus however long the cron was down on Labs. If you think seeing errors empty was strange, then you obviously haven't seen me. :) Bgwhite (talk) 07:11, 9 January 2017 (UTC)
Hadibe #1 should be fixed. Bgwhite (talk) 19:51, 9 January 2017 (UTC)
Thank you. --Hadibe (talk) 21:04, 9 January 2017 (UTC)
Could someone please check if the detection on de.wiki #16 has been stopped? There weren't any entries for some weeks, but every now and then I find some articles which should have been detected like this one --Hadibe (talk) 18:17, 16 March 2017 (UTC)

Bug or not[edit]

Resolved

Moin Moin @Bgwhite:, don't know the problem or if it is a bug. But for a few days I have the following behavior. Background is Opera 42, IE 11 on Windows 7, 10. I choose any ID, click to more and if I would like to click to "Done" and the article has a space then the complete entry disappears. On the ID-List I could click "Done" and its done. Either my browser nor my system changed? Can you tell me if something has changed in the code? mfg --Crazy1880 (talk) 18:25, 9 January 2017 (UTC)

Crazy1880 Code changed. I was working on a cleanup of the code and it wasn't 100% ready. I pushed it out to fix some bugs that have been talked out above. Will fix. Bgwhite (talk) 19:56, 9 January 2017 (UTC)
Crazy1880 Should be fixed. Bgwhite (talk) 21:01, 10 January 2017 (UTC)
Moin Moin @Bgwhite:, it look good. Thanks. Regards --Crazy1880 (talk) 18:35, 11 January 2017 (UTC)

Red Link Recovery Live[edit]

I hope this isn't too far off topic but I've already tried elsewhere with no reply. The helpful Red Link Recovery Live [14] has been broken for a few weeks now, returning a header but no data. (A little hacking suggests that it may no longer be authorised to log in to the database it queries.) Topbanana usually looks after these things but has been inactive for a month - I do hope they're OK. Is there anyone else who could take a look at the problem please? Certes (talk) 01:11, 10 January 2017 (UTC)

Certes I poked around the tb-dev directory on labs. There are no logs being kept. There's nothing else around that gives any indication on how things are running. The only thing I can think of is to email wikitech-l@lists.wikimedia.org This is the main email list for everybody on Labs. The Wiki-Tech meetings are going on this week, so I don't know if that will slow down response time from the mailing list. Bgwhite (talk) 21:13, 10 January 2017 (UTC)
Thank you Bgwhite! I'll give it another week if the techs are busy, then pester them by e-mail if I can't find a less disruptive way to get things mended. Certes (talk) 23:57, 10 January 2017 (UTC)

History column[edit]

Sometimes the errors are done by new users and it should be Revert. Please add history column Yamaha5 (talk) 11:34, 15 January 2017 (UTC)

Error #43 has false positive[edit]

when an article has {{{}}} inside it, the error #43 will have false positive for example at here and fa:آزاده (شاهنامه) Yamaha5 (talk) 09:07, 18 January 2017 (UTC)

For example {{{foo|{{{Foo|}}}}}} has 2 {{ and 3 }} and because of that the report has false positiveYamaha5 (talk) 19:11, 18 January 2017 (UTC)
Not only, "{{{" is marked as an error too, to detect articles with unbalanced brackets, for example, {{{reflist}}. So... Why do you use "{{{foo|}}}" in articles? That's templates element, {{{foo|something}}} is always equalent to something since there are no parameters passed to article, including foo. Facenapalm (talk) 13:30, 19 January 2017 (UTC)
I didn't use "{{{foo|}}}" :) it is new user's falt they use original template inside article. it should list in other error list not mix with this error list
I solved this bug at python by this code
import re
text2=text
while '{{{' in text2:
    text2 = re.sub(r'\{\{\{[^\}]\}\}\}','',text2)
# after this cleaning we can check the text for {{}}

Yamaha5 (talk) 12:47, 26 January 2017 (UTC)

Is there limit(number of items) with whitelist?[edit]

By plan(jawiki) it will be about 3000 items. Is it possible? Thanks.--Momijiro (talk) 02:47, 29 January 2017 (UTC)

Momijiro There is no limit. Bgwhite (talk) 09:26, 29 January 2017 (UTC)
Bgwhite I understood.Thanks.--Momijiro (talk) 09:59, 29 January 2017 (UTC)

Error #46[edit]

Resolved

There several false positives in cswiki. They have in common that the reported bad syntax is inside an image which is at the very beginning of an article and which has an internal link inside its caption. Matěj Suchánek (talk) 08:57, 29 January 2017 (UTC)

Matěj Suchánek It's not a false positive. CheckWiki has identified the wrong spot. Search for [[ or ]] in the article to find the correct spot. Bgwhite (talk) 09:33, 29 January 2017 (UTC)
I can see them know. Thanks for making me take a second look. Matěj Suchánek (talk) 09:41, 29 January 2017 (UTC)

<mapframe> tag[edit]

Yes check.svg Done

@Bgwhite: Apparently, the <mapframe> tag has been activated, at least on frwiki. It seems to be containing JSON data, leading to false positives for errors related to curly braces (like #47 for fr:Le Malesherbois#Composition). Could we ignore contents inside <mapframe> tag ? --NicoV (Talk on frwiki) 09:46, 3 February 2017 (UTC)

NicoV This was just a simpe copy/paste to add this. Bgwhite (talk) 20:01, 3 February 2017 (UTC)
Thanks ! Also done in WPC for #43 and #47. --NicoV (Talk on frwiki) 17:39, 7 February 2017 (UTC)

Proposal in the Magiodilitis arbcom case[edit]

I've made a proposal on the ARBCOM workshop page concerning how to deal with the Magiodilitis situation. Input by WP:BAG, bot owners, and the community at large is welcomed. Even if the proposal doesn't pass, some other ideas can be of interest, especially to WP:AWB/WP:CHECKWIKI people. Headbomb {talk / contribs / physics / books} 13:23, 3 February 2017 (UTC)

Better notice for #43 ?[edit]

Hi, on frwiki, fr:Bimbogami ga! gets reported for #43 with the notice {{TomeBD | couleur_ligne = | langage_u, and checkarticle gives a location of "-1". I could find where the error is... Would it be possible to have at least the position ? --NicoV (Talk on frwiki) 17:36, 7 February 2017 (UTC)

False positive for #102[edit]

At here CheckWikis reports PubMed Url links. I checked the links and they are ok like PMID 19184489 which Checkwiki mentions as problemYamaha5 (talk) 21:03, 10 February 2017 (UTC)

It reached the maximum, maybe? IKhitron (talk) 21:11, 10 February 2017 (UTC)

Double prefix[edit]

Image description with full small[edit]

Please modify code so that it also detects small templates in addition to small tags. -- Magioladitis (talk) 13:21, 11 February 2017 (UTC)

Tnavbar-header and Navbar-header[edit]

these need to be fixed. either (1) someone has substituted a template which should not have been substituted, or (2) substituted a template but didn't clean up after substituting the template. in either case, the edit link is not pointing to where you edit the content. the solution is either (1) unsubstitute the template, or (2) replace style="..." | {{T?navbar-header|title text|...fontcolor=fcolor}} with style="...; text-align:center; color: fcolor;" | title text. for example, like this. not sure if this can be safely done by bot. Frietjes (talk) 21:14, 14 February 2017 (UTC)

#48 whitelist does not work in hewiki[edit]

Resolved

Hi. Could you check what's the problem, please? Thank you. IKhitron (talk) 19:34, 22 February 2017 (UTC)

Found the problem. IKhitron (talk) 15:36, 10 March 2017 (UTC)

#48 templates[edit]

Hi. Is there a possibility to ignore #48 in some imagemap templates? Thank you. IKhitron (talk) 15:17, 24 February 2017 (UTC)

update fawiki's local setting[edit]

Hi, please update fawiki's local setting. I changed the setting and whitelists some month ago, still checkwiki's reports doesn't change Yamaha5 (talk) 11:57, 5 March 2017 (UTC)

Suggestion: Find cases of [[Link target|Link tar]]get and [[Link target|Link tar]]<nowiki/>get[edit]

I'm talking about cases where the link display text with the link trail is identical to the link target text.

"[[Link target|Link tar]]get" is functionally identical to [[Link target]].

"[[Link target|Link tar]]<nowiki/>get" is not functionally identical, but it's likely that the intention was to write [[Link target]]. I fixed many such cases in the Hebrew Wikipedia, and I cannot recall even one case where something different was needed.

This should also check for "[[Link target|link tar]]get", i.e. the capitalization of the first letter doesn't matter. --Amir E. Aharoni (talk) 12:09, 6 March 2017 (UTC)

"[[Link target|Link tar]]get" is functionally identical to [[Link target]]. - only if "get" consists of lowercase symbols. See: tarGet, tar2et, tar-et. tarгет doesn't works here, but works in ruwiki ([а-яё] is a regexp for russian lowercase symbols if it will be implemented). Facenapalm (talk) 19:26, 6 March 2017 (UTC)
Right, but it doesn't change the suggestion. Most likely, the editor's intention was still to include the trail inside the link. --Amir E. Aharoni (talk) 20:27, 6 March 2017 (UTC)
By the way, this replacement is also possible: [[Link tar|Link target]] -> [[Link tar]]get. Again, only if symbols after the link in result are in lowercase. Both of those suggestions can be a nice addition to 64th error, especially considering the fact that many instruments (AWB, ruwiki's Wikificator) already can do this. Facenapalm (talk) 02:55, 7 March 2017 (UTC)

Suggestion: Find cases of [[Link target|<nowiki/>]][edit]

Parsoid occasionally created "links" like "[[Link target|<nowiki/>]]", and possibly still does. This outputs nothing, and most likely has to be simply deleted. Occasionally, this must be fixed to [[Link target]] or something else, but [[Link target|<nowiki/>]] is in any case wrong. --Amir E. Aharoni (talk) 12:09, 6 March 2017 (UTC)

This is a good one. -- Magioladitis (talk) 18:07, 8 March 2017 (UTC)

Bad CSS[edit]

Resolved

someone thought it would be a good idea to mangle the css with with hyphens instead of a standard ASCII minus sign. all of these are broken and should be fixed. Frietjes (talk) 16:46, 8 March 2017 (UTC)

Frietjes I fixed all. -- Magioladitis (talk) 17:52, 8 March 2017 (UTC)

Actually, there are five more [20] IKhitron (talk) 17:54, 8 March 2017 (UTC)
Magioladitis, and same for font-size. Frietjes (talk) 17:56, 8 March 2017 (UTC)


Frietjes and IKhitron I think now I fixed everything even in user pages. -- Magioladitis (talk) 18:06, 8 March 2017 (UTC)

Access-Control-Allow-Origin[edit]

Hello! I tried to write a script which will automatically download list of errors for current page, display them in edit form and offer to choose the ones that should be automatically marked as "Done". However I met a problem: http-request, which is worked correctly in empty tab, can't load anything while being launched on wikipedia domain. Chrome writes something like this:
XMLHttpRequest cannot load https://tools.wmflabs.org/checkwiki/cgi-bin/checkwiki.cgi?project=ruwiki&view=detail&title=.mil. No 'Access-Control-Allow-Origin' header is present on the requested resource. Origin 'https://ru.wikipedia.org' is therefore not allowed access.
Is it possible to configure server in such a way to make such requests working? (If it's important, requests must be done via https, because request via http on https domain doesn't work too). Facenapalm (talk) 14:53, 9 March 2017 (UTC)
You are trying to fetch it in table view, not the bot one? Why? IKhitron (talk) 14:55, 9 March 2017 (UTC)
Because I need to get the list of errors for current page, not the list of pages for current error. And I need to get "Notice" field as well: I want to show information for people, not to get information for bots. Anyway, requests for bot view doesn't work too if their origin is Wikipedia. Facenapalm (talk) 15:04, 9 March 2017 (UTC)
I see. But AWB can fetch them, somehow. IKhitron (talk) 15:05, 9 March 2017 (UTC)
AWB is launched on your PC and can fetch them just like browser do. My python bot also can download the list of errors and mark some of them as "done" without any problem. But that's not how cross-domain AJAX works. Facenapalm (talk) 15:08, 9 March 2017 (UTC)

ID 3 (references responsive)[edit]

Moin Moin @Bgwhite:, since yesterday we have the feature "<references responsive />" to make references more columns. The ID 3, however, means that there is no TAG for references. Can it be that this new syntax is not yet understood by the script? Regards --Crazy1880 (talk) 18:29, 17 March 2017 (UTC)

ID 16 (Unicode)[edit]

Moin Moin @Bgwhite: the second today. Since now some time in the ID 16 no new entries are added. But I had found in some articles which ones which should have been registered. Can it be that something isn't running? Regards --Crazy1880 (talk) 18:32, 17 March 2017 (UTC)

Linter tool out on small wikis[edit]

Quick note: mw:Extension:Linter (which helps editors find wikitext errors) is available on some small wikis. Have a look at mw:Special:LintErrors to see what it does. (I don't know what all of the categories are, but I see that lots of translations of mw:Help:VisualEditor/User guide appear in a couple of the lists.) Whatamidoing (WMF) (talk) 20:14, 21 March 2017 (UTC)