Wikipedia talk:WikiProject Check Wikipedia

From Wikipedia, the free encyclopedia
Jump to: navigation, search
  Check Wikipedia   WMFLabs   List of Errors   Discussion

ISBN-check[edit]

It would be very helpful if the check could recognize and ignore

  • ISBNistFormalFalsch=J
Example: de:Erich Burgener - {{Literatur | Autor=Bertrand Zimmermann | Titel=Erich Burgener | Verlag= Editions de la Thèle| Ort=Yverdon-les-Bains | Jahr=1987 | ISBN=2-8283-0024 | ISBNistFormalFalsch=J }}
  • http://xxxxx/isbn/282830024

--Tsor (talk) 09:09, 2 March 2014 (UTC)

Tsor, as usual, I'm confused. Why give a bad ISBN in the first place? I did a Google search and only two non-Wikipedia derived websites give this number and one of them is Wikipedia. Bgwhite (talk) 23:43, 2 March 2014 (UTC)
Hello Bgwhite, this ist just a (bad) example. Sometimes we find in a book an ISBN which is formal wrong. Some guys use the template Vorlage:Literatur where they can mark such invalid ISBNs by "ISBNistFormalFalsch=J". There is another template Vorlage:Falsche ISBN which can mark such invalid ISBNs: {{Falsche ISBN|3-123-45678-9}} leads to "ISBN 3-123-45678-9 (formal falsche ISBN)". This template is used very often: https://de.wikipedia.org/wiki/Spezial:Linkliste/Vorlage:Falsche_ISBN
I will look for a better example for an invalid ISBN. --Tsor (talk) 10:10, 3 March 2014 (UTC)
PS: An additional column in the error-list "marked as invalid" would help. --Tsor (talk) 10:18, 3 March 2014 (UTC)
Tsor, I'm slow, but I still fail to see what is wrong. It would be best to use a correct ISBN? A better example would help me understand. TMg, could you help me out.
There are whitelists in which articles can be added so they won't be raised as an error again. To many things can go wrong with "marked as invalid" button... Already a problem of vandalism by people clicking done when they have no intention of fixing errors. Bgwhite (talk) 3 March 2014 (UTC)
Here are 349 examples. --Tsor (talk) 11:10, 3 March 2014 (UTC)
I just looked at the first one in the list, de:Charles de Melun and I don't understand why the ISBN is qualified as bad: the checksum is correct. Is it normal to have "ISBNistFormalFalsch=J" with an ISBN that seems correct? Edit: idem for second example de:Bussard (Einheit). --NicoV (Talk on frwiki) 12:26, 3 March 2014 (UTC)
Hmm, you are right, in de:Charles de Melun ISBN is marked as bad but ist is ok. Same at your second example. I will have a closer look. --Tsor (talk) 13:26, 3 March 2014 (UTC)
Please repeat your calculation. The checksum digit is false, if the first 9 digits are corect the checksum digit in the end should be a 1, so the ISBN should be 2902091311 and not 2902091312. --Cepheiden (talk) 19:15, 5 March 2014 (UTC)
Well, you're just not looking at the version as was looking at, the page was modified since my comment and changed completely about the ISBN: a ISBN-13 with a coherent checksum was replaced by a ISBN-10 with a non-coherent checksum. --NicoV (Talk on frwiki) 20:21, 5 March 2014 (UTC)
I'm sorry, you are right i didn't notice the edit. --Cepheiden (talk) 17:48, 8 March 2014 (UTC)
I also looked at other, a lot seem in the same situation. There's also cases where the ISBN has indeed a wrong checksum, but the book can be found with the correct ISBN on the internet: de:Mare Imbrium and the corresponding book on google. I've spent quite some time on frwiki to fix ISBN reported by CW (still quite some work to do), but I've found very few situations where the ISBN with the incorrect checksum was confirmed as being the ISBN (it's usually fixed at some point). --NicoV (Talk on frwiki) 15:51, 3 March 2014 (UTC)
Yes, there are cases of ISBN's with false checksum digits used as the original ISBN (printed in book and listed in databases of libraries etc.). If someone cites this book with this ISBN we mark them as "formally false" like some libraries do. So what's the point here? --Cepheiden (talk) 19:15, 5 March 2014 (UTC)
My point was that I was surprised by the size of the list (349 pages), because as I said, I fixed a lot of ISBN on frwiki, and didn't find so much situations where the ISBN with the non-coherent checksum had to be kept. Given that the first hits in the search seemed to be mistakes, I was wondering if it was normal that you have so many page with ISBN tagged as formally false. --NicoV (Talk on frwiki) 20:26, 5 March 2014 (UTC)
This was more a reply to Bgwhite (like Tsor already did). --Cepheiden (talk) 17:48, 8 March 2014 (UTC)

Just an example for the second point: http://www.randomhouse.ca/catalog/display.pperl?isbn=9780676978223 found in de:28 Stories über Aids in Afrika. --Tsor (talk) 22:08, 3 March 2014 (UTC)

It links to "Page not found", the correct link seems to be at http://www.randomhouse.ca/catalog/display.pperl?isbn=9780676978230 (different last 2 digits ISBN). --NicoV (Talk on frwiki) 22:29, 3 March 2014 (UTC)

Adjacent references ?[edit]

Hi, what do you think of adding a detection for adjacent references, like <ref>...</ref><ref>...</ref><ref>...</ref> ? This error probably won't be of any interest for enwiki because reference numbers are put between square brackets [1][2][3]. But on frwiki reference numbers are displayed without any decoration so adjacent references may look like only one reference 123, so we're generally using a template {{,}} between references. --NicoV (Talk on frwiki) 22:14, 27 May 2014 (UTC)

NicoV, could you get me some articles with the problem as test subjects. <maniacal laugh> Test Subjects </maniacal laugh> I take it I need to look for cases of: </ref><ref> and <ref name=ack /><ref ? I also saw your message above about adding to the done pages. Bgwhite (talk) 05:29, 28 May 2014 (UTC)
Ok, will try to find some... The subject was brought on WPCleaner's talk page for this modification, but the page is fixed now. --NicoV (Talk on frwiki) 07:17, 28 May 2014 (UTC)
Bgwhite, I checked a lot of articles but I haven't found an other example yet... --NicoV (Talk on frwiki) 12:16, 28 May 2014 (UTC)
fr:Utilisateur:Zetud/Pb Ref should have a list. --NicoV
Bgwhite, fr:Leetchi, with at least 2 problems in the introduction. --NicoV (Talk on frwiki) 07:34, 2 July 2014 (UTC)

Error 3 on elwiki[edit]

I think WPCleaner catches the list found at el:Βικιπαίδεια:WikiProject_Check_Wikipedia/Μετάφραση while CHECKWIKI script does not. -- Magioladitis (talk) 16:48, 27 June 2014 (UTC)

I think the problem is only with the last line. Now that I updated the code, I noticed that all errors shown are connected to the last line. -- Magioladitis (talk) 05:26, 8 July 2014 (UTC)

Bgwhite. Nope. The problem still occurs. -- Magioladitis (talk) 08:26, 24 January 2015 (UTC)

#14 false positives[edit]

Two false positives at plwiki are reported. To remove such cases, you might check only for "<source ", not "<source", and skip code which is in a <source> by itself. ToSter (talk) 19:20, 21 October 2014 (UTC)

The solution is again "<source[^a-z]". -- Magioladitis (talk) 06:13, 22 October 2014 (UTC)

Magioladitis, Error #14 doesn't use a regex. It uses the same subroutine used for checking imbalanced nowiki, pre, comment, syntaxhighlight, code, math, hiero, and score. The regex also doesn't solve the problem with the articles ToSter mentioned. The problem with the articles... there are valid, unbalanced source tags inside source tags.
Following scenario is in ToSter's articles, where the second source is not an html source tag.
<source> [text] <source> [text] </source>
Problem is... how does one differentiate between ToSter's scenario and a scenario where the first <source> tag is actually missing a closing tag, especially when editors don't always put extra parameters inside source tags. Bgwhite (talk) 07:26, 22 October 2014 (UTC)
We also have false positives on frwiki, which doesn't seem to fall into the above category:
  • fr:Apache Ant: a <sourcePath> tag is detected as being a <source> tag
  • fr:Vidéo HTML5: there are 3 self-closing <source /> tags inside a <syntaxhighlight> tag. The third one is reported.
--NicoV (Talk on frwiki) 13:41, 25 October 2014 (UTC)

Bgwhite is this fixed somehow? I haven't seen any false positives for a long time. -- Magioladitis (talk) 08:33, 24 January 2015 (UTC)

Stripping pre tags[edit]

Hi,

<pre> tags are stripped only if they have no additional attributes. In pl:dmesg there's a pre block:

<pre style="height:20em; overflow-y:scroll">...</pre>

It's not getting stripped by checkwiki.pl (get_pre() function) so false positives are reported (like #56 in that case). ToSter (talk) 18:51, 12 November 2014 (UTC)

ToSter, personally, I'd remove the entire pre text. I don't see the benefit of a boot screen from a 6-year old version of Linux. Bgwhite (talk) 23:35, 12 November 2014 (UTC)
Bgwhite, that's right :) but still the problem can occur in another place. ToSter (talk) 06:13, 13 November 2014 (UTC)
ToSter, it can, but it is not. Also, this is what the whitelist is for. Bgwhite (talk) 21:54, 14 November 2014 (UTC)

AWB fixes/detects more of some errors[edit]

Add field with user edit[edit]

Hola, disculpas por escribir en español, se podría agregar un campo mas en el cual indique el nombre de usuario o ip que realizó la edición del error detectado. gracias buen trabojo.Sergio Andres Segovia (talk) 16:59, 30 November 2014 (UTC)

"Hi, I apologize for writing in Spanish, you could add an additional field which states the user name or ip who made the edition of the detected error. thanks good work."
Seems possible but would require a lot of processing to find the particular edit. Frietjes (talk) 17:58, 30 November 2014 (UTC)
Es una pena que requiera una gran cantidad de procesamiento, porque si se agregara ese campo iríamos directamente a las contribuciones del usuario o ip, y el que tenga el flag de reversor podría revertir las edición desde allí. En Wikipedia en español intentamos detectarlo con un filtro de ediciones pero arrojó muchos falsos positivos [1], saludos. Sergio Andres Segovia (talk) 19:05, 30 November 2014 (UTC)
"It's a shame that requires a lot of processing, because if that field we would be linked directly to user contributions or ip, and an editor with rollback could reverse the issue from there. At Spanish Wikipedia, we tried to detect issues with an edit filter but it resulted in many false positives[2], greetings."
I agree that it would be useful. You might be able to get a bot to do this for you? for example, I know that some bots like 'BracketBot' will warn you when you have introduced unbalanced brackets. of course, there is a difference between warning a user about 'breaking an article' and warning a user about using deprecated syntax. maybe you can ask the operator of BracketBot (A930913)? Frietjes (talk) 17:19, 1 December 2014 (UTC)
There is also Bracketbot's brother, ReferenceBot. Both are done by A930913. The two main differences between BracketBot and CheckWiki is: 1) Bracketbot checks articles in near real-time 2) Bracketbot informs the editor of the problem they created instead of reporting the error to a master database. In theory, CheckWiki can also be run in near real-time on individual articles. I would need help from A930913. His bot code would run normally except call CheckWiki to test an article instead of using the bot's checks. Bgwhite (talk) 19:53, 1 December 2014 (UTC)
@Bgwhite: Make a (web)script that I can ping with a pageid/title/diffid/oldid/user? ##930913 connect? 930913 {{ping}} 07:33, 2 December 2014 (UTC)

Error #84[edit]

Remember to exclude headers consisted by a sole letter from checking. -- Magioladitis (talk) 08:41, 14 January 2015 (UTC)

Magioladitis Code already does this. Wondering if another section header, say == 0-9 ==, was flagged in the article. The last dump did not flag any one character section header. Bgwhite (talk) 21:41, 22 January 2015 (UTC)

Bgwhite I can't remember anymore which page caused the problem. -- Magioladitis (talk) 09:11, 24 January 2015 (UTC)

Instances of 'subst:' in articles[edit]

Yes check.svg Done

do we have a scan for 'subst:' in articles? for example, for cleaning up this type of issue? Frietjes (talk) 20:59, 28 January 2015 (UTC)

@Frietjes, Magioladitis, NicoV: A scan of the latest dump for 'subst:' can be found at User:Bgwhite/Sandbox. The listing will contain examples where 'subst:' is inside comment tags. I've added 'subst:' to the checkwiki program and results will show up under error #34.
Currently #34 scans for:
  • {{{
  • #if:
  • #ifeq:
  • #switch:
  • #ifexist:
  • {{fullpagename}}
  • {{sitename}}
  • {{namespace}}
  • {{basepagename}}
  • {{pagename}}
  • {{pagesize}}
  • {{protectionlevel}}
  • {{subpagename}}
  • {{subst:
Bgwhite (talk) 22:43, 28 January 2015 (UTC)
I've updated WPC so that it detects subst: and safesubst: as error #34. WPC detects a lot of functions as #34 (see variable functionMagicWords around line 247), but I don't think they should be added to the CW detection. --NicoV (Talk on frwiki) 11:08, 29 January 2015 (UTC)
now fixed all the ones that should be fixed in User:Bgwhite/Sandbox, so feel free to rescan, or overwrite the list with something else. Frietjes (talk) 16:40, 30 January 2015 (UTC)
@Magioladitis, Bgwhite: It seems that it also detects NUMBEROFARTICLES (Encyclopédie is detected on frwiki). Is it useful? There's nothing much we can do about this value, since it's a dynamic one. Same would apply to PAGESIZE and PROTECTIONLEVEL. --NicoV (Talk on frwiki) 08:44, 2 February 2015 (UTC)
NicoV NUMBEROFARTICLES was added for a couple of days and then removed. Magioladitis will have to answer the other question. Bgwhite (talk) 09:41, 2 February 2015 (UTC)
@NicoV, Bgwhite: let's remove both PAGESIZE and PROTECTIONLEVEL. -- Magioladitis (talk) 09:59, 2 February 2015 (UTC)

#69 additions[edit]

@Magioladitis, NicoV: Checkwiki *should* find cases of ISBN Pound-sign.... ISBN # Bgwhite (talk) 23:26, 28 January 2015 (UTC)

@Magioladitis, NicoV: {{Infobox comics character and title}} contains ISBN# as a parameter name. I need to put a fix in to avoid this, don't know about your programs. Bgwhite (talk) 05:18, 30 January 2015 (UTC)
ISBN# is not a valid parameter name. ISBN1, ISBN2, etc. are. ISBN# need to be replaced with ISBN1 is not empty, otherwise removed. -- Magioladitis (talk)

@Bgwhite, NicoV: I removed any instances of ISBN# from the Infobox and all other similar infoboxes. -- Magioladitis (talk) 23:12, 30 January 2015 (UTC)

@Magioladitis, NicoV: Checkwiki *should* find cases of [[ISBN]] now.... [[ISBN]] 978-3948-3838-33, [[ISBN]]: 978-3949-3838-33, etc... Bgwhite (talk) 21:10, 2 February 2015 (UTC)

Not everything of the cases above will be fixed by AWB. I am afraid of false positives. I do knot know whether Rjwilmsi could help us here. -- Magioladitis (talk) 09:07, 3 February 2015 (UTC)

New error : empty titles ?[edit]

Hi, I was thinking about a new error for detecting empty titles, like the ones VE is creating on a regular basis (== <nowiki /> ==). --NicoV (Talk on frwiki) 18:10, 13 August 2014 (UTC)

NicoV, I did a scan for enwiki and came up with 83 articles. The VE edits all appear old. I wonder if they have fixed the problem in new VE builds? Bgwhite (talk) 22:31, 22 August 2014 (UTC)
Bgwhite, apparently it's still not fixed, the last VE edit I found with this problem is from last night. --NicoV (Talk on frwiki) 09:10, 23 August 2014 (UTC)
Thanks for the list Bgwhite, I've added error #522 to detect empty titles and fixed all the occurrences. --NicoV (Talk on frwiki) 12:21, 24 August 2014 (UTC)

And also, of the same kind, a new error for empty internal links, like in this edit ([[Boom Fm|<nowiki/>]] and [[Roger Blackburn|<nowiki/>]]). --NicoV (Talk on frwiki) 10:11, 23 August 2014 (UTC)

This was fixed in Visual Editor. No new cases have been found over the past few months. Bgwhite (talk) 23:51, 28 January 2015 (UTC)
Bgwhite, not at all. A few examples just in the last 24h (nowiki tags):
And maybe another problem with things like that: [[XX|YY ]]<nowiki/>ZZ which could be easily replaced by [[XX|YY]] ZZ
--NicoV (Talk on frwiki) 13:02, 29 January 2015 (UTC)

@NicoV, Magioladitis: According to Tech News: 2015-14, the problem of nowiki in titles has been fixed. Of course, what new untold problems have arisen due to their fix has yet to be seen. Bgwhite (talk) 20:28, 30 March 2015 (UTC)

Well, when you read in the same announcement that "VisualEditor is now the main editing tool on 53 more Wikipedias", you can't take it really seriously as even on wikis where it has been enabled by default for almost 2 years, it's still far away from from being the "main editing tool"... Face-wink.svg --NicoV (Talk on frwiki) 20:56, 30 March 2015 (UTC)
NicoV When I read that sentence for the first time, thoughts of dread and pity for those 53 sites went thru my mind. I also wondered what "phase 5" meant. mw:VisualEditor/Rollouts explains what each phase means. They have enwiki as a phase 0, which is, "... wikipedias that have been closed or deprecated". Ahhh, Visual Editor... always good for a laugh and a cry. Bgwhite (talk) 21:25, 30 March 2015 (UTC)
You didn't know ? When enwiki made its push to make VE opt-in, they closed enwiki Face-wink.svg Currently, we're not editing enwiki, it's a decoy... In the rollouts, I also liked very much the sentence that wikis in phases 1 to 4 "are relatively easy for VE to support"... --NicoV (Talk on frwiki) 21:35, 30 March 2015 (UTC)
VE is supporting phases 1 thru 4 very well. It's rather obvious. From day one, VE has supported goofs, foul-ups, mistakes and barfs. Bgwhite (talk) 21:45, 30 March 2015 (UTC)

Category:Articles with links needing disambiguation from June 2011 is getting close[edit]

Category:Articles with links needing disambiguation from June 2011, containing the oldest dated links tagged as needing disambiguation, is now under a thousand. I am sure that with some teamwork, we can wipe it out this month. Cheers! bd2412 T 00:31, 1 February 2015 (UTC)

Error #2 additions[edit]

Resolved

@Bgwhite, NicoV: CHECKWIKI now detects cases such as those described in rev 10811 (i.e. <br clear=both /> and similar). -- Magioladitis (talk) 12:33, 1 February 2015 (UTC)

In Checkwiki, I just use /<\s*br\s*clear/ to find cases
WPC now detects them also, and suggests a replacement after some configuration (see example on frwiki configuration). I've only configured frwiki, other projects should do the configuration depending on the replacement they want to use. --NicoV (Talk on frwiki) 17:31, 15 February 2015 (UTC)
For enwiki, the configuration should probably be:
error_002_clear_all_enwiki={{Clear}} END
error_002_clear_left_enwiki={{Clear|left}} END
error_002_clear_right_enwiki={{Clear|right}} END

New error: 102 - Check PMID[edit]

Resolved

@Magioladitis, NicoV, Meno25, Josve05a, Matěj Suchánek, ToSter: Error 102 will check for proper syntax of PMID. This is similar to error #69. See WP:PMID for more info. It "should" start checking in the next run. Bgwhite (talk) 20:26, 2 February 2015 (UTC)

Added to WPCleaner in the next release. --NicoV (Talk on frwiki) 18:24, 23 February 2015 (UTC)

Summary of Changes made recently[edit]

Summary of Changes made recently:

  • Error 69: Now finds cases of ISBN in a wikilink ( [[ISBN]] 978-12345-6789-0) and # symbol (ISBN #978-12345-6789-0)
  • Error 2: Checks for <center/>, <small/> and <br clear
  • Error 85: Checks for <center></center> and <gallery></gallery>
  • Error 34: Catches more cases. See Instances of 'subst:' in articles

Bgwhite (talk) 06:01, 3 February 2015 (UTC)

Bgwhite, I don't understand the rationale of grouping detection for center and small with #2. The br tag is a special tag in HTML5 (not necessarily XML compliant now), while center and small tags are more conventional tags (XML compliant). Wouldn't it be better to put them in a separate detection? --NicoV (Talk on frwiki) 10:49, 16 February 2015 (UTC)
NicoV #2 is looking for bad or malformed tags. br and small are both elements, one just has a void end tag... just like hr, img, source, meta and a host of other tags. Wikipedia no long is XML compliant nor tries to be. I really don't want to go into the intricacies of HTML tags... just what is good or bad. Bgwhite (talk) 06:23, 18 February 2015 (UTC)
I understand, it's just that </br> is invalid while </center> is not... ;-) --NicoV (Talk on frwiki) 16:19, 24 February 2015 (UTC)

New error: unneeded headline[edit]

X mark.svg Not done

Hi! I was wondering whether a headline as the first stuff of an article (excluding templates) is an error which could be also checked by CheckWiki. See example (cswiki). Matěj Suchánek (talk) 18:38, 5 February 2015 (UTC)

Matěj Unfortunately, it would create too many false positives. There are many articles that do not have a lede. Articles with tables is the most common culprits. In those cases, a headline can't be removed and the article would be a false positive. An example is List of Liberty Bowl broadcasters. Bgwhite (talk) 07:16, 6 February 2015 (UTC)
I also think there would be too many false positives. Maybe only if the headline is the same as the article title (to catch mistakes made by newbee who adds a title) ? --NicoV (Talk on frwiki) 20:06, 6 February 2015 (UTC)
I support at least Nico's idea. Matěj Suchánek (talk) 09:43, 7 February 2015 (UTC)
English dump is about ready to be processed. I'll add something to checkwiki to check to see if the first headline is the same as the article's title. After looking at the results, will decided what to do. Bgwhite (talk) 10:20, 7 February 2015 (UTC)
NicoV Matěj Magioladitis. The following articles were found in the first 1% of the dump that had the first headline the same as the article's title
IMO, La Espero, List of Pokémon, Philosophy of education, Quantum information, Religion and mythology and Hake.
I'm not to sure about this. IMO looks to be a valid case for removal, but the rest don't. AWB will remove the first section headings in all these articles. Bgwhite (talk) 20:53, 11 February 2015 (UTC)

AWB already fixes things like these. -- Magioladitis (talk) 10:37, 7 February 2015 (UTC)

We better avoid adding this error. -- Magioladitis (talk) 21:44, 11 February 2015 (UTC)

Many false positives. Big troubles. -- Magioladitis (talk) 11:25, 22 February 2015 (UTC)

Unnecessary pipe template[edit]

Resolved

could be interesting to find these, basically, a {{!}} inside of [[ ]]. Frietjes (talk) 17:59, 11 February 2015 (UTC)

Frietjes Magioladitis A database scan can be found at User:Bgwhite/Sandbox. Magioladitis could do an AWB bot run. F&R could be something like \[\[(.*)\{\{\!\}\}(.*)\]\] -> [[$1|$2]] This did catch some false positives, so the F&R needs to be refined. See "Weird Al" Yankovic discography.
What does the rest of the motley crew think about adding this? @Matěj Suchánek, NicoV, Meno25, Josve05a, ToSter: Bgwhite (talk) 22:43, 11 February 2015 (UTC)

I agree. -- Magioladitis (talk) 22:53, 11 February 2015 (UTC)

Or we could fix the "Cite" thing in the toolbar, that keeps adding it everywhere, except on enwp....
  • enwp = <ref>{{cite web|title=foo|url=foo|website=[[Foo|test]]}}</ref>
  • svwp = <ref>{{webbref|titel=foo|url=foo|websida=[[Foo{{!}}test]]}}</ref>

(tJosve05a (c) 06:20, 12 February 2015 (UTC)

I also agree. Thank you for notifying me. --Meno25 (talk) 07:42, 12 February 2015 (UTC)
Yes, another nice case how to clean Wikipedia. Matěj Suchánek (talk) 14:34, 12 February 2015 (UTC)
Bgwhite and Magioladitis, the replacement should be \[\[([^\[\]]*)\{\{!\}\}([^\[\]]*)\]\] -> [[$1|$2]] which wouldn't do anything in your false-positive case. Frietjes (talk) 15:04, 12 February 2015 (UTC)
Magioladitis I've updated lists at User:Bgwhite/Sandbox. It was generated using Frietjes' regex. Bgwhite (talk) 21:26, 12 February 2015 (UTC)

@Bgwhite, Frietjes: I ran the bot with the regex and I am done. -- Magioladitis (talk) 13:26, 13 February 2015 (UTC)

@Bgwhite, Frietjes: Found exception at Toreador Song. -- Magioladitis (talk) 07:57, 15 February 2015 (UTC)

That would be an exception. However, in that article, the repeat symbol should be removed. A musical note shouldn't be there. Bgwhite (talk) 09:02, 15 February 2015 (UTC)
could update the regexp to \[\[([^\[\]\|]*)\{\{!\}\}([^\[\]]*)\]\] -> [[$1|$2]] which would exclude that case (basically no actual pipe before the escaped pipe). Frietjes (talk) 14:57, 15 February 2015 (UTC)


@Matěj Suchánek, NicoV, Meno25, Josve05a, ToSter, Frietjes: It has been added as error #102. It should show up today in the daily scans. There is a problem with dumps in which no dumps are being produced for alot of languages. Many languages haven't been dumped since December. Bgwhite (talk) 01:03, 20 February 2015 (UTC)

rev 10835 now in AWB general fixes. -- Magioladitis (talk) 10:03, 20 February 2015 (UTC)

Now this is error #103. -- Magioladitis (talk) 11:25, 22 February 2015 (UTC)

@Bgwhite, Magioladitis:, I think this should be restricted to the target of the link, it should not detect articles where {{!}} is used in the displayed text of the link. For example, fr:Idéal de l'anneau des entiers d'un corps quadratique is detected for [[Valeur absolue|{{!}}''f''{{!}}]] where the goal is to display pipes around "f" (absolute value of "f" in maths). --NicoV (Talk on frwiki) 14:37, 23 February 2015 (UTC)

Added to WPCleaner. --NicoV (Talk on frwiki) 18:00, 26 February 2015 (UTC)

Add galician wikipedia?[edit]

Hi, could be it possible add galician wikipedia to this tool? Thanks!, Elisardojm (talk) 09:58, 13 February 2015 (UTC)

Hi Elisardojm. Each wiki requires a configuration page, based on Wikipedia:WikiProject Check Wikipedia/Translation. Could you create a similar page on galician wikipedia? --NicoV (Talk on frwiki) 12:39, 13 February 2015 (UTC)
Hi Elisardojm, I saw that you've started creating the configuration. If you're interested, I've modified WPCleaner for glwiki, it can help you check the configuration. --NicoV (Talk on frwiki) 20:35, 21 February 2015 (UTC)
Yes :) NicoV, but I'm translating it too slow, I intended to notice it here when I had finished it :). How can I try the WPCleaner? --Elisardojm (talk) 17:00, 22 February 2015 (UTC)
Hi Elisardojm, see Wikipedia:WPCleaner (general info), Wikipedia:WPCleaner/Installation for installation and Wikipedia:WPCleaner/Check wiki for usage with CW: the menus in the Check Wiki window will help you check what you have configured (error labels, error activation, ...). --NicoV (Talk on frwiki) 22:24, 22 February 2015 (UTC)
Thanks for starting the Galician translation :-) I've included the current translation. --NicoV (Talk on frwiki) 17:43, 23 February 2015 (UTC)

Possible false positives in Error #47[edit]

I think that the template errors (#47 and the like) ignore the characters between math tags. This is good. I think, though, that the formulas between math tags dont' get filtered out if:

  • Math tags are capitalized (like <Math>)
  • The tag has some attributes, like <math display="inline"> in ca:Gas ideal

Could you please check it out? --Joutbis (talk) 10:59, 22 February 2015 (UTC)

Joutbis Anything between comment, math, nowiki, code, pre, source, hiero and score tags get removed before checks take place. Bgwhite (talk) 05:39, 23 February 2015 (UTC)
Yes, that's fine, but I'm afraid that if the format is <Math> or <math display="inline">, then they are not removed, and the brace counter goes wild. Is this possible? Otherwise, I can't see what's wrong in ca:Gas ideal --Joutbis (talk) 19:50, 28 February 2015 (UTC)
Joutbis Ok, two things going on here.
  1. {{equació|1=<math display="block">P = \frac{N \cdot m \cdot \overline{v^2}}{3 \cdot V}</math>|2=3}} is one of the lines causing a #47 error. Checkwiki thinks there is an error because there is only one {{, while there are two }}. Math equations are a common false positive. On enwiki, we have whitelisted multiple articles with the majority being math related.
  2. The code is supposed remove anything between the math tags, thus the above line shouldn't be causing a #47 error. It does remove cases including <Math> and <math display="inline">. The lower/upper case does not matter and any parameter inside the math tag does not matter. However, in order to speed up the code, I check to see if there is a math tag in the article first. I was not checking cases of <math display>. As the article only contained <math display>, the checkwiki program "saw" no math tags, thus didn't remove anything between the math tags. Therefore, #47 showed up when it shouldn't have.
In theory, there shouldn't be cases of <math display> in any article, only <math alt> and <math style>. This is especially true when used inside the {{equació}} template, as dispaly=inline is redundant and display=block can be handled by the template. I did edit ca:Gas ideal to remove 'display'. I also edited the CheckWiki program to check for more cases of <math, so it won't matter what is inside the math tags when "seeing" for math tags in the article. Bgwhite (talk) 06:43, 3 March 2015 (UTC)
Wow, thanks! --Joutbis (talk) 17:39, 3 March 2015 (UTC)

When to start the check process[edit]

Won't fix

I have noticed that the Check Wikipedia process runs the day after the monthly dump is complete. But the dump takes a very long time because of the "All pages with complete page edit history" dumps. Take cawiki, for instance: the dump started on February 12 and the "All pages, current versions only" was done at 0:45 the next day. The whole process, though, didn't end until February 14, 05:21. Check Wikipedia ran on the 16. This delay means the fixes made during those four days will get flagged again. Could you run your process when the "current version" dump is done, without waiting for the historic one?

--Joutbis (talk) 11:11, 22 February 2015 (UTC)

Joutbis The dump files are not available at WMFLabs until the entire dump gets finished. I have complained about this to various people and been told to stop complaining. This is assuming that dumps are actually being made (they have been very spotty recently) and WMFLabs dump directory is actually working (past year only 50% of the time). I no longer get any responses from anybody about dumps not working or directories not working. Essentially, nobody at WMFLabs or WMF cares.
I do copy enwiki's (15 day gap) and frwiki's (10 day gap) dump files, but leave the rest to when the dump actually finishes. Bgwhite (talk) 06:02, 23 February 2015 (UTC)
Joutbis Given how the dumps are managed by WMFLabs, it's not possible to run the check just after the current version dump is done. After a full scan, you can use WPCleaner to check all articles and mark as done the ones where the problem is already fixed: Bot tools, select all the error numbers below 500 and click on "Mark errors already fixed". It's not very fast, but it's fully automatic and won't do any modification on Wikipedia, I use it from time to time on frwiki. --NicoV (Talk on frwiki) 10:13, 26 February 2015 (UTC)

OK, thanks --Joutbis (talk) 19:50, 28 February 2015 (UTC)

False positive in Error #85[edit]

In ca:Brainfuck, there are <code> tags between <center> tags. However, the tool is flagging it as if it were empty.

--Joutbis (talk) 11:20, 22 February 2015 (UTC)

Same kind of false positive for frwiki: fr:Messiah with <score> tags between <center> tags, and fr:Tiret with <code> tags between <center> tags. --NicoV (Talk on frwiki) 22:27, 22 February 2015 (UTC)
Joutbis NicoV See discussion two above this one... Anything between comment, math, nowiki, code, pre, source, hiero and score tags gets removed before checks take place.
All right, will do.--Joutbis (talk) 19:50, 28 February 2015 (UTC)
In ca:Brainfuck's case, <center> tags are not to be used like that in tables. This is a case of doing center properly. I did edit Brainfuck to do tables properly. fr:Tiret has the same problem, well actually, it is full of fail (scope="col" is redundant, <font> and <tt> are obsolete).
In the case of fr:Messiah, if it was on enwiki, I'd use the {{center}} template. That does the proper thing anyway instead of using the obsolete <center> tag. Bgwhite (talk) 06:38, 23 February 2015 (UTC)

Typo[edit]

Resolved

The main page currently reads " ... facilitates the correction of detected by the program ... ".

This seems to be a typo? Trafford09 (talk) 00:37, 23 February 2015 (UTC)

Thanks Trafford09, I've added "problems". --NicoV (Talk on frwiki) 10:37, 26 February 2015 (UTC)

Whitelist (sv.wikipedia.org) for error #34[edit]

CanI get instances such as {{#expr:{{Stat/Finland/Kommuner/Befolkning|Föglö}}/{{Stat/Finland/Kommuner/Areal land|Föglö}} round 2}} whitelisted on svwp, since this is used to automatically update population numbers in articles such as sv:Föglö. I don't know how to do it, or what to do... (tJosve05a (c) 18:57, 23 February 2015 (UTC)

Josve05a This can be handled two ways and it depends on how many article you are talking about. If it is not "alot", then add the articles to a whitelist. If there are alot, then I can added it to the code.
If you are using a whitelist, look at Wikipedia:WikiProject Check Wikipedia/Translation and see how it is done for enwiki (search for "whitelist"). #34 on enwiki does have a whitelist. Frwiki also has whitelists set up. Bgwhite (talk) 19:26, 23 February 2015 (UTC)
@Bgwhite: THis should be used a lot, at least for all populated places in Sweden with population numbers at Statistiska centralbyrån, since a bot updates those automaticle. Not all articles are using this system yet, but more and more are. (tJosve05a (c) 19:29, 23 February 2015 (UTC)
I've also seen constructions like that on frwiki, but I don't like having calculations in articles. An other solution would be to use a template to do the computation instead of putting the #expr directly in the article. --NicoV (Talk on frwiki) 19:47, 23 February 2015 (UTC)

False positive on error #37[edit]

de:Bělá (Divoká Orlice) is stated as not bearing a sort key, but in fact this key is given with the parameter SORTNAME in template de:Vorlage:Infobox Fluss. Adding a defaultsort parameter to the article itself results in a warning message that the previous sort key has been overwritten. So the sort key seems to be valid. I don’t know if there are other templates affected which are listed in de:Kategorie:Vorlage:mit Kategorisierung. --Hadibe (talk) 17:29, 28 February 2015 (UTC)

@Magioladitis: Hadibe, grrrr.... sort values shouldn't be in Infoboxes. Magioladitis is the one to ask about this. Bgwhite (talk) 05:35, 3 March 2015 (UTC)

CHECKWIKI #81[edit]

I know that #81 was turned off on enwp due to the was amount of these errors, but is it possible to turn it on, even if only for one database scan or something, for me? (tJosve05a (c) 04:55, 3 March 2015 (UTC)

Josve05a Yup, I can run it. The next enwiki dump should be out by the end of the week. I'll run it, which is when I run the regular dump scan. The big problem will be me remembering. Bgwhite (talk) 05:32, 3 March 2015 (UTC)
Face-grin.svg (tJosve05a (c) 05:34, 3 March 2015 (UTC)
Josve05a The list is at User:Bgwhite/Sandbox. It only contains the first 49,000 articles. The entire list (89,000) was too big to save. Bgwhite (talk) 17:13, 18 March 2015 (UTC)

List with empty title[edit]

Hi Bgwhite, at least on frwiki list for error #25, there was only one line, and it contains an empty title and time found 0000-00-00 00:00:00. The "Done" button does nothing. The "Set all articles as done" works, and the empty title appears now in the list of done articles. --NicoV (Talk on frwiki) 19:41, 10 March 2015 (UTC)

Same for error 59, but I left it as it is. --NicoV (Talk on frwiki) 19:43, 10 March 2015 (UTC)
Same for error 85. --NicoV (Talk on frwiki) 20:10, 10 March 2015 (UTC)

Template calls with duplicate arguments[edit]

Resolved

Hi, I've added #524 to WPCleaner to detect template calls with duplicate arguments (several arguments with the same name). On frwiki, this allows to work on the pages in fr:Catégorie:Page utilisant des arguments dupliqués dans les appels de modèle. --NicoV (Talk on frwiki) 23:07, 10 March 2015 (UTC)

Equivalent category on enwiki is Category:Pages using duplicate arguments in template calls, currently 44k pages in it if someone is looking for work ;-) --NicoV (Talk on frwiki) 10:23, 11 March 2015 (UTC)
NicoV, the person who would enjoy this is Frietjes. The cat is already on one that she watches. I wonder if there are many cases where the values of the duplicate arguments are identical? If so, have WPCleaner fix this automatically in bot mode. Bgwhite (talk) 05:14, 16 March 2015 (UTC)
Done, when values are identical WPCleaner will remove the first argument automatically. --NicoV (Talk on frwiki) 17:31, 16 March 2015 (UTC)
I wonder also if I should remove the first argument automatically when it's empty... --NicoV (Talk on frwiki) 18:04, 17 March 2015 (UTC)
NicoV, on this wiki, we have Wikipedia:Bots/Requests for approval/SporkBot 5 which does both automatically, with some exceptions (see discussion concerning scores and other parameters with numeric suffixes). Frietjes (talk) 18:10, 17 March 2015 (UTC)
Frietjes, thanks for the link, very interesting :-) I modified WPCleaner to do automatic replacements only if the argument name doesn't end with a digit (should cover the exceptions), both for equal arguments or first argument empty. --NicoV (Talk on frwiki) 21:10, 17 March 2015 (UTC)

Update:

  • Frietjes Automatic replacement is also prevented if the argument name contains a digit (even if it's in the middle of the argument name) to avoid this (correct modification)
  • I'm starting to make errors > 500 behave more like regular CW errors. For example, defining the parameter category for #524 allows the bot tools to work on the list of pages for #524 as if it was available on labs like regular errors. I'm going to extend this behavior to other errors and also outside of the bot tools.

--NicoV (Talk on frwiki) 08:45, 26 March 2015 (UTC)

That's great! Would it be problem to make it easier by checking Special:TrackingCategories (in this case MediaWiki:Duplicate-args-category)? Matěj Suchánek (talk) 17:30, 26 March 2015 (UTC)
Thanks Matěj Suchánek, I will try to use it. --NicoV (Talk on frwiki) 06:35, 27 March 2015 (UTC)
Matěj Suchánek, WPCleaner is now checking MediaWiki:Duplicate-args-category if no configuration is provided. --NicoV (Talk on frwiki) 00:37, 28 March 2015 (UTC)

False positive for #31[edit]

Hi Bgwhite, on frwiki there are several false positives with things like <trl>, <trois>, <trk>, <transformers, <transmission, <traduction, <track>, ... Would it be possible to limit the detection ? For example, detect only <tr when followed by a space, a "/", a ">", ... but not by a letter ? --NicoV (Talk on frwiki) 09:12, 11 March 2015 (UTC)

NicoV, I'll take a look, but it will be a couple of weeks till I can get to it. Bgwhite (talk) 05:10, 16 March 2015 (UTC)

False positives for error #103[edit]

Hi, I think that the script is not doing what user NicoV requested:

it should not detect articles where {{!}} is used in the displayed text of the link.

There are a few examples in cawiki: many train stations, like ca:Estació de Bogatell or ca:Llista de cançons del DJ Hero 2 (this one took several tries to fix error #32, and now it's back!).

I'm not sure if this would cover all the false positives, but I think that, if a | is already there, it should allow several {{!}}'s.

--Joutbis (talk) 09:35, 13 March 2015 (UTC)

Joutbis Yes, those are false positives. The DJ Hero 2 article contains M|A|R|R|S, which is also in several English articles too. I've seen errors that also had | inside a wikilink. I'd say add it to the whitelist for now and I'll take a look at. I've been gone for the best part of 2 weeks, so I need to catch up on things first before diving into the code. Bgwhite (talk) 05:07, 16 March 2015 (UTC)
Bgwhite I fixed M|A|R|S using {{pipe}}. -- Magioladitis (talk) 20:08, 16 March 2015 (UTC)
Good idea, thanks! --Joutbis (talk) 19:54, 17 March 2015 (UTC)

Joutbis I created the template in Catalan Wikipedia! 10 -- Magioladitis (talk) 20:10, 17 March 2015 (UTC)

Template programming element[edit]

I don't understand why this is classified as an error – I have never seen any rule that parser functions are restricted to templates, not, for instance in Help:Magic words. Is that (another) unwritten Law of Wiki? --Unbuttered parsnip (talk) mytime= Mon 08:56, wikitime= 00:56, 16 March 2015 (UTC)

Unbuttered Parsnip, an example would be good. Bgwhite (talk) 04:59, 16 March 2015 (UTC)

Extending #85 for empty unnamed ref tags ?[edit]

Yes check.svg Done

Hi, I wonder if we should extend #85 to look also for empty unnamed ref tags (like in this buggy VE edit). --NicoV (Talk on frwiki) 10:13, 19 March 2015 (UTC)

NicoV, a listing for frwiki can be found at User:Bgwhite/Sandbox2. It was generated with an AWB database scanner search, so these contain cases where the empty ref tags could be in nowiki tags. I'm currently doing a checkwiki scan for enwiki. If everything looks ok, will add it. Bgwhite (talk) 20:22, 20 March 2015 (UTC)
For some reason there are always several examples (without nowiki) on the Category:Pages with incorrect ref formatting. I guess there is a problem with the placement of the option on the wiki-insert panel. Doesn't really explain why people commit a change like it. Unbuttered parsnip (talk) mytime= Sat 05:35, wikitime= 21:35, 20 March 2015 (UTC)
Thank you Unbuttered Parsnip for telling me about the category. I got some test cases from the category. I was wondering why CheckWiki couldn't find any examples because people are fixing them in that category.
NicoV It has been added. Bgwhite (talk) 18:11, 21 March 2015 (UTC)
"people fixing them" = me mostly! -- Unbuttered parsnip (talk) mytime= Sun 05:22, wikitime= 21:22, 21 March 2015 (UTC)

Errors #51 and #53[edit]

  • Error #51 Interwiki before last heading
  • Error #53 Interwiki before last category

Today, I first corrected all of the #53 errors. Then, I went to correct the #51 errors. To my initial surprise, I saw some of the same errors that I had corrected for #53 listed under #51 errors.

On further thought, I question why the Checkwiki software scans for two different "misplaced interwiki" errors. If an interwiki link does not appear after the last category, it's an error. No need to distinguish between "before last heading" and "before last category".

Knife-in-the-drawer (talk) 04:09, 26 March 2015 (UTC)

Simple, see {{Uncategorized}} Jerodlycett (talk) 13:32, 30 March 2015 (UTC)

Error n°54 false positive[edit]

Yuri (genre) is a false positive. The break is in a reference. Jerodlycett (talk) 13:36, 30 March 2015 (UTC)

List of errors > #500[edit]

As you probably know, WPCleaner can detect some errors that are not listed by Check Wiki, using error numbers > #500, without any link to a list of pages to fix.

I've modified WPCleaner to be able to manage a list for some of these errors:

If you know some way of getting a list of pages for other errors > #500, I can add it to WPCleaner. --NicoV (Talk on frwiki) 19:22, 30 March 2015 (UTC)

The abuse filter extraneous markup is one. Checkwiki will catch some of these, such as this, but won't others. Bgwhite (talk) 07:05, 31 March 2015 (UTC)

Error n° 95 false positive[edit]

Robert Clark Young has a user link that may be a false positive. Jerodlycett (talk) 20:59, 31 March 2015 (UTC)