Wikipedia talk:WikiProject Check Wikipedia/Archive 1

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Archive 1 | Archive 2

Contents

Small style element[edit]

What are possibilities or recommendations how to replace style element small Wikipedia:WikiProject Check Wikipedia#HTML_text_style_element_.3Csmall.3E? Especially I would like to ask if there is easy solution for Wikipedia:Taxobox usage#Synonyms? (Personally I do not prefer writing authority in small letters at all.) --Snek01 (talk) 12:46, 19 December 2008 (UTC)

I'm not sure I understand the objection to the usage of the small tags. Is it just that the html is superfluous or difficult to edit for noobs? The reason for the usage, often, is that authorities can sometimes be quite lengthy and instead of wrapping the text in the taxobox, nbsp and the small tags are used to keep everything on one line. See Utricularia stellaris for example. I've used this html tag throughout my edits in the Utricularia species. I also find that it separates the authority from the species visually; many people aren't used to seeing authorities, so with a link and a different text style, it gives them an idea that it means something different from a species or variety name. I'd be open to suggested changes that would maintain some sort of font difference, though. --Rkitko (talk) 13:11, 19 December 2008 (UTC)
Yes, there still can be authority written in such smaller letters (until it is recomended in documentation of Taxobox). Authorities can always be easily distinguish, because the scientific name is in italics. And to its length: there is nor reason to change contents for better view but the view should accomodate co its contents. (Soory for my English, I hope I used appropriate words.) --Snek01 (talk) 17:07, 19 December 2008 (UTC)

This project WikiProject Check Wikipedia recommend to not use the style element <small>. So the question is if we really should replace style element <small> with something other (which can have the same appereance) and what are possibilities? Maybe there is no reason to change it or maybe a better possibility does not exist. I do not know. --Snek01 (talk) 17:07, 19 December 2008 (UTC)

<span style="font-size:80%">make me small</span> maybe? --213.168.121.43 (talk) 02:45, 21 December 2008 (UTC)
Normally, we don´t need <small>. In Wikipedia we only write text. The Stylesheet (css) make with this text the right output. I hope we can eliminate all <small>-Tags from the text and put all formats in the stylesheet. In XHTML is small not allowed, there you can only use <span style="font-size:80%">make me small</span>. -- sk (talk) 16:04, 21 December 2008 (UTC)
Hello Stefan. In XHTML a <small/> element is allowed. Cascading Style Sheets (CSS) are used only to describe the presentation (the look and formatting) of a document. They do not describe the "meaning" of elements (that is, non-semantic markup). If you replace i.e. <small>testo</small> with <span style="font-size:90%;">testo</span>, web-based screen readers will fail to interpret a correct meaning of a element content. Also you can use CSS to describe a look of <small/> element (instead of <span/> element). --DaBler (talk) 15:10, 28 February 2009 (UTC)
The "Small" element if completely valid in some HTML/XHTML; however, WikText should be used on Wikipedia whenever possible... after all, we use "==" instead of "<h2></h2>", so we should also use {{small}} to make text smaller. -Drilnoth (talk) 16:37, 28 February 2009 (UTC)
Hello Drilnoth. A <small> element is valid in all versions of HTML 4 ([1]), XHTML 1.0 ([2]), XHTML 1.1 (Presentation Module, [3]). Wikipedia (MediaWiki) declare XHTML 1.0 Transitional document type. However, the syntax == cause a creation of a <h2> element (it is semantic correct markup). Template {{small}} generate a <span> element, not a <small>. <span> element is generic language/style container. Using a <span> element instead of <small> is valid, however semantic interpretation is incorrect. --DaBler (talk) 13:01, 2 March 2009 (UTC)
I was under the impression that, generally, WikiSyntax (including templates which simply add CSS style ids and classes to the page) was preferred over actual HTML and XHTML. However, I don't really care one way or the other (I, personally, have started using {{small}} because it's quicker to type than both the start and end tags), and I think that you should probably talk to the creator of the script if you think it should be removed. I don't really worry about changing that in articles unless I'm editing around it anyway, but Stefan might have some reasons that I have either overlooked or forgotten. -Drilnoth (talk) 03:18, 3 March 2009 (UTC)
There is template allready available at Template:Small. --Snek01 (talk) 22:19, 1 January 2009 (UTC)
not correct correct
<b>testo</b> '''testo'''
<i>testo</i> ''testo''
<u>testo</u> <span style="text-decoration:underline;">testo</span>
<s>testo</s> <span style="text-decoration:line-through;">testo</span>
<strike>testo</strike> <span style="text-decoration:line-through;">testo</span>
<tt>testo</tt> <span style="font-family:monospace;">testo</span>
<big>testo</big> <span style="font-size:120%;">testo</span>
<b><big>testo</big></b> <span style="font-size:larger; font-weight:bold;">testo</span>
<small>testo</small> <span style="font-size:90%;">testo</span>
<p> <p></p>
<br> <br /> or <br></br>
<center>testo</center> <p style="text-align:center;">testo</p>
<p align="center">testo</p> <p style="text-align:center;">testo</p>
<font color="#224466">testo</font> <span style="color:#224466;">testo</span>
<font style="text-decoration:overline">testo</font> <span style="text-decoration:overline;">testo</span>

Wow[edit]

This page is awesome. Great work. :-) --MZMcBride (talk) 05:31, 24 December 2008 (UTC)

Thanks. -- sk (talk) 18:05, 1 January 2009 (UTC)

Parenthesis and punctuation spacing errors, if possible.[edit]

I often see errors of a missing space before or after a parenthesis, or after a comma. E.g., "Smith traveled from New York,where he had studied, to North Carolina(accounts of the dates vary)". Can we get a list of those? bd2412 T 20:52, 1 January 2009 (UTC)

I will test this. Thanks for this info. -- sk (talk) 17:10, 2 January 2009 (UTC)
This is not easy: "Antimon(III,V)-oxid", "(Anti-)Atomkraft" and so one. I have try it with different regular expression, but I have no good results. -- sk (talk) 21:16, 3 January 2009 (UTC)

Duration[edit]

Duration: 255 minutes 19 secounds

Wait a minute! Secounds? Simply south not SS, sorry 17:06, 5 January 2009 (UTC)

Table not correct end[edit]

Great work!

{{End}} and its redirects are used in some cases of incorrect table ends... Therefore are semi false positives... For laziness im just doing a list compare between the current 200 and what transcludes that (and {{End box}} which seems to be the most common)... Reedy 23:37, 7 January 2009 (UTC)

I insert "end" "end box" and "End box". Is there more? -- sk (talk) 20:44, 4 February 2009 (UTC)

Article with <ref> and no <references/>[edit]

I think there are many false positives for this as some redirect templates are missed on en: {{ref-list}}, {{reflink}} are a couple. Thanks Rjwilmsi 21:11, 14 January 2009 (UTC)

No more ref-list.. Reedy 23:31, 31 January 2009 (UTC)
Green tickY, I insert "reflink" in my skript. -- sk (talk) 20:40, 4 February 2009 (UTC)
[4] would be them all Reedy 20:43, 5 February 2009 (UTC)

Table Formatting[edit]

A few of the tables have 2 columns, where the 2nd is empty. Would be nice if you could tidy that up

Reedy 22:16, 14 January 2009 (UTC)

People editing the page...[edit]

Is it me, or is it just a bit pointless? You're creating nearly 0.5 megabyte revisions to remove a few lines, which, would be remove anyway at a later update?

Reedy 20:56, 20 January 2009 (UTC)

Good point, I've been editing just to avoid overlap. Any suggestion how to get round that without new revs?Cubathy (talk) 17:01, 17 February 2009 (UTC)
I think that it's kind of silly to remove five or six lines at a time, but if you finish a section or need to take a break and want to mark off the twenty+ that are done, that makes sense to me. -Drilnoth (talk) 02:06, 21 February 2009 (UTC)

Double Categories[edit]

What is the policy on double categories like Alpha Centauri where two items in the category (two HD objects/HIP objects in this case) point to the same page (because it's a binary star system). Seems to make some sense the way it is, but should we be creating two new pages for this? —Preceding unsigned comment added by Cubathy (talkcontribs) 09:51, 17 February 2009 (UTC)


Mismatched square bracket[edit]

Just went through the list for error 10. There were a few articles in the table which did not appear to have issues. For example:

There were about 10 in total. Seems like the script has difficulty with more complex structure (images with double bracket links in the description) and also ignores the links which Wikipedia doesn't process (i.e. the code blocks on the REBOL page). Any chance of getting these issues looked at? Cubathy (talk) 14:55, 18 February 2009 (UTC)

A similar error for the table without end tag error:
In REBOL it is code, then you should use the tag <source> or <code>. -- sk (talk) 21:16, 18 February 2009 (UTC)
In "Black Site" it is the complex image description. But if this only in this article, then it should be change there. Normaly a image and the description stand only in one line. -- sk (talk) 21:20, 18 February 2009 (UTC)
Also in "Wikipedia" it is a complex description with references. Maybe I can fix this in future. -- sk (talk) 21:21, 18 February 2009 (UTC)
In "Borel functional calculus" my script found "{|". This is in Wikipedia the begin of a table. If you want write it in this article then use the Math-Tag. -- sk (talk) 21:25, 18 February 2009 (UTC)

automated mistakes?[edit]

Please check edits like http://en.wikipedia.org/w/index.php?title=2008_South_Ossetia_war&diff=next&oldid=272623918. Instead of correcting something, it removes valid and needed brackets. No idea how this project works, but something is going wrong here. --Xeeron (talk) 16:04, 23 February 2009 (UTC)

This is related to the square bracket errors in the section above. The article contains 'complicated' bracketing which the script reports as an error (but fix is not automated so this one shouldn't have been updated). Cubathy (talk) 16:32, 23 February 2009 (UTC)
Ok, no big deal, but please be more careful when checking (not directed at you Cubathy), at pages with less traffic, such edits might go unreverted for a while. --Xeeron (talk) 16:50, 23 February 2009 (UTC)
Oops! My bad on that one. I guess I just got messed up because there aren't typically references within wikilinks, and I just didn't notice that it was part of an image. I'll be on the lookout for that in the future. Thanks for mentioning it! -Drilnoth (talk) 00:22, 24 February 2009 (UTC)

Gallery Without Description[edit]

This is one of the projects, but mw:Help:Images#Gallery of images states that captions are all optional. Why is this project present? Yellowweasel (talk) 00:12, 26 February 2009 (UTC)

I think that the captions are optional, but are highly recommended... it just doesn't specifically say that. Tools such as screen readers need the captions to "read" the images, so it's really an accessibility issue. -Drilnoth (talk) 00:21, 26 February 2009 (UTC)
The problem is that in many cases, captions for each image would be redundant, such as for American Beaver. Yellowweasel (talk) 00:53, 26 February 2009 (UTC)
Ah... good point. In that case I'd recommend posting something at de:Benutzer Diskussion:Stefan Kühn/Check Wikipedia, so that the script designer can take a look. -Drilnoth (talk) 00:55, 26 February 2009 (UTC)
If the description redundant then is also the image redundant. I think than we can put this images at commons and not in the article. In American Beaver this images are redundant:
They have no new information about this animal. I think it is a good idea to descripe realy every other images in this gallery. -- sk (talk) 16:52, 26 February 2009 (UTC)
Good point... when images are that redundant, you generally only need one. -Drilnoth (talk) 16:55, 26 February 2009 (UTC)

Arrows... Its showing an up arrow when the numbers have stayed the same...[edit]

As above Reedy 10:22, 8 March 2009 (UTC)

No it is normal. Yesterday I have play with enwiki and crash the statistic. After this I fix the list with articles and my script find today many errors again. -- sk (talk) 17:24, 8 March 2009 (UTC)

Possible AWB Plugin[edit]

de:Benutzer_Diskussion:Stefan_Kühn#Re:WikiProject_Check_Wikipedia.E2.80.8E

Asked if there is some way we can get an XML format or similar for AWB to use.. As AWB can easily be used to fix numerous of the errors.. And possibly more automated Reedy 15:25, 8 March 2009 (UTC)

See here. I write you an email. -- sk (talk) 17:04, 8 March 2009 (UTC)

Translation page[edit]

Hi, now I have insert the translation page in english. So you can write a better description or activate and deactivate the errors by yourself. -- sk (talk) 19:06, 16 March 2009 (UTC)

Awesome! Thanks. –Drilnoth (TC) 12:41, 17 March 2009 (UTC)

Reformatting[edit]

Okay, so for the past few days there's been a Wikimedia Foundation error whenever I've tried to update the list with the content from the toolserver... apparently because of page size. Therefore, I have split the page into three subpages, with each one transcluded onto the main page, thereby fixing the length issue. Use of the project shouldn't change, but it will take longer to update each time... in fact, I'd reccommend just doing it every other day to save the effort. Here's a breakdown of the three pages:

I will continue to try to update this every other day, and will occasionally check to see if the updates can be done normally again. –Drilnoth (TC) 18:03, 20 March 2009 (UTC)

Maybe it is better to set the number of article per error from 100 to 25 or so. I can change this in my script an you have not so much problems. -- sk (talk) 18:06, 20 March 2009 (UTC)
That could work for some of the problems... I think that 25 would be good for all of them other than "Headlines start with three "="", which I try to go through whenever I have the time and I'd probably pass 25 pretty quickly each day, so maybe keep that at 100. Sorry about being WP:BOLD and just doing this... it was just starting to get a little frustrating. However, if you can make that change, it would probably work. Thanks! –Drilnoth (TC) 18:16, 20 March 2009 (UTC)
Please retain 100 errors at least for the highest priority ones -- in general there are much less than that, though. -- Laddo 66.131.214.76 (talk) 23:29, 20 March 2009 (UTC)
I think that the script can be configured for each wiki separately, so the French output would be unchanged. –Drilnoth (TC) 23:41, 20 March 2009 (UTC)
At the moment I redesign many things in this script, and I think the idea to make different between high and low pritority is a good idea. -- sk (talk) 12:32, 21 March 2009 (UTC)
Hmm... seems to work now. Thanks for updating it! –Drilnoth (TC) 21:52, 23 March 2009 (UTC)

Automation[edit]

This shouldn't require editing the pages manually. Can't a script be set up to sync the wiki pages with the text files every 12 hours? Seems trivial to do. --MZMcBride (talk) 20:38, 20 March 2009 (UTC)

Good question (and maybe it could be set up to just change every time that the toolserver updates, as the timeframe really varies). –Drilnoth (TC) 20:44, 20 March 2009 (UTC)
Well, if it tries to edit and the content is the same, it just won't save the revision. MediaWiki doesn't allow duplicate revisions to be saved. :-) Would you like me to look into automating this? --MZMcBride (talk) 21:17, 20 March 2009 (UTC)
That would be great, thanks! –Drilnoth (TC) 21:24, 20 March 2009 (UTC)
OMGZ BOT Reedy 01:38, 21 March 2009 (UTC)
...
I guess you mean a bot request is probably the best thing to do? –Drilnoth (TC) 01:44, 21 March 2009 (UTC)
Yeah. Its non contraversial and can be done pretty easily Reedy 23:36, 21 March 2009 (UTC)
Done. Thanks for the tip! –Drilnoth (TC) 23:41, 21 March 2009 (UTC)

punctuation after references[edit]

Can this be done: like cleaning up '<ref>abc</ref>.' to '.<ref>abc</ref>' and similar other mistakes? Thanks.--GDibyendu (talk) 12:44, 21 March 2009 (UTC)

It might make more sense to use Special:Random in conjunction with WP:FORMATTER, which I think fixes punctuation after refs automatically, but this could be a good idea. –Drilnoth (TC) 21:09, 21 March 2009 (UTC)

Dashes[edit]

I have often corrected hyphens to dashes in the situations described at WP:DASH, so this got my attention: "en dash or em dash The article had a dash. Write for –; better "–" or —; better "—"." If "The article had a dash" is an error, then the Manual of Style guideline WP:DASH is a bigger error. You then contradict that thought with the ungrammatical "Write for –; better "–" or —; better "—".", which seems to be telling me to use dashes after all, even though the previous sentence considers "The article had a dash" to be an error. So is this consistent with WP:DASH or isn't it? Art LaPella (talk) 04:22, 24 March 2009 (UTC)

Probably it's a case that the original German explanation hasn't been translated correctly – the Check Wikipedia originates from de-wiki. I agree the wording needs to be clarified. Rjwilmsi 08:02, 24 March 2009 (UTC)
My understanding was: should be replaced with –, and the with —. Other interpretations would not make sense in light of what MoS says. Should be clarified. GregorB (talk) 10:10, 24 March 2009 (UTC)
What GregorB said. I'll see if I can alter the translation page to clarify this. –Drilnoth (TC) 13:22, 24 March 2009 (UTC)

underline[edit]

What's the wiki alternative to <u>? It's listed as a syntax error but I don't know of an alternative. OrangeDog (talkedits) 22:49, 25 March 2009 (UTC)

Template:Underline should do the trick. –Drilnoth (TC) 16:22, 26 March 2009 (UTC)

Title in text[edit]

This isn't an error. Wiki software will automatically render it as Title. Leaving it as a link is easier for future splits and does no harm. OrangeDog (talkedits) 23:05, 25 March 2009 (UTC)

I think that all of those errors is for when the article links to exactly its own title, not to a redirect to it (although I could be wrong). Having bold text in the article without having a good reason is generally discouraged, as random bolding of words could be distracting. –Drilnoth (TC) 16:23, 26 March 2009 (UTC)
Isn't it in WP:MOS or just recommended way of doing it? Reedy 16:50, 26 March 2009 (UTC)
WP:BOLDTITLE says that the article name in the lead section should be bold, but does not prescribe a "correct" way of doing it. If the article name is "Foo", then in the same article '''Foo''' and [[Foo]] work exactly the same, and that's why some people use the second technique in the article intro. So, if I understand correctly, title link in text is not an error in the introductory sentence, but is an error anywhere else in the text. GregorB (talk) 17:47, 26 March 2009 (UTC)
At least in the first sentence, it seems to me that it has always been '''Title'''. -- User:Docu

I was thinking of substituted nav templates (they do exist) and similar. In the main text it's probably wrong. OrangeDog (talkedits) 22:21, 2 April 2009 (UTC)

I am not able to update the page..[edit]

I am getting the following error: Request: POST http://en.wikipedia.org/w/index.php?title=Wikipedia:WikiProject_Check_Wikipedia&action=submit, from 71.231.176.196 via sq16.wikimedia.org (squid/2.7.STABLE6) to 208.80.152.43 (208.80.152.43) Error: ERR_READ_TIMEOUT, errno [No Error] at Sun, 29 Mar 2009 10:01:03 GMT --Anshuk (talk) 10:03, 29 March 2009 (UTC)

Probably related to your connection and the size of the page. Reedy 12:32, 29 March 2009 (UTC)
its the page-size. updated omitting lowest priority --AwOc 12:57, 29 March 2009 (UTC)

maybe there should be a sub-page for each priority. this would also reduce the revision sizes on small edits. updating would then mean to edit four pages, which is quite annoying. maybe a bot could solve this. --AwOc 13:18, 29 March 2009 (UTC)

I am able to update. I only tried to edit a section, rather than the whole page. BTW, in any case, it will be updated in a few hours. So, probably removing entries may not be important. Also, depending on when data collection for today started, some of things that you may have fixed today, may again appear on today's list.--GDibyendu (talk) 13:49, 29 March 2009 (UTC)
The low priority couldn't be added today because of size, either? Weird. It had been working for a while. –Drilnoth (TC) 15:36, 30 March 2009 (UTC)

Errors suitable to be fixed with AWB[edit]

The descriptions can be edited at WikiProject Check Wikipedia/Translation.

To edit a description, copy the text from the default description (desc_script) to desc_enwiki, e.g. for error 1:

 error_001_desc_script=This article has no bold title like '''Title'''. END
 error_001_desc_enwiki=This article has no bold title like '''title'''. END

I think it would be helpful if any element that can be fixed with AWB be marked as such. -- User:Docu

That would probably be a good idea... I'd add it myself except that I don't know what AWB fixes! Anyone? –Drilnoth (TC) 00:52, 30 March 2009 (UTC)
I'll have a go at this later today. Rjwilmsi 11:05, 30 March 2009 (UTC)
Hmm, I got sidetracked. but will work on this over the next few days. Rjwilmsi 22:37, 30 March 2009 (UTC)

I marked some of them. They seem to be the type of changes after which my usual AWB settings skip saving. -- User:Docu

For error 7, I made a feature request at WT:AutoWikiBrowser/Feature requests#Section header level (WikiProject Check Wikipedia #7). It might as well be fixed in any article. -- User:Docu

A clever trick - or not?[edit]

From Bihar:

{{#switch: {{#expr: {{CURRENTHOUR}} mod 1}}
|0 = [[Image:Secretariat Building patna.JPG|left|170px|thumb|Vidhansabha Building, [[Patna]]]] 
|1 = [[Image:Patnahighcourt.jpg|left|180px|thumb|Patna high court, [[Patna]]]]
}}

Opinions? GregorB (talk) 19:20, 30 March 2009 (UTC)

I think that it is an interesting concept, but doesn't really make sense. The article changes based on the time of day? What?! It just doesn't really seem quite right. Something could be brought up at one of the village pumps about this, perhaps a template being created to standardize such pseudo-randomization across a number of articles. –Drilnoth (TC) 21:10, 30 March 2009 (UTC)
A clever trick yes, but per standards, no. If there are multiple relevant images for the article and not enough space to show them all full size then surely a gallery should be used to make them all available to readers. Rjwilmsi 22:36, 30 March 2009 (UTC)
I tend to agree... Although we all accept the notion of article changing through revisions, the idea that an article changes in a somewhat random way, without the underlying content being revised, is a bit unsettling to me. Of course, for the purposes of Check Wikipedia, this should be either done through a template, or not be done at all. GregorB (talk) 11:35, 31 March 2009 (UTC)
I have removed the code and just left the images in the article. –Drilnoth (TC) 12:17, 31 March 2009 (UTC)

Nice changes[edit]

Nice updates, Stefan! Do you know if it is yet possible to include the "lowest-priority" pages in the list? –Drilnoth (TC) 21:10, 30 March 2009 (UTC)

Double pipe in one link -- sometimes the second pipe enhances formating[edit]

Like in the following examples:

Johnny Valentine [[World Class Championship Wrestling|Southwest Sports, Inc. | NWA Big Time Wrestling]]''
Mick Foley [[Extreme Championship Wrestling|Eastern Championship Wrestling | Extreme Championship
Roddy Piper [[World Wrestling Entertainment|World Wrestling Federation|World Wrestling Entertainment]]''
Scott Levy [[World Wrestling Entertainment|World Wrestling Federation '''|''' World Wrestling

What do you think we should do about this?--Anshuk (talk) 08:26, 31 March 2009 (UTC)

It displays as "Southwest Sports, Inc. | NWA Big Time Wrestling". I'd remove it. Reading "World_Class_Championship_Wrestling#Big_Time_Wrestling:_1966-1981", I think "NWA Big Time Wrestling" should do, but one could replace it with a colon. -- User:Docu

working on bot - question.[edit]

I'm working (slowly) on a semi-automated bot to fix some of the simpler problems on this list. I'm curious, though: is there any way to access the full list of found problems (i.e., get around the 'output was limited to 50 articles' issue)? --Ludwigs2 19:05, 1 April 2009 (UTC)

You'd need to ask Stefan... what kinds of problems do you think your bot can fix? A lot of these need human attention. –Drilnoth (TC) 19:06, 1 April 2009 (UTC)
Check http://toolserver.org/~sk/checkwiki/enwiki/
I think AWB can do quite a few of them already, Rjwilmsi might annotate those later (see #Errors_suitable_to_be_fixed_with_AWB). I fixed error 16 by bot and just got criticized at WP:ANI for having done so. -- User:Docu
Yeah, AWB will be able to fix a number of the errors around wikilinks, square brackets etc. I am going to specify in the translation exactly what can be done, and in the longer term work on increasing the range of AWB fixes to handle what's here. There's still going to be plenty of stuff that's manual, like image descriptions, though perhaps AWB could be used to make the process faster. Rjwilmsi 22:07, 1 April 2009 (UTC)
Most template fixes will probably need to be done manually. Anyways, maybe we could just set everything that can be done by AWB to same priority level. Not sure if we could create an additional one (4). Those that are mostly AWB, but need some checks (2) and those that need to be manual to (1). Everything else would be (3). -- User:Docu
oh, there's a lot here that can be done with a manually assisted bot - I was looking particularly at the sections on breaks after list items, regularizing headers and fixing bad <br /> constructions, but with artful use of regex and a little human guidance most of the things here can be streamlined significantly. mostly the bot would automate the boring details. for instance, to regularize headings (now), I need to see what needs to be changed, copy the wikitext into a text editor, run regular expressions or other edits to fix the headings, copy the revised text back into the browser, and save the results. with a semi-auto bot, I could do all of that in one step (just specify the changes and click 'go!'>. some things (like list-breaks) can be fully automated, which is why I was wondering about getting larger segments.
with respect to AWB - I'm a mac user, no joy. Face-sad.svg
well, let me get the thing working, and then I'll talk to stephan about expanding it. no sense putting the cart before the horse. --Ludwigs2 01:23, 2 April 2009 (UTC)
It's not real grand, but my CodeFixer user script can fix some of the errors automatically. –Drilnoth (TC) 01:31, 2 April 2009 (UTC)
Oh, that's useful. --Ludwigs2 02:25, 2 April 2009 (UTC)
Glad to hear you like it. I'm still working on adding some more things to it, but it does a fair bit now (mainly converting those pesky XML and HTML character encodings to be actual symbols... it's — not &mdash;. Anyway, I hope to add some more things to it soon, but please let me know if you have any ideas. –Drilnoth (TC) 02:35, 2 April 2009 (UTC)
well, the only thing that strikes me immediately is the auto-submit aspect: that's great if you're code-fixing some random page, but not so good if you just want to clean up the code in a section you're working on and check it over. personally, I'd rather take the extra step of submitting it myself. maybe change it so there are options - submit, preview, or do nothing on run. I'll give it some thought for more stuff, though. --Ludwigs2 03:15, 2 April 2009 (UTC)
It just clicks show changes; it doesn't actually save the page until you do so manually. If wanted, though, I could add in some configuration to allow you to choose how it should act (diff, preview, continue edit without looking at changes, or save). –Drilnoth (TC) 13:30, 2 April 2009 (UTC)

What the heck?[edit]

Where is everything? Surely there are still errors. –Drilnoth (TC) 01:00, 2 April 2009 (UTC)

sorry, no. all errors on wikipedia have been fixed, both technically and content-wise. in fact, there's really no reason to edit the encyclopedia anymore. Face-smile.svg --Ludwigs2 02:22, 2 April 2009 (UTC)
Oh. Duh. –Drilnoth (TC) 02:32, 2 April 2009 (UTC)
That's good. All this talk above about bots had made me think "Wow... did they do something already?" –Drilnoth (TC) 02:33, 2 April 2009 (UTC)

Face-smile.svg it's back. -- User:Docu

So did Stefan do that, or was that one of you? Regardless, excellent work. –Drilnoth (TC) 13:30, 2 April 2009 (UTC)
It was a bug in some other languages and April 1 here Face-wink.svg It was from a wiki where they do get to zero. - User:Docu
A wiki where they do get to zero. That would be nice. –Drilnoth (TC) 15:39, 2 April 2009 (UTC)
Sorry for my fault. A wiki with zero errors is pdcwiki They have only daily errors (today only 2). When en has this level? You must work harder an faster! :-) -- sk (talk) 08:25, 3 April 2009 (UTC)
EN's never had 0. :( –Drilnoth (TC) 13:00, 3 April 2009 (UTC)

I'm confused :([edit]

I really haven't been staying fully up to date here but... what's going on? There's only a handful of pages listed as having HTML italics/. "Headlines start with three "="" has only 52 results... come on, there's a lot more than that! (or has someone really been going through them?). Headline hierarchy is down to 14. I know that there have been a lot of changes recently, so I'm just wondering... is this on purpose or is there a bug? If it's on purpose, why? Thanks. –Drilnoth (TC) 17:45, 2 April 2009 (UTC)

Check de:User talk:Stefan_Kühn/Check_Wikipedia#6000_issues_solved??.
Which is why I left yesterdays entries there. I think it gives a good idea how much piles up in one day. -- User:Docu
Okay; thanks for the link. –Drilnoth (TC) 20:30, 2 April 2009 (UTC)


Statistics[edit]

I was just wondering how much we covered. Which percentage of the last dump is scanned?

It looks like many of the checks with mid-sized results got down into the 50s range thanks much work. Others keep increasing despite that we also work on these, probably because the underlying sample changes.

Obviously, some of the checks dig up pages with really odd formatting that take quite some time to improve.-- User:Docu

I have no idea... I think that this project is being much more productive than it was, say, a few months ago, thanks to the use of AWB, but with all the new errors there's no easy way to tell what progress is being made. –Drilnoth (TC) 21:05, 6 April 2009 (UTC)
My understanding is that a full English dump was generated March 13 that whould have been fully scanned on March 14 (the analysis took a bit longer on that day). For all detection types that existed at the time, their numbers increase only due to : a) new articles - b) new errors introduced by modifications to existing articles - c) improvements to rules of that detection. For all detection types created since the last full scan, the full dump was never analyzed; error counts of those detections increase daily due to a) errors in new articles - b) modified articles that get scanned for the first time with that detection - c) modifications to articles, introducing new occurrences of that error - d) improvements to rules of that detection. Check the "News" section immediately below the summary table to see what detections got enhanced of changed recently. -- Laddo 66.131.214.76 (talk) 03:44, 7 April 2009 (UTC)
Thanks Laddo. What he wrote is correct. Also I have a mistake 1.April/2.April, where many errors was delete. But at the moment I think there is no reason to make a new scan of the old dump. You have enough errors in enwiki to work. :-) We will wait for the next dump. -- sk (talk) 06:16, 7 April 2009 (UTC)
The counts on many errors seem to confirm this (e.g. 5, 40, 45, 49, 65, 51, 60, 3, 8, 19, 32, 55, 58, 52, 66 all showing mainly new issues). This is good news.
Others, such as the result for check #30 (#Image without description), seem to increase on a daily basis. It's a check that was already there on March 14. If the above is correct that means that either it's frequent in new modifications or its detection rules changed much. I concede that I don't necessarily add a description when it could be equal to "general view of pagename". The results for check number #7 seems to increase in a similar way.
BTW, I'm not concerned that we don't have enough to work on ;), just wondering about the size of the iceberg -- User:Docu

Summary from user talk: The total for one check (e.g. 2000) is not updated, unless:

  • the dump is completely (re-)scanned
  • the fixed items are within the first fifty being scanned on a daily basis

-- User:Docu

AWB and List of all articles with error X[edit]

Is there any way to have AWB properly load the contents of the pages referenced at "List of all articles with error X"?Naraht (talk) 05:17, 12 April 2009 (UTC)

Try this. Note that items in the lists beyond #50 don't get updated on a daily basis (as per #Statistics). -- User:Docu
That usually works for me. –Drilnoth (TC) 12:58, 12 April 2009 (UTC)
Quick question about AWB, I'm trying to work on the ISBN mistakes, but I can't figure out how to search in the text for the string ISBN. ctrl-F doesn't work and it doesn't seem like the massive Find and Replace concept is the way to go.Naraht (talk) 15:12, 12 April 2009 (UTC)
If you're just searching for the sting, isn't there a box in the "start" menu for AWB where you can plug that in? –Drilnoth (TC) 16:07, 12 April 2009 (UTC)
You want the little box, at the bottom, just to the left of the edit window. I continually try to use CTRL-F to no avail, but that works quite well most of the time (press it a few times if it doesn't work straight away). - Jarry1250 (t, c) 16:43, 16 April 2009 (UTC)

Image Description with Small[edit]

Some of these are deliberate, where it isn't the entire description, but rather a part of it that is within angle-small-angle, and Wiki will make that even smaller than the 94% that the rest is in. Is this still an error if Wikipedia is handling it correctly and it is doing what they intended?Naraht (talk) 17:37, 15 April 2009 (UTC)

Well, it makes the text far too small (IMHO) when there are alternatives. - Jarry1250 (t, c) 16:49, 16 April 2009 (UTC)
A difference of opinion over style is not a syntax error. OrangeDog (talkedits) 20:32, 17 April 2009 (UTC)
I'm not sure that the concerns are so much based in style arguments as usability and accessibility ones. - Jarry1250 (t, c) 20:58, 17 April 2009 (UTC)
That still doesn't make them syntax errors, and there's no policy saying you can't use different text sizes in an image caption. Even if there were, people might want to ignore such a rule in special circumstances. OrangeDog (talkedits) 01:40, 18 April 2009 (UTC)
They are welcome to IAR it if they want. Who said it was a syntax error anyway? I certainly didn't. - Jarry1250 (t, c) 10:30, 18 April 2009 (UTC)

Forcing a section update[edit]

I think the Headlines start with three "=" is quite a bit out of date (D6 did a load a fortnight ago). What's the proper method for forcing an updated list to be produced (if possible)? - Jarry1250 (t, c) 12:30, 18 April 2009 (UTC)

To my knowledge there isn't one... you'd need to ask SK. –Drilnoth (TCL) 12:32, 18 April 2009 (UTC)
Sorry for fixing them. ;)
Currently, the full lists are only updated if a new dump is available (#Statistics). If you ask sk, maybe he will slip the 3000 pages of bug #7 into the daily scan for changes.
-- User:Docu
I asked him at de:User talk:Stefan_Kühn/Check_Wikipedia#Check_7_on_en.wp. -- User:Docu
The "List of all articles with error X" will be daily updated, but not complete. With every scan daily this list will be updated. But daily the script scan not all articles in this list. Only the first articles of this list, until it found 50 errors. I think the list is at the moment ok. Maybe my script found more then 50 errors in the new articles. So the list will not go down. But I think it will not help to make a complete scan of the old dump. I will wait for the next new dump. -- sk (talk) 18:59, 18 April 2009 (UTC)
Kein Problem. Now I understand how it works, I won't be surprised in future. - Jarry1250 (t, c) 19:03, 18 April 2009 (UTC)

<unindent>

The result from the old dump would be the same, no? Anyways, for this check, it might be better if new pages were listed once old results are dealt with, e.g. the version of April 12 lists two pages deleted in the meantime. -- User:Docu

Standard sortkeys[edit]

The reports lists the usual "*" sortkey as error (Wikipedia:WikiProject_Check_Wikipedia#DEFAULTSORT_with_special_letters). As it's the standard way to sort defining articles before others in the category, it shouldn't be included. -- User:Docu

Editnotice[edit]

When editing the project page, it now displays an edit notice. I made an initial version. It can be edited at Editnotice-4-WikiProject Check Wikipedia or by suggesting an update below. -- User:Docu

It's now at Wikipedia:WikiProject Check Wikipedia/notice 1 and another one at Wikipedia:WikiProject Check Wikipedia/notice 2. -- User:Docu

Missing opening or closing brackets, table and template markup[edit]

At Wikipedia:WikiProject Check Wikipedia/AWB, there is a series of samples from checks 46, 10, 28, 47, 43.

AWB could repair two of #46 (Square brackets not correct begin) by removing an additional bracket from an external link [5][6]. All others were done manually. Fixes consisted of removing or adding tags. In most cases this was within the highlighted section. In one case, I had to restore from a previous version.

The general steps seem to be:

  • 1. open the page in edit mode
  • 2. search for the extracted section
  • 3. fix it manually
  • 4. save it
  • 5. go on to the next page.

The question is: which is the best tool to do this? Personally, I haven't managed to do this efficiently in AWB. Possibly it could be modified to do steps 1, 2, 4, 5 more or less automatically. -- User:Docu

Have a go with the SVN snapshot of AWB (link from AWB page) which has more fixing logic I added for template and link brackets etc. Rjwilmsi 11:15, 9 April 2009 (UTC)
I'm testing the SVN version. It seems to be dealing with a few of 46 and 10 only, possibly the same as before (BTW quite scary the new redundant reference removal). Probably it's in the nature of the checks that they can't be fixed easily.
For 47, 43, quite a few of the broken templates seem to be cite templates. This is probably why they are not noticed. Table markup seems to be able deal with missing closing tags (28). -- User:Docu
Table markup is still problematic without closings, because it can cause some bugs even if they can't readily be seen. And I agree that most of the errors 47 and 43 are citation templates... I cleaned up about 50 a few days ago, and was amazed by just how many had similar errors. –Drilnoth (TC) 13:01, 10 April 2009 (UTC)

A series of new features are being implemented to help with some of these. I'm looking forward to a new build allowing to test them. -- User:Docu

Awesome; can't wait. –Drilnoth (TC) 12:41, 13 April 2009 (UTC)
They are live in build SVN 4218. I went through the list of check #43 (Template not correct end). Cool! -- User:Docu
If you come across any more common bracket errors that AWB could reliably fix automatically, let me know and I can add them to AWB. I haven't looked at any of the table-related errors yet. Rjwilmsi 18:42, 17 April 2009 (UTC)
Looks like you already fixed quite a few cite templates testing the new feature. I did a few of #47 -- not too many at once, it's a bit consuming  ;).
I'm not sure if more fixes could be automated. Besides that, two minor points come to my mind:
  1. Where there are too many closing braces, it might be worth placing the cursor on the unmatched one or to color them.
  2. If the feature is to work mainly for curly brackets, I'd named them "braces" or "curly brackets".
Tables seem easier to do by hand. In general, they can't be overlooked as the broken (cite) templates. -- User:Docu
Are you using 'Options -> highlight first unbalanced bracket if found'? I'm investigating using colour. Rjwilmsi 20:57, 17 April 2009 (UTC)
I do. The other options I currently use are "Apply changes automatically", "On load: show changes". Watching closely, it looks like the edit box is loosing focus immediately after the cursor highlights the position. -- User:Docu 21:29, 17 April 2009 (UTC)
Finally I used it on some of checks 10 and 46 (square brackets): it works. I updated the description accordingly. Another point I came across: the "Alert box" doesn't update when one uses "re-parse". If one fixes a set of brackets and wants to make sure that all problems are fixed, one needs to save and reload the page several times. It's rare that there are several, but I didn't manage to get through (the markup in) Tibetan sovereignty debate and its many {{quote}}. -- User:Docu 10:14, 18 April 2009 (UTC)
For #47, one error that might be fixed automatically by AWB would be this one. -- User:Docu
Also, ((cite web -> with curly brackets. I'm liking the improvements, however. - Jarry1250 (t, c) 18:40, 18 April 2009 (UTC)
rev 4229 Both done. Rjwilmsi 21:38, 18 April 2009 (UTC)
Good, I'm thankful for each I don't have to fix myself.
If I remember right, for this one, AWB suggested adding curly brackets instead of removing the ones (SVN 4218).
BTW my edit box still keeps loosing focus. -- User:Docu
Yep, no bleedin' idea why that loss of focus is happening. It's on my TODO list but I have higher priorities. I'll have a look at braces behaviour on that revision of Vistula Veneti later. Rjwilmsi 17:53, 19 April 2009 (UTC)
I thought it might have been limited to mine. Anyways, it still works. We did get to the bottom of the 43+47 (some nasty ones remain) and, even better, some of the new entries get fixed directly. -- User:Docu
rev 4233 Resolves Vistula Veneti issue - AWB now just highlights as unbalanced. Rjwilmsi 22:11, 19 April 2009 (UTC)
At Sri Lanka Indo-Portuguese language (this version), AWB should probably skip the sequence as the page now includes <code></code> [7] (SVN 4218). I added {{nobots}} to Comparison of programming languages (object-oriented programming) as it doesn't have such tags yet. Besides that, the fixes for check #10 went well. -- User:Docu
I had allowed for 'nowiki' and 'math' tags but not 'code'. rev 4242 for that. Rjwilmsi 18:06, 21 April 2009 (UTC)

Check 59: Template value ends with break[edit]

Shall we try to fix these (1), keep the report running in case someone is interested (2), or deactivate it (3)? I'm a bit hesitant.

Today (April 19) the report lists 2714 occurrences. As the check was added after the last dump (mid-March), I assume that there should be more to come.

The displayed value doesn't change in MediaWiki, but, e.g. for templatetiger, the available data would be cleaner without. -- User:Docu

I'd go with (2). It doesn't seem like its as pressing as some of the other errors, but it is still an error which someone can fix when the more important problems are resolved. –Drilnoth (TCL) 12:37, 19 April 2009 (UTC)
To start to fix them, the current result will still be found in the page's history. We could also save the full list from toolserver. In the meantime, if it isn't handeled, we could just de-activate it, possibly re-activate it just before the next dump. With some work, it should be possible to use pywikipediabot to do most of them. -- User:Docu
Okay; I don't really care either way. –Drilnoth (TCL) 14:01, 19 April 2009 (UTC)
Lets see what others say. If there is no demand, we could turn it off. -- User:Docu

Headline hierarchy[edit]

Hi! Are you really sure the headline hierarchy is a problem? I mean: does it generate any real problem? In my opinion, due the very small difference in font size between the level 1 and level 2 headlines, many users sometimes just choose to use the level 3 in place of 2 in order to obtain a better layout and a clearer structure within the article. Is there enough consensus about this level gap ban? -- Basilicofresco (msg) 07:46, 10 March 2009 (UTC)

IMO, I think that when a headline jumps from level 2 to level 4, it looks pretty ugly on the screen. Also, the MOS states (at WP:MOSHEAD) that "primary headings are then ==H2==, followed by ===H3===, ====H4====, and so on." So, no, it doesn't generate any real problem (it's not going to destroy the wiki or anything!), but there is community consensus and (I believe) it's been that way for a long time. –Drilnoth (TC) 17:27, 10 March 2009 (UTC)
See Organizing a page using headings at the Web Content Accessibility Guidelines 2.0 (11 December 2008). --Red Power (talk) 15:09, 2 April 2009 (UTC)

See also: User_talk:WWGB#Headline_levels_on_Deaths_in_January_2009_etc -- User:Docu

  • sigh* What reason can there possibly be for opposition? It's in the MOS. Although the MOS is not always correct and is a guideline, I see no reason why this should be any different from other articles. –Drilnoth (TC) 13:29, 4 April 2009 (UTC)
  • lol... a bit of insight for you: the phrase 'there's an exception to every rule' really means that every rule has someone who takes exception to it. Face-smile.svg --Ludwigs2 19:29, 4 April 2009 (UTC)
Heh... –Drilnoth (TC) 19:31, 4 April 2009 (UTC)
Well the whole set of monthly pages is formatted that way (at least since 2006). I understand their explanation about how they are "growing" them. Inserting additional headers is probably a good way to keep them more or less the same.
My bot did several hundred of #7 (WP:WikiProject Check Wikipedia#Headlines start with three_"=") on articles that didn't have a "==" level, but this didn't quite reduce the numbers (the stats aren't simple to read). If we want the others to be fixed, maybe we need to have the TOC feature "changed" as it currently adjusts automatically for some of the headers on the wrong level. -- User:Docu

I have fixed manually more than 50 of these recently but the number just keeps growing. Maybe the script could be improved to create a new list about articles with new headline hierarchy issues, i.e. articles that previously did not have an issue but that have the issue now. Then the users who create these issues could be tracked and educated. This would be applicable to the other checks, too. —ZeroOne (talk / @) 08:29, 22 April 2009 (UTC)

The ones displayed are generally the ones in articles recently changed (not only to add a specific error though).
The total doesn't necessarily change from one day to the other even if you fix most of them (see #Statistics). On the other side, e.g. today check #7 dropped by 134 even though we probably didn't fix more than 50 yesterday, but the ones the script had stored to rescan next were already done in the days before.
At least for check 7, some of these are easily fixed, as sometimes all levels are just one off. I don't think it's helpful to lecture contributors that created their first article about a somewhat minor point while there is much more to be done. Once it's fixed in an article, eventually they will figure it out. If it can be fixed by bot, it wasn't much effort for me either. There articles where the structure is due to the growth of the article, possibly by numerous contributors and someone has to try to adjust the structure at some point. More tricky are articles that use a predefined structure made for longer articles. At least for these, one has to find an intermediary solution of some sort and we wouldn't want to stop their growth just to reach a perfect TOC. BTW a numbered TOC adjusts for most cases. -- User:Docu


Check 30: Image without description[edit]

Shall we keep this running or turn it off? -- User:Docu

I can go either way with this one... one one hand it kind of is an error, but on the other it isn't really a "syntax" error, but a "content"/"style" error. Whatever we do, the "image gallery without description" check should be the same. –Drilnoth (T • C • L) 22:32, 21 April 2009 (UTC)
Image gallery is just 394, but this one is 11,260. It seems to rise fast since the last dump (/old, not sure if the script changed). If it's being used, I don't mind having it. -- User:Docu
I have changed massively the script for error 30. Now my script will detect really more errors. I think this is one of the important errors. No bot can fix this. Here we need manpower. A bot can only inform at the discussion page that there is an image without description. We make this in dewiki. -- sk (talk) 19:17, 22 April 2009 (UTC)
If it doesn't slow down the scan for the other errors, let's leave it running. I added the "new" tag to the results. BTW one could try to import the descriptions from Commons .. -- User:Docu
I saved the current list at /030, in case someone wants to work on it. -- User:Docu

Extraneous links in hatnotes[edit]

Hatnotes should only contain links to the desired possible other target. See Wikipedia:HATNOTE#Extraneous links. Would need human review probably, as there may be exceptions. –xeno talk 16:31, 22 April 2009 (UTC)

Ideally, yes, templatetiger can give an overview of things. First, it might be worth trying to convert notes that don't use {{dablink}}, {{otheruses}} and the like to one of the templates. -- User:Docu

Reference[edit]

Some articles dont use <references /> but {{Reference}} which is the same as {{reflist}}. Kwiki (talk) 06:59, 10 May 2009 (UTC)

It looks like this makes appear Cynthia L. Bauerly on the list for check #3 - Wikipedia:WikiProject Check Wikipedia#Reference tag .3Creferences .2F.3E missing (partial AWB) - where it shouldn't. There are several other redirects at Special:WhatLinksHere/Template:Reflist. -- User:Docu
I left a note for Stefan (de:User talk:Stefan_Kühn/Check_Wikipedia#Check_3_at_en.wp). They should be gone after the next update. -- User:Docu

New dump?[edit]

I take it the +179,000 bytes equates to the new dump having been scanned through? - Jarry1250 (t, c) 18:25, 15 May 2009 (UTC)

Looks like it:
http://download.wikimedia.org/enwiki/20090512/
2009-05-14 23:57:24 done Articles, templates, image descriptions, and primary meta-pages.
2009-05-14 23:57:23: enwiki 8521847 pages (99.613/sec), 8521847 revs (99.613/sec), 77.1% prefetched, ETA 2009-05-16 15:45:17 [max 22793793]
  • This contains current versions of article content, and is the archive most mirror sites will probably want.
  • pages-articles.xml.bz2 4.8 GB
I was wondering how much was still to come, but I'm still surprised. -- User:Docu
If we edit one article per minute, we will be done in 4 months .. -- User:Docu
Heh...
I have DrilBot on "headlines end with colon". –Drilnoth (T • C • L) 19:07, 15 May 2009 (UTC)
Starting from the back of the list since I see D6 got some of them already. –Drilnoth (T • C • L) 19:12, 15 May 2009 (UTC)

No, this is the old dump from 2009-03-13 01:27:21. I have start the scan of the old dump 3 days ago. -- sk (talk) 19:17, 15 May 2009 (UTC)

Oh wonderful. We have that to look forward to. - Jarry1250 (t, c) 19:20, 15 May 2009 (UTC)
Wow. The new scan will register changes since this one, right? –Drilnoth (T • C • L) 19:23, 15 May 2009 (UTC)
I'm guessing, but I think that the jump is because this is the first dump that a whole bunch of new / updated checks are being run on (as opposed to just edited). So probably not quite so many more. - Jarry1250 (t, c) 19:29, 15 May 2009 (UTC)
I think that's how it works. It's a great script regardless; thanks sk! –Drilnoth (T • C • L) 19:48, 15 May 2009 (UTC)
It would be just another two or three weeks of errors (dump is mid March, most checks were in place at the beginning of April). Luckily it's the weekend, traffic is low [8] and the server doesn't lag [9] (as of 02:49, 16 May 2009 (UTC)). -- User:Docu
Drilnoth, do you think we could talk Jarry into running another bot to crunch through the lists? -- User:Docu
I could manage an AWB bot, certainly. There's a python one going through the approvals process now, but I guess, DrilBot's in the best position for expandability. Sorry, I don't really feel like coding much more than merely setting AWB to stun mode. - Jarry1250 (t, c) 08:13, 16 May 2009 (UTC)
Given the amount of pages to process, I think it would be worth it. Ideally, we would try to process most pages before the next dump (in June probably). Given the way the script works, it's unlikely that we will get updated full lists before.
BTW the unicodify function on python (converts many) should probably be harmonized with the one in AWB (has a few exceptions). Ideally, the selection in the script sk is using should be similar (could shorter though). -- User:Docu
I hope to plug in some improved unicodification manually, although I agree that having it be default in AWB (maybe something like Wikipedia:AutoEd/unicodify.js's changes?) would be better. –Drilnoth (T • C • L) 10:52, 16 May 2009 (UTC)

Check 19 (Headlines start with one "=")[edit]

Usually there were just a few new articles listed. Now there are 2535. It's probably worth fixing these by bot, lowering all headers by one level. -- User:Docu

When I go through these it seems like quite often there will be a page where the headers are one level high for about half the article or something and then be accurate... fixing those by bot would create a page just as incorrect as the previous one. –Drilnoth (T • C • L) 10:51, 16 May 2009 (UTC)
Some are broken in odds ways others are just all one level off. I suppose one would have to check first if there is more than one header with level "=". The good thing is that some have already been fixed since March ;) -- User:Docu

Page update[edit]

18:52, 17 May 2009 (UTC) I'm trying to update the page, but it keeps timeing out. -- User:Docu

Seems like you got it. :) –Drilnoth (T • C • L) 20:06, 17 May 2009 (UTC) Oops, I'm blind. –Drilnoth (T • C • L) 20:08, 17 May 2009 (UTC)
Finally. I wonder if it's some new abuse filter that slows it down so much -- User:Docu

Title linked in text[edit]

Can ↑ be done reliably by bot? I'm asking because it was brought up at Wikipedia talk:AutoWikiBrowser/Bugs#Bold names and it seems that making this change on image maps could be problematic, as discussed here. Are there any other times that this could be a bad edit? –Drilnoth (T • C • L) 12:53, 18 May 2009 (UTC)

The one in the image map looks more or less what it should be doing (though one could add an exclusion for <imagemap>).
His is complaining that the self link on the image at 50000_Quaoar#Size is being removed. Personally, I think it even more important to remove these confusing ones, than fixing the usual ones. Anyways, in general, there is always a trade off between fixing 1000 and possibly breaking a few. -- User:Docu (15:32)
Hmm .. the missing link on http://en.wikipedia.org/w/index.php?title=50000_Quaoar&oldid=290470688#Size does make the image disappear completely. Too bad the extension has no maintenance category associated with it. -- User:Docu

It was fixed at 15:32. -- User:Docu

Excellent. –Drilnoth (T • C • L) 15:54, 18 May 2009 (UTC)

Double pipe in one link[edit]

I've been through a few of these with AWB. A lot of them are of the form [[article || text]] or [[article|text | ]]. This kind of error looks eminently like bot-work to me; is there one active, or could one be modified to suit? Mr Stephen (talk) 18:05, 18 May 2009 (UTC)

I might be able to have DrilBot fix the type that you mentioned, although a lot would need to be done by hand. I'm not entirely sure though; I'm still not good enough with RegExp. –Drilnoth (T • C • L) 20:20, 18 May 2009 (UTC)

Ignore articles tagged for deletion[edit]

Is this possible? Would it slow down the scan? More importantly, do we want it? Would it be beneficial to not "waste" time fixing articles that are later deleted, or is this an important service - letting people see the content of the article as it was intended to be displayed. Discuss. - Jarry1250 (t, c) 13:30, 10 May 2009 (UTC)

I guess it's probably possible. It's a good question. For some reports it was a bit annoying to get flooded with new articles likely to be deleted while there were many old ones that needed fixing. There are some I skipped, others I reformatted - to avoid seeing them for another week in the reports. -- User:Docu
What Docu said for the most part. When I'm using a tool like AWB or AutoEd I usually just fix the error, but if something is being done by hand it seems kind of pointless. –Drilnoth (T • C • L) 14:53, 10 May 2009 (UTC)
Not all the articles tagged will end up deleted, though, right? Even so, I guess it's reasonable to wait until they're untagged, just in case. --Auntof6 (talk) 03:39, 26 May 2009 (UTC)

44: Headlines with bold[edit]

Treating this as an "error" is brain-damaged. There are plenty of valid reasons why bold could appear in a headline, for example mathematical notation often relies on fixed typefaces. Automatic removal of bold tags as in this edit is dead wrong. — Emil J. 10:50, 19 May 2009 (UTC)

Hmm... well, it still would only matter in level 2 headlines since headlines of level 3 and below aren't visually affected by having the bold text. –Drilnoth (T • C • L) 13:46, 19 May 2009 (UTC)
Frequently, it looks a bit like bold text within text that is already bold, or italics and underscores combined.
In general, italics seem a viable option for additional emphasis within headline. In the sample above, <math></math> seems a good option as well. -- User:Docu

Numbers are low[edit]

I just signed up for this page and I will start going through them as well but I do have a couple comments. First I think some of the numbers are low. For example I know that there are more pages than listed here with incorrect breaks (i.e. <br>, <BR>, <br., etc. I also know that there are more pages with incorrect characters or invalid formatting in the defaultsort. Not criticizing because I am glad that someone created this list but wanted to let you know. My next comment is based on the rather minimal impact of some of these edits such as the breaks. I personally follow the belief that if you watch the penny's the dollars will mind themselves (Even the small edits are important over the long term) but some would argue that some of these edits are a waste of resources and fill up editors watchlists (also not a problem for me personally). Since AWB specifically requests that some of these edits such as the breaks not be done with AWB as standalone changes are we ok to go ahead and do them?--Kumioko (talk) 20:30, 19 May 2009 (UTC)

(de-capitalized header) I think that the list of breaks is about correct... things like <br> and <BR> are correct; the list here only has those which have an error like <.br>, <<br>, or <\br>. My bot ran through the defaultsort list a couple of days ago and fixed a lot of them, although it looks like the list might not have taken that into account yet... weird. Anyway, my feeling is that the things that don't really change much (e.g., the location of categories) and which can be done by bot should be done by bot... then it doesn't waste human resources and the edits don't show up on watchlists. –Drilnoth (T • C • L) 20:47, 19 May 2009 (UTC)
Kumioko, would you have samples of pages that were missed? The reports keep getting improved, but they are not meant to be exhaustive (at least for now ;) ). -- User:Docu
Drilnoth, when you have a moment, would you run your bot through 43/47 (broken templates)? It's easier to look manually at the remaining ones once these done. For an update to date list of what remains, we might have to wait for the next dump though. -- User:Docu 11:00, 20 May 2009 (UTC)
Sure; I'll start it running right after the next update. –Drilnoth (T • C • L) 12:38, 20 May 2009 (UTC)
Before the next dump (in June supposedly) would be sufficient (hopefully in June we wont get April data ;) ). -- User:Docu
For 43, thanks for doing a first pass by bot. I just finished today's 50. Luckily one page filled half the list ;) -- User:Docu
You're welcome; thanks for mentioning it. I don't think that DrilBot can really do much with #47... AWB can't pick up very many of them to fix automatically. –Drilnoth (T • C • L) 15:18, 22 May 2009 (UTC)
As I have to stop at each page to the check it manually, it's helps if the automatic ones are gone. Besides, too many of these at once, give me a headache. -- User:Docu
I read your previous note too quickly. You mentioned the other report. It does also work for #47 (missing opening brackets), see rev 4229 mentioned in /Archive#Missing opening or closing brackets, table and template markup. When editing with a new release, I have actually seen it being fixed! -- User:Docu
I know that it does work automatically some of the time... the problem is that there aren't enough that AWB can auto fix, and when I'd had my bot going through the list it was making a lot of edits that didn't fix that error. –Drilnoth (T • C • L) 14:11, 23 May 2009 (UTC)
If I'm not sure which type of problem AWB will fix, I'm just using "clean-up" as edit summary. It happens sometimes that I forget to switch it from a more specific one. For check#47, you could run it with a summary "Clean-up, general fixes (batch #47)" this might be sufficiently descriptive for the type of operation. If it's done just once for each dump, I think it's acceptable. One could also link general fixes to WP:GENFIXES, this way interested editors can easily find the full set of possible fixes. -- User:Docu 06:09, 24 May 2009 (UTC)
Eh... I'd do that except for the AN/I report about DrilBot, with the consensus being that more descriptive edit summaries are needed. –Drilnoth (T • C • L) 16:32, 24 May 2009 (UTC)
It's not a coincidence that I wrote the above. Anyways, WP:GENFIXES is very descriptive (IMHO), maybe you could even copy it to a separate page. The problem is that if the edit summary is too descriptive and the edit doesn't make the change that is in the summary, it's more problematic. I suppose it would be possible to set AWB to have changes that trigger edits (with a specific summary) and add all other gen fixes behind. Whatever solution you choose, after each 20k of edits, you will get a new thread on ANI ;) -- User:Docu
35k edits. :) I'm working on User:DrilBot/Summaries to create a more descriptive guide on these edits. –Drilnoth (T • C • L) 17:00, 24 May 2009 (UTC)

Non-editable and unreliable (check 7)[edit]

Heed this edit! I will be back to do more of these later. Can someone supply the secret contact information mentioned in the edit summary? Michael Hardy (talk) 11:39, 24 May 2009 (UTC)

The list on the server isn't completely up to date, from the introduction "# The number of items on this page is limited. For a longer list see tools:~sk/checkwiki/enwiki/. These aren't updated daily though. When working on the toolserver lists, it is suggested to start from A. The next day, the script will use these items to re-generate this page omitting already fixed articles.". An estimated one third is already fixed. I'm not quite sure when he will be doing it, but the list will be split into two separate ones (the ones without level "==" headings and the ones with, according to (de:User talk:Stefan Kühn/Check Wikipedia#Check 7 at en.wp)). -- User:Docu

I wasn't worried about missing items from the list, but about items on the list that shouldn't be.

Why is Bell polynomials on the list? Someone edited it to change some subsections to first-tier sections. I hit the "rollback" button. They were intended to be subsections. Michael Hardy (talk) 12:11, 24 May 2009 (UTC)
In fact, I fixed the error on the page more than a month ago myself diff. Items on the toolserver lists date from the scan done 10 days ago of the March 2009 dump. Items are rescanned every day to generate a list of 50 current items. That means the items here are still not done as of yesterday. Obviously this system works better for lists where we don't have that many open. -- User:Docu
Well, someone "fixed" it again today and I reverted to the "unfixed" version and I'll have to do that as many times as someone does that same "fix".
So is it strictly forbidden to have a subsection in the initial section that has no main header? Michael Hardy (talk) 17:53, 24 May 2009 (UTC)
Per WP:MOSHEAD, headlines should start with "==" with subsections being "===", "====", and so on... so consensus is against having a subsection in the lead of the articles. –Drilnoth (T • C • L) 18:21, 24 May 2009 (UTC)
Michael Hardy, I saw the re-"fixing", someone was a bit in a hurry, I'm glad you repaired it.
The article title is a level <h1> heading, thus the next lower level and first level in article text, would be a <h2> level ("=="). The peculiar thing about Wikipedia is that the lead section doesn't have a header which is somewhat asymmetric. Interestingly even on the Main page, they managed starting with h1 and going to h2! -- User:Docu 18:25, 24 May 2009 (UTC)

This check is also broken for special pages like disambiguation pages, where smaller headers are appropriate. This check needs to be eliminated or seriously refined. —Centrxtalk • 18:12, 24 May 2009 (UTC)

Why should it be different for dabs? –Drilnoth (T • C • L) 18:21, 24 May 2009 (UTC)
Disambiguation pages are often short, and there may only be a couple of entries in each section. The ==-level header, which also adds an underline, is far too prominent for disambiguation pages. The section header should not be the same size, or half the size, as the entire section. The standard promulgated by this Check may be theoretically best for articles, but not for many different types of other pages. —Centrxtalk • 18:28, 24 May 2009 (UTC)
Also, the script or bot or logic that was automatically promulgating this Check, is additionally broken in at least three ways. —Centrxtalk • 18:33, 24 May 2009 (UTC)
Which are they? -- User:Docu

Start copy from User talk:Centrx.

Inspection reveals two major classes of page where this Check was automatically implemented: a) disambiguation pages, where a lesser section header was specifically intended; b) blatantly non-wikified pages, that need far more help than tweaking section headers.
Also, without exception, the bot did not even implement the Check correctly. It does not normalize section headers, it simply chops off one =. For example, ==== is reduced to === even if it is supposed to be == at the top level.
This analysis does not even enter the situation of general bugginess, as evidenced by the fact that it eliminated correct sub-sections in [10]; and other problems and objections on User talk:PigFlu Oink. —Centrxtalk • 19:11, 24 May 2009 (UTC)
Bell polynomials is clearly wrong as I had to do it manually myself. It would be interesting to know what caused it. a) isn't incorrect as MoS does warrant level "==" headers. I don't see a problem with b) as such articles never get fixed in one step. "====" to "===" happens with AWB too. -- User:Docu
Looks like it chopped off a level from the first level "===" and below headers it found. Not good. -- User:Docu
  • General MoS does not apply to special cases in special pages. Even if the disambiguation page headers are incorrect, the proper correction is to change them to mere Bolded title as used in Wikipedia:Manual of Style (disambiguation pages), not to change them to grand section headers.
  • Depending on the page, tweaking a broken page either means 1) actual errors are obscured by making the page look superficially correct, but still be wrong header-wise (e.g. [11]); or 2) little is lost by incidentally reverting a page that needs to be majorly fixed by the first editor to come along anyway.
  • Unless someone will manually inspect 15000+ edits to save a handful of sound corrections, reverting all is the only way to correct the mistakes caused by an unauthorized, blatantly buggy bot. —Centrxtalk • 19:38, 24 May 2009 (UTC)

End copy from User talk:Centrx.

See Also => See also[edit]

For consistency, "== See Also ==" should be capitalised "== See also ==" Tabletop (talk) 02:42, 26 May 2009 (UTC)

Category duplication (check 17) due to template that includes a category[edit]

A lot of the asteroid articles have duplicate categories because they're stubs, and the stub template includes a category that's also hard-coded. To me, it's not obvious that the stub templates include the categories (I had to research it a little). Therefore, I'm inclined not to resolve these so that future removal of the stub templates won't remove the category altogether. What say ye? --Auntof6 (talk) 05:13, 26 May 2009 (UTC)

The reports don't cover this, as far as I know. The category needs to defined twice in the article itself, e.g. Category:Main Belt asteroids in 10515 Old Joe [12]. -- User:Docu
OK. Maybe the ones I spot-checked already had the duplicate removed, then. I just went through all the ones that start with numbers and did find some with the category specified twice (and fixed them, of course!). I'll wait and see if they're still there if/when the list gets regenerated. --Auntof6 (talk) 08:27, 26 May 2009 (UTC)

Suggestion: stub templates on articles that are not stubs[edit]

AWB seems to remove stub templates from articles that are over a certain size (I don't know what size that is, though). Maybe this project could generate a list of non-stub articles that have stub templates. --Auntof6 (talk) 09:32, 26 May 2009 (UTC)

Database_reports has a weekly Long_stubs report. -- User:Docu

Error 2 incorrect entry[edit]

[edit] Article with false
(AutoEd)
...deleted...
article info
Comparison of layout engines (Non-standard HTML) * <tt id="trident_wbr"><wbr>Naraht (talk) 15:13, 28 May 2009 (UTC)

I cleaned that up so that it shouldn't be listed in the future. It should have been using &let; and > tags originally, anyway. Thanks! –Drilnoth (T • C • L) 15:12, 28 May 2009 (UTC)

May 28 update[edit]

It seems like the toolserver problems canceled the update. -- User:Docu 00:15, 29 May 2009 (UTC)

Appears so. Ah, well; maybe tomorrow. –Drilnoth (T • C • L) 01:54, 29 May 2009 (UTC)
Apparently, after 9 hours, it did finally get through. BTW we are down
from "342,337 ideas for improvement in 304,586 articles" (May 15)
__to "257,460 ideas for improvement in 233,446 articles" (May 28).
Even if many of the articles of the March dump scanned on May 15 were already fixed by May, I think we made substantial progress. -- User:Docu
Woah; that's a nice statistic. Although wasn't the "headlines start with the '='" check split into two? I'm guessing that that reset those counters until the next database dump. –Drilnoth (T • C • L) 13:12, 29 May 2009 (UTC)
I just hope the new dump wont be as old as the last one. BTW 7 and 83 combined should still equal the old 7. The current distribution between the two isn't "accurate" yet, as the server lists are checked to see which pages still fail the new check 7 or go under 83. Anyways, I think it's quite encouraging. -- User:Docu

Automated edits to heading hierarchies[edit]

I've opened an RfC with regards to whether the community stand by the MoS, but mainly about whether it should be enforced using automated edits. Cheers, - Jarry1250 (t, c) 14:53, 30 May 2009 (UTC)

Grand![edit]

Another huge database dump! :) I'm guessing that this is a more recent one? –Drilnoth (T • C • L) 19:36, 1 June 2009 (UTC)

Or... did it just rescan the same dump with the new errors? DrilBot is finding a lot of already-fixed articles with #53. –Drilnoth (T • C • L) 20:45, 1 June 2009 (UTC)
WTF? Locos ~ epraix Beaste~praix 21:00, 1 June 2009 (UTC)
#16 was mostly ok. It might be the dump from 2009-05-24 21:39:17. -- User:Docu
That sounds like it could be accurate... I think that DrilBot ran through this specific list after that. –Drilnoth (T • C • L) 21:55, 1 June 2009 (UTC)
Soon will be up2date, nothing left to do! -- User:Docu
But that's a good thing, isn't it? :) –Drilnoth (T • C • L) 22:15, 1 June 2009 (UTC)
I'm seeing something similar with #17. You sure it didn't scan an old dump? I've been working through that list, and the areas I've already worked on (numbers through J) suddently have a lot of articles again. --Auntof6 (talk) 22:14, 1 June 2009 (UTC)
It was the new dump. I have change my script after the new server make every 4-5 days a new dump of a language (de, fr, ...). This was too fast for my script. Now a script start at 1., 8., 15. and 22. of a month and search for new dumps and scan this. So maybe all 7 days we use a new dump. The new en-dump is from 2009-05-24 21:39:17 and my script found at 2009-06-01 the new dump and scan this. -- sk (talk) 07:47, 3 June 2009 (UTC)
Okay; thank you for clarifying. –Drilnoth (T • C • L) 12:30, 3 June 2009 (UTC)

More contributors[edit]

Today I have the idea to recruit more contributors. Under Wikipedia:WikiProject_Wiki_Syntax#Thank_you_to_contributors we found many Wikipedians, who help in this old project. Maybe they don't know the new "WikiProject Check Wikipedia". Some of you can inform this people at the User discussion page. What do you think about this? -- sk (talk) 08:05, 3 June 2009 (UTC)

Hmm... neat. I might be able to use AWB to send all of them messages about this. –Drilnoth (T • C • L) 12:32, 3 June 2009 (UTC)

Add logic to AWB to include more of these errors.[edit]

I have been looking at the errors and have noticed that there are several that I think could or should be added to AWB as general fixes or at least as a custom module. Before I go adding a bunch of feature requests though does anyone have any thoughts about which ones they feel could or should be added to AWB?--Kumioko (talk) 18:57, 3 June 2009 (UTC)

I'd say, just go ahead and request them. Note that "reference duplication" has already been requested, as has "template with Unicode control characters". –Drilnoth (T • C • L) 15:40, 4 June 2009 (UTC)
OK and I think they added a few others along with the change from using a text box to a richtextbox (to make and display the edits). Maybe I'll wait till the next version comes out before I make the suggestions.--Kumioko (talk) 17:42, 4 June 2009 (UTC)
You could just download the most recent SVN build; that's what I do. –Drilnoth (T • C • L) 17:44, 4 June 2009 (UTC)
Handy link (that's unless you want to hook into the SVN directly). - Jarry1250 (t, c) 17:48, 4 June 2009 (UTC)

Reactivated errors[edit]

I have reactivated the following errors:

  • #30: Image without description and #35: Image gallery without description. These are both errors because they cause accessibility issues, e.g., for people using screen readers. I can't really see any reason why they should be deactivated, as the vast majority should have descriptions per W3C standards.
  • #36: Redirect not correct. Many of the redirects listed here work, but they use improper syntax (e.g., a carriage return between #REDIRECT and the target page's name). I believe that AWB can be configured to fix these, so a bot can repair them easily and those won't require human attention. The redirects on the list that are malfunctioning can then be fixed manually with ease.
  • #79: External link without description. "Bare links" should almost never be used... they should have some description to help indicate where they lead to. I'm not sure why this was turned off.

Reactivating these may increase the amount of time that the scan takes, but I don't think that that really matters... it will still be daily, just at a different time of day. If you think that any of these should be deactivated again, please don't hesitate to post here. –Drilnoth (T • C • L) 16:39, 6 June 2009 (UTC)

For 2, I agree with you completely. As for 1 and 3, the question here is about whether they are worth listing for now. They invariably involve large numbers of articles which need to be fixed by hand. Not all images need descriptions by the way. We're never going to get through them all, so why both listing them? I mean, that a pessimistic view I know, but seriously, 30,000 images? That's a hell of a job. - Jarry1250 (t, c) 16:57, 6 June 2009 (UTC)
I'm kind of torn myself on point 1... the way I see it, saying "we're never going to finish them, so why bother?" doesn't really makes sense... the question is more like "How many of these don't need descriptions? And is this a good job to have on CHECKWIKI, which primarily lists cosmetic and code changes, not content problems?" Ditto with the external links. I thought that they should maybe be reactivated and we can see what they come up with before making a final decision. I guess that I'm not really opposed to having them deactivated—just kind of neutral on it—for the reasons that I outlined (is it a good job for CHECKWIKI?). Feel free to re-deactivate them if you'd like; I won't oppose it. –Drilnoth (T • C • L) 17:02, 6 June 2009 (UTC)
One idea for #30 and #35. Many of this images without description are flags. I think it is no problem to create a bot who check this images and find flags like Flag of Germany.svg and create a description like Flag of Germany. In dewiki we have 40000 articles with minimum one image without description. 5000 of this 40000 are flags. I have try to get a bot, but at the moment nobody had time. Maybe in Enwiki someone can create this bot. If he use all flags of commons in this commons:Category:SVG_flags this will help. If the #30 activated then he can also use this file. -- sk (talk) 19:46, 6 June 2009 (UTC)

Sharing regexes[edit]

I have been building a new AWB plugin-system thing, designed at making it possible to subscribe to blocks of regexes and to collaboratively work to improve them. I'm calling it FRONDS at the moment (a working title) and you can read all about it at Wikipedia:AutoWikiBrowser/Fronds. I'm designing it (in the broader sense of the term) with CheckWiki in mind. See what you think: expand that page, or put questions on the associated talk page. Cheers, - Jarry1250 (t, c) 19:23, 9 June 2009 (UTC)

(I edited the above to avoid WP:TLDR.) A beta's almost ready now, and it'd be nice to get some regular expressions in the system. I'll be adding mine today. - Jarry1250 (t, c) 10:07, 12 June 2009 (UTC)

No new updates for a few days[edit]

See [13]. I hope that that can be fixed easily enough. Anyway, there's quite a lot to work through here already. –Drilnoth (T • C • L) 15:56, 12 June 2009 (UTC)

# 81 Reference duplication[edit]

I went through this category and I did most of this one but there seemed to be a lot of false positives so the number still isn't at zero. You might want to rerun the list using your script and see what's left.--Kumioko (talk) 18:43, 3 June 2009 (UTC)

If you found a false positiv, then please write the article title here. So I can check this. -- sk (talk) 10:13, 4 June 2009 (UTC)
Ok I will do that next time I go through.--Kumioko (talk) 12:42, 4 June 2009 (UTC)
I couldn't find one for 81 but I just found one for 19. European grid is showing on #19 as having only 1 = for a section but the only rogue = I could find was related to a math calculation.--Kumioko (talk) 13:13, 4 June 2009 (UTC)
See this change. If my script found a "=" at the first position of a new line, then it think it is a headline. Normaly a "=" should not be at the first position. -- sk (talk) 20:26, 4 June 2009 (UTC)

Well it may be the convention to avoid 'reference duplication', but it's a great advantage for readers not to have to scroll down and back up again, specially where there are relevant quotations within the refs. Readers like me, that is, who might want to get an immediate idea of the field of reference that's being used in the article. If I wanted to buck the convention a) would you allow it and b) how would it be done manually?i.e is the question about reference duplication simply a matter of duplicating the numbers and can I place the references where I think is sensible but using one continuous number sequence? I hope this is the right place to put this message: if not, please redirect to the right place! Dungur (talk) 09:15, 16 June 2009 (UTC)

Add details to the different error sections[edit]

As this project grows and we add more and more users and errors I think it would be good if in each of the error sections we give a link (where possible or practical) to the reference in WP that identifies the format or error. I know what some of them are but before I start making a large change like that I wanted to mention it here first.--Kumioko (talk) 18:54, 3 June 2009 (UTC)

Agreed; I'll try to do this when I have the time. –Drilnoth (T • C • L) 15:39, 4 June 2009 (UTC)
Can we especially get the link for error 78 (reference list duplication)? After I took care of those, an editor said he'd put <references/> into individual sections of Shamanic music on purpose, to prevent readers from having to scroll to the bottom of the article to see the references. The version with multiple <references/> tags is here. I couldn't find anything that specifically says only one references tag is allowed. Most of the other cases were really errors (written by editors who didn't understand how it worked), but the editor may have a point about this one. What do y'all think? --Auntof6 (talk) 06:53, 16 June 2009 (UTC)

Adjacent references ?[edit]

Hi, what do you think of adding a detection for adjacent references, like <ref>...</ref><ref>...</ref><ref>...</ref> ? This error probably won't be of any interest for enwiki because reference numbers are put between square brackets [1][2][3]. But on frwiki reference numbers are displayed without any decoration so adjacent references may look like only one reference 123, so we're generally using a template {{,}} between references. --NicoV (Talk on frwiki) 22:14, 27 May 2014 (UTC)

NicoV, could you get me some articles with the problem as test subjects. <maniacal laugh> Test Subjects </maniacal laugh> I take it I need to look for cases of: </ref><ref> and <ref name=ack /><ref ? I also saw your message above about adding to the done pages. Bgwhite (talk) 05:29, 28 May 2014 (UTC)
Ok, will try to find some... The subject was brought on WPCleaner's talk page for this modification, but the page is fixed now. --NicoV (Talk on frwiki) 07:17, 28 May 2014 (UTC)
Bgwhite, I checked a lot of articles but I haven't found an other example yet... --NicoV (Talk on frwiki) 12:16, 28 May 2014 (UTC)
fr:Utilisateur:Zetud/Pb Ref should have a list. --NicoV
Bgwhite, fr:Leetchi, with at least 2 problems in the introduction. --NicoV (Talk on frwiki) 07:34, 2 July 2014 (UTC)

Stripping pre tags[edit]

Hi,

<pre> tags are stripped only if they have no additional attributes. In pl:dmesg there's a pre block:

<pre style="height:20em; overflow-y:scroll">...</pre>

It's not getting stripped by checkwiki.pl (get_pre() function) so false positives are reported (like #56 in that case). ToSter (talk) 18:51, 12 November 2014 (UTC)

ToSter, personally, I'd remove the entire pre text. I don't see the benefit of a boot screen from a 6-year old version of Linux. Bgwhite (talk) 23:35, 12 November 2014 (UTC)
Bgwhite, that's right :) but still the problem can occur in another place. ToSter (talk) 06:13, 13 November 2014 (UTC)
ToSter, it can, but it is not. Also, this is what the whitelist is for. Bgwhite (talk) 21:54, 14 November 2014 (UTC)

Add field with user edit[edit]

Hola, disculpas por escribir en español, se podría agregar un campo mas en el cual indique el nombre de usuario o ip que realizó la edición del error detectado. gracias buen trabojo.Sergio Andres Segovia (talk) 16:59, 30 November 2014 (UTC)

"Hi, I apologize for writing in Spanish, you could add an additional field which states the user name or ip who made the edition of the detected error. thanks good work."
Seems possible but would require a lot of processing to find the particular edit. Frietjes (talk) 17:58, 30 November 2014 (UTC)
Es una pena que requiera una gran cantidad de procesamiento, porque si se agregara ese campo iríamos directamente a las contribuciones del usuario o ip, y el que tenga el flag de reversor podría revertir las edición desde allí. En Wikipedia en español intentamos detectarlo con un filtro de ediciones pero arrojó muchos falsos positivos [14], saludos. Sergio Andres Segovia (talk) 19:05, 30 November 2014 (UTC)
"It's a shame that requires a lot of processing, because if that field we would be linked directly to user contributions or ip, and an editor with rollback could reverse the issue from there. At Spanish Wikipedia, we tried to detect issues with an edit filter but it resulted in many false positives[15], greetings."
I agree that it would be useful. You might be able to get a bot to do this for you? for example, I know that some bots like 'BracketBot' will warn you when you have introduced unbalanced brackets. of course, there is a difference between warning a user about 'breaking an article' and warning a user about using deprecated syntax. maybe you can ask the operator of BracketBot (A930913)? Frietjes (talk) 17:19, 1 December 2014 (UTC)
There is also Bracketbot's brother, ReferenceBot. Both are done by A930913. The two main differences between BracketBot and CheckWiki is: 1) Bracketbot checks articles in near real-time 2) Bracketbot informs the editor of the problem they created instead of reporting the error to a master database. In theory, CheckWiki can also be run in near real-time on individual articles. I would need help from A930913. His bot code would run normally except call CheckWiki to test an article instead of using the bot's checks. Bgwhite (talk) 19:53, 1 December 2014 (UTC)
@Bgwhite: Make a (web)script that I can ping with a pageid/title/diffid/oldid/user? ##930913 connect? 930913 {{ping}} 07:33, 2 December 2014 (UTC)

#14 false positives[edit]

Resolved

Two false positives at plwiki are reported. To remove such cases, you might check only for "<source ", not "<source", and skip code which is in a <source> by itself. ToSter (talk) 19:20, 21 October 2014 (UTC)

The solution is again "<source[^a-z]". -- Magioladitis (talk) 06:13, 22 October 2014 (UTC)

Magioladitis, Error #14 doesn't use a regex. It uses the same subroutine used for checking imbalanced nowiki, pre, comment, syntaxhighlight, code, math, hiero, and score. The regex also doesn't solve the problem with the articles ToSter mentioned. The problem with the articles... there are valid, unbalanced source tags inside source tags.
Following scenario is in ToSter's articles, where the second source is not an html source tag.
<source> [text] <source> [text] </source>
Problem is... how does one differentiate between ToSter's scenario and a scenario where the first <source> tag is actually missing a closing tag, especially when editors don't always put extra parameters inside source tags. Bgwhite (talk) 07:26, 22 October 2014 (UTC)
We also have false positives on frwiki, which doesn't seem to fall into the above category:
  • fr:Apache Ant: a <sourcePath> tag is detected as being a <source> tag
  • fr:Vidéo HTML5: there are 3 self-closing <source /> tags inside a <syntaxhighlight> tag. The third one is reported.
--NicoV (Talk on frwiki) 13:41, 25 October 2014 (UTC)

Bgwhite is this fixed somehow? I haven't seen any false positives for a long time. -- Magioladitis (talk) 08:33, 24 January 2015 (UTC)

Whitelist (sv.wikipedia.org) for error #34[edit]

CanI get instances such as {{#expr:{{Stat/Finland/Kommuner/Befolkning|Föglö}}/{{Stat/Finland/Kommuner/Areal land|Föglö}} round 2}} whitelisted on svwp, since this is used to automatically update population numbers in articles such as sv:Föglö. I don't know how to do it, or what to do... (tJosve05a (c) 18:57, 23 February 2015 (UTC)

Josve05a This can be handled two ways and it depends on how many article you are talking about. If it is not "alot", then add the articles to a whitelist. If there are alot, then I can added it to the code.
If you are using a whitelist, look at Wikipedia:WikiProject Check Wikipedia/Translation and see how it is done for enwiki (search for "whitelist"). #34 on enwiki does have a whitelist. Frwiki also has whitelists set up. Bgwhite (talk) 19:26, 23 February 2015 (UTC)
@Bgwhite: THis should be used a lot, at least for all populated places in Sweden with population numbers at Statistiska centralbyrån, since a bot updates those automaticle. Not all articles are using this system yet, but more and more are. (tJosve05a (c) 19:29, 23 February 2015 (UTC)
I've also seen constructions like that on frwiki, but I don't like having calculations in articles. An other solution would be to use a template to do the computation instead of putting the #expr directly in the article. --NicoV (Talk on frwiki) 19:47, 23 February 2015 (UTC)

False positive in Error #85[edit]

In ca:Brainfuck, there are <code> tags between <center> tags. However, the tool is flagging it as if it were empty.

--Joutbis (talk) 11:20, 22 February 2015 (UTC)

Same kind of false positive for frwiki: fr:Messiah with <score> tags between <center> tags, and fr:Tiret with <code> tags between <center> tags. --NicoV (Talk on frwiki) 22:27, 22 February 2015 (UTC)
Joutbis NicoV See discussion two above this one... Anything between comment, math, nowiki, code, pre, source, hiero and score tags gets removed before checks take place.
All right, will do.--Joutbis (talk) 19:50, 28 February 2015 (UTC)
In ca:Brainfuck's case, <center> tags are not to be used like that in tables. This is a case of doing center properly. I did edit Brainfuck to do tables properly. fr:Tiret has the same problem, well actually, it is full of fail (scope="col" is redundant, <font> and <tt> are obsolete).
In the case of fr:Messiah, if it was on enwiki, I'd use the {{center}} template. That does the proper thing anyway instead of using the obsolete <center> tag. Bgwhite (talk) 06:38, 23 February 2015 (UTC)

False positive on error #37[edit]

de:Bělá (Divoká Orlice) is stated as not bearing a sort key, but in fact this key is given with the parameter SORTNAME in template de:Vorlage:Infobox Fluss. Adding a defaultsort parameter to the article itself results in a warning message that the previous sort key has been overwritten. So the sort key seems to be valid. I don’t know if there are other templates affected which are listed in de:Kategorie:Vorlage:mit Kategorisierung. --Hadibe (talk) 17:29, 28 February 2015 (UTC)

@Magioladitis: Hadibe, grrrr.... sort values shouldn't be in Infoboxes. Magioladitis is the one to ask about this. Bgwhite (talk) 05:35, 3 March 2015 (UTC)

List with empty title[edit]

Hi Bgwhite, at least on frwiki list for error #25, there was only one line, and it contains an empty title and time found 0000-00-00 00:00:00. The "Done" button does nothing. The "Set all articles as done" works, and the empty title appears now in the list of done articles. --NicoV (Talk on frwiki) 19:41, 10 March 2015 (UTC)

Same for error 59, but I left it as it is. --NicoV (Talk on frwiki) 19:43, 10 March 2015 (UTC)
Same for error 85. --NicoV (Talk on frwiki) 20:10, 10 March 2015 (UTC)

False positive for #31[edit]

Hi Bgwhite, on frwiki there are several false positives with things like <trl>, <trois>, <trk>, <transformers, <transmission, <traduction, <track>, ... Would it be possible to limit the detection ? For example, detect only <tr when followed by a space, a "/", a ">", ... but not by a letter ? --NicoV (Talk on frwiki) 09:12, 11 March 2015 (UTC)

NicoV, I'll take a look, but it will be a couple of weeks till I can get to it. Bgwhite (talk) 05:10, 16 March 2015 (UTC)

New error : empty titles ?[edit]

Hi, I was thinking about a new error for detecting empty titles, like the ones VE is creating on a regular basis (== <nowiki /> ==). --NicoV (Talk on frwiki) 18:10, 13 August 2014 (UTC)

NicoV, I did a scan for enwiki and came up with 83 articles. The VE edits all appear old. I wonder if they have fixed the problem in new VE builds? Bgwhite (talk) 22:31, 22 August 2014 (UTC)
Bgwhite, apparently it's still not fixed, the last VE edit I found with this problem is from last night. --NicoV (Talk on frwiki) 09:10, 23 August 2014 (UTC)
Thanks for the list Bgwhite, I've added error #522 to detect empty titles and fixed all the occurrences. --NicoV (Talk on frwiki) 12:21, 24 August 2014 (UTC)

And also, of the same kind, a new error for empty internal links, like in this edit ([[Boom Fm|<nowiki/>]] and [[Roger Blackburn|<nowiki/>]]). --NicoV (Talk on frwiki) 10:11, 23 August 2014 (UTC)

This was fixed in Visual Editor. No new cases have been found over the past few months. Bgwhite (talk) 23:51, 28 January 2015 (UTC)
Bgwhite, not at all. A few examples just in the last 24h (nowiki tags):
And maybe another problem with things like that: [[XX|YY ]]<nowiki/>ZZ which could be easily replaced by [[XX|YY]] ZZ
--NicoV (Talk on frwiki) 13:02, 29 January 2015 (UTC)

@NicoV and Magioladitis: According to Tech News: 2015-14, the problem of nowiki in titles has been fixed. Of course, what new untold problems have arisen due to their fix has yet to be seen. Bgwhite (talk) 20:28, 30 March 2015 (UTC)

Well, when you read in the same announcement that "VisualEditor is now the main editing tool on 53 more Wikipedias", you can't take it really seriously as even on wikis where it has been enabled by default for almost 2 years, it's still far away from from being the "main editing tool"... Face-wink.svg --NicoV (Talk on frwiki) 20:56, 30 March 2015 (UTC)
NicoV When I read that sentence for the first time, thoughts of dread and pity for those 53 sites went thru my mind. I also wondered what "phase 5" meant. mw:VisualEditor/Rollouts explains what each phase means. They have enwiki as a phase 0, which is, "... wikipedias that have been closed or deprecated". Ahhh, Visual Editor... always good for a laugh and a cry. Bgwhite (talk) 21:25, 30 March 2015 (UTC)
You didn't know ? When enwiki made its push to make VE opt-in, they closed enwiki Face-wink.svg Currently, we're not editing enwiki, it's a decoy... In the rollouts, I also liked very much the sentence that wikis in phases 1 to 4 "are relatively easy for VE to support"... --NicoV (Talk on frwiki) 21:35, 30 March 2015 (UTC)
VE is supporting phases 1 thru 4 very well. It's rather obvious. From day one, VE has supported goofs, foul-ups, mistakes and barfs. Bgwhite (talk) 21:45, 30 March 2015 (UTC)
I don't know if they deployed it, but empty titles are still created like here (without nowiki tag). --NicoV (Talk on frwiki) 16:48, 1 April 2015 (UTC)
still nowiki in titles..., and frwiki is running 1.25wmf23, the version identified as fixing the problem in all bug reports... --NicoV (Talk on frwiki) 15:54, 2 April 2015 (UTC)

False positive for #60 ?[edit]

Hi, I don't understand why fr:Liste des commandes et des livraisons de l'Airbus A380 keeps getting reported again and again. It seems that the following part is reported: {{#tag:ref|Singapore Airlines a commandé l'A380 en trois versions différentes, dont deux sont opérées : * 01 = 471 places{{#tag:ref|{{Lien web|url=http://www.seatguru.com/airlines/Singapore_Air/Singapore_Air_Airbus_A380.php|titre= Singapore Airlines Seat Maps (V1)|éditeur=http://www.seatguru.com}}|name=SIA_A}}, * 02 = 411 places{{#tag:ref|{{Lien web|url=http://www.seatguru.com/airlines/Singapore_Air/Singapore_Air_Airbus_A380_B.php|titre= Singapore Airlines Seat Maps (V2)|éditeur=http://www.seatguru.com}}|name=SIA_B}}, * 03 : ''configuration non encore connue'' |group=Note|name=SIAVersions}}

But in the notice, you have #tag:ref, Singapore Airlines a commandé l'A380 en trois versions différentes, dont deux sont opérées: note the comma after tag:ref instead of the actual pipe.

There is a similar construct before (Emirates) but it's not reported. --NicoV (Talk on frwiki) 11:36, 12 April 2015 (UTC)

A few questions, and silly requests[edit]

Hi from it.wiki; a few random things:

  1. Request: in the web interface the "more" link should be sortable, displaying how many errors are there ("1 more", "2 more" and so on); or at least don't display "more" if there isn't any other error; I'd love this so much :D
  2. Question: I'm testing two whitelists on it.wiki; I'm supposed to wait the new dump to see those articles removed from the web interface?
  3. Request: it should be possible to whitelist a single ISBN instead of articles; in it.wiki we have a parameter |ignoraisbn= inside the citation templates (doc here, it's part of the LUA module); article it:Jordan 195 has a wrong ISBN and it's not on our error lists[16]; but I don't know the details about this: is this "ignore" parameter working on every wiki due to the Lua module? Can we always use this instead of whitelists?
  4. Gadget proposal: when logged in Wikipedia, on Special:Watchlist there should be something like "Show errors in my whatchlisted articles", redirecting to our interface with a list of errors found; it should work like clicking on "more" for every article. I can do it already using the url, one article at a time. Similar gadgets can be done for a category, etc.
  5. Error 39, en.wiki translation page: " Due to a Wikimedia bug</a> ": is there a missing url?
  6. Minor bug: in the interface, after clicking on "more", the "list" link is broken.
  7. Suggestion: in the translation, it'd be better to use &nbsp; for spaces inside the examples proposed, at least where spaces are the problem.

Sorry if I'm wasting your time, and thanks for maintaining this wonderful project! --Vittorioo (talk) 23:08, 23 April 2015 (UTC)

Vittorioo There are no silly requests, but I may give silly answers. :)
  1. Good question. I'll look into it.
  2. The whitelist is updated at 0z everyday. Unfortunately, itwiki is only updated twice a month. From what you've already done, it looks good. I'll check it (my) tomorrow to see if the whitelists work just fine.
  3. It's not possible. enwiki has a similar parameter to ignoraisbn. Checkwiki is not checking ISBNs inside any cite template. The Lua module already checks for bad ISBNs. On enwiki, the errors are located at Category:Pages with ISBN errors. Checkwiki is only checking ISBNs that are not inside a cite type template.
  4. I haven't a clue when it comes to Gadgets. Gadgets are written in Javascript, a language I've never dealt with.
  5. I've removed the <a> tag. There was another Mediawiki bug that prevented newlines from being used in <blockquote> and several quote templates, thus <p> had to be used there. Those were fixed and the <a> tag was related to that.
  6. Will fix.
  7. Could you give me an example?
You are not wasting my time. Any suggestions or questions are always welcome. Bgwhite (talk) 23:49, 23 April 2015 (UTC)
7) For example on error 22, the [[Category : ABC]] and the like; but it's just me splitting windows; I've put no break spaces everywere :D Thanks again, will report on it.wiki --Vittorioo (talk) 00:05, 24 April 2015 (UTC)
6) Fixed. -- Magioladitis (talk) 07:35, 8 May 2015 (UTC)
7) I've added a few myself [17]; regarding 1) and 4): I've found a way to use WPCleaner to find articles with multiple errors or to scan my whatchlisted articles, it's quite the same of what I've asked, so don't waste time on them. Even that bug in 6), it's really not essential, just archive all of this. Thanks. --Vittorioo (talk) 12:43, 28 May 2015 (UTC)

Error 82 confusion and new error 104[edit]

Hi from it.wiki again. I've problems with error 82 "Link to other wikiproject"; it's active in es.wiki too.

  • A) Please exclude the "Wikipedia:" namespace from being detected by the script when it's checking a xx.wiki.
  • B) Redirects to en.wiki articles written like [[w:en:Article]] or [[:w:Article]] etc.:
    • from xx.wiki point of view they belong to error #68 "Link to other language";
    • from en.wiki point of view they are internal links badly written, together with [[:en:Article]] and the like: I propose to transfer them to a new error #104;
  • C) There are redirects to en.wiki articles using Meta or Mediawiki mixed syntax like [[m:en:Article]] or [[:en:mw:w:Article]] or [[meta:w:Article]] etc., and the script is not handling them correctly:
    • from xx.wiki point of view they belong to error #68 "Link to other language";
    • from en.wiki point of view they belong to the error #104 I've proposed in point B above;
    • they belong to error #82 only when the script is checking Commons or other sister projects.

A more simple fix to the script would be renaming error 82 to something like "Links with mixed MediaWiki syntax" and heavily expand its description. But in this case you have to be sure that all the above cases and variants are checked.
Sorry for the headache and thanks again. --Vittorioo (talk) 20:02, 1 May 2015 (UTC) Edit: added "at least" in point C) + some minor fixes --Vittorioo (talk) 10:07, 2 May 2015 (UTC) PS: I've rewritten and simplified my proposal. --Vittorioo (talk) 20:49, 26 May 2015 (UTC)

Vittorioo Sorry for ignoring you. I've been sick this past week. When I do edit, I'm trying just to keep up with fixing enwiki checkwiki errors. I'll get back to answering you next week. I've got an in-law gathering this weekend... so I'll probably be really nauseated for awhile. Bgwhite (talk) 21:44, 1 May 2015 (UTC)
Magioladitis Last month, I did the 2nd fewest edits in over four years and March was the 3rd fewest. Besides being sick the past two months, I wonder what else happened..... Bgwhite (talk) 00:01, 2 May 2015 (UTC)
Real life first of course! We have hundreds of years ahead to fix wiki. :D Take care. --Vittorioo (talk) 10:07, 2 May 2015 (UTC)
Pull request with a partial fix, basically an update for the list of projects: we are missing "species" because it's written "speciesi" in the script; also missing "voy" and many others. This is the list I've proposed when the script is checking a xx.wiki (that is, not Commons or other projects): b: c: d: n: q: s: species: v: voy: wikt: m: mw: meta: metawiki: metawikipedia: mediawikiwiki: commons: wikibooks: wikidata: wikinews: wikiquote: wikisource: wikispecies: wiktionary: wikivoyage: wikiversity: phabricator: wikitech: toollabs: testwiki: test2wiki: testwikidata: wmf: foundation: wikimedia: wmania: incubator: outreach:. There are more, but those are less used: see Help:Interwikimedia links. I've also proposed to add zh and bn language codes instead of fl (which doesn't exist) and gv (too small wiki). --Vittorioo (talk) 20:49, 26 May 2015 (UTC)

──────────────────────────────────────────────────────────────────────────────────────────────────── Vittorioo I've updated the program with your changes. Bgwhite (talk) 18:01, 27 May 2015 (UTC)

Slightly better. I've read your commit and you are still listing "fl" language code: as I said, it doesn't exist; also still listing "meta-wiki:" and labs:: they don't exist; re-read the list above please for the updated interwikimedia links: still missing c: for Commons and d: for Wikidata, etc. Also, is it really impossible to consider "Wikipedia:" a namespace, removing it from the list of projects? As I said in the pull request, error #82 is not active in our sister projects, so it's quite safe to remove "Wikipedia:" for now. I'd leave "w:" for a future fix, since that one is more complex to handle. I really appreciate your work, keep going. --Vittorioo (talk) 19:21, 27 May 2015 (UTC) Edited for grammar. --Vittorioo (talk) 12:43, 28 May 2015 (UTC)
With the last week commit it looks much better now. Thank you! --Vittorioo (talk) 10:03, 2 July 2015 (UTC)

Error 4 false positive[edit]

Error 4 (HTML-tag <a>): matches <a throne dais>; it's wrong. Fix: add href= to regexp 92.242.90.153 (talk) 23:17, 24 July 2015 (UTC)

Invalid color tracking[edit]

might not be feasible, but may be interesting to (1) parse an article, (2) grab any css style statements, (3) parse the background/foreground colors and compute the contrast ratio, (5) flag articles with really bad ratios. for the parsing part of the style statement, we have code in module:color contrast . for related discussion see Template talk:Episode list#Invalid color tracking category. of course, it would be pointless if there is no one interested in fixing them, but an idea for helping those of us with (partial) colour blindness. Frietjes (talk) 16:16, 1 August 2015 (UTC)

False positives for error #103[edit]

Hi, I think that the script is not doing what user NicoV requested:

it should not detect articles where {{!}} is used in the displayed text of the link.

There are a few examples in cawiki: many train stations, like ca:Estació de Bogatell or ca:Llista de cançons del DJ Hero 2 (this one took several tries to fix error #32, and now it's back!).

I'm not sure if this would cover all the false positives, but I think that, if a | is already there, it should allow several {{!}}'s.

--Joutbis (talk) 09:35, 13 March 2015 (UTC)

Joutbis Yes, those are false positives. The DJ Hero 2 article contains M|A|R|R|S, which is also in several English articles too. I've seen errors that also had | inside a wikilink. I'd say add it to the whitelist for now and I'll take a look at. I've been gone for the best part of 2 weeks, so I need to catch up on things first before diving into the code. Bgwhite (talk) 05:07, 16 March 2015 (UTC)
Bgwhite I fixed M|A|R|S using {{pipe}}. -- Magioladitis (talk) 20:08, 16 March 2015 (UTC)
Good idea, thanks! --Joutbis (talk) 19:54, 17 March 2015 (UTC)

Joutbis I created the template in Catalan Wikipedia! 10 -- Magioladitis (talk) 20:10, 17 March 2015 (UTC)

Magioladitis, thanks! However, it doesn't work 100% of the time. It's OK for the M|A|R|R|S thing, and for train stations, but not in some (brain-damaged, granted) templates, which wrap square brackets around some of the parameters. See ca:Papa Bonifaci II, at the end.--Joutbis (talk) 16:27, 11 April 2015 (UTC)

Joutbis Hm... I can't fight with that.. -- Magioladitis (talk) 16:30, 1 August 2015 (UTC)

Error n°54 false positive[edit]

Yuri (genre) is a false positive. The break is in a reference. Jerodlycett (talk) 13:36, 30 March 2015 (UTC)

The mistake in de:Hilfsfrist. Is there any way else to avoid collecting these articles in WPSK than to separate the ref group entries? --Hadibe (talk) 10:47, 28 October 2015 (UTC)

Proposed error detection[edit]

I noticed some file-delinker bots (or even users) removing an image name leaves an incorrect syntax like [[File:|thumb|]]. Also, in image galleries I noticed that image title was removed, but caption remained (after pipe). It might be useful to detect these errors also. --XXN, 00:34, 20 November 2015 (UTC)

The example you provided looks like Double pipe in a link. Matěj Suchánek (talk) 14:34, 25 November 2015 (UTC)
Only in this particular example. But it can also be like: [[File:|thumb]] or [[File:|caption here]] or [[File:|some_size_px]] etc. --XXN, 13:33, 26 November 2015 (UTC)

Category namespace[edit]

When the migration started, I asked about including some more namespaces. Detecting stuff like this [18] would be awesome. Matěj Suchánek (talk) 11:02, 12 December 2015 (UTC)

Wrong quotes[edit]

See this edit. Don't know how wide the problem is, but maybe it's worth including in Checkwiki? --Edgars2007 (talk/contribs) 09:58, 26 November 2015 (UTC)

For example, such wikisearch insource:/(style|class|colspan|class|rowspan|align)\s?\=\s?[”“]/i gives 200+ results at enwiki. Regex of course could be improved, as there may be cases, when opening brackets are correct, but closing ones are not. --Edgars2007 (talk/contribs) 11:20, 12 December 2015 (UTC)
And if I already started... Articles are using also style="text-align:centre;", which, of course, doesn't work. --Edgars2007 (talk/contribs) 11:36, 12 December 2015 (UTC)

File URLs[edit]

For some reason people cite files from their own hard drive. At least 58 articles when searching for file://c:/Users. Can we get these flagged? — Dispenser 21:42, 13 December 2015 (UTC)

Dispenser I can't remember when, maybe ~10-12 months ago, I did a scan for this problem and |image = http:// in infoboxes. If I remember right, there were different combinations of the problem. I look into it. I need to see what the other language Wikipedia's are like. For example, do they use file or another word. Bgwhite (talk) 05:47, 14 December 2015 (UTC)
Its standardized, see file URI scheme. — Dispenser 13:03, 14 December 2015 (UTC)

ISBN with invalid syntax missed by #69[edit]

Bgwhite Apparently, #69 doesn't catch the invalid syntax in Donald Strachey, like (isbn = 1-55583-387-X). Before fixing them, I tried checkarticle.cgi and nothing was detected. --NicoV (Talk on frwiki) 16:25, 18 December 2015 (UTC)

NicoV Correct, Checkwiki doesn't detect these. The main reason is the use of isbn= inside cite and infobox templates. Bgwhite (talk) 22:15, 21 December 2015 (UTC)
Bgwhite Could it be modified so that they are reported when they are not inside a template ? --NicoV (Talk on frwiki) 07:23, 22 December 2015 (UTC)

No title displayed[edit]

Today I noticed an error report which has no title displayed (it's in the first row, using search I found it should be Bisabolol in Czech Wikipedia). The timestamp there is also strange. Matěj Suchánek (talk) 12:15, 13 December 2015 (UTC)

Matěj Suchánek, I've seen that on enwiki and fixed one issue that caused most of the problems. It was related to dump files. I haven't been able to find the cause for the remaining problem. Bgwhite (talk) 05:36, 14 December 2015 (UTC)
I'm seeing this from time to time on frwiki, reported a few sections above. If you need, I can report when I see it. --NicoV (Talk on frwiki) 06:18, 14 December 2015 (UTC)
That would be good if you and Matěj could report them. It would help if the article was found via the dump or daily scan. Bgwhite (talk) 06:37, 14 December 2015 (UTC)
Existing ones on frwiki: 67, 91. I don't know for how long they are there, those errors have never been completely cleaned for some time, and it 's only possible to remove the empty title when it's the only one left. --NicoV (Talk on frwiki) 05:25, 15 December 2015 (UTC)
Bgwhite On frwiki, the problem is visible for #105 and it's probably very recent. I haven't done anything to remove it if it can help you understand where the problem comes from. --NicoV (Talk on frwiki) 17:13, 15 December 2015 (UTC)
Also on #60 and #43 but I don't know for how long. --NicoV (Talk on frwiki) 17:16, 15 December 2015 (UTC)
Bgwhite Can I remove the ones that can be removed, and then warn you if they appear again ? --NicoV (Talk on frwiki) 17:01, 18 December 2015 (UTC)
NicoV Yes, you can remove them. They are showing up in via dump processing. What's weird is they don't show in the log file. Bgwhite (talk) 22:17, 21 December 2015 (UTC)
Ok, I've removed the ones I can. I will notify you when I see some more. --NicoV (Talk on frwiki) 07:23, 22 December 2015 (UTC)

Bgwhite, I think there was a full scan yesterday on frwiki, I see empty titles for #26 (same notice as fr:Emphase (typographie)), #38 (same as #26), #45, #51 (similar than #45), #53 (similar than #45), #67 (maybe an old one). --NicoV (Talk on frwiki) 11:45, 31 December 2015 (UTC)

False positive for #60 ?[edit]

Hi, fr:Aldébaran keeps being reported for #60 with the notice Palette VizieR, V*. The template VizieR does have a "V*" parameter, but it seems to be detected as an error. Same for fr:Wolf 1061. --NicoV (Talk on frwiki) 17:09, 2 January 2016 (UTC)

NicoV You are correct. Atleast on enwiki, one can't have a parameter with * in its name. Probably true for dewiki as this was originally added by Stefan. Bgwhite (talk) 22:39, 5 January 2016 (UTC)

Error 104 unbalanced quotes with special characters and curly quotes[edit]

In regard of the display problem and the ref names rules, I've created a test page for error 104 (NicoV: WPCleaner wants to put the quote close to the slash in line 11). I've also searched for the opening and closing curly quotes, and some are mixed up with the regular ones. I think that Check Wiki should warn the user to search carefully every occurrence of a ref name found by error 104. If a ref name is "LuisBuñuel-59" the user needs to search at least "LuisBu" in order to find all of them. I hope it's clear enough. --CX42 (talk) 07:34, 10 January 2016 (UTC)

I once looked through Anomie's list of fixes and collected something (that I understand) for Latvian Wikipedia scanning. Yes, some are bracket-unrelated, but most of them are (in section "Kļūdainās atsauces (# Other issues)"). --Edgars2007 (talk/contribs) 08:20, 10 January 2016 (UTC)