Jump to content

Wikipedia:Bots/Requests for approval/DYKReviewBot: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Line 195: Line 195:
:{{BotExtendedTrial|edit=100}} I think we need to be moving this to approved - there are a couple of questions higher up, but don't want you to have to shut down so here are another 100 edits while this is in closing. — [[User:Xaosflux|<span style="color:#FF9933; font-weight:bold; font-family:monotype;">xaosflux</span>]] <sup>[[User talk:Xaosflux|<span style="color:#009933;">Talk</span>]]</sup> 13:52, 25 July 2016 (UTC)
:{{BotExtendedTrial|edit=100}} I think we need to be moving this to approved - there are a couple of questions higher up, but don't want you to have to shut down so here are another 100 edits while this is in closing. — [[User:Xaosflux|<span style="color:#FF9933; font-weight:bold; font-family:monotype;">xaosflux</span>]] <sup>[[User talk:Xaosflux|<span style="color:#009933;">Talk</span>]]</sup> 13:52, 25 July 2016 (UTC)


*Comment: I have initiated a discussion regarding the layout of DYKReviewBot on the nominations page, located at [[Wikipedia talk:Did you know#DYKReviewBot on the nominations page|DYKReviewBot on the nominations page]]. <span class="smallcaps" style="font-variant:small-caps;">[[User:Northamerica1000|North America]]<sup>[[User talk:Northamerica1000|<font size="-2">1000</font>]]</sup></span> 22:48, 25 July 2016 (UTC)
*Comment: I have initiated a discussion regarding the layout of DYKReviewBot on the nominations page, located at WT:DYK at [[Wikipedia talk:Did you know#DYKReviewBot on the nominations page|DYKReviewBot on the nominations page]]. <span class="smallcaps" style="font-variant:small-caps;">[[User:Northamerica1000|North America]]<sup>[[User talk:Northamerica1000|<font size="-2">1000</font>]]</sup></span> 22:48, 25 July 2016 (UTC)

Revision as of 22:49, 25 July 2016

Operator: Intelligentsium (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 12:20, Friday, June 17, 2016 (UTC)

Automatic, Supervised, or Manual: automatic

Programming language(s): Python

Source code available: Available

Function overview: To aid in new WP:DYK nominations by checking for basic criteria such as sufficient length, newness, and citations.

Links to relevant discussions (where appropriate): Wikipedia talk:Did you know#RFC: A bot to review objective criteria

Edit period(s): Fixed intervals (~once per hour)

Estimated number of pages affected: Subpages of Template:Did you know nominations and author talk pages

Exclusion compliant (Yes/No): Yes

Already has a bot flag (Yes/No): No

Function details: The DYK nominations page is perennially backlogged. Nominations typically take several days to a week to be reviewed. This bot will ease the backlog by checking basic objective criteria immediately after nomination so the author is made aware of those issues immediately.

Specific criteria which will be checked are:

  • Readable prose
  • Article newness or recent 5x expansion
  • Citations in every paragraph
  • No maintenance templates
  • Link to Earwig's copyvio report
  • Hook is <200 chars
  • Whether the article is a BLP

If there are issues, the bot will leave a note on the nomination page and on the nominator's talk page.

This bot is intended to supplement, not substitute for human review.

Discussion

  • information Administrator note Flagged "confirmed" as known alt account of User:Intelligentsium. — xaosflux Talk 13:31, 17 June 2016 (UTC)[reply]
  • How will the copyvio check work? Will above a certain percentage receive a "possible copyright violation" note? Since quotes from a source are typical in good articles (which can come to DYK), there's going to be a high false positive rate there. What are your thoughts on leaving a note even when all criteria it checks for are met so reviewers know which clear-cut objective criteria (readable prose, hook length, and newness, probably) they don't need to check? ~ RobTalk 13:43, 17 June 2016 (UTC)[reply]
    • Hi. It will link to Earwig's page with a note about the percentage. The relevant message makes it clear there is low confidence in the automated copyvio detection, and reviewers should still manually ensure there is no violation if the bot reports no violation (or vice versa). I agree that a standard notice is a good idea - the bot will edit any DYK page that has not already been reviewed with these comments. Intelligentsium 13:54, 17 June 2016 (UTC)[reply]
      • Alright, good responses. Please check directly with Earwig that this use complies with our license to use search results on his tool. There was some hubbub about APIs and licenses recently, and I know fully automated tools had some issues. Other than that, this is an obviously helpful bot. ~ RobTalk 14:56, 17 June 2016 (UTC)[reply]
        • Pinging @The Earwig: to keep all discussion in one place. Is automatically linking to the results page for your copyvio tool in compliance with the Google TOS? Intelligentsium 15:10, 17 June 2016 (UTC)[reply]
          • Yes, that's fine. — Earwig talk 18:14, 17 June 2016 (UTC)[reply]
            • The link isn't the potential issue. It's the actual running of the tool to get a percentage, which you said you'd be placing in the review note. Certain search sites require that their API is only used by an actual person and that the search results are displayed in a search-like experience (i.e. not just summarizing a percentage). I assume Earwig saw that bit too when looking this over, though, so you should be good on that front. ~ RobTalk 19:37, 17 June 2016 (UTC)[reply]
              • As long as you're making any automated requests through my tool's API, it's fine. We don't have the same restriction with Google's API as we had with Bing/etc. — Earwig talk 04:24, 18 June 2016 (UTC)[reply]

FYI I'm going to run a short test in my userspace to ensure the code to save pages is working correctly. Intelligentsium 17:45, 17 June 2016 (UTC)[reply]

Here is an example run, which you can see below. Any feedback is welcome.
The source code is also posted here. I'm not a professional programmer and much of this was written yesterday so please excuse any sloppiness. Intelligentsium 19:43, 17 June 2016 (UTC)[reply]
For full disclosure there are a few known issues
  • Unable to handle multi-article nominations. I'm not sure how best to implement that as sometimes single articles have commas, sometimes multinoms are made under only one article, and sometimes the link is a redirect.
  • Maintenance template grepping is a hack because I was lazy - it looks for dated templates as content templates usually are not dated (this does introduce false positives, for example {{use mdy dates}})
  • The char count is not exactly the same as Shubinator's tool as his tool parses the HTML while mine uses wikitext. Let me know if there is a significant (>5%) discrepancy
  • Sometimes the paragraph division is off, possibly because a single return in the editor doesn't break the paragraph in display.
  • I mostly ignore exceptions since there are many, many ways a nomination can be malformed
Intelligentsium 19:57, 17 June 2016 (UTC)[reply]
You wrote in the discussion that reviewers need to manually use Shubinator's tool and Earwig's tool to perform these standard checks. These issues could be pointed out easily by a bot for nominators to work on, rather than having to wait several days/weeks until a human reviewer gets around to raising them. What if pasting the output of Shubinator's tool and Earwig's tool was made standard in DYK submissions? Not to say that I have any issues—I fully support this bot—I'm just a bit surprised that you actually went to the trouble of this BRFA before what I saw as the most obvious solution.
I also recommend mwparserfromhell to parse wikitext instead of those nasty regular expressions. You may find using ceterach on Python 3 to make handling unicode much smoother, as well. Σσς(Sigma) 03:39, 18 June 2016 (UTC)[reply]
Seconded that mwparserfromhell is a wonderful library to use. I, too, once used regex to parse wikitext, but one of the many problems with doing so is that the expressions constantly have to be updated as editors find new and exciting ways to write malformed wikitext. Regex-based wikitext parsing is really technical debt, and once you switch over, it'll be so much easier. Enterprisey (talk!(formerly APerson) 03:44, 18 June 2016 (UTC)[reply]
Thanks, I'll look into the mwparser. @Sigma: I'm not sure I understand your comment. Using Shubinator's and Earwig's tools is standard review practice but because there are hundreds of submissions and as many of the users who participate at DYK are new users, the reviewer ends up having to perform the check. Intelligentsium 04:04, 18 June 2016 (UTC)[reply]
What I meant was, what if using Shubinator's and Earwig's tools, or gathering equivalent data through some other means, was required to submit a DYK?
many of the users who participate at DYK are new users I was not aware of this. Thank you for your response. Σσς(Sigma) 04:20, 18 June 2016 (UTC)[reply]
Here are some updated results

Intelligentsium 00:59, 19 June 2016 (UTC)[reply]

Thanks, I'm not that familiar with the DYK mechanics - that doesn't seem like the most appropriate use of the Template namespace -- but that is not being introduced by your bot so outside of this review. — xaosflux Talk 20:46, 23 June 2016 (UTC)[reply]
Approved for trial (50 edits or 15 days). Please provide a link to the relevant contributions and/or diffs when the trial is complete.xaosflux Talk 20:46, 23 June 2016 (UTC)[reply]
Intelligentsium Don't know if this issue was covered at DYK or not. But here goes. Is the bot configured to determine if each and every hook on a nomination is sourced? It would help a lot. — Maile (talk) 21:21, 23 June 2016 (UTC)[reply]
That's almost certainly not possible unless the hook was taken word-for-word from the article. ~ RobTalk 21:27, 23 June 2016 (UTC)[reply]
Thanks for the quick answer. Nevertheless, this bot is going to be a good addition to DYK. — Maile (talk) 21:48, 23 June 2016 (UTC)[reply]
Thanks Xaosflux. I've wondered myself why we use Template: pages for nominations rather than Wikipedia: subpages. It's probably an artefact of never moving away from the talk page of Template:Did you know for nominations, unlike ITN or TFA which have their own Wikipedia: pages.
@Maile66: Unfortunately, Rob is correct; that would be exponentially more difficult than anything the bot currently does. I don't know if you follow xkcd but this xkcd comes to mind... Intelligentsium 22:41, 23 June 2016 (UTC)[reply]
Funny stuff! — Maile (talk) 22:48, 23 June 2016 (UTC)[reply]
Approved for extended trial (250 edits or 30 days). Please provide a link to the relevant contributions and/or diffs when the trial is complete. This trial is approaching the original edit limit - there has been a tremendous amount of community discussion below and issues appear to concerns are being continually addressed - extending the trial period to allow this to continue. — xaosflux Talk 15:09, 30 June 2016 (UTC)[reply]

Trial comments

Agree that we should eschew overall icons. (The tiny check and X are okay.) The character count difference in Ellen F. Golden was 13: 4003 for the bot, 3990 for DYKcheck. I was wondering, though, why the bot says "No issues found" when there was a potential issue (see the small red X) with copyvio. I also think it might make sense to add an extra line to the bottom of the review, after what's there, that starts with the "review again" icon and would perhaps say something like "Full human review needed", perhaps also in bold. Otherwise, I think people may believe that a regular review has begun, and go on to another nomination. BlueMoonset (talk) 00:38, 24 June 2016 (UTC)[reply]
Excellent work on the bot so far, and agreeing with comments above. It did occur to me when I read through the ones checked by the bot, that visually speaking, potential reviewers might think, "Oh...that one's already been done...I don't need to bother with it." So, yeah, maybe something eye-catching to let the reviewer know a human is still needed. — Maile (talk) 00:55, 24 June 2016 (UTC)[reply]
  • Thanks for the suggestion. @Yoninah: I wanted something which would complement for reviews with issues. I deliberately chose as it is not one of the icons we usually use to indicate that the nomination is not yet approved, but I see how this could be confusing to new reviewers. @BlueMoonset: I'm still debating the best way to handle possible copyvio. The bot just sees the percentage and compares it to a threshold value of 20% (which I can change if people would like me to do so). The articles I've seen with close paraphrasing are usually at least 15-25%, which is why the threshold is here. However, this also catches some articles which use titles and quotes extensively, and because the metric has low confidence, I don't know if this should be flagged as an "issue" per se, rather something a human should look further into (which they should always do anyway as the note says). If this is flagged as an issue then the nominator will automatically be informed, for what could just be a waste of time. @Maile66: (also relevant to Yoninah and BlueMoonset's comments) This may also be a matter of people getting used to the bot; some other areas of the wiki have bot-endorsed/bot-issues-found icons which are distinct from the regular icons, and once people are aware of the bot and understand what the bot icons mean, there should be less confusion. Intelligentsium 03:22, 25 June 2016 (UTC)[reply]
  • That symbol is used for Good Articles, and its use here is confusing. I'd suggest using the red "re-review" roundel , as the intention is for a human to confirm the bot-generated review. If we really want a new symbol, perhaps the blue plus roundel could be used. Antony–22 (talkcontribs) 03:48, 25 June 2016 (UTC)[reply]
  • Template:Did you know nominations/2016 OFC Nations Cup Final; I do not see how the error the bot reported corresponds to the DYK criteria. - Yellow Dingo (talk) 01:54, 24 June 2016 (UTC)[reply]
  • The bot is great, but each auto-review is a lot of text. The nominations page is already consistently browser-stuttering and unwieldy. (The HTML is 1.3M at the moment.) Can we revive the idea of automatically moving human-reviewed and approved DYK nominations to a different subpage? Opabinia regalis (talk) 05:34, 24 June 2016 (UTC)[reply]
    • Yes, the page does sometimes lag for me as well (though the idea of the bot's review is that the user won't have to review the same criteria, so theoretically no net change). However, speaking as a regular DYK reviewer rather than "The Bot-op", approved nominations are often double-checked by other reviewers, and moving them to a subpage may reduce the additional scrutiny "approved" nominations receive. Moreover, this double-checking frequently leads to de-approvals, and moving the pages back and forth would be a pain (would probably call for another bot...) Intelligentsium 03:22, 25 June 2016 (UTC)[reply]
  • Template:Did you know nominations/PSLV-C34 "This article is too short at 1487 characters" The DYK Check shows it as 1536 characters as of the time and date the bot ran. I have noted that on the nomination's template. — Maile (talk) 12:19, 24 June 2016 (UTC)[reply]
    • Unfortunately 50 chars is within margin of error. I've examined the readable text extracted by the bot, and the discrepancy was coming from the converted units, which the bot wasn't counting. I can reduce the bot's pass threshold to 1450, though articles that are that close to the boundary often benefit from a bit more prose anyway. Intelligentsium 03:22, 25 June 2016 (UTC)[reply]
      • Intelligentsium, how practical would it be to incorporate existing prose-checking code, such as User:Dr pda/prosesize.js, into your bot? If not very, perhaps a rewording of the message could say that as your result is borderline, the number should be checked against DYKcheck or prosesize to get an exact number. (Maybe a yellow question mark or something like that could indicate this?) BlueMoonset (talk) 23:00, 7 July 2016 (UTC)[reply]
  • I'm not sure if this has been covered already, but does the bot ignore lead sections when checking for paragraph cites? Lead sections are not required to have citations. Also, does the bot deal appropriately with articles moved from userspace? Gatoclass (talk) 20:40, 24 June 2016 (UTC)[reply]
    • @Gatoclass: The current version of the bot considers everything up to the first header the "lead" and doesn't require it to have citations. This will not handle an article without sections correctly (though in almost all cases the article should have sections anyway). An alternative from an older commit is only to give the first paragraph a pass, which of course will handle multi-paragraph leads incorrectly. In your opinion, which behaviour is more desirable? The bot does correctly handle articles moved from user- or draft- space, as well as articles which are created from redirects and disambiguations. In fact the bot ignores the user-provided classification and performs its own classification (created, 5x, GA, 2xBLP) because the user classification is frequently inaccurate. As long as it qualifies under some DYK criterion, the bot will find it. Intelligentsium 03:22, 25 June 2016 (UTC)[reply]
Hi, sorry I was unexpectedly called out for the whole day today so I haven't had a chance to respond to these comments til now. I will respond to each comment individually above. Intelligentsium 02:24, 25 June 2016 (UTC)[reply]
  • A question was raised earlier by Opabinia regalis about misidentifying the nominator. This could be a problem in cases where an article has several authors, and one who does the nomination by an editor in their first five nominations (even if other authors have more than 5). As I understand it, QPQ is not required from that nominee in such a case, but if the bot picks a different author then a need for QPQ could be reported. Note also the QPQ issue I mentioned on your talk page. Thanks. EdChem (talk) 08:25, 25 June 2016 (UTC)[reply]
    • This is a "corner case" which it is also possible to address, though doing so would increase the complexity of the code and thus increase the number of possible failure points. Is this a common occurrence? This case would only be encountered when an article worked on by multiple authors is nominated by the least experienced with DYK (who would thus have fewer than 5 nominations). Usually when multiple authors work on an article, who the nominator "technically is" doesn't make a huge difference as the nompage only supports one nominator but all of the authors are responsive. Intelligentsium 12:13, 25 June 2016 (UTC)[reply]
  • Some hook length comments. The bot is counting the entire hook, but according to the hook format rules and Supplementary rule C8, the italicized (pictured) or similar text should not be counted, nor should the initial ellipsis and space. Also, for Template:Did you know nominations/Nuclear blackout the bot got 135 characters but I counted 134, which could be an off-by-one error. (Also, the last couple of bullets are at a different indentation level than the first bunch.) Antony–22 (talkcontribs) 18:46, 25 June 2016 (UTC)[reply]
    • I have raised an issue with this nomination as the article claimed for QPQ credit has been claimed on a previous nomination. Intelligentsium and I briefly discussed this at his user talk page, where I asked whether the bot checks for a previous claiming of QPQ credit. Now that we have an actual case of this issue arising, I think we need input on whether the bot could / should do such a check, or make clear it has not been done, or report differently on the QPQ topic. Thoughts? EdChem (talk) 05:35, 26 June 2016 (UTC)[reply]
  • Template:Did you know nominations/2016 OFC Nations Cup Final - bot only counted the hook length for ALT0 but didn't do it for ALT1 - Yellow Dingo (talk) 12:09, 29 June 2016 (UTC)[reply]
  • I've made it so the bot will look harder for alt hooks. However, because users tend to format alts in a variety of ways, it is still possible that a few will be missed. I'm doing some testing of hook format checking (checking for the space, link, bold, and correct use of (pictured) - it won't immediately show up in the reviews. I'm looking into the counting issue. Intelligentsium 12:47, 30 June 2016 (UTC)[reply]
  • Could you include text with the QPQ check that indicates the reviewer should check if the QPQ is a full review? DYK has always had issues with nominators "completing" their QPQ by providing a review that just says "Good to go" or omits several of the DYK criteria, and these "reviews" don't count toward QPQ because they must be re-reviewed by another editor to check all the criteria. I'm somewhat worried that a bot "verifying" the QPQ will make it less likely for reviewers to double-check that an appropriate review was performed. ~ RobTalk 15:16, 30 June 2016 (UTC)[reply]
    • Actually, I have a more radical suggestion to prevent editors from ignoring certain criteria in their reviews as a result of the bot's automated review. I apologize for this suggestion coming so late in the game, since it's going to be a pain to implement now if this is the way we decide to go. Why don't we just have the bot make notes where it does find errors rather than noting all the things that weren't incorrect? If reviewers are meant to double-check anything the bot does anyway, then noting all the non-errors doesn't serve any purpose. Noting an actual error could expedite correcting the error before a reviewer even touches the DYK nomination, though. That's where this bot's benefit is. Thoughts on this? ~ RobTalk 15:18, 30 June 2016 (UTC)[reply]
      • Not everything needs to be double-checked - for instance, human reviewers shouldn't have to double check the length or date or expansion if the bot verifies it. I think the purpose of the bot is to shift the focus of a human review away from ticking off the criteria towards a content-based review: do the sources say what the article claims they say, are the sources reliable, is there close paraphrasing, etc. I think someone commented early in the discussion that it's useful to have the bot explicitly state the criteria even if it doesn't find a problem so human reviewers can be reminded of what the criteria are. I can reword the QPQ bullet to note that a human should confirm the review was performed properly. Intelligentsium 16:43, 30 June 2016 (UTC)[reply]
  • Hi, the bot is reporting the same number of DYKs (39) for the nominator in four reviews on July 5, 2016, one minute apart. They are: this edit at 19:22, this edit at 19:22, this edit at 19:23, and this edit at 19:23. Yoninah (talk) 12:10, 6 July 2016 (UTC)[reply]
  • Intelligentsium, one of the DYK requirements is that the article not be a stub. The DYKcheck tool looks for this but I find that sometimes there is no talk page and no rating has occurred. Would it be appropriate for your bot to check for a stub tag on the talk page, or a stub category on the article page, and report these as a stub, and if no stub category and no talk page, report it as unassessed? EdChem (talk) 05:24, 7 July 2016 (UTC)[reply]
    • While checking for stub templates and tags is appropriate—DYKcheck does this today—I can think of no DYK reason why the bot should check for unassessed WikiProject classes or empty talk pages. Both are irrelevant to DYK and its criteria. BlueMoonset (talk) 22:52, 7 July 2016 (UTC)[reply]
      • My thinking is that if there is no talk page then there has likely never been an assessment and hence it could be a stub. Is it unreasonable to suggest that a reviewer check if there has been an assessment and if there is none then tag as stub or start or ...? EdChem (talk) 00:10, 8 July 2016 (UTC)[reply]
        • I don't think this is within the purview of the bot. The bot current does check for stub templates on the article page but Wikiproject categorization should be up to the discretion of the reviewer (indeed it may create more work if editors who don't know what they're doing start tagging articles because they think it's required) Intelligentsium 01:06, 8 July 2016 (UTC)[reply]
  • The bot reviewed my nomination and tells everybody my number of credits. The number is not correct (I guess those before templated DYK was introduced were not counted), but I don't like it there anyway. The human reviewer might get some bias such as "she has many credits, no need to check carefully". How about the bot checking qpq first, - if done there's no need to even check the credits. If not done, checking and then a line "... has more than 5 credits" would be enough. - I also think the always same caution about paraphrasing should not appear in every review, but be linked. --Gerda Arendt (talk) 05:55, 8 July 2016 (UTC)[reply]
    • Gerda, I think it's actually the cases where the credits are done manually rather than by the bot, my list of credits is missing a recent one, though I think your suggestion is a sensible one - what matters is if QPQ is needed, not how many credits there are.

      Intelligentsium, maybe have the bot recognise "tbd" or something similar as "Nominator recognises that a QPQ is required and will post here once it has been done"? EdChem (talk) 04:12, 9 July 2016 (UTC)[reply]

      • I've changed the phrasing so that it's only a binary more than 5 credits vs less than 5. However I think there is too much variation in how nominators say tbd for automated detection - at any rate it's useful to have the reminder there as it's easy to forget about it. Intelligentsium 09:24, 9 July 2016 (UTC)[reply]
        • I think the bot could first check if a qpq was done, if yes no need to run anything further. If no, a check if more than 5 DYK credits, - only speak up if yes, - keep the bot a bit more silent, - in general. (If saying anything, it should probably be "five or fewer" but seems clumsy.) As EdChem noted, the figure can be incorrect but never to high, so would err on the lenient side. --Gerda Arendt (talk) 12:27, 9 July 2016 (UTC)[reply]
          • @Gerda Arendt: actually, it is possible for it to be too high. In this case I found that there were 5 credits recorded but two of them were the same (1 and 2) though in different formats. Some sort of bot issue, presumably, but I was glad I checked it before asking for a QPQ. (PS: Intelligentsium, I would not suggest that your bot should catch things like this, it needs an aware reviewer or a nominator who raises the issue.) EdChem (talk) 12:40, 9 July 2016 (UTC)[reply]
  • I have a suggestion: to add <!--hidden comment codes--> on either side of the review to separate the bot's review from comments below it and the nomination above. It'd be really useful with wikEd, etc. Raymie (tc) 03:59, 9 July 2016 (UTC)[reply]
  • At the moment, the bot inserts its review after the ":* <!-- REPLACE THIS LINE TO WRITE FIRST COMMENT, KEEPING  :* -->" line in the template (immediately above the bottom, template-ending "Please do not write below this line"). The idea behind the "replace" comment line is that the (human) reviewer should start their review here, but that no longer makes sense if the bot's review is placed below it. I'd like to propose that the bot inserts its review above this comment line, so that the line appears directly below the bot's review (with a blank line between), indicating where the human review should begin. The blank line between the comment line and bottom line of the template should also be retained. Thanks. BlueMoonset (talk) 16:56, 9 July 2016 (UTC)[reply]
  • The bot is counting "ALT1", and even my signature on that alt, to the character count for ALT1. See [1]. Yoninah (talk) 18:55, 9 July 2016 (UTC)[reply]
    • I believe it only counted your signature. Unfortunately there is less confidence in the parsing of any alt hooks as there is usually greater variation in their formatting. I didn't mention this before but now is probably a good time to do so - the bot can look for hooks using a permissive regex, tolerant of incorrect formatting (missing ..., "that", or question mark at the end) or a "strict" regex, which requires correct formatting (unless no correctly formatted hooks are found, in which case it falls back to the permissive regex). The strict manner is likely to result in a slightly more accurate count but is also likely to miss alts unless they are perfectly formatted. I guess the question is, which is more important - getting all the alts or getting the count exactly right? Personally I don't think it makes a huge difference either way though I'm open to thoughts. Intelligentsium 22:23, 9 July 2016 (UTC)[reply]

Bot tags formal names

Bot just reviewed my DYK nomination…said risk was ~25%. However, virtually everything it tagged as close paraphrasing was a formal name combined with simple grammar words (e.g. “in the National Register of Historic Places”, “with the Brooks-Scanlon Lumber Company”, “for the Pilot Butte Development Company”, “and the Central Oregon Irrigation Company”, “mayor of the City of Bend, Oregon”, etc). “National Register of Historic Places” and other formal names can’t be avoided yet the bot tagged them multiple times causing a high risk score. Is there any way you can modify the bot to avoid tagging formal name in the review process?--Orygun (talk) 21:09, 10 July 2016 (UTC)[reply]

Do you mean for the copyvios tool? 25% is quite low for that tool, and even a 100% rating doesn't mean there's a copyright violation because it could have just caught a properly attributed quote. Editor review is required to determine if a copyright violation or close paraphrasing has taken place. The copyvios tool is only a shortcut for checking for that. ~ Rob13Talk 21:18, 10 July 2016 (UTC)[reply]
  • Yes, comment is related to copyright tool. At 25% the tool marks the copyright section with a red X (vice a green checkmark) so human reviewer is given the impression that there is a copyright problem. In the case of my article, I think a human reviewer would quickly see that there wasn't a copyright violation, but if formal names hadn't been tagged risk percent would have been in low single digits and could have been marked with a green check.--Orygun (talk) 21:38, 10 July 2016 (UTC)[reply]
  • Orygun, the bot is not doing the check, it is just reporting the output from using Earwig's tool which every reviewer should check. It is not uncommon for the tool to flag a possible copyvio issue which a reviewer can see is not an issue (eg. I recently saw a 98% case where the text had been copied from WP without proper attribution). Close paraphrasing and copyvios get missed at DYK too often, so the bot reminding everyone to check is a good thing. I doubt anyone who knows what they are doing will see a high percentage as a mark against you without investigating because the tool finds similarities which might be problematic and flags them for attention, it doesn't conclude whether or not a problem actually exists. EdChem (talk) 02:11, 11 July 2016 (UTC)[reply]
  • Hi, Orygun, EdChem and BU Rob13 are correct: the copyright checker is Earwig's tool and there is a note that titles and cited quotes may trigger a false positive. However, only a human check can verify whether a copyright violation exists; the bot merely alerts the human reviewer to be pay more attention when Earwig's tool reports a violation greater than 20%. It would be possible to raise this threshold if there is consensus to do so, but in my (manual) reviews I have found that >20% is almost always a reason for taking a closer look at the very least, and violations can exist even below that. Intelligentsium 23:32, 13 July 2016 (UTC)[reply]
  • What is the language used for >20%? There might be a case for using softer language for 20–50% (possible close paraphrasing, for instance) and stronger language for >50% (possible copyright violation, for instance). ~ Rob13Talk 23:38, 13 July 2016 (UTC)[reply]
  • I don't think there's necessarily a greater possibility of copyvio for >50% than >20% (usually >50% just means there's a mirror somewhere); close paraphrasing generally falls on the lower end, in the form of snippets and phrases rather than entire sentences or paragraphs. I have changed it so the notice will now be a purple question mark (?) and the bot will not automatically notify the nominator to avoid spamming; the human reviewer will have to review the comparison and confirm if a violation indeed exists. Intelligentsium 14:20, 14 July 2016 (UTC)[reply]
  • I've found copying when the Earwig-reported number was less than 10%; copyvio/plagiarism/close paraphrasing is something that should be always checked by a human reviewer. I would disagree with any request to set the number higher than 20%, and think the idea of a purple question mark is a good one. Given the number of false positives generated by Earwig, it's probably a good idea not to notify the user if Earwig numbers are the only issues found. BlueMoonset (talk) 15:14, 14 July 2016 (UTC)[reply]

DYK bot

At Template:Did you know nominations/Samiun dan Dasima, the new review bot tagged the article as lacking a citation for the plot section. As the film is still extant, and the plot is implicitly cited to the film (and no citation is required, per WP:DYKSG #D2, can we please add an exception to the bot's code so that sections titled Plot or Summary aren't checked? If we have a swath of film articles nominated, not having an exception coded might lead to more work for reviewers (or mislead new reviewers into thinking plot summaries need a citation). — Chris Woodrich (talk) 02:08, 10 July 2016 (UTC)[reply]

@Intelligentsium: copying this over from WT:DYK. Hope you see this here Chris Woodrich — Maile (talk) 21:22, 10 July 2016 (UTC)[reply]
I saw this, thanks (though oddly I didn't get this ping...). I've had to do a bit of unexpected travelling over the past few days, nothing too major but I might not be able to respond in-depth until this weekend. However I will look into this issue. Intelligentsium 23:26, 13 July 2016 (UTC)[reply]
Hi again. I haven't read through this whole page, but I'm wondering if someone mentioned the length of the text that the DYK review bot is adding to each nomination. It takes me much longer now to scroll down T:TDYK to find suitable hooks to promote to prep. I'm wondering if the bot's review could be placed in a collapsed box so prep promoters can easily scroll through and select suitable hooks? Thanks, Yoninah (talk) 15:29, 15 July 2016 (UTC)[reply]
Intelligentsium any feedback on this? Another option may to to wrap the review in <noinclude> tags. — xaosflux Talk 04:52, 25 July 2016 (UTC)[reply]
I agree that this would be helpful in terms of length and appearance—the nomination templates get very long, and some reviewers shy away from the ones that look busy. I'm not sure whether it would be better to have a line indicating that the automated review has been done but is collapsed, or just noinclude the whole section. BlueMoonset (talk) 05:28, 25 July 2016 (UTC)[reply]

Copyvio language misleading

Hi folks, EEng just brought this to my attention on my talk page. It's misleading to call what the copyivo tool returns a probability of violation. It's really a measurement of how much text in the article is in common with the suspected source, but fuzzed a bit. There's a big difference between saying "the probability of a violation is 15%" and "~15% of the article was found elsewhere on the internet". Now, here's my suggestion. Don't try to interpret the significance of the percentage yourself; the tool tells you how to interpret it. If the tool's API indicates that no violation is present (where resp["best"]["violation"] == "none" in the returned JSON), then the bot should say "No copyright violation suspected. (review)", with a green checkmark, and you can eschew the note that follows. Otherwise resp["best"]["violation"] contains a descriptive string, either "possible" or "suspected". If the former, say "A copyright violation may be possible, according to an automated tool with X% confidence. (confirm)"; otherwise, "A copyright violation is suspected by an automated tool, with X% confidence. (confirm)" with the ? and the clarifying note that reads "Please manually verify that there is no copyright infringement...". This should reduce confusion unless a real match is found. What do you think? — Earwig talk 00:23, 18 July 2016 (UTC)[reply]

I like this idea. Even though I'm familiar with the tool and what the percentages mean, it's clear from this thread that the existing language is confusing and this could help. ~ Rob13Talk 00:25, 18 July 2016 (UTC)[reply]
The Earwig, at what level will the tool's API tell you that there the chance of a copyvio is "none"? Do you have a set percentage within the tool? Also, some sources, such as a book at Google books, may appear to have been checked, but what's actually checked is the metadata page, not the actual contents (or specific page) of the book that has been cited. BlueMoonset (talk) 03:38, 19 July 2016 (UTC)[reply]
It's 40% at the moment. Also, I can only check what's in the HTML or PDF at the URL returned by the search engine. Google Books is not friendly to scrapers. — Earwig talk 03:40, 19 July 2016 (UTC)[reply]
Done. Thanks for the suggestion! Intelligentsium 20:14, 21 July 2016 (UTC)[reply]
Given that I've found copyvio down as low as 9.8%, saying that there's "no copyright violation suspected" at 40% or lower seems very misleading to me, and indeed could give the human reviewer a false sense that there's no need to check further. We've had reviewers in the past citing the Earwig percentage as sufficient evidence of a lack of copyvio/close paraphrasing/plagiarism. It's not, of course, but we have to be very careful in what is said here. Further, not all sources are (or can be) checked; sometimes a slow response from a website will leave its pages unchecked by the bot, when a human check would check and possibly find duplicated material. BlueMoonset (talk) 05:21, 25 July 2016 (UTC)[reply]

Moving towards stable operations

@Intelligentsium:, just checking in, the discussion and responses above have been great! Once live, where would you want editor feedback to go (e.g. your talk, the bot's talk, some other page)? Are there any outstanding technical or operational issues (not including enhancement requests)? — xaosflux Talk 00:46, 17 July 2016 (UTC)[reply]

{{OperatorAssistanceNeeded}}xaosflux Talk 18:34, 21 July 2016 (UTC)[reply]
Oops, sorry for the belated response! They can go on my talk page; I'll add a note to my userpage to this effect. Intelligentsium 20:11, 21 July 2016 (UTC)[reply]

Trial complete. I'm at 249 now. There are no glaring issues remaining. In the most recent run I note there was an anomaly relating to an unusually large nomination. The bot currently does not handle the case where a reviewer reviews a multi-article hook with N articles, then proceeds to claim QPQ credits for N of their own articles (which the reviewer is entitled to do). I will look into implementing a check for this as an additional feature, but this should not be a common occurrence. Intelligentsium 23:58, 24 July 2016 (UTC)[reply]

Approved for extended trial. Please provide a link to the relevant contributions and/or diffs when the trial is complete. I think we need to be moving this to approved - there are a couple of questions higher up, but don't want you to have to shut down so here are another 100 edits while this is in closing. — xaosflux Talk 13:52, 25 July 2016 (UTC)[reply]