Jump to content

User talk:ערן

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by SandyGeorgia (talk | contribs) at 02:19, 31 January 2020 (What next: awesomest). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

For question about my bot, User:EranBot, please read the documentation there before. Thanks!

Hi. I'm wondering how I would be able to test the script you suggested at the link above. I recently stumbled upon the gadget proposal page and saw your suggestion. I tried to import it into my personal js, but was unable to get it to work. [1] Any ideas? Killiondude (talk) 08:31, 20 January 2012 (UTC)[reply]

Hi, you should try to replace it with the following code:
mw.loader.load('http://bits.wikimedia.org/he.wikipedia.org/load.php?debug=false&lang=he&modules=ext.gadget.autocomplete');
(see example in User:ערן/common.js).
By the way, importScript is deprecated and mw.loader is the new convention. However, the old way should still work, but as this is an external script use importScriptURI instead of importScript. Eran (talk) 10:33, 20 January 2012 (UTC)[reply]
Oh, thanks! I'll try it now. I've used importScriptURI before, I didn't think to use it on this occasion. As you can see, I'm not particularly apt at coding. :-) Killiondude (talk) 17:51, 20 January 2012 (UTC)[reply]
Works great! Definitely useful. Thanks again! Killiondude (talk) 17:56, 20 January 2012 (UTC)[reply]
Thank you! User:Ijon taught us to install this at WikiData workshop in Pune, India Nikhilsheth (talk) 10:19, 19 September 2017 (UTC)[reply]

Invitation to events in June and July: bot, script, template, and Gadget makers wanted

I invite you to the yearly Berlin hackathon. It's 1-3 June and registration is now open. If you need financial assistance or help with visa or hotel, just mention it in the registration form.

This is the premier event for the MediaWiki and Wikimedia technical community. We'll be hacking, designing, and socialising, primarily talking about ResourceLoader and Gadgets (extending functionality with JavaScript), the switch to Lua for templates, Wikidata, and Wikimedia Labs.

Our goals for the event are to bring 100-150 people together, including lots of people who have not attended such events before. User scripts, gadgets, API use, Toolserver, Wikimedia Labs, mobile, structured data, templates -- if you are into any of these things, we want you to come!

I also thought you might want to know about other upcoming events where you can learn more about MediaWiki customization and development, how to best use the web API for bots, and various upcoming features and changes. We'd love to have power users, bot maintainers and writers, and template makers at these events so we can all learn from each other and chat about what needs doing.

Check out the the developers' days preceding Wikimania in July in Washington, DC and our other events.

Best wishes! - Sumana Harihareswara, Wikimedia Foundation's Volunteer Development Coordinator. Please reply on my talk page, here or at mediawiki.org. Sumana Harihareswara, Wikimedia Foundation Volunteer Development Coordinator 14:51, 2 April 2012 (UTC)[reply]

autocomplete.js needs protocol-relative URL

Copied from User talk:ערן/autocomplete.js by me, the original poster, as recommended by Rjd0060 (talk · contribs)

The URL [in this script's mw.loader.load call] should be made protocol-relative, that is: mw.loader.load('//bits.wikimedia.org... in order to avoid mixed content in the event that someone is using the secure server without HTTPS Everywhere. Your common.js appears already to contain the proper protocol-relative usage. --SoledadKabocha (talk) 21:17, 11 October 2012 (UTC)[reply]

 Done Eran (talk) 07:03, 12 October 2012 (UTC)[reply]

How would I configure autocomplete to work for other namespaces?

I use your Autocomplete script as a gadget on the All The Tropes wiki on the Orain wikifarm service and am quite pleased with it, but I was wondering how it could be configured to work in namespaces other than the main namespace, like our "Forum:" namespace?

Any help in fixing this would be appreciated. GethN7 (talk) 03:21, 20 May 2014 (UTC)[reply]

Small update, it seems that it works fine in the namespace itself, but does not work on LQT when in the edit window. If you can provide a patch for this, it would be appreciated. GethN7 (talk) 04:44, 20 May 2014 (UTC)[reply]
Hi GethN7, the script is designed mainly for editing articles, and less for "disscusion" pages, though it is very similar. I am not familiar with LQT so I can't say why doesn't it works there (maybe opensearch API isn't configurated to search in this namespace?) Eran (talk) 21:23, 23 May 2014 (UTC)[reply]

User:Eran/refToolbarVe.js

Hi, I've been trying to get refToolbarVe working as an example of making a dialog. (I realize it's been supplanted by in-house methods). I thought maybe it had name conflicts with in-house cite dialog but couldn't get it working on localhost after a refactor either; can you confirm that it is still working? Also any tips on how to debug "dialog just doesn't show up" problems would be helpful as I've had the same problem with ones I've tried to write from scratch as well... :). Thanks for all your work on the VE gadget tutorials! Mvolz (talk) 09:10, 29 June 2014 (UTC)[reply]

Mvolz, thank you and I'm glad tutorials are helpful.
I think there was a change in the VE API, and we didn't fix it for that specific gadget as there is already a built-in support. To fix it:
  • change dialogFactory=>windowFactory
  • change 'dialog' => 'window'.
You can also take a look in User:ערן/veReplace.js which should work.
Another possible problem with dialogs not opening correctly may be the wrong order of the decelerations of inheritance (it is important to put the inheritClass call before actual deceleration implementation). Eran (talk) 18:11, 29 June 2014 (UTC)[reply]
Hi Eran,
I removed that script from my common.js file just now, and a lot of problems with opening VisualEditor just cleared up, too. WhatamIdoing (talk) 22:52, 30 June 2014 (UTC)[reply]
There was some API break in the VE (changes for method names) which we weren't aware too, and it seems to break the VE when it loads plugins that weren't updated according to these changes. I changed it now so it won't cause exceptions. Eran (talk) 04:29, 1 July 2014 (UTC)[reply]

Core ve changes that will affect veReplace.js

Just a heads up in case you didn't know, but there were some ve core changes that will affect veReplace (Notably, the class ve.ui.Dialog has been replaced- try FragmentDialog- and the dialog constructor will need to take an additional argument, manager). The changes are getting deployed this Thursday at 11am PST time on en-wiki: I don't know when on he-wiki! Mvolz (talk) 17:07, 21 July 2014 (UTC)[reply]

Mvolz, thanks for the notice :) Eran (talk) 17:25, 21 July 2014 (UTC)[reply]

Wikipedia:Manual of Style/Words to watch/Config

can I help develop it?--Gabrielchihonglee (talk) 10:52, 7 August 2014 (UTC)[reply]

Hi Gabrielchihonglee, I would like to have more people involved. Are you in the hackathon? I'm siting in the corner (near the coffee table ;) ). Eran (talk) 14:32, 7 August 2014 (UTC)[reply]
I am sorry that I am in Hong Kong, I am just an online volunteer of Wikimedia2014. --Gabrielchihonglee (talk) 00:45, 8 August 2014 (UTC)[reply]
Hi Gabrielchihonglee, I would really love to have the list of common words to avoid in other languages except English (such as Chinese), so it would possible to use this tool in other languages. Another important imporvment would be to expend the list with more words based on the manual of style. I already created a similar list for Hebrew (I'm not editing so much in English Wikipedia so I didn't add so many words in English).
Once it is created in other language it is easy to be adopted in other languages with the following code:
/*
Load clippy for VE editing
*/
mw.config.set('WEASLE_WORD_PAGE', 'PAGENAME'); //replace the PAGENAME with the name in the local language
$.getScript('//en.wikipedia.org/w/index.php?title=User:%D7%A2%D7%A8%D7%9F/WeaselWords.js&action=raw&ctype=text/javascript');
I've a similar but a but a different tool for wikitext editing, that warns users on bad style sentences. Once such list would be available in other languages it would be possible to create more tools for improved user editing, not only the clippy ;) Thanks, Eran (talk) 19:29, 8 August 2014 (UTC)[reply]
1) I think I can help with translating the list of common words to avoid to zh-hant, zh-hans and zh-yue.
2) Can you talk a little bit more about creating other tools? Thanks!--Gabrielchihonglee (talk) 00:59, 9 August 2014 (UTC)[reply]
  1. great
  2. in hewiki there is a tool for various options to improve article which works in wikitext editing: it includes warning of usage of words to avoid and also h≈elps to locate them (and other suggestions such as replacing fair use images with free, and replacing disambig link to correct link etc). This tool, and other similar tools in other languages use hard coded "dictionary" - and extracting such dictionary out of such tools should make them work in other languages. I'll give here a link to these other tools later on. Eran (talk) 06:32, 9 August 2014 (UTC)[reply]
  1. Where is the list? And where should I place my translate?
  2. I will study it after I get the link.--Gabrielchihonglee (talk) 07:59, 9 August 2014 (UTC)[reply]
  1. The list can be created in in Wikipediia namespace, with some intro, than "-----", and than the list itself in the format of "*WORD TO WATCH//DESCRIPTION", e.g. similar to English list.
  2. Other similar tools are he:MediaWiki:Gadget-Checkty and ru:MediaWiki:Gadget-wfTypos - both works in wikitext. Both the Russian and the Hebrew gadget use autuomatic replacement for "safe" common typos, and the Hebrew one also use warnings/suggestions for the "words to watch" - which isn't safe or not possible to automaticlly fix. Eran (talk) 09:24, 9 August 2014 (UTC)[reply]
  1. Just did the translation into Chinese. See Wikipedia:避免使用的字詞/配置.
  2. Are you planning to develop other tools? If yes, just tell me and I may help. :)--14.198.2.193 (talk) 13:06, 9 August 2014 (UTC)[reply]
  1. Awesome!! Did you write on this in Village pump so users would know it exist and can use it?
  2. Sure. I'm going to refactor the hewiki tool to be more generic and use this external list instead of hard coded list, and then users would be able to use this list of "words to watch" also in the classical editor.
Many thanks, Eran (talk) 13:17, 9 August 2014 (UTC)[reply]
I am sorry that I don't know he. Is there any new tools in en? --Gabrielchihonglee (talk) 14:18, 10 August 2014 (UTC)[reply]

A barnstar for you!

The Original Barnstar
For the great work you have done on the copy and paste detection bot. Doc James (talk · contribs · email) (if I write on your page reply on mine) 01:39, 22 August 2014 (UTC)[reply]

BAGBot: Your bot request EranBot

Someone has marked Wikipedia:Bots/Requests for approval/EranBot as needing your input. Please visit that page to reply to the requests. Thanks! AnomieBOT 03:59, 22 August 2014 (UTC) To opt out of these notifications, place {{bots|optout=operatorassistanceneeded}} anywhere on this page.[reply]

EranBot

moved from User talk:EranBot 12:02, 22 August 2014 (UTC)[reply]

Jmh649 is this your bot account? If not, who is the operator? Do you plan to seek bot approval? — xaosflux Talk 01:07, 22 August 2014 (UTC)[reply]

Yes plan to seek bot approval and no this is not my bot (i have no idea how to write bots). Just got back from Wikimania and still catching up. Doc James (talk · contribs · email) (if I write on your page reply on mine) 01:13, 22 August 2014 (UTC)[reply]
This account says "There is NO plan for this bot to made edits to mainspace.", I've removed autopatrolled as there should be no impact to NPP. — xaosflux Talk 02:00, 22 August 2014 (UTC)[reply]
Hi Xaosflux, I requested the autopatrolled right to avoid CAPTCHA in edits, which prevents the bot from running automatically. Autopatrolled is the weakest right possible according Special:UserGroupRights (skipcaptcha right) for a new account. Anyway once the bot will finish its trial period, and will get added to bot group there will be no need for it. Thanks, Eran (talk) 07:56, 22 August 2014 (UTC)[reply]
Have given it back. Doc James (talk · contribs · email) (if I write on your page reply on mine) 09:00, 22 August 2014 (UTC)[reply]
OK, of course the 'bot' flag should take care of this long term. — xaosflux Talk 12:33, 22 August 2014 (UTC)[reply]

Attention Bot Operator @ערן:: Please seek your bot trial approval at WP:RFBOT. — xaosflux Talk 02:07, 22 August 2014 (UTC)[reply]

Have added it here Wikipedia:Bots/Requests_for_approval#Current_requests_for_approval. Doc James (talk · contribs · email) (if I write on your page reply on mine) 02:47, 22 August 2014 (UTC)[reply]
See bot request, trial in userspace is good. — xaosflux Talk 12:37, 22 August 2014 (UTC)[reply]
Please see the thread at User_talk:Jmh649#User:EranBot_account_flags, if you have discovered a bug I am interested. — xaosflux Talk 04:27, 24 August 2014 (UTC)[reply]

Discussion of improvements

Hey Eran. Have started going over the diffs here User:EranBot/Copyright. Wondering about a few adjustments as explained. The bot did pick up one positive and have had a chance to educate a user.  :-) Doc James (talk · contribs · email) (if I write on your page reply on mine) 04:45, 23 August 2014 (UTC)[reply]

Also wrote some comments here Wikipedia:MED/Copyright Doc James (talk · contribs · email) (if I write on your page reply on mine) 06:30, 23 August 2014 (UTC)[reply]
Hi Doc James, I started to go over the few first edits (from the bottom). It did find a possible copyvio in Temporomandibular joint dysfunction and I reverted this edit and notified the user. In any case I added a new column to the table, "status", so editors who go over the list will be able to write what is the status of the edit (TP/FP), and then we can do fine tuning to improve the precision. cheers, Eran (talk) 07:39, 23 August 2014 (UTC)[reply]
Excellent. It should be fairly easy to exclude edits that are reverts correct as they are tagged as such in the edit summary? Doc James (talk · contribs · email) (if I write on your page reply on mine) 08:02, 23 August 2014 (UTC)[reply]
Looks like you have already fixed this :-) Doc James (talk · contribs · email) (if I write on your page reply on mine) 08:02, 23 August 2014 (UTC)[reply]

This all appear to be exceedingly poor quality sources. Can we leave them out? Doc James (talk · contribs · email) (if I write on your page reply on mine) 22:04, 23 August 2014 (UTC)[reply]

For listing sites to skip does this format work User:EranBot/Copyright/Blacklist Doc James (talk · contribs · email) (if I write on your page reply on mine) 22:15, 23 August 2014 (UTC)[reply]
For now I added it manually. Eran (talk) 04:12, 24 August 2014 (UTC)[reply]

Hidden text

Wondering if we could add the text the line the goes with status such as I did here [2]

Once again many thanks for your excellent work. Doc James (talk · contribs · email) (if I write on your page reply on mine) 01:22, 24 August 2014 (UTC)[reply]

Doc James, yes it is OK. <ref name="XX"/> to link to a followup comment (putting later a <references><ref name="XX">COMMENT</ref></references>) should be OK too. But it should be consistent in sense that the first word should be TP/FP. Eran (talk) 04:18, 24 August 2014 (UTC)[reply]

Bot runs

How many times is the bot run per day? Doc James (talk · contribs · email) (if I write on your page reply on mine) 01:06, 25 August 2014 (UTC)[reply]

Doc James, Every 3 hours. Eran (talk) 04:13, 25 August 2014 (UTC)[reply]
Thanks. Doc James (talk · contribs · email) (if I write on your page reply on mine) 05:29, 26 August 2014 (UTC)[reply]

"connection error"

The sources with the label "connection error" here are useless. Can we leave them out? Doc James (talk · contribs · email) (if I write on your page reply on mine) 05:29, 26 August 2014 (UTC)[reply]

Wikipedia forks and mirrors

Can the bot check suspected positives against listings in Wikipedia:Mirrors_and_forks subpages? That might improve the FP rate. Many of the forks there have page names that systematically derive from the WP article title. Might as well tilt the Whack-A-Mole game in our favour. LeadSongDog come howl! 17:10, 26 August 2014 (UTC)[reply]

It also seems to be ignoring entries in the blacklist, e.g.
Some guidance on how those entries should be formatted would help. LeadSongDog come howl! 15:23, 28 August 2014 (UTC)[reply]
Hi LeadSongDog (I'm also responding to @Jmh649), it seems that Wikipedia:Mirrors_and_forks is for human not for machines ;) There is no consistency in the format or mirrors & forks list: sometimes it is with/without nowiki, sometimes the link is to main page ([3]) or other page ([4]), and sometime it include the prefix of the site ([5]). It isn't possible to cut the suffix, as sometimes only part of the site is mirror.
Regarding the blacklist, I haven't yet written parser for blacklist (just added some of them manually). I guess we would like to have sometimes similar to User:CorenSearchBot/exclude so we can reuse it.
In any case, I did some fixes to the bot so it can rank low quality sites and hint on mirrors based on their content. If we find this hints as reliable enough we can remove those sites automatically. Eran (talk) 08:02, 30 August 2014 (UTC)[reply]
Great if you want us to start creating a list like that for CorenSearchBot/exclude we can. Would the bot automatically follow that than? Doc James (talk · contribs · email) (if I write on your page reply on mine) 08:35, 30 August 2014 (UTC)[reply]
We haven't yet written parser for the blacklist, but once we do... :) (tagging @Ladsgroup). Eran (talk) 08:52, 30 August 2014 (UTC)[reply]
a few ideas to consider, but not sure how to implement... Parser should ignore the http https FTP distinction, as many sites serve the same paths by multiple protocols. Could also ignore the top level domains for that matter. Should derive or share entries from User:CorenSearchBot/exclude and Wikipedia:Mirrors_and_forks/All. Should be language independent (via wikidata?) LeadSongDog come howl! 20:40, 30 August 2014 (UTC)[reply]
OK, now the blacklist should work.
  • The blacklist is based on regular expressions so there is no problems with http/ftp protocol (unless it is explicitly part of the regex in the blacklist).
  • User:CorenSearchBot/exclude - The blacklist have almost the same format so we can fork it. Unfortunately Wikipedia:Mirrors_and_forks/All isn't machine readable in its current state, as there is no consistency in the way URL and alternative URLs are written.
Eran (talk) 21:21, 30 August 2014 (UTC)[reply]
Well the vast majority of the urls are preceded by either "URL" or "website". Most of the other info on that list is of limited use for this bot. LeadSongDog come howl! 22:47, 30 August 2014 (UTC)[reply]
I forked User:CorenSearchBot/exclude into the blacklist and reformatted the entries. One way or the other, so far we seem to be catching the bulk (if not all) the mirrors. Would it be possible for the report entries to include a two or three word string that matched? That would make it much quicker for reviewing humans to localize the troublesome part of the text. LeadSongDog come howl! 17:22, 3 September 2014 (UTC)[reply]

Still picking up reverts

Such as this edit [6] Doc James (talk · contribs · email) (if I write on your page reply on mine) 01:47, 27 August 2014 (UTC)[reply]

Similarly, editors who reorganize text by cut-save-paste-save (in separate edits) are getting picked up. Perhaps use a diff from the latest version by a different editor vice just the latest version? LeadSongDog come howl! 13:22, 10 September 2014 (UTC)[reply]
That would be more difficult technically. We need to be able to splice out blocks of text. Moving text does not count as a revert. Doc James (talk · contribs · email) (if I write on your page reply on mine) 13:25, 10 September 2014 (UTC)[reply]
Isn't it just a question of which version to diff from? Even just comparing against yesterday's version would seem better than arbitrarily using the "last" one. LeadSongDog come howl! 02:10, 11 September 2014 (UTC)[reply]
We just sorted out an issue where a copyvio by an indef'd copypaster that had been removed was restored by a good faith edit by Formerly98. The existence of mirrors could make this kind of thing **very** confusing. I`m thinking that it would be useful to (eventually) have the bot apply something like wikiblame to help determine the first edition of the suspect text in the article. This might, of course, be too resource-intensive to be practical. LeadSongDog come howl! 17:10, 18 September 2014 (UTC)[reply]
In some sense it is feasible - the bot can add a link to search in wikiblame (but only a small fraction of the added text) :) Eran (talk) 17:27, 18 September 2014 (UTC)[reply]

EranBot suggestion

Also pinging User:Jmh649. Would it be useful to have a bot (perhaps written by someone else) that builds a list of Wikipedia mirrors, to remove them from EranBot's false positives? I was thinking this might be worth requesting at WP:BOTREQ, though it's not really an internal bot function at all. I think it could be done by taking some characteristic text from multiple pages and looking for domains that come up with positive matches on a high percentage of them. Mike Christie (talk - contribs - library) 01:58, 12 September 2014 (UTC)[reply]

We have a list of 2000 mirrors. They just need to be converted into machine readable format. User:Ocaasi has details. Doc James (talk · contribs · email) (if I write on your page reply on mine) 07:10, 12 September 2014 (UTC)[reply]
The bot looks for characteristic text of mirrors, or standard attribution to Wikipedia, in if it does find it, it adds "Mirror" suggestion to add for blacklist (this is just a suggestion and isn't added automaticlly to blacklist). However, there are many sties which copy Wikipedia content without attribution (which is BTW copyright violation from their side). Eran (talk) 07:15, 12 September 2014 (UTC)[reply]
I understand. I was just thinking that the process of identifying both full-fledged mirrors and sites with substantial copyvios is something that might usefully be subcontracted, so to speak, to another bot. It could generate lists of mirrors in machine readable format, and keep the list up to date as new ones are found. Mike Christie (talk - contribs - library) 14:24, 12 September 2014 (UTC)[reply]

I compiled a list of mirrors from the WMF plagiarism study last year, which may have some additional sites that aren't on the bigger list. These are the ones that were coming up frequently in Grammarly's plagiarism matching system (already in the form of regex, too): User:Sage (Wiki Ed)/mirrors.--Sage (Wiki Ed) (talk) 19:55, 12 September 2014 (UTC)[reply]

Great - we can fork it to the blacklist and avoid checking them. I think some mirrors are actually "grey list" mirrors - sometimes they have their own content and not always mirroring wikipedia. I'm not sure how should we handle them. Eran (talk) 17:59, 15 September 2014 (UTC)[reply]
I went ahead and incorporated them, if the FPs don't come down we're nowhere. Changed the regex form to match. Thanks, Sage! LeadSongDog come howl! 18:58, 16 September 2014 (UTC)[reply]
I've added the Wikipedia:Mirrors_and_forks entries too, for a total of 1563. Let's hope the grey list mirrors are not too much of an issue. I think it's better than rejecting good faith edits for the wrong reason. LeadSongDog come howl! 18:57, 17 September 2014 (UTC)[reply]

Still trapping on added refs

This edit only added a ref. Why was that caught, if the ref tagged content is supposedly removed? LeadSongDog come howl! 15:51, 18 September 2014 (UTC)[reply]

A little tougher case at [7], where the refs are not wrapped in tags. LeadSongDog come howl! 03:06, 22 September 2014 (UTC)[reply]
  • Antimicrobial - the change itself isn't copvio, but as it change word + adding reference, it is considered as changing paragraph, and the whole paragraph was sent for checking. Maybe it would be possible to check the diff against the content again to eliminate such changes to appear here as the diff change size is just word.
  • In general copyright protects also collections of facts, so copying a full (long) list of bibliographic references may be considered copyright violation, though this is not the case here of course. I'm not sure how we should handle such edits. maybe the diff size threshold should be higher? Eran (talk) 17:48, 22 September 2014 (UTC)[reply]
Not sure I get your point. The diff should be the only part going for checking, at least for now. The balance of the text has been in the article for years. Wasting human effort on chasing mirrors is rather pointless, as there will always be more of them. Especially wasteful as at least half of the mirrors are readily identifiable by the string "Wikipedia" which they contain. LeadSongDog come howl! 22:30, 22 September 2014 (UTC)[reply]

LeadSongDog, Doc James: Notice the 2 following changes in the bot:

  1. report link - added links to ithenticate for comparing the diffs (already appear in the last runs)
  2. diff is now in sequence resolution instead of line resolution (for next bot runs): e.g Instead of comparing line by line the diff is compared char by char. For example the diff in [8] is adding of single sentence ("Voluntary counseling... test positive"), instead of replacing the whole paragraph "Programs encouraging..." to a new paragraph with similar text plus sentence.

This is definitely the case in the example, and I think in general it should be better, but please let me know if you see weired diffs or wrong behavior. This change would also affect the number of diffs behind sent to ithenticate service: for the example diff above, will not be sent to ithenticate since it is too small diff (only 163 characters). Thanks, Eran (talk) 13:10, 25 September 2014 (UTC)[reply]

Simpler idea

Given that many edits actually do provide an url or citation to the source used (imagine that!!!) it would seem prudent to first check the edit diff against such cited sources before going further afield to look at everything else. Is that an available option? LeadSongDog come howl! 16:56, 25 September 2014 (UTC)[reply]

It may be possible. Can you please provide 1-3 example diffs for which it should say the content is similar to the referenced URL? Eran (talk) 17:27, 25 September 2014 (UTC)[reply]
Recently, [9] cited [10], [11] cited "<ref><ref>Stanford University microbiologist Nathan Wolfe quoted in National Geographic article on Microbes - Jan 2013 pg 141</ref></ref>" and [12] cited "<ref>{{cite web | url=http://www.womenshealth.gov/publications/our-publications/fact-sheet/hashimoto-disease.html | title=Hashimoto's disease fact sheet | publisher=Office on Women's Health, U.S. Department of Health and Human Services, womenshealth.gov (or girlshealth.gov) | date=July 16, 2012 | accessdate=23 November 2014 | reviewed by=Cooper MD, DS |}}</ref>". Each of these NCP violations were caught, but they were just that, not failures to cite. LeadSongDog come howl! 18:19, 25 September 2014 (UTC)[reply]

Same text added and removed in diff

The quality of results is really coming along. Nice work so far! Checking out the recent false positives, it seems like aside from mirrors, the edits that cause problems are mainly ones like this one (report). Here, the block of text with a match was modified, but the actual matching text was not the part that was added. It seems like in a case like this where there's an <ins class="diffchange diffchange-inline"> tag inside the ins tag, only the content inside the tags should be checked.--Sage (Wiki Ed) (talk) 20:28, 20 October 2014 (UTC)[reply]

Thank you. It already removes parts of the text that already exist in previous edit, but this works for a paragraph. I just added a small tweak to do so also for smaller chuncks of the text. Eran (talk) 22:03, 20 October 2014 (UTC)[reply]

VE gadget

I've left some comments at User talk:ערן/veReplace.js about an upcoming breaking change. ESanders (WMF) (talk) 13:38, 28 November 2014 (UTC)[reply]

Have these changes been made? Doc James (talk · contribs · email) 05:22, 11 December 2014 (UTC)[reply]

A barnstar for you!

The Defender of the Wiki Barnstar
For creating Wikipedia's most important bot. One that has keep hundreds if not thousands of pieces of copyright violations out of Wikipedia. Doc James (talk · contribs · email) 01:55, 11 December 2014 (UTC)[reply]

BAGBot: Your bot request EranBot 2

Someone has marked Wikipedia:Bots/Requests for approval/EranBot 2 as needing your input. Please visit that page to reply to the requests. Thanks! AnomieBOT 23:38, 18 December 2014 (UTC) To opt out of these notifications, place {{bots|optout=operatorassistanceneeded}} anywhere on this page.[reply]

Getting duplicate reports

The bot reported here and then on the next run again here on the same two infractions. LeadSongDog come howl! 16:27, 16 January 2015 (UTC)[reply]

New buttons

I love these new buttons here [13]. Can we begin using the new setup for the medical articles?

Can we switch the buttons such that "green" is no copyvio/FP and "red" is copyvio/TP

Doc James (talk · contribs · email) 04:20, 1 March 2015 (UTC)[reply]

Concur, bravo. Le Prof. 71.201.62.200 (talk) 17:40, 30 June 2015 (UTC)[reply]

Problem getting reports

I can't get reports any more. It looks like something's gone off with the tool server? Please have a look.LeadSongDog come howl! 13:04, 15 April 2015 (UTC)[reply]

LeadSongDog, it should be fixed. Thanks! Eran (talk) 19:14, 15 April 2015 (UTC)[reply]
That's got it, thank you. LeadSongDog come howl! 21:33, 15 April 2015 (UTC)[reply]

No updates to /rc lately

Hi! I noticed that EranBot hasn't updated the global recent changes copyvio page in a while. Is that on hold for the time being?--Sage (Wiki Ed) (talk) 16:50, 8 June 2015 (UTC)[reply]

Hi Sage (Wiki Ed), I just restarted it (and moved all the old reports in User:EranBot/Copyright/rc/archive) Eran (talk) 19:22, 8 June 2015 (UTC)[reply]
Can we have the bot break down the edits by week maybe? When the lists get to large they are hard to load. Doc James (talk · contribs · email) 21:28, 8 June 2015 (UTC)[reply]
Ideally we could set it up such that more load the farther one scrolls down like the "new page patrol". How hard is that to do? Doc James (talk · contribs · email) 21:29, 8 June 2015 (UTC)[reply]

Okay have split the nearly one million byte page into 7 subpages. Eran how to we get the "javascript" to work on those 7 subpages? Doc James (talk · contribs · email) 21:55, 8 June 2015 (UTC)[reply]

By the way how often does the bot die and therefore how often does it need restarting? Might be good to have a few of us able to restart the bot. Doc James (talk · contribs · email) 21:56, 8 June 2015 (UTC)[reply]

Eranbot, positive found in ref tag

Hi. This edit was recently tagged by Eranbot. I'm not too familiar with the bot, but it appears that it is supposed to skip checking citations, but it caught this addition of a quote inside a ref tag. If the bot is not intended to skip citations, it might be worth ignoring content added to the "quote" parameter in citation templates (for example: {{cite web|url=...|quote=Ignore this}}) Such content is always going to be picked up by the bot, but probably should not be. I skimmed the source, and it seems to run a few regexes before submitting to Turnitin. It should be possible to remove the quote with something simple like... /quote=[^|<]*/i Let me know if I can be of any help. Thanks.   — Jess· Δ 17:39, 21 June 2015 (UTC)[reply]

Hi Mann_jess, thank you for noticing it. The bot intentionally doesn't skip citations as copying of large bulk of text may be considered as copyright violation. The bot handles citations/quotes as follows:
  • Quotes with small number of words are skipped (see the specific threshold in github, it is called "WORDS_QUOTE"; it may be possible to think on a better threshold)
  • If the source is mentioned within the text there is a green indication of "citation" that tells watchers it's probably OK (hint - but it is still good to take a quick look on it)
The bot doesn't handle specific templates of enwiki (e.g cite web) to allow compatibility with other wikis and languages, though it may be possible to think on improvements here and use configuration to tell what are the citation templates for each wiki. If you want to improve the bot or add such handling you are more than welcome to submit a patch to the bot in github :) Eran (talk) 17:59, 21 June 2015 (UTC)[reply]
Yes that is an interesting question. How big of a quote should we allow in "quote=" before it contravenes Wikipedia:Non-free_content_criteria Doc James (talk · contribs · email) 13:14, 22 June 2015 (UTC)[reply]
There's no simple answer in copyright issues. It often hinges on how much the quote reproduces the central idea of the work and on whether the quote devalues the original, rather than word counts, but at least attribution addresses the plagiarism problem. Usually, though, I get pretty nervous after about 50 words. LeadSongDog come howl! 13:46, 22 June 2015 (UTC)[reply]
Agree we must make sure we are not decrease the value of the original work through large quotes. Doc James (talk · contribs · email) 13:55, 22 June 2015 (UTC)[reply]

Possible Eranbot issue, clarifications requested

Doc James, Eran, et al., in a recent set of edits of an historical article, which were large in toto, but with no new material added that was not sourced, large blocks of pre-existing material were moved about within the article. This prompted a response to me from another editor, citing Eranbot as the source of the plagiarism charge, which leads me to ask these questions:

  • Is an Eranbot plagiarism finding triggered when large blocks of pre-existing plagiarised material are moved about within an article (as opposed to added afresh)?
  • While the stated emphasis of this tool is medical pages, how is it that, in this case, a biographical/political article was caught up?
  • Relatedly, can Eranbot be used manually by editors, and if so how?
  • Finally, can the Turnitin / iThenticate algorithm be run against articles, generally (versus large edits)? There is near constant need to determine when existing, past-generated blocks of text are cribbed. The mirror site issue makes manual performance of this on existing text in articles a significant problem.

RSVP here, cheers, User:Leprof 7272. Le Prof 71.201.62.200 (talk) 18:01, 30 June 2015 (UTC)[reply]

Leprof 7272:
  • On the first phase, the bot checks for any large bulk of text added to article, and isn't aware whether whether it is moved content or fresh. It is hard to identify old plagiarism, as old content may be either copied from Wikipedia or copied to Wikipedia. The bot handles it by trying to avoid (not always with 100% success) moved content, and search for the text in older revisions.
  • Eranbot scans also recent changes - see User:EranBot/Copyright/rc
  • For manually validating copyright issues there is a great tool of Earwig: https://tools.wmflabs.org/copyvios/
  • The bot can run on articles rather than edits, but for regular basis we don't use it, as there are many sites that can copy from Wikipedia (even when excluding mirrors).
Eran (talk) 20:46, 30 June 2015 (UTC)[reply]
Yes the person doing the follow up should be checking to see if the text in question was preexisting in the article. If one states that they have moved text around in the edit summary that helps those doing the follow up.
Agree with Eran. If the text has been long standing if will have been mirrored across the Internet and thus this tool is no good at looking at old text as there are too many false positives. Doc James (talk · contribs · email) 22:17, 30 June 2015 (UTC)[reply]
Thanks all for replies, am reviewing now, but expect little further comment. Cheers. Le Prof Leprof 7272 (talk) 22:22, 2 July 2015 (UTC)[reply]

Copyvio - textbook not online

A reader reported a potential copyvio to the Wikimedia foundation. It hasn't been tagged in my guess is because the source material is a college textbook and not online. I believe current search but will find things online but not sources that are printed. My hope is that Turnitin has access to a broader base of material including college textbooks. If this is true, could you try or better yet tell me how to try running it against Cyclic decomposition theorem?--S Philbrick(Talk) 15:16, 3 July 2015 (UTC)[reply]

Yes this bot has access to a much larger range of published sources.
This bot ONLY runs on new edits to Wikipedia. It does not run on content from years ago such as 2013.[14]
It is more to prevent further problems going fowards rather than fix problems that occured in the past
The problem is that once the content is on Wikipedia for some time it is mirrored across 1000s of other websites. Thus it takes a lot of work to determine who copied from who. Doc James (talk · contribs · email) 17:19, 3 July 2015 (UTC)[reply]
I am familiar with the challenges involved in running comparison software against old articles. I'm not suggesting that it be unleashed on all of Wikipedia. However CSB had an option for running it against a single article. I was hoping that this software could be run against this article to see if it uncovers the underlying text which I hope is in a database. I may be wrong on that assumption in which case it would be a waste of time. When I read about the software a few years ago I thought one of its strengths was that it had a database of documents as opposed to simply comparing to online material. I thought the document database included prior essays submitted by students, which wouldn't help in this situation but I had hoped that it might also include college textbooks. Again if that is incorrect this won't help and will have to try alternatives.--S Philbrick(Talk) 18:56, 3 July 2015 (UTC)[reply]
S Philbrick, the bot supports this capability of checking specific articles, e.g I have to run:
python plagiabot.py -page:"Cyclic decomposition theorem" -lang:en
The result fo Cyclic decomposition can be found here. (in the report press on the arrow that will appear on mouse over 100% in the right panel, and look for the specific sources (Publications and Crosscheck) Eran (talk) 20:14, 3 July 2015 (UTC)[reply]
Thanks for running that. No hit, but that may simply mean the source is not in the data base, I'll proceed in other ways.--S Philbrick(Talk) 13:45, 4 July 2015 (UTC)[reply]

Make EranBot /rc pages filterable

Currently we have User:EranBot/Copyright/rc as a huuuuge list that includes items already handled in addition to ones still pending. That makes it hard to find ones to do if I find a few minutes of free time (especially moving forward as more and more of the pending ones get done). Also, there are wikiprojects listed and notices posted to their various talkpages. It would be nice if this could all be integrated into the dynamic notice-boards that some projects have, where only notices relevant to that project (such as AFD, GA nom, etc.) are displayed directly on their talkpages. Both of these could be implemented by a filtering mechanism, where {{plagiabot row2}} could take a {{{status_filter}}} and {{{tags_filter}}} that, if a value is passed, not display unless the value matched the actual {{{status}}} or were listed in the {{{tags}}}. The /rc page would propagate them. Then we could use the /rc page as a template, and pass the _filter fields to limit which entries are displayed for various purposes. DMacks (talk) 08:55, 5 July 2015 (UTC)[reply]

The page is currently filterable by wikiproject. You need to add the import Script bit and then you can click on expand and select which wikiprojects you wish to see.
We are going to work on auto archiving at Wikimania
This is a great idea "integrated into the dynamic notice-boards" thanks
Ideally we will one day move over to something like the NewPagesFeed formating. We could than have separate feeds for each wikiproject and autoarchiving Doc James (talk · contribs · email) 16:18, 5 July 2015 (UTC)[reply]
That split by projects should see some urgency. The current approach is pretty much unworkable without a massive increase in the workforce.LeadSongDog come howl! 18:40, 25 August 2015 (UTC)[reply]
Eran where are we at with this? User:Harej now that we have the data in a database can we create separate modules for each Wikiproject? Doc James (talk · contribs · email) 20:46, 25 August 2015 (UTC)[reply]
  • There is database and API, so for integrating the tool in Wikiproject modules, and this task is currently waiting for [[User:Harej|].
  • Kaldari asked me to expose the sources themselves rather than the output of the bot, so he will be able later on to use the sources also in Copyvio Detector tool, which I haven't done yet. Eran (talk) 04:00, 31 August 2015 (UTC)[reply]
You might want to subscribe to/comment at phabricator:T109318. LeadSongDog come howl! 22:03, 16 December 2015 (UTC)[reply]

EranBot for sv.wp

@Eran, Doc James, and Ocaasi: Can EranBot be activated on sv.wp? It would be great for me fighting copyvio there. If there is problem with bot flags or something, perhaps the results could be listed on enwp+ I don't know, all I know is that I highly would like it :D (tJosve05a (c) 20:39, 21 July 2015 (UTC)[reply]

Josve05a: Based on ithenticate FAQ, Swedish is supported by the copyright detection service. I can run it for a test run on the recent changes, and create a report page. If it works, we can run it periodically (X times each day) or with in online mode (in this case I will nominate the account for bot rights). Eran (talk) 15:04, 24 July 2015 (UTC)[reply]
Great! Doing a test run on the RC-feed sounds great! Feel free to ping me when it is ready to be "reviewed". (tJosve05a (c) 16:47, 24 July 2015 (UTC)[reply]
Josve05a, I ran it on the last 12 hours of svwiki, and couldn't find even 1 copyright concern issue. (~68 diffs with large enough content). I'm not sure wether svwiki recent changes is fully clean of copyright issues (in the last 12 hours) or there is no rich repository of sv texts in iturnitin. I can try to re-run it in different time and see if this observation is reproduced. Eran (talk) 13:13, 25 July 2015 (UTC)[reply]
Yeah, svwp doesn't get that much copyvio/spam. But it would be great to know if this was an abnormality, or just "normal" sv.wp-behaviour. (tJosve05a (c) 14:19, 25 July 2015 (UTC)[reply]
Have you had time to run the scans again to see if there has been any different results? (tJosve05a (c) 15:43, 3 August 2015 (UTC)[reply]
Running now. I'm not adding it yet to a automatic schedule job/cron - since I'm not sure there are relevant sv sources indexed in the service we used. If we will do find suspected edits I'll enable it. Eran (talk) 19:54, 3 August 2015 (UTC)[reply]
Can you run against a specific page or edit on command? I know of two articles (in sv:Kategori:Wikipedia:Plagiat) which has copyvio on it on sv.wp, could it be possible to check that article and see if it "finds a match"? (tJosve05a (c) 11:01, 4 August 2015 (UTC)[reply]
Josve05a, I ran it only on one of them: https://tools.wmflabs.org/eranbot/ithenticate.py?rid=18905816 . So it does find the sources, but note the maximum similarity is only 33%, which lower than the current threshold of the bot. Eran (talk) 20:09, 4 August 2015 (UTC)[reply]

Abraham Isaac Kook

Is there an experienced editor fluent in English and Hebrew, who knows how to properly add the critically important supporting document and picture links presented on the talk page, to the article page? Ksavyadkodesh (talk) 18:15, 25 August 2015 (UTC)[reply]

Add notification to article talkpages?

Any views on having the bot also post a templated message to talkspace? Something along the lines of

A bot has detected similarities between text in this article and other published sources. Please examine recent edits for possible copyright violations.

This would seem more likely to get attention than burying the notice on the bot subpage. Your thoughts? LeadSongDog come howl! 20:22, 21 September 2015 (UTC)[reply]

WikEd support

Plugin does not work with Wikipedia:WikEd... — Preceding unsigned comment added by Gizmocorot (talkcontribs) 17:25, 15 October 2015 (UTC)[reply]

Bot hung?

Nothing new seen for weeks. Is the bot down? LeadSongDog come howl! 20:54, 23 October 2015 (UTC)[reply]

EranBot for en.wv

I saw a write-up on your efforts at http://www.eschoolnews.com/2015/11/02/turnitin-wikipedia-copyright-068/ . Great work! Would it be possible to run EranBot on English Wikiversity? -- Dave Braunschweig (talk) 21:22, 3 November 2015 (UTC)[reply]

@Dave Braunschweig: - Yes it is possible to run the bot in other wikis. For this we have to import the relevant templates and pages to wikiversity. (Template:Plagiabot row2). I'm going to run it once just as a proof of concept (see wikiversity:User:EranBot/Copyright). If you would like it to run on a regular basis, please discuss it with the community and reach community consensus for this. This is important not only for the community consensus, but also to be sure there are people who are interested in monitoring this (e.g add the page to their watchlist and treat the reported copyvios). EranBot (talk) 21:43, 3 November 2015 (UTC)[reply]

ithenticate

So I don't know what ithenticate is (thought it was a typo) the text mentions it twice no links or anything. -- GreenC 01:46, 25 April 2016 (UTC)[reply]

Green: fixed, thank you for the comment. Eran (talk) 04:20, 25 April 2016 (UTC)[reply]

Bot has stopped

Hi Eran. The bot seems to have stopped sometime yesterday. If you could have a look whan you have time, that would be perfect. Thanks, — Diannaa (talk) 23:34, 4 May 2016 (UTC)[reply]

Diannaa, the bot session expired (e.g it got logout). It just logged in again, and it's next schedule should work soon. Eran (talk) 04:28, 5 May 2016 (UTC)[reply]
The bot seems to have stopped again. Thanks, — Diannaa (talk) 14:49, 29 May 2016 (UTC)[reply]
Thanks. I restarted it. Eran (talk) 17:31, 30 May 2016 (UTC)[reply]
Seems to have stopped again; nothing for an hour an a half when we normally get five or ten per hour. Thanks, — Diannaa (talk) 22:03, 30 May 2016 (UTC)[reply]
Seems to have stopped again; nothing for two hours. — Diannaa (talk) 20:17, 1 June 2016 (UTC)[reply]
It is working now. I think so there was a bug there causing it to crash. I hopefully fixed it. Eran (talk) 21:19, 1 June 2016 (UTC)[reply]
Thank you! I will report back if there's any further issues. — Diannaa (talk) 21:32, 1 June 2016 (UTC)[reply]
Bad news, it's stopped again. — Diannaa (talk) 03:25, 2 June 2016 (UTC)[reply]
Seems to have stopped about 10 hours ago. Thanks, — Diannaa (talk) 18:51, 4 June 2016 (UTC)[reply]
Thank you for the report. I restarted it. Eran (talk) 20:50, 4 June 2016 (UTC)[reply]
Hi Eran. The bot seems to have stopped again; no new reports for about 5 hours. If you could have a look when you have time, that would be great. Thanks, — Ninja Diannaa (Talk) 03:40, 17 June 2016 (UTC)[reply]
Rolling again now; thanks. — Ninja Diannaa (Talk) 04:52, 17 June 2016 (UTC)[reply]
The bot has stopped again. Please have a look when you have time. Thanks, — Diannaa (talk) 23:55, 5 July 2016 (UTC)[reply]
The bot seems to have stopped about 5 hours ago. If you could have a look when you have time, that would be great. Thanks, — Diannaa (talk) 01:41, 11 August 2016 (UTC)[reply]
Bot has stopped again. Thanks, — Diannaa (talk) 04:07, 19 August 2016 (UTC)[reply]
It looks like the bot has stopped again. Thanks, — Diannaa 🍁 (talk) 05:15, 6 September 2016 (UTC)[reply]
Diannaa, thank you. It seems that the bot have returned to work without any further action from my side Eran (talk) 20:44, 7 September 2016 (UTC)[reply]
Maybe people stopped adding copyright violations! one can only hope, — Diannaa 🍁 (talk) 22:05, 7 September 2016 (UTC)[reply]
The bot appears to be down; no reports for ten hours. Thank you, — Diannaa 🍁 (talk) 17:18, 11 September 2016 (UTC)[reply]
It may be down again; no reports for 3.5 hours. Thank you, — Diannaa 🍁 (talk) 19:26, 15 September 2016 (UTC)[reply]

I fixed the death date. Does Hebrew Wiki add the Hebrew date to biographical articles? If so, you can find it on the article's talk page.Geewhiz (talk) 04:50, 11 September 2016 (UTC)[reply]

Error message

I am getting an error message when I try to load https://tools.wmflabs.org/copypatrol. "Slim Application Error" — Diannaa 🍁 (talk) 13:37, 18 September 2016 (UTC)[reply]

Diannaa, thank you for the report. The problem should be fixed by now[15]. I'm also pinging Niharika and MusikAnimal to review my fix since I'm not familiar with the web interface code of CopyPatrol. Eran (talk) 18:30, 18 September 2016 (UTC)[reply]
Yes, I am examining some reports right now, trying to get things caught up as I will be away next week, with less time to edit. Thank you, — Diannaa 🍁 (talk) 18:31, 18 September 2016 (UTC)[reply]
It seems that the reason for the fail is Special:Diff/739950162 in which a username was hidden, and the code didn't take such cases into account. Eran (talk) 18:33, 18 September 2016 (UTC)[reply]

Autocomplete plug-in not working

I have been using this plugin for a long time, and have had no problems, that was until a few days ago. What is the problem? Thank you for your help. Charlotte Allison (Morriswa) (talk) 01:48, 22 September 2016 (UTC)[reply]

Charlotte Allison (Morriswa): I added a fix. It should work now (you may need to force reload with Ctrl+F5 / Ctrl+R). Eran (talk) 17:20, 22 September 2016 (UTC)[reply]
Thank you for your help. I will have to test it out. Charlotte Allison (Morriswa) (talk) 18:26, 22 September 2016 (UTC)[reply]

Autocomplete in the Portuguese Wikipedia?

Hi Friend!

Autocomplete is awesome, i use everyday. Good job!

My question is, there's a way to implement the autocomplete in the Portuguese Wikipedia? My life will be so much easier if i could use this feature there.

Waiting for you response, thanks!!

Guilherme (talk) 20:33, 1 November 2016 (UTC)[reply]

Guilherme: Thank you for the kind words :) To have it in Portuguese Wikipedia you can add your pt:User:Guilherme/common.js page the following code:
mw.loader.load('//he.wikipedia.org/w/load.php?modules=ext.gadget.autocomplete');
Or if you think there are other users that would like it, you can also ask He7d3r to import from hewiki by copying he:MediaWiki:Gadget-autocomplete.js => pt:MediaWiki:Gadget-autocomplete.js, creating a description in pt:MediaWiki:Gadget-autocomplete and registering it in pt:MediaWiki:Gadgets-defintion - this way other users will find it easy to install from preferences. At least in hewiki, it is very popular gadget. Eran (talk) 21:13, 1 November 2016 (UTC)[reply]
Thanks for responding me so fast man, i really appreciate it.
It works very well, thanks one more time.
I will ask him, for sure! Guilherme (talk) 03:18, 2 November 2016 (UTC)[reply]

Add wiki-template-name support

There are templates that refer to other templates. Also, how difficult would it be to implement the support of predefined array of values as another template param type? The array (along with template name and parameter name/position) of course will be provided by the user who imports autocomplete.js. Have I made myself clear? --Dixtosa (talk) 20:20, 5 November 2016 (UTC)[reply]

Dixtosa: autocomplete takes advantage of the TemplateData to enhance auto completion for template parameters. This way the gadget can work in any wiki, and the suggested parameters are coherent with the VisualEditor template dialog. So to make it work we should make sure TemplateData support it.
  • Enums - do you suggest to have a new type parameter in TemplateData for "predefined values"/enum? I would love to support it in the autocomplete gadget, once it is supported in TemplateData syntax. Please comment in phab:T53375 to promote this feature request (Krenair already started to work on it).
  • wiki-template-name - Thank you for asking it. You reminded me, I asked it myself - phab:T88900 ... I just updated the gadget to support this type.[16]
Eran (talk) 21:40, 5 November 2016 (UTC)[reply]
Yup, Enum type seems to be what I was trying to verbalize above. I guess, I am just gonna have to wait )). --Dixtosa (talk) 21:55, 5 November 2016 (UTC)[reply]

Bot may have stopped

Hi Eran. I wonder if the bot has stopped, as there's been no copyvio reports for 4.5 hours, which is unusual for a weekday. If you could please have a look when you have the time, that would be great. Thanks, — Diannaa 🍁 (talk) 13:51, 8 November 2016 (UTC)[reply]

The bot may have stopped; no new copyvio reports for 4 or 5 hours. If you could check that would be perfect. Thanks, — Diannaa 🍁 (talk) 00:30, 2 December 2016 (UTC)[reply]
And again the bot may have stopped; no new copyvio reports for about 10 hours. Thanks so much, — Diannaa 🍁 (talk) 21:26, 13 December 2016 (UTC)[reply]
Bot seems to have stopped again; no new reports in 22 hours. Thanks, — Diannaa 🍁 (talk) 20:43, 15 December 2016 (UTC)[reply]

Plagiarism bot in Portuguese?

Hi Eran! I hope this message finds you well. I have read the report on the plagiarism bot --congrats!--, and I was wondering what it would take to make it work in Wikipedia in Portuguese. I am an education program leader, and as I work with my students I face a real challenge to check and prevent plagiarism, and an automated system would be a tremendous support. Do you think this would be possible? Thanks in advance, --Joalpe (talk) 17:59, 14 November 2016 (UTC)[reply]

Hi Joalpe, I just checked ithenticate (the service which we use for plagiarism detection), and it should support pt[17]. I can run the bot for a test run in pt (for all the edits in the last day or so) to see if it works OK in pt. If the results are good and the bot results are useful for pt, we can then ask for community consensus in village pump in pt to have it run on regular basis. Are you familiar with git/github? If yes could you please fork the bot code[18] and add translations to pt? (see lines 70-80 for English strings) If not, you can just write here the pt translation for those strings and I will update the code there. Thanks, 18:12, 14 November 2016 (UTC)
Hello, ערן! Sorry for not replying earlier. For some reason, I have missed your reply. I would really appreciate if you could run the bot (I have no idea how this works and am not familiar with GitHub). Would it make sense to run it on specific entries I could submit to you? Students of mine are posting their contributions this week (over 150 entries!), and it'd be really helpful to me and we could use this as a pilot so I could present it at the village pump. Below is translation:

'pt-br': {

'table-title': 'Título',

'table-editor': 'Editor',

'table-diff': 'Diff',

'table-status': 'Status',

'template-diff': u'Diff',

'table-source': 'Fonte',

'update-summary': 'Atualização',

'ignore_summary': '\[*(Revertido|Revisão desfeita|rv$)',

'rollback_of_summary': 'Revertidas .*?edições? de (\[\[Usuário(a):)?{0}|Revisão desfeita {1}|Revertendo possível vandalismo de (\[\[Usuário(a):)?{0}' }

Thanks, you rock! --Joalpe (talk) 02:32, 23 November 2016 (UTC)[reply]

Joalpe, the bot usually runs on the recent changes, but it can work on any list: a page of list of links to the user entries (format: "* [[Page name]]" in each line), or if you prefer on a category. Note that in this mode of work (e.g working on list of pages rather than on recent changes), it is more likely to have false positives of other websites copying from wikipedia rather the other way round. If these are totally new pages is not an issue, but if they are heavily based on existing articles that may be an issue. Eran (talk) 05:22, 23 November 2016 (UTC)[reply]
Hello, Eran. Thanks. They are absolutely new content. Some pages already existed but they were completely rewritten by my students. I will prepare the list of entries, that is related to a very big education program I have led. --Joalpe (talk) 10:15, 23 November 2016 (UTC)[reply]
Since students are still working on their entries, I could have them all set for plagiarism check on Friday afternoon. Would that be OK? --Joalpe (talk) 11:33, 23 November 2016 (UTC)[reply]
Sure :) Eran (talk) 09:33, 24 November 2016 (UTC)[reply]
Hi, Eran. Could you please run the anti-plagiarism bot on Wikipedia in portuguese on w:pt:Categoria:Patrimônio histórico de São Paulo. This category has many pages --I am not sure this is an issue. Thank you so much! As soon as we get the results: (1) Could you please forward them to me?; (2) Depending on results, I will start a discussion thread on the village pump about having this bot on our community? Thanks again! --Joalpe (talk) 15:56, 25 November 2016 (UTC)[reply]
Joalpe: please see pt:Usuário(a):EranBot/Copyright. This is a report of all pages that have possible copyright violation, but someone must go over it manually to validate. Actually this type of report is a bit old, and Niharika wrote a better interface for this report called CopyPatrol, but it doesn't yet have other languages other than English and French. Eran (talk) 23:28, 25 November 2016 (UTC)[reply]
Eran, thanks so much. I have just looked at two cases so far. I have maybe a silly question: the bot has flagged some cases in which plagiarism has happened in edits that have been made a long time ago. Wouldn't it be possible that what the bot is identifying cases in which Wikipedia was actually plagiarized? --Joalpe (talk) 00:01, 26 November 2016 (UTC)[reply]
@Joalpe:: If the pages contain old content, it is very possible. I assumed that the content in those pages is new content, but if this is not the case, you can skip reports with mirrors of Wikipedia. BTW, this is way the bot regular mode of work is to run on the recent changes. Eran (talk) 07:29, 26 November 2016 (UTC)[reply]
Eran, thanks for the info. 170 out of the 200 pages in the category were done by my students. Half of the remaining pages were flagged for plagiarism. --Joalpe (talk) 20:43, 26 November 2016 (UTC)[reply]

Stop

Hi, please stop process : Eranbot/MusikBot turn endlessly : User:MusikBot/CopyPatrol/Error_log. Thanks --Framawiki (please notify) (talk) 17:35, 9 December 2016 (UTC)[reply]

Framawiki, these errors are not comding from EranBot. I'm pinging User:MusikAnimal. Eran (talk) 17:40, 9 December 2016 (UTC)[reply]
Hey! It looks like user s52615's access to the copyright database was revoked. I see production CopyPatrol is using a different user now, too. Not sure what happened, but I've updated the credentials to match production so it should be OK MusikAnimal talk 17:55, 9 December 2016 (UTC)[reply]
Thanks to you two.. Have a good day --Framawiki (please notify) (talk) 20:21, 9 December 2016 (UTC)[reply]

Module:PropertyLink, a page which you created or substantially contributed to (or which is in your userspace), has been nominated for deletion. Your opinions on the matter are welcome; you may participate in the discussion by adding your comments at Wikipedia:Miscellany for deletion/Module:PropertyLink and please be sure to sign your comments with four tildes (~~~~). You are free to edit the content of Module:PropertyLink during the discussion but should not remove the miscellany for deletion template from the top of the page; such a removal will not end the deletion discussion. Thank you. JohnBlackburnewordsdeeds 00:32, 20 December 2016 (UTC)[reply]

Copypatrol not working

https://tools.wmflabs.org/copypatrol/en is not working today; I am getting a "Slim Application Error". If you could have a look that would be great. Thanks, — Diannaa 🍁 (talk) 12:57, 13 April 2017 (UTC)[reply]

Today I am getting an error message when reviewing cases: "tools.wmflabs.org says: There was an error in connecting to database." Attempting to access the leaderboard results in the "Slim Application Error". Thanks, — Diannaa 🍁 (talk) 14:12, 14 April 2017 (UTC)[reply]

Diannaa: Thank you for reporting. This is under track in phab:T162932. This is due to some DB corruption. Eran (talk) 15:00, 14 April 2017 (UTC)[reply]
It seems to be functioning properly again. Thanks, — Diannaa 🍁 (talk) 15:41, 14 April 2017 (UTC)[reply]

Copypatrol not working - Slim application error

I am getting a Slim application error when I try to access the copypatrol today. Any help getting this tool running again would be appreciated. — Diannaa 🍁 (talk) 11:43, 23 June 2017 (UTC)[reply]

Seems like an issue with ORES (Halfak, Ladsgroup):"Cannot process your request because the server is overloaded. Try again in afew minutes."[19] Eran (talk) 12:47, 23 June 2017 (UTC)[reply]
ORES is down at the moment, we are working on to make it back online. Ladsgroupoverleg 12:55, 23 June 2017 (UTC)[reply]
Diannaa: Copypatrol is back. (now degrades gracefully when there are issues in ORES). Eran (talk) 13:11, 23 June 2017 (UTC)[reply]
Thank you! — Diannaa 🍁 (talk) 13:19, 23 June 2017 (UTC)[reply]

Hi Eran. I am unable to get the iThenticate links to load today at https://tools.wmflabs.org/copypatrol/en; they return an unhappy face ;-( I don't know if you're the right person to alert, but if not, perhaps you know who I should contact? Thanks, — Diannaa 🍁 (talk) 17:25, 5 August 2017 (UTC)[reply]

Copypatrol is working only intermittently

Hi Eran. Copypatrol is working only intermittently. Right now the page refuses to load. This problem has been occurring on and off for the last few days. If you could see if there's anything you can do to help that would be perfect. Thank you! — Diannaa 🍁 (talk) 00:27, 11 August 2017 (UTC)[reply]

Copypatrol is down

Hi Eran. The copypatrol bot appears to be down, as there's been no new reports since 18:59 on August 28 (over 24 hours ago). If you could have a look that would be appreciated. Thanks, — Diannaa 🍁 (talk) 20:02, 29 August 2017 (UTC)[reply]

Diannaa: phab:T174517 filled. Feel free to subscribe to this task.--Framawiki (please notify) (talk) 23:11, 29 August 2017 (UTC)[reply]
Is this still a problem? I'm seeing newer reports MusikAnimal talk 23:26, 29 August 2017 (UTC)[reply]
@Diannaa: To make sure you see this MusikAnimal talk 23:28, 29 August 2017 (UTC)[reply]
Hi Diannaa. In future, you could rather complain on my talk page about Copypatrol issues. I'm not sure how actively Eran checks his talk page. -- NKohli (WMF) (talk) 23:55, 29 August 2017 (UTC)[reply]
Hi Diannaa, right now it looks up with hints on suspected edits from last few minutes. I'm currently behind firewalls and can't access to the logs, but I'll check later today in the logs what could explain the lack of updates. Thanks, Eran (talk) 05:19, 30 August 2017 (UTC)[reply]
Thanks Eran, Thanks NKohli (WMF) for offering to help. I will bookmark your talk page for future use. New reports are being generated so it looks like all is functioning properly again. — Diannaa 🍁 (talk) 12:05, 30 August 2017 (UTC)[reply]

Bot has stopped 2

(Cross posting) The copypatrol page has no new listings for more than eight hours. I suspect the bot is down. Could you take a look please when you get a chance? Thank you. — Diannaa 🍁 (talk) 00:44, 28 December 2017 (UTC)[reply]

Diannaa: I see there are newer edits there, so it seems to resolved. Eran (talk) 17:27, 28 December 2017 (UTC)[reply]
It is was working again. I don't know if it re-started on its own or if someone fixed it. But now it appears to have stopped again, as there's been no new reports for 4.5 hours. I have also alerted User:NKohli (WMF) at meta. Thanks, — Diannaa 🍁 (talk) 13:27, 29 December 2017 (UTC)[reply]
Update: Things seem to be working normally again. — Diannaa 🍁 (talk) 21:23, 29 December 2017 (UTC)[reply]
Seems to have stopped again; no new reports for 5 hours. I have also alerted User:NKohli (WMF) at meta. — Diannaa 🍁 (talk) 21:33, 30 December 2017 (UTC)[reply]
The bot has been behaving erratically since December 27. It seems to go down for a few hours or up to 10 hours at a time, and then resumes, filing a few reports, and then stopping again. For example on January 2 there were only 8 reports and the usual is 60 to 100 per day. There's a Phabricator ticket open https://phabricator.wikimedia.org/T183913. Any assistance would be appreciated. Thanks, — Diannaa 🍁 (talk) 11:39, 3 January 2018 (UTC)[reply]
Hi again. The bot seems to have stopped again. There's been no new reports for twelve hours. Any assistance you can offer would be appreciated. Thanks, — Diannaa 🍁 (talk) 13:37, 17 January 2018 (UTC)[reply]
Hi Diannaa, it seems that the bot is out of credits, and the issue is discussed in phab:T185163. Eran (talk) 22:48, 18 January 2018 (UTC)[reply]
Thank you very much for the update and the link. — Diannaa 🍁 (talk) 23:58, 18 January 2018 (UTC)[reply]
Diannaa, just updating that the issue was resolved. Thanks, Eran (talk) 07:49, 19 January 2018 (UTC)[reply]
Thanks Eran! that's good news. — Diannaa 🍁 (talk) 11:50, 19 January 2018 (UTC)[reply]

The bot has stopped

Hello Eran. The CopyPatrol bot appears to have stopped; there's been no new reports for about four hours, and the iThenticate links are failing to load. (I have also posted at meta:User talk:NKohli (WMF)#The bot has stopped). Thanks, — Diannaa 🍁 (talk) 18:50, 17 March 2018 (UTC)[reply]

CopyPatrol stopped again: 502 Bad Gateway

Hi Eran. The bot stopped a while ago, likely due to issues on Labs. "502 Bad Gateway" for a while; the page loads now, but there's been no new reports for around 6 hours. I have also notified User:NKohli (WMF) at Meta (meta:User talk:NKohli (WMF)#CopyPatrol: 502 Bad Gateway). If you have time to take a look that would be great. Thanks, — Diannaa 🍁 (talk) 00:43, 7 June 2018 (UTC)[reply]

Bot appears to have stopped

Hi Eran. The bot appears to have stopped. There's been no new reports for 5.5 hours now. Please see meta:User talk:NKohli (WMF)#Problems with Copypatrol for a description of what's been happening. Any assistance would be appreciated. Thanks, — Diannaa 🍁 (talk) 13:55, 28 August 2018 (UTC)[reply]

Diannaa: I restarted the bot now. Eran (talk) 03:16, 29 August 2018 (UTC)[reply]
I see some unusual slowness in the backend service provider of ithenticate. Eran (talk) 03:42, 29 August 2018 (UTC)[reply]
I've been experiencing issues for a couple days with iThenticate. I will ask NKohli (WMF) to contact them in the morning (North America time) if the problem still persists at that time. There's still no new reports at CopyPatrol yet; it's late here, perhaps there's nothing to report at this time. I have to log off now. — Diannaa 🍁 (talk)

CopyPatrol has stopped

Hi Eran. The CopyPatrol bot is not filing new reports (there's been no new reports filed since 02:28 UTC October 1), and the iThenticate reports are failing to load. I wonder if the iThenticate website is down, or if there's something happening at our end that we can fix. Thanks for any assistance you can offer. I have also posted at m:User talk:NKohli (WMF). — Diannaa 🍁 (talk) 18:25, 1 October 2018 (UTC)[reply]

Hi Diannaa, there is an issue with the Wikimedia account in iThenticate, and we need to ask to re-enable it to get it work again. Sorry for the inconvenience but for now Eranbot is down and doesn't report on new violations to CopyPatrol. Eran (talk) 18:56, 1 October 2018 (UTC)[reply]
Thank you for the information. Is someone taking care of this problem? — Diannaa 🍁 (talk) 19:07, 1 October 2018 (UTC)[reply]
Diannaa: Thanks for the quick response from Ithenticate, the account re-activated and I just re-enabled the bot. Eran (talk) 19:40, 1 October 2018 (UTC)[reply]
Normal activity has resumed. Thank you very much for your prompt assistance. — Diannaa 🍁 (talk) 19:48, 1 October 2018 (UTC)[reply]

Hi Eran. The bot appears to have stopped again. There's been no new reports for nearly 7 hours. Thanks for any assistance. — Diannaa 🍁 (talk) 03:43, 5 October 2018 (UTC)[reply]

Diannaa: I restarted it, should be working now. Eran (talk) 06:44, 5 October 2018 (UTC)[reply]
Yes it is! Thank you, — Diannaa 🍁 (talk) 10:18, 5 October 2018 (UTC)[reply]

BAGBot: Your bot request EranBot 3

Someone has marked Wikipedia:Bots/Requests for approval/EranBot 3 as needing your input. Please visit that page to reply to the requests. Thanks! AnomieBOT 13:51, 15 October 2018 (UTC) To opt out of these notifications, place {{bots|optout=operatorassistanceneeded}} anywhere on this page.[reply]

Copypatrol bot has stopped

Hi Eran. The CopyPatrol bot is not filing new reports (there's been no new reports filed since 2019-01-05 15:47), and the iThenticate reports are failing to load. The iThenticate website was down for maintenance for a while, but it's back up now. Perhaps there's something happening at our end that we can fix? Thanks for any assistance you can offer. I have also posted at m:User talk:NKohli (WMF). — Diannaa 🍁 (talk) 21:05, 5 January 2019 (UTC)[reply]

Normal activity has resumed. Thank you :) — Diannaa 🍁 (talk) 01:34, 6 January 2019 (UTC)[reply]

Hi Eran, the bot seems to have stopped again, with no new reports filed for 4 hours. If you could check and see what's happening, that would be perfect. Thanks, — Diannaa 🍁 (talk) 23:08, 1 February 2019 (UTC)[reply]

Seems to be working again! Thank you, — Diannaa 🍁 (talk) 23:15, 1 February 2019 (UTC)[reply]

Hello again Eran, the iThenticate site was down for a while about an hour ago, and there's been no new reports since. You may need to re-start the bot? If you could check that would be great. Thanks. — Diannaa 🍁 (talk) 21:27, 13 February 2019 (UTC)[reply]

Activity has resumed. Thank you, — Diannaa 🍁 (talk) 21:55, 13 February 2019 (UTC)[reply]

CopyPatrol page will not load

Hello Eran. The CopyPatrol page is failing to load; the last time I was able to use the page properly was at around 03:02 UTC. If you could have a look that would be perfect. Thanks, — Diannaa 🍁 (talk) 03:48, 15 February 2019 (UTC)[reply]

If I leave it spin long enough it times out to "502 Bad Gateway". — Diannaa 🍁 (talk) 04:06, 15 February 2019 (UTC)[reply]

Hi again Eran. I wonder if you could try re-starting the bot? Things may be working again, at least temporarily. Thank you, — Diannaa 🍁 (talk) 20:37, 16 February 2019 (UTC)[reply]

Actually, it looks like EranBot is running, but the page where the results are reported (https://tools.wmflabs.org/copypatrol/en) is still producing a "500 - Internal Server Error". — Diannaa 🍁 (talk) 21:34, 16 February 2019 (UTC)[reply]

Fixed by MusikAnimal. Eran (talk) 23:13, 17 February 2019 (UTC)[reply]
Everything has been working properly for quite a few hours so hopefully this incident is over. Thank you, — Diannaa 🍁 (talk) 23:19, 17 February 2019 (UTC)[reply]

Bot stopped

Hi again Eran, the bot appears to have stopped, as there's been no new reports for 3 hours and only 2 reports in the last 5 hours. If you could have a look that would be great. Thanks, — Diannaa 🍁 (talk) 04:09, 19 March 2019 (UTC)[reply]

Update: there's some information on this at at meta:User talk:NKohli (WMF)#Bot stopped. — Diannaa 🍁 (talk) 13:25, 19 March 2019 (UTC)[reply]

After running successfully for a couple days, the bot has stopped again. No new reports since 13:17 on March 21 (8 hours ago). Any assistance would be appreciated. I have also posted at meta:User talk:NKohli (WMF)#Bot stopped. Thank you, — Diannaa 🍁 (talk) 21:46, 21 March 2019 (UTC)[reply]

Copypatrol stopped

Hi Eran. The CopyPatrol bot is not filing new reports (there's been no new reports filed since 14:37 UTC July 13), and the iThenticate reports are failing to load. I wonder if the iThenticate website is down, or if there's something happening at our end that we can fix. Thanks for any assistance you can offer. This is the same as the issue I reported in October 2018. I have also posted on Niharika's talk at meta. Thanks, — Diannaa 🍁 (talk) 17:09, 13 July 2019 (UTC)[reply]

Update:Normal activity has resumed. I'll let you know if there's any further issues! Thanks, — Diannaa 🍁 (talk) 20:16, 13 July 2019 (UTC)[reply]

Thanks for the update. Eran (talk) 20:17, 13 July 2019 (UTC)[reply]
Hi again Eran. The bot has stopped again; no new reports since 13:25 on July 21 (8 hours ago). Cross posted to Niharika's talk at meta. Thanks, — Diannaa 🍁 (talk) 21:54, 21 July 2019 (UTC)[reply]
The bot is still down. Is there anybody that can take a look? user:MusikAnimal sometimes helps. Thanks, — Diannaa 🍁 (talk) 18:39, 22 July 2019 (UTC) or user:MusikAnimal (WMF)Diannaa 🍁 (talk) 19:27, 22 July 2019 (UTC)[reply]
Sorry for the delayed response. It ran out of credits, and after getting additional credits it seems to work. Eran (talk) 05:07, 26 July 2019 (UTC)[reply]

Hello again. The bot seems to have stopped again, as there's been no new reports since 20:53, 2019-08-15 (over two hours ago). If you could have a look that would be great. I will cross-post at Niharika's talk at meta. Thanks, — Diannaa 🍁 (talk) 23:15, 15 August 2019 (UTC)[reply]

Reports are comning in again, so it looks like we're okay for now. Thanks, — Diannaa 🍁 (talk) 11:20, 16 August 2019 (UTC)[reply]

Hello again. The bot seems to have stopped again, as there's been no new reports since 19:52, 2019-08-28 (around 3 hours ago). I have also posted at Niharika's talk page at meta. — Diannaa 🍁 (talk) 22:57, 28 August 2019 (UTC)[reply]

Updating: I don't know if it started again on its own or if someone gave it a re-start. But reporting has resumed! Thanks, — Diannaa 🍁 (talk) 00:33, 29 August 2019 (UTC)[reply]

Substantiation for Log/pagetriage-copyvio? Please, either explain the issue about Draft:Group 1 element (edit | talk | history | links | watch | logs) or get this empty-worded gossip out. Incnis Mrsi (talk) 07:59, 9 September 2019 (UTC)[reply]

Hello Incnis Mrsi. The report is a false positive. Because content you added from Alkali metals has been copied to other locations online, the bot filed a report indicating a possible violation of the copyright policy. Here is a link to the bot report. The matching site is a Wikipedia mirror. Incnis Mrsi, a false positive bot report does not imply any wrongdoing on your part (in fact you did everything correctly from an attribution point of view), and bots are unable to engage in gossip, empty-worded or otherwise, as they are only tools. — Diannaa 🍁 (talk) 15:01, 9 September 2019 (UTC)[reply]
Incnis Mrsi: Sorry, for the threaten warning. As Dianna explained this is false positive. The terminolgoy we use is "potential copyright violation" to indicate this doesn't have to mean a real issue. If you have any suggestion for better phrasing for that we can consider it (admin just have to rephrase it here: MediaWiki:Logentry-pagetriage-copyvio-insert).
Diannaa thanks for the explaination. Some internal implementation details of the bot:
If the summary contains internal links, the bot goes to that page and looks for copied content to avoid such false positives. So the bot shouldn't have mark this as pontential copyvio - I suspect there is a bug in the bot[20], and did a minor change in the code to try to avoid it in further cases.
Eran (talk) 17:40, 9 September 2019 (UTC)[reply]
Thanks Diannaa, Eran – no worries about the false positive per se, buggy software is a widespread problem. But it is not very polite to make bizarre alerts visible to anybody in Special:Log without links to evidence. Yes, it’s gossip. Incnis Mrsi (talk) 18:05, 9 September 2019 (UTC)[reply]
The page user:EranBot should not only link to toollabs:copypatrol, but state prominently and clearly that everything in Special:Log/pagetriage-copyvio/EranBot should be assessed using this tool. Incnis Mrsi (talk) 18:11, 9 September 2019 (UTC)[reply]
Updated the bot page documentation (Special:Diff/914989564). Eran (talk) 15:48, 10 September 2019 (UTC)[reply]

By the way, redirecting user_talk:EranBot here seems comfortable—for the botmaster—but isn’t comfortable. Indeed, a separate page having a thick edit notice about toollabs:copypatrol could avert arguments on an insignificant pretext, like this. Incnis Mrsi (talk) 07:53, 10 September 2019 (UTC)[reply]

Incnis Mrsi, yes it is comfortable for the botmaster I understand the issue, but otherwise I would probably not notice issues quickly. I and will try to make it more accessible by linking from my talk page and the user page to the bot page. Eran (talk) 15:48, 10 September 2019 (UTC)[reply]

Speaker of the British House of Commons election

I'm puzzled as to why EranBot has just marked this disambiguation page I created as a potential copyright violation. Any light you can shed on this would be welcome. This is Paul (talk) 20:57, 22 September 2019 (UTC)[reply]

Hi Paul. It's a false positive. Here is a link to the bot report. Click on the iThenticate report to view the overlap. This occasionally happens with lists and list-like content. — Diannaa 🍁 (talk) 02:10, 23 September 2019 (UTC)[reply]
ok thanks for the explanation Diannaa, I'm glad it's only a glitch as it had me a bit worried, I'll take a look at the report. This is Paul (talk) 16:42, 24 September 2019 (UTC)[reply]

CopyPatrol stopped again

Hi Eran. The CopyPatrol bot seems to have stopped. There's been no new reports since 09-22 at 21:00 (about 5 hours ago). I am also emailing User:IFried (WMF), who is our new contact person at meta. Thanks, — Diannaa 🍁 (talk) 02:22, 23 September 2019 (UTC)[reply]

Any idea what's going on? I'd appreciate an update. Thank you, — Diannaa 🍁 (talk) 20:28, 23 September 2019 (UTC)[reply]

From the error logs, it seems EranBot is having trouble authenticating to English Wikipedia. This is strange because it doesn't appear to have the same problem for the other wikis. We were hoping you could help us, ערן, but we're happy to give this a try if you don't immediately know what the problem is. MusikAnimal talk 16:57, 24 September 2019 (UTC)[reply]
I logged-n successfully with the bot account from toolforge. Eran (talk) 18:14, 24 September 2019 (UTC)[reply]
Thank you!— Diannaa 🍁 (talk) 19:15, 24 September 2019 (UTC)[reply]

Hello Eran @MusikAnimal: The bot seems to have stopped again; no new reports since 15:41 (4 hours ago). Thank you, — Diannaa 🍁 (talk) 19:35, 1 October 2019 (UTC)[reply]

Last report was from 20 minutes ago, so I assume all is fine. I don't see any recent errors in the logs. Perhaps this was truly just a dry spell? MusikAnimal talk 20:25, 1 October 2019 (UTC)[reply]
That only happens during the Superbowl, lol :) Thanks. — Diannaa 🍁 (talk) 21:31, 1 October 2019 (UTC)[reply]

Hello Eran @MusikAnimal: https://tools.wmflabs.org was down for about an hour and is now back up, but https://tools.wmflabs.org/copypatrol/ is still getting a "500 - Internal Server Error". Maybe it needs a restart? Thanks, — Diannaa 🍁 (talk) 16:51, 7 October 2019 (UTC)[reply]

@Diannaa: Fixed! This was due to phab:T234834 (where all Toolforge went down). We just needed to restart the webservice. MusikAnimal talk 17:34, 7 October 2019 (UTC)[reply]
@MusikAnimal: The interface is available again, but there's been no new reports posted since 15:16 UTC, which was the last report filed before Toolforge went down. So I'm pretty sure there's still an issue of some kind. Thanks, — Diannaa 🍁 (talk) 18:27, 7 October 2019 (UTC)[reply]
@Diannaa: I restarted EranBot and that seemed to fix it. Best, MusikAnimal talk 19:11, 7 October 2019 (UTC)[reply]
Now there's a new report, so it looks like we're rolling once again. Thank you! — Diannaa 🍁 (talk) 18:59, 7 October 2019 (UTC)[reply]

Hello again Eran and @MusikAnimal: The iThenticate site was down for about an hour and a half and has now been back up for a while, but so far there's been no new reports posted to CopyPatrol. If you could check and see if the bot needs a re-start that would be great. Thank you, — Diannaa 🍁 (talk) 14:35, 12 October 2019 (UTC) Update: Activity has resumed. Thanks! — Diannaa 🍁 (talk) 20:41, 12 October 2019 (UTC)[reply]

Copyvio tag on this page

Hello. I noticed that my recent edits to this page were tagged as a potential copyright violation. I used content from previous seasons and references about Season 4 for that article, and although it is styllistically similar to the pages for previous seasons, it is different enough that I don't quite understand why those changes might constitute a copyright violation. I have edited various Wikipedia articles for the last 12 years or so, and am highly respected for my work here, so if there was any problem with my edits to that page in question, I don't know exactly what that would be. If you could enlighten me on this issue, I'd greatly appreciate it, as I never intended to violate a copyright of any kind. In fact, I studiously try to avoid questionable conduct on Wikipedia, which my contribution history demonstrates. Thanks. --Jgstokes (talk) 00:12, 25 September 2019 (UTC)[reply]

Jgstokes: Sorry for the delayed response. The bot indicates large chunks of content appears in other source as you can review in [21] - this doesn't mean copyright violation but potential risk for copyvio. As you wrote, the most reasonable explaination for this similarity is previous seasons, and probably other websites copies from Wikipedia as well. One way to avoid such false alarms in future is to mention in the edit summary other pages which are used as base for the work. For example "Creating a new page, based on The Good Place (season 3)" or "Expending Cast section, partially based on The Good Place (season 3)" - the important part here is to link to other pages, in summary so the bot can remove content that is similar to other pages, and by the way provide credits to that pages. Thanks, Eran (talk) 15:34, 1 October 2019 (UTC)[reply]

A humble Request!

Editor, Can you please help me identify a material that violating the copyrights law on the draft Zack Onisko. I will remove it and it will be learning. I am just adding info what I find on the internet. Thanks (ScorpionShark (talk) 11:50, 26 October 2019 (UTC)) Never Mind, I received some guidance from your previous comments and removed the material taken from https://www.geckoboard.com/assets/Food-for-Growth-by-Geckoboard-Mention-120515.pdf. Thanks, You saved my efforts! ♥ (ScorpionShark (talk) 11:56, 26 October 2019 (UTC))[reply]

Hi Eran, I am surprised to see the aria article was marked with "Potential copyright violation". The libretto must be written as is. I have written many arias, this is the first time I received the violation message. Please clarify. You can refer to list of arias to see how it was written according to Opera Project. Refer en:Category:Arias by composer. Thanks - Jay (talk) 09:34, 12 November 2019 (UTC)[reply]

Hi Jay, your edit is not copyright violation. The edit was marked as "Potential copyright violation" because there is large chunk of text that already appears in other sources, such as [22]. Having lyrics of a poem in article may be an issue of copyvio, but not in this case as the Libretto is classic. Thanks, Eran (talk) 20:37, 12 November 2019 (UTC)[reply]

Autocompleter stopped working

Hi Eran, I used your autocompleter on several private wikis for a while, but it stopped working near the end of October. I posted my question originally on User talk:ערן/autocomplete but there is no activity there, so thought I would post here as well. Would you rather I post there or discuss here? Thanks! Tenbergen (talk) 19:59, 24 November 2019 (UTC)[reply]

Hi, there are some gadgets and extensions such as CodeMirror that may not be compatible with this. Another option to explore is missing resources - you can verify jquery.ui and relevant component are being loaded. Eran (talk) 17:31, 26 November 2019 (UTC)[reply]
Hi Eran, I am not running CodeMirror, and don't have gadgets enabled. I do use a bunch of extensions; I just disabled all but the following (was worried that would cause other problems...): SMW, Cargo, WikiEditor. Problem persists. I had to figure out how to check if jquery.ui is working. I tried the following snippet of code in mediawiki:common.js, above the code that sets up your autocomplete:

if (jQuery.ui) {

   console.log("jquery.ui is loaded")

} else {

   console.log("jquery.ui is NOT loaded")

}

It tells me that jquery.ui is NOT loaded, but I have no idea what I should do with that info...? Tenbergen (talk) 01:23, 9 December 2019 (UTC)[reply]
You can get jquery.ui loaded with:
mw.loader.using('jquery.ui').done(()  => { /* CALL TO CODE DEPENDENT ON JQUERY UI */ })
Eran (talk) 21:58, 14 December 2019 (UTC)[reply]
Do you mean I need to use this to call your code in common.js, ie
mw.loader.using('jquery.ui').done(()  => { mw.loader.load('//he.wikipedia.org/w/load.php?modules=ext.gadget.autocomplete') });
I tried that, but the dropdown still doesn't work. It actually really surprised me by briefly working once when I was editing a complicated template with a query from the cargo extension. Suddenly in the middle of it your dropdown showed up, for the first time in months. I had to get that fix done and when I came back to it later to try to replicate this, I couldn't figure out the exact scenario again. :-( Makes me wonder, though, if it is actually jquery not getting loaded, or if something brakes it after it gets loaded. Tenbergen (talk) 05:09, 23 December 2019 (UTC)[reply]

Copypatrol loading slow, and bad gateway 504

Hi Eran and MusikAnimal. Copypatrol is loading really slow, and I had a 504 Bad Gateway error happening for 10-15 minutes starting at about 22:00 UTC. Just a heads up in case you have a minute to see if there's anything that needs fixing. Thank you, — Diannaa 🍁 (talk) 22:25, 11 December 2019 (UTC)[reply]

I noticed things going slow elsewhere, so I'm guessing there was an ops-level connectivity issue. CopyPatrol is now loading in the usual amount of time for me. Let me know if it's still hiccuping for you. MusikAnimal talk 22:36, 11 December 2019 (UTC)[reply]
@MusikAnimal: It's really peppy for a few minutes but still intermittently having issues. — Diannaa 🍁 (talk) 23:24, 11 December 2019 (UTC)[reply]
@MusikAnimal: Sorry to keep bugging you, but I think the bot may have stopped. There's been no new reports for 2 hours, after a steady stream of 10-20 per hour in the preceding time period. Thanks, — Diannaa 🍁 (talk) 02:43, 12 December 2019 (UTC)[reply]
Now it's not loading at all. "502 Bad gateway"— Diannaa 🍁 (talk) 12:03, 12 December 2019 (UTC)[reply]
Trying to load in iThenticate link results in "503 service unavailable". — Diannaa 🍁 (talk) 12:07, 12 December 2019 (UTC)[reply]
Update: everything has been working properly for a couple hours, so it looks like this event is over. Thanks, — Diannaa 🍁 (talk) 15:14, 12 December 2019 (UTC)[reply]
@Diannaa: There has been maintenance going on with Toolforge. Everything went down for about 2 hours. I attribute all of the problems above to that. Indeed it seems resolved now. Sorry for the downtime, MusikAnimal talk 16:27, 12 December 2019 (UTC)[reply]

@MusikAnimal: The bot may have stopped again; there's been no new reports for nearly 4 hours. I would appreciate it if you or Eran could have a look. Thanks, — Diannaa 🍁 (talk)

The bot resumed shortly after the above post. Just thought I'd update since the post has no timestamp.— Diannaa 🍁 (talk) 12:05, 14 December 2019 (UTC)[reply]

Copyvio autopatrol

Is there any mechanism for autopatrolling edits flagged as copyvios by this bot? My edits are going to be flagged every time I use this tool (click "Toggle stub code") to create an article because it pulls public domain content from the corresponding NCBI gene pages (e.g., [23]), which are used on many other sites. I intend to use this tool a lot in the future since there are still thousands of missing articles on human protein-coding genes.Seppi333 (Insert ) 23:49, 31 December 2019 (UTC)[reply]

These are just the 2 recent examples I remember:

Seppi333 (Insert ) 23:52, 31 December 2019 (UTC)[reply]

Hi Seppi333: thank you for the report. We don't have today a good way to avoid it but I can change the bot to accept citations from public domain. Before doing it I would like to verify
  • Do you think it would be possible to indicate the source in the summary itself (such as "New article based on https://www.ncbi.nlm.nih.gov/gene/81035")? This comes with assumption that the text content is based only on a single PD source
  • How can we know automatically that NCBI is PD ([24]). Keep in mind that NCBI also provides abstract of papers, and they don't own copyrights of papers.

Eran (talk) 07:32, 3 January 2020 (UTC)[reply]

The text is pulled from the gene summary element within the summary (topmost) pane on NCBI gene, e.g., for [25] that script would use This gene encodes a ryanodine receptor found in cardiac muscle sarcoplasmic reticulum. The encoded protein is one of the components of a calcium channel, composed of a tetramer of the ryanodine receptor proteins and a tetramer of FK506 binding protein 1B proteins, that supplies calcium to cardiac muscle. Mutations in this gene are associated with stress-induced polymorphic ventricular tachycardia and arrhythmogenic right ventricular dysplasia. [provided by RefSeq, Jul 2008] From what I've seen, that data is provided by OMIM, NCBI staff, and/or refseq, but this script has been in use in one form or another (originally via User:ProteinBoxBot, which created several thousand gene articles using it when it was active) for a number of years. I can't find an explicit statement about it being PD on NCBI gene, but given that it only appears to be supplied by US government entities (i.e., OMIM, refseq, NCBI gene curators), I think it's a safe assumption that it's always public domain. Editors at WT:MCB would probably know more than me about this though.
Sure, I can use that as my edit summary. Seppi333 (Insert ) 08:59, 3 January 2020 (UTC)[reply]
Seppi333: I believe the gene summary in NCBI (and ReqSeq) is good resource and you should keep the good work expending Wikipedia based on it. However, note that not all NCBI content is PD[26] (for example OMIM is copyright owned by Johns Hopkins University[27], and PubMed copyrights is for the publishers). For that reason it makes it hard to automate the process of autoconfirm any content from NCBI instead of manually reviewing it. Anyway - the copyright system doesn't check bots (I assume that as part of requesting a bot rights, legal issues are also reviewed). So if ProteinBoxBot or any similar bot will do it - it will not be marked with copy issues. Eran (talk) 08:32, 4 January 2020 (UTC)[reply]

Bot has stopped 3

Hello Eran, MusikAnimal (talk · contribs). The bot appears to have stopped; no new reports for ~4 hours. iThenticate reports are also failing to load. Perhaps we are out of credits? or some other issue. If you could spare a minute to have a look that would be great. Thank you, — Diannaa 🍁 (talk) 19:55, 4 January 2020 (UTC)[reply]

Hi Diannaa, I missed it earlier but Turnitin updated us in advance there is a maintenance in their side. Please see: https://twitter.com/turnitinstatus Eran (talk) 20:13, 4 January 2020 (UTC)[reply]
Thank you for the update. I will bookmark that Twitter feed for future reference. They have now completed the maintenance; I don't know if you have to re-start the bot or if that happens automatically. — Diannaa 🍁 (talk) 22:56, 4 January 2020 (UTC)[reply]
Still no new reports— Diannaa 🍁 (talk) 02:40, 5 January 2020 (UTC)[reply]
Still no new reports; it's been 4 days now and no response to our pings, messages, and emails. In addition to Eran, the following people are listed as maintainers of this tool: User:Catrope; User:Kaldari; User:KHarlan (WMF); user:MusikAnimal; User:NKohli (WMF); User:Samwilson. Any help would be appreciated. Thank you, — Diannaa 🍁 (talk) 19:53, 8 January 2020 (UTC)[reply]
@Diannaa: I am on holiday until January 11 but I've notified other members of my team about this issue. MusikAnimal talk 21:07, 8 January 2020 (UTC)[reply]
Thank you for taking time out from your vacation to do that. Appreciated — Diannaa 🍁 (talk) 22:53, 8 January 2020 (UTC)[reply]
The tool was working for a short while, but has once again stopped. phab:T242291Diannaa 🍁 (talk) 23:54, 9 January 2020 (UTC)[reply]

What next

Great tool !!! Eranbot flagged a copyvio here, which I noticed after I had cleaned up that edit. I reverted the edit, rewrote some of the text, and notified the new (copyvio) editor. The Bot instructions don't explain to me what I should do next. Do I need to have someone scrub the interim edits that I made to cleanup the addition before I saw the copyvio? Where do I flag that? SandyGeorgia (Talk) 22:30, 30 January 2020 (UTC)[reply]

Hi SandyGeorgia. The source webpage is released under a Creative Commons Attribution 2.0 Generic (CC BY 2.0) license, so the overlapping material is okay to keep, as long as the source is properly attributed. The attribution can be added by using a template {{CC-notice}} or attribution can be added manually, like I did here. (This won't be necessary for this case, since you've already amended the content.) Normally cases found by EranBot are listed at CopyPatrol, but for some reason it's not appearing there, even though it is present in the log. Revision deletion is not required in this instance, since failure to include attribution is a violation of the CC-by license terms but not a copyright violation per se. For cases where you do want revision deletion to be done, you can use the template {{Copyvio-revdel}} or if you find it awkward to use the template let me know on my talk page and I will do it, or ask any of the folks listed at Category:Wikipedia administrators willing to handle RevisionDelete requests. — Diannaa (talk) 01:52, 31 January 2020 (UTC)[reply]
@Diannaa:, got it, thanks so much ! SandyGeorgia (Talk) 02:09, 31 January 2020 (UTC)[reply]
Oops, no Diannaa I don't get it. How do I tell that article is released under CC BY 2.0? I can't find anything on that page. SandyGeorgia (Talk) 02:11, 31 January 2020 (UTC)[reply]
For this particular article, click on the blue link that says "Copyright and License information". This reveals all. For some articles, I've had to download the PDF and hunt for a license. PDF for this particular article has the license at the bottom of the first page. — Diannaa (talk) 02:15, 31 January 2020 (UTC)[reply]
Wow, this is awesome news. Thanks again, SandyGeorgia (Talk) 02:19, 31 January 2020 (UTC)[reply]