Jump to content

Wikipedia:Bots/Requests for approval/VoxelBot: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
→‎VoxelBot: Changing once again
→‎Arbitrary break: Trial complete
Line 232: Line 232:
::Nevermind, just noticed you're not a crat. I'll ask MBisanz. [[User:Vacation9|<span style="color:#008B8B">Vacation</span>]]<sup>[[User talk:Vacation9|<span style="color:#FF8C00">nine</span>]]</sup> 02:11, 8 January 2013 (UTC)
::Nevermind, just noticed you're not a crat. I'll ask MBisanz. [[User:Vacation9|<span style="color:#008B8B">Vacation</span>]]<sup>[[User talk:Vacation9|<span style="color:#FF8C00">nine</span>]]</sup> 02:11, 8 January 2013 (UTC)
:::{{done}} Query-continue and 30 minute edit time implemented and committed. We need a bot flag to run it with our current code though. Otherwise as stated above it would cause massive server load. [[User:Vacation9|<span style="color:#008B8B">Vacation</span>]]<sup>[[User talk:Vacation9|<span style="color:#FF8C00">nine</span>]]</sup> 02:56, 8 January 2013 (UTC)
:::{{done}} Query-continue and 30 minute edit time implemented and committed. We need a bot flag to run it with our current code though. Otherwise as stated above it would cause massive server load. [[User:Vacation9|<span style="color:#008B8B">Vacation</span>]]<sup>[[User talk:Vacation9|<span style="color:#FF8C00">nine</span>]]</sup> 02:56, 8 January 2013 (UTC)
{{Bot trial complete}} We have run VoxelBot for all of today and there have been no problems. Yesterday I did run [http://en.wikipedia.org/w/index.php?title=Template:Vandalism_information&diff=531888810&oldid=531832993 some] [http://en.wikipedia.org/w/index.php?title=Template:Vandalism_information&diff=531889723&oldid=531888810 tests] on the template which I then reverted while trying to implement query-continue. I fixed it this morning (all the commits are at the github) and we then run it. Starting at 13:30 UTC, the bot ran on the cron job. I did one manual test of query-continue where I changes the max edit retrieval amount to 500 from 5000 so it had to use query-continue. It [http://en.wikipedia.org/w/index.php?title=Template:Vandalism_information&diff=531969364&oldid=531968253 worked perfectly] and a run without query-continue gave the same results as with (thus there was no additional edit) after this, the bot ran completely automated. There was downtime from 16:07 to 20:00 because the server "crashed" (someone unplugged it). We stopped it at 23:40 UTC. The edit interval seems to be on average every hour. This means of course 24 edits per day for around 720 edits a month, which seems much more reasonable. Comments are welcome. [[User:Vacation9 Public|<span style="color:#008B8B">Vacation</span>]]<sup>[[User talk:Vacation9 Public|<span style="color:#FF8C00">nine</span>]]</sup> <span style="color:#008B8B">Public</span> 23:44, 8 January 2013 (UTC)

Revision as of 23:44, 8 January 2013

Operators: Fox Wilson and Vacation9

Time filed: 19:17, Wednesday December 26, 2012 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): Python

Source code available: Yes, at GitHub.

Function overview: Updates Template:Vandalism information automatically

Links to relevant discussions (where appropriate): User talk:Addshore#Huggle_Edit_Count, User_talk:Diannaa#Bot_Proposal, User talk:Vacation9/Archives/2012/December#RE:_Bot_Proposal

Edit period(s): Once every 530 minutes

Estimated number of pages affected: One

Exclusion compliant (Yes/No): NoYes

Already has a bot flag (Yes/No): No

Function details: Uses MediaWiki API to update Template:Vandalism information. Every fivethirty minutes (if the stats are different), VoxelBot automatically takes the previous five minutes of edits, searches for the text Revert in the edit summaries, then updates the template with the current information. A working version running in VoxelBot’s userspace is available here. The bot is not exclusion compliant because there are no pages it is editing, apart from Template:Vandalism information.is manually overrideable and is exclusion compliant.

Discussion

Without a Bot flag, we cannot look at more than 500 previous edits, which the number of edits in the previous 5 minutes may exceed. We can increase this number if this request is approved. Will also require confirmed status in order to edit the semi-protected template. Vacationnine 19:25, 26 December 2012 (UTC)[reply]

Few comments. Sounds like a fantastic idea. However, please stop the bot from making useless edits (like [1] or the countless others visible in that page's history). Should be relatively easily to check (maybe store a variable of the previous run results and the current results; if equal, abort). Also--don't you feel that *every 5 minutes* is a bit too frequent? I'd be more in favor of a once every thirty minutes check, at minimum. Regardless, sounds like a great project. —Theopolisme 22:28, 26 December 2012 (UTC)[reply]
Sure, we could certainly add a check like that. We didn't feel every 5 minutes was too frequent as an accurate depiction of vandalism is the whole reason for the bot. This way, we don't have to parse massive loads of API output and request thousands of revisions either, and since we average the last five minute's reverts to get the reverts a minute, more time would be less accurate. IMO, I don't see a problem with the current editing speed. Anybody else? Vacationnine 22:32, 26 December 2012 (UTC)[reply]
Just an FYI, the bot runs on a cron job, but it should still be easy to check result differences. I'll implement that in a bit. Also, the bot is already having some difficulty going above 100 EPM (no bot flag yet, so it can only check the past 500 edits, 500 edits/5min = 100 EPM). I don't think the interval should be anywhere above, say, around 20-30 minutes. I'll work on implementing the check for uselessness :) It's a Fox! (Talk to me?) 22:37, 26 December 2012 (UTC)[reply]
(edit conflict) I guess I was comparing this to the revision history of Template:Vandalism information--just a few edits each day. However, now that you've made your case, I guess there really isn't any reason not to go whole hog--as long as you're able to create the "null edit check" (as that in and of itself would greatly reduce the number of edits, anyhow). —Theopolisme 22:38, 26 December 2012 (UTC)[reply]
 Doing... Vacationnine 22:40, 26 December 2012 (UTC)[reply]
Implemented null edit check. It's a Fox! (Talk to me?) 22:44, 26 December 2012 (UTC)[reply]
Love the idea of the bot, have always thought that this should be automated :) ·Add§hore· Talk To Me! 00:56, 27 December 2012 (UTC)[reply]
Addshore, if you have the time, it would be nice to have some feedback on our current detection of reverts. Thanks! It's a Fox! (Talk to me?) 01:03, 27 December 2012 (UTC)[reply]
Already done here. Vacationnine 01:07, 27 December 2012 (UTC)[reply]
Hmm, sound good, but I have one question. Will the vandalism information be taken from Huggle, or extracted right from Wikipedia's database? Kevin12xd (talk) (contribs) 03:01, 27 December 2012 (UTC)[reply]
Grabs it directly from the MediaWiki API. However, we've asked Addshore (Huggle dev) about it and he said what we're doing should work just as well. You can look at the information it's outputting and compare it to Huggle if you want. The only thing is that we currently can't go over 100 EPM because of API limitations, which will be resolved when we have a bot flag. It's a Fox! (Talk to me?) 03:06, 27 December 2012 (UTC)[reply]

If you want to know the exact details, the code (shown below, rest at github) gets the edits in the last five minutes, and counts the amount of "revert" in edit summaries, then subtracts good faith edits and duplicate revert links in Rollbacks. It then averages the count so it is per minute and adds one so it rounds up.

count = int(round(((data.count("revert")-data.count("reverted good faith")-data.count("reverting good faith")-data.count("help:reverting"))/5.0)+1))

Vacationnine 03:32, 27 December 2012 (UTC)[reply]

That kind of raises another concern from me. The vandalism information would be slightly innacurate, because the output is based on edit summaries, as shown by the code above. I myself often put "I reverted an edit by" rather than "Reverted edits by"; the default edit summary for reverts. But overall, the bot sounds good, efficient,and I would approve of it ;). Kevin12xd (talk) (contribs) 16:09, 27 December 2012 (UTC)[reply]
That's the good thing about the bot - the detection algorithm should still match your revert. It matches partial words as well. It's a Fox! (Talk to me?) 16:22, 27 December 2012 (UTC)[reply]
(edit conflict)We think that the edit summary method is accurate, and as we said User:Addshore supports our current method. Also, in your example, both would be recognized as vandalism as the code searches for "revert" (after converting the summary to lowercase). Even if the method isn't quite accurate, it doesn't matter anyway because this is only used to give a general idea about vandalism. This is also an improval over Huggle because everything is rendered client side in huggle, and the stats are based on cumulative figures. Eventually, revert count would even out. Vacationnine 16:24, 27 December 2012 (UTC)[reply]
Make sure it matches common shorthands like "rvv" or "rv vand" and such. —  HELLKNOWZ  ▎TALK 16:29, 27 December 2012 (UTC)[reply]
 Doing... Vacationnine 16:31, 27 December 2012 (UTC)[reply]
I'm curious, what about all the cases where "revert" is used as a generic word? I've used that a lot myself, something like "will revert to older values" or "adding colored heading, revert if wrong". —  HELLKNOWZ  ▎TALK 16:34, 27 December 2012 (UTC)[reply]
(edit conflict) Done and deployed. As to your other concern, these "edge cases" so to speak, probably won't make a difference in the final value. Because VoxelBot divides the revert count by five and rounds up, one or two false positives shouldn't make a difference. Without getting into artificial intelligence, there isn't really a way (correct me if I'm wrong) to detect these. Vacationnine 16:39, 27 December 2012 (UTC)[reply]
Also, as with other detection bots like User:ClueBot NG, nothing can always be accurate. We're confident that these edge cases happen rarely enough to still give an accurate depiction of current vandalism counts. Vacationnine 16:43, 27 December 2012 (UTC)[reply]
Really, because I counted 9/40 edits using "revert" that are not related to vandalism. And of the remaining ones a bunch were reverting "unsourced content" and not vandalism. So that's not exactly rare. Personally, I'd be very selective as to which phrases are used. I'd really like to see a list of bot generated diffs and edit summaries so we can actually see what the false positive rate is. —  HELLKNOWZ  ▎TALK 16:56, 27 December 2012 (UTC)[reply]

Alright. We changed the parsing system to allow this, and we're implementing an output of summaries. Should have results soon. Vacationnine 17:02, 27 December 2012 (UTC)[reply]

Which namespace(s) does this work on? —  HELLKNOWZ  ▎TALK 16:35, 27 December 2012 (UTC)[reply]

VoxelBot will only edit Template:Vandalism information but it looks at recent changes from all namespaces. Vacationnine 16:40, 27 December 2012 (UTC)[reply]

Here's some sample output from recent changes detected as reverts:

reverting possible vandalism by 72.73.113.14 to version by waacstats. false positive? report it. thanks, cluebot ng. (1418018) (bot)


reverting possible vandalism by 62.24.111.248 to version by kashmiri. false positive? report it. thanks, cluebot ng. (1418019) (bot)


reverted edits by 86.147.7.195 (talk) to last revision by chuunen baka (hg)


reverting possible vandalism by 109.255.201.22 to version by 1.34.51.114. false positive? report it. thanks, cluebot ng. (1418021) (bot)


undid revision 530003889 by 174.28.97.31 (talk) rv unexplained deletion of content


reverted to revision 529983834 by pleasant1623: non constructive. (tw)


reverted edits by 124.253.2.153 (talk) to last version by cantaloupe2


rv 5 edits - unreferenced


reverting possible vandalism by 69.231.29.98 to version by 70.100.131.172. false positive? report it. thanks, cluebot ng. (1418022) (bot)


All of them seem to be vandalism, the only one that's a bit iffy is "unreferenced." Should we detect this as vandalism reversion or not? It's a Fox! (Talk to me?) 17:15, 27 December 2012 (UTC)[reply]

I don't think "unreferenced" or "unsourced" should be counted as vandalism. I removed them in a commit. Vacationnine 17:19, 27 December 2012 (UTC)[reply]
9 edits is really not a good sample size. Even a trial for this kind of task would be many hundreds of edits. I meant you should leave the bot running for like a night or something. P.S. without diff links, those edits are impossible to check. —  HELLKNOWZ  ▎TALK 17:22, 27 December 2012 (UTC)[reply]
Yes, we're currently implementing an output system. We have had the actual bot running here. Vacationnine 17:26, 27 December 2012 (UTC)[reply]
The bot is now running and outputting summaries flagged as vandalism to a report file. When we have some significant numbers, we can post results. Vacationnine 17:28, 27 December 2012 (UTC)[reply]
You can find the report page here: http://fcwnet.com/~pibot/report.txt that is a page which updates after every run of the bot. Edits further down the page are later in time. After a bit it should give you quite a lot of edit summaries. It's a Fox! (Talk to me?) 17:33, 27 December 2012 (UTC)[reply]
Also, what do you mean "to check?" I'll work on diff links, but could you clarify that? Thanks! It's a Fox! (Talk to me?) 17:35, 27 December 2012 (UTC)[reply]
I mean not just the edit summary, but actual diff links to the change (like [2]), so it's possible to see what the edit was. —  HELLKNOWZ  ▎TALK 17:40, 27 December 2012 (UTC)[reply]
Alright, we'll try to add diffs. From the reports so far (by no means conclusive) we've found an original research revert and a self revert, which shouldn't be classified. We fixed these and deployed them here and here. Vacationnine 17:52, 27 December 2012 (UTC)[reply]
Diff links added. Current vandalism revert blacklist: "revert", "rv " (purposeful space so it doesn't match rv as characters in a word) Current vandalism revert whitelist: "good faith", "agf", "unsourced", "unreferenced", "self", "speculat", "original research", "rv tag", "reverting a close template (not " (last one is whitelisting a current discussion going on with that as the title, working on a title whitelist) Vacationnine 19:02, 27 December 2012 (UTC)[reply]

Alright, we've let it run for quite a bit now. I went through and checked all the flags that existed at [3] as of 3:00 Eastern Time and here are the false positive results. F means the issue was fixed in a commit and PF means partially fixed:

  • 312 vandalism flags

False Positives and Reasons:

  • typo revert F- 1
  • original research revert F- 1
  • self revert F- 1
  • incorrect changes revert F- 2
  • "rv" used to mean remove PF- 2
  • revert a good faith edit but didn't explicitly say so (unfixable) - 14
  • "revert" in section title (which is enclosed in /* */ in the summary) F- 2

Which adds up to 23 false positives, or 16 not counting the fixed issues. This gives a fp percentage of ~7.37% counting the fixed issues and ~5.13% without the fixed issues. We feel this is small enough since the template is intended to give a general idea about current vandalism; there is no need for pinpoint accuracy here. What do you think about this? The exact list used (I took out any duplicates due to manual runs of the bot) is available here for public view. Vacationnine 20:35, 27 December 2012 (UTC)[reply]

That's 3 PM Eastern time :) 69.255.179.102 (talk) 20:50, 27 December 2012 (UTC)[reply]
Yes, thank you. I added it in there. Vacationnine 21:13, 27 December 2012 (UTC)[reply]

Just an idea: why not adding some tag filters as another indication? Like [4][5][6][7][8][9] ? (and to detect if such an edit was also later reverted or not double counting) mabdul 18:04, 29 December 2012 (UTC)[reply]

Thanks for the input! However, there are some problems with this approach - first of all, as you mentioned, we would need to detect if the edit was later reverted, adding another layer of complexity and possible errors. Also, double counting would almost always be an issue, which would need some mechanism, probably not always accurate, to detect double counting. Third, it isn't really necessary, as non-false positives tagged under this filter are almost always reverted by ClueBot anyway, or some other editor. Fourth, the filters are not always accurate anyway; see [10], [11], [12], [13], [14], [15] (iffy) - all examples of good faith edits flagged under one of the filters. Fifth, the previous information (displayed by Huggle) was based on revert count, and we want the information given to be as close as Huggle's as possible. IMO, this is redundant and wouldn't improve the accuracy. Thanks for the idea though! Vacationnine 20:31, 29 December 2012 (UTC)[reply]
Approved for trial (5 days). Please provide a link to the relevant contributions and/or diffs when the trial is complete. MBisanz talk 01:22, 31 December 2012 (UTC)[reply]
Thank you! Before beginning trial however, we do need a bot flag as stated above. Would this be possible? Vacationnine 01:25, 31 December 2012 (UTC)[reply]
I've given a provisional grant of the bot flag for the trial. This will be removed at the end of the trial before final approval to permit review by an independent bureaucrat. Thanks. MBisanz talk 01:30, 31 December 2012 (UTC)[reply]

A few comments:

  • I'm concerned about the fact that the bot can/will over-write what a human says the level is. For whatever reason, if a human says the level is 2, the bot shouldn't reset it to 5 until a human does (or a certain amount of time passes?). Personally I would like to see something like {{defcon|bot=yes}} which would allow users to opt-in to seeing the bot having it update (this could be done by having a Template:defcon/bot that is updated or something. Would probably require some template magic.)
  • Will this detect move page vandalism? I haven't had the chance to look through the code, but I didn't see it mentioned above.
  • More descriptive edit summaries would be nice, something like "Bot: Changing defcon level to X, message."
  • I'm also concerned about the update time. 5 minutes seems way too often for something like this. Based on past human updates ([16]), once every 6 hours seems like a good update time. In IRC channels where updates to this template are broadcast, it gets rather annoying to keep seeing updates.
  • Should these edits be marked as a bot? I would think we want these edits to show up in users' watchlists.

Legoktm (talk) 06:30, 31 December 2012 (UTC)[reply]

First of all, let me say the key idea behind the bot. We want VoxelBot to always be an accurate depiction of current vandalism levels. With this, I can answer your questions. Your first comment doesn't make much sense: this would completely defeat the purpose of the bot would it not? I'd also like to point out that our bot is more accurate than Huggle's count. Huggle's count is a cumulative count which is of course based on the amount of time Huggle has been open. This is different with VoxelBot: it only uses the last five minute's edits in consideration of revert count. Which brings me to your point about the time: if VoxelBot is to be an accurate depiction of current vandalism levels, it must update them frequently. This is for the same reason I never really paid attention to previous levels either: outdated data means absolutely nothing. Averaging the last six hours' edits wouldn't make sense, and obviously wouldn't technically be possible. Even with the Bot flag, we can only look through 5000 edits at a time, much the less tens of thousands which would occur in six hours. This would put massive load on the server. We're trying to improve human update time. See User_talk:Vacation9/Archives/2012/December#RE:_Bot_Proposal - this is the whole reason people supported the bot in the first place: up to date information. Also, if the move undo has "revert" in the summary, then yes. As coded now, the bot pays attention to any and all edits. I'll add more descriptive summaries now. Also, for the same reason stated above, these edits shouldn't show up in recent changes / watchlists because the bot is editing frequently. Vacationnine 06:50, 31 December 2012 (UTC)[reply]
Thanks for the response, and the updates. Fwiw, I haven't used huggle in a few years, so I have no idea how the count feature works, but if it helps people, then great. I'm a fan of the general idea (bots are usually more reliable than humans :P) but I think it can do with some improvements.
  • Is having the bot update every 5 minutes crucial though? I can understand going from 4-->2, but does going from 5<-->4 need an update every 5 minutes? I don't think so.
  • Why not track the IRC feed, store the matches, and calculate it every 6 hours? That would be an improvement, since you wouldn't lose any edits if 5000+ edits were made in 5 minutes, and if edits were revdel'd you would still be able to count them.
Legoktm (talk) 07:00, 31 December 2012 (UTC)[reply]
Changing from 5<-->4 actually makes a big difference because it changed the WikiDefcon level that Wikipedia is currently at from normal to low. IMO, updates like 6<-->7 are important though. They still represent a change in vandalism, and that's our goal. There's no downside to not updating from 6<-->7 and other small changes. About your 6 hours idea, that would be possible, but it would both be a major programming change and, for the reasons already explained, would render the bot basically useless. Vacationnine 07:08, 31 December 2012 (UTC)[reply]
Sorry I wasn't clear, I meant the 1,2,3,4,5 that are the images that show up. (Apparently these are now an inverse of the revert count, which means the bot might need to do some simple math.)
Maybe a system where depending on the level the notification time changes? So for a level 4-->5 might be 6 hours, but a 4-->1 would be 5 minutes.
I understand it might be a major coding change, but the goal at the end of the day is to have the best bot, which might require a few changes. That being said, if you want a quick/dirty IRC bot you can modify to easily count edits, I recommend taking a look at snitchbot to do that. Legoktm (talk) 07:52, 31 December 2012 (UTC)[reply]
Alright, I understand. But I still feel our principle applies: we want to have an accurate representation of edits, and a system like yours would eliminate accuracy to make it slower. Anyway, if the level is 4 and needs to be 5, how could you wait six hours to change it? Would it not already have changed by then? Your system sounds complicated and you still haven't provided a solid reason why something like this is needed instead of the current static five minutes. Vacationnine 08:05, 31 December 2012 (UTC)[reply]
Right, my idea would be sacrificing a bit of accuracy. Simply put, updating every 5 minutes is annoying. In IRC channels where updates to the template are broadcast, it gets rather irritating to have it update every 5 minutes, for not much change. Also the fact that the bot will overwrite any human updates to the template, which may not always be desirable. (And since the bot isn't exclusion compliant, there would be no way to prevent the bot from updating other than actually blocking it.)
The real question is, does having extremely up to date information actually make a difference? Does a level of 4 vs 5 actually mean anything major? I personally don't think so.
As a side note, you might find User:WdefconBot and it's approval discussion interesting as another implementation on how to update the template. Legoktm (talk) 06:39, 2 January 2013 (UTC)[reply]
I know almost nothing about IRC, so please forgive me if I ask stupid questions. Is it possible to not give IRC updates to the template? Or to exclude bot edits to this template?
About overwriting human updates, I can't think of a case where anyone wouldn't want the bot to update the template. Nobody else has expressed this concern.
It would be nice if we could have some input from other users on the update time issue. I'll ask some of the users that have given input already here. Vacationnine 18:16, 2 January 2013 (UTC)[reply]
User:Theopolisme has already expressed his opinion above that there is no reason not to edit every five minutes. Vacationnine 18:19, 2 January 2013 (UTC)[reply]

The trial has been going along well. Since being approved for trial, we implemented a couple of changes.

  • We added a check for 0 revert counts, which outputs 0 instead of rounding up to 1
  • We added a check for "format" in the notVandalism list (revert of incorrect formats shouldn't be classified as vandalism
  • We added more descriptive summaries , which display the edit count and revert count in the summary for easier review
  • Changed the template output per Theopolisme: instead of "according to VoxelBot. VoxelBot (talk) timestamp" it is now "according to VoxelBot (talk) timestamp".
  • Changed the code to only change the stats if the revert count has changed. Edit count is no longer taken into consideration.

Vacationnine 19:01, 31 December 2012 (UTC)[reply]

There have been lots of mentions about the edit frequency of 5 mins being too much. Realistically looking through the test edits so far it doesn't edit every 5 minutes, sometimes 10, 20 or more. Also changing from 5 to 4 and 4 to 3 e.t.c does make a difference, hence why the grades on the template go from 1 to 5. If we were not to change between each increment of the template we may as well alter the template to have less increments.. ·Add§hore· Talk To Me! 19:49, 2 January 2013 (UTC)[reply]
It hasn't been a week since the trial began, and the bot has already made almost 400 edits. Is the accuracy of the template so important to anti-vandalism efforts to warrant 4000 edits per month? Σσς(Sigma) 20:40, 2 January 2013 (UTC)[reply]
Note that the bot was editing here for quite a while before trial. Vacationnine 21:39, 2 January 2013 (UTC)[reply]
I know. The bot has made 400 edits in the template namespace, to {{Vandalism information}}. Σσς(Sigma) 22:07, 2 January 2013 (UTC)[reply]

Arbitrary break

Instead of using the API, have you considerd using the IRC recent changes feed -- irc.wikimedia.org #en.wikipedia? That should make it easier to change how often it is updated. Also as has been metioned above, there should be a way for users to override the bot. --Chris 03:10, 3 January 2013 (UTC)[reply]

IRC feed is a great idea and shouldn't be that hard to implement.A way for users to override the bot? How so? And in theory if the bot is programmed correctly what would be the need?
Looking at the way the template is set up and updated currently, It is essentially pointless if it doesn't get updated. You could make the bot only change the template if the status changes by 2, but then again, would it not make more sense to change the template and get rid of one of the status'es as we are essentially saying there is no point in keeping it up to date as that would mean we make too many edits?
·Add§hore· Talk To Me! 03:33, 3 January 2013 (UTC)[reply]
No, I think the point is more 5 minutes isn't long enough to get a good sample size. --Chris 03:40, 3 January 2013 (UTC)[reply]
I kind of agree with that. If IRC was used the sample size could be 'endless', the bot could then look at the trend as well as the calculation at any given time! :) ·Add§hore· Talk To Me! 03:42, 3 January 2013 (UTC)[reply]
Fox and I will discuss this. Expect to hear an answer soon. Vacationnine 07:23, 3 January 2013 (UTC)[reply]
We can safely receive the last ten minutes' edits using only the API if that would be better. Then we would have a better sample size and a shorter update time, solving both problems. What does everyone think of this? Vacationnine 01:19, 4 January 2013 (UTC)[reply]
Actually as I pointed out earlier, you won't get all of the last ten minutes' of edits. If a bunch of edits are reverted, and then revdel'd (because of an offensive username, etc) you can't see them since this account isn't an adminbot. However the edits will still show up in the IRC feed, which would give you a more accurate count. Legoktm (talk) 01:36, 4 January 2013 (UTC)[reply]
I am not experienced in IRC (especially with programming and scraping messages) and don't want to take this leap. IMO, don't mess with something that already works. While I agree about making the interval longer to increase accuracy, I don't think we should make a massive leap with very small advantages. Also, since when do edits by users with offensive usernames get RevDel'd? Thought it depended on the content of the message? Anyway, since the bot tracks reverts and the summary of undos of RevDel'd content is almost never deleted. Especially if it is a simple revert message. Vacationnine 03:01, 4 January 2013 (UTC)[reply]
No, I disagree. Using the API is a broken way of doing things; IRC is a much better and more accurate solution. You're using python, surely python has some irc libraries that are easy to use? Can any python devs reading this please help? --Chris 03:10, 4 January 2013 (UTC)[reply]
Yep. Σσς(Sigma) 18:35, 5 January 2013 (UTC)[reply]

The API is broken? Could you elaborate? And how is IRC more "accurate" persay? There is oyoyo which we could use, but as I said why mess with something that works? Vacationnine 03:21, 4 January 2013 (UTC)[reply]

With the API you are always relying in the data-set recived from the API to be within the timespan since your last request. It is inherently vulnerable to changes (mainly increases) in edit rates. As the rate of edits per minute increase, your bot will become less and less accurate. It may not have an effect now, but give it a year or so. As well as this, you are limited to sampling data within the specific time period that the API returns. What if is there is extremely high vandalism (say a 4chan raid), defcon level 2, the page they are attacking is then protected. The edits stop for a few minutes, suddenly the defcon is down to 5 again. Even though the raid is still on, and we're just waiting for them to pick another target. If you use IRC, you could have a much better sampling of data. So say there is a short lul, defcon can be lowered by 1 (or not lowered at all), so as not to mislead a person reading the template. Further more, I would stress that updating every five minutes is still a bad idea, the template should not be that senstive to fluctuation. I've always thought of it giving an overview of vandalisim levels in the last 1/2 hour to hour. --Chris 03:35, 4 January 2013 (UTC)[reply]
Oh, and there must be a way for a user to override the bot, if they so wish. --Chris 03:37, 4 January 2013 (UTC)[reply]
Alright, we'll see what we can do with IRC. Your reasoning does make sense. Should the override be a special keyword on the Template:Vandalism information page or on a page in the bot's userspace? Vacationnine 03:44, 4 January 2013 (UTC)[reply]
As I had linked earlier, snitchbot provides an quick connecting framework built on Twisted that you can modify. If you want, I have a stripped down version that just connects to irc.wikimedia and doesn't require SQLite.
IMO the easiest way to have an override is just add in support for {{nobots}}.
Edit summaries are revdel'd if they have offensive usernames, like "Adminname is a faggot". Legoktm (talk) 03:56, 4 January 2013 (UTC)[reply]
 Done Implemented manual override. Instructions are at the Template page in a comment. Will work on IRC tomorrow. Vacationnine 04:13, 4 January 2013 (UTC)[reply]

Trial complete. Alright, the bot has run for the allotted five days. Over that period, we have made several adjustments to the bot. The main changes (not just minor bug fixes) were to:

  • Add more descriptive edit summaries which include edit count and revert count
  • Improve signatures from "according to VoxelBot. VoxelBot (talk) timestamp" to "according to Voxelbot timestamp"
  • Add the possibility for manual override of the bot by users updating the vandalism information template
  • Change the update time from five minutes to ten minutes, which gives us a larger sample of edits and reduces server load

There have been no problems we have observed with the functioning of the bot. If you see times where the bot edited less than five minutes apart, that was because of a manual run to test a new feature, not a problem with the server/script. There has been occasionally some downtime for the bot due to the server it's currently being hosted on going down, but we’re hoping to solve this problem by transferring to the Toolserver if approved.

There has been quite a discussion about using the API vs. the IRC feed. Fox and I have decided that the best solution is to stick with the API, for now. The reason is that currently IRC provides us with little to no benefit, while it provides many challenges. First of all, both Fox and I are inexperienced to say the least with IRC chat, and the code suggested to us by Legoktm contains almost no documentation. As said above, IRC would provide negligible benefit. The only benefit we would receive right now is maybe a RevDel'd revert. However, IMO this would rarely happen. Maybe the vandalism edit's summary would be RevDel'd but there would be no reason to RevDel the revert's summary, which is what VoxelBot pays attention to anyways. The only reason I can think of is if the summary contains an extremely offensive username or something of the likes. In the future however, the previous 10 minutes' edits may exceed 5000. With our current capacity however, we can handle up to 500/minute. Current edits are hovering around 200.

Now onto the disadvantages. There is two ways I can think of where we could implement IRC. The first one would be constantly monitoring the IRC feed and dumping the output into a file. This would get very massive very quickly. We would also have to store the time stamps and when the bot is run account for time differences. The server the bot is being run on might not be exactly accurate meaning the bot could match more or less edits than it's supposed to. This isn’t a problem with the API since it's using server time. The dump file would get massive very quickly as well. It would also cause large amounts of server load on our part. Time for the second option: constantly scanning the IRC for summaries with "revert" in them. This maintains some of the same problems as stated previously: the bot has no idea exactly what time the wikipedia server is currently at, and since it checks the timestamps it could overlap. We would also still need to store the summaries of the reverts (since we output them to a public file). This would cause even larger server load than the previous option.

What I'm trying to say is that IRC at the moment doesn’t have any noticeable advantages but it does have a few disadvantages. IRC implementation is not urgent and is not necessary to be approved IMO. We may work on implementing it in our devel branch mainly because there is no edit capacity, but we can be approved without it. We addressed the sample size issue by elevating the edit interval to 10 from 5 minutes (you won't see this because it was implemented after we stopped the bot, but it is a straightforward change with no possible consequences.) and there were no other concerns that weren’t fixed. I think that we're ready for approval. Approval would also mean we can apply for an account on the Toolserver. Thank you, Fox Wilson (talk) and Vacationnine 18:23, 5 January 2013 (UTC)[reply]

Sorry what? Your bot doesn't know what time the server is at?
I don't understand why you need to create a dumpfile, which fwiw, would not be massive. It's just plain text. Just keep the counts in a dict or just as variables.
If you need help with snitch/snatch, you could ask for help? I think MZMcBride would be more than willing to explain how it works.
Related to using the API, is there a reason you can't use the query-continue to fetch all of the last 10 minutes?
Also, the point that Sigma raised earlier about the bot making 4000 edits/month wasn't answered.
Legoktm (talk) 20:10, 5 January 2013 (UTC)[reply]
Sigma - we have changed the bot to edit once every ten minutes and will try with this. Legoktm: query-continue? I wasn't aware that the API implemented this, is there a specific call we can use? Or is it just check the last timestamp and work from that? Thanks, It's a Fox! (Talk to me?) 20:13, 5 January 2013 (UTC)[reply]
The API docs explain how it works: mw:API:Query#Continuing_queries. Legoktm (talk) 20:18, 5 January 2013 (UTC)[reply]
Also what we mean about server time is that the server time can be different than the official time. Two servers might not have the exact same time, which can cause problems. About IRC, even if we had help there would still be the problems above. With the dumpfile, you're thinking of the second option, where we store the counts. We still need to export the reverts though, since we have a public log of the summaries flagged as reverts. We will look into continuing queries and implement it, thanks! Vacationnine 20:33, 5 January 2013 (UTC)[reply]
{{BAGAssistanceNeeded}} - Comments would be appreciated. We have fixed all problems and can implement continuing queries or IRC in the future, they aren't needed now. Vacationnine 16:34, 6 January 2013 (UTC)[reply]

I'm hesitant about approving this. I prefer to do things right the first time round. That said, approved in its current state, while not perfect, the bot should operate reasonably. I'm also still slightly concerned about the edit rate. Vacation9, could you please clarify how often the template will be updated? What do other BAGers (and everyone else) think? --Chris 07:54, 7 January 2013 (UTC)[reply]

What do you mean by "do things right the first time round"? Did VoxelBot do things wrong during its trial? If you're saying that all trials need to be perfect, that usually doesn't happen, but we have shown we have fixed all the problems that have at least been pointed out to us that are possible to fix. After the trial however, we did change the edit frequency, but any Python developer can look and that and see that it couldn't possibly break anything. The edit frequency is variant, but the bot checks if an edit is needed every ten minutes. This means the bot should average an edit around every fifteen minutes; sometimes it would be every ten minutes and sometimes every twenty. At this rate, that would be 96 edits a day or around 2800 edits a month. It could be even less, but it is a bit variant based of course on the amount of reverts at that time. We have already adressed all user comments as well. Vacationnine 12:43, 7 January 2013 (UTC)[reply]
No. I mean in terms of the code. I have already explained why I feel that the current code is the "wrong" way to do things, I am merely asking for a second opinion from someone else. I am aware that I am perhaps being a bit harsh; that is why I would like a second opinion. I am very aware this must be frustrating for you, but please be patient. Also "that it couldn't possibly break anything." -- famous last words ;) . --Chris 14:17, 7 January 2013 (UTC)[reply]
Ah, ok. I understand and will wait for a second opinion. You are aware though that query-continue (which will be implemented in the future, just not now) addresses your timespan concerns? If 5000 edits isn't enough, the bot can keep querying until it receives all edits. Vacationnine 14:26, 7 January 2013 (UTC)[reply]

I would say anything less than 20-30 minutes and 2000+ edits is not a large enough sample size. So much can change in a few minutes. I definitely don't think you can judge a site-wide vandalism from just the last few minutes. Same goes for edit frequency. I look at edits like this and I seriously doubt the accuracy. I recognize the bot only counts the vandalism reverted, this isn't an accurate vandalism/disruptive editing detection/measure (not to mention false positives and missed negatives), so no one is expecting perfect prediction. But this is why averaging over a large sample size is very important. —  HELLKNOWZ  ▎TALK 15:12, 7 January 2013 (UTC)[reply]

I was asked on my talk page to give my paste my opinion here related to the edit rate.
The whole template doesn't make any sense (in my eyes), but hence some editors find it useful, so: I like the general idea to shrink the template options to only three modes: low, mid, high. Other modes are more or less not understandable (or better saying meaningless where the difference to the next higher/lower lever is).
By shrinking the possibilities the bot will likely do less edits. mabdul 15:32, 7 January 2013 (UTC)[reply]
Mabdul, I agree with your points. I might draft a userspace template sometime... Chris G: "famous last words" :D yep. I'm just going to pretend it says "shouldn't do anything wrong." I happen to run the server it's on, and the thing is ancient. Vacation9: Assume that the computer will become artificially intelligent and rewrite VoxelBot :) It's a Fox! (Talk to me?) 17:13, 7 January 2013 (UTC)[reply]
I'm slightly concerned with pulling every edit from the API from a load basis, but I'm not an expert there. I'm also concerned with the 2800 edits per month. Something more in line with Hell's 20-30 minute lag would be better, even if it means switching over to a different codebase. But I'll defer to Chris if he thinks it should work ok as-is. MBisanz talk 00:23, 8 January 2013 (UTC)[reply]
We will implement a 30 minute interval tonight/tomorrow using query-continue. Vacationnine 01:35, 8 January 2013 (UTC)[reply]

Approved for extended trial. Please provide a link to the relevant contributions and/or diffs when the trial is complete. So that operation with "query-continue" can be implemented. --Chris 01:55, 8 January 2013 (UTC)[reply]

Sounds great, I'll try to get that done tonight. We will need our bot flag back temporarily however since we don't want to be doing 10 query continue requests to the API and making unneccesary load. MBisanz did grant temporary bot status for our first trial. Vacationnine 02:10, 8 January 2013 (UTC)[reply]
Nevermind, just noticed you're not a crat. I'll ask MBisanz. Vacationnine 02:11, 8 January 2013 (UTC)[reply]
 Done Query-continue and 30 minute edit time implemented and committed. We need a bot flag to run it with our current code though. Otherwise as stated above it would cause massive server load. Vacationnine 02:56, 8 January 2013 (UTC)[reply]

Trial complete. We have run VoxelBot for all of today and there have been no problems. Yesterday I did run some tests on the template which I then reverted while trying to implement query-continue. I fixed it this morning (all the commits are at the github) and we then run it. Starting at 13:30 UTC, the bot ran on the cron job. I did one manual test of query-continue where I changes the max edit retrieval amount to 500 from 5000 so it had to use query-continue. It worked perfectly and a run without query-continue gave the same results as with (thus there was no additional edit) after this, the bot ran completely automated. There was downtime from 16:07 to 20:00 because the server "crashed" (someone unplugged it). We stopped it at 23:40 UTC. The edit interval seems to be on average every hour. This means of course 24 edits per day for around 720 edits a month, which seems much more reasonable. Comments are welcome. Vacationnine Public 23:44, 8 January 2013 (UTC)[reply]