User talk:Coren/Bot policy

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Refactor[edit]

Following a review of the draft, edits performed by all users since initial draft:

  1. [1] Coren - clarifying various uncontroversial matters, adding draft sections on appeals/re-examinations, and BAG membership process
  2. [2] ST47 - fine tuning and adding detail to the above
  3. [3] FT2 - split intro into a brief intro and an "overview" section with some background (as common for policies). Also covers unapproved bots/tasks.
  4. Reorganize and group sections together
  5. Add wikilinks
  6. [4] Communication
  7. High speed/high volume/more automated processes are more likely to be treated as bots
  8. User scripts are usually enhancements or accessibility related and these do not usually need BAG approval
  9. Bots which might be operated by more than one person (see below for diff)
  10. Bot flag (and flag vs. approval)
  11. Tools with misuse ability
  12. Bot account naming, and linking to owner.

Although most edits to this page are straightforward, I have comments on one of these.

1. Bots operated by multiple users

The issue of concern here is role accounts, generally strongly discouraged or forbidden on English Wikipedia.

In the case of bots there are reasons why this may be sensible to relax a bit, centering on two differences between "bot" accounts and "human" accounts - bot accounts typically have much less flexibility in some cases (many bots can only do very specific types of edit and would have very limited access), and, in many cases the person operating 1/ has limited ability to modify the designated work it performs or perform inappropriate tasks, and 2/ cannot operate the bot without positive authorization and user verification.

It is clear that in some cases there will be benefit to allowing more than one user to operate the bot for its owner; obviously some criteria should apply but on the whole complete prohibition seems unecessary.

I've asked round arbcom and whilst the answer is informal, the consensus seems to be that this is a matter for BAG to decide how it feels, ie, whether in some cases it may be advantageous to provide that specific bots could be approved to be operated by more than one person, and which cases those are, which can then be discussed within any given bot's BRFA.

This would also tie in with existing wording, "Providing some mechanism which allows contributors other than the bot's operator to control the bot's operation is useful in some circumstances".

I've added what seem the most important criteria, but thought it was important to reassure on this point.

Diff [5]


FT2 (Talk | email) 04:47, 8 April 2008 (UTC)[reply]

I also changed the Restrictions section to the older version that had consensus to only prohibit bots from populating people categories. MBisanz talk 17:07, 8 April 2008 (UTC)[reply]

The 'bot' flag[edit]

"[...] not all approved bots need (or should have) that property." - While not all bots need the flag, I can see no possible reason why they shouldn't have the flag. As bots have the ability to mark their edits as bot (hidden) edits, or not, there is no reason not to have the flag. There are, also, benefits to having the flag other than hiding edits. One is the ability for editors to identify (approved) bots from Special:Listusers/bot. -- Cobi(t|c|b) 07:19, 8 April 2008 (UTC)[reply]

True; but there are bots extant which do not have the flag and which are approved. I think it's important to note that approved bots might not have the flag even if we do, in practice, give it as a matter of fact. — Coren (talk) 12:37, 8 April 2008 (UTC)[reply]
It seems (see below) that the granting of the flag as a "seal of approval" has sufficient support to be a serious reason to reword. I did, accordingly, with a nod to historical approvals that may not have resulted in a bot flag. — Coren (talk) 16:45, 8 April 2008 (UTC)[reply]

BAG[edit]

I for one would like to see the BAG abolished and allow bots to be commented on by everyone. It's simply another status thing for some people. On Commons, the process is part of the RfA page, and allows anybody to discuss it, and the final decision is made by a bureaucrat, not by a member of a little group. Bot approval should be more open to the community, and we need to get rid of silly little groups that only appear to be there for the sake of being there. Majorly (talk) 12:42, 8 April 2008 (UTC)[reply]

Although not involved in the bot world or BAG, this is what the intro states: "[the Bot Approval Group] supervises and approves all bot-related activity from a technical and quality-control perspective". Bots arent the same as many other decisions; they can have wide repercussions if subtly wrong or substandard somehow, or if technical loopholes exist or rigorous requirements are not complied with. The community can decide on social acceptability/desirability, but I think when it comes to bots, you may still need a group of users with those functions, and arbitrary community members will not necessarily be skilled judges of those technical tasks without some structure to work within. So one way or the other you'll need to ensure those who are, have a significant venue to discuss and make rechical observations, assessments and requirements, and have a say on those areas. Just a thought. FT2 (Talk | email) 13:23, 8 April 2008 (UTC)[reply]
That said in practice, if a bot was up for approval and a user known to be skilled with bots opposed for a technical reason, others would probably oppose too, and if a bot had a problem other editors would have a problem too. The function is what matters; the question whether a focused group will encourage more productive and higher standards and a focus for involvement is what's being asked. And indeed we do actively encourage task- and interest-focussed groups to exist in many other areas, for exactly those reasons. Thoughts... FT2 (Talk | email) 13:33, 8 April 2008 (UTC)[reply]
You are correct. For some people, BAG is a "status thing". So is adminship for others. I'm not convinced this means we should let anyone delete articles. The whole point of that modification to policy is to increase transparency, and give a better outlet for everyone to express concerns they might have with a bot (post facto because, let's be frank, people have absolutely no interest in the day to day operations of bots and couldn't be bothered to give a damn unless they have something to complain about). — Coren (talk) 13:47, 8 April 2008 (UTC)[reply]
For the most part I like this rewrite, it clears up a lot of ambiguity. However I will say that I agree with Majorly that it mostly codifies how things are now as far as the BAG goes. The Commons scheme seems to work for a project the size of Commons. Whether it would scale to en:wp is unclear, but maybe it would. In any case, I do have a concern about "all noms need to be made by a current or former BAG member"... that seems to perpetuate self reference. If someone completely incompetent to adjudge bots is nominated or self nominated, I would think that a few current BAG members politely pointing that out (with something to back up their assessment) would be sufficient to take care of any such candidacy. ++Lar: t/c 15:48, 8 April 2008 (UTC)[reply]
I'll grant that the requirement for a BAG nomination is mostly borne out of a desire to compromise between the (not unreasonable) worries from current BAG members that technical competence will be ignored and (also not unreasonable) worries from non-BAG members that it will remain an in-group.

Do you have a good alternative to suggest? — Coren (talk) 16:43, 8 April 2008 (UTC)[reply]

no idea. Heavier weighting on opposes from BAG members? That seems way too mechanical. Proficiency test? gamable. Trust the community to do the right thing if the issues are raised, would be my approach, instead of additional trappings. ++Lar: t/c 16:53, 8 April 2008 (UTC)[reply]
PS I think that "worries from current BAG members that technical competence will be ignored" are actually "unreasonable" rather than "not unreasonable" as you say. I highly doubt that if the process were open more widely that raising technical issues would be an ineffective way of derailing bot approvals. Raising technical issues derails Commons bot approvals all the time... My concern rather is that (especially with one bot/one task level approvals) that there is a scaling problem, not that we'd get bad decisions. ++Lar: t/c 17:04, 8 April 2008 (UTC)[reply]
I fear that optimism isn't borne out by past experience; we have had editors express concerns that "bots would be taking over", and that "automation is evil". As the community size increases, the probability that there is a small but vocal minority of, forgive the candor, complete nutcases able to derail most processes approaches one. At any rate, I have no particular attachment to the requirement and wouldn't be opposed to replacing it with a note that a candidacy that does not have support for a BAG member is simply unlikely to pass. — Coren (talk) 18:10, 8 April 2008 (UTC)[reply]

FT2 wrote:

Bots arent the same as many other decisions; they can have wide repercussions if subtly wrong or substandard somehow

You can say the exact same thing for administrators. Why don't we make an AAG, to approve admins in the same way? Majorly (talk) 16:48, 8 April 2008 (UTC)[reply]

"I for one would like to see the BAG abolished and allow bots to be commented on by everyone." - Do you have a drama fetish? They can be commented on by anyone. You're not on BAG. Get over it. Or, participate in discussion on BAG, where the idea has been raised that we can have "lay" members on BAG, responsible for representing the community view. Martinp23 18:49, 8 April 2008 (UTC)[reply]

I'm going to carry on now, because I don't care anymore. Majorly: you're looking at the BAG through some old bias which you have against it. Above, I insinuate that it's because this is a pie that, at the moment, it's unlikely you'll be able to get your finger into. I may be wrong there. In any case, you need to take another look at BAG before making comments like this, and like the blog post I just read. You are good at inciting drama, it must be said, however often I don't think it's the best thing to do. Bestest wishes, Martinp23 18:52, 8 April 2008 (UTC)[reply]

Why am I so fired up about this? Because I literally spent hours, a lot of it with others, working out the best ways to make the bot approvals process, and BAG, as open as possible. To have an individual come along and start kicking up a fuss, using arguments that are months out of date by now is, shall I say, infuriating. Actually, it's worse than that. It's downright mind-numbingly uncurteous, unfair, and absolutely annoying. MAJORLY: CHECK YOUR FACTS. Martinp23 18:56, 8 April 2008 (UTC)[reply]

Please calm down. Shouting at me will get nowhere. I made my comments after viewing relevant pages. I honestly don't see much change. It may not be the BAG's fault no one else comments, but as I said, it's probably the perceived idea that only BAG members can have anything to do with bots, and the general fear of the idea of bots. Majorly (talk) 19:30, 8 April 2008 (UTC)[reply]
To answer the question above, if you consider judging whether a user has acted reasonably as an editor, applied policy correctly, understands WP:BLOCK and so on... and then ask about specific bot loopholes that could be exploited, good and bad coding, design and interface practices, awareness of toolserver issues that influence bot design and management, factors that improve or reduce efficiency...
  • "Bots arent the same as many other decisions; they can have wide repercussions if subtly wrong or substandard somehow" - yes.
  • "You can say the exact same thing for administrators" - no.
Like it or not, wiki-bots do need significant technical assessment and quality control in a way that humans don't, and Wiki-groups often do encourage more involvement, group learning, and a "drive" to higher aims and standards. The argument above is fairly outdated. FT2 (Talk | email) 21:41, 8 April 2008 (UTC)[reply]

I agree with Majorly. The current bot approval process is a pain in the arse. The last time, I wanted an approval to remove a single line from a template in 90,000 articles; a simple procedure, and there was a clear need for it. The complete process took almost a month, and none of it was of any use to me; all feedback I got was after I started running for real. I believe this is because my proposal was so uncontroversial, that none of the BAG members deemed it necessary to comment, and so my proposal languished. The previous system was much better: if no-one objected after some days, you were free to run the bot. The current process is just putting hurdles in place without any benefit to the bot operator or Wikipedia.

In future, I'm going to follow Cyde's example, and skip the approval phase. I've shown that I have enough technical competence to run a bot, and the necessity of the tasks that I propose, and the details of the edits that my bot will make, are better discussed at appropriate wikiprojects. I will of course consult them, but I don't see any need for a (content-less) BAG review. -- Eugène van der Pijll (talk) 21:44, 8 April 2008 (UTC)[reply]

This is a direct contravention of bot policy. Any bot that is discovered to be editing without the requisite approval will be blocked on sight, regardless of whether the operator considers themselves skilled enough to be above the law in that regard. — Werdna talk 06:11, 9 April 2008 (UTC)[reply]
A couple years ago, after people agreed that some sort of bot review/approval was needed, the system involved proposing a task on a talk page and checking for any objections (on WT:BOT, I think). Some time ago I suggested using something like that old approval system for tasks after the original bot approval. The first BRFA is the real test. After that, the operator has experience and (presumably) a good reputation. Later tasks could be more straightforward than they are currently, which would encourage more bot ops to document what they're doing. Gimmetrow 07:46, 9 April 2008 (UTC)[reply]
Werdna, being blocked would be fine; that would lead to a discussion of the usefulness of the bot, and to approval. When I follow the current process, my proposals are ignored. If BAG members do not want to perform the job they've chosen to do, that's fine with me, but don't expect me to take any notice of their buraucratic requirements. -- Eugène van der Pijll (talk) 08:25, 9 April 2008 (UTC)[reply]
I have said for some time that it's important to keep the bot approval process as lightweight as possible, to avoid the issues Eugene is raising. We should also remember the main point of bot approval is to screen out unsuitable operators and screen out unsuitable tasks. It also gives a forum to discuss possibly controversial tasks. A helpful, uncontroversial task run by an experienced operator isn't problematic.
It's only bureaucracy to block a bot that is not broken and is doing useful work just because it doesn't have a seal of approval. When someone blocked Cyde's bot a while back for that reason, I think the main result was that the BRFA process went to MfD. — Carl (CBM · talk) 10:51, 9 April 2008 (UTC)[reply]

We are not overly bureaucratic. A task takes just a few days to be approved. We ask a few questions to check for sanity, allow you to go through a trial period, and give sufficient time for a bit of any community input. It probably takes about as long as the approval process on commons (possibly less time), and the main hold-up is always bot operators.

On the other hand, blocking unapproved bots is part of maintaining a credible approval process. If bots that were useful were allowed to continue without approval, nobody would bother with it. — Werdna talk 05:53, 10 April 2008 (UTC)[reply]

Your experience differs from mine. And my aim on Wikipedia is to be useful; not to maitain the credibility of an approval process. -- Eugène van der Pijll (talk) 08:11, 10 April 2008 (UTC)[reply]
It is critical that a credible approval process is maintained, because otherwise, badly-written bots are likely to cause significant damage, which is avertable by a a few days' to a weeks' approval process. — Werdna talk 13:07, 10 April 2008 (UTC)[reply]
I completely agree that some sort of advance notice is worthwhile, to screen ill-conceived bot ideas. But I don't see that there is any real damage a bot can do. The bot flag gives the bot no more ability to edit than an ordinary editor (unlike an admin flag, for example). Of course it may be burdensome to revert a thousand edits by hand, but that is an inconvenience rather than real damage.
I wrote a longish comment at Wikipedia_talk:Bot_policy#My_thoughts_about_bots with my reflections on the bot approval system. — Carl (CBM · talk) 13:18, 10 April 2008 (UTC)[reply]
I do not consider a process which includes blocking useful and harmless edits to be credible. -- Eugène van der Pijll (talk) 13:28, 10 April 2008 (UTC)[reply]
I think I see where Werdna is coming from in saying "If bots that were useful were allowed to continue without approval, nobody would bother with it." it is true that we want to prevent novice bot operators from making good-intentioned mistakes, and discourage experienced operators from taking on good-faith tasks for which there is community opposition (e.g. a welcome bot).
But the current system does permit plenty of bots to run without task approvals; they only tend to get blocked if they are too conspicuous. — Carl (CBM · talk) 13:24, 10 April 2008 (UTC)[reply]

You're missing the point, Eugene. The point of a bot approval process is to determine whether a bot process is useful and harmless, and therefore ought to be allowed. It is not unreasonable to require operators to go through with a week or so's approval process to make sure that we don't have bots running rampant and breaking things. — Werdna talk 00:29, 12 April 2008 (UTC)[reply]

Apparently you and I have a different definition of the word "unreasonable". I would agree with you when it comes to bots run by people who have never run a bot before, but when a bot owner with multiple approved bot tasks wants to add a task, one which clearly has consensus to be done and is not particularly complicated to implement, a weeks worth of approval process is utterly unreasonable, insulting, and a complete waste of time for everyone involved. Why does an experienced bot operator need to be checked for sanity again, checked to make sure they still know how to write code again, checked to make sure they know enough to test their own code before making thousands of crap edits again?--Dycedarg ж 21:58, 12 April 2008 (UTC)[reply]

I recently reactivated Werdnabot, and went through the appropriate approvals process. I insisted on doing so as there had been substantial changes to both my code, and the code of MediaWiki, despite the opinions of many that I needn't have bothered. In the process of my trial run, about four errors were found and fixed by astute users and members of the approvals group. I hold that this demonstrates that even experienced developers and bot operators make mistakes, and it is not a waste of time to force them to suffer what you consider to be an insulting indignity, in which the main hold-up is users. — Werdna talk 03:08, 13 April 2008 (UTC)[reply]

I do not consider a trial run to be optional, an insulting indignity, or anything else. Every single piece of code anyone ever writes should be thoroughly tested before it is put into action, that is only common sense. What I consider to be an insulting indignity is a forced trial run, with people looking over your shoulder, that you have to wait days to even get approval to do. As I've said several times in several places, if an experienced bot operator does not know enough to trial run his code ahead of time, forced or not, then he should not have a bot account, period. A bot owner should be expected to trial his own code, check the results for errors, ask other people to look over the results if he's not sure, and then run his bot. Not, wait days for trial permission, run trial, wait days for someone to get around to looking at the trial results, then get permission to run his bot. Don't even bother trying to convince me that the main hold-up is users. I just went through a BRFA that took a little less than five days; 2 days after I filed I got trial permission, 12 hours went by before someone put my bot on the AWB approval page and I was able to run the trial, and then 2 days went by after that before I got full approval. The only reason I didn't have to wait even longer for trial approval and then full approval was because I got tired of waiting and went after a BAG member myself. And this was for a bot that utilizes a widely used plugin with hundreds of thousands of edits proving that it works. So yes, I am quite fed up with your system.--Dycedarg ж 06:53, 13 April 2008 (UTC)[reply]

Excuse the sarcasm, but two whole days? However did you manage? It takes me much longer than that to get code live on Wikimedia, and I've written hundreds of features. — Werdna talk 00:29, 14 April 2008 (UTC)[reply]

Perhaps I'm misunderstanding what you're saying, but are you seriously comparing the fairly easily reversible damage a bot can do with what would happen if you broke the live version of MediaWiki that Wikimedia runs on? Anyway, I don't like waiting for no reason, I don't care how long it is. In any case it's obvious we're not going to agree on this, so I can't see any particular reason to continue this conversation.--Dycedarg ж 02:58, 14 April 2008 (UTC)[reply]

Comment by CBM[edit]

Coren asked for comments, so here are mine. Overall I think this is pretty clear and generally matches the current practice. One requirement that seems to have been dropped is that bots have to indicate in their edit summaries that they are a bot.

The part about limiting edit rates is reasonable, but someone should check with a developer about what their preferred stance is. My memory is that maxlag is the preferred method of rate limiting, with a maxlag of say 4 or 5 seconds for nonaggressive bots. The current text seems very neutral about maxlag versus a simple delay between edits.

I rephrased one of the requirements to say "does not consume resources unnecessarily". I think this would be better as "does not consume resources inappropriately"; necessary is a high standard.

A frequently asked question about bots is how much activity a semi-automatic bot can have before it needs to be flagged. Have you considered adding a conservative rule of thumb here? I usually use the arbitrary rule that if you routinely have 5 edits/min sustained for 20 minutes or so, you should think about getting a bot flag. The main point of that is to reassure people who are only making 50 total edits that they don't need a flag.

I would be interested to see a mockup of the BAG confirmation page; I don't understand yet exactly what is in mind there. — Carl (CBM · talk) 12:54, 8 April 2008 (UTC)[reply]

Well, we haven't done mockups since we don't yet know what the general approval for the changes are. But we are almost certainly going to be unimaginatively be ripping off the closest analogs in existing processes— no point in reinventing the wheel. — Coren (talk) 13:54, 8 April 2008 (UTC)[reply]

Comment by AKAF[edit]

This proposal does not yet address most of the recent problems. My particular concerns are as follows:

  1. Unflagged bots: While it is technically possible that some bots need to run unflagged, the presentation of unflagged bots as typical on this page is absolutely unacceptable. If there are historical bots which are an exception to the rule they can be listed explicitly at the bottom of the page. I suspect that there are no historical unflagged bots which cannot be flagged.
  2. Bot definition: There needs to be some work done here. Historically the definition of a bot has been speed, but currently the main problem with bots generally is the reduced amount of checking which goes into a bot edit as opposed to a manual edit. See for instance the recent tagging for speedy deletion of around 700 raster images by betacommand [6], using his normal user account, where the level of checking was significantly less than for a group of manual edits. It needs to be clear what exactly the limit is on this kind of edit. I would suggest changing the introduction "Because bots are potentially capable of editing far faster than humans can" to "Because bots are potentially capable of editing far faster than humans can, and have a lower level of verification on each edit than a human editor"
  3. Bot accounts: It appears to be the community consensus that each new task added to a bot should have a new account. Splitting bots into multiple user accounts is resisted by bot operators as a complicating factor, but I think it is clear that there are a number of bots which have circumvented the bot approval process by adding wildly different tasks to an approved bot. In particular the gathering of statistics on bots and the rollback of malfunctions is aided by having many bot accounts. I suggest changing "Contributors should create a separate user account in order to operate a bot" to "Contributors should create a separate user account for each separate task approved for a bot, except in the case that a task produces less than 100 edits"
  4. Unapproved scripts run from user accounts: I think this problem needs to be explicitly addressed. The whole point of the bot policy is to make it clear that this kind of activity is absolutely unacceptable. I invite any members of the BAG to look at the complaints stemming from betacommand's continual use of this practice to make it clear how strong the community consensus against it is.

In short, although I can see that this page is an improvement over the current bot policy, I think that there is still a lot of work to be done, particularly in the direction of defining for the reader what an archetypal approved bot account should look like. AKAF (talk) 15:16, 8 April 2008 (UTC)[reply]

Let's see if I can address those points in order:
  1. I can see there is support for flagging bots as a matter of authentication (i.e., the bot flags serves as "proof" of approval) beyond the technical constraints which made their implementation originally necessary. Given that there is no technical reason to not give the flag to an otherwise approved bot even if it should not mark its edits as bot edits, I'll change the wording accordingly.
  2. Actually, the definition of a bot has always hinged on human-supervision vs automated (with the speed of edition viewed more as a side effect). Your suggested wording does make that more apparent, and I'll put it in presently.
  3. How about requiring that any recurring or continuous task be done from a separate bot account?
  4. I don't think we'd have the authority to decide one way or the other on that. We can certainly mandate that unapproved bot accounts be blocked on sight (and the policy states as much explicitly), but bot-like edits done from a human editor's account would normally fall in the "user behavior" spectrum and beyond our scope. Or do you have an idea how to deal with this?
    (As an aside, I would say that the proposed policy would certainly make that case unauthorized operation of a bot since a human editor's account could not be authorized to run a bot— it would be quite reasonable for the community to handle such unauthorized operation in anyways it deemed fit).
What do you think? — Coren (talk) 15:50, 8 April 2008 (UTC)[reply]
I don't really like requiring that different tasks use different accounts. In some cases it is good to split them off, but for example my bot does a bunch of very small tasks, so there is little reason to split them apart. — Carl (CBM · talk) 15:56, 8 April 2008 (UTC)[reply]
Yes, but were those tasks subject of different approvals or aspect of a single authorized task? — Coren (talk) 16:19, 8 April 2008 (UTC)[reply]
Another question is, are those all "one-off" tasks? I.e.: it starts, then ends, and doesn't recur? Because I think what the community is worried is that two different continuous tasks are tied to each other. — Coren (talk) 16:19, 8 April 2008 (UTC)[reply]
The list is at User:VeblenBot. There are 5 continuous tasks, but I can't see any real benefit to splitting them into 5 different usernames. — Carl (CBM · talk) 16:27, 8 April 2008 (UTC)[reply]
Frankly, I don't see any harm either. Personally, I would make splitting tasks between accounts a recommendation rather than a requirement, but I can understand why many would prefer it be required and would not oppose that change for new requests. — Coren (talk) 16:39, 8 April 2008 (UTC)[reply]
My concern in this area is with the camels nose effect of new tasks being carried out without approval. One bot, one task seems a good default policy. Allow for exceptions, sure, with justification, but only if there is some given in advance. ++Lar: t/c 16:55, 8 April 2008 (UTC)[reply]
Adding more requirements to new requests would only discourage experienced operators from filing new requests. One goal of the BAG process should be to minimize the number of hoops that experienced people have to jump through, because otherwise they will simply stop bothering to file task requests.
If there is a solid argument why particular tasks need to be split off, it can be made before the task is approved. Mandatory separate accounts for every task seems like a solution in search of a problem. — Carl (CBM · talk) 17:00, 8 April 2008 (UTC)[reply]
Obviously I completely disagree. As to why, see the current RfAr, where this issue is a factor. It takes very little effort to create a new bot account, and it's a great way to keep things segregated, and it makes it harder to "accidentally" execute tasks that there isn't approval for. As for the hoops argument, any operator who simply stops bothering to file task approval requests needs to have all their flags turned off, I'd say... ++Lar: t/c 17:07, 8 April 2008 (UTC)[reply]
There may be some circumstances in which it's helpful to segregate accounts, but just like we don't mandate use of the {{bots}} template, I don't see any argument why it should be made mandatory in every case. We have traditionally given experienced bot operators who avoid controversy a large amount of freedom in running tasks without approval, and I don't think any change to the BAG policy is going to change that practice. — Carl (CBM · talk) 17:46, 8 April 2008 (UTC)[reply]
Sounds like a bigger change is needed than just a BAG rewrite, then. running tasks far outside the scope of the mandate given is not something I approve of, and I suspect many others would not either were they actually aware of it. It seems silly to detail out the tasks contemplated, if once approval is given, any task at all can be undertaken. ++Lar: t/c 18:26, 8 April 2008 (UTC)[reply]
It's certainly less bureaucratic that way. The mandate of everyone here is to improve the encyclopedia; if the bot approval process makes someone nervous or depressed, they can certainly ignore the process if what they are doing is uncontroversial and clearly beneficial.
I do think that it's better for bot operators to put in task requests, so that it's clear what's going on. My argument is that we should keep the bureaucracy to a minimum to encourage them to enter those task requests. — Carl (CBM · talk) 19:22, 8 April 2008 (UTC)[reply]
One of the reasons separate accounts is valuable is in case of fuck up. If a bot goes br0ken and starts scribbling over hundreds of article, mass-reverting the bot back to date X is a fairly easy thing to do iff the bad edits aren't interspersed with hundreds of unrelated good edits from another task. It also allows blocking on misbehaving function without affecting the other, properly working ones. — Coren (talk) 19:43, 8 April 2008 (UTC)[reply]
That only makes sense for bots that are making edits on a scale large enough that they can't be reverted by hand. This is the type of thing that should be considered on a case by case basis, not written into the bot policy. — Carl (CBM · talk) 20:00, 8 April 2008 (UTC)[reply]

(outdent) I disagree with an absolute one task per bot account rule in the strongest possible terms. As in all things of this nature, it is a result of one or two accounts doing things that annoy a large number of people, and now suddenly everyone has to follow new overly stringent rules that serve in most circumstances utterly no purpose. For example, I presently run three different tasks through one account, have another that's completed but I might want to do again later, am in the process of adding another, and also do high speed semiautomated editing. Why should I be expected to open five different accounts? Only two of the tasks involve high speed unsupervised editing that might require blocking, but that has never happened to me and I sincerely doubt that it ever will as one of the tasks is actively maintained by an extremely experienced operator and the other I rarely run anymore anyway, and involves a narrowly defined task that has no changing parameters. And then I would like to be able to add tasks now and again without wasting too much of my time. I would imagine, also, that the people arguing in favor of this have no idea how annoying it would be to code for this. I'd either have to have four or five different distributions of Pywikipedia wasting space on my hard drive which I would have to maintain individually or use some symbolic links nonsense which it doesn't even have instructions how to make except in Linux which I am having difficulty using at the moment. It's bureaucratic nonsense at its worst; if someone like the instigator of this whole mess is abusing his bot account, then make him break his tasks down into different accounts by having a proviso declaring that BAG may require this of certain bot owners at their own discretion. Don't punish the good guys for the actions of a few, all you'll do is drive away your already sparse coding resources, not to mention add even more work to the bureaucrats who have to flag all of these redundant bot accounts.--Dycedarg ж 00:31, 9 April 2008 (UTC)[reply]

Dycedarg, I can fully appreciate your point here, but I respectfully disagree. I think that there are classes of bots where it is not really necessary to have multiple accounts, and perhaps we should enumerate them. For me they are:
  1. Bots with recurring jobs with low numbers of relatively uncontroversial edits
  2. Bots which do not edit, or which edit only their own userspace
for example newsletter bots or statistics bots (where the server load limits are observed). Perhaps you can add some others. I think though that any bot job which edits mainspace or imagespace needs to be split off into a different account. Different users would have different ideas about the number of edits at which this should happen, so perhaps you can make your own suggestion, but I would think that we could find general agreement that any bot which will make 10,000 edits during its lifetime should be under its own username. I would personally hope that any bot which will make over 1000 edit during its lifetime or 100 edits per day would have its own username.
This ties in with the definition of a bot. For me a script running on a user account with 500 automatic edits interspersed with user edits is a real problem if reversion is needed. This is the reason why I have suggested a 100 edit limit. It's twice the usual bot test amount, and leaves a bot operator relatively free to test out a new function, or to discuss it with others, without allowing carte blanche to have infinite edits. I would suggest a guideline for rules of thumb about what is needed for a bot. For example:
  1. Bot creates <50 lifetime edits --> no approval necessary
  2. Bot creates >50 lifetime edits --> BAG approval is necessary, but could be a new task of an existing bot
  3. Bot creates >1,000 lifetime edits or >100 edits per day --> BAG approval (new bot) + separate account necessary
  4. Bot creates >10,000 lifetime edits or >500 edits per day --> BAG approval (new bot) + separate account + bot flag necessary
Perhaps you have a suggestion about what limits you think should apply? AKAF (talk) 07:20, 9 April 2008 (UTC)[reply]
To Coren: I've described here what I think should be the rule of thumb. For me it rests mainly on the difficulty of manually correcting the edits of a bot, which is a gradient. I have tried to integrate Carl's concerns, which I think are very valid, since there are plenty of small jobs which are not a problem. Even very controversial tasks are likely to be unproblematic if <50 edits are produced. I would be happy with the words "recurring or continuous" also.
As to point 4, I'm afraid I see this as extremely important, since editors who regularly make unsupervised edits to mainspace are very corrosive. I think WP:DUCK applies here, especially if a large number of insufficiently checked edits of a specific type, with a single (or automated) edit summary can be seen from a single account. AKAF (talk) 07:20, 9 April 2008 (UTC)[reply]

I mind a "general rule of thumb", and discretion given as to the following of said rule, far less than I do a blanket "Each clearly delimited function of a bot should use a single, distinct account." I think it should be encouraged, mandated in some cases even, but that there should be no absolutes involved. That is, in my opinion, what BAG and the bot approval process is there for in the first place. They should be given the right to make approval contingent on some prerequisites, such as following nobots if it makes sense, or splitting the task away from a prior held bot account if it makes sense. I think your guidelines would make it easier to give bot operators an idea of what BAG will expect, but that they should be able to make a case for running their bot the way they see fit and barring any major objections then or later BAG should just let them. For me, I'm not going to split the bot as I use it at the moment into multiple accounts. Two of the tasks it does are very low in edit count per day and are utterly uncontroversial, to the extent that I never bothered with a BRFA for them. The other one runs very rarely, and when it does run it does all its work in chunks of a few hundred, maybe a thousand at most, and these chunks could hardly be considered to be interspersed with the edits of the other tasks. I might consider getting another account for the new task I'm getting approval for, as it is another one involving chunks of hundreds/thousands of edits, but I don't want to. Again, it will be run periodically, and it's edits will be in contiguous chunks that aren't interspersed with anything, and could be easily reverted en masse if necessary. Really, my primary objection to this whole thing is that I dislike in the extreme unnecessary complexity.--Dycedarg ж 09:35, 9 April 2008 (UTC)[reply]

I quite agree with you, but I think that it would be better to write the actual text of the bot policy from the point of view of a bot which would fall under case 4 above, and have other bots as the exception, rather than the other way around. This will mean that the vast majority of bots will fall under the exception, but at least it will be clear from what exactly they are being excepted. Since there are plenty of people who think like you, I think it would be best to at least clarify what the expectations are. The problem in the past has been that the bot policy was so indefinite as to completely defy application to any concrete test case.
I think though, that there would not be anyone who would agree with your adding jobs to a bot completely without a BRFA. Would it help if it were clarified that an active and running bot under my suggested case 1 above could have a BRFA while running if it turns out to be a good idea to keep? I would hope that this would be a case where speedy approval would be applicable.
I see the point of this policy as being a guideline as to when an administrator uninvolved with the BAG should block a user or bot. The BAG has shown itself chronically unable to deal with problem users or bots, and there does not seem to be any kind of community within bot users which cares either. At least part of the problem is due to wildly varying ideas about what the rules/guidelines actually are, and a chronic unwillingness on the part of some bot users to behave in a responsible manner.
Perhaps a technical question for the developers: Would it be possible for accounts with a bot flag have linked sub-accounts and to use a switch to run different jobs as those accounts, but with the code running through the main account? This would avoid the code-splitting which Dycedarg appears to find so problematic. AKAF (talk) 12:17, 9 April 2008 (UTC)[reply]
The tasks I added without a BRFA are two simple maintenance tasks for project space pages requiring 8 edits a day or less, which are made at one or two edits per minute. The maintainers of said project pages are the ones who requested it, and in one case I was simply duplicating functionality for a bot who's owner was on a wikibreak of indeterminate length. I decided that wasting my time and BAG's time with something so completely uncontroversial in every sense of the word was utterly nonsensical. What possible reason could there be for requiring it? You might as well template someone for vandalizing their own sandbox. After all, it is part of Wikipedia. If someone's doing unauthorized bot work that could conceivably be considered controversial by someone at some point, then yes yell at them. Block them if you must. But if no one is going to object for any reason at any time, the obsessive need for hoop jumping and process and bureaucracy is utterly unhelpful. Oh, and if you think no one would agree with me check the section above. As for your third point: Not to my understanding, and the developers would be unlikely to instantiate such a major change for such a nonissue.--Dycedarg ж 12:48, 9 April 2008 (UTC)[reply]
Dycedarg, I'm not attacking your bot, it appears to be to be completely beneficial (although I have added a question at your current BRFA). The point is that every bot operator feels that they are trustworthy, and that extra tasks shouldn't require them jumping through hoops, and that their tasks don't require approval because they are trustworthy. For the most part that's true too, but not everyone can be above average. The trouble is that your idea is not what is reflected in the bot policy. If you feel that some tasks should be exempted from bot policy then write something to clarify which tasks you think they are and get it put into the bot policy. This idea that the bot policy doesn't apply to experienced bot operators is a huge problem. If you don't agree with the bot policy as written, then collaborate and get it changed. AKAF (talk) 13:12, 9 April 2008 (UTC)[reply]
It's edit rate is ridiculously low. I can edit faster than that with no effort at all, manually. It's only technically a bot because I don't choose to review its edits as they're made. No one has complained, and if I hadn't mentioned it to you I doubt anyone would have ever so much as noticed. Quite frankly, if you insist on pursuing this until I have a BRFA you will be creating work for multiple people and solving a problem that could not possibly be said to exist. Oh, and by the way, my stance on this matter has nothing to do with my considering myself more trustworthy, or more experienced, or anything of the kind. I can name a bunch of people off the top of my head who are more experienced than I am. It has everything to do with the fact that the tasks are ridiculously low impact, and no more than a couple dozen people are so much as aware that they get done at all. Bots are potentially problematic because of their ability to do mass amounts of damage, and the difficulty associated with overseeing what they're doing. These tasks are neither particularly edit heavy, nor would a single major mistake go unnoticed by the maintainers of the pages it would destroy. Additionally, I can guarantee you that I personally checked every edit it made on either task for weeks. Thus none of the reasons these standards exist apply to the bot, and thus I see no reason for a BRFA. In the vast majority of cases I would agree with you, however, I'm going to IAR on this one until someone comes up with an actual reason as to why I shouldn't.--Dycedarg ж 13:37, 9 April 2008 (UTC)[reply]
Dycedarg let me make this very clear I don't want you to have a BRFA and I don't care either way about your script/bot. But I don't want you to have to ignore bot policy either. My position is that the official bot policy should allow considerably more freedom for bot operators to do what they want, but that bot operators should stop ignoring it. What I want is for you to change the bot policy to reflect what you think is correct, so that it will be a realistic guide for future bot operators. As I have said multiple times above, I think that there should be an official leeway for low impact jobs, so that operators don't have to jump through hoops. Just tell us what you think it should be, so that your bot could jump through. What about the following guideline?:
  1. Bot creates <50 lifetime edits or <10 edits/day outside its own userspace --> no approval necessary to be run on a user or bot account
  2. Bot creates >50 lifetime edits or >10 edits/day outside its own userspace --> BAG approval is necessary, but could be a new task of an existing bot
  3. Bot creates >1,000 lifetime edits or >100 edits per day --> BAG approval (new bot) + separate account necessary
  4. Bot creates >10,000 lifetime edits or >500 edits per day --> BAG approval (new bot) + separate account + bot flag necessary
AKAF (talk) 14:20, 9 April 2008 (UTC)[reply]
I think we could just use common sense in deciding which bots are "high impact" and which are not, rather than making up arbitrary numbers that (in the end) some misguided editor will try to enforce strictly? Wikipedia has historically avoided detailed, legalistic policies in favor of general statements of principles. — Carl (CBM · talk) 14:35, 9 April 2008 (UTC)[reply]
Although I agree in principle, in practice this approach has led to a proliferation of unapproved bot tasks, some of which are problematic, some of which are not, and a general feeling amongst bot operators that the BAG policies need to be circumvented in order to operate in the real world. I think that it would help bot operators like you and Dycedarg if you could point to a guideline and say that you're within it. Although you say that it's common sense, betacommand didn't think that the deletion of 700 raster images in a day by unapproved bot was a problem either. I would push for a guideline like this so that there can be general agreement that while 10 edits/day is okay, that 1000 edits/day is not. I would hope that labelling it a guideline would mean that a bot which mainly produces 11 edits/day would not need to be approved.
The second problem is that there does not actually appear to be any consensus of what constitutes "high impact". Why don't you write down a definition for us? AKAF (talk) 14:58, 9 April 2008 (UTC)[reply]
Here's one common sense rule: If a project decides that it wants its content updated by a bot, and the bot only makes that type of edit to the page or series of pages under the purview of the project, then it should be automatically approved. The only people such edits affect are the maintainers of the pages, and if said maintainers approve of the bot then nothing else matters. Such tasks could be what I'm doing for WP:DEP, or a specific type of archiving, or anything similar so long as the edit rate is relatively low. As for numbers, the tasks my bot is doing will undoubtedly surpass any arbitrary limits set for lifetime edits because, unless the needs of the project pages change or I leave the project, it will continue doing them forever. I don't think that the lifetime edit limits are enforceable in general because, with exceptions depending on the type of task, it is rather impossible to judge when one first runs a task how long it will end up running, or the possible need for recurrence. I would like it if some definition for high impact could be determined that isn't solely based off of arbitrary edit rates.--Dycedarg ж 15:05, 9 April 2008 (UTC)[reply]
I think it will require a more systemic change for BAG to deal with issues like the deletion of raster images (Betacommand is a BAG member after all). I think that changing the policy to try to eliminate situations like that is counterproductive; individual situations should be dealt with individually, and tough cases make bad law. What we need for BAG is not more policies, but more clue overall.
I think it would be enough to simply say "high impact" in the policy, and remind operators to discuss first if they are unsure whether their plans would be considered high impact. That will prove more robust in practice than an arbitrary metric based on counting edits. — Carl (CBM · talk) 15:09, 9 April 2008 (UTC)[reply]
(Edit conflict)Perhaps:
"Bots which edit only on a single page, or closely linked set (Such as a single user-page or project-page (with closely associated subpages)) do not need to be approved except in the (rare) case that they generate high server load in the gathering of data"
I had hoped that my proposed rule one above would exclude nearly all of your bots, by not requiring any tasks with less than 10 edits/day to be registered. Maybe rather than lifetime edits, edits in one run, for example:
  1. Bot creates <50 edits in each run or <10 edits/day outside its own userspace --> no approval necessary to be run on a user or bot account
  2. Bot creates >50 edits in each run or >10 edits/day outside its own userspace --> BAG approval is necessary, but could be a new task of an existing bot
  3. Bot creates >1,000 edits in each run or >100 edits per day --> BAG approval (new bot) + separate account necessary
  4. Bot creates >10,000 edits in each run or >500 edits per day --> BAG approval (new bot) + separate account + bot flag necessary
How's that?AKAF (talk) 15:22, 9 April 2008 (UTC)[reply]
CBM proposed an alternative method of bot approval with which I wholly agree on WT:BOT, which would solve this problem entirely. So consider that my new position on this whole issue.--Dycedarg ж 09:24, 10 April 2008 (UTC)[reply]

Reign in the bots[edit]

Wikipedia has become plastered with tags that are not really decorative. This is based on a theory of editing as realistic as the Marxist view of the common man. People don't start contributing knowing everything about these repurposed typographical symbols. About as many people read the manual before fiddling with their gadget as read all the Wiki edit guides before editing. (I've written manuals for a couple of decades and have so far only encountered one lonely example for this special category!!) I doubt there are many articles that are uploaded as a whole in pristine and compliant form (has there been one yet??) Most people will probably do things in bits and pieces and read a guide when they feel they need to. And then you get these bots sticking all manner of stickers on your evolving work. (e.g. I had the misfortune of using an underscore to make lines and a bot felt free to revert several hours worth of work without so much as a breather why!) Such things discourage new editors and if this is not to end steaming in it's own juice it needs frequent new blood. So here are a couple of suggestions. - bots that revert edits have to leave a line as to why on the "talk" page along with a way to get the deletion reverted. - bot tags have to be revisited once a month if they still apply - bots that complain about style or use of "code" symbols report to a page where programming jockeys can pick up tasks and help out. - people who run bots have to clean up a minimum number of complaints themselves or show a pool of other people that do (That works at our translators' site where you can only ask if you answer.) If not their bot gets the boot. Lisa4edit —Preceding unsigned comment added by 71.236.23.111 (talk) 12:38, 12 April 2008 (UTC)[reply]