Jump to content

Wikipedia:Bot policy: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Betacommand (talk | contribs)
m Revert to revision 369720341 by GoEThe.
m Bot usage: "since" is not synonymous with "because"
Line 17: Line 17:


==Bot usage==
==Bot usage==
Since bots:
Because bots:
# Are potentially capable of editing far faster than humans can
# Are potentially capable of editing far faster than humans can
# Have a lower level of scrutiny on each edit than a human editor
# Have a lower level of scrutiny on each edit than a human editor

Revision as of 12:03, 11 July 2010

Bot policy covers the operation of all bots and automated scripts used to provide automation of Wikipedia edits, whether completely automated, higher speed, or simply assisting human editors in their own work.

It also covers the work of the Bot Approvals Group, which supervises and approves all bot-related activity from a technical and quality-control perspective on behalf of the English Wikipedia community.

Definitions

  • Bots (short for "robots") are generally programs or scripts that make automated edits without the necessity of human decision-making.
  • Assisted editing covers specifically lower-speed tools and scripts that can assist users to make decisions but leave the actual decision up to the user (see Assisted editing guidelines below). Any program or tool which does not allow the user to view each edit and give an instruction to make that edit (that is, one which can edit without the operator looking at and approving the change) is considered to be a bot.
  • Scripts are personalized scripts (typically, but not exclusively, written in JavaScript) that may automate processes, or may merely improve and enhance the existing MediaWiki interface.
  • The Bot Approvals Group ("BAG") is a group of users with appropriate technical skills and wiki-experience, whose members are approved by the community to oversee and make decisions on bot activity and operation on-wiki for the community. The Bot Approvals Group also determine the classification as bot or assisted editing, in ambiguous cases. Formal work by MediaWiki Developers is outside the scope of this policy.

Bot usage

Because bots:

  1. Are potentially capable of editing far faster than humans can
  2. Have a lower level of scrutiny on each edit than a human editor
  3. May cause severe disruption if they malfunction or are misused
  4. Are held to a high standard by the community,

High standards are expected before a bot is approved for use on designated tasks. Operation of unapproved bots, or use of approved bots in unapproved ways outside their conditions of operation, is prohibited and may in some cases lead to blocking of the bot account and possible sanctions for the operator.

Note that higher speed or semi-automated processes may effectively be considered bots in some cases, even if performed by an account used by a human editor. If in doubt, check.

Bot accounts

Contributors should create a separate user account in order to operate a bot. The account's name should identify the operator or bot function. Additionally, it should be immediately clear that its edits are made by an automated account; this is usually accomplished by including the word "Bot" at the beginning or end of the username. (Bots active on other wikis may need to use other means to indicate this.) Bots must only edit while logged into their account, bots which often attempt to edit while logged out should use AssertEdit, or a similar function. Tools not considered to be bots do not require a separate account, but some users do choose to make separate accounts for non-bot but high-speed editing.

The contributions of a bot account remain the responsibility of its operator, who must be prominently identifiable on its user page. In particular, the bot operator is responsible for the repair of any damage caused by a bot which operates incorrectly. All policies apply to a bot account in the same way as to any other user account. Bot accounts are considered alternative accounts of their operator for the purposes of the user account policy.

Bot accounts should not be used for contributions that do not fall within the scope of the bot's designated tasks. In particular, bot operators should not use a bot account to respond to messages related to the bot. Bot operators may wish to redirect a bot account's discussion page to their own.

The 'bot' flag

Bot accounts will be marked by a bureaucrat upon Bot Approvals Group request as being in the "bot" user-group within MediaWiki. This is a flag on their account that indicates the account is used by a bot, and reduces some of the technical limits usually imposed by the Mediawiki software. Edits by such accounts are hidden by default within recent changes.

Historically, being flagged as a bot account was distinct from the approval process; not all approved bots had that property. This stemmed from the fact that all bot edits were hidden from recent changes, and that was not universally desirable. Now that bot edits can be allowed to show up on recent changes, this is no longer necessary.

Bot requirements

In order for a bot to be approved, its operator should demonstrate that it:

  • is harmless
  • is useful
  • does not consume resources unnecessarily
  • performs only tasks for which there is consensus
  • carefully adheres to relevant policies and guidelines
  • uses informative messages, appropriately worded, in any edit summaries or messages left for users

The bot account's user page should identify the bot as such using the {{bot}} tag. The following information should be provided on, or linked from, both the bot account's userpage and the approval request:

  • Details of the bot's task (or tasks)
  • Whether the bot is manually assisted or runs automatically
  • When it operates (continuously, intermittently, or at specified intervals), and at what rate
  • The language and/or program that it is running

While performance is not generally an issue, bot operators should recognize that a bot making many requests or editing at a high speed has a much greater effect than the average contributor. Operators should be careful not to make unnecessary Web requests, and be conservative in their editing speed. Developers will inform the community if performance issues of any significance do arise, and in such situations, their directives must be followed.

  • Bots in trial periods, and approved bots performing all but the most trivial or urgent tasks, should be run at a rate that permits review of their edits when necessary.
  • Unflagged bots should edit more slowly than flagged bots, as their edits are visible in user watchlists.
  • The urgency of a task should always be considered; tasks that do not need to be completed quickly (for example, renaming categories) can and should be accomplished at a slower rate than those that do (for example, reverting vandalism).
  • Bots' editing speed should be regulated in some way; subject to approval, bots doing non-urgent tasks may edit approximately once every ten seconds, while bots doing more urgent tasks may edit approximately once every five seconds.
  • Bots editing at a high speed should operate more slowly during peak hours (1200–0400 UTC), and days (middle of the week, especially Wednesdays and Thursdays) than during the quietest times (weekends). Traffic statistics [dead link] are available.
  • Bots' editing speed may also be adjusted based on slave database server lag; this allows bots to edit more quickly during quiet periods while slowing down considerably when server load is high. This can be achieved by appending an extra parameter to the query string of each requested URL; see mw:Manual:Maxlag parameter for more details.

Bots that download substantial portions of Wikipedia's content by requesting many individual pages are not permitted. When such content is required, download database dumps instead. Bots that require access to run queries on Wikipedia databases may be run on the toolserver; such processes are outside the scope of this policy.

Good communication

Users who read messages or edit summaries from bots will generally expect a high standard of cordiality and information, backed up by prompt and civil help from the bot's operator if queries arise. Bot operators should take care in the design of communications, and ensure that they will be able to meet any inquiries resulting from the bot's operation cordially, promptly, and appropriately. This is a condition of operation of bots in general. At a minimum, the operator should ensure that other users will be willing and able to address any messages left in this way if they cannot be sure to do so themselves.

Configuration tips

Bot operators may wish to implement the following features, depending on the nature of the bot's tasks:

  • Bots which deliver user talk messages are encouraged to provide a method of opting out of non-critical messages, and advertise that method on the bot user page.
  • Bots which edit many pages, but may need to be prevented from editing particular pages, can do so by interpreting {{Bots}}; see the template page for an explanation of how this works.
  • Bots which "clean up" in response to non-vandalism user edits may honor {{inuse}} to help avoid edit conflicts, either by checking for the presence of that template (and redirects) or the category Category:Pages actively undergoing a major edit. As suggested at Template:inuse/doc, a bot that honors {{inuse}} may ignore the template if it has been more than 2 hours since the last edit.
  • Providing some mechanism which allows contributors other than the bot's operator to control the bot's operation is useful in some circumstances – the bot can be enabled or disabled without resorting to blocks, and could also be configured in other ways. For example, the bot could check the contents of a particular page and act upon the value it finds there. If desired, such a page could then be protected or semi-protected to prevent abuse. Bot operators doing this should bear in mind that they retain all responsibility for their bot account's edits.
  • To avoid unnecessary blocks, the bot may detect whether its account is logged in, and cease editing if not. This can be done using the Assert Edit Extension.

Authors of bot processes are encouraged, but not required, to publish the source code of their bot.

Restrictions on specific tasks

Categorization of people

Assignment of person categories should not be made using a bot. Before adding sensitive categories to articles by bot, the input should be manually checked article by article, rather than uploaded from an existing list in Wikipedia. (See Wikipedia:Categorization of people)

Spell-checking

Bot processes may not fix spelling mistakes in an unattended fashion, as accounting for all possible false positives is unfeasible. Assisted spell-checking is acceptable, and may or may not be considered a bot process depending on the editing rate. Such processes must not convert words from one regional variation of English to another.

Operators of interwiki bots creating new links to articles that do not already link back must be familiar with the languages to which they are linking. Bots running standard tools such as the pywikipedia framework should be updated to the latest version daily. Globally-approved interwiki bots are permitted to operate on English Wikipedia, subject to local requirements. Interwiki bots should not run unsupervised in Template namespace unless specifically designed to run on templates. They must make sure that interwiki links added to templates are not transcluded on all pages using the template by properly placing them in the appropriate documentation subpage section, or non-included portion of the template if no documentation subpage exists. (As of May 2009, the standard interwiki module in pywikipedia does not meet these requirements.)

Cosmetic changes

Scripts that apply cosmetic changes, such as cosmetic_changes.py, should be used with caution. The pywiki functions standardizeCategories, validXhtml, translateAndCapitalizeNamespaces, removeNonBreakingSpaceBeforePercent, or equivalent functionality, should not be used (as of May 2009), as they do not function correctly or there is no consensus for such changes. The functions removeUselessSpaces and cleanUpSectionHeaders are also not recommended, as they mainly move around whitespace.

Approval process

Approval

All bots that make any logged actions (such as editing a page, uploading files or creating accounts) must be approved before they may operate. Operators may carry out limited testing of bot processes without approval, provided that test edits are very low in number and frequency, and are restricted to test pages such as the sandbox. Such test edits may be made from any user account. In addition, any bot or automated editing process that affects only the operators', or their own, user and talk pages (or subpages thereof), and which are not otherwise disruptive, may be run without prior approval.

Bot approval requests should be made at Wikipedia:Bots/Requests for approval. Requests should state precisely what the bot will do, as well as any other information that may be relevant to its operation. The request will then be open for some time during which the community or members of the Bot Approvals Group may comment or ask questions. The decision to approve a request should take into account the requirements above, relevant policies and guidelines, and discussion of the request. The need for an account to be added to the "bot" user group may also be determined by the approvals group; should this be required, it may be carried out by any bureaucrat.

During the request for approval, a member of the Bot Approvals Group may approve a short trial during which the bot is monitored to ensure that it operates correctly. The terms and extent of such a trial period may be determined by the approvals group. Automated processes should be supervised during trial periods so that any problems may be addressed quickly.

In addition, prospective bot operators should be editors in good standing, and with demonstrated experience with the tasks the bot proposes to do.

Should a bot operator wish to modify or extend the operation of a bot, they should ensure that they do so in compliance with this policy. Small changes, for example to fix problems or improve the operation of a particular task, are unlikely to be an issue, but larger changes should not be implemented without some discussion. Completely new tasks usually require a separate approval request. Bot operators may wish to create a separate bot account for each task.

Accounts performing automated tasks without prior approval may be summarily blocked by any administrator.

Appeals and reexamination of approvals

Requests for reexamination should be discussed at Wikipedia talk:Bots/Requests for approval. This may include either appeal of denied bot requests, or reexamination of approved bots. In some cases, Wikipedia:Requests for comment may be warranted.

Such an examination can result in:

  • Granting or revoking approval for a bot task;
  • Removing or placing the account into the bot user group; or
  • Imposing further operational conditions on the bot to maintain approval status.

BAG has no authority on operator behavior, or on the operators themselves. Dispute resolution is the proper venue for that.

Bots with administrative rights

Bots with administrator rights (a.k.a. "adminbots") are also approved through the general process. The bot operator must already be an administrator. As with any bot, the approval discussion is conducted on two levels:

  1. Community approval for the bot's task, i.e. whether there is consensus for the task to be done by an automated program. This discussion either takes place in a dedicated subsection of the BRFA proper, on the Village Pump, or any other forum, provided it receives significant publicity.
  2. The technical assessment of the bot's implementation, i.e. whether the bot will do what it's supposed to do. The technical assessment process is open to all users, though those leaving comments are generally expected to possess some relevant technical expertise. It is recommended that the source code for adminbots be open, but should the operator elect not to do so, they must present it for review upon request from any BAG member or administrator.

After a suitable consensus that the task is both useful and technically sound, a member of the Bot Approvals Group will review the request and approve a trial period, wherein the bot will either run "dry" without a 'sysop' bit (if practical), or be run on the operator's main account (with its edits clearly marked as such). When the Bot Approvals Group is satisfied that the bot is technically sound, they will approve the bot and recommend that it be given both 'bot' and 'sysop' rights. The bureaucrat who responds to the flag request acts as a final arbiter of the process and will ensure that an adequate level of community consensus (including publicity of approval discussion) underlies the approval.

If the bot operates upon additional rules (such as lists of regular expressions applied in a particular decision-making process) that are not publicly visible, the bot operator should make these available to any BAG member or administrator upon request. The operator should also exercise their best judgment when making alterations to these rules even more so if they can significantly alter the bot's behavior.

Administrators running unapproved experimental administrative bots (for example during the development phase) should "babysit" the bots and terminate them at the first sign of incorrect behavior. Administrators will be responsible for the behavior of robots that are allowed to run wild.

Administrators are allowed to run semi-automated tools (assisted use of administrative tools) but will be held responsible if those tools go awry.

If an administrator responsible for one or more adminbots is desysopped, their bots should be immediately desysopped at the same time (except if the administrator voluntarily stepped down in uncontroversial circumstances).

Bot Approvals Group

Members of the group are experienced in writing and running bots, have programming experience, understand the role of the BAG in the BRFA process, and understand Wikipedia's bot policy. Those interested in joining the group should make a post to the talk page explaining why they would be a good member of the team and outlining past experience, and then should make posts to WP:AN, WT:RFA, WP:VPM, WT:BOT, and WP:BON. After seven days, an uninvolved bureaucrat will close the discussion.

Dealing with issues

If you have noticed a problem with a bot, or have a complaint or suggestion to make, you should contact the bot operator. If the bot is causing a significant problem, and you feel that more urgent discussion is necessary, you may also wish to leave a message at Wikipedia:Administrators' noticeboard and/or Wikipedia:Bot owners' noticeboard, indicating where you have notified the bot operator.

Administrators may block bot accounts that operate without approval, operate in a manner not specified in their approval request, or operate counter to the terms of their approval or bot usage policy. A block may also be issued if a bot process operates without being logged in to an account, or is logged in to an account other than its own. Bots which are known to edit while logged out should have AssertEdit, or a similar function, added to them.

Bots operated by multiple users

Accounts used for approved bots that can make edits of a specific designated type, at the direction of more than one person, are not likely to be a problem, provided:

  1. operator disclosure – the Wikipedia user directing any given edit will always be identifiable, typically by being linked in the edit summary, and
  2. operator verification – users able to direct the bot to make edits must be positively identified to the bot at the time of edit, in some manner not readily faked and unique to that user that cannot readily be bypassed or avoided (e.g. non-trivial password, restricted IP, wiki login, irc hostname), so that the user directing any given edit and identified above, may be considered verified.
  3. operator trust – if anyone other than the bot creator is likely to operate the bot, then there must be outline measures to reassure Bot Approvals Group members that they will have the requisite skill and knowledge to operate that bot to an appropriate standard.

Assisted editing guidelines

"Assisted editing" covers the use of tools which assist with repetitive tasks, but do not alter Wikipedia's content without some human interaction. Examples of this include correcting typographical errors, fixing links to disambiguation pages, reverting vandalism, and stub sorting.

While such contributions are not usually considered to constitute use of a bot, if there is any doubt, you should make an approval request; see Approval above. In such cases, the Bot Approvals Group will determine whether the full approval process and a separate bot account are necessary. In general, processes that are operated at higher speeds, with a high volume of edits, or are more automated, may be more likely to be treated as bots for these purposes.

Contributors intending to make a large number of assisted edits are advised to first ensure that there is a clear consensus that such edits are desired. They may wish to create a separate user account in order to do so; such accounts should adhere to the policy on multiple accounts. Contributors using assisted editing tools may wish to indicate this, if it is not already clear, in edit summaries and/or on the user page or user discussion page of the account making the contributions.

Authors of assisted editing tools are permitted to create their own approval mechanism for that tool; if bot approval is required for use of the tool, this is in addition to, not instead of, the normal approval request process. AutoWikiBrowser is an example of a tool with such a mechanism. Release of the source code for assisted editing tools is, as with bots, encouraged but not required.

Note that any large-scale semi-/automated article creation task requires a BRFA.

User scripts

The majority of user scripts are intended to merely improve, enhance, or personalize the existing MediaWiki interface, or to simplify access to commonly used functions for editors. Scripts of this kind do not normally require BAG approval.

See also

  • Wikibot auto links 'wikitagged' words in Joomla! contents with Wikipedia.