From Wikipedia, the free encyclopedia
Jump to: navigation, search

The bot

This bot runs on meta under the account m:User:COIBot and here under this account. Its main tasks are to correct issues related to a conflict of interest (COI):

  • Report link additions where the account name of the person adding has a significant overlap with the domain of the link added. With link additions, the IP of the link added (via a DNS resolve) is also tested against the IP of anonymous users (the IP of registered users is not known!). If the IP of the anonymous user is close to the IP of the URL, the link addition will be reported. Overlap is also calculated from text associated to a username and from the IP-ranges which can be traced back to a certain link (both via a blacklist).
  • Report edits of users whose username is very similar to the pagename edited.
  • Monitor some link-additions which are currently being (COI-)spammed/pushed.

Accidental overlaps can be whitelisted (also see interpretation).

What is watched

COIBot is at the moment listening and reporting to the IRC channels (IRC on

  • #wikipedia-en-spam - all coi and link addition reports (reads the English link-addition feed from here)
  • #wikimedia-swmt - all non-english reports.
  • #wikimedia-swmt-spam - all non-en.wikipedia specific coi and link addition reports (reads the non-English link addition feed from here).
  • #wikipedia-spam-t - main command channel, certain en-specific reports
  • #wikipedia-spam-stats - used for some statistics and commands

On IRC on COIBot listens standard to:

  • #en.wikipedia,
  • #de.wikipedia,
  • #fr.wikipedia,
  • #it.wikipedia,
  • #nl.wikipedia,
  • #pl.wikipedia,
  • #pt.wikipedia,
  • #es.wikipedia,
  • #no.wikipedia,
  • #ru.wikipedia,
  • #fi.wikipedia,
  • #sv.wikipedia.

COIBot here watches for page edits. Channels can be added or removed while COIBot is running.

What is reported, and where

All edits pertaining this wikipedia are reported here, everything gets reported to COIBot's account on Specific user and link-reports are saved on both wikipedia, and contain in both cases all reports.


Items on the whitelist make COIBot ignore the complete edit, so when the link 'example' <-> '' would be on the whitelist, COIBot would not report when user 'example' would add '' to a page (which would normally result in an overlap of 70%, well above the threshold). Users can also be whitelisted completely, which will result in them never being reported. Complete whitelisting of links will still result in them being reported, but such links will never be automonitored (see monitor list, below).

Please understand that whitelisting means that your username is whitelisted on all monitored wikis, which also means that while you have not a conflict of interest on this wiki, another user on another wiki may have a conflict of interest. It may therefore be undesirable to whitelist certain usernames.

If you believe your name is wrongly on a report, please a) remove yourself from the reports (preferably using <s> and </s>-tags, providing a clear edit summary, and notify Dirk Beetstra or a regular on Wikipedia talk:WikiProject Spam to request whitelisting. Please note that the link-reports (under Wikipedia:WikiProject Spam/LinkReports) are generated automatically by COIBot, and may be regenerated before whitelisting.


Coibot has a table where usernames are linked to keywords. This gives the possibility to check whether certain accounts e.g. add a certain url (when a suspected or known conflict of interest exists). For example the blacklist rule 'COIBot' <-> 'example' would result in the following two results when user COIBot would add the link '':

The second case has a ratio higher than the threshold, and COIBot would be reported.

The reverse is also checked, so '' can be linked to the keyword 'COI' or to IP-ranges, which makes it possible to find sock-puppets or check for additions by certain IP ranges.

Monitor list

COIBot records additions of URLs on the monitorlist, except when the user is whitelisted or when a user is already reported via the blacklist or via overlap between username and domain-name. This functionality is used to find IP-ranges or sock-puppet accounts that add certain domains, but where the full scope of the involved accounts is not (yet) clear. This function may result in numerous 'false positives' for domains which are, besides being spammed or pushed by certain accounts, also used as e.g. references.

Addition of a link that has a large overlap with the username of the user that is adding the link will result in the link being added to the monitorlist automatically. COIBot also monitors WT:WPSPAM and WP:COIN for reported links, as well as the spam blacklists on the wikipedia it is monitoring.

All items on the monitor list are interpreted as a regular expression.

When your name appears on the reports for a monitored link, then it does not mean you have a conflict of interest, or that you were spamming, but that there may (have) be(en) issues with that particular link. More information (monitoring or blacklisting reasons) can be found in the header of the specific reports on that link (see Wikipedia:WikiProject Spam/LinkReports for a list of generated link reports).



Care should be taken when interpreting the data that is provided by COIBot. The bot has a mechanism which matches username against domain added or page edited, reporting significant overlap (its standard setting is to report all cases with more than 25% overlap). At the current state it can be seen from the reports that more than 95% of the reported cases are 'correct' in terms of 'username indeed has a huge overlap with the pagename/url'.

Some points of attention:

1. Editors with short usernames editing articles with short names easily exceed the 25% threshold since single characters have a high weight in short names:

<COIBot> TEST: en:User:zxv/en:Special:Contributions/zxv scores 90% (U->T) and 60% (T-U) (ratio 54%) on string zyxwv

2. An overlap does not necessarily mean that the editor has a conflict of interest. Example:

<COIBot> TEST: en:User:chocolatefan/en:Special:Contributions/chocolatefan scores 75% (U->T) and 47.36% (T-U) (ratio 35.52%) on string chocolate_chip_cookie
Of course a ChocolateFan does not have a conflict of interest when adding important information to chocolate chip cookies.

Therefore, all results should be, and will be, manually checked against the policies and guidelines. When wrong reports occur too often, these combinations can be whitelisted.


The bot is written in Perl, originally based on the code of user:shadowbot (though the overlap is now only the basic IRC-read and mediawiki-edit mechanism). It uses perlwikipedia, a module to read/write mediawiki pages. A recent example of the code can be found on m:User:COIBot/COIBot.


Barnstar-lightbulb3.png The Technology Barnstar
This Barnstar is awarded to COIBot for identifying conflicts of interest on Wikipedia! ----Hu12 12:26, 28 July 2007 (UTC)
Spamstar1.jpg The Spamstar of Glory
Is presented to COIBot for automatically creating comprehensive spam reports which help users deal with spam.--Otterathome (talk) 19:52, 18 March 2008 (UTC)