Wikipedia:WikiProject Spam

From Wikipedia, the free encyclopedia
  (Redirected from Wikipedia:WPSPAM)
Jump to: navigation, search
Shortcut:
WikiProject Spam engaging spammers off the coast of Wikipedia.

As Wikipedia grows in popularity, the temptation to misuse its editability to bring attention to other websites becomes nearly unbearable. At one end of the spectrum are professional spammers seeking to drive traffic to commercial sites. At the other end are webmasters of simple community sites who want to get more attention for their sites. This potential for self-promotion on Wikipedia must be managed—Wikipedia is not a "repository for links" or a "vehicle for advertising". Wikipedia exists for the purpose of creating a collaboratively edited encyclopedia, not for any individual to promote a site in which they have an interest.

Those promoting sites by linking to them from Wikipedia formerly saw major search engine optimization (SEO) benefits, due to Wikipedia's popularity. The ability to promote a site's appearance in search engine results was considered too great an incentive for people to add extraneous links to articles. So in February 2007, the English Wikipedia instituted a policy that tags external links "NOFOLLOW."[1] This means that major search engines like Google no longer index these links. Many web site operators still seek to use Wikipedia to increase the number of inbound links to their sites, some either out of ignorance of SEO functionality or of this policy change, others because they simply hope to draw individual readers to their site through direct Wikipedia traffic.

Currently link spammers enjoy many advantages from the lack of cohesion to the spam-fighting process. It is possible to sneak links into relatively unwatched articles successfully. Such links may lie unexamined for months, gaining the appearance of legitimacy from having remained in an article so long. When spam links are reverted, there is not much communication. Spammers can return and add links when different editors are watching who do not know their history of editing-with-an-agenda. Frequently spam contributors take advantage of Wikipedia's Assume good faith policy. They may engage in straw-man or special pleading arguments for inclusion of their links under the guise that they have only the welfare of Wikipedia at heart, usually in the presence of evidence to the contrary.

WikiProject Spam is a voluntary Spam-fighting brigade. Our purpose is: to develop standards and processes for recognizing, hunting down, and eliminating link spam; to streamline communication between those who want to watch over articles to prevent it; and, to send a message by our actions and effectiveness that link spammers are fighting a war they cannot win.

If you would like to participate, we encourage you to add your name to the sign-up list. We encourage you to join in editing this page so we can grow toward consensus about the best way to fight link spam. You are welcome to relate any of your own current ongoing efforts to fight link spam on the talk page so that in the immediate future we can be aware of users that are acting with an agenda to promote an external site.

Removal how-to[edit]

There are a variety of facets for dealing with inappropriate links. This guide breaks the process into a number of steps. Most editors will want to complete the first step. Editors interested in doing a more thorough job should follow through with additional measures.

  1. Revert and warn the user: when new spam appears on your watchlist, the easiest way to remove or revert it is by selecting the diff link, then select the edit above the left-hand column, include an appropriate edit summary, preview changes and finally save the page. If you come upon an article with spam, check the recent diffs (the last links) in the page history to see if it was added recently in a way that damaged the article; revert the changes to restore the articles to its previous state. It is important to warn the user, which will likely stop the spamming or establish a history of problematic edits. To warn the user, go back to your watchlist or page history and select the Talk link associated with the editor and add {{subst:uw-spam1}} ~~~~ to that page. If the user already has a spam warning, add {{subst:uw-spam2}} ~~~~; if two warnings {{subst:uw-spam3}} ~~~~; if three warnings {{subst:uw-spam4}} ~~~~. At this point the task is done, but to see if the user added the same link to other articles, go on to step2.
  2. Check the user's contributions: a user will often add the same link to multiple articles. This is often confirmation that the user is not editing in good faith. To check for this type of activity, select the contribs (or for anonymous users the IP address) link from your watch list or an article's history. This shows all the other edits the user recently made and selecting the diff link shows if the same link has been added to other articles. If inappropriate links are found, revert as in step one, but the user only needs to be warned once unless he has spammed since the last warning.
  3. Check for similar links: a crafty spammer hides spamming by using multiple accounts. This step involves finding all of the articles that contain a link to a particular site. If a link to www.example.com were discovered and removed in steps one and two, the next step is to use the External links command to find all articles that contain such links. One may enter www.example.com in the search box, but consider entering *.example.com because this will find not only www.example.com but also ads.example.com and any other domain that might have been used. (The External links command is found in the Special pages list, which has a link from the toolbox of every page.) An archived inventory of of over 1000 problematic domains can be found at this project's LinkSearch page.
  4. Identifying the spammer: the process of finding links in step three reveals which articles they are in but does not indicate which editor added them. To find out, go to the article history and expand it to 500 entries. Check to see if the link is present in the last revision in the list. If so, select the previous 500 changes and check again, repeating the process to find the subset of changes where the link appeared. To find the exact edit where the link was added, check the version in the middle of the 500 entries. If it is not present there, then it is in the edits above otherwise it is in the edits below. Check the middle of the appropriate half in the same manner. By using this divide and conquer method the exact point of insertion can be quickly found. Often the edit summary includes the words External links which can help pinpoint the edit. Once the edit is found, go back to step one and start cleaning up after this editor.
  5. Persistent spammers: if an active spammer continues adding links after a {{subst:uw-spam4}} warning, report the user to the administrators at the intervention against vandalism page. If blocks fail to stop the problem, or if the spammer is using multiple accounts and IPs, the case can be reported to the Spam blacklist.
  6. Document the target url. Post an example of the link you have been removing on this project's talk page. If the site is spammed again, any future spam fighters who try a link search will realize that it is a repeat offense, and further action may be needed.

See also the to-do list.

Standards[edit]

The number one rule for Project members is this code of honor: "I will never insert links to my own sites into Wikipedia's article space." Not only is Conflict of interest a guideline that is generally accepted among editors, but many of us who run websites are too committed to their success (however we define it) to judge impartially whether or not they belong in an article. Moreover, we are actively reverting self-promotion linking by other editors, some of whom view the addition of their links as sincere attempts to service various communities. It is easier to gain the respect of these people if we hold ourselves to the highest possible standard and avoid any appearance of double-standards or hypocrisy.

Tag 'em to stop 'em[edit]

Spam edits automatically deserve a {{subst:uw-spam1}} tag on the user's talk page. This is important! First, to drive the message that spam is not welcome here, and second, to warn us of repeat offenders. Successive violations of the spam policy can be met with talk page additions of;

  • {{subst:uw-spam2}}
  • {{subst:uw-spam3}}
  • {{subst:uw-spam4}}

Note: When tagging spammers, it will help if you leave a "linksummary" template or URL with the warning, or formulate your own, like;

==Additions of http://.spammydomain.com ==
{{subst:uw-spam1}}--~~~~
or
== Addition(s) ==
*{{LinkSummaryLive|spammydomain.com}}
{{subst:uw-spam1}}--~~~~

Leaving a link or template can help find IPs and accounts, months later. If a violation occurs after the fourth warning, you should report the offending user at the Administrator intervention against vandalism page. If there is ongoing abuse, over a period of time and you think a site should be blacklisted, report it. Placing the warning tag does not take much more effort than removing the spam itself, and can really help the effort to prevent the spam from returning.

Further templates: Wikipedia:WikiProject_Spam#Templates

How to identify spam and spammers[edit]

  1. User is anonymous (an IP address)
  2. User:page and/or User_talk:page are red links
  3. No edit summary (other than, perhaps /* External links */)
  4. User has made only one edit, which consisted of inserting a link
  5. User has made multiple edits to related articles
  6. The majority of user's edits are to external links sections
  7. The link is a site that has Google/Yahoo ads (AdSense/SM).
  8. Edits are marked "minor"
  9. Link is trying to sell a product or service. You can use Microsoft's Detecting Online Commercial Intention Tool to help you with the determination.
  10. User adds links to the top of a section, above far more relevant sites
  11. User replaces an existing link or part of an existing link.
  12. The syntax of the added link does not match the syntax used in the rest of the list
  13. User adds links to inappropriate sections of articles ("References", "See also", "For more information")
  14. User adds links that have been previously removed, without discussing on the talk page.
  15. Following a link takes you to a site that does not mention the specific topic of the page containing the link.
  16. Link is unrelated, or only marginally related to the article. For example, link on a biography to a specific page on a genealogy site describing the person's genealogy, but not the person.
  17. User adds links to other Wikipedia articles where he/she has already placed spam links.
  18. User includes within the link description, "hosted on example.com" with a separate link to example.com.
  19. Link is mangled, or it took many edits to get the syntax right. The spammer may be new to Wikipedia and not be familiar with Wikipedia syntax for external links.
  20. Text of the link goes beyond describing the contents to actively encouraging you to read it. For example, including text such as, "Read more about [subject] in [this fascinating article]"

Common spammer strawmen[edit]

Shortcut:

The nature of Wikipedia means that one can not make a convincing argument based on whether other links in articles do or do not exist; that is because anyone can add any link to any article. Plenty of links exist that probably should not, and many links do not exist that probably should. So pointing out that some other link exists is not relevant to whether the link in question should also exist.

Spammers will often offer arguments in which an irrelevant topic is presented in order to divert attention from the original spam issue. These are some characteristic strawman arguments;

"But you have links to commercial sites in the list."
Spamming is about promoting your own site or a site you love, not about commercial sites at all. Links to commercial sites are often appropriate. Links to sites for the purpose of using Wikipedia to promote your site are not.
"But you have links to other sites that people have added for self-promotion."
Those probably need to go, too. The fact that we have not gotten around to it, yet, does not mean that we have some obligation to have your site.
"But you have a link to site Y, and my site is just like that."
We do not need to link to every site in existence that meets a certain criterion. Sometimes we just need one site representative of a category. (See also the comments about linking to web directories instead, so that Wikipedia does not become a web directory.)
"But these links have been here for a long time."
There are no binding decisions on Wikipedia, especially when the decision was never discussed on the talk page. Just because nobody noticed your spam a long time ago does not mean you now have a "right" to keep it in.
"My link is useful."
It is more likely that the link added has no more useful information than the Wikipedia article itself.
"My site is non-commercial, so it is not spamming" (or, my site is nonprofit, or charitable, or opposes cruelty to puppies, etc).
It does not matter – being noncommercial (etc.) does not confer a license to spam even when it is true, and these sites are often trying to sell something even if the business is organized as a nonprofit.

Further note, notice how many times "my site" (or "my link") appears in the above examples.

Assuming good faith[edit]

Assuming good faith is an important policy of Wikipedia, but does not require that you assume good intentions when there is evidence to the contrary. Link spamming behavior fits a definite profile. When editors meet this profile, they are engaging in activity which is detrimental to Wikipedia, no matter how sincere they may have been in their edits. We should develop responses to those who engage in this behavior which encourage them to reform into productive Wikipedians, but we should waste no time in protecting Wikipedia from the damaging behavior through reverts and blocks where necessary.

Regular clean-out of undiscussed links[edit]

What several editors in some articles do is go in every few days and remove any undiscussed external links. Call it quick and easy "house cleaning." To encourage sincere links, they leave this edit summary:

Regular clean-out of undiscussed links. Please come to Talk page if you want a link not to be cleaned out regularly.

One could easily start this strategy in any article by adding {{subst:Discuss links here}}~~~~ to its talk page. The plan is to discourage people whose sole intention is self-promotion.

Also, add commented-out warnings to the External links section of the articles, themselves:

<!-- ATTENTION! Please do not add links without discussion and consensus on the talk page. Undiscussed links will be removed. -->

For this purpose the Template:NoMoreLinks has been created.

This strategy is used in the following articles:

This strategy is also helpful to deal with POV and conspiracy links:

What to do with linkfarms[edit]

"... consider a sidewalk. Some litter accumulates. Soon, more litter accumulates. Eventually, people even start leaving bags of trash from take-out restaurants there or breaking into cars."

Fixing Broken Windows by George L. Kelling and Catherine Coles 1996

A successful strategy for preventing linkfarms is to fix the problems when they are small;

  1. Open Directory Project ({{dmoz}}) → {{Dmoz|Path/To/Category|Category's name}}
  2. Yahoo! Directory ({{yahoo directory}}) → {{Yahoo directory|Path/To/Category/|Category Name}}
Sometimes sections of articles reach the Spam Event Horizon (WP:SPAMHOLE) where the number of links is so large that no realistic attempt can be made to assess relevance or the need for further inclusion. Excessively long "link lists", or an External links section broken down into 'subsections', are often a clear indication of a Spam Event Horizon.
Technology articles are common targets, select "Related changes", to view recent edits within these specific catagories;
  • The References section is for references. The References section of a Wikipedia article is not a list of related works or links; it is specifically the list of works that have been used as sources in the article. Reference sections in articles often become problem areas that can attract spam. Common forms of spam can be WP:LINKSPAM, WP:BOOKSPAM or WP:REFSPAM. Therefore, it is never correct for links or "faux-references" to reside in these sections if article content is not actually referring to it.
Note: Similarly, (===Notes===) sections, (===Further reading===) sections and (===See also===) sections, should be scrutinized for Book and link spam that have not been added for verifying article content.

Promotional sub-pages[edit]

'Promotion" does not always mean commercial promotion: anything can be promoted, including a person, a non-commercial organisation, a point of view, etc. User sub-pages and sandbox's are increasingly being used by promotional accounts as a dumping ground / web-host for spam articles, vanity pieces and other promotional content not suitable for encyclopedic inclusion. Many of these SPA promotional accounts are also using articles for creation and Article Incubator for the same purpose.

Speedy Delete any advertising or promotion per WP:CSD#G11. Simply Place the following tag on the offending page;
Template;
{{subst:db-spam}}
Copy paste;
{{subst:db-spam}}
Spam blanking is undertaken on pages where the content is for one reason or another, inappropriate advertising, promotional material or other spam and is replaced with {{subst:spam blanked}}. Spam blanking does not entail any revision deletion (see, Wikipedia:Deletion policy and Wikipedia:Hiding revisions). This should help address this growing problem and assist in removing from view promotional content not always located in the article-space;
Template;
{{subst:spam blanked}}
Text;
Copy paste;
{{subst:spam blanked}}

The Campaign[edit]

We would want a concerted viral marketing strategy involving

and a dash of mentions in help pages, FAQs and fixup templates.

Guidelines, policies, and essays[edit]

Policy
Guidelines
Essays

Templates[edit]

{{subst:spam}} and related[edit]

These templates should be substituted ({{subst:Uw-spam}}, etc) as per WP:SUBST.

{{Cleanup-spam}}[edit]

{{Cleanup-spam}} for Articles or sections with excessive spam. See Wikipedia:Spam for more details.

{{Spam blanked}}[edit]

{{subst:spam blanked}} is undertaken on pages where the content is for one reason or another, inappropriate advertising, promotional material or other spam not always located in the article-space.

{{subst:WPSPAM-invite}}[edit]

Saw someone revert or remove linkspam? Invite the comrade here with {{subst:WPSPAM-invite}} placed on their User talk page.

A souped-up alternative: {{subst:WPSPAM-invite-n}}.

Standardised edit summary[edit]

HorsePunchKid suggests a standardised edit summary to raise awareness both of the problem and this particular effort:

Removed link spam. Wikipedia is [[WP:NOT|NOT]] a link directory. Join [[Wikipedia:WikiProject Spam]] to help!

Perfecto uses the following:

Removed [[WP:EL|external link]] [[WP:SPAM|spam]]. ([[WP:WPSPAM|You can help!]])

--Aude suggests:

Removed [[WP:EL|external link]] added by [[User talk:69.159.82.252|69.159.82.252]]. Wikipedia is [[WP:NOT|NOT]] a link directory. ([[WP:WPSPAM||WikiProject Spam]])
Substitute the ip address/user name as appropriate.

TheJabberwʘck suggests (for users of popups):

Reverted [[WP:EL|external link]] addition by [[Special:Contributions/<user>|<user>]] to version %s, using [[:en:Wikipedia:Tools/Navigation_popups|popups]]. Wikipedia is [[WP:NOT|NOT]] a link directory. ([[WP:WPSPAM|you can help!]])

These edit summaries help drive a concerted viral marketing strategy.

Recognition[edit]

The Spamstar of Glory[edit]

The coveted Spamstar of Glory is awarded to those who show strong contributions to tracking down and stopping spammers as well as cleaning up their links. Introduced on November 8, 2006 by A. B., it originally consisted of a nicely Photoshopped can of Hormel spam superimposed on a barnstar. Later, due to concerns about infringing on Hormel's trademark, the award was changed to the current design, adapted from the The RickK Anti-Vandalism Barnstar

Spamstar1.jpg The Spamstar of Glory
Presented to {{{1}}} for diligence in fighting spam on Wikipedia


The Anti-Spam Barnstar[edit]

No spam barnstar.png The Anti-Spam Barnstar
Presented to {{{1}}} for diligence in fighting spam on Wikipedia

Participants[edit]

See the list of participants. You can sign up and help us fight spam on Wikipedia!
As of May 2012 we have over 380 participants.

Userbox[edit]

Participants may add this to their userpage instead of signing up.

Code: Results in:
{{User WikiProject Spam}}
No-spam.svg This user is a member of WikiProject Spam.

If you prefer not to use userboxes, you may add yourself directly to Category:WikiProject Spam members by placing the following code on your Userpage: [[Category:WikiProject Spam members|{{PAGENAME}}]]

Tools[edit]

  • Search tools
  • To combat repeat offenders, you may request to have links added to the local spam blacklist (for links that have only been spammed on the English Wikipedia) or the Wikimedia Global spam blacklist (for links that have been spammed on more than one Wikimedia Foundation project).
  • For links that are generally used in an inappropriate way by unestablished users, but which do not qualify for the meta or local spam blacklists, ask User:XLinkBot to monitor it with a request at User talk:XLinkBot/RevertList.
  • Watch the link addition feed in #wikipedia-en-spam. There is a bot on there that reports all newly added links and keeps track of serial spammers. For more info see /IRC Channels.
  • If no one is around to add something to the spam blacklist, contact users at #wikipedia-spam-t on freenode.
  • Link spamming edit filter for repeated addition of external links by non-autoconfirmed user.