MediaWiki talk:Spam-blacklist

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by Beetstra (talk | contribs) at 06:00, 11 December 2013 (→‎bodybuilding.com: by the way, WP:AGF). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

    Mediawiki:Spam-blacklist is meant to be used by the spam blacklist extension. Unlike the meta spam blacklist, this blacklist affects pages on the English Wikipedia only. Any administrator may edit the spam blacklist. See Wikipedia:Spam blacklist for more information about the spam blacklist.


    Instructions for editors

    There are 4 sections for posting comments below. Please make comments in the appropriate section. These links take you to the appropriate section:

    1. Proposed additions
    2. Proposed removals
    3. Troubleshooting and problems
    4. Discussion

    Each section has a message box with instructions. In addition, please sign your posts with ~~~~ after your comment.

    Completed requests are archived. Additions and removals are logged, reasons for blacklisting can be found there.

    Addition of the templates {{Link summary}} (for domains), {{IP summary}} (for IP editors) and {{User summary}} (for users with account) results in the COIBot reports to be refreshed. See User:COIBot for more information on the reports.


    Instructions for admins

    Any admin unfamiliar with this page should probably read this first, thanks.
    If in doubt, please leave a request and a spam-knowledgeable admin will follow-up.

    Please consider using Special:BlockedExternalDomains instead, powered by the AbuseFilter extension. This is faster and more easily searchable, though only supports whole domains and not whitelisting.

    1. Does the site have any validity to the project?
    2. Have links been placed after warnings/blocks? Have other methods of control been exhausted? Would referring this to our anti-spam bot, XLinkBot be a more appropriate step? Is there a WikiProject Spam report? If so, a permanent link would be helpful.
    3. Please ensure all links have been removed from articles and discussion pages before blacklisting. (They do not have to be removed from user or user talk pages).
    4. Make the entry at the bottom of the list (before the last line). Please do not do this unless you are familiar with regex — the disruption that can be caused is substantial.
    5. Close the request entry on here using either {{done}} or {{not done}} as appropriate. The request should be left open for a week maybe as there will often be further related sites or an appeal in that time.
    6. Log the entry. Warning: if you do not log any entry you make on the blacklist, it may well be removed if someone appeals and no valid reasons can be found. To log the entry, you will need this number - 585551605 after you have closed the request. See here for more info on logging.
    snippet for logging: {{/request|585551605#section_name}}
    snippet for logging of WikiProject Spam items: {{WPSPAM|585551605#section_name}}
    A user-gadget for handling additions to and removals from the spam-blacklist is available at User:Beetstra/Gadget-Spam-blacklist-Handler


    Proposed additions


    marketsandmarkets.com

    marketsandmarkets.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com --Dennis Bratland (talk) 17:37, 21 November 2013 (UTC)[reply]

    directory.tradeford.com

    directory.tradeford.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.comrybec 18:52, 19 September 2013 (UTC)[reply]

    riocodes.com

    riocodes.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.comrybec 18:41, 19 September 2013 (UTC)[reply]

    suzukicycles.org

    suzukicycles.org: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com Frequently cited, frequently plagiarized. Trove of copyrighted photos, books and text violates WP:COPYLINK and WP:SPS --Dennis Bratland (talk) 16:09, 17 September 2013 (UTC)[reply]

    Support blacklisting. Werieth (talk) 19:37, 27 September 2013 (UTC)[reply]

    elitetraveler.com

    Spammers

    Three spamruns from this website that I have noticed. The Banner talk 21:25, 13 August 2013 (UTC)[reply]

    Four spamruns (a small one the last time The Banner talk 18:36, 2 September 2013 (UTC)[reply]
    • I have no opinion on the editors above, but I do want to report that the magazine itself qualifies as a reliable source according to Wikipedia standards. WP:RS. Like all glossy magazines IMHO, they do pander to their advertisers to some degree, but it is a real print magazine with reporters, editors and fact checkers.--Nixie9 23:26, 3 September 2013 (UTC)[reply]
      • And this and [1] is how they promoted themselves on WP. How it survived the multiple nomination for deletion because of SPAM/promo is a mystery to me. The Banner talk 00:46, 5 September 2013 (UTC)[reply]

    Morning277 subjects

    These sites are being promoted by a publicity agency, banned from Wikipedia, which has been posting articles about them. After an article is deleted and the poster blocked, a new article with similar contents is posted from a different account, almost always under a different title. Since they keep using new accounts and new article titles, account blocking and page protection haven't been entirely effective.

    newyorkstay.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    www.youtube.com/user/newyorkstaycom: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    www.justiceforall.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    www.kulaw.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    4cabling.com.au: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    www.aasted.eu: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    www.alsbridge.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    www.awaionline.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    www.bizible.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    rybec

    www.princeton.edu/~achaney/tmve/wiki100k

    princeton.edu: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    • The site is not reliable and should not be used in Article space per WP:CIRCULAR. This is clone of Wikipedia (every real page there says "The article content of this page came from Wikipedia and is governed by CC-BY-SA."). Some wikipedia editors may think that site is good as RS (it is in google's top and the domain is .edu), but it isn't and there should be some way to say that the link is not correct to be added to the Wiki.
    • Recent example: diff
    • Currently there are 76 links to the site, some are from Article space: [2]:
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/1924_Summer_Olympics.html is linked from Albert Séguin
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/2D_computer_graphics.html is linked from 2D computer graphics
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Abba_Eban.html is linked from Abba Eban
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Adair_County,_Missouri.html is linked from Grand River (Missouri)
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Area_rule.html is linked from Sears–Haack body
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Banba.html is linked from LÉ Banba (CM11)
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Ben_Bova.html is linked from Ben Bova
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Bhavani.html is linked from Bhavani Peth
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Biface.html is linked from Hand axe
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Brightness_temperature.html is linked from Brightness temperature
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Bushyhead,_Oklahoma.html is linked from Dennis Bushyhead
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Byng,_Oklahoma.html is linked from Julian Byng, 1st Viscount Byng of Vimy
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/CDC_6600.html is linked from CDC 6600
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Camel_(band).html is linked from Camel (band)
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Camel_(band).html is linked from The Snow Goose (album)
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Critical_theory.html is linked from Critical theory
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Ctesiphon.html is linked from Iwan
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Du_hast.html is linked from Burkenburg
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Francesco_Redi.html is linked from Francesco Redi
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Jean_le_Rond_d_Alembert.html is linked from Louis-Camus Destouches
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Lagari_Hasan_%C3%87elebi.html is linked from Lagâri Hasan Çelebi
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Language_game.html is linked from Language game
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Local_Government_Areas_of_Australia.html is linked from Local Government Area
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Lord_Peter_Wimsey.html is linked from Lord Peter Wimsey
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Mohammed_Deif.html is linked from Mohammed Deif
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Mystic_Records.html is linked from Mystic Records
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Noise_weighting.html is linked from Psophometric weighting
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Phonograph_cylinder.html is linked from Early classical guitar recordings
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Pimsleur_language_learning_system.html is linked from Pimsleur method
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Pope_John_XXI.html is linked from History of Roman Catholicism in Portugal
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/QuarkXPress.html is linked from QuarkXPress
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Reconquista.html is linked from History of Roman Catholicism in Portugal
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Record_producer.html is linked from Executive producer
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Sacraments_of_the_Catholic_Church.html is linked from Catholic Church
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Simpson_s_paradox.html is linked from Edward H. Simpson
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Smokey_Robinson.html is linked from North End, Detroit
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Sunk_costs.html is linked from Sunk costs
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/The_Chemical_Brothers.html is linked from Alleyn's School
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Transport_in_Barbados.htm is linked from Transport in Barbados
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Tsui_Hark.html/ is linked from List of University of Texas at Austin alumni
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Wall.html is linked from Wall
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Warren,_Arkansas.html is linked from Warren, Arkansas
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Wis%C5%82awa_Szymborska.html is linked from Ironic precision
      • www.princeton.edu/~achaney/tmve/wiki100k/docs/Yom_Kippur_War.html is linked from United Nations Security Council Resolution 338
    PS Actually, there are another runs of "tmve/wiki100k" on different sites (google for "tmve/wiki100k" site:wikipedia.org), e.g. http://www.sccs.swarthmore.edu/users/08/ajb/tmve/wiki100k/docs/Bavarii.html http://www.sccs.swarthmore.edu/users/08/ajb/tmve/wiki100k/docs/Potentiometer.html and they are not only in en-wiki (move to meta spam list or create some filters for wiki100k?) `a5b (talk) 00:52, 25 August 2013 (UTC)[reply]
    I don't know where the links came from but that site is benign -- it's an experiment being done by a grad student at Princeton. For more information, see these web pages:
    http://www.cs.princeton.edu/~achaney/papers/ChaneyBlei2012.pdf
    I suggest emailing her at http://www.cs.princeton.edu/~achaney/email.html before any blacklisting to give her a heads up.
    Her work could be very useful to Wikipedia and the Wikimedia Foundation in the long-term.
    That said, we don't need any of these links since they circle back to our own content.
    --A. B. (talkcontribsglobal count) 16:23, 29 August 2013 (UTC)[reply]
    I suggest she get in touch with WikiProject Research
    --A. B. (talkcontribsglobal count) 16:25, 29 August 2013 (UTC)[reply]
    That Wikiproject looks moribund when I look at it closer. It looks like there's very active support and discussion of various research projects on Meta-Wiki at meta:Research:Index. I'd hate to see a diligent researcher run afoul of what might look BITE-y to an outsider.
    --A. B. (talkcontribsglobal count) 16:34, 29 August 2013 (UTC)[reply]
    A. B., the main problem with this site is that: it takes texts from Wikipedia and republish them. This is allowed to copy text from wiki, but what is not allowed (per WP:CIRCULAR) - is to used wikipedia texts as references (wikipedia is not Reliable source, so does any copy of wikipedia). E.g. If there is link to http://www.princeton.edu/~achaney/tmve/wiki100k/docs/Pope_John_XXI.html in some article, we should replace it with Pope_John_XXI; and if such link is in <ref> we should replace this link with {{fact}}. I propose to add this site to spam list only to limit the efforts of replacing links to the princeton with {{fact}}. With site included to spam list, there will be no new links to the site added by good faith users who may think that something from `.edu` is always reliable..... Ok, there is actually no need to include her site into spam-list, but we should delete all links to her site and periodically recheck the Linksearch. `a5b (talk) 21:39, 9 September 2013 (UTC)[reply]
    As the creator of these problematic pages, I'm sorry--I just became aware that this is an issue. Please let me know what I can do to help fix it or prevent bad citations in the future. I won't be following the discussion here, but please email me if you'd like me involved. — Preceding unsigned comment added by Absonant (talkcontribs) 13:23, 19 September 2013 (UTC)[reply]
    To fix it, stop using these links for citations, period. As A5b noted, "wikipedia is not Reliable source, so does any copy of wikipedia." Better yet, clean up the mess by replacing all of those links with [citation needed], or even better, find a source that meets WP:RS to do so. OhNoitsJamie Talk 21:46, 16 November 2013 (UTC)[reply]

    sourcesecurity.com

    Spammers

    Long term, persistent spamming on many IPs and users - above is a partial list of IPs and accounts. Main spam URL is sourcesecurity.com, but thebigredguide and yogawizard show some overlap in accounts. - MrOllie (talk) 18:37, 30 August 2013 (UTC)[reply]


    ebscohost\.com(\.|.*(pdfviewer|EbscoContent)) - block unusable EBSCOHOST links

    Here's a specific suggestion:

    ebscohost\.com(\.|.*(pdfviewer|EbscoContent))     #Block 3 kinds of unusable EBSCOHOST links but allow permalinks: Match proxies: there's a literal "." after "com", and temporary session links, which contain pdfviewer or EbscoContent
    

    ( This is a consolidation of these two simpler regexes:

    ebscohost\.com.*pdfviewer          #Block unusable [[wp:EBSCOHOST]] links but allow permalinks
    ebscohost\.com\.                   #Match proxies, which is where it's not the end of the hostname - there's a literal "." after "com".
    

    )

    Wikipedia has many apparently dead-on-arrival links (like this intended to be to PDFs of the form ebscohost.com...pdfviewer...: All 7 of the 323 pages containing ebscohost and pdfviewer] I looked at had dead EBSCO links. These are NOT links that hit a paywall (like this. Rather, they bring up 404-like server error messages, and did from the day they were added; they're non-persistent URLs.

    A second problematic type of EBSCO link are proxied URLs, like the three added by a user's (sole ever) edit that are of the form hxxp://0-web.ebscohost.com.sculib.scu.edu/ehost/pdfviewer/pdfviewer?sid=[hex string]@sessionmgr13&vid=4&hid=13. (Note the bold portion!) These links work ONLY for subscribers that are ALSO at SCU. We shouldn't allow such links, and the blacklist (or a similarly functioning parallel system) would be a good solution.

    I've noticed that EBSCO staff has been heavily editing their own article. I solicited assistance, hoping they'd be available, willing, and able to help fix these links or suggest ways to deal with them systematically. note posted; no response. What EBSCOhost calls permalinks, like http://search.ebscohost.com/login.aspx?direct=true&db=ulh&AN=37698669&site=ehost-live&scope=site are acceptable, and so I've designed a regex that allows the permalinks but forbids the non-persistent URLs.

    Research suggests it's not possible to convert the non-persistent URLs to persistent URLs using the data in the former. --Elvey (talk) 21:26, 9 September 2013 (UTC)[reply]

    The second problem is the use of a proxied URL, ie, the link points to a institution's proxy server such as sculib.scu.edu. This is not specific to ebscohost - it happens with links to other subscription databases too. A search for "ezproxy", for example, will bring up hundreds of such links. They are a bad thing. Nurg (talk) 08:39, 12 June 2013 (UTC) (reposted)[reply]
    I am tempted to see these sites as redirects, which will be location-dependent whether they work. I would consider that these should typically be converted to direct links to the object (within educational institutions, one can generally use a web-proxy to get to literature - a direct link would either be the link on the server where the literature resides, or the DOI. <snip> Links through proxy servers have no place whatsoever. I am somewhat tempted to say that these need blanket blacklisting on meta, as they could possibly be abused to circumvent other blacklistings (for a relatively open proxy), and serve no function whatsoever to most readers except for the (few) ones that have access through the proxy - I doubt even if the url can be understood well enough to be able to figure out a real link from it. It is however going to be very obnoxious for the users that in good faith insert the proxy url they copy from their web-browser and then they can't save, and one could think of cases where it is appropriate (if information is only available to people who can pass the proxy and no-where else in the world, it could still a good reference for certain information - think of it of a book of which the single copy is in an nearly inaccessible library (the library in the Vatican), it is still verifiable by proxying through people who do have access to the library (ask the pope)).
    Note, that with creative regex rule-writing, we could blacklist the two 'bad' examples of Nurg (the non-persistent link and the institution proxies), still enabling good ones (the permalinks). --Dirk Beetstra T C 09:30, 12 June 2013 (UTC) (reposted, indented, and 1 sentence snipped)[reply]
    We use the blacklist to limit examiner.com links, because they generally fail RS, so I think it's appropriate that we add regexes for the impermanent URLs. (Arguably it would be better to have a similarly functioning parallel system with its own error messages handle sites like examiner.com and this ebsco problem, but in the meantime, I say let's put in regexes to handle them.) I also match the ebscohost proxy URLs, but not by matching on 'ezproxy', because some of the ebscohost proxy URLs don't contain 'ezproxy'. (It could be considered as part of a future proposed blacklist addition.) Beetstra (talk · contribs) suggested blanket blacklisting on meta be considered, but at meta, though I see these links on other sites - e.g 'fr.', I was told firmly, "Deal with it at the local wiki level." (Discussion at https://meta.wikimedia.org/w/index.php?title=Talk:Spam_blacklist&oldid=5798048#Unusable_EBSCOHOST_links.) --Elvey (talk) 21:26, 9 September 2013 (UTC)[reply]
    Well? Do we need to run a bot to remove all the extant links first, or is there more that is holding this addition back? --Elvey (talk) 01:58, 19 September 2013 (UTC)[reply]
    3000 links to improve in one go - seems like a good idea to me, yes!--Elvey (talk) 02:17, 26 September 2013 (UTC)[reply]
    Given the fiasco of the many editors pissed off by the actions of Cyberpower678 (talk · contribs)'s bot Cyberbot II (talk · contribs), a non-spam blacklist (see the big text above) is urgently needed. If one of Cyberpower678's bots is set up to handle entries on this list appropriately, it would be appropriate to ad the EBSCOHOST regex to it, and move the examiner.com and petition regexes to it.--Elvey (talk) 09:13, 26 September 2013 (UTC)[reply]
    Many? Don't see that yet. Elvey, most of the links we block are blocked because they are/were spammed (examiner.com was spammed and is a spam-problem, for most of the petition sites, that is also true - it is a spam problem .. your remark regarding that is wrong), we do not block because we don't like links, or because they are unusable or because they are unreliable sources .. Nonetheless, your suggestion to have a second similar list might have merit, but that is a mediawiki developer problem that should be solved at the bugzilla level .. and I do not have much hope since we are waiting for several blacklist-related 'bugs' (improvements) for years already. --Dirk Beetstra T C 09:26, 26 September 2013 (UTC)[reply]
    Thanks, I see what you're saying. Here is a good example of the problem (of the list being used for reasons other than to block spam and bot Cyberbot II (talk · contribs) pissing off many users) : Luke (talk · contribs) is ADAMANT: "If something gets tagged as being on the spam blacklist, I will remove it, pure and simple." He's saying he's NOT going to examine the link, or attempt to repair or replace it. He's going to ASSUME the blacklist maintainers are making sure that the blacklist pretty much only blocks spam (like the spamhaus SBL and XBL maintainers do, if you're familiar with those lists).


    It's a problem that this blacklist is not a Spamhaus caliber blacklist; it's more like some of the more aggressive blacklists that are willing to regularly include entries that can be expected to cause considerable collateral damage. And that wouldn't be so bad if this blacklist was not marketed/described as pretty much only blocking spam. I'm willing to bet that the typical editor who tries to add a legitimate reliable source that is blacklisted collateral damage ends up not adding it, because we don't have multiple lists. The bot and the blacklist description pages are wrong to say the link Luke removed was spam; he was misled. The solution isn't to threaten to block everyone who does what he intends to do. It's to fix the list by splitting it. ASAP.


    We should still be blocking no-ip.com and examiner.com by default. Just not with this list. Roughly how many have I seen/do I think have expressed/do I think are upset because of the bot? Me? Around 6/?/60+, based on the 6. What about you? Is effecting a new list a developer problem at all? I would expect replicating the existing system and changing the names of a few things would be relatively trivial task for the right person (an admin, not a developer), compared to a real development project, such as a significant enhancement. Many good examiner.com-type links were not blocked because the links were a spam problem, but rather because they might have been one. But yes, I see your point - the examiner.com domain was blocked because it is a spam problem, I stand corrected. (BTW, is there a working 'spam' definition for use here? I usually refer to the definitions like the ones spamhaus proffers, tweaked to apply to this medium. I guess I'll go look for that …) I remember the first time I tried to save an edit and couldn't, and for the longest time had no idea why Wikipedia wouldn't let me save the edits I'd made to an article, which included adding an examiner.com source to it. The error messages had me thoroughly confused - and I'm knowledgeable about spam blacklists - but I still got the most pathetic and inscrutable error messages, and had no idea why Wikipedia wouldn't let me save the edits i'd made to an article. I have just gone through the same motions, which confirmed what I have seen others' comments suggest: the error messages shown to legit editors are still a pretty serious FAIL, though they are better than I remember. I recall they were worse than nothing, worse than useless. For no-ip.com links, the error message is still awfully misleading, as it describes my only options as:
    • If you feel the link is needed, you can:
      • Request that the entire website be allowed, that is, removed from the local or global spam blacklists (check both lists to see which one is affecting you).
      • Request that just the specific page be allowed, without unblocking the whole website, by asking on the spam whitelist talk page.
    This error message is unhelpful. The appropriate action is to request that the subdomain, rwservices.no-ip.info be whitelisted. It needs to be whitelisted. The error message is **misleading**.
    --Elvey (talk) 21:57, 26 September 2013 (UTC)[reply]
    PS: From current discussions: Surely you admit this listing wasn't because of a spam problem:

    \bjustjared\.buzznet\.com\b # Kanonkas # Gossip site/copyvio issues/speculation/not a reliable source used wrongly

    Perhaps Versageek (talk · contribs), creator of bot XLinkBot (talk · contribs) would be up to the task (of a non-spam blacklist system.) --Elvey (talk) 00:07, 23 October 2013 (UTC)[reply]

    sentuamsg.com

    These URLs are being spammed by 86.20.42.223 (talk • contribs • deleted contribs • blacklist hits • AbuseLog • what links to user page • COIBot • Spamcheck • count • block log • x-wiki • Edit filter search • WHOIS • RDNS • tracert • robtex.com • StopForumSpam • Google • AboutUs • Project HoneyPot) all over the place. Diffs: [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17]. --benlisquareTCE 05:56, 15 September 2013 (UTC)[reply]

    pinkangelsmokes.com

    Repeated additions of this - apparently, as I can't check it from a work machine - porn site link by at least 3 4 5 IP users I have been able to ID (only other reference I could find in a search was at Talk:Smoking fetishism requesting it be removed from the article in May of this year). User(s) using formatting in a - poorly executed - attempt at deception to masquerade it as a link to a government survey. besiegedtalk 00:51, 21 September 2013 (UTC)[reply]

    raveguide.co.uk

    raveguide.co.uk: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com Being added by user of the same name. The user has been reported as a promotional user, not (yet) as a vandal. Fiddle Faddle 22:45, 26 September 2013 (UTC)[reply]

    Digitaldreamdoor.com

    digitaldreamdoor.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com This website accepts user contributions without editorial oversight which means they fail the WP:Reliable sources guideline. Anybody can make a list of their favorites and publish it—a lot of what we see on Wikipedia is in the form of "100 Greatest Fusion Artists" or similar. The website has been added as an external link or a reference to Wikipedia many, many times, often by multiple good-faith users rather than by a single spamming account. If we blacklist the website the unreliable references and links will stop. Examples of this website being used as a reference or an external link:[18][19][20][21][22][23][24]
    Back in 2007, Special:Contributions/65.2.112.232 added Digital Dream Door to seven articles in every one of his seven edits—an example of spamming.
    A discussion about this general issue can be see at Wikipedia:Reliable_sources/Noticeboard#digitaldreamdoor.com. Thank you. Binksternet (talk) 16:03, 27 September 2013 (UTC)[reply]


    programarexcel.com

    programarexcel.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com Several inclusions on Oct 9, 2013 on Microsoft Excel to this site, which is of dubious value and pops open full screen ads.

    mobiles.sulekha.com

    mobiles.sulekha.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com A welter of spam links to this site today alone. It may be worth investigating the entire domain too Fiddle Faddle 10:39, 12 October 2013 (UTC)[reply]

    Several "Gentaur" related websites added in past months

    bcl10.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    gentaur-worldwide.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    apoptosises.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    Saw that IP 94.26.80.83 added apoptosises.com to Apoptosis, clicked it, and saw a long list of "Buy Now" buttons for pharmaceuticals on a shoddy website. Sites host barebones (almost certainly copy/paste jobs) medical articles, with tons of links to buy its products. Also reporting the IP as those are all it has ever added in the period since February 2013. Undid all of his/her past additions, but these should probably be blacklisted. --Rhododendrites (talk) 10:50, 17 October 2013 (UTC)[reply]


    soccerdatabase.eu

    soccerdatabase.eu: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    Back in May 2013 this link was Wikipedia:Bot requests/Archive 49#soccerdatabase.eu#mass removed from Wikipedia because it was deemed to be a copyright violating mirror website, of the defunct 'playerhistory.com' website. As I understand it, the owner of 'playerhistory.com' is Polarman (talk · contribs) and he has been taking legal action against the owners of 'soccerdatabase.eu' for violating copyright. This website has no place on Wikipedia and should therefore be blocked. Note that a previous attempt to blacklist 'soccerdatabase.eu' fizzled out with no real decision either way. GiantSnowman 12:32, 29 October 2013 (UTC)[reply]


    qtrax.com

    qtrax.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com Spammed by multiple accounts Special:Contributions/Lilamey2013, Special:Contributions/79.179.194.98, Special:Contributions/Putting_an_end_to_tyrant_editors, Special:Contributions/Sivan_qtrax, Special:Contributions/109.64.177.36.

    Apparently using bots now too according to [25]. Яehevkor 10:28, 10 November 2013 (UTC)[reply]

    No, this is incorrect. If you look at DVdm (talk), you will see that there isn't any current usage of bots, but only declaration about the desire of using a bot. Currently, there is a bot approved by wikipedia, that add links to lyrics of notable songs, and redirect the readers to a website owned by CBS Interactive, called MetroLyrics. From some reason, this website has been approved, and spreading links all over wikipedia. There isn't much difference between Qtrax and Metrolyrics, beside the fact that Qtrax also offer free streaming and downloads of the song. Yes, free. The whole model of Qtrax is based on providing LEGAL music for FREE to end users. Yet, in return the artists are getting paid by Qtrax. Hence it is the only service in the world which is both free & legal.

    Our quest is eventually to fight piracy over the web. We offer everything for free, because this is what pirate sites offer. So if we want to fight piracy, we must offer music for free. If we would like to keep being legal, we must then have licenses with the music labels, and pay the artists for every song our users play or download on Qtrax. Which we happily do. Please help us fight piracy, don't stop us. — Preceding unsigned comment added by 79.183.0.181 (talk) 14:10, 13 November 2013 (UTC)[reply]

    Note - According to About us:

    "In addition, the partnership between advertising and Qtrax delivers great potential for monetization by brokering branded deals with consumer advertisers around the world. These natural partnerships are sure to yield generous revenues for artists and labels alike."

    - DVdm (talk) 10:56, 10 November 2013 (UTC)[reply]
    I fully support this being blacklisted. The IP has clearly stated its intent is to promote the website and artists, not to better the Wikipedia project. Wikipedia is not here to use as your source of free advertisement. Sergecross73 msg me 14:16, 13 November 2013 (UTC)[reply]
    Please look at Requests_for_approval/LyricsBot. There's obviously a common interest in adding lyrics to song pages, which has been validated by the Wikipedia community and administrators which approved the MetroLyric bot. Just like Qtrax, MetroLyrics (owned by CBS) is a commercial entity. Since Wikipedia is unbiased, an equal approach should be taken towards both parties; meaning, since MetroLyrics were allowed to add external links, so should Qtrax. — Preceding unsigned comment added by Gil.qtrax (talkcontribs) 15:30, 13 November 2013 (UTC)[reply]
    Qtrax should go through whatever Metrolyrics did for approval then. Until then, you shouldn't be adding Qtrax links to articles, as you've already shown your interest is in your own website and promoting artists, not bettering the Encyclopedia. If you keep spamming the website, you'll be blocked. Sergecross73 msg me 15:38, 13 November 2013 (UTC)[reply]
    Wikipedia is not a free advertising platform for your new company. In the absence of documented community consensus for mass-adding of the links, if I see any other single purpose accounts adding the links, the account will be blocked and your site will be blacklisted. OhNoitsJamie Talk 15:44, 13 November 2013 (UTC)[reply]
    According to WP:Village_pump_(proposals)/Archive_97#Linking_lyrics_from_legal_providers there is a documented community consensus for mass-adding of exactly such links. There is no difference between a Qtrax link and a MetroLyrics link. Since this has already been discussed and supported in the community, and approved for automation (LyricsBot) - less than a year ago - there should be no problem with posting manual links until a Bot is approved. — Preceding unsigned comment added by Gil.qtrax (talkcontribs) 15:55, 13 November 2013 (UTC)[reply]
    Qtrax is a separate organization, so it needs separate consensus. Its as simple as that. You don't get to make the terms here. Follow protocol and the rules or get blocked. Sergecross73 msg me 16:04, 13 November 2013 (UTC)[reply]

    tefl-online-courses.com

    Being spammed to multiple articles by multiple IPs. [26] [27] Jackmcbarn (talk) 22:22, 16 November 2013 (UTC)[reply]

    en.softonic.com

    Part of a plague of snowshoe spam on WP and many, many areas elsewhere, to drive traffic to this software download site. the en. version redirects to the second version. I suspect other prefixes as well as .en. Fiddle Faddle 12:04, 28 November 2013 (UTC)[reply]

    archive.is

    archive.is: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com In October there was an RFC in which it was decided that archive.is should be blacklisted. I don't see the site in the list, so I think the decision has not been implemented. The RFC said that "[o]ver 10,000 links to archive.is remain on Wikipedia" but when I checked just now I found 27,309 such links. —rybec 23:00, 3 December 2013 (UTC)[reply]

    • Sorry, new to the blacklist, so sorry if I state the obvious. Per the RFC, the blacklist shouldn't be implemented until most/all of those links are removed. Doing so would, as I understand it, make it nearly impossible to edit these articles. I've not been tracking bot issue, but I think User:Kww is on it. Hobit (talk) 05:06, 4 December 2013 (UTC)[reply]
    • I'd encourage anyone who happens to be driving by to read that RFC linked above. Very interesting look at an issue where consensus, if there is one, seems thin. No easy answers for complex issues. Ultimately, monetary donations and volunteers are limited resources, and if those resources are deemed inadequate for the desired preservation of references, then some form of partnership with commercial interests may be desirable if it's done in an acceptable manner. Has WMF given an opinion on these issues? Has The Signpost written about this? If not, they should. These issues need wider exposure than just among the technicians who deal with blocking editors and blacklisting, and they need to be explained so that average non-technical editors understand them. Failure to do so properly risks upsetting regular editors in ways that could exasperate retention issues. Wbm1058 (talk) 23:12, 4 December 2013 (UTC)[reply]

    For now,  Not done. Although having a blacklisted link on a page does not disable any other editing to the page, in situations where a page is 'broken' by John Doe 1 (for example, but not necessarily through vandalism) resulting in (formal) removal of a link), and John Doe 2 comes by and repairs (but does not rollback, revert or undo!) or does an individual unrelated edit which does not re-instate the link, reverting to the original version is impossible as it would result in the re-addition of the blacklisted link (which would then be blocked by the blacklist). I personally handled such a situation not too long ago, it is quite annoying and needs administrative intervention. On a small scale of 5-10 remaining links on (then often) low-visibility pages, that is hardly ever an issue, but with 27,378 (!!) links on any possible subject (including highly visible ones like Glee (TV series), Miley Cyrus, and some which are 'vandalism prone': List of bisexual people (A–F), List of vegetarians ('oh, my teacher is bisexual/a vegetarian, lets add her/him, I'll even add a ref to her facebook, though I don't know how the referencing works so I may break the page ...')) that is likely going to aggravate many editors.

    First get a bot to replace, remove, or at least disable (comment them out?) ALL the links, and when that task is (nearly) finished, blacklist to avoid re-insertion (sigh, on that scale, that is going to annoy more people when the process of removal is performed, and people will use rollback/revert/undo to revert the bot because they don't agree with the established consensus). --Dirk Beetstra T C 07:00, 5 December 2013 (UTC)[reply]

    • Well, I would argue that the resulting close saying that the site should be blacklisted was a bit of a supervote, since there was no real consensus either way (a lot of non-policy based voting on either side as well), but given that I voted vehemently against that happening, I'm obviously not neutral on this regard. Lukeno94 (tell Luke off here) 07:13, 5 December 2013 (UTC)[reply]
      • Regarding that, I assumed here that the RfC was independently closed, and that the closing editor found that the consensus was to blacklist the link (and either way, I am not to decide on that, if the RfC was wrongly closed, get a community decision that that was the case). This is NOT the place to fight or oppose that part of the discussion or of the result, nor complain about how the RfC was closed. If that RfC was closed, independently, saying that the consensus is to blacklist the link, then that is the decision that would result in an admin carrying out that blacklisting. I have here now only decided that it is  Not done at this time, since there are too many links left over. As soon as those links have been removed (as per the closing the RfC, as that was the determined consensus), this link should be blacklisted (as per the closing of the RfC, as that was the determined consensus). If you want to fight the decision, either before (now) or after the decision has been applied (links have been removed and the site blacklisted), please use the appropriate paths in dispute resolution. --Dirk Beetstra T C 08:55, 5 December 2013 (UTC)[reply]
    Thank you, Dirk. Wbm1058 (talk) 14:43, 5 December 2013 (UTC)[reply]
    Kww, I admire how you took the lead in implementing community consensus on VisualEditor, any don't envy your decision on whether to proceed with removing these links. I've just come to understand the difference between using the Wayback Machine, which I've used often, and using WebCite, which I haven't done yet. It just seems obviously useful to just have a system where, whenever legal, reference links are just automatically archived the moment they are added to Wikipedia. We know the foundation can be aggressive with providing solutions like VisualEditor. This just seems like a problem that's crying out for the foundation to take the lead on providing a solution. Just give us an officially endorsed WebCite-style reference archive solution which makes link-rot a thing of the past. If it's not done in-house then negotiate a contract with a third party that includes terms of use which are mutually beneficial (limited advertising, perhaps, but no trojan horses)... the volunteer community can't do this on its own. Help! please – Wbm1058 (talk) 14:43, 5 December 2013 (UTC)[reply]
    See also: Wikipedia:Village pump (proposals)#WebCite possibly going downWbm1058 (talk) 14:49, 5 December 2013 (UTC)[reply]

    Kww just started a new job and is a little swamped. Writing the bot specs for a graceful removal is on his to-do list for the weekend.—Kww(talk) 00:09, 6 December 2013 (UTC)[reply]

    nationalforum.com

    nationalforum.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com Regular spam of this site, and also the author (and blacklist candidate string) "William Allan Kritsonis" across a range of often bizarrely unrelated articles, from a number of IPs. See [28] [29] and 72.48.212.34 (talk · contribs) for examples. Blocking is likely to be ineffective, as some are just commonplace AT&T IPs. Andy Dingley (talk) 00:37, 7 December 2013 (UTC)[reply]

    Completed Proposed additions

    Proposed removals

    Rentarasta.com

    rentarasta.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com Not understanding why this one in particular is on the list or even where to find out on my own. There is a screenshot archive of an article here that I want to reference. Docucopter (talk) 18:04, 25 September 2013 (UTC)[reply]

    Question How would this link be useful for Wikipedia? OhNoitsJamie Talk 21:42, 16 November 2013 (UTC)[reply]

    cbronline.com

    Contains the archives of the Computer Business Review, an indispensable source of information about business computing of the 1980s and 90s. QVVERTYVS (hm?) 15:50, 29 September 2013 (UTC)[reply]

    Seconded. Computer Business Review is cited at Azul Systems and would seem to be a valuable source concerning a 2006 legal dispute between Azul and Sun Microsystems. As far as I can tell at [30] the domain has changed hands since the blacklist listing. Kingdon (talk) 14:43, 8 October 2013 (UTC)[reply]
    Agree, there are few online sources from this era, except a few in Google books or a few other narrowly-focused sites. Being able to cite them might help recentism. The cbronline pages do show up in Google searches near the top, perhaps because of spam-like behavior in the past, but also perhaps because they are useful. The cbronline story pages do not seem overly commercial themselves, unless I am missing another reason to keep them blacklisted. A bunch of pages on computer topics from that era still have the citations, but many others are uncited because of the black list. W Nowicki (talk) 22:25, 8 October 2013 (UTC)[reply]
    Seconded. I have used cbronline.com as a source for several articles about computer systems from the 1990s, especially Digital Equipment Corporation products. For a lot of facts and figures this was the only reasonable online source I could find at the time. Letdorf (talk) 22:02, 14 October 2013 (UTC).[reply]
    Please be aware that the usefulness of a source has no relevance to the blacklist. Past behavior and potential for future abuse are all that matters here. ~Amatulić (talk) 23:31, 20 October 2013 (UTC)[reply]
    OK, I'm a greenhorn when it comes to spam-blacklisting. I'm here because apparently a bot has recently begun tagging articles with {{Blacklisted-links}}. For example, Super video graphics array was tagged on September 24, 2013—that's just the first one I noticed. An external links search finds some 357 articles linking to this so-called "spam" site. The information box at the top of this section tells me to familiarize myself with the reasons why this site was blacklisted, so I look at MediaWiki talk:Spam-blacklist/log/2010#April 2010 to see who blacklisted the link and when, and the reason given for blacklisting. There I find that user:Tedder blacklisted this site 21:28, 16 April 2010. So, apparently many readers and editors have been blissfully unaware that hundreds of articles have been linking reliable-source references to a blacklisted site for the past 3 12 years. As for the reason, we are permalinked to Wikipedia talk:WikiProject Spam (the discussion can also be found at Wikipedia talk:WikiProject Spam/2010 Archive Apr 1#cleantechnology-business-review.com). There, I find many links to Template:Link summary and Template:User summary that aren't expanded for some technical reason. Notice that those templates work fine up to a certain point on the page, and then they don't. Maybe the expansion limit was exceeded? So, I've taken the liberty to copy the relevant section of the archive to Wikipedia talk:WikiProject Spam/2010 Archive Apr 1/cleantechnology-business-review.com. There you will see that the blacklist decision was a local consensus between the two editors Tedder and user:Beetstra. Beetstra helpfully mentions that "this is used as a reference as well, and I see many 'regulars' using these links" – well I'd guess that most of the 350+ links are legitimate links created by us 'regulars'. I mean, Wikipedia has a "massive" number of links to The New York Times I'm sure, but their massiveness doesn't make them spam. Sorry, but after all the time and effort I've put into this, I still don't understand why this site was blacklisted. Can Tedder or Beetstra please explain, for the benefit of us spam-blacklist newbies? Thanks, Wbm1058 (talk) 18:47, 23 November 2013 (UTC)[reply]
    I just recalled that my first visit to the MediaWiki talk: namespace, two years ago, was over this same issue. As I've only half a dozen edits in this namespace, to me this particular blacklist item really stands out. I can't recall any encounters with any other site on the blacklist. Relevant past discussions in the archive:
    You just hit it, SimonThird is one of them, 78 edits, most of the cases adding a reference to cbronline. We call that reference spamming. Sometimes adding his reference to referenced material, or just adding a sentence with this reference. How many of the current available 350+ articles that contain the links are still there because the spam was not appropriately cleaned out. Yes, it is a massive work to get those 350 through the whitelist, but I have seen quite a number of them already having been declined because they were not necessary, replaceable (4 of the 5 I just went through were in fact replaceable, and only one was granted). As this was a massive campaign to spam Wikipedia, and we are NOT a vehicle for that, I am very reluctant to removal, and I will ask editors to go the extra mile and go through whitelisting for the individual links, including showing that there are not other sources for their requests.  Defer to Whitelist. --Dirk Beetstra T C 07:45, 24 November 2013 (UTC)[reply]
    To be able to see the full record, I split the archive, please see: Wikipedia_talk:WikiProject_Spam/2010_Archive_Apr_1B#cleantechnology-business-review.com. --Dirk Beetstra T C 08:49, 24 November 2013 (UTC)[reply]

    So let me be sure I understand this last post… a small number of users including (only?) SimonThird started spamming the Wiki. Right? So instead of blocking them we blocked a reference used in hundreds of other articles? And the reason for this is that it would be too much effort to fix the actual problem? Maury Markowitz (talk)

    I checked another one of SimonThird's edits. Here he added a new section Releases to IBM's article, referenced with a link to the CBR site. Clearly this gives undue weight to a single product release, one of perhaps thousands of products released over IBM's 100+ year history. This one lasted a couple months before it was reverted. So some unknown percentage of this editor's contributions may have dubious motivations. I maintain that the first example I cited, if looked at by itself, with no knowledge of SimonThird's other edits, should be considered as both good-faith and helpful, were the site not blacklisted. I don't have the power to check this user's IP address to see if it could be associated with this website. I'm not really familiar with the publication, but my perception is that Computer Business Review was a British printed trade journal back in the day, perhaps similar to InfoWorld. A lot, perhaps most of these publications have dropped their paper editions, and if they're still operating, are now online-only. Unfortunately, unlike InfoWorld which Google has helpfully scanned so that we may directly link images of the paper-printed magazines, our only option with Computer Business Review is to link to this site. I doubt this site makes much, if any, money selling subscriptions, so obviously they need to draw traffic that views on-site advertising to survive. Our legitimate reference links to this site may help in some small way in that regard, helping them stay online so that the site is available for us to research and find more references. Now The New York Times does still make real money from selling subscriptions, but even they are becoming more dependent on online ads. So, what if, theoretically, some anonymous editors decided to help the Times out by focusing all their editing energies on clearing the Category:Articles with unsourced statements backlog by inserting mostly helpful citations to Times articles, but got somewhat over-enthusiastic about the project and let some dubious links like SimonThird's "Releases" link slip in as well. Would we then be forced to blacklist the Times?

    Dirk Beetstra, I see that you maintain a bot that generated a report: Wikipedia:WikiProject Spam/LinkReports/cbronline.com – can we get an updated report? Thanks, Wbm1058 (talk) 00:13, 25 November 2013 (UTC)[reply]

    I'll also point out that user:SimonThird has a clean block log, the only admonition on user talk:SimonThird was extremely vague and didn't indicate any specific edits or the nature of the alleged "promotional material", and by the time this site was blacklisted on 16 April 2010, SimonThird was long gone (last edit 11 December 2009). Wbm1058 (talk) 01:34, 25 November 2013 (UTC)[reply]

    Another relevant past discussion: Wikipedia:Reliable sources/Noticeboard/Archive 49#xxx-business-review.com as source – yes this site seems to have a lot of "articles" that are just regurgitated press releases, but the idea that these are unreliable sources is a bit ridiculous. You just need to be careful about what they reliably say. Press releases are primary sources, not the secondary sources preferred on Wikipedia. Primary sources are used to fact-check secondary sources. If you have a company Z press release dated March 1996 announcing the release of the product whiz-bang version 3.0 then that is indeed a reliable source for "Company Z announced the release of whiz-bang version 3.0 in March 1996". You might want to look for a reliable secondary source that confirmed that the product actually was released when they said it was, but the press release is a reliable source for "the company claimed the product was released." - Wbm1058 (talk) 03:23, 25 November 2013 (UTC)[reply]

    Nope, it was not only SimonThird, there were more. If it was only SimonThird, likely, as you state it, it would likely have been a block for the user (with some exceptions for some sites). Wbm1058, I said it was a campaign, spammers do not stay with just one account, they use multiple accounts to spam multiple domains. There were 5 or 6 listed, but there are many, many more (some with just one or two edits, but of the same pattern). We may indeed not have bothered to block editors here, but warn the different accounts and move on to straight blacklisting. Why bother blocking accounts if other socks will pop up (sometimes, even warning them as they will not read the warning on the old-sock account when they are already on the new one).
    Also, you say a reference to hundreds of other articles, if I see it correctly, there are at least 5 accounts (and if I go through a handful of other IPs I am worried about those edits as well), who added and (between each other) re-added links that were removed. Since it took 2 years before it was uncovered, the reports are congested with regular editors who, in some cases, may have added the links back reverting unrelated vandalism. There may also have been regulars adding the link in the past - but that means that there should also be regulars who tried to add the link since. If it was significantly used, then there must among those have several regulars who do know that when they run into a blacklist warning that there are ways to discuss that problem (whitelist requests). Still, there have not been many discussions regarding it, suggesting that not many regulars have used the link. Moreover, most of the whitelisting requests I did see were declined as 'replaceable'. I don't think this site has been used by regulars a lot.
    The site was not blacklisted because it was an unreliable source, even a porn site is a reliable source if you use it correctly, it was blacklisted because it was spammed (and otherwise spammy abused) by multiple accounts (likely a SEO-company seen IP edits) in a campaign (or multiple campaigns), ánd it is not massively reliable anyway.
    So if it shows anything, I think it shows that most of these links were not cleaned out after the blacklisting .. manpower is a continued problem here.
    Regarding the Times - that is the interesting thing - first, a journal like the Times does not need spam to get their links out (so that says something about companies that do spam), moreover, if a site like that would engage in a massive spamming campaign, we would indeed have a nice problem, which likely would be handled through the legal department of Wikimedia (we have had congressman or their representatives spam Wikipedia - besides blocking, they have to be reported to the Foundation). I would however not exclude that if such a site would engage in such massive spamming, that blacklisting (though more likely an edit filter) may be needed to mitigate the problem - and it has happened for sites like that.
    And yes, cbronline or the ..-review sites may be a reliable source for some information - and that is why we have a whitelist for those cases where the information is unique, reliable ánd notable enough.
    Sorry, COIBot is down at the moment, but the old reports should already give you an idea - I just went through some IPs, and there are more engaging in spamming than the ones that precipitated the blacklisting. --Dirk Beetstra T C 08:11, 25 November 2013 (UTC)[reply]
    OK, fair enough. This has been a good discussion, although I feel that perhaps the burden of proof has been unfairly placed on the defense. There does seem to be a problem here, but the extent of the problem and the manageability of it is just a matter of opinion. I feel that no matter how much effort I put into showing it's manageable, you will still reply that what I've found is just the tip of the iceberg and we just haven't identified the rest of it. So there's no point in further analyzing what happened over three years ago before the blacklisting. This recent addition of bot-generated {{Blacklisted-links}} has introduced an eyesore that currently transcludes onto 3,738 articles, apparently 357 (nearly 10%) of which are caused by this reference. If the goal of this exercise was to twist the arms of busy gnomes into diverting from other backlogs they've been trying to clear for months, then it's succeeded. I'll familiarize myself with the white-listing process, which is something I haven't needed to do until now, and get to work on "cleaning out" the links, though I can just get started for now before I need to take a break. Wbm1058 (talk) 14:20, 25 November 2013 (UTC)[reply]
    Hmm, I'm surprised at how short that white-list is, now that I'm looking at it for the first time. Just nine entries for cbronline. But the "helpful hint" section does not feel helpful at all. In fact, it strikes me as rather hostile. The attitude that you are guilty until proven innocent comes through loud-and-clear. That probably explains why the list is so short. If you make something enough of a bureaucratic bother, then volunteers just won't bother. I suppose that is the goal. Excuse me for grumbling. Wbm1058 (talk) 15:03, 25 November 2013 (UTC)[reply]
    @Wbm1058: Regarding manageability - I was just pointed to a similar case of spamming, where many domains from a company were blacklisted back in 2008. 2 independent whitelisting requests led me to have a look, and it looks like the same spamming is still ongoing, with many single-purpose accounts creating and editing articles in the realm of the company - a clear case of paid advocacy (maybe SEO-spam). There, blocking accounts and blacklisting their domains certainly did not stop the spam, and I have no believe that here blocking the editors would have stopped it either. Spamming pays their bills. --Dirk Beetstra T C 07:55, 28 November 2013 (UTC)[reply]
    Regarding the helpful hint - unfortunately that is behaviour that we have to put up with on a regular basis. Please understand that blacklisting is not done to annoy good faith fellow editors, it is to stop spam - and even the blacklist does not do a good job at that. Paid advocacy is a serious continuous problem, Wikipedia is a massive spam target. I am sorry, but I get very nervous and non-cooperative when a regular is coming with an attitude of 'you asshole, you blocked the domain that I need and now I can not save my page' .. that approach should be reserved for editors who spam Wikipedia, but that aspect is often ignored.
    Unfortunately, the bureaucratic bother is needed, spam is not your run-of-the-mill vandalism. And the reverse is generally also the case - we are deemed guilty of blocking 'useful' sites (your post of 23 November, 18:47 suggests such assumptions as well - 'this so-called "spam" site'/'Wikipedia has a "massive" number of links to The New York Times I'm sure, but their massiveness doesn't make them spam.'), and we need to continuously defend that if a site is blacklisted, it was actually spammed. --Dirk Beetstra T C 08:10, 28 November 2013 (UTC)[reply]

    energy-business-review.com

    Why is this website blocked? It is used in articles like Project Hayes and Waitahora Wind Farm, and does not look like a spam website to me. --Pakaraki (talk) 17:34, 4 October 2013 (UTC)[reply]

    Looks like it was being spammed by this user and perhaps others. OhNoitsJamie Talk 18:14, 4 October 2013 (UTC)[reply]

    ccel.us

    This was added by User:Ckatz in the summer of 2011 [31], apparently in response to this spam taunt, but it's quite unlikely that this threat was honest since CCEL (now titled Evangelical Christian Library) is simply a repository site for well-known theological and religion-related texts, most of them PD. I would not be surprised to see links to its materials throughout religious topics on Wikipedia; the case which caught my eye involves a reference in J. Z. Knight to an on-line edition of a book by Russell Chandler, once a religion writer for the LA Times. This looks to be a perfectly reasonable reference, and an online copy is surely preferable for an online encyclopedia. Therefore I would like to ask that this entry be removed from the spam blacklist as unnecessary and inappropriate. Mangoe (talk) 02:37, 30 November 2013 (UTC)[reply]

    One correction: ccel.us and ccel.org are separate sites. However, the rest of my request remains the same: ccel.us has a number of references now, and as far as I can determine it was never actually spammed. Mangoe (talk) 22:19, 3 December 2013 (UTC)[reply]

    dyingscene.com

    I'd like to reinforce the view others have expressed in a recent discussion: this site has become a major news source in the punk scene during the past few years, and as they promised to discontinue spamming, they should be whitelisted. See previous discussion here: https://en.wikipedia.org/wiki/MediaWiki_talk:Spam-blacklist/archives/November_2013#dyingscene.com

    Strummer25 (talk) 14:06, 6 December 2013 (UTC)[reply]

    This is what, the fourth blacklist removal request (which is not the same as whitelisting)? Why are they so eager to get off the blacklist if they're not going to spam? Why can't selective whitelisting for individual links (which I'm not opposed to) suffice? OhNoitsJamie Talk 15:13, 6 December 2013 (UTC)[reply]
    Today they're a major punk news source, clearly No.2. after punknews.org, and they are catching up, just check the trending on Alexa. It means that their articles would be used as references for about every punk band's Wikipedia page by editors.

    Strummer25 (talk) 20:10, 6 December 2013 (UTC)[reply]

    I remain unconvinced that selective whitelisting is not more appropriate given that at least four different accounts were spamming this link, including a "role" account operated by the website. OhNoitsJamie Talk 21:55, 6 December 2013 (UTC)[reply]
    The selective whitelisting process is far too slow and cumbersome.

    Strummer25 (talk) 08:30, 8 December 2013 (UTC)[reply]

    Strummer25, there are no deadlines here, and hence that is never a reason to de-blacklist over other processes (also, it appears that you did not even try to get specific links whitelisted and see whether those documents pass the bar). I further agree with Ohnoitsjamie's assessment (and earlier assessments) - it is not always the only source, and it got spammed by IPs and role accounts, and the promise to not do it again .. I've seen that before, still I'd like to see whether the community, at large, finds individual links useful (and it appears that in the last years there is exactly one whitelisting request for one link (open at the moment), and no requests by regulars) -  Defer to Whitelist. --Dirk Beetstra T C 09:03, 8 December 2013 (UTC)[reply]
    OK, I've requested two links that I need as references to be whitelisted.

    Strummer25 (talk) 23:30, 8 December 2013 (UTC)[reply]

    eHow.com

    The blacklist shows behow.com, so it appears to me that it is not really intended to block eHow.com. Spalding (talk) 16:05, 8 December 2013 (UTC)[reply]

    The entry is \behow\.com\b, which means (word boundary) ehow.com (word boundary). It is intentionally catching eHow.com. Jackmcbarn (talk) 16:41, 8 December 2013 (UTC)[reply]
    Indeed. The backslash before a character makes that following character treated specially - '\b' is word boundary, '\.' is a true '.' ('.' itself has a function - it would match any character - similar is true for '\?', '\/', '\$' ...). Please see Regular expression, which contains, or will link through, to more information. --Dirk Beetstra T C 05:48, 9 December 2013 (UTC)[reply]

    bodybuilding.com

    bodybuilding.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    I just noticed this has been blacklisted on Tommi Thorvildsen and Geir Borgan Paulsen, which are on my watchlist; doubtless it is on many other WikiProject Bodybuilding articles also, apparently for having "six pack abs" and "secret" and other phrases typical of that sport/industry on it, but this is not part of the massive spam campaign by a certain someone that shows up all over youtube and, well, nearly everywhere. Bodybuilding.com is a mainstream bodybuilding site and in this case its biographies are what the citation is about; it also has content listings. It should not have been blacklisted, even if some of its articles are about the secret to six pack abs (diet, diet, diet) or because it carries advertising; so do flexmagazine.com musclemaginternational.com and other magazines that are central to the bobybuilding industry. I do not have the time to dig through the blacklisting log to find out who blacklisted this and why, but it should not have been blacklisted. Reliable sources for this sport that do not mention "six pack abs" and "training secrets" simply do not exist.Skookum1 (talk) 17:56, 9 December 2013 (UTC)[reply]

    Generally, sites here get only blacklisted when someone was spamming them to Wikipedia, so my first assumption would be that that is also the case here (I did not look into this specific case, which may be different, but unfortunately respectable organisations do spam ..). Anyway, this is blacklisted on meta, so delisting either needs to be requested there ( Defer to Global blacklist) or whitelisting should be requested locally ( Defer to Whitelist). --Dirk Beetstra T C 06:38, 10 December 2013 (UTC)[reply]
    Bodybuilding.com is well-known enough in the bodybuilding world to not spam itself to Wikipedia, which is scarcely a site full of bodybuilding wannabes who are its market. It's no doubt widespread within WP:Bodybuilding articles because it has good bios and contest-winner lists. It and similar sites generally aren't used anywhere like the steroid or training articles for reasons of WP:RS except when talking about what such magazines/sites have to sway about whatever. So anyways, looks like there's more research for me to do; all I'm trying to do is protect two articles from deletion for not having sources, and encountering wiki-procedure out the ying yang. Remember that word "wiki"? Means "fast, quick" in Hawaiian? Yeah OK. I think I'm gonna look up the Hawaiian word for "complicated and time-consuming". Thanks for the heads-up. There may be other bb'ing sites also blacklisted that shouldn't be. What I'm smelling here is bit of knee-jerkery and it looks like it was black-listed simply for containing phrases used by a well-known Chinese-American jock cum "scientist" who is definitely a spammer.Skookum1 (talk) 17:21, 10 December 2013 (UTC)[reply]
    • Gotta agree here. I have a link from the site in an article that is not spam and should pass RS. It was entered before the site was blacklisted. Bodybuilding doesn't have the same coverage as baseball or football and bodybuilding.com is a big player in the coverage. Not everything on the site passes RS of course (such at forums or user generated), but much of it would and the blanket block makes sourcing a lightly covered sport even more difficult. Niteshift36 (talk) 20:25, 10 December 2013 (UTC)[reply]
    Thanks for the comment; I was looking at the global meta-blacklist procedure and it's torturing my brain, was going to see about a whitelisting, but it's not just about this site but others of the same kind e.g. elitefitness.com. Here's a copy paste of the tags that triggered off the blacklisting, I'm wondering how many other bodybuilding sites have been thrown out because of the seo-driven bathwater "Triggered by \b(easy)?(hairgrowth|bodybuilding(?!-magazin)|weightloss?|mafiawar|sixpackabs)(secret)?\b on the global blacklist" Hairgrowth? Six pack abs? Weight loss". Why not just block particular sites that are KNOWN spammers like the whatsisface with the "secret" from China (diet, diet, diet). is "bodybuilding" a blacklisted term>? I don't understand the syntax of that....Skookum1 (talk) 21:41, 10 December 2013 (UTC)[reply]
    • "lightly-covered sport" is definitely true re Sports Illustrated and TSN/ESPN et al, but the truth is that the bodybuilding publishing industry is one of North America's largest markets; Muscle & Fitness, FLEX, Men's Health et al. are massive in earnings and circulation...... websites like bodybuilding.com break from the corporate norm ("Weider Inc.") so excluding them implicitly means giving a boost to the use of the bodybuilding mainstream media (Weider) and also its ties to the IFBB/NPC....much as banning political blogs means that the biases of the mainstream media are considered "reliable" while sites carrying correct and unbiased information are often not allowed...... this is trivial by comparison but I think you see the issue; bodybuilding.com and certain others are held in higher regard than the editorial content of the major print publications and their websites, which really are glorified catalogues for particular product lines.....many bodybuilding websites are independent of any product line....(many are definitely front-page for particular lines, however).Skookum1 (talk) 21:46, 10 December 2013 (UTC)[reply]

    Please. We have very established and notable companies who do however either push their links themselves, or who hire SEO companies to improve their Search Engine results who then choose to use also Wikipedia in their tactics. Do not assume that because it is such a good site, that such a site would not engage in those tactics. Spam is not just porn, viagra etc. It is not even that the site needs to sell something, it may boil down to 'getting known' or even down to 'need more traffic to our site' - SEO is a booming business.

    That being said, it does appear as if 'bodybuilding.com' got caught up in a wrong regex. I can not find any discussion regarding this domain. I'll have a look at the meta-blacklist and see whether we can adapt the rule or similar. However, maybe it is easier to just plainly whitelist it here (sigh, why do we not have a global whitelist ..). --Dirk Beetstra T C 05:51, 11 December 2013 (UTC)[reply]

    And Skookum1, please WP:AGF, your comments ('What I'm smelling here is bit of knee-jerkery and it looks like it was black-listed simply for containing phrases used by a well-known Chinese-American jock cum "scientist" who is definitely a spammer.') are insinuating a form of abuse of process for which you do not have any evidence - we are all just volunteers here, we do our best to keep Wikipedia free of spam, but do accept that occasional mistakes are made as well. Also Niteshift36 - a 'blanket block' is hardly ever the intention. --Dirk Beetstra T C 06:00, 11 December 2013 (UTC)[reply]

    Completed Proposed removals

    Troubleshooting and problems

    • There's a probable false positive reported here - the blacklist for exeter.co.uk is matching hms-exeter.co.uk. Andrew Gray (talk) 16:08, 8 December 2013 (UTC)[reply]
    • Another probable false positive here. The blacklist for dot.us is matching dot-dot-dot.us. Pburka (talk) 00:02, 9 December 2013 (UTC)[reply]

    Logging / COIBot Instr

    Blacklist logging

    Full instructions for admins


    Quick reference

    For Spam reports or requests originating from this page, use template {{/request|0#section_name}}

    • {{/request|213416274#Section_name}}
    • Insert the oldid 213416274 a hash "#" and the Section_name (Underscoring_spaces_where_applicable):
    • Use within the entry log here.

    For Spam reports or requests originating from Wikipedia_talk:WikiProject_Spam use template {{WPSPAM|0#section_name}}

    • {{WPSPAM|182725895#Section_name}}
    • Insert the oldid 182725895 a hash "#" and the Section_name (Underscoring_spaces_where_applicable):
    • Use within the entry log here.
    Note: If you do not log your entries, it may be removed if someone appeals the entry and no valid reasons can be found.

    Addition to the COIBot reports

    The lower list in the COIBot reports now have after each link four numbers between brackets (e.g. "www.example.com (0, 0, 0, 0)"):

    1. first number, how many links did this user add (is the same after each link)
    2. second number, how many times did this link get added to wikipedia (for as far as the linkwatcher database goes back)
    3. third number, how many times did this user add this link
    4. fourth number, to how many different wikipedia did this user add this link.

    If the third number or the fourth number are high with respect to the first or the second, then that means that the user has at least a preference for using that link. Be careful with other statistics from these numbers (e.g. good user who adds a lot of links). If there are more statistics that would be useful, please notify me, and I will have a look if I can get the info out of the database and report it. This data is available in real-time on IRC.

    Poking COIBot

    When adding {{LinkSummary}}, {{UserSummary}} and/or {{IPSummary}} templates to WT:WPSPAM, WT:SBL, WT:SWL and User:COIBot/Poke (the latter for privileged editors) COIBot will generate linkreports for the domains, and userreports for users and IPs.


    Discussion