Jump to content

MediaWiki talk:Spam-blacklist: Difference between revisions

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Content deleted Content added
Mishandling of old links.
Line 341: Line 341:
My tests were based on a report at [[Wikipedia:Teahouse/Questions#I can't figure out what link is blacklisted?]] [[User:PrimeHunter|PrimeHunter]] ([[User talk:PrimeHunter|talk]]) 20:51, 17 December 2013 (UTC)
My tests were based on a report at [[Wikipedia:Teahouse/Questions#I can't figure out what link is blacklisted?]] [[User:PrimeHunter|PrimeHunter]] ([[User talk:PrimeHunter|talk]]) 20:51, 17 December 2013 (UTC)
:the admin visible log for blacklist-matches has the same problem .. Especially annoying for cases where redirects are used - what did they try to avoid? --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> 19:03, 5 January 2014 (UTC)
:the admin visible log for blacklist-matches has the same problem .. Especially annoying for cases where redirects are used - what did they try to avoid? --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> 19:03, 5 January 2014 (UTC)

===cbronline.com===
* {{Link summary|cbronline.com}}
Currently, "cbronline.com" is blacklisted on the English Wikipedia as of late 2013. "Computer Business Review Online" used to be a reasonable news source, but at some point it transitioned to "Your Tech Social Network" and went downhill. All the old URLs stopped working (the ones with the form "?guid=" followed by a long hex string) but can be fixed from the Internet Archive. New URLs have a different syntax. I suggest updating the regular expression on the blacklist to exclude URLs of the old form. They were legitimate links in many articles. In general, blacklisting links from years ago is a bad idea. It damages the encyclopedia. I'm trying to fix the mess Cydebot II created at [[RegisterFly]] now. [[User:Nagle|John Nagle]] ([[User talk:Nagle|talk]]) 22:07, 5 February 2014 (UTC)


=Logging / COIBot Instr =
=Logging / COIBot Instr =

Revision as of 22:07, 5 February 2014

    Mediawiki:Spam-blacklist is meant to be used by the spam blacklist extension. Unlike the meta spam blacklist, this blacklist affects pages on the English Wikipedia only. Any administrator may edit the spam blacklist. See Wikipedia:Spam blacklist for more information about the spam blacklist.


    Instructions for editors

    There are 4 sections for posting comments below. Please make comments in the appropriate section. These links take you to the appropriate section:

    1. Proposed additions
    2. Proposed removals
    3. Troubleshooting and problems
    4. Discussion

    Each section has a message box with instructions. In addition, please sign your posts with ~~~~ after your comment.

    Completed requests are archived. Additions and removals are logged, reasons for blacklisting can be found there.

    Addition of the templates {{Link summary}} (for domains), {{IP summary}} (for IP editors) and {{User summary}} (for users with account) results in the COIBot reports to be refreshed. See User:COIBot for more information on the reports.


    Instructions for admins
    Any admin unfamiliar with this page should probably read this first, thanks.
    If in doubt, please leave a request and a spam-knowledgeable admin will follow-up.

    Please consider using Special:BlockedExternalDomains instead, powered by the AbuseFilter extension. This is faster and more easily searchable, though only supports whole domains and not whitelisting.

    1. Does the site have any validity to the project?
    2. Have links been placed after warnings/blocks? Have other methods of control been exhausted? Would referring this to our anti-spam bot, XLinkBot be a more appropriate step? Is there a WikiProject Spam report? If so, a permanent link would be helpful.
    3. Please ensure all links have been removed from articles and discussion pages before blacklisting. (They do not have to be removed from user or user talk pages.)
    4. Make the entry at the bottom of the list (before the last line). Please do not do this unless you are familiar with regular expressions — the disruption that can be caused is substantial.
    5. Close the request entry on here using either {{done}} or {{not done}} as appropriate. The request should be left open for a week maybe as there will often be further related sites or an appeal in that time.
    6. Log the entry. Warning: if you do not log any entry you make on the blacklist, it may well be removed if someone appeals and no valid reasons can be found. To log the entry, you will need this number – 594109892 after you have closed the request. See here for more info on logging.


    Proposed additions

    suzukicycles.org

    suzukicycles.org: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com Frequently cited, frequently plagiarized. Trove of copyrighted photos, books and text violates WP:COPYLINK and WP:SPS --Dennis Bratland (talk) 16:09, 17 September 2013 (UTC)[reply]

    Support blacklisting. Werieth (talk) 19:37, 27 September 2013 (UTC)[reply]


    Morning277 subjects

    These sites are being promoted by a publicity agency, banned from Wikipedia, which has been posting articles about them. After an article is deleted and the poster blocked, a new article with similar contents is posted from a different account, almost always under a different title. Since they keep using new accounts and new article titles, account blocking and page protection haven't been entirely effective.

    newyorkstay.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    youtube.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    justiceforall.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    kulaw.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    4cabling.com.au: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    aasted.eu: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    alsbridge.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    awaionline.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    bizible.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    rybec

    Rybec, you didn't date your signature. Hard to tell when you added this. Are these domains still a problem? ~Amatulić (talk) 23:05, 2 January 2014 (UTC)[reply]
    This request was filed on 18 August 2013. -- SMS Talk 20:06, 4 January 2014 (UTC)[reply]

    sourcesecurity.com

    Spammers

    Long term, persistent spamming on many IPs and users - above is a partial list of IPs and accounts. Main spam URL is sourcesecurity.com, but thebigredguide and yogawizard show some overlap in accounts. - MrOllie (talk) 18:37, 30 August 2013 (UTC)[reply]


    Here's a specific suggestion:

    ebscohost\.com(\.|.*(pdfviewer|EbscoContent))     #Block 3 kinds of unusable EBSCOHOST links but allow permalinks: Match proxies: there's a literal "." after "com", and temporary session links, which contain pdfviewer or EbscoContent
    

    ( This is a consolidation of these two simpler regexes:

    ebscohost\.com.*pdfviewer          #Block unusable [[wp:EBSCOHOST]] links but allow permalinks
    ebscohost\.com\.                   #Match proxies, which is where it's not the end of the hostname - there's a literal "." after "com".
    

    )

    Wikipedia has many apparently dead-on-arrival links (like this intended to be to PDFs of the form ebscohost.com...pdfviewer...: All 7 of the 323 pages containing ebscohost and pdfviewer] I looked at had dead EBSCO links. These are NOT links that hit a paywall (like this. Rather, they bring up 404-like server error messages, and did from the day they were added; they're non-persistent URLs.

    A second problematic type of EBSCO link are proxied URLs, like the three added by a user's (sole ever) edit that are of the form hxxp://0-web.ebscohost.com.sculib.scu.edu/ehost/pdfviewer/pdfviewer?sid=[hex string]@sessionmgr13&vid=4&hid=13. (Note the bold portion!) These links work ONLY for subscribers that are ALSO at SCU. We shouldn't allow such links, and the blacklist (or a similarly functioning parallel system) would be a good solution.

    I've noticed that EBSCO staff has been heavily editing their own article. I solicited assistance, hoping they'd be available, willing, and able to help fix these links or suggest ways to deal with them systematically. note posted; no response. What EBSCOhost calls permalinks, like http://search.ebscohost.com/login.aspx?direct=true&db=ulh&AN=37698669&site=ehost-live&scope=site are acceptable, and so I've designed a regex that allows the permalinks but forbids the non-persistent URLs.

    Research suggests it's not possible to convert the non-persistent URLs to persistent URLs using the data in the former. --Elvey (talk) 21:26, 9 September 2013 (UTC)[reply]

    The second problem is the use of a proxied URL, ie, the link points to a institution's proxy server such as sculib.scu.edu. This is not specific to ebscohost - it happens with links to other subscription databases too. A search for "ezproxy", for example, will bring up hundreds of such links. They are a bad thing. Nurg (talk) 08:39, 12 June 2013 (UTC) (reposted)[reply]
    I am tempted to see these sites as redirects, which will be location-dependent whether they work. I would consider that these should typically be converted to direct links to the object (within educational institutions, one can generally use a web-proxy to get to literature - a direct link would either be the link on the server where the literature resides, or the DOI. <snip> Links through proxy servers have no place whatsoever. I am somewhat tempted to say that these need blanket blacklisting on meta, as they could possibly be abused to circumvent other blacklistings (for a relatively open proxy), and serve no function whatsoever to most readers except for the (few) ones that have access through the proxy - I doubt even if the url can be understood well enough to be able to figure out a real link from it. It is however going to be very obnoxious for the users that in good faith insert the proxy url they copy from their web-browser and then they can't save, and one could think of cases where it is appropriate (if information is only available to people who can pass the proxy and no-where else in the world, it could still a good reference for certain information - think of it of a book of which the single copy is in an nearly inaccessible library (the library in the Vatican), it is still verifiable by proxying through people who do have access to the library (ask the pope)).
    Note, that with creative regex rule-writing, we could blacklist the two 'bad' examples of Nurg (the non-persistent link and the institution proxies), still enabling good ones (the permalinks). --Dirk Beetstra T C 09:30, 12 June 2013 (UTC) (reposted, indented, and 1 sentence snipped)[reply]
    We use the blacklist to limit examiner.com links, because they generally fail RS, so I think it's appropriate that we add regexes for the impermanent URLs. (Arguably it would be better to have a similarly functioning parallel system with its own error messages handle sites like examiner.com and this ebsco problem, but in the meantime, I say let's put in regexes to handle them.) I also match the ebscohost proxy URLs, but not by matching on 'ezproxy', because some of the ebscohost proxy URLs don't contain 'ezproxy'. (It could be considered as part of a future proposed blacklist addition.) Beetstra (talk · contribs) suggested blanket blacklisting on meta be considered, but at meta, though I see these links on other sites - e.g 'fr.', I was told firmly, "Deal with it at the local wiki level." (Discussion at https://meta.wikimedia.org/w/index.php?title=Talk:Spam_blacklist&oldid=5798048#Unusable_EBSCOHOST_links.) --Elvey (talk) 21:26, 9 September 2013 (UTC)[reply]
    Well? Do we need to run a bot to remove all the extant links first, or is there more that is holding this addition back? --Elvey (talk) 01:58, 19 September 2013 (UTC)[reply]
    3000 links to improve in one go - seems like a good idea to me, yes!--Elvey (talk) 02:17, 26 September 2013 (UTC)[reply]
    Given the fiasco of the many editors pissed off by the actions of Cyberpower678 (talk · contribs)'s bot Cyberbot II (talk · contribs), a non-spam blacklist (see the big text above) is urgently needed. If one of Cyberpower678's bots is set up to handle entries on this list appropriately, it would be appropriate to ad the EBSCOHOST regex to it, and move the examiner.com and petition regexes to it.--Elvey (talk) 09:13, 26 September 2013 (UTC)[reply]
    Many? Don't see that yet. Elvey, most of the links we block are blocked because they are/were spammed (examiner.com was spammed and is a spam-problem, for most of the petition sites, that is also true - it is a spam problem .. your remark regarding that is wrong), we do not block because we don't like links, or because they are unusable or because they are unreliable sources .. Nonetheless, your suggestion to have a second similar list might have merit, but that is a mediawiki developer problem that should be solved at the bugzilla level .. and I do not have much hope since we are waiting for several blacklist-related 'bugs' (improvements) for years already. --Dirk Beetstra T C 09:26, 26 September 2013 (UTC)[reply]
    Thanks, I see what you're saying. Here is a good example of the problem (of the list being used for reasons other than to block spam and bot Cyberbot II (talk · contribs) pissing off many users) : Luke (talk · contribs) is ADAMANT: "If something gets tagged as being on the spam blacklist, I will remove it, pure and simple." He's saying he's NOT going to examine the link, or attempt to repair or replace it. He's going to ASSUME the blacklist maintainers are making sure that the blacklist pretty much only blocks spam (like the spamhaus SBL and XBL maintainers do, if you're familiar with those lists).


    It's a problem that this blacklist is not a Spamhaus caliber blacklist; it's more like some of the more aggressive blacklists that are willing to regularly include entries that can be expected to cause considerable collateral damage. And that wouldn't be so bad if this blacklist was not marketed/described as pretty much only blocking spam. I'm willing to bet that the typical editor who tries to add a legitimate reliable source that is blacklisted collateral damage ends up not adding it, because we don't have multiple lists. The bot and the blacklist description pages are wrong to say the link Luke removed was spam; he was misled. The solution isn't to threaten to block everyone who does what he intends to do. It's to fix the list by splitting it. ASAP.


    We should still be blocking no-ip.com and examiner.com by default. Just not with this list. Roughly how many have I seen/do I think have expressed/do I think are upset because of the bot? Me? Around 6/?/60+, based on the 6. What about you? Is effecting a new list a developer problem at all? I would expect replicating the existing system and changing the names of a few things would be relatively trivial task for the right person (an admin, not a developer), compared to a real development project, such as a significant enhancement. Many good examiner.com-type links were not blocked because the links were a spam problem, but rather because they might have been one. But yes, I see your point - the examiner.com domain was blocked because it is a spam problem, I stand corrected. (BTW, is there a working 'spam' definition for use here? I usually refer to the definitions like the ones spamhaus proffers, tweaked to apply to this medium. I guess I'll go look for that …) I remember the first time I tried to save an edit and couldn't, and for the longest time had no idea why Wikipedia wouldn't let me save the edits I'd made to an article, which included adding an examiner.com source to it. The error messages had me thoroughly confused - and I'm knowledgeable about spam blacklists - but I still got the most pathetic and inscrutable error messages, and had no idea why Wikipedia wouldn't let me save the edits i'd made to an article. I have just gone through the same motions, which confirmed what I have seen others' comments suggest: the error messages shown to legit editors are still a pretty serious FAIL, though they are better than I remember. I recall they were worse than nothing, worse than useless. For no-ip.com links, the error message is still awfully misleading, as it describes my only options as:
    • If you feel the link is needed, you can:
      • Request that the entire website be allowed, that is, removed from the local or global spam blacklists (check both lists to see which one is affecting you).
      • Request that just the specific page be allowed, without unblocking the whole website, by asking on the spam whitelist talk page.
    This error message is unhelpful. The appropriate action is to request that the subdomain, rwservices.no-ip.info be whitelisted. It needs to be whitelisted. The error message is **misleading**.
    --Elvey (talk) 21:57, 26 September 2013 (UTC)[reply]
    PS: From current discussions: Surely you admit this listing wasn't because of a spam problem:

    \bjustjared\.buzznet\.com\b # Kanonkas # Gossip site/copyvio issues/speculation/not a reliable source used wrongly

    Perhaps Versageek (talk · contribs), creator of bot XLinkBot (talk · contribs) would be up to the task (of a non-spam blacklist system.) --Elvey (talk) 00:07, 23 October 2013 (UTC)[reply]

    soccerdatabase.eu

    soccerdatabase.eu: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    Back in May 2013 this link was mass removed from Wikipedia because it was deemed to be a copyright violating mirror website, of the defunct 'playerhistory.com' website. As I understand it, the owner of 'playerhistory.com' is Polarman (talk · contribs) and he has been taking legal action against the owners of 'soccerdatabase.eu' for violating copyright. This website has no place on Wikipedia and should therefore be blocked. Note that a previous attempt to blacklist 'soccerdatabase.eu' fizzled out with no real decision either way. GiantSnowman 12:32, 29 October 2013 (UTC)[reply]

    • Support, this should be blocked per WP:LINKVIO and all current links removed. Liamdavies (talk) 14:03, 12 December 2013 (UTC)[reply]
    • Support, and please can someone with the proper userright deal with this long-standing request? Lukeno94 (tell Luke off here) 14:08, 12 December 2013 (UTC)[reply]
    • It looks like 'soccerdatabase.eu' is not reachable anymore. I don't know whether this is related to the legal action mentioned above and I also don't know whether the announced relaunch of 'playerhistory.com' will preserve the former player IDs. But as long as this is still a possible option I would rather prefer to keep the 'soccerdatabase.eu' links and reuse them for 'playerhistory.com'. This won't be possible if the data is deleted. Please first talk to Polarman (talk · contribs) about the current status before taking any decisions. --RonaldH (talk) 00:42, 22 December 2013 (UTC)[reply]
      • Blacklisting would have as a result that no new links could be added. One could easily then remove the old links, replacing them with the comment '<!-- playerhistory - player ID -->'. --Dirk Beetstra T C 06:27, 22 December 2013 (UTC)[reply]
        • ...which would be the ideal outcome. We could, over time, replace the existing links with proper sources while preventing any new links being added. GiantSnowman 21:42, 26 December 2013 (UTC)[reply]
          • I would really urge you to first talk to Polarman (talk · contribs) about the current status and the outlook of playerhistory.com. I don't like to see this request being based on obsolete information. There is no imminent danger as far as the addition of new links is concerned as soccerdatabase.eu is not properly working at this stage anymore. --RonaldH (talk) 22:09, 28 December 2013 (UTC)[reply]

    Editors continue to add this website (most in ignorance/good faith, while others are known vandals) - can we get some movement on this please? GiantSnowman 19:34, 24 January 2014 (UTC)[reply]

    jatland.com

    I see quite some being added by Diptanshu.D, who appears to be a regular. Did you discuss this issue with them? --Dirk Beetstra T C 06:28, 3 January 2014 (UTC)[reply]
    I've left a note with them about general issues but they are far from being the only person who introduces this source. In fact, my experience is that most introductions come from anons. - Sitush (talk) 06:30, 3 January 2014 (UTC)[reply]
    (disclaimer: the db just started up, I have no records yet of anons). Are these reference-additions by anons or plain links (makes a difference for XLinkBot). --Dirk Beetstra T C 07:00, 3 January 2014 (UTC)[reply]
    They appear both in external links and as citations; in both cases, they are quite often barelinks. Is that what you mean? Quite often, although seemingly not in the latest batch, someone just copies an entire article from jatland to create a new one here on WP & then adds the jatland link. - Sitush (talk) 07:29, 3 January 2014 (UTC)[reply]
    I've revertlisted it (so XLinkBot reverts external link additions, not references; of course only for not-autoconfirmed and IP editors). Lets see what happens with that. If it is used to copyvio, unreliable, and it is somewhat 'pushed', then blacklisting may be an option. However, I'd like to hear Diptanshu.D's take on it as they have used it quite extensively. 'on hold' (do we have a template for that) for a couple of days to allow for that discussion (unless the spamming by IPs goes on too much). --Dirk Beetstra T C 08:07, 3 January 2014 (UTC)[reply]
    Thanks. I hope that @Diptanshu.D: can shed some light. (There is {{On hold}} but I cannot recall ever seeing it used - perhaps appears at DYK and on GA reviews?) - Sitush (talk) 00:23, 4 January 2014 (UTC)[reply]

    ratedsupplements.co.uk

    ratedsupplements.co.uk: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    This link has been repeated added to multiple articles by multiple IPs and users. Deli nk (talk) 13:04, 30 December 2013 (UTC)[reply]

    www.ilawyermarketing.com

    The website above claims specialization in SEO, and produces promotional materials for legal firms, disguised as "infographics". At least one (likely more) IP editors add these as citations in loosely-related Wikipedia articles, occasionally adding text and occasionally replacing valid citations. The site above does not appear to be spamming directly, but the sites I am nominating are hosting materials that they produced. All of the links below were added by 76.88.84.30 (talk • contribs • deleted contribs • blacklist hits • AbuseLog • what links to user page • COIBot • Spamcheck • count • block log • x-wiki • Edit filter search • WHOIS • RDNS • tracert • robtex.com • StopForumSpam • Google • AboutUs • Project HoneyPot).

    Added to several bicycle-related articles: [1] [2] [3] [4] [5]
    Added to sport concussion related articles: [6] [7]
    Added to articles on holiday lighting and Christmas trees: [8] [9]
    Added to Labor Day: [10] - this "infographic" was created by a different SEO firm, but reasons for listing are the same.
    Added to articles on employment law: [11] [12]

    Ivanvector (talk) 21:53, 15 January 2014 (UTC)[reply]

    Also:
    That said, I cannot find anything apart from that one IP. I've blocked it for 2 months. MER-C 05:32, 18 January 2014 (UTC)[reply]

    lisakellysite.com

    lisakellysite.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    I just added this, it may be a temporary measure, but the site or its DNS has been hacked and currently redirects to an MMF scam (see Ticket:2014020310001561). I blacklisted it to prevent good-faith users trying to reinsert it, as this was, by all accounts, the correct official site of Lisa Kelly. Guy (Help!) 12:44, 3 February 2014 (UTC)[reply]

    r4rating.com

    r4rating.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    is being spammed by

    Athul noble (talk · contribs · deleted contribs · blacklist hits · AbuseLog · what links to user page · count · COIBot · Spamcheck · user page logs · x-wiki · status · Edit filter search · Google · StopForumSpam)

    User has been injecting the r4rating.com website into a variety of articles. Site is not a reliable source, and appears to swallow content for regurgitation. For example on this page the site reprinted unattributed Wikipedia content about Indian actor/producer Mohanlal, then user Athul noble later went into the Wikipedia article on Mohanlal and added the plagiarist site as a reference. I attempted to make contact with the editor, but they ignored my first warning, and my invitation to comment at [[13]]. I've searched articles for instances of r4rating.com and reverted where appropriate. The spam is not overwhelming, but I figured I'd report it here since it is clearly spam, and less discriminating users might not notice this is not a legitimate content provider. Thanks. Cyphoidbomb (talk) 03:06, 5 February 2014 (UTC)[reply]

    Noted, but no Declined seeing it's only that user (who is now blocked). MER-C 10:27, 5 February 2014 (UTC)[reply]
    @MER-C: Got it, thanks. I appreciate your time. :) Cyphoidbomb (talk) 16:34, 5 February 2014 (UTC)[reply]

    Completed Proposed additions

    sentuamsg.com

    These URLs are being spammed by 86.20.42.223 (talk • contribs • deleted contribs • blacklist hits • AbuseLog • what links to user page • COIBot • Spamcheck • count • block log • x-wiki • Edit filter search • WHOIS • RDNS • tracert • robtex.com • StopForumSpam • Google • AboutUs • Project HoneyPot) all over the place. Diffs: [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28]. --benlisquareTCE 05:56, 15 September 2013 (UTC)[reply]

    qtrax.com

    qtrax.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com Spammed by multiple accounts Special:Contributions/Lilamey2013, Special:Contributions/79.179.194.98, Special:Contributions/Putting_an_end_to_tyrant_editors, Special:Contributions/Sivan_qtrax, Special:Contributions/109.64.177.36.

    Apparently using bots now too according to [29]. Яehevkor 10:28, 10 November 2013 (UTC)[reply]

    No, this is incorrect. If you look at DVdm (talk), you will see that there isn't any current usage of bots, but only declaration about the desire of using a bot. Currently, there is a bot approved by wikipedia, that add links to lyrics of notable songs, and redirect the readers to a website owned by CBS Interactive, called MetroLyrics. From some reason, this website has been approved, and spreading links all over wikipedia. There isn't much difference between Qtrax and Metrolyrics, beside the fact that Qtrax also offer free streaming and downloads of the song. Yes, free. The whole model of Qtrax is based on providing LEGAL music for FREE to end users. Yet, in return the artists are getting paid by Qtrax. Hence it is the only service in the world which is both free & legal.

    Our quest is eventually to fight piracy over the web. We offer everything for free, because this is what pirate sites offer. So if we want to fight piracy, we must offer music for free. If we would like to keep being legal, we must then have licenses with the music labels, and pay the artists for every song our users play or download on Qtrax. Which we happily do. Please help us fight piracy, don't stop us. — Preceding unsigned comment added by 79.183.0.181 (talk) 14:10, 13 November 2013 (UTC)[reply]

    Note - According to About us:

    "In addition, the partnership between advertising and Qtrax delivers great potential for monetization by brokering branded deals with consumer advertisers around the world. These natural partnerships are sure to yield generous revenues for artists and labels alike."

    - DVdm (talk) 10:56, 10 November 2013 (UTC)[reply]
    I fully support this being blacklisted. The IP has clearly stated its intent is to promote the website and artists, not to better the Wikipedia project. Wikipedia is not here to use as your source of free advertisement. Sergecross73 msg me 14:16, 13 November 2013 (UTC)[reply]
    Please look at Requests_for_approval/LyricsBot. There's obviously a common interest in adding lyrics to song pages, which has been validated by the Wikipedia community and administrators which approved the MetroLyric bot. Just like Qtrax, MetroLyrics (owned by CBS) is a commercial entity. Since Wikipedia is unbiased, an equal approach should be taken towards both parties; meaning, since MetroLyrics were allowed to add external links, so should Qtrax. — Preceding unsigned comment added by Gil.qtrax (talkcontribs) 15:30, 13 November 2013 (UTC)[reply]
    Qtrax should go through whatever Metrolyrics did for approval then. Until then, you shouldn't be adding Qtrax links to articles, as you've already shown your interest is in your own website and promoting artists, not bettering the Encyclopedia. If you keep spamming the website, you'll be blocked. Sergecross73 msg me 15:38, 13 November 2013 (UTC)[reply]
    Wikipedia is not a free advertising platform for your new company. In the absence of documented community consensus for mass-adding of the links, if I see any other single purpose accounts adding the links, the account will be blocked and your site will be blacklisted. OhNoitsJamie Talk 15:44, 13 November 2013 (UTC)[reply]
    According to WP:Village_pump_(proposals)/Archive_97#Linking_lyrics_from_legal_providers there is a documented community consensus for mass-adding of exactly such links. There is no difference between a Qtrax link and a MetroLyrics link. Since this has already been discussed and supported in the community, and approved for automation (LyricsBot) - less than a year ago - there should be no problem with posting manual links until a Bot is approved. — Preceding unsigned comment added by Gil.qtrax (talkcontribs) 15:55, 13 November 2013 (UTC)[reply]
    Qtrax is a separate organization, so it needs separate consensus. Its as simple as that. You don't get to make the terms here. Follow protocol and the rules or get blocked. Sergecross73 msg me 16:04, 13 November 2013 (UTC)[reply]

    kagisotownship.co.za and westrand.org.za

    kagisotownship.co.za: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    westrand.org.za: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    Full details of extensive sockpuppetry related to the spamming (see recent examples at [30], [31], [32], [33], [34]) are contained in Wikipedia:Sockpuppet investigations/DjMlindos/Archive. Blacklisting the website addresses may be a more effective long-term solution. HelenOnline 09:37, 8 January 2014 (UTC)[reply]

    plus Added. Newest account: User:Likhwa Ndlovu. MER-C 05:50, 30 January 2014 (UTC)[reply]

    www.cardsagainsthumanity.us

    Multiple IPs have tried to change the official website on the Cards Against Humanity page from the correct .com official site, to a fraudulent third-party site.

    Caidh (talk) 04:12, 14 January 2014 (UTC)[reply]

    I've protected the page for a week. MER-C 12:21, 14 January 2014 (UTC)[reply]
    Nothing for a week. no Declined, please report to WP:RFPP if spamming resumes only on that page, or here for multiple pages. MER-C 05:38, 30 January 2014 (UTC)[reply]

    subwaymetro.com

    Has been added as an external link to a large number of Wikipedia articles on subway systems (e.g. Osaka Municipal Subway added on 17 January here, Caracas Metro added on 26 January here to name just a couple), but the site's content is largely derived from Wikipedia articles and uses Wikimedia Commons images without attribution. I have removed links to the site from all (43) of the articles I could find. --DAJF (talk) 14:03, 26 January 2014 (UTC)[reply]

    Looks like it's just that one IP. no Declined, warned spam4im. Bump me if the IP resumes spamming and I'll block it for some time. MER-C 07:35, 2 February 2014 (UTC)[reply]

    Proposed removals

    www.boarding-schools.findthebest.com

    I am editing some boarding school pages, and this site is very useful for comparisons of different metrics between different schools. (They even source their data). It seems the domain was blocked in June 2010 for spam from 96.56.136.42. However, that IP has not been active since then and I'd like to use the website as a reference now. Specifically the boarding school subdomain, but I see no reason why the entire site can't be unblocked, as it looks like it could be useful for a number of different categories. R0uge (talk) 21:39, 5 January 2014 (UTC)[reply]

    The IP could not spam anymore, so that likely stopped their contributions. However, that is three years ago, and I am considering this (it can always be re-added if the abuse did not stop ..). Are the subdomains maintained by the site owner itself, or by different groups of people (I mean, maybe boarding-schools subdomain is fine, but one other may not be - in which case I would suggest selective whitelisting to see whether spamming is still an issue but also to keep the situation manageable). How is the data maintained anyway? --Dirk Beetstra T C 04:25, 6 January 2014 (UTC)[reply]
    Well I don't know anything about how they work other than what's on the web, but it looks like their research team is monolithic rather than disparate contributors. No idea how the data is maintained, but it looks current - the page for Phillips Academy Andover (first link I clicked) says it was updated yesterday. (I can't link to these pages because they're on the blacklist, ironically enough. Might want to disable the blacklist for this page only.) R0uge (talk) 13:48, 6 January 2014 (UTC)[reply]
    If you leave off the http:// you can save here - one can than always paste the link to have a look if needed - sometimes there is a reason not to follow the link so indeed, any form of linking is disabled.
    I'll have a look at the data, and the original origin of the spam if I have time later on (and no-one beats me to it). --Dirk Beetstra T C 14:00, 6 January 2014 (UTC)[reply]
    Any updates? R0uge (talk) 21:38, 13 January 2014 (UTC)[reply]

    www.cpu-galaxy.at

    While I don't prefer this one as a source for old microchip technical specifications, it was the only hit that got me certain numbers for the Intel 1103 memory chip. It does appear to be a genuine microchip museum site (contact info has email and phone number in an image, probably for spambot protection), so I suspect it got blacklisted because someone started posting a bunch of corrections/citations in a short period of time and got mistaken for spam or vandalism. Featherwinglove (talk) 06:01, 15 January 2014 (UTC)[reply]

    www.iwawaterwiki.org

    The site was apparently blacklisted in 2010 due to repeated external links by user Beddowve, who has since been banned indefinitely. The IWA Water Wiki is a useful external resource for many water related topics, and it would be useful to be able to link to it. --Tentotwo (talk) 13:48, 15 January 2014 (UTC)[reply]

    You do realize that wikis are by definition practically useless as a reference (except where it serves as a primary reference), and wikis are discouraged as external links (moreover, we are not writing a linkfarm here). Tempted to decline, and let this go through the whitelist for specific links for specific pages. I hope this explains. --Dirk Beetstra T C 13:54, 15 January 2014 (UTC)[reply]
    Thanks for the quick response. Yes, wikis are generally useless as references, but not as "further reading" resources providing more in-depth information on certain topics (for example, compare the Wikipedia section Water_purification#Coagulation_and_flocculation with the corresponding entry on the IWA Water Wiki, CoagulationandFlocculationinWaterandWastewaterTreatment). I guess going through the whitelisting process for specific pages would work as well, but there doesn't seem to be a reason to keep the page blacklisted, since the excessive linking was a one-off event and offending user has been blocked.--Tentotwo (talk) 14:07, 15 January 2014 (UTC)[reply]

    kavkazcenter.com

    The main media outlet for the insurgency operating in Russia's North Caucasus region, although biased, it is a useful source of information for this obscure conflict that has very limited English language sources. Kavkaz Center is also used on numerous pages (including it's own) from before the blacklisting occured and is a source for much of the English language reporting that does occur on this conflict Gazkthul (talk) 23:08, 19 January 2014 (UTC)[reply]

    thehamptons.com

    The site was put on the blacklist in 2009 because three editors continually added the site to the article, The Hamptons. Those editors (2 IPs and one registered user) have long since refrained from editing Wikipedia. I am requesting to remove this site from the blacklist because there are useful articles only present on the site. One example is an original film review of a 1998 Louis C.K. film (which I cannot even link to in this request). I cannot link to the article currently because of a conflict that was resolved long ago. -- Wikipedical (talk) 06:21, 29 January 2014 (UTC)[reply]

    Completed Proposed removals

    Troubleshooting and problems

    Incomplete message for petition url

    An attempt to save http://petition.com/example only gives me the message:

    • The following link has triggered a protection filter: petition

    Either that exact link, or a portion of it (typically the root domain name) is currently blocked.

    It appears MediaWiki:Spamprotectionmatch doesn't get the full url in $1. Maybe it has something to do with the petition entry not having a domain:
    \bpetition(?:online|s)?\b

    {{int:Spamprotectionmatch|petition}} produces the message I got:
    The following link has triggered a protection filter: petition
    Either that exact link, or a portion of it (typically the root domain name) is currently blocked.

    Solutions:

    • If the URL used is a URL shortener/redirect, please use the full URL in its place, for example, use youtube.com rather than youtu.be,
    • If the URL is a Google URL, please look to use the (full) original source, not the Google shortcut or its alternative.
    • Look to find an alternative URL that is considered authoritative.

    {{int:Spamprotectionmatch|http://petition.com/example}} produces what I expected to get: A message with "The following link has triggered a protection filter: http://petition.com/example". I can see it in preview but not save it without nowiki, because the produced interface message contains the blacklisted link.

    My tests were based on a report at Wikipedia:Teahouse/Questions#I can't figure out what link is blacklisted? PrimeHunter (talk) 20:51, 17 December 2013 (UTC)[reply]

    the admin visible log for blacklist-matches has the same problem .. Especially annoying for cases where redirects are used - what did they try to avoid? --Dirk Beetstra T C 19:03, 5 January 2014 (UTC)[reply]

    cbronline.com

    Currently, "cbronline.com" is blacklisted on the English Wikipedia as of late 2013. "Computer Business Review Online" used to be a reasonable news source, but at some point it transitioned to "Your Tech Social Network" and went downhill. All the old URLs stopped working (the ones with the form "?guid=" followed by a long hex string) but can be fixed from the Internet Archive. New URLs have a different syntax. I suggest updating the regular expression on the blacklist to exclude URLs of the old form. They were legitimate links in many articles. In general, blacklisting links from years ago is a bad idea. It damages the encyclopedia. I'm trying to fix the mess Cydebot II created at RegisterFly now. John Nagle (talk) 22:07, 5 February 2014 (UTC)[reply]

    Logging / COIBot Instr

    Blacklist logging

    Full instructions for admins


    Quick reference

    For Spam reports or requests originating from this page, use template {{/request|0#section_name}}

    • {{/request|213416274#Section_name}}
    • Insert the oldid 213416274 a hash "#" and the Section_name (Underscoring_spaces_where_applicable):
    • Use within the entry log here.

    For Spam reports or requests originating from Wikipedia_talk:WikiProject_Spam use template {{WPSPAM|0#section_name}}

    • {{WPSPAM|182725895#Section_name}}
    • Insert the oldid 182725895 a hash "#" and the Section_name (Underscoring_spaces_where_applicable):
    • Use within the entry log here.
    Note: If you do not log your entries, it may be removed if someone appeals the entry and no valid reasons can be found.

    Addition to the COIBot reports

    The lower list in the COIBot reports now have after each link four numbers between brackets (e.g. "www.example.com (0, 0, 0, 0)"):

    1. first number, how many links did this user add (is the same after each link)
    2. second number, how many times did this link get added to wikipedia (for as far as the linkwatcher database goes back)
    3. third number, how many times did this user add this link
    4. fourth number, to how many different wikipedia did this user add this link.

    If the third number or the fourth number are high with respect to the first or the second, then that means that the user has at least a preference for using that link. Be careful with other statistics from these numbers (e.g. good user who adds a lot of links). If there are more statistics that would be useful, please notify me, and I will have a look if I can get the info out of the database and report it. This data is available in real-time on IRC.

    Poking COIBot

    When adding {{LinkSummary}}, {{UserSummary}} and/or {{IPSummary}} templates to WT:WPSPAM, WT:SBL, WT:SWL and User:COIBot/Poke (the latter for privileged editors) COIBot will generate linkreports for the domains, and userreports for users and IPs.


    Discussion

    COIBot / LiWa3

    I am busy slowly restarting COIBot and LiWa3 again - both will operate from fresh tables (LiWa3 started yesterday, 29/12/2013; COIBot started today, 30/12/2013). As I am revamping some of the tables, and they need to be regenerated (e.g. the user auto-whitelist-tables need to be filled, blacklist-data for all the monitored wikis), expect data to be off, and some functionality may not be operational yet. LiWa3 starts from an empty table, which also means that autodetection based on statistics will be skewed. I am unfortunately not able to resurrect the old data, that will need to be done by hand. Hopefully things will be normal again in a couple of days. --Dirk Beetstra T C 17:27, 30 December 2013 (UTC)[reply]