Jump to content

MediaWiki talk:Spam-blacklist: Difference between revisions

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Content deleted Content added
→‎Viejaiglesiacatolicaromanaritolatino Wordpress site: move section from bottom to here - rename section to more specific name, format a bit.
Line 756: Line 756:
===Heads up===
===Heads up===
I have asked for porting Erwin's tool (SBHandler) from meta to here: [[Wikipedia:Gadget/proposals#m:User:Erwin.2FSBHandler_-_Spam_blacklist_handler]]. --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> 09:57, 25 April 2014 (UTC)
I have asked for porting Erwin's tool (SBHandler) from meta to here: [[Wikipedia:Gadget/proposals#m:User:Erwin.2FSBHandler_-_Spam_blacklist_handler]]. --[[User:Beetstra|Dirk Beetstra]] <sup>[[User_Talk:Beetstra|<span style="color:#0000FF;">T</span>]] [[Special:Contributions/Beetstra|<span style="color:#0000FF;">C</span>]]</sup> 09:57, 25 April 2014 (UTC)

==Gayot.com==
*{{link summary|gayot.com}}

While editing [[VeeV]] I came across a link to a page on a site called novusvinum.com, which I traced and found now to redirect to a page at gayot.com. [[Gayot]] is a reputable food and drink review website, and the page verifies the naming of VeeV as a "top 10 spirit" in 2010. When I tried to update the link in the article, I found that gayot.com is on the blacklist. The reason given for its inclusion is that it was being abused by multiple SPAs back in March 2011. Can we try removing the site now, three years later, as it is a legitimate source, and the abusers may by now have gone on to other projects? [[User:Largoplazo|—Largo Plazo]] ([[User talk:Largoplazo|talk]]) 11:21, 25 April 2014 (UTC)

Revision as of 11:21, 25 April 2014

    Mediawiki:Spam-blacklist is meant to be used by the spam blacklist extension. Unlike the meta spam blacklist, this blacklist affects pages on the English Wikipedia only. Any administrator may edit the spam blacklist. See Wikipedia:Spam blacklist for more information about the spam blacklist.


    Instructions for editors

    There are 4 sections for posting comments below. Please make comments in the appropriate section. These links take you to the appropriate section:

    1. Proposed additions
    2. Proposed removals
    3. Troubleshooting and problems
    4. Discussion

    Each section has a message box with instructions. In addition, please sign your posts with ~~~~ after your comment.

    Completed requests are archived. Additions and removals are logged, reasons for blacklisting can be found there.

    Addition of the templates {{Link summary}} (for domains), {{IP summary}} (for IP editors) and {{User summary}} (for users with account) results in the COIBot reports to be refreshed. See User:COIBot for more information on the reports.


    Instructions for admins
    Any admin unfamiliar with this page should probably read this first, thanks.
    If in doubt, please leave a request and a spam-knowledgeable admin will follow-up.

    Please consider using Special:BlockedExternalDomains instead, powered by the AbuseFilter extension. This is faster and more easily searchable, though only supports whole domains and not whitelisting.

    1. Does the site have any validity to the project?
    2. Have links been placed after warnings/blocks? Have other methods of control been exhausted? Would referring this to our anti-spam bot, XLinkBot be a more appropriate step? Is there a WikiProject Spam report? If so, a permanent link would be helpful.
    3. Please ensure all links have been removed from articles and discussion pages before blacklisting. (They do not have to be removed from user or user talk pages.)
    4. Make the entry at the bottom of the list (before the last line). Please do not do this unless you are familiar with regular expressions — the disruption that can be caused is substantial.
    5. Close the request entry on here using either {{done}} or {{not done}} as appropriate. The request should be left open for a week maybe as there will often be further related sites or an appeal in that time.
    6. Log the entry. Warning: if you do not log any entry you make on the blacklist, it may well be removed if someone appeals and no valid reasons can be found. To log the entry, you will need this number – 605734691 after you have closed the request. See here for more info on logging.


    Proposed additions

    microsoft-cortana.com

    This website is constantly put in Microsoft Cortana article but is not registered to Microsoft. It is registered to an ISP in Istanbul, Turkey. Curiously, one of the persons who keep adding it, 92.44.220.141, is also from Istanbul, Turkey. Comodo Internet Security triggered a security alert while I was visiting this website.

    Conclusion: High possibility of malware website being advertised in Wikipedia.

    Best regards,
    Codename Lisa (talk) 10:33, 19 April 2014 (UTC)[reply]

    sourcesecurity.com

    Spammers

    Long term, persistent spamming on many IPs and users - above is a partial list of IPs and accounts. Main spam URL is sourcesecurity.com, but thebigredguide and yogawizard show some overlap in accounts. - MrOllie (talk) 18:37, 30 August 2013 (UTC)[reply]

    lisakellysite.com

    lisakellysite.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    I just added this, it may be a temporary measure, but the site or its DNS has been hacked and currently redirects to an MMF scam (see Ticket:2014020310001561). I blacklisted it to prevent good-faith users trying to reinsert it, as this was, by all accounts, the correct official site of Lisa Kelly. Guy (Help!) 12:44, 3 February 2014 (UTC)[reply]

    city-data.com

    Please provide evidence of spamming (i.e. a list of spammers). MER-C 13:01, 4 April 2014 (UTC)[reply]

    apptinder.com and tinderblackberry.org

    The first IP added apptinder to the Tinder(app) page, and when I removed it, re-added it. The second added tinderblackberry. Both are copycat sites that direct you to download something from somewhere that's not one of the legit tinder sites, from what I see. Starts at this revision: [1] And continued till here: [2] I'm not sure if I reported this right. I'm going to remove those links from there. 85.138.73.48 (talk) 22:49, 16 February 2014 (UTC)[reply]

    These continue to be added to the tinder article. Please block them. 85.138.73.48 (talk) 01:35, 23 February 2014 (UTC)[reply]

    ripoffreport.com

    I need to open by saying that Wikipedia is being used for search engine promotional link-spam for the article in question.

    As you may or may not be aware, the Ripoff Report(RoR) has (purportedly) degraded to an Internet extortion website, their currency is a high ranking on Google search. They also enjoy a loophole in the Communications Decency Act, 47 U.S.C. § 230(c), which allows unsubstantiated and fraudulent anonymous statements to remain and be perpetuated via search engines. RoR charges up to $12,000.00 to for remediation of a single claim according to court documents, regardless if the claim is fraudulent. IMHO the RoR is a WP:NOTRELIABLE anonymous blog. While it would be wonderful to get the entire domain blacklisted, I'm not here for that.

    The reference in question is an un-sourced anonymous (and likely self published) written attack against the prosecutor and witnesses in the Tracey Ann Richter murder conviction and upheld appeal. In a bizarre twist of fate, one of the witnesses is the leading opponent of RoR and the CEO of Rexxfield Cyber Investigation Services, Roberts (Rexxfield CEO) was Tracey Ann Richter's second husband. The reference is a thinly veiled attempt to punish Roberts of Rexxfield for his activism regarding the loophole in the Communications Decency Act.

    If there is anything else you need from me, let me know. Thank you 009o9 (talk) 08:36, 18 February 2014 (UTC)[reply]

    I missed including the article where the reference is being persisted (Early, Iowa). If the editor in question was actually interested in Tracey Ann Richter's innocence, wouldn't they present some verbiage with the link? Instead of just pasting a raw link at the end of the article? (I also see that this entry has been improved with the Linksummary template - Thank you!)009o9 (talk) 06:55, 20 February 2014 (UTC)[reply]

    I've removed the offending links from the article Early, Iowa the archive can be found here: Archived Page 009o9 (talk) 18:40, 3 April 2014 (UTC)[reply]

    usmagazine.us

    All "source" the other's bullshit stories about celebrity deaths and the like. Caused some annoyance at Brian Bonsall and Wayne Knight today, pretty clear that's all they're good for. Trying to be sneaky by naming like actual rags. Internet people can and will be fooled. Best to preempt them. InedibleHulk (talk) 17:01, March 16, 2014 (UTC)

    data.cas-msds.com

    Spammers

    IP hoppers from China attempt to mass-add data.cas-msds.com links to articles on chemicals. Those pages (example) are very poorly composed and referenced (basically broken morphs of wikipedia and some chemistry database) and seem like gathering advert requests to be placed on top of the page. Materialscientist (talk) 06:25, 27 March 2014 (UTC)[reply]


    Persianfootball.com

    links
    users

    This editor, who is currently blocked for two weeks for other infractions, has been spamming this url on article talk pages.

    I asked the sysop who applied the most recent blocks what to do about this, and he suggested that one of the courses of action I might consider would be to ask for it to be added to the spam list.

    This editor has simply littered talk pages w/the url. He will not listen to others -- he reverted Walter Görlitz who pointed to NOTAFORUM in deleting this editor's addition. See, e.g., here and here and here and here and here and here and here and here and here.Epeefleche (talk) 04:55, 28 March 2014 (UTC)[reply]

    yourbrainonporn and such

    Spam links which to not pass w:WP:MEDRS and w:WP:MEDASSESS relentlessly added to articles about masturbation and pornography over the years. The matter has been debated at w:WP:RSN. They may seem to violate w:WP:SOAP. Tgeorgescu (talk) 00:20, 31 March 2014 (UTC)[reply]

    I would add another
    It is a blog written by a lawyer who pretends to be a sex expert, although she did not study medicine, nor psychology. She is the partner of the man who runs yourbrainonporn.com and a fellow anti-pornography crusader. Tgeorgescu (talk) 21:47, 1 April 2014 (UTC)[reply]
    The problem with these blogs is that to the wannabe they appear to contain hard science, and out of gullibility wanabees want to add such information to Wikipedia as if it were presented in a reliable source. We cannot do otherwise than assume good faith for those who add it for the first time and assume bad faith after lengthy explanations that the couple who writes these blogs is no way near expert opinion on any medical or psychological subject. You'd be amazed how many people come to believe the dopamine-version of "masturbation makes you blind", i.e. they come to believe there is hard science behind sex/masturbation being an addiction similar to crack cocaine. The man who runs yourbrainonporn even made it to speak for TEDx, but if you listen carefully to what he said there he acknowledges it is all guessiology, there is no peer-reviewed evidence that he would be correct (as DSM-5 explicitly states, and it has been published fairly recently). Tgeorgescu (talk) 12:34, 2 April 2014 (UTC)[reply]

    Viejaiglesiacatolicaromanaritolatino Wordpress site

    viejaiglesiacatolicaromanaritolatino.wordpress.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    Constantly being added to articles about Catholic churches and rites, either in the external links section or, more recently, at the very top of the article. Also added to category and category talk pages.

    For example, see contributions of:

    ... discospinster talk 21:38, 17 April 2014 (UTC)[reply]

    Sock-puppetry - Shelby sedan

    Links
    Users

    Sock-puppets talking about a nonexistent Shelby sedan, as well as Google search results about it, were adding those links. 212.103.189.106 (talk) 20:40, 19 April 2014 (UTC)[reply]

    alertwoo.com

    The IP hopping, all from Chetput, India, makes it difficult to communicate with the anon and it's clearly a SPAM link. I can't see this company ever offering a valuable RS either.

    I think that I have reverted them all for now. Walter Görlitz (talk) 20:38, 20 April 2014 (UTC)[reply]

    Completed Proposed additions

    Proposed removals

    pv-magazine.com

    This website is a secondary source which includes specific information on photovoltaic projects throughout the world. It was used legitimately in a number of photovoltaic power stations articles and is generally reliable. It was discussed in May 2011 for spamming.

    How can the site be useful It is used as a reliable source. Often this website is the only English secondary source with a specific information.

    Why it should not be blacklisted It is useful as a source for many articles. Despite that the website was spammed, it is a valuable resource for myself and others who works with energy-related articles. When discussed in 2011, it was said that "If a non-COI editor makes a later request, it could be reconsidered". Accordingly I am making that request. Additional issues are that the blacklisting seems punitive, not preventive, and it was blacklisted without prior notifying relevant Wikiproject. Beagel (talk) 07:47, 6 April 2014 (UTC)[reply]

    The original thread regarding the spamming is Wikipedia_talk:WikiProject_Spam/2011_Archive_Mar_1#User:Paulzubrinich
    "Despite that the website was spammed .... the blacklisting seems punitive, not preventive". I respectfully disagree - it prevented the spamming (there were multiple accounts (some likely with a conflict of interest: User:Beckystuart - [3]; User:Paulzubrinich), they show an intention to spam ([4]), they mislead other editors ([5]), replace other sources with theirs, there is no reason to think that it should stop (in fact, User:MER-C noted the creation of new socks when old were blocked)), it does not punish anyone.
    Blacklisted without prior notifying relevant WikiProject - that is at best a good consideration, but is not, has never been, and should never be a compulsory part of blacklisting.
    Did you look whether the spamming actually is not still actively busy, so that we can safely say that blacklisting is not necessary anymore to prevent further spamming? Otherwise, I would consider to  Defer to Whitelist for the links that are needed. --Dirk Beetstra T C 17:37, 23 April 2014 (UTC)[reply]
    Taking account the time how long this site has been blacklisted there are not so much websites left. However, it is a very time-consuming to apply for each single link for whitelisting (as a rule, it takes weeks to get any reaction and too often the reaction is an advice to look for some other source.) During the latest discussion about different -technology.com sites there were several proposals how to use bots and filter to make the process of blacklisting more transparent and detecting spammers more easily; however, it seems that there is no wish to change the current system. Beagel (talk) 18:02, 23 April 2014 (UTC)[reply]
    It is good that we do not have any deadlines, then.
    Beagel, WP:BRFA is right there for that task (I am not going to operate that bot, and I warn anyone who wants to take that task of familiarising themselves of what happened with Betacommand). I believe that it should not be compulsory, that is it - and you still seem to think that I am unwilling to notify wikiprojects of 'their' links being blacklisted, and that I don't (or didn't) make the analysis. I explained what and how I analysed it, and I still believe that most of these -technology.com links are not secondary sources, but simple re-reports of primary sources (in fact, the first addition of one of those sites that I encountered made by you (after the many by the spammers) was exactly that - a rewrite of the company report - in fact, I only believed it when I found the original, as they did not source where they actually got the information from).
    I don't think that this process is less transparent than WP:AN/I or WP:RS/N .. it is just that people don't care. --Dirk Beetstra T C 18:13, 23 April 2014 (UTC)[reply]
    So, no analysis what is the impact to the work of other editors who are here to improve the Wikipedia and not for spamming. And no intention to see the situation from these editors point of view. And it is big difference, if the source is directly from the company (that is, primary source) or re-written (not re-printed) by the webmedia source (secondary source). Beagel (talk) 18:55, 23 April 2014 (UTC)[reply]
    Don't start that again, 'the work of other editors who are here to improve the Wikipedia and not for spamming' - you say you have all these pages on your watchlist (together with tens of others in the Wikiproject), still none cared about the spammers.
    No, it is not, it is re-written without pointing to the original source - it is only reliable when you see the original source and compare. This is not a reliable secondary source. There is nothing against primary sources, and this is a prime example why prime sources should sometimes be used over secondary sources. --Dirk Beetstra T C 19:21, 23 April 2014 (UTC)[reply]
    Please do not put words in my mouth. I never said I have all these pages in my watchlist. My watchlist icludes only about 3,000 pages including files, templates, projects etc. However, your accusation that "none cared about the spammers" is baseless. I can't speak on behalf of anybody else; however, if there has been spamming also like vandalism etc I have always dealt with this. For some reasons, there has been no such a large-scale spamming at these pages on my watch list as you seems to imply. Therefore, my experience have been that blacklisting of certain websites have been created more harm and extra work than any spammer I have dealt with. And this is not said only by me but has been said here by several other editors. So, maybe instead of denying the problem it would be better if we could together find a way to make the system less painful for the ordinary editors without being less effective for fighting spammers. I personally suggested some potential solutions (I don't say that they are ready solutions or that there is no better solutions) but instead the of dialogue and discussing it you just rejected any cahnge to the current system. Although, if the link is added not by spammer but ordinary editor with a long edit history without histroy of spamming, vandalizing, paid editing etc, it would be logical if the link will be whitelisted more or less automatically. Beagel (talk) 19:56, 23 April 2014 (UTC)[reply]
    Sorry, I did not mean it in that way, not as a personal complaint to you. These spammers were too smart for that, Beagel. Still, between the 7-or-so last accounts (some likely socks seen their timing!) there are about 250 link additions. Count that for all the sites there are a couple of hundreds of links (500? 700?), and the fact that this is known to be busy for 7 years (did I still miss accounts, I think I blacklisted after 4, found more later, and I did see a number of accounts with in total 4 or 5 edits where they add 3 or 4 links, is it a spammer or not? I also have a 'gap' where the spammers don't seem to have been active, that is too curious to be true) quite some of the links that are there were originally spammed (the one on Navy is spammed (and IMHO, inappropriate)). Even if you had ALL articles in one of the spammed subject areas on your watchlist (which no-one has, I do not have all 14.000 chemicals on my watchlist), you might have seen just a couple of edits over months - as I said, they are smart, they know we are looking (we caught them spamming earlier).
    I may have underestimated how often this link was used by regulars (the analysis if it should have been is elsewhere), but as I said - I see an (obviously incomplete) set of editors spamming, with hundreds of edits between them, and several hundreds of links there, and I found it easier to find spammers than regulars adding the site (it took me quite some time before I ran into the first case where you added the link, and as for the four you requested whitelisting for, I found that one replaceable as well (don't remember where, I left it)). Maybe the cleanup of at least the spammed links should have been more rigorous before blacklisting (and I still believe that quite some of the rest should go as well, there are better sources). Announcing it to the WikiProjects (which would probably be 30 or 40 in this case .. each for a 15-20 links on average) might have been an option, though I a) doubt much participation in cleaning (personal experience), and b) if you notify them on a regular basis of pending blacklisting that participation will even become lower - in the end I don't think it will have much of an effect, and c) the spamming would still go on and that would also need to be cleaned. The bot does a similar thing, it notifies people of 'questionable' links on pages on their watchlist. You say that you have 3000 pages on your watchlist, so a rough guess, that there would be 25 pages with now blacklisted links (tagged in one go, so all visible). Most of those 25 pages are likely watched by another 10 people, so you would on average have to evaluate 2 of them. I, as spam-fighter, would however first have to clean-out all the spammed links, and then evaluate (which is difficult for me, I am not a specialist in all these subjects) whether the others are replaceable, should go or should be whitelisted (in the meantime, I have to revert the ongoing spam).
    I am not rejecting any change - I am all for more participation. But having compulsory notifications to Wikiprojects is not a solution (but just for the compulsory part of it - WP:PHARM is going to kill me after the 3rd notification of a Taladafil spammer, do I REALLY have to notify them? - and some links are plain spam and should undoubtedly be blacklisted but do not 'belong' to one, or any WikiProject. As I said elsewhere, I know what happened with Betacommand, the idea is good but the practical application is running into problems which will make people yell at the bot operator that operates that bot). For me, the solution is to get more people aware of the page and get more people commenting, and helping. And I have asked for that on several occasions ...
    In most cases, requests for whitelisting for links added by regulars and requested by regulars go fairly automatic - though, and I have said that here before, besides that it was spammed, I have serious questions about the reliability (reliability is not the right word, it turns out that it is reliable, it is more that they are not independent determinations of the facts than really reliability - they are secondary, they appear therefore independent, but that is a wrong impression) and suitability of these links (and not only of the ones that were spammed). Examiner.com links are not automatically whitelisted if a regular requests it - we ask everybody to go the extra mile and show there is no better source for the same info. I think that that should happen here as well (but now it is not blacklisted no-one will care about the links that are there: they are fine because they are not blacklisted - and (likely after a bit of time) the spammers can carry on). --Dirk Beetstra T C 21:26, 23 April 2014 (UTC)[reply]

    energy-business-review.com

    This website is a secondary source which includes good, detailed information on different type of energy projects throughout the world. It was used legitimately in a number of energy-related articles and is generally reliable and, notwithstanding blacklisting, it is still in use in some articles. It was discussed in October 2009 whith several other website for spamming. I myself have never seen it spammed on Wikipedia, just used as a reference.

    How can the site be useful It is used as a reliable source. Often this website is the only English secondary source with a specific information.

    Why it should not be blacklisted It may have been spammed with several other websites but it is most useful as a source for many articles. Despite that the website was spammed, it is a valuable resource for myself and others who works with energy-related articles. Additional issues are that the blacklisting seems punitive, not preventive, and it was blacklisted without prior notifying relevant Wikiproject. Beagel (talk) 07:47, 6 April 2014 (UTC)[reply]

    This site is part of the large scale CBROnline-spamming. This is mainly a site that re-reports company reports. There may be a few links left, most were removed, most are replaceable with the proper primary source (repeating the primary source does not make this a secondary source), whitelisting can handle the rest.
    "Despite that the website was spammed .... the blacklisting seems punitive, not preventive". I respectfully disagree - it prevented the spamming (as has been shown, it is still ongoing with numerous related accounts lately spamming sites of the same owner), it does not punish anyone.
    Blacklisted without prior notifying relevant WikiProject - that is at best a good consideration, but is not, has never been, and should never be a compulsory part of blacklisting.
    Did you look whether the spamming actually is not still actively busy, so that we can safely say that blacklisting is not necessary anymore to prevent further spamming? Otherwise, I would consider to  Defer to Whitelist for the links that are needed. --Dirk Beetstra T C 17:43, 23 April 2014 (UTC)[reply]
    Taking account the time how long this site has been blacklisted there are not so much websites left. However, it is a very time-consuming to apply for each single link for whitelisting (as a rule, it takes weeks to get any reaction and too often the reaction is an advice to look for some other source.) AS for CBROnline, it is perfect example of sites too large and too important to so easily blacklist. And no, this is not only company reports. Also, if the site is blacklisted, that means you can't to add this site, so how one could say it is still spamming? During the latest discussion about different -technology.com sites there were several proposals how to use bots and filter to make the process of blacklisting more transparent and detecting spammers more easily; however, it seems that there is no wish to change the current system. Beagel (talk) 18:07, 23 April 2014 (UTC)[reply]
    Note, there is nothing against more people actually helping at the whitelist and blacklist .. they are after all community noticeboards and crosslinked from all of them. --Dirk Beetstra T C 18:15, 23 April 2014 (UTC)[reply]
    Regarding the question on how to see whether a site is still spammed - that is how I found the ongoing CBROnline.com-spam, because people are still trying to add and were still spamming links belonging to the company. I think that that is a compulsory analysis to be done before de-blacklisting is considered, as well as an analysis of the overall use of the link (we have 4 million pages, if we are talking thousands of pages in a subject-range, but only 10 which would be enhanced by a reference to this site, then whitelisting is a better solution). I believe still that whitelisting is a better solution for this site as well. --Dirk Beetstra T C 18:22, 23 April 2014 (UTC)[reply]
    Just curious - where's the logfile that shows the hits on the blacklist? Is it one of the edit filters? ~Amatulić (talk) 19:39, 23 April 2014 (UTC)[reply]
    User:Amatulic: see https://en.wikipedia.org/w/index.php?title=Special:Log/spamblacklist&limit=500&type=spamblacklist&user= <- admin only. I went through attempts to add cbronline.com, looking at the contributions of editors who hit the blacklist on it, found an IP that had such a hit, and as contributions only spam to -technology.com sites. COIBot helps you further. Digging further .... --Dirk Beetstra T C 19:47, 23 April 2014 (UTC)[reply]

    www.water-technology.net

    This website is a secondary source which includes good, detailed information on water supply, dam and hydropower projects throughout the world. It is used legitimately in 84 water, dam and hydropower plant articles and is generally reliable. It was black-listed in March 2014 which several other website for spamming. I myself have never seen it spammed on Wikipedia, just used as a reference.

    How can the site be useful It is used as a reliable source. Often this website is the only English source with an article on a specific dam or hydropower project. Stubs get bigger and readers learn more.

    Why it should not be blacklisted It may have been spammed with several other websites but it is most useful as a source for many articles. Despite that the website was spammed, it is a valuable resource for myself and others who start similar articles. I am dismayed that it was spammed but moreso that it was blacklisted.--NortyNort (Holla) 20:08, 3 April 2014 (UTC)[reply]

    Frankly some discretion was called for. Banning a site which is legitimately linked by hundreds of wiki pages means imposing hundreds of hours of work on the Wikipedia community to either find alternate sources or to remove now-unsourced material. Requiring the community to waste that much time on busywork instead of spending it on more productive edits just because a site was spam promoted is unreasonable. Anti-spam work is supposed to reduce others' opportunity costs, not increase them. TheOtherEvilTwin (talk) 11:49, 6 April 2014 (UTC)[reply]

     On hold - not blacklisted anymore pending discussion. --Dirk Beetstra T C 18:24, 23 April 2014 (UTC)[reply]

    www.boarding-schools.findthebest.com

    I am editing some boarding school pages, and this site is very useful for comparisons of different metrics between different schools. (They even source their data). It seems the domain was blocked in June 2010 for spam from 96.56.136.42. However, that IP has not been active since then and I'd like to use the website as a reference now. Specifically the boarding school subdomain, but I see no reason why the entire site can't be unblocked, as it looks like it could be useful for a number of different categories. R0uge (talk) 21:39, 5 January 2014 (UTC)[reply]

    The IP could not spam anymore, so that likely stopped their contributions. However, that is three years ago, and I am considering this (it can always be re-added if the abuse did not stop ..). Are the subdomains maintained by the site owner itself, or by different groups of people (I mean, maybe boarding-schools subdomain is fine, but one other may not be - in which case I would suggest selective whitelisting to see whether spamming is still an issue but also to keep the situation manageable). How is the data maintained anyway? --Dirk Beetstra T C 04:25, 6 January 2014 (UTC)[reply]
    Well I don't know anything about how they work other than what's on the web, but it looks like their research team is monolithic rather than disparate contributors. No idea how the data is maintained, but it looks current - the page for Phillips Academy Andover (first link I clicked) says it was updated yesterday. (I can't link to these pages because they're on the blacklist, ironically enough. Might want to disable the blacklist for this page only.) R0uge (talk) 13:48, 6 January 2014 (UTC)[reply]
    If you leave off the http:// you can save here - one can than always paste the link to have a look if needed - sometimes there is a reason not to follow the link so indeed, any form of linking is disabled.
    I'll have a look at the data, and the original origin of the spam if I have time later on (and no-one beats me to it). --Dirk Beetstra T C 14:00, 6 January 2014 (UTC)[reply]
    Any updates? R0uge (talk) 21:38, 13 January 2014 (UTC)[reply]
    Any updates? R0uge (talk) 05:32, 17 February 2014 (UTC)[reply]
    Any updates? R0uge (talk) 08:58, 4 March 2014 (UTC)[reply]
    Reposting the same question multiple times is not going to hurry things up.
    You also haven't responded to Dirk Beetstra's suggestion to show us the links that interest you, without the http prefix.
    Finally, you haven't explained why you can't request whitelisting of one or two specific pages for the purpose of referencing in the article. I would oppose de-listing the root domain findthebest.com. ~Amatulić (talk) 20:30, 4 March 2014 (UTC)[reply]
    The particular page I am looking to reference is boarding-schools.findthebest.com (just that one link), specifically to use the list as sorted by Acceptance rate to show that Phillips Academy Andover is 'selective' as compared to its peers. I personally would only need that one page whitelisted, but it seems that the root domain contains much more useful information that could be useful for other editors to reference across wikipedia, and with the abuse from this domain occurring and ceasing so long ago, it seems that now would be a good time to open it up. R0uge (talk) 21:32, 30 March 2014 (UTC)[reply]
    I wouldn't be opposed to adding that subdomain to the whitelist. Even better would be to just whitelist the 'widget' version of that page, which is an actual URL path rather than a domain: boarding-schools.findthebest.com/w/kibHiSfKuAR ~Amatulić (talk) 17:01, 8 April 2014 (UTC)[reply]

    st-seraphim.com

    st-seraphim.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com http://www.st-seraphim.com/harry.htm I presume that this is a false positive from by \bseraphim\.com\b on the local blacklist, I have found the log listing, which is

    \bseraphim\.com\b # # Ale_jrb # SGGH # see that needs to be blocked
    

    All the best, Rich Farmbrough, 17:21, 27 March 2014 (UTC).[reply]

    Does this one class as "to hard"? All the best, Rich Farmbrough, 01:12, 13 April 2014 (UTC).[reply]

    infodriveindia.com

    infodriveindia.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    Could this link please be removed from the blacklist? This website has trade data relating to India, and I would like to reference it in articles that mention multilateral trade involving India. I have not found any alternative online sources for this data via Google Search. The link used to be spammed in the external links section of articles until it was blacklisted in 2007. See also MediaWiki talk:Spam-blacklist/archives/November_2007#Infodriveindia.com (removal). I think that the possibility of re-listing on the blacklist is a sufficient disincentive to further spamming. --Joshua Issac (talk) 13:04, 31 March 2014 (UTC)[reply]

    hollywoodnorthreport.com

    hollywoodnorthreport.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    This website no longer exists, but I would like to be able to use archived pages from it at the Wayback Machine.

    • "Explain how the link can be useful on Wikipedia."

    Looking at the archived version of the site, it seems to have qualified as a WP:RS during its existence. As for the specific link I'm currently dealing with: I came here from the article Atomic Betty, which has been tagged as having a link to this site in it. I would like to replace the link in question with one to the archived version of the page.

    • "Explain your reasoning why the blacklisting is not necessary anymore."

    It is extremely unlikely that there will be any reason to put the site on this list again as it no longer exists.

    Dogmaticeclectic (talk) 16:27, 2 April 2014 (UTC)[reply]

    Given the rationale above, and the fact that it comes from a trusted high-volume editor, I would not be opposed to removing the rule from the blacklist. Any objection from other administrators? ~Amatulić (talk) 22:56, 2 April 2014 (UTC)[reply]

    naval-technology.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    • "Explain how the link can be useful on Wikipedia."

    naval-technology.com has considerable information on its namesake: naval technology. It currently has 490 links, the vast majority seem to be useful.

    • "Explain your reasoning why the blacklisting is not necessary anymore."

    There is no explanation as to why it was blacklisted: MediaWiki talk:Spam-blacklist/log#March 2014, so I don't understand why it was added. A considerable number of article use this site for reference. Loss of access to this website would seem detrimental.

    Jim1138 (talk) 16:16, 3 April 2014 (UTC)[reply]

    I think you'll find that it's one of many sites discussed at #cbronline.com below. - David Biddulph (talk) 16:23, 3 April 2014 (UTC)[reply]

    Railway-Technology.com

    railway-technology.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    • "Explain how the link can be useful on Wikipedia."

    Railway-Technology.com, like naval-technology.com above, has considerable information on its namesake: railways (e.g. metros and subways and the like, as well as commuter rail), especially in regards to statistics relating to railways (e.g. route length, number of stations, etc.)

    I agree that this site is valuable to a number of articles with which I'm familiar and it does provide comprehensive and pertinent information to them, so would gather that it has been used in other articles as well. Blacklisting would be detrimental to Wiki as a whole as well. Djflem (talk) 20:05, 3 April 2014 (UTC)[reply]
    Railway-Technology.com seems to me to be a perfectly legitimate online source of definitive information about railway technology. It is used by a large number of articles on the English wiikpedia, and including this site on the blacklist is causing a large number of articles to be defaced by a bot-created notice.Hallucegenia (talk) 23:33, 3 April 2014 (UTC)[reply]
    I would like to agree with what User:Hallucegenia and User:Djflem have said. This website as far as I can tell contains useful and relevant information and is used as a reference in many articles. I have added links to this website as references myself and I'm certainly not a spammer. I find it rather annoying that so many articles have been tagged by this spam bot! This has been raised at Wikipedia talk:WikiProject Trains#Blacklisted website. G-13114 (talk) 22:04, 4 April 2014 (UTC)[reply]

    I am puzzled by the events leading to this site being on the blacklist in the first place. Railway-technology.com and its sister, Future Rail (Railway-technology Dot com/Uploads/AboutUs/RailwayTechnology/mediapacksfr/about-us-fr.html), are digital magazines produced by Kable, an information research firm based in London with offices in Australia, Asia, Europe, North America and South America. This is a legitimate secondary source. If we think railway-technology.com should be in the black list, we should also put wired.com on the black list. The are just in similar ranking in their respective industries.Z22 (talk) 02:21, 6 April 2014 (UTC)[reply]

    • "Explain your reasoning why the blacklisting is not necessary anymore."

    There is no explanation as to why it was blacklisted: MediaWiki talk:Spam-blacklist/log#March 2014, though I gather it was added in response to its use by some Wiki sockpuppets. A considerable number of articles also use this site for reference. Loss of access to this website would seem detrimental to Wiki as a whole as well.

    --IJBall (talk) 18:48, 3 April 2014 (UTC)[reply]

    On second thought, suspending this request, until the discussion downpage is hashed out. --IJBall (talk) 19:05, 3 April 2014 (UTC)[reply]

    Actually, it is necessary to remove from the blacklist. This should be removed as soon as we can. As mentioned, there is no good reason to be put on the blacklist in the first place. The site is used by many articles with real legitimate contents. Editors with good intention may see the note left by the bot about blacklist and may attempt to clean up articles by choosing the less valuable or less reliable sources to replace those citations. This will be detrimental to many rail articles. If there have been problems with certain users who misused this site for something, please address those issues with those particular users. Don't just put this site on the blacklist and cause a mayhem. Z22 (talk) 02:21, 6 April 2014 (UTC)[reply]

    I would second User:Z22. I think it's quite obvious that this website (and others similar) has been placed on the blacklist in error, and should be removed from it ASAP! look at the reaction to this website's entry on the whitlelist page. This is causing a lot of totally unnecessary disruption to hundreds of articles, and it should be resolved as soon as possible! G-13114 (talk) 02:37, 6 April 2014 (UTC)[reply]
    It is not added in error, it was spammed. --Dirk Beetstra T C 08:00, 6 April 2014 (UTC)[reply]

    aldservice.com

    Dear Editors, I wish for the following link to be delisted (removed from the blacklist): aldservice.com I could not find a record for when and by who it was blacklisted, but following an online chat with one of your representatives I found out that it was blacklisted in 2012 and because it proved not useful for the pages it was linked to, one of them was FMECA. I hereby turn to you in a request to reconsider allowing this link to appear in Wikipedia. It is extremely useful for many individuals and companies in many different sectors who are in a need for information about reliability and safety. This website is provided by a legitimate company that has a lot of highly valuable information that can benefit several important pages on Wikipedia. Please let me know if you have any questions. I will be happy to provide examples and additional information. Ayelet Saciuk Moved from MediaWiki talk:Spam-blacklist/archives/April 2014, originally posted by AyeletSaciuk (talk · contribs). -- SMS Talk 06:55, 20 April 2014 (UTC)[reply]

    This domain is not listed here, you will have to ask at m:Talk:Spam blacklist for delisting. It was added for this reason. MER-C 06:59, 20 April 2014 (UTC)[reply]

    eqi.org

    Dear Editors, I have no idea why this site is banned. Even if we don't add the links to the Emotional Intelligence main page, where the trouble initially started, then I think it could be useful on other pages, such as emotional abuse and emotional needs, and similar pages. The website is well written, although it may be idiosyncratic at times, it is mixed in with a lot of good advice and information.— Preceding unsigned comment added by 77.102.201.19 (talkcontribs)

     Not done We generally only consider requests from trusted, high-volume editors. Furthermore, I don't see how this would be of any use to the project, as it doesn't meet WP:RS. I see that it's also blacklisted on at least 6 other wikis, suggesting some serious canvassing. OhNoitsJamie Talk 16:52, 23 April 2014 (UTC)[reply]

    preqin.com

    This site appears wrongly banned. It is an established financial research institution with an entire Wikipedia page (https://en.wikipedia.org/wiki/Preqin) dedicated to it, and there appear to be no controversies or questions around the legitimacy of its financial research. I wanted to include a reference to one of its reports on investment funds and learned that the site is banned. It should be unbanned. Thanks for the consideration. Orthodox2014 (talk) 19:12, 24 April 2014 (UTC)[reply]

    I presume that you mean preqin.com (in the template) .. I have changed that. I'll have a look. --Dirk Beetstra T C 19:15, 24 April 2014 (UTC)[reply]
    No, the site was not wrongly banned, it was spammed (several editors who edited solely to get this linked, this is one example edit). I found quickly a likely COI editor operating in 2010 (the site was blacklisted in 2008).
    However, this is a long time ago, maybe it can be removed (with the hope that it does not continue). I do however note, that this is likely only going to be used on a couple of pages, and a couple of links, maybe whitelisting is a sufficient option. --Dirk Beetstra T C 19:23, 24 April 2014 (UTC)[reply]
    OI!
    I just noted that on Preqin, the external link is not spelled 'preqin.com', but 'prequin.com' - a plain redirect site.
    Looking further. --Dirk Beetstra T C 19:24, 24 April 2014 (UTC)[reply]
    The correct spelling (I just double checked) is preqin.com. They do financial analysis and research on investments. Orthodox2014 (talk) 20:05, 24 April 2014 (UTC)[reply]
    I have another suspect from 2011 using prequin.com. Nothing significant. I'll let another admin (or User:Hu12, who handled this in 2008) have a look and decide. --Dirk Beetstra T C 20:12, 24 April 2014 (UTC)[reply]
    The point that bugs me is the following. That editor of 2011 is pretty much an SPA, handful of edits (which makes me suspect that it is a spammer, but I am not naming because I don't know for sure, not enough edits). What my problem is, is that that editor is adding a piece of data with a reference using prequin.com. A random, good faith, editor (i.e., someone not involved with the site or being paid to use the site) who would find info on preqin.com (that is where you are when you read the data) would copy-paste the link from the address bar as a reference - 'preqin.com/blah', not 'prequin.com/blah' .. what is the chance that a random editor knows that they can use 'prequin.com' to not hit the blacklist? Or were they, 3 years after the blacklisting, still actively here to push their own stuff, knowingly using prequin.com so it would not hit the blacklist (or paying someone to do it for them)? If it is the latter, I am tempted to blacklist prequin.com as well, and ask people to go through the current links there to prune them, or whitelist the specific links that are needed, and also here suggest to  Defer to Whitelist for the specific link. --Dirk Beetstra T C 20:21, 24 April 2014 (UTC)[reply]
    Yes, I am going to advise whitelisting:
    Still active in 2013. Obvious COI. --Dirk Beetstra T C 20:30, 24 April 2014 (UTC)[reply]

    Completed Proposed removals

    Troubleshooting and problems

    Incomplete message for petition url

    An attempt to save http://petition.com/example only gives me the message:

    • The following link has triggered a protection filter: petition

    Either that exact link, or a portion of it (typically the root domain name) is currently blocked.

    It appears MediaWiki:Spamprotectionmatch doesn't get the full url in $1. Maybe it has something to do with the petition entry not having a domain:
    \bpetition(?:online|s)?\b

    {{int:Spamprotectionmatch|petition}} produces the message I got:
    The following link has triggered a protection filter: petition
    Either that exact link, or a portion of it (typically the root domain name) is currently blocked.

    Solutions:

    • If the URL used is a URL shortener/redirect, please use the full URL in its place, for example, use youtube.com rather than youtu.be,
    • If the URL is a Google URL, please look to use the (full) original source, not the Google shortcut or its alternative.
    • Look to find an alternative URL that is considered authoritative.

    {{int:Spamprotectionmatch|http://petition.com/example}} produces what I expected to get: A message with "The following link has triggered a protection filter: http://petition.com/example". I can see it in preview but not save it without nowiki, because the produced interface message contains the blacklisted link.

    My tests were based on a report at Wikipedia:Teahouse/Questions#I can't figure out what link is blacklisted? PrimeHunter (talk) 20:51, 17 December 2013 (UTC)[reply]

    the admin visible log for blacklist-matches has the same problem .. Especially annoying for cases where redirects are used - what did they try to avoid? --Dirk Beetstra T C 19:03, 5 January 2014 (UTC)[reply]
    I ran into a similar problem as well and I find this frankly very annoying. Why is the search string petition blocked at all. This seems severy hindering any sourcing or discussion regarding petitions, which I find unacceptable.--Kmhkmh (talk) 09:57, 7 February 2014 (UTC)[reply]
    This should only block links with these terms in the domainname.
    Petitions are at best a primary source at the moment they are closed. That being said, if that information is notable, then a secondary source will have reported the same number. For the few that are needed, they can be whitelisted.
    For the rest, these are a prime example of violations of WP:SOAPBOX when they are still open. 'Click here to get Justin Bieber sent back to Canada' .. that is how petition links are often 'abused' (and some got regularly plainly spammed), and since they do not often serve a real use anyway, blacklisting those prevents this soapboxing. --Dirk Beetstra T C 17:58, 23 April 2014 (UTC)[reply]

    cbronline.com

    Currently, "cbronline.com" is blacklisted on the English Wikipedia as of late 2013. "Computer Business Review Online" used to be a reasonable news source, but at some point it transitioned to "Your Tech Social Network" and went downhill. All the old URLs stopped working (the ones with the form "?guid=" followed by a long hex string) but can be fixed from the Internet Archive. New URLs have a different syntax. I suggest updating the regular expression on the blacklist to exclude URLs of the old form. They were legitimate links in many articles. In general, blacklisting links from years ago is a bad idea. It damages the encyclopedia. I'm trying to fix the mess Cydebot II created at RegisterFly now. John Nagle (talk) 22:07, 5 February 2014 (UTC)[reply]

    I just want to point out Cydebot didn't create a mess it did exactly as was designed to do, flag links that are on the blacklist. This is something the bot nor the operator have control over. However i totally agree this is something that needs sorted to avoid old links being hit unnecessarily, especially when their content was entirely justifiable and useful at the time.Blethering Scot 23:56, 5 February 2014 (UTC)[reply]

    As a general reply to this. Look, it is not our fault that a company finds it necessary to optimize their search engine results, or to just generally make sure that their site gets promoted. I agree, sometimes sites are a reasonable source that is reasonably used, but if the amount of spam pushed by this company exceeds that level significantly (a whole long list of sites; a whole list of sock/meatpuppets, reports go back to 2009), then the spam blacklist is designed to just do what it should do: stop the spamming (it was not blacklisted in 2013, it was blacklisted in 2010).

    Per Blethering Scot - Cyberbot II has not created a mess - the mess is completely at the side of the spammers who were the editors responsible for getting the site blacklisted. --Dirk Beetstra T C 05:17, 11 February 2014 (UTC)[reply]

    • A couple of points I'd like to make here. Their guid URLs may be dead, but you can find the same article on their site is you just google its title. So you don't have to resort to the copyright-questionable copy on archive.org. CBR should have preserved redirects from their the old links when they transitioned to the more google-friendly URLs, but apparently they did not care enough for old citations. (It's not the first publication to make this mistake, I can thing of a few more, like The Register and so forth). I'm not sure how bad the spamming of CBR links was, but it may be worthwhile removing the blacklist and see if there any issues presently. 2010 was a long time ago in wikitime. Anyway, the interim solution is to add specific (and fairly numerous) URLs to the whitelist. A second batch has been proposed by me (the first was one probably the one by User:Wbm1058/User:Qwertyus), and I have a 3rd in preparation given that User:Rilak wrote quite a few articles citing CBR at at time when Google Books didn't offer the full archives of Computerworld and Infoworld. Even with these better-known sources now available, occasionally CBR is useful for citing stuff not (easily) found in the other two. If and when EE Times puts all its archives online (via Google Books or themselves) we might have a better alternative for the chip-oriented stuff, but in the meantime, there isn't another online source for old chip topics that I know of. Someone not using his real name (talk) 19:03, 1 March 2014 (UTC)[reply]
    • By the way, I cannot find a discussion for when cbronline was added to the blacklist. The earliest discussion I can find about it is a complain from 2011 (against its listing) MediaWiki_talk:Spam-blacklist/archives/September_2011#cbronline. Someone not using his real name (talk) 19:20, 1 March 2014 (UTC)[reply]
      • Ok, I found the original discussion for the addition: [6]. The number of *-business-review.com sites spammed was indeed impressive (all part of Progressive Media Group), but the actual number of links added to them was not that big and they were in conjunction with out-of-place content so the few spammer accounts were rather easy to spot. Only a small percentage of the links added by the spammer accounts was to cbronline (rather than the other sites). It's probably worth risking to de-blacklist just this one, although if you'd rather process all whitelist requests that Cyberbot II with trigger... Someone not using his real name (talk) 19:48, 1 March 2014 (UTC)[reply]
        • So what's the proposed solution? Expecting editors to manually fix articles from years ago won't work; many of those editors are long gone. (In fact, for a 'bot to complain about an editor action from the distant past is usually an indication of a bad bot.) John Nagle (talk) 21:39, 1 March 2014 (UTC)[reply]
          • The proposed solution is as usual, and as suggested the template: evaluate and whitelist the links. You don't really need to fix the articles - if they are good references the articles should remain untouched. --Dirk Beetstra T C 05:01, 3 March 2014 (UTC)[reply]
          • Note, the bot is not 'complaining about an editor action from the distant past', the bot is signifying a problem that is an issue with the editing of the page: that there is a link that is caught by the blacklist on the page. It does not say that the addition was spam, just that having that blacklisted link is there may result in problems with editing (as already I have found in personal experience - an issue with a blacklisted link which was not whitelisted disabled me reverting vandalism). --Dirk Beetstra T C 10:58, 3 March 2014 (UTC)[reply]
    • I've removed the cbronline.com entry from the blacklist. After four years it's probably worth giving this another chance. If the spam problems resume, please feel free to add the site back to the blacklist. — Mr. Stradivarius ♪ talk ♪ 04:36, 3 March 2014 (UTC)[reply]
      Note: I've also removed a bunch of links to cbronline.com from the spam whitelist, so if you add cbronline.com back to the blacklist, please undo my edit to the whitelist too. — Mr. Stradivarius ♪ talk ♪ 04:43, 3 March 2014 (UTC)[reply]
    As a note, I strongly disagree with this removal. This was a link in a large, deliberate spam campaign of epic proportions. Experience has learned (e.g. with a similar company: Wikipedia talk:WikiProject Spam/2013 Archive Nov 1#Agora Publishing spam on Wikipedia - 2, blacklisted >5 years ago) that they do not stop, they do continue, they often are still around. This is exactly what we have a whitelist for, and what the whitelist should resolve. I would suggest undoing these edits. Thanks. --Dirk Beetstra T C 04:58, 3 March 2014 (UTC)[reply]
    I've undone the de-whitelisting - having links on the whitelist that are not caught by a blacklist will not influence the loading of the page anyway, and there are quite some which have been 'properly vetted' for addition, we don't want to go through that again if the blacklisting is reverted. --Dirk Beetstra T C 05:04, 3 March 2014 (UTC)[reply]
    Still looking, but I just blocked a spammer who was spamming over the last >6 months '$$$$-technology.com'-website (that looks somewhat familiar, though it looks like it is from a different company than cbronline). However, this particular editor triggered the spam blacklist for an addition of cbronline.com .. --Dirk Beetstra T C 10:51, 3 March 2014 (UTC)[reply]

    Thank you, Mr. Stradivarius, in my opinion this removal is long overdue. I have put in hours and hours of my time working to support whitelisting efforts for reference-links to this site (which has been a significant distraction from other projects I've committed to work on). I am willing to put in similar time manually reverting spam-links to this site which might result from this action. Until the time I spend reverting spam-links greatly exceeds the time I've already spent working towards whitelisting, I won't be supporting a re-blacklist of this site. I am still unclear on the best methods for detecting spam-link additions that point to this site. Any advice on how to do that is appreciated. Can an edit filter be created that flags any edits adding the text "cbronline"? If anyone points me to unwanted spam from this site I will work to remove it. Wbm1058 (talk) 20:06, 3 March 2014 (UTC)[reply]

    OK, after consultation with another user, I am reverting this removal. They are still here. --Dirk Beetstra T C 07:40, 4 March 2014 (UTC)[reply]
    Undone de-blacklisting. Now for some more:
    This editor, over the last six months added:
    now (in collaboration with MER-C: see User_talk:MER-C:
    "airforce-technology.com is a product of Kable. Copyright 2014 Kable, a trading division of Kable Intelligence Limited"
    kable.co.uk:
    ""©2013 Kable Business Intelligence Limited. John Carpenter House, 7 Carmelite Street, London, EC4Y 0BS
    cbronline.com/about-us:
    "© CBR 2013 | Part of Progressive Digital Media Group Plc."
    progressivedigitalmedia.com:
    "Progressive Digital Media Group PLC © 2012 | John Carpenter House, 7 Carmelite Street, London, EC4Y 0BS"
    In other words, user:115.119.113.194 is spamming on behalf of kable.co.uk, a company that has exactly the same address as the company that owns CBR, and was trying to add cbronline.com as well. I am sorry, I am all for more help at the spam blacklist and spam whitelist - but removals (and additions) do need investigation whether the situation did stop, and like with Agora, spamming pays their bills, they will continue as has once again been shown here. Currently, WikiMedia is discussing a change in the Terms and Conditions regarding Paid editing, and these two examples show blatantly why paid editing is an issue. --Dirk Beetstra T C 07:53, 4 March 2014 (UTC)[reply]
    @Beetstra: Thanks for adding the site back to the list. If the spammers are still active, then I of course agree that that's the right thing to do. I see that I have been guilty of Administrative Action While Clueless here - I didn't even know that we had a log of attempted link additions until about an hour ago. I was assuming that the situation was more or less equivalent to vandalism and page protection, i.e. that you have to unprotect for a little while to see if the vandalism continues, but now I see that I was wrong. And now I see that we have Wikipedia:Spam-blacklisting, which I somehow managed to miss when I read the top of the page last time. Sorry for acting out of process - I think I need to make a few requests before I attempt the admin side of things again. Speaking of which, there are some cbronline links at RS/6000 that will need to be whitelisted. I'll make that report when I have a spare moment, if no-one else gets to it first. — Mr. Stradivarius ♪ talk ♪ 09:38, 4 March 2014 (UTC)[reply]
    No worries - I am glad you are willing to participate and hopefully learn, there is an obvious and huge backlog on all sides. Note also that I am all for trying to remove old items and monitor them - unless the spamming is still known to be active. I am a bit worried that I did not see these earlier, I did look before whether CBRonline was still active in spamming, but totally missed these spammers (maybe they did not try cbronline.com itself earlier ..). Guess this will grow again into a huge list of to-be-blacklisted links. Hope to see you around. --Dirk Beetstra T C 13:36, 4 March 2014 (UTC)[reply]
    Well I find this all very frustrating. The cited IP has made all of 27 edits over a span of seven months, keeping such a low profile that even Dirk did not immediately notice their edits. This one made just 27 edits, but I know this editing activity is not likely limited to just a single IP. There may be dozens (hundreds?) (thousands?) of other IP editors with similar editing activity. So, 27 edits is not that many that we can't take a close look at all of them, or all of their edits that have not been deleted. As I'm not an admin, I don't know how many deleted edits that they may have made.
    • The first two "test" edits to Chandra Sekhar Yeleti, an Indian film director, changed a birth year and quickly reverted that change; imply that the editor is likely Indian and perhaps based in India.
    • Next we get to their first "spam link" addition. An external link to Indianapolis International Airport. As external links go, this one doesn't seem that bad; it is on-topic, seems professionally written and potentially could be used as a reference. Nonetheless, it was reverted one minute later by User:XLinkBot for reasons that were posted to the editor's talk page. As I see that the code is maintained by User:Beetstra perhaps you can explain the logic that flagged this edit for reversion. The link seems to "contain neutral and accurate material that is relevant to an encyclopedic understanding of the subject", per WP:EL#YES 3.
    • The next edit was a similar external link addition, to Boeing X-51 (airforce-technology rather than airport-technology); again this was on-topic. For some unknown reason, this one was never reverted and is still on the article, along with other "spam" links to a couple of youtube videos, one from Fox News.
    • Next, a link on Fresno Yosemite International Airport, which, as with the Indy airport, was reverted a minute later by XLinkBot. Out of curiosity I added this link back to the article, to see if the bot would revert me, but, so far the link has been allowed to stay. It appears to me that this bot is implementing some sort of "back-door bot-enabled blacklisting of sites" and thus avoiding the scrutiny of a front-door blacklisting request.
    • The next edit, months later, was a good-faith, good edit to correct a link on Vizianagaram district—which supports my theory that this IP is India-based. Indeed, I just made a who-is lookup and found that the IP belongs to TATA Communications, based in Hyderabad. So their interest in editing articles about American airports is suspect.
    • Next we have an addition to List of countries by gold production, another on-topic link, this one to mining-technology; it's not been removed, and could serve as a reference for the article.
    • Then another mining-technology link on Mir mine. The link is still there, and this source directly contradicts the article, so the article should be scrutinized for accuracy. The article says "The Mir mine was permanently closed in 2011", while this external link says "The mine produced 497,000 tonnes of ore in 2012."
    • Next Wind farm links to power-technology. Again, an on-topic link which has not been reverted and seems to add value to the encyclopedia.
    OK, that's enough for now, I'll stop here. While {{unreferenced}} tags some 219,000 articles lacking sources and {{refimprove}} tags another 228,000 articles needing additional references, I wonder if it's counterproductive to discourage the addition of (potentially) useful links that other editors might cite. I disagree with the idea that these examples show blatantly why paid editing is an issue. I think we may have too broad a definition of spam, if these links are considered spam.
    Sorry, I don't see any log of attempted link additions. All I see is "Permission error". That's why I was asking about an edit filter for "cbronline". I think I would be allowed to see that. Why couldn't cbronline just get the "softer scrutiny" of XLinkBot, allowing good-faith editors to override the bot? Wbm1058 (talk) 19:30, 4 March 2014 (UTC)[reply]
    And here we have a proposal, if I'm reading it right, that would effectively remove "grandfathered" blacklisted links from Wikipedia very promptly, effectively sending them to Wikipedia's "gas chamber" long before the glacial whitelisting process ever saves them, to ensure the "cultural purity" of the encyclopedia is maintained. Wbm1058 (talk) 21:47, 4 March 2014 (UTC)[reply]
    I hope you realize that this was not the only editor spamming these sites, User:Wbm1058, I already caught a second one who is indefinitely blocked now, and the set of links is growing as well. --Dirk Beetstra T C 05:15, 6 March 2014 (UTC)[reply]
    Also note, that it is in the spammers interest to stay under the radar - it pays their bills. --Dirk Beetstra T C 05:18, 6 March 2014 (UTC)[reply]
    You may want to see this in the light of m:Talk:Terms of use/Paid contributions amendment - this is likely SEO spamming (as the contributions seem to originate from India), someone pays someone else to spam their sites to Wikipedia. That it is helpful does not make the principle of spamming right
    To take your last example, Wind farm in the article names 13 'largest onshore wind farms', and we have 2 articles listing wind farms. How does that link to power-technology about 'Biggest Wind Farms in the World', listing a mere 10 (!) ADD anything extra to Wikipedia (the lists largely overlap). That link blatantly fails our external links guideline - and that is the whole problem here, these paid editors link because it fits their goals, not necessarily because it adds anything. If there is anything interesting there, go to the talkpage and discuss, as our conflict of interest guideline suggests. This is plainly spamming. --Dirk Beetstra T C 11:27, 6 March 2014 (UTC)[reply]
    I haven't waded through all of m:Talk:Terms of use/Paid contributions amendment—sorry, TLDR—but this quote I spotted there sums my view up nicely: A case of trying to swat a fly with a sledgehammer. Clearly the goal here is to put Progressive Digital Media Group out of business. We won't be satisfied until that organisation shuts down all its sites and turns off the lights. They are an evil organisation that needs to be banished from the face of the earth and we will do whatever it takes to deny them all sources of revenue. Yeah, we begrudgingly whitelist those old cbronline cites after making the requesters jump through lots of hoops and show a lot of patience, but we really want editors to just remove those links. Removing those links is easy and is the best and recommended way to get rid of that annoyingly helpful banner template our bot puts on those pages. Hey, I've identified another spammer. Google is spamming links to Wikipedia all over its search engine results. They need to stop that. Readers should just find Wikipedia articles by searching Wikipedia. We don't want or need Google's help to pay our bills, thank you. We need to blacklist Google until they stop spamming their search results with links to Wikipedia. Wbm1058 (talk) 14:38, 6 March 2014 (UTC)[reply]
    "Clearly the goal here is to put Progressive Digital Media Group out of business.". What a ridiculous accusation, as is the rest of your remarks. --Dirk Beetstra T C 09:11, 9 March 2014 (UTC)[reply]

    Going on - cbronline.com

    links
    users
    (in progress) --Dirk Beetstra T C 08:11, 4 March 2014 (UTC)[reply]

    Gentlemen, I added this on another page, but it is more appropriated to add it here. Please see my findings below.

    designbuild-network.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    roadtraffic-technology.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    airforce-technology.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    power-technology.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    aerospace-technology.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    foodprocessing-technology.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    airport-technology.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    army-technology.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    mining-technology.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    naval-technology.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    railway-technology.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    offshore-technology.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    ship-technology.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    Less used, from the same group:

    semiconductor-technology.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    mobilecomms-technology.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    hotelmanagement-network.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    water-technology.net: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com


    Spammers
    Explanation

    I could identify only some spammers, as apparently they have been active since 2006 at least. All those domains appear to be part of the same farm and have about 2k links combined. They are all part of Kable.

    Some sites appear to have actual content, many links are just dropped on the "external links" section.

    Legionarius (talk) 03:07, 31 March 2014 (UTC)[reply]

    All these are currently being plus Added. --Dirk Beetstra T C 05:42, 31 March 2014 (UTC)[reply]
    I will admit that there are a non-trivial number of external link injection edits to these sites. They do have reliable info on them (they are essentially industrial news aggregator sites), and are legitimately used as sources in a lot of articles. IMO this was overkill. Dave (talk) 16:58, 3 April 2014 (UTC)[reply]
    There is a gray area between curbing behavior issues and causing damage to the project. These links do contain legitimate content. I added one of these links myself and am not a spammer. -- GreenC 17:28, 3 April 2014 (UTC)[reply]
    I appreciate you guys and bots are having fun but the bot is tagging hundreds of articles related to the aviation project, including official sites of Airbus like www.airbusmilitary.com, appreciate you are trying to stop spammers but you need to appreciate the chaos out in the real world. I presume it will be left to others like the aviation project to sort this out. Oh and why does the bot put a message on the article and the talk page, is is that bad that the article needs to be marked (and more stuff to tidy up). MilborneOne (talk) 18:34, 3 April 2014 (UTC)[reply]
    Airbusmilitary is clearly a false positive - no idea how anyone came to the conclusion it's part of the spammed-for sites. It needs to be removed from the spamlist ASAP. I don't think the other sites offer anything of value, a quick check of Airbus aircraft pages there does not really reveal useful information. Parts of the information seem to be taken from multiple sources including Wiki. Could not even be used as reference for anything. --Denniss (talk) 18:57, 3 April 2014 (UTC)[reply]
    I came across this recent bot edit [7] to [Nakhoda Ragam-class corvette] effectively undermining some cited specification information added more than five years ago with [8]. It seems that a very large number of bot-flagged blacklisted links associated with this case were legitimate citations. --Rumping (talk) 19:15, 3 April 2014 (UTC)[reply]
    Links to Foo-Technology.com are not legitimate citations; they are either unreliable sources (and thus acceptable collateral damage, at worst) or spam masquerading as references (hiding a spam link as a legitimate citation happens distressingly often). Airbusmilitary does need to be removed from the blacklist though, that was a mistake. - The Bushranger One ping only 20:21, 3 April 2014 (UTC)[reply]
    Today, when I first saw my watchlist clogged with the bot flagging these for blacklisted links I was inclined to believe this was a gross overreaction. However, your point is valid and I'm re-thinking this. After reviewing some of the citations and these cites, the "about us" page for these does not inspire confidence, and while the text seems reliable, it appears to have been scraped from somewhere else. I did find this humorous instance http://www.roadtraffic-technology.com/projects/i-15-core-utah-county/ of an article about a major construction project in Utah, where the text seems reliable and matches what other reliable websites have said, but the pictures for the article are clearly taken in the San Diego, California area, not Utah. Sigh, collateral damage it is.Dave (talk) 21:51, 3 April 2014 (UTC)[reply]
    My watchlist notified me of the drive-by tagging of Brehon B. Somervell. The link in question was to ***** which I added myself some years ago. It's about a class of warships, and I took it from the eponymous warship's page. It is a useful resource and carries no advertising. I did not see any debate about its reliability, so we cannot declare it an unreliable source. I guess we have to have some form of censorship. Hawkeye7 (talk) 20:31, 3 April 2014 (UTC)[reply]
    Is it possible for this bot to place its notice on the talk page of articles but not in the articles themselves? There are a large number of articles which cite railway-technology.com as a reliable source, and in many cases there is no easy alternative source for the cited facts. At the moment, with so many legitimate links being flagged (in the articles and not just talk pages) this bot is damaging Wikipedia. Hallucegenia (talk) 00:09, 4 April 2014 (UTC)[reply]
    I share the same concerns with water-technology.net and requested a removal from the blacklist above. It is generally a reliable source and valuable to the project. There must be some other solution other than blacklisting the site.--NortyNort (Holla) 00:52, 4 April 2014 (UTC)[reply]
    Furthermore, I question whether User:Innomad is a spammer as described above. I may be wrong of course, but permanently blocking a user without warning whose only ever made 20 edits and whose last edit was in 2009 seems to me to be an abuse of administrators' privileges. Hallucegenia (talk) 07:18, 4 April 2014 (UTC)[reply]
    User Innomad ONLY added external links, just like the previous spammers related to CBROnline - at best a sock/meatpuppet of other users. This is obvious a spammer related to the case, and they, seen the earlier cases, should have known better. I am not here to play a game of whack-a-mole - Spammers out of the same campaign get blocked without warning - they can consider themselves already to have been warned. --Dirk Beetstra T C 07:55, 6 April 2014 (UTC)[reply]
    Why using the swat a fly with a sledgehammer approach on only a few users identified as spammers? We want to penalize the spammers but it seems like we are penalizing hard-working editors who want to keep articles clean. Please rethink this again and weight the benefits before making a massive change to tons of existing articles. If you think that ???-technology.com are not reliable sources, please provide evidences of such. I could have theorized that the spammers are Kable's competitors who know the loophole that if someone spams enough on Wikipedia, those web sites will be put on the Wikipedia's blacklist. The approach to deal with spammers is to deal directly with spammers. If the web sites are obvious product advertisement, okay fine, it may be legitimate to put on the blacklist. But when the sites associated with suspected spammers are reliable sources providing good contents, we should not address it this way. We could have been manipulated by the spammers to do this because we actually don't know to true intention of spammers and the true connection between the spammers and the sites. Like I said the spammers could have been their competitors, or some crazy people just having fun seeing us hammering ourselves. Even more to this, it appears that one of the alleged spammers who "is actively spamming" made the total of 27 edits in the last 6 months, and our action is to ban legitimate sites??? Z22 (talk) 03:53, 6 April 2014 (UTC)[reply]
    A few, CBROnline was on it with a large number of editors. The approach to deal with spammers is to make sure they stop - blocking does not make them stop, you already see a number of different Single Purpose Accounts here - blocking one will likely result in others coming up. --Dirk Beetstra T C 07:55, 6 April 2014 (UTC)[reply]
    Removed airbusmilitary.com. MER-C 13:05, 4 April 2014 (UTC)[reply]


    Further Review Requested

    I recently looked at a page for the Boeing 737 Next Generation (and, related, the Boeing 737 root page which also mentions the 737NG), and I saw the site aerospace-technology.com on the blacklist. This, with other previously posted examples in this section, I've seen that these appear at first glance to be valid references. However, I haven't had the chance (and won't have the time) to scrutinize the site; all I'm saying is that I think that some people need to go through all the sites made by these posters, to check their accuracy, and, if necessary, change the links, whitelist the links, or even remove the sites from the blacklist. I'd do it myself, but I don't have the time commitment necessary to do that massive task; only just to add the Boeing 737NG link issue to your attention.

    In my personal opinion, though it does come off as spammy in the way it was posted, and even if the person posting the links may be paid to do it (proof permitting; after all, it could be someone who -really- likes cbronline.com as a reference), if the sources are valid (unless it's against Wikipedia policy), as long as they are accurate, why not leave them as is and keep them off the blacklist? But again, that's just me. :)

    From what I've seen, all the External Links shown seem to directly relate to the content, so there's no question on whether it's on the wrong page or not, and it does help explain the content similar to a reference. I see it in similar vein to a link to almost any movie's wikipedia page, that almost always has a metacritic.com, rottentomatoes.com, and/or some other review site page on it, even though they are all ultimately business sites, similar to cbonline.com. But again, I'm not educated on this particular website host, so if I'm mistaken, feel free to let me know. The Legacy (talk) 18:31, 4 April 2014 (UTC) (Edited The Legacy (talk) 18:39, 4 April 2014 (UTC))[reply]

    Railway-Technology.com

    I've just seen dozens of railway related articles tagged by the bot because they contain links from railway-technology. Most of these links contain legitimate information and are being used as citations in many articles. I added some of them myself. And I'm certainly not a spammer. This has been raised at Wikipedia talk:WikiProject_Trains#Blacklisted website. This bot is causing more havoc to the wikipedia than any spammer could, and I rather resent this taking up time which I could better put to something useful! G-13114 (talk) 21:55, 4 April 2014 (UTC)[reply]

    Power-Technology.com
    Offshore-Technology.com

    All what was said about Railway-Technology.com applies also to Power-Technology.com and Offshore-Technology.com. The first one is included in more than 150 articles and the second one is included in more than 130 articles. Most of these links contain legitimate information and are being used as citations. And as the previous editor, also I may say that a number of these links were added by me and I am not a spammer. It is also unacceptable that that kind of mass listings are made without prior notification of affected Wikiprojects (concerning these two sites it is mainly WP:Energy but also WP:Geology, WP:Dams and some others). Beagel (talk) 16:43, 5 April 2014 (UTC)[reply]

    naval-technology.com
    army-technology.com

    The first is used in over 300 articles, the second over 200. These links contain legitimate information and are being used as citations. It seems that the blacklist is a bigger menace to the integrity of the wikipedia that any spammer. Hawkeye7 (talk) 06:49, 6 April 2014 (UTC)[reply]

    I do see that someone else brought up that these are not reliable sources anyway, and I found this diff informative regarding that as well. It starts to seem that most of these are replaceable by better, reliable sources, others can plainly be deleted as the information is not notable enough to be mentioned, and then the rest can be handled by whitelisting. --Dirk Beetstra T C 07:55, 6 April 2014 (UTC)[reply]
    Please go ahead and find alternative sources for the hundreds upon hundreds of articles which use these then! Cause I'm bloody well not doing it!! G-13114 (talk) 08:19, 6 April 2014 (UTC)[reply]
    Nobody asks you to do it - and as argued below and above, this may not be the reliable source one takes it for (funny, that happened as well with a whole other set of CBROnline references - deemed unreliable and scraped), so it can just go without going through the effort of looking for the reliable sources. --Dirk Beetstra T C 08:24, 6 April 2014 (UTC)[reply]
    By the way, you might want to direct your anger at CBROnline/Kable for continuously violating our core policies - and by the looks of it the violation of what is soon going to be our new Terms of Use (though that is still under debate). --Dirk Beetstra T C 08:25, 6 April 2014 (UTC)[reply]
    No, I'd rather direct my anger at the black list. That's what is really in violation of our values. There's been no argument presented that the sites are replaceable or unreliable in any way. No editor should be allowed to place an article on the blacklist if they are not prepared to go and fix it personally. We block vandals for much less. Hawkeye7 (talk) 10:18, 6 April 2014 (UTC)[reply]
    No argument, but several people have commented that the info is better sourced from other sites in the several threads here (e.g. "Links to Foo-Technology.com are not legitimate citations; they are either unreliable sources (and thus acceptable collateral damage, at worst) or spam masquerading as references (hiding a spam link as a legitimate citation happens distressingly often)" by The Bushranger above; "given its just being used to cite basic facts those should be easily found in better sources." by Werieth below). --Dirk Beetstra T C 13:32, 6 April 2014 (UTC)[reply]

    Going on - part 2

    I have for now commented out the blacklisting, though I will encourage further discussion (and I will undo this if I find ongoing abuse or see that the scale is bigger than expected - we have a blacklist and whitelist for a reason). There is to me NO question that CBROnline and Kable are spamming Wikipedia using multiple Single Purpose Accounts, and that they have been doing this for many, many years now. Although regulars have been using this site, I know that spammers have engaged in 'reference spamming' as well as plain external link spamming. This is an issue that needs to be resolved, as this (the spamming) goes straight against our core policies and guidelines. I am also worried by several (knowledgeable) voices saying that either the information they provide is replaceable, or is used to support not-notable information. Editors may want to start and look into those issues. --Dirk Beetstra T C 08:19, 6 April 2014 (UTC)[reply]

    Just noting, that I do not have any problem if another admin disagrees with my (temporary) removal of the blacklisting and reverts that removal. --Dirk Beetstra T C 08:58, 8 April 2014 (UTC)[reply]

    Taking account the impact this listing has and the ongoing discussion, this blacklisting should be a community-based decision, not a decision of a single admin. Beagel (talk) 09:33, 9 April 2014 (UTC)[reply]
    I am sorry to say but you just deny the problem. Nobody says that spamming is not a problem, but the real problem here is not spamming but the fact that thousands of valid references are being outlawed. Even if we assume that there are replacement for all these references (but there is not, at least not for all), it is a large workload for fellow editors to do that. We are not paid for editing here, so some respect for others work and time is appreciated. You says that these references may not to be reliable but a number of people being active in different Wikiprojects say that these references are reliable (at least in most of cases) and have added-value to quality of articles. And knowing some of them by their excellent work in Wikipedia, I would say that they are also "knowledgeable voices". Also, recalling some earlier cases, it seems that in some cases the blaclisting really is not preventive but punitive. And the big problem is that nobody notifies affected/relevant Wikiprojects prior any action taken. Getting to know that something is going on when bot messes-up your watchlist or when the project clean-up listing has hundreds new entries, is not the way how thing should be done. Beagel (talk) 09:17, 6 April 2014 (UTC)[reply]
    No, I am not denying that - and that is why I commented out the blacklisting. More research on the scale of things needs to be done here. The links are not outlawed - we do have a whitelist for a reason.
    I am not sure if it is punitive - this really prevents editors from editing, this is not more punitive than blocking the individual spammers - it prevents spammers from adding, it does not punish them (having your links blacklisted does not significantly affect your search ranking, unlike the nofollow that Wikipedia implemented years ago).
    I am sorry, it is just impossible to notify wikiprojects affected - there is no way of detecting that, scale the necessity, etc. The notification of the bot that an article needs to be looked at is the closest one can get to notifying editors. --Dirk Beetstra T C 09:30, 6 April 2014 (UTC)[reply]
    Re@prventive v. punitive. I am not talking about spammers. I personally think that every spammer should be blocked. But the blackslisting of websites should be the last resort because of its side-effects, mainly taking away information resources from editors. And it is punitive - mainly against our regular editors who have to deal with the mess created with backlisting like in this case (but definately this is not the only one).
    Re@notification. It is hard, I agree, but it is possible, if there is a goodwill for this. If you have the website you would like to blacklist, you have to check which/how many articles uses that website. This is not a rocket science. And if you know which articles use this website, it is easy to check which Wikiprojects are involved. It is even possible to create a special bot task for this which helps to deliver notifications imminently after the blacklisting proposal is made. And websites like aerospace-technology.com or power-technology.com give a clear indication which Wikiprojects could be interested about this, so the argument of impossibility is not valid one. But as I said, it needs some extra work. However, I think that some extra work by blacklisters are justified if this helps avoid even more extra work by our regular editors. It can't be accepted if blacklister decrease their workload by increasing workload of other editors. A change of attitude in this respect would be useful to achieve the common goals of Wikipedia.
    Beagel (talk) 11:12, 6 April 2014 (UTC)[reply]
    Punitive against regular editors .. what is blacklisting punishing them for? --Dirk Beetstra T C 13:35, 6 April 2014 (UTC)[reply]
    It was said several times by several editors, but lets me repeat what TheOtherEvilTwin said: "Banning a site which is legitimately linked by hundreds of wiki pages means imposing hundreds of hours of work on the Wikipedia community to either find alternate sources or to remove now-unsourced material. Requiring the community to waste that much time on busywork instead of spending it on more productive edits just because a site was spam promoted is unreasonable. Anti-spam work is supposed to reduce others' opportunity costs, not increase them." If this additional workload for community is not a punishment, what it is then? If you can't predict what are the consequences of blacklisting certain websites, it shows that something is wrong with the whole procedure. So, if the bot already looks for links in the articles (at the moment after blacklisting), lets program it to do this imminently when the proposal to blacklist any webiste is submitted (that means before any action is taken). Bot could make the list of affected articles, it could make the list of affected Wikiprojects based on WP banners on the articles' talk pages and it could notify affected WPs (it could be that there are some misplaced banners but lets say if the certain WP has more than 10 (or 20 or what ever we agree) hits, it should be notified). And if there are hundreds or thousands articles linked to the certain website, it is a clear signal that the issue needs a careful analysis. If we don't have that kind of analysis, we would repeat these mistakes what happened now.
    And also, putting every website related to the person who was stupid enough to spam, into the indef. blacklist is punitive, not preventive. I think that the first time blaclisting should not exceed one year (exception, of course, should be websites promoting hate, violence, child pornography etc). Beagel (talk) 14:18, 6 April 2014 (UTC)[reply]
    Sure, so we don't blacklist and let the few editors who care work their hours to remove the spam - that is more important than the 100 editors who spend some time pruning (removing/whitelisting) the sites that are of interest. I totally agree.
    Blacklisting is not indef - if the threat has stopped and when editors request removal then they get removed - however experience shows that de-listing does result in the spammers to return (in fact, cbronline.com was being attempted to be added despite that it is years that the site was blacklisted). The blacklist prevents addition of sites that are spammed in the same way that a block prevents an editor from vandalising Wikipedia. If the company stops with spamming, the site can be removed, if the editor stops vandalising the editor can be unblocked. CBROnline, obviously, did not stop spamming. --Dirk Beetstra T C 09:42, 7 April 2014 (UTC)[reply]
    Whether it is a justified method of stopping the spammers, I'm going to put that aside for now. My point is more on how the bot adds a mess to tons of good quality articles. Maybe putting a site on the blacklist for future edits and force editors to look for alternatives or request a particular page to be put on the whitelist may not be as bad. But putting a hat tag on every single articles with a blacklist link is very counter productive. If the bot has to do anything, another approach may need to be implemented. For example, the bot would not put a hat tag on the article. It would put a new section on the talk page of each article and list out which links are from the blacklist sites. Then be clear to editors that having a site on blacklist does not automatically mean that the references and the associated contents are questionable. The message should be clearly conveyed that editors should use discretion to inspect each of those links and confirm that whether they are legitimate. If legitimate, no action is needed. If not so, find alternative sources. Z22 (talk) 15:43, 6 April 2014 (UTC)[reply]
    That boat has sailed a long time ago. If you have a problem with the bot, take it up with the bot operator. MER-C 04:18, 7 April 2014 (UTC)[reply]
    Stop giving these editors the run-around. The bot editor will just refer you back to here. The only justification for that bot was a bot request with minimal participation. Somehow things have been warped to where any source may be deemed an {{unreliable source}} if someone is thought to be creating "spam" links to that source. It's high time for a bot that has the widespread impact that this bot has to justify its existence with consensus for its operation at Wikipedia:Village pump (proposals). Don't be surprised if, as with this discussion about {{orphan}}s, consensus turns out to be that the bot's "big", "ugly", "defacing", "distracting" and "grotesque" message should be moved to talk pages. – Wbm1058 (talk) 01:03, 8 April 2014 (UTC)[reply]
    I agree, the bot is causing far more damage to wikipedia than any spammer could ever dream of! Like 99.9% of editors, I had absolutely no idea that this bot existed until I saw the havoc it was wreaking across hundreds of articles. So saying that it was approved when hardly anyone knew about it is a touch disingenuous! Had I known I might have participated. G-13114 (talk) 01:41, 8 April 2014 (UTC)[reply]
    The bot is not giving any more damage than the {{unreferenced}} which is also everywhere on top of pages (as are many of those maintenance templates). I still have NO clue how that template is causing damage to Wikipedia. And the argument for {{orphan}} is the only one that managed to get moved (logically, it does not signify a problem with the page, as {{unreferenced}} does) - you've tried that argument before and it does not sail. --Dirk Beetstra T C 08:43, 8 April 2014 (UTC)[reply]
    This is just giving us the run around again. As has already been explained to you. What is causing the damage, is that many hundreds of articles are having otherwise perfectly good references outlawed for no good reason. And necessitating enormous time wastage on behalf of the (volunteer) editors to find replacements, which may or may not be as good. The template is a nuisance though. G-13114 (talk) 09:38, 8 April 2014 (UTC)[reply]
    And that was explained as well - they are not outlawed - if they are perfectly fine than they are suitable for whitelisting. And again, removing the spam is also a 'necessitating enormous time wastage on behalf of the (volunteer <- yes, I am a volunteer as well) editors to remove. The template is not more of a nuisance than {{unreferenced}}, {{cleanup}}, {{primarysources}}, and, as opposed to {{unreferenced}}, {{cleanup}} and {{primarysources}}, it in fact points to a problem with the page that should (generally by whitelisting) be solved as it may interfere with the editing process. --Dirk Beetstra T C 10:09, 8 April 2014 (UTC)[reply]
    Moreover, the solving of the problems on the hundreds of pages can be performed by the hundreds of editors who have each some of the pages on the talkpage, unlike the spam issue, which (just like that you did not know about the existence of the bot and the template) was completely missed by you and all those volunteers that have the pages in these subjects on their watchlists, and has to be solved by the few volunteers that are active there. I am actually wondering why I am wasting my time fighting spam, maybe we should remove those guidelines, and scrap WP:NOTSOAPBOX from WP:NOT, as it can be completely ignored. --Dirk Beetstra T C 10:13, 8 April 2014 (UTC)[reply]

    Are these sites on XLinkBot? MER-C 07:26, 9 April 2014 (UTC)[reply]

    Blanketed, but I am going to adapt that now. --Dirk Beetstra T C 07:31, 9 April 2014 (UTC)[reply]
    Rules adapted, sites spammed here are there (if there are other Kable links still missing, I would suggest to add those as well). There is still a lot of cleanup to do, the editor mentioned below from 2009 is hardly reverted, e.g. (and that is true for recent editors as well). I am questioning how many of the hundreds of links that are there were originally spammed, spammers here have sometimes more than 50 additions on their name. --Dirk Beetstra T C 07:43, 9 April 2014 (UTC)[reply]

    Going on - Part 3

    The last edit of this editor was made more than four years ago. How it is relevant in this context? And why s/he was not blocked in the first place? Beagel (talk) 09:28, 9 April 2014 (UTC)[reply]
    'how is it relevant': this is long term abuse of Wikipedia. --Dirk Beetstra T C 12:26, 9 April 2014 (UTC)[reply]
    This first user (Dee82) last edited in January 2010. The first edit by the second user (Veronicawilson235) was in March 2013 or more than three years later. There is no proof that they are even the same person. Blocking seems to be a right decision here; however, blacklisting was clearly an overkill and, based on this provided information, is not justified. Beagel (talk) 15:33, 9 April 2014 (UTC)[reply]
    Beagel, they are working for the same organisation, this is WP:DUCK, they may not be the same person, but meatpuppetry is the same. Moreover, you forgot the other 5 or 6 accounts that were active, showing that blocking them does not solve the problem, there are 7 or 8 editors spamming (and their MO is practically always the same, many cases just external links, sometimes a bit of low-relevance reference spamming). That there is a gap between Dee82 and Veronicawilson235 (or even, if we see all accounts) may just mean that there were more links being spammed. Lets turn it around - how many edits can you show me by regulars using this site, for me, it seems to be easier to find the spammers. Top level article in the field of naval-technology.com would be Navy, the link was not added by a regular, but by a spammer. And naval-technology.com is not the official recognised reporter on Navy, is it? Is that link appropriate? And that is my conclusion in many cases for this link: the fast majority should plainly go. Blacklisting is justified, this is spam. --Dirk Beetstra T C 03:54, 10 April 2014 (UTC)[reply]
    Re@how many edits can you show me by regulars using this site. I myself at least 50 (am really not able to recall all my 78,000 edits made over 8 years), mainly offshore-technology.com and power-technology-com but probbaly few of other technology pages. I think that you got similar figures from NortyNort or Dormskirk. Do not know so well other editors commented here but certainly also they have used these websites. I would be very careful calling any of them spammers. Beagel (talk) 04:49, 10 April 2014 (UTC)[reply]
    'I would be very careful calling any of them spammers.' .... you really got a wrong impression of me. --Dirk Beetstra T C 05:45, 10 April 2014 (UTC)[reply]
    I had a quick glance at the additions of those two you mentioned - I did not see any additions of you over the last handful of months, I do find more possible spammers though. I will look further into the past. --Dirk Beetstra T C 06:15, 10 April 2014 (UTC)[reply]

    ...this is one for long-term abuse. --Dirk Beetstra T C 07:02, 9 April 2014 (UTC)[reply]

    S/he is blocked now. Did not this solve a problem? Beagel (talk) 09:28, 9 April 2014 (UTC)[reply]
    I'm not entirely sure I see what the problem is here. All of those links as far as I can see are relevant to the articles they have been put in, so I'm not sure how this is detrimental to the wikipedia. If they were putting in links to websites that were selling viagra or something then that would be different! G-13114 (talk) 11:09, 9 April 2014 (UTC)[reply]
    @User:Beagel: until another sockpuppet comes up. We have 5 who have been active in the last months, and 2 or 3 from before, and those are only the ones identified. Do you believe that this solved the problem? Did all the editors from the CBROnline spam of a couple of years ago who got blocked convey the message that it stops? Do you understand why people spam? It pays their bills. They do not stop (obviously) when just blocked, they will just make a new account.
    @User:G-13114: It is a community consensus that we are NOT writing a linkfarm here but an encyclopedia. These additions, simply, fail our external links guideline, others fail our reliable sources guideline (when used as a source), and the editors are, plainly, violating WP:SPAM. These editors are violation our core policies and guidelines, our pillars. --Dirk Beetstra T C 12:24, 9 April 2014 (UTC)[reply]
    I strongly agree with Beetstra here. Not blacklisting these links sets a dangerous precedence that would allow any site that could meet WP:RS to abusively spam Wikipedia. Yes, it's a pain to replace those links, but it's not as if one person has to do it all. I'd be happy to help. OhNoitsJamie Talk 14:46, 9 April 2014 (UTC)[reply]
    Until now, most of the links I encountered I simply removed, haven't found anything that needed replacement yet. --Dirk Beetstra T C 04:31, 10 April 2014 (UTC)[reply]
    Based on the provided information my understanding is that blaclisting was an overkill. However, I do not want to underestimate the spamming problem and I think that solution could be some kind of "greylist". Websites in this list are not blacklisted but they have a history of spamming. We should designate a bot to check and list every day articles where these "greylisted" links were added. Having that kind of list (including the name of users who added these links) it would be quite easy to discover any pattern of spamming and it would be easy to catch these spammers. It would be more editors-friendly solution than blacklisting websites that qualifies as RS. Of course, we will still have a blacklist for special cases. Beagel (talk) 15:43, 9 April 2014 (UTC)[reply]
    re: "greylist" – see Wikipedia:Edit filter. Wbm1058 (talk) 15:51, 9 April 2014 (UTC)[reply]
    Edit Filter is, with some exceptions, not capable of handling this - overloads the server. Greylist here is more User:XLinkBot - the links (with thinking forward) are there. However, accounts (established accounts) are easy to get, most of these accounts have been active for several months and have >50 edits - way over the 'autoconfirmed' limit that is used by XLinkBot and MediaWiki. --Dirk Beetstra T C 03:59, 10 April 2014 (UTC)[reply]
    The proposal was not meant to prevent somebody, but to get a better analysis to detect potential spammers more easy way and without damaging work of regulars. Beagel (talk) 04:49, 10 April 2014 (UTC)[reply]
    I find the idea that Progressive Digital Media is not operating reliable sources to be dubious. Assume in good faith that they are trying to be reliable. Even major news outlets occasionally make errors and have to later publish corrections. Perhaps this organisation, which may be running on shoestring resources, makes more errors than larger, better endowed media, but they are trying to be reliable and usually are. Wbm1058 (talk) 15:51, 9 April 2014 (UTC)[reply]
    @Dirk Beetstra, would you like to explain why you say that the X-technology links are not reliable sources? I have to say that in my experience they are accurate and professionally written, I have put in a few railway-technology links myself as cites, I would not have done this if I did not believe they were accurate and reliable. I see a lot of people have said the same thing. G-13114 (talk) 16:27, 9 April 2014 (UTC)[reply]
    No, that is the wrong way of asking, G-13114. A site is hardly even not an RS at all. But a lot of the material that Wikipedia is currently referencing to these sites is either not notable, or this is not the optimal source, or it is, for that fact, not a reliable source. Being reliable does not make you a reliable source either. Wikipedia is often reliable, it is not a reliable source. Fact however is, that even plain spam, utter useless for the majority of Wikipedia, where the majority of the site can be replaced with easy to find better sources, material where you would not have a doubt for blacklisting seen the abuse, are sometimes reliable sources for something.
    I am not saying, and I have never said, that this site should never be used by regulars, that this site should be banished from the face of the earth. I said that the scale of the spamming of the editors is a problem that can not be solved by blocking the editors, that can not be solved by protecting the pages, and seen the persistence over the years (this is a problem for many, many years) it is also not a problem that XLinkBot can solve - the spamming can only be stopped by blacklisting the site. The main argument that is thrown at me is that by blacklisting a site it outlaws it, that regulars are not allowed to use it, that the material on the site is bad, that I damage Wikipedia. That is a logical fallacy - it is not true. I do however say that a) many, many (if not all) of the links in external links sections currently there are inappropriate and can go without damaging Wikipedia; b) that many, many of the references there are either trivial information, some of the information should actually be primary sourced (and not from aggregator), c) and that the links that are detrimental should be whitelisted. For what I have seen, the hundreds and hundreds of additions by identified spammers and I hardly ever encounter an addition by a regular (some IPs with 1 or 2 edits, not sure if they are spammers, hence not reported; some editors with 20 edits editing only in a very small subset; some vandalism reverts where the link was re-added after a vandal). I have yet to encounter the edits by you, G-13114 and Beagle and others. You all say 'I've used this site', but if I find the spammers adding hundreds, and you guys are talking about 'a few' - does that mean that all the regulars have added maybe up to 25 of the current links to all these sites (by now several hundreds), and the spammers the rest. To me it looks this way.
    These sites are being spammed on a large scale, SEO exists for a reason. This is likely not one editor, but a concerted effort. These sites should be blacklisted, and pruned. Material should be replaced where possible and then the rest should be whitelisted. Not blacklisting sites that are spammed on a large scale is setting a bad precedent, especially in the light of the drive of WMF against paid editing. --Dirk Beetstra T C 03:54, 10 April 2014 (UTC)[reply]
    WP:RS clearly says: The reliability of a source depends on context. That mean the some source should be in some context reliable and in the other context unreliable. Every source (and that means reference, not the website in whole) should be considered individually. Saying that all these sites are unreliable is a non-starter. Concerning whitelisting, well, this is not a suitable if we talking a mass blacklisting. I have an experience when I some years ago asked to whitelist one site. The first time the request was ignored, the second time the answer was quite arrogant recommending to look for another source. All in all, I spent more than month for nothing. This is clearly not the way forward for regulars. Concerning outlawing, well, you could it how you like but de facto these sites a outlawed in practice.Beagel (talk) 04:49, 10 April 2014 (UTC)[reply]
    That is what I said - there are however sites where the information is generally unreliable, and/or where they can be replaced with more reliable sites in general - aggregator type information is not 'reliable', they copy without scrutiny what others say, that does not make it reliable (it is probably true, but that is something else). It were however not my words (it was mentioned by others) and it has NOT been a large factor in my decision to blacklist. Somewhere else there is the suggestion to have more manpower - I would be all for having more manpower on the whitelist as well .. --Dirk Beetstra T C 05:42, 10 April 2014 (UTC)[reply]

    Getting at the heart of the issue with sites like cbronline

    This issue has been building for some time; see MediaWiki talk:Spam-blacklist/archives/December 2013#cbronline.com for an earlier thread. Extracting some comments from that:

    Q: Wikipedia has a "massive" number of links to The New York Times, but their massiveness doesn't make them spam. The Times does still make real money from selling subscriptions, but even they are becoming more dependent on online ads. What if some anonymous editors "help" the Times by focusing their editing on clearing the Category:Articles with unsourced statements backlog by inserting mostly helpful citations to Times articles, but get somewhat over-enthusiastic about the project and also add some dubious (external) Times links that are not strictly required for confirmation of article statements of fact. Would we then be forced to blacklist the Times?
    A: A journal like the Times does not need spam to get their links out (so that says something about companies that do spam), moreover, if a site like that would engage in a massive spamming campaign, we would indeed have a nice problem, which likely would be handled through the legal department of Wikimedia (we have had congressman or their representatives spam Wikipedia - besides blocking, they have to be reported to the Foundation). I would however not exclude that if such a site would engage in such massive spamming, that blacklisting (though more likely an edit filter) may be needed to mitigate the problem - and it has happened for sites like that.

    We have something of on an ongoing crisis in journalism, as traditional print newspapers have become more and more scarce, and those that survive have shrinking resources and content. If all we have left are a handful of sources who can afford not spamming Wikipedia to build their audience, then the only remaining available media may be that provided by a handful of major corporations who will have a de facto oligopoly on the news. Why should we give the Times or any other major media special treatment? Shouldn't Progressive Digital Media be given equal treatment? Has Wikimedia's legal department been made aware of this situation? Why haven't we used an edit filter to deal with this problem, as would "more likely" be done to fight Times spam? A cynic—and make no mistake, in my earlier post that was dismissed as "ridiculous", I was in cynic or devil's advocate mode—might think that the problem was that Progressive Digital Media wasn't generous enough with contributions to the Foundation, and that the Foundation favors organisations that are generous towards it. Wbm1058 (talk) 15:34, 9 April 2014 (UTC)[reply]

    Giving grants does not make your links magically appear here. That would be completely against our core policies and guidelines, editors would never allow that.
    Point is, we are not giving one site a special treatment. If a site gets spammed (which I have never observed for the Times) it is up for blacklisting. Whether it is CNN or whether it is your regular Taladafil site. Now, blackisting ALWAYS has collateral damage - but I think that the collateral damage here is minor, really there are only a few additions by regulars in comparison to hundreds of additions by spammers, for CNN that would be different. If a huge site would massively spam Wikipedia with multiple accounts, all across, then the only problem to mitigate it at some point may be a blacklist entry. Unless you want that company to overtake the editing in Wikipedia and make sure that they proclaim what Wikipedia is reporting and finds important, because that is the consequence, that is what you are advocating here and in above threads: by not blacklisting and allowing this to continue you endorse those hundreds of spammed links to the -technology.com websites, letting them decide what is important, and letting them decide how things are being sourced. On top of that, there are a few links added outside of that effort, which may have been 'an easy to find source' (hey, easy to find: their spamming works?) and replaceable as well, or up for whitelisting. They are now being allowed to violate our pillars WP:NOT, WP:NPOV), so you can follow yours (WP:V, WP:ENC). Who wins: the big company? --Dirk Beetstra T C 04:11, 10 April 2014 (UTC)[reply]

    exeter.co.uk and cardiff.co.uk

    These were added to the list in July 2010 with another link, but not logged; edits that led to the blacklisting are Special:Contributions/Philiporchard. As Exeter and Cardiff are UK cities these are likely to cause false positives and should probably be modified the same way as the guy.com entry was[9]. The \bstay[\w-]*\.co\.uk\b was added after it was claimed "collateral damage is unlikely"[10], unsurprisingly wrong (there's at least one other site containing "stay", which was an official site when added but now a dead link). Peter James (talk) 22:47, 3 March 2014 (UTC)[reply]

    Logging / COIBot Instr

    Blacklist logging

    Full instructions for admins


    Quick reference

    For Spam reports or requests originating from this page, use template {{/request|0#section_name}}

    • {{/request|213416274#Section_name}}
    • Insert the oldid 213416274 a hash "#" and the Section_name (Underscoring_spaces_where_applicable):
    • Use within the entry log here.

    For Spam reports or requests originating from Wikipedia_talk:WikiProject_Spam use template {{WPSPAM|0#section_name}}

    • {{WPSPAM|182725895#Section_name}}
    • Insert the oldid 182725895 a hash "#" and the Section_name (Underscoring_spaces_where_applicable):
    • Use within the entry log here.
    Note: If you do not log your entries, it may be removed if someone appeals the entry and no valid reasons can be found.

    Addition to the COIBot reports

    The lower list in the COIBot reports now have after each link four numbers between brackets (e.g. "www.example.com (0, 0, 0, 0)"):

    1. first number, how many links did this user add (is the same after each link)
    2. second number, how many times did this link get added to wikipedia (for as far as the linkwatcher database goes back)
    3. third number, how many times did this user add this link
    4. fourth number, to how many different wikipedia did this user add this link.

    If the third number or the fourth number are high with respect to the first or the second, then that means that the user has at least a preference for using that link. Be careful with other statistics from these numbers (e.g. good user who adds a lot of links). If there are more statistics that would be useful, please notify me, and I will have a look if I can get the info out of the database and report it. This data is available in real-time on IRC.

    Poking COIBot

    When adding {{LinkSummary}}, {{UserSummary}} and/or {{IPSummary}} templates to WT:WPSPAM, WT:SBL, WT:SWL and User:COIBot/Poke (the latter for privileged editors) COIBot will generate linkreports for the domains, and userreports for users and IPs.


    Discussion

    Help?

    I've hit a wall editing the Icon Complex, and fear I can't get much more done without a solution to this problem. Here is an example:
    If you search for 171011_Supporting_Info.pdf on Google (bing doesn't find this file), one of the results will lead you straight to the motherload of information for said article, at the Hobart City Council website (I would just look there but find this site very hard to navigate). Unfortunately, Wikipedia wont let me cite these sources as they go thru some funky Google redirecting process (I think?). Can someone please tell me a way to find the original link? The search result I have (and cant cite) is http://www.google.com.au/url?sa=t&rct=j&q=&esrc=s&source=web&cd=1&cad=rja&ved=0CCgQFjAA&url=http%3A%2F%2Fwww.hobartcity.com.au%2Ffiles%2F06ef439a-450b-48dd-b33d-9f7b00f2017e%2F171011_Supporting_Info.pdf&ei=K9sBU5uDLsijkAX0zID4DA&usg=AFQjCNGy8xhtzY6MEgNPCjB8QQtbE7qQ8w&bvm=bv.61535280,d.dGI There must be a way to bypass Google and cite the source? Wiki ian 10:19, 17 February 2014 (UTC)[reply]

    You mean: http://www.hobartcity.com.au/files/06ef439a-450b-48dd-b33d-9f7b00f2017e/171011_Supporting_Info.pdf - pdfs are generally set to 'download', and it is then difficult to get the link itself (you are right, it is the redirect code). Hope this helps. --Dirk Beetstra T C 10:46, 17 February 2014 (UTC)[reply]
    thank you. Could you please give me some advice on how to find out the link for myself in future? Wiki ian 10:49, 17 February 2014 (UTC)[reply]
    Pages with parameters have a base part (here 'http://www.google.com.au/url'), followed by a '?', and then a list of parameters separated by '&' (the first one is the parameter 'sa', set to 't': 'sa=t', the second one is 'rct', set to 'j'). For google, the original link is in 'url', set to 'http%3A%2F%2Fwww.hobartcity.com.au%2Ffiles%2F06ef439a-450b-48dd-b33d-9f7b00f2017e%2F171011_Supporting_Info.pdf' - that link is 'percent encoded', '%3A' = ':'; '%2F' = '/' - you take that part and 'decode' the encodings in that list.
    For most, it is a matter of clicking the link and copy-pasting the result in the address bar, but for some a handler in your browser takes over (typically for pdf, xls, doc etc.). Asking here also helps :-). --Dirk Beetstra T C 13:00, 17 February 2014 (UTC)[reply]

    COIBot / LiWa3

    I am busy slowly restarting COIBot and LiWa3 again - both will operate from fresh tables (LiWa3 started yesterday, 29/12/2013; COIBot started today, 30/12/2013). As I am revamping some of the tables, and they need to be regenerated (e.g. the user auto-whitelist-tables need to be filled, blacklist-data for all the monitored wikis), expect data to be off, and some functionality may not be operational yet. LiWa3 starts from an empty table, which also means that autodetection based on statistics will be skewed. I am unfortunately not able to resurrect the old data, that will need to be done by hand. Hopefully things will be normal again in a couple of days. --Dirk Beetstra T C 17:27, 30 December 2013 (UTC)[reply]

    Now what to do?

    I know the policy on short urls. Once again, it gets in the way. http://archive.is/SSm7 is a short code for the webarchive. How much more of a legitimate use can we find. The do not provide the longer code to reach this. So how do we get here? Trackinfo (talk) 00:14, 27 February 2014 (UTC)[reply]

    That's not a "short code for the webarchive." That's the full URL of a shady archiving site (not the real web.archive.org) that the community decided to disallow. Jackmcbarn (talk) 00:42, 27 February 2014 (UTC)[reply]
    As above, you should use The Internet Wayback Machine (Internet Archive) or WebCite. --///EuroCarGT 00:50, 27 February 2014 (UTC)[reply]
    It's a web archive but there are many others so I wouldn't say "the" webarchive. Which policy on short url's are you referring to? We don't want url shortening services like TinyURL which redirect to other websites but this is an archive and not a redirect so the issues are different. See previous discussions about archive.is at Wikipedia:Archive.is RFC and Wikipedia talk:Link rot. PrimeHunter (talk) 01:07, 27 February 2014 (UTC)[reply]
    Whatever the semantics, its blacklisted. Trackinfo (talk) 02:11, 27 February 2014 (UTC)[reply]
    No it isn't. If it were blacklisted, you wouldn't be able to link to it above. However, consensus is that it should not be used on Wikipedia, so any links found to it should be removed. Jackmcbarn (talk) 02:25, 27 February 2014 (UTC)[reply]

    Note, that this is due to be blacklisted per outcome of an RfC; it is just awaiting removal of the plethora of links - an editfilter is in place to avoid additions with specific explanation. --Dirk Beetstra T C 11:06, 6 March 2014 (UTC)[reply]

    Change in functionality of spam blacklist

    Due to issues with determining the content of parsed pages ahead of time (see bugzilla:15582 for some examples), the way the spam blacklist works should probably be changed. Per bugzilla:16326, I plan to submit a patch for the spam blacklist extension that causes it to either delink or remove blacklisted links upon parsing, or replace them with a link to a special page explaining the blacklisting. This could be done either in addition to or instead of the current functionality. Are there any comments or suggestions on such a new implementation? Jackmcbarn (talk) 20:46, 3 March 2014 (UTC)[reply]

    It think that that is a bad idea - sometimes links get blacklisted because of spamming or similar abuse, but older links should then be whitelisted if they do pass the bar. De-linking or even outright removal would be damaging to Wikipedia (one would remove legit references?). Such links should simply be whitelisted if they pass the merits of linking, as should be done for new links that one considers to add. As is currently, blacklisted links that were there before blacklisting do not disable editing to a page, and there is now an effort going on to get those links whitelisted (to avoid the rarely occurring cases of 'accidental' removal). --Dirk Beetstra T C 11:06, 6 March 2014 (UTC)[reply]
    To clarify, the links would still be in the source of the page. They just won't be linked in the normal view of it. None of the links will be lost with this proposal. Jackmcbarn (talk) 16:30, 6 March 2014 (UTC)[reply]

    Can the blacklist handle article talk page spamming?

    When a blacklisted link to the personal website of some ip-shifthing author's original research is repeatedly and disruptively added on various article talk pages, will a bot automatically undo new talk page edits containing that link and warn the user? - DVdm (talk) 16:39, 11 March 2014 (UTC)[reply]

    You have two separate questions implied:
    • Will the blacklist also affect talk pages? Yes, the blacklist prevents blacklisted links from being added to articles as well as talk pages. The blacklist doesn't undo edits. It blocks edits from happening.
    • Is there a bot that can undo spam edits? Yes, see User:XLinkBot. It has its own separate revert list and rules. It's useful for cases where a hard blacklisting of a site (like blogspot.com for example) isn't completely justified although most attempts to link to that site will be spam.
    Hope that helps. ~Amatulić (talk) 17:06, 11 March 2014 (UTC)[reply]
    Indeed that helps. See my next edit. - Cheers - DVdm (talk) 18:03, 11 March 2014 (UTC)[reply]
    Also editfilters might be useful for this. I thought they were used, but I am not certain. All the best, Rich Farmbrough, 18:05, 3 April 2014 (UTC).[reply]

    Full log

    after the heading "Old logs" please add
    * [[/Full_list/]] (Large but useful when you have no idea of the date.)

    All the best, Rich Farmbrough, 18:03, 3 April 2014 (UTC).[reply]

    Done --Redrose64 (talk) 19:00, 3 April 2014 (UTC)[reply]

    \bpower-technology\.com\b

    Does anyone know why power-technology.com is on the blacklist? A bot recently tagged Hazelwood Power Station. The link is arguably not in the best point in the article, but is this a mirror site or something? The link has been there since 2009. Yaris678 (talk) 12:33, 4 April 2014 (UTC)[reply]

    See #cbronline.com Werieth (talk) 12:39, 4 April 2014 (UTC)[reply]
    OK. So links to the site have been added recently by a known spammer. The website appears to be associated with another website whose quality has gone down at some point. What do you think should be done in this instance? Yaris678 (talk) 12:55, 4 April 2014 (UTC)[reply]
    Block the spammer, I would say. Beagel (talk) 05:34, 5 April 2014 (UTC)[reply]
    Actually this has been an ongoing issue. I would just find a better source, given its just being used to cite basic facts those should be easily found in better sources. Werieth (talk) 12:57, 4 April 2014 (UTC)[reply]
    More than 150 articles use that site for references and most of them are valid references, not spam. Another site heavily affecting WP:Energy is offshore-techology.com (more than 130 articles - again, valid references, not spam). Do you do any consequences analysis before blaclisting sites? Anyway, I started policy discussion here as this is not acceptable that affected Wikiprojects get know about blacklisting only post factum. Beagel (talk) 05:31, 5 April 2014 (UTC)[reply]

    why the hell is International Trade Union Confederation on the black list?

    http://www.ituc-csi.org/IMG/pdf/statement_by_global_unions_to_the_2013_annual_meetings_of_the_imf_and_world_bank.pdf.pdf

    that link doesnt work for example--Crossswords (talk) 00:03, 13 April 2014 (UTC)[reply]

    Eh, you just linked to it - so it is obviously not on the blacklist. --Dirk Beetstra T C 07:07, 13 April 2014 (UTC)[reply]

    -not true, i am allowed to post it in talk pages, but i am not allowed to edit things by using their url site as a source--Crossswords (talk) 03:18, 18 April 2014 (UTC)[reply]

    Thats not how the backlist works. Can you provide more details? Werieth (talk) 03:20, 18 April 2014 (UTC)[reply]

    Heads up

    I have asked for porting Erwin's tool (SBHandler) from meta to here: Wikipedia:Gadget/proposals#m:User:Erwin.2FSBHandler_-_Spam_blacklist_handler. --Dirk Beetstra T C 09:57, 25 April 2014 (UTC)[reply]

    Gayot.com

    While editing VeeV I came across a link to a page on a site called novusvinum.com, which I traced and found now to redirect to a page at gayot.com. Gayot is a reputable food and drink review website, and the page verifies the naming of VeeV as a "top 10 spirit" in 2010. When I tried to update the link in the article, I found that gayot.com is on the blacklist. The reason given for its inclusion is that it was being abused by multiple SPAs back in March 2011. Can we try removing the site now, three years later, as it is a legitimate source, and the abusers may by now have gone on to other projects? —Largo Plazo (talk) 11:21, 25 April 2014 (UTC)[reply]