Jump to content

MediaWiki talk:Spam-blacklist: Difference between revisions

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Content deleted Content added
Line 432: Line 432:


==Typemock.com==
==Typemock.com==
==blog.typemock.com==
* {{LinkSummary|typemock.com}}
* {{LinkSummary|typemock.com}}
* {{LinkSummary|blog.typemock.com}}
* {{LinkSummary|blog.typemock.com}}
Line 438: Line 437:
A terminated employee of Typemock - a global high tech company - violated the TOS and resulted in Typemock.com and blog.typemock.com being added to the Wikipedia spam blacklist. That employee is no longer with Typemock.
A terminated employee of Typemock - a global high tech company - violated the TOS and resulted in Typemock.com and blog.typemock.com being added to the Wikipedia spam blacklist. That employee is no longer with Typemock.


Typemock is an international company that makes development tools for .NET and C++ developers. Customers include Microsoft, Intel, Morgan Stanley, and numerous companies in the US, Europe, Amdocs, and around the world. We have been featured in the software development media, such as SDTimes and EE Times. We are a Microsoft Development Partner, as well.
Typemock is an international company that makes development tools for .NET and C++ developers. Customers include Microsoft, Intel, Morgan Stanley, and numerous companies in the US, Europe, Amdocs, and around the world. We have been featured in the software development media, such as SDTimes and EE Times. We are a Microsoft Development Partner, as well. {{unsigned|Amechad}}

:I see that spamming of Typemock [http://en.wikipedia.org/w/index.php?title=List_of_mock_object_frameworks&diff=prev&oldid=330057676 continued with blacklist evasion]:

:*{{spamlink|typemock.org}}

:You have failed to argue why Wikipedia benefits from linking to this site. [[Adverse selection|We also do not remove entries from the spam blacklist in response to requests from the site owners]]. {{denied}}. [[User:MER-C|MER-C]] 03:14, 14 March 2011 (UTC)


=Completed Proposed removals=
=Completed Proposed removals=

Revision as of 03:14, 14 March 2011

    Mediawiki:Spam-blacklist is meant to be used by the spam blacklist extension. Unlike the meta spam blacklist, this blacklist affects pages on the English Wikipedia only. Any administrator may edit the spam blacklist. See Wikipedia:Spam blacklist for more information about the spam blacklist.


    Instructions for editors

    There are 4 sections for posting comments below. Please make comments in the appropriate section. These links take you to the appropriate section:

    1. Proposed additions
    2. Proposed removals
    3. Troubleshooting and problems
    4. Discussion

    Each section has a message box with instructions. In addition, please sign your posts with ~~~~ after your comment.

    Completed requests are archived. Additions and removals are logged, reasons for blacklisting can be found there.

    Addition of the templates {{Link summary}} (for domains), {{IP summary}} (for IP editors) and {{User summary}} (for users with account) results in the COIBot reports to be refreshed. See User:COIBot for more information on the reports.


    Instructions for admins
    Any admin unfamiliar with this page should probably read this first, thanks.
    If in doubt, please leave a request and a spam-knowledgeable admin will follow-up.

    Please consider using Special:BlockedExternalDomains instead, powered by the AbuseFilter extension. This is faster and more easily searchable, though only supports whole domains and not whitelisting.

    1. Does the site have any validity to the project?
    2. Have links been placed after warnings/blocks? Have other methods of control been exhausted? Would referring this to our anti-spam bot, XLinkBot be a more appropriate step? Is there a WikiProject Spam report? If so, a permanent link would be helpful.
    3. Please ensure all links have been removed from articles and discussion pages before blacklisting. (They do not have to be removed from user or user talk pages.)
    4. Make the entry at the bottom of the list (before the last line). Please do not do this unless you are familiar with regular expressions — the disruption that can be caused is substantial.
    5. Close the request entry on here using either {{done}} or {{not done}} as appropriate. The request should be left open for a week maybe as there will often be further related sites or an appeal in that time.
    6. Log the entry. Warning: if you do not log any entry you make on the blacklist, it may well be removed if someone appeals and no valid reasons can be found. To log the entry, you will need this number – 418721276 after you have closed the request. See here for more info on logging.


    Proposed additions

    sikhkaras.com

    Repeatedly using anonymous IPs to add link (which, while I do not understand the language used, appears plain to be a commercial site for selling various religious items) to Sikhism, and, while my memory is not fresh on this, I believe to other Sikhism-related articles. --Nlu (talk) 15:38, 31 December 2010 (UTC)[reply]

    Not much activity with this link. I see it was added by 117.199.85.120 (talk · contribs) and 117.199.90.124 (talk · contribs) to Sikhism in December, and in late January it was added by 59.94.208.231 (talk · contribs) to Guru Gobind Singh, which I have just removed. There doesn't seem to be a push to spam this link, and the first two seem to be the same person. The last one, based on the editsummary, may be a good faith attempt to include a quotation.
    COIbot, oddly, doesn't report any activity with this link. What's up with that? ~Amatulić (talk) 21:14, 8 February 2011 (UTC)[reply]
    COIBot was triggered by this;
    Also;
    Worth noting, Sikhkaras.net is unrelated to the .COM of the same name, however, was added by SikhKaras (talk · contribs).--Hu12 (talk) 16:30, 11 February 2011 (UTC)[reply]

    advocatekhoj.com

    Well over one hundred links to this site - apparently a legal portal that charges lawyers for referrals - have been added as "references" is recent weeks, by a series of IPs in the 123.201.*.* range. The IPs have also been replacing existing ref links with this URL. It appears to be spam, just thought I'd get a second opinion before removing them and blacklisting. --Ckatzchatspy 08:37, 14 February 2011 (UTC)[reply]

    Comment: this thread was blanked [1] by 123.201.77.78 (talk · contribs · WHOIS). --- Barek (talkcontribs) - 04:57, 16 February 2011 (UTC)[reply]
    Seems clear it's a spam issue, as I'm now seeing SPA accounts reverting/adding the links. I've added it to the blacklist and will continue cleaning it up later this evening. --Ckatzchatspy 06:47, 16 February 2011 (UTC)[reply]

    scribd.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    I noticed Hu12 added a bunch of scribd.com links (one entry with some wildcards would have been sufficient; why all those?). I'm wondering, since scribd content consists (as far as I can tell) of original work posted by users or copies of copyrighted material, if anything on scribd would qualify as a WP:RS. If Hu12's additions are any indication, this blacklist could swell disproportionately with scribd links. ~Amatulić (talk) 19:42, 21 February 2011 (UTC)[reply]

    Seems scribd links are formatted in some form of unique document number (...scribd.com/doc/10935894/...), not by user name or ID. Those links are apart of one persistant spammers collection of spamlinks. Typical, Spamming, subverting the blacklist, vandalism ect type case. The log has a link to the case. I would agree, Amatulić, as to scribd... its a "honey pot" for WP:OR, WP:COPYRIGHT vios, and most things unreliable...perhaps this might be a candidate for a perminant block?--Hu12 (talk) 20:31, 21 February 2011 (UTC)[reply]
    Personally I'd support general blacklisting with support for whitelisting documents deemed acceptable. A lot of POV pushers have used scribd documents as a way to imply that a real scholarly paper has been published on something when in fact scribd has no editorial function. They have also been used to store copyright violations, as noted in the scribd article. Also, while not a reason for blacklisting, it's true that a lot of well-meaning editors have used scribd for sourcing simply because it looks like a reliable source, even though it generally isn't. Gavia immer (talk) 21:33, 21 February 2011 (UTC)[reply]
    One wouldn't need to blacklist the whole domain either, just \bscribd.com/doc/\b. ~Amatulić (talk) 21:53, 21 February 2011 (UTC)[reply]
    Found an old discussion supporting the same thing (note; the load time is long). Any way, there are currently over 7000 links of scribd on wikipedia. cleanup will need to be done first, otherwise we risk significant disruption. I think I recall someone was making a bot that could remove links, cant remember who. Seems there's quite a few sub-sections;
    • scribd.com/group/
    • scribd.com/share/
    • scribd.com/groups/
    • scribd.com/feeds/
    • scribd.com/explore/
    • scribd.com/community/
    • scribd.com/store/
    • scribd.com/webstuff/
    • scribd.com/upload/
    • scribd.com/partners
    • scribd.com/people/
    • scribd.com/mobile/
    • scribd.com/full/
    • blog.scribd.com/
    • scribd.com/collections/
    • scribd.com/press
    Authors pages are located in the root.. scribd.com/LauraNovak..--Hu12 (talk) 16:56, 22 February 2011 (UTC)[reply]
    The vast majority (5000+ links) are for scribd.com/doc/*. We could chip away at the most obvious ones first, such as scribd.com/(store|group|groups|community) and blog.scribd.com. I also see a few scribd links that match a familiar Wikipedia username.... looks like somebody trying to create their own article references. ~Amatulić (talk) 18:14, 22 February 2011 (UTC)[reply]
    I've revertlisted scribd.com on XLinkBot, which might help to keep mainspace a bit clean. Seen this post, I would support blacklisting this. Note, we do not need to clean before blacklisting (pages with the link will still save), as long as they go ASAP afterwards. --Dirk Beetstra T C 13:32, 23 February 2011 (UTC)[reply]
    "We do not need to clean before blacklisting". We don't? How does that work? And would this explain why I'm able to save blacklisted links in the helium.com article? Just curious how this works; I'm fairly new to working on this list. ~Amatulić (talk) 01:44, 26 February 2011 (UTC)[reply]

    siggysoft.org

    siggysoft.org: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com This link was firstly added to the article email, but after an invastigation the site seem to be a content mapper/rewriter of wikipedia artciles. mabdul 15:49, 23 February 2011 (UTC)[reply]

    classicalm.com

    classicalm.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    This commercial site which sells CDs and has no encyclopedic value whatsoever, had been repeatedly spammed by:

    82.117.252.87 (talk • contribs • deleted contribs • blacklist hits • AbuseLog • what links to user page • COIBot • Spamcheck • count • block log • x-wiki • Edit filter search • WHOIS • RDNS • tracert • robtex.com • StopForumSpam • Google • AboutUs • Project HoneyPot)

    Many more pages have been spammed than currently shows as editors have reverted the additions multiple times. The IP's talk page currently has multiple warnings, including a final warning concerning this and continued to spam, e.g. [2]. IP now temporarily blocked for 24 hours, but will no doubt to continue to add this link once unblocked or he shifts to another IP. This has been going on for months – Voceditenore (talk) 13:24, 24 February 2011 (UTC)[reply]

    pregnancy.wisertogether.com

    Hmm ..

    Mainly the abovementioned subdomain:

    And the above IP. --Dirk Beetstra T C 08:58, 25 February 2011 (UTC)[reply]

    cn.zs.yahoo.com

    This link recently came up at the Help Desk[3] and was removed.[4] Also, in July 2007, a similar link was removed as "Dangerous link - Drive-by-drive download so we want to get rid of that link for security reasons?"[5] Linksummary above may have other examples. -- Uzma Gamal (talk) 17:14, 24 February 2011 (UTC)[reply]

    joshuaproject.net

    This website is blacklisted at w:de since 2011, see w:de:SBL#joshuaproject.net (German). ATM there are more than 700 links in w:en.
    I blacklisted that page at w:de, because Joshua Project seems to be an aggressive converting organisation. One of their main questions is "Which people groups still need an initial church-planting movement in their midst"[6]. Of course they have got a great database on languages and peoples. But apart from that at every language page the pov-information about the "progress" (which is the amount of christians) is given and additionally whether some jesus film is available in that language.
    The interesting part of the website's content is collected from Ethnologue and LinguistList. A blacklisting of joshuaproject.net would not delete or hide information for it is available at the other projects. -- seth (talk) 21:56, 25 February 2011 (UTC), 23:26, 2 March 2011 (UTC)[reply]

    statsheet.com and associated sites

    I have no idea how this went on so long without being noticed, but this user has been spamming links to a network of sites (e.g. terpsball.com, hooreview.com) for two months. There are a whole slew of domain names.--B (talk) 19:54, 1 March 2011 (UTC)[reply]

    StatSheet network of online, fan-centric sports sites
    • Adsense google_ad_client = 1552698452868487
    Article Spam
    StatSheet (edit | talk | history | protect | delete | links | watch | logs | views)
    more Accounts
    RobbieStats (talk · contribs · deleted contribs · blacklist hits · AbuseLog · what links to user page · count · COIBot · Spamcheck · user page logs · x-wiki · status · Edit filter search · Google · StopForumSpam)
    Gfoster23 (talk · contribs · deleted contribs · blacklist hits · AbuseLog · what links to user page · count · COIBot · Spamcheck · user page logs · x-wiki · status · Edit filter search · Google · StopForumSpam)
    Aggrogates all of its info and data from the Sports Network site. Claims to have over 300 sites...--Hu12 (talk) 20:24, 1 March 2011 (UTC)[reply]
    Whoa - why is statsheet.com being blacklisted? How come there was no invitation to discuss at Wikiproject:College Basketball? The site is a valid source of college statistics back to 1997. Rikster2 (talk) 20:56, 1 March 2011 (UTC)[reply]
    Its not blacklisted. Although there were concerns raised in previous discussions;
    I find that the the use of sock accounts, repeatedly adding ones own site on top, canvasing for inclusion, then attempting to add hundreds of your own, Adsense related domains, not signs of good faith.--Hu12 (talk) 21:37, 1 March 2011 (UTC)[reply]
    So deal with the users, not the valid source of basketball stats. I have no connection with the site, but use it all the time. It is easily as valid as ESPN.com - used on 95% of sports articles. It's just a database with a user-friendly interface. I do think any discussion of forbidding the use of statsheet.com should invite in the WP:College Basketball Project which has a vested interest in valid historical statistics. Rikster2 (talk) 22:07, 1 March 2011 (UTC)[reply]
    If we had a technical means for legitimate users to use it while blocking spam, I'd be all for it. But if we let anyone spam their site to every team and player page, we'd have eleventy billion external links in every sports article. When someone repeatedly promotes their site, we take action to prevent disruption. If we reward spamming, then we only encourage it. Is it possible to get the statistics we need from the NCAA? I don't look for basketball all that much, but I know they have very detailed statistics pages for football. --B (talk) 23:13, 1 March 2011 (UTC)[reply]

    dvdrare.com

    Spammers

    See WikiProject Spam report MER-C 07:35, 11 March 2011 (UTC)[reply]

    potatoricer.org.uk

    Continued again today (see 188.135.28.137 (talk · contribs)), ongoing for over a year. See prior reports at:

    --- Barek (talkcontribs) - 17:13, 12 March 2011 (UTC)[reply]

    reflectionsindia.org

    Sock farm adding links across several articles. Also see Wikipedia:Sockpuppet_investigations/Hinduismispeace/Archive - MrOllie (talk) 21:51, 13 March 2011 (UTC)[reply]

    Completed Proposed additions

    sunbizar-technologies.com

    Vandalistic SEO spamming. See WikiProject Spam report MER-C 10:27, 17 February 2011 (UTC)[reply]

     Done--Hu12 (talk) 20:44, 17 February 2011 (UTC)[reply]
    Continued spamming related renovations.co.nz also this report vandalized
    plus Added--Hu12 (talk) 18:11, 9 March 2011 (UTC)[reply]

    kavkazcenter.com

    This is a "radical islamic website" providing disinformation about the cacasus region/world. I've also made a request here http://en.wikipedia.org/wiki/Wikipedia:Reliable_sources/Noticeboard#Kavkaz_Center_.28everyone_can_help.21.29 to get other opinions/experience with this page. Please take a look, thanks in advance! --84.168.101.210 (talk) 19:55, 28 February 2011 (UTC)[reply]

     Done. Please start cleaning up the mainspace articles containing this. See http://en.wikipedia.org/wiki/Special:LinkSearch/*.kavkazcenter.com ~Amatulić (talk) 20:36, 1 March 2011 (UTC)[reply]

    supportdock.com

    Svarya (talk · contribs) has been adding spam links to this site to various articles despite multiple requests to stop doing so. Socrates2008 (Talk) 09:14, 1 March 2011 (UTC)[reply]

    Spammed related,
    Article Spam
    IYogi (edit | talk | history | protect | delete | links | watch | logs | views)
    Accounts
    Svarya (talk · contribs · deleted contribs · blacklist hits · AbuseLog · what links to user page · count · COIBot · Spamcheck · user page logs · x-wiki · status · Edit filter search · Google · StopForumSpam)
    Alenaross07 (talk · contribs · deleted contribs · blacklist hits · AbuseLog · what links to user page · count · COIBot · Spamcheck · user page logs · x-wiki · status · Edit filter search · Google · StopForumSpam)
    Roskey44 (talk · contribs · deleted contribs · blacklist hits · AbuseLog · what links to user page · count · COIBot · Spamcheck · user page logs · x-wiki · status · Edit filter search · Google · StopForumSpam)
    Manish-iyogi (talk · contribs · deleted contribs · blacklist hits · AbuseLog · what links to user page · count · COIBot · Spamcheck · user page logs · x-wiki · status · Edit filter search · Google · StopForumSpam)
    Diyogi (talk · contribs · deleted contribs · blacklist hits · AbuseLog · what links to user page · count · COIBot · Spamcheck · user page logs · x-wiki · status · Edit filter search · Google · StopForumSpam)
    Devi.rathore (talk · contribs · deleted contribs · blacklist hits · AbuseLog · what links to user page · count · COIBot · Spamcheck · user page logs · x-wiki · status · Edit filter search · Google · StopForumSpam)
    Sudhir nyc (talk · contribs · deleted contribs · blacklist hits · AbuseLog · what links to user page · count · COIBot · Spamcheck · user page logs · x-wiki · status · Edit filter search · Google · StopForumSpam)
    125.19.48.38 (talk • contribs • deleted contribs • blacklist hits • AbuseLog • what links to user page • COIBot • Spamcheck • count • block log • x-wiki • Edit filter search • WHOIS • RDNS • tracert • robtex.com • StopForumSpam • Google • AboutUs • Project HoneyPot)
    supportdock.com/iyogi.html. Seems a multi account/IP Spam effort by this company.  Done--Hu12 (talk) 19:52, 1 March 2011 (UTC)[reply]

    picdix.com

    picdix.com: Linksearch en - meta - de - fr - simple - wikt:en - wikt:frMER-C Cross-wiki • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advancedCOIBot-Local - COIBot-XWiki - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.org • Live link: http://www.picdix.com

    Added by the following IPs:

    Report at Wikipedia_talk:WikiProject Spam/2011 Archive Feb 1 claims cross-wiki spamming although I see just one instance of that happening. See also "We are not spammers" message posted by the site owner to [7] by 213.150.228.38 in response to MER-C removing 25 links.

    The site consists of line drawings of various things (a "picture dictionary") some of which would serve Wikipedia better if they were uploaded to Commons and used directly in articles. ~Amatulić (talk) 21:54, 3 March 2011 (UTC)[reply]

    I think we can get away with not listing this one for now - the latest post on my talk page gives me the feel that they will add no more links for now, although the user did not answer my question. MER-C 08:31, 4 March 2011 (UTC)[reply]
    We'll hold off on this one for now, clearly if the additions return, this should be added. thanks  Not done--Hu12 (talk) 16:25, 10 March 2011 (UTC)[reply]

    healthsystemcanada.com

    Related domains
    Spammers

    See WikiProject Spam report MER-C 07:19, 4 March 2011 (UTC)[reply]

    plus Added--Hu12 (talk) 16:30, 10 March 2011 (UTC)[reply]

    xtremegapyear.co.uk

    Repeated addition of identical low-quality, otherwise unsourced, section to Gap year containing this spam link. The perpetrator uses changing anonymous IPs geolocated all over the place:

    While low intensity, no indication this is going to stop. Perps cannot be warned off and blocked by conventional means. --Lambiam 09:05, 5 March 2011 (UTC)[reply]

    plus Added --Dirk Beetstra T C 16:06, 6 March 2011 (UTC)[reply]

    All three already added. The spammers continued with the former of these three after the original one was blacklisted. Many IPs spamming the links. --Dirk Beetstra T C 12:00, 9 March 2011 (UTC)[reply]

    spectrumscandal.com

    Anonymous amateur POV blog about the ongoing 2G cellular spectrum auction investigation in India. Repeated addition to related articles by Mahalaxmanan (talk · contribs) and sock-puppet Spectrumraja (talk · contribs):

    Sockmaster has been 72-hour banned and puppet is indef-banned. I propose that the domain be blacklisted to avoid the need for ongoing supervision. - Pointillist (talk) 00:50, 7 March 2011 (UTC)[reply]

    Update: another account is adding the same link to the same articles. I've filed this new SPI case, but it would be better to blacklist the site for good rather than have to keep detecting and blocking sock accounts that seem to be created just for the purpose of adding links to spectrumscandal.com - Pointillist (talk) 10:09, 10 March 2011 (UTC)[reply]
    Mahalaxmanan (talk · contribs · deleted contribs · blacklist hits · AbuseLog · what links to user page · count · COIBot · Spamcheck · user page logs · x-wiki · status · Edit filter search · Google · StopForumSpam)
    Spectrumraja (talk · contribs · deleted contribs · blacklist hits · AbuseLog · what links to user page · count · COIBot · Spamcheck · user page logs · x-wiki · status · Edit filter search · Google · StopForumSpam)
    Mohanloganathan (talk · contribs · deleted contribs · blacklist hits · AbuseLog · what links to user page · count · COIBot · Spamcheck · user page logs · x-wiki · status · Edit filter search · Google · StopForumSpam)
    Niraradia (talk · contribs · deleted contribs · blacklist hits · AbuseLog · what links to user page · count · COIBot · Spamcheck · user page logs · x-wiki · status · Edit filter search · Google · StopForumSpam)
    plus Added. Thanks for reporting, Pointillist. --Hu12 (talk) 16:40, 10 March 2011 (UTC)[reply]

    marvelousessays.com

    MER-C 11:08, 9 March 2011 (UTC)[reply]

    See WikiProject Spam report plus Added--Hu12 (talk) 18:00, 9 March 2011 (UTC)[reply]

    portablegeneratorinfo.com

    Acroterion (talk) 13:16, 10 March 2011 (UTC)[reply]

    also;
    Portable Generator Information (edit | talk | history | protect | delete | links | watch | logs | views)
     Done. Thanks Acroterion. --Hu12 (talk) 16:49, 10 March 2011 (UTC)[reply]

    Proposed removals

    KoolMuzone.com

    A quick run through the logs reveal that the domain was blocked due to the actions of a possible vandal. The site otherwise is one of the best and most respected blogs of Pakistan and has won the best Music Blog of Pakistan Award last year. They are regularly breaking news, and launching artists by partnering with national radio stations here. I request that its inclusion in the blacklist be reconsidered. Thanks. UzEE 02:40, 5 March 2011 (UTC)[reply]

    Sorry, but this was plainly spammed, and it is blacklisted on 3 different wikis (I am actually wondering why it is not meta-blacklisted). If the blog is notable enough for an own article on en.wikipedia, I would suggust to whitelist an about.htm or the index.htm for use on that article. Blogs generally make bad external links and are often also not really suitable as references .. and seen the multi-IP, multi-article, multi-wiki spamming, I would say,  Defer to Whitelist for specific links which are suitable here or there. --Dirk Beetstra T C 08:25, 11 March 2011 (UTC)[reply]

    NingboGuide.com

    Requesting removal of spam status for following reasons: Ningboguide.com is a directory/travel site rated #3 on Google and top 5 of most search engines irrespective of country. Site represents a city of apprx 5 million residents and has a mission statement of providing complete info from medical care to nightlife. In addition the site is the only link to the city's only English magazine. Site content written by company staff/contributing authors as well as contributed by other entities such as press releases from local government, consulates, 5 star hotels etc. Have read previous logs pertaining to why site was black listed. To my knowledge, as owner of company, this was done between individuals not related to company/site/magazine.The site is truly an informational tool used by many prior to and during their business, tourist, or residences in Ningbo, China...which matches the goals of Wikipedia in my opinion. Thx for taking time to read an consider. Thaneningbo (talk) 10:36, 11 March 2011 (UTC)Thane[reply]

    A bit perennial, this. This was requested earlier by editors clearly involved in the site. This site, and sites related to it, were aggressively added throughout Wikipedia, and also on other wikipedia's. In fact, this specific site is blacklisted on three different wikis (which makes it almost suitable for global blacklisting). Many edits of removing other links in favour of these, many additions, mainly by IPs and editors who seem to have a big interest in this site.

    Thaneningbo - that this site is rated #3 on Google and top 5 on other search engines may tell about the efficiency of SEO of this site (that ranking would likely be higher when Wikipedia was also involved ..). We hardly ever, if ever, de-list on the request of someone involved in the site , so just as earlier: no Declined - if uninvolved, high volume editors come to request this, then de-listing may be considered (though seen the history of additions of this domain and related domains, I would then suggest to defer this to the whitelist for a specific link to be used on one, specific article.

    Some further statistics:

    users
    links

    --Dirk Beetstra T C 11:54, 11 March 2011 (UTC)[reply]

    Typemock.com

    A terminated employee of Typemock - a global high tech company - violated the TOS and resulted in Typemock.com and blog.typemock.com being added to the Wikipedia spam blacklist. That employee is no longer with Typemock.

    Typemock is an international company that makes development tools for .NET and C++ developers. Customers include Microsoft, Intel, Morgan Stanley, and numerous companies in the US, Europe, Amdocs, and around the world. We have been featured in the software development media, such as SDTimes and EE Times. We are a Microsoft Development Partner, as well. — Preceding unsigned comment added by Amechad (talkcontribs)

    I see that spamming of Typemock continued with blacklist evasion:
    You have failed to argue why Wikipedia benefits from linking to this site. We also do not remove entries from the spam blacklist in response to requests from the site owners.  Denied. MER-C 03:14, 14 March 2011 (UTC)[reply]

    Completed Proposed removals

    ehow.com

    I am requesting that eHow be unblocked because I edited Monterey Bay Aquarium and came upon the message that it was blocked. I am working hard to promote it to GA status. Also, if information is added to an article that regards how to do a task, I would certainly use this link to cite such info. Thank you. Bulldog73 (talk) 05:16, 1 March 2011 (UTC) ehow.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com[reply]

    Ehow.com is a content farm to whom anyone can contribute, which pays its writers based on page views ([8]), exercises little editorial oversight and whose content is essentially self-published. See also [9]. If you really want to use a specific link for the article, then  Defer to Whitelist. MER-C 09:41, 1 March 2011 (UTC)[reply]

    google.com/cse?

    I do not know why these are blocked, so in all honesty I can't tell you why this one should be unblocked. It seems perfectly harmless to me. Speaking as a newbie to the Spam apparatus here, I can say with some authority that it is set up in an unnecessarily arcane and uninformative manner. The current blocked list is nowhere in sight, and yet users are supposed to check the list to see whether it is a Meta block or a local block. So I am putting this up on both. Surely software can determine this for users. Anarchangel (talk) 02:13, 19 February 2011 (UTC) google.com/cse?: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com[reply]

     Defer to Global blacklist It's in the global blacklist. Please see meta:Talk:Spam blacklist#google.com/cse?. Jafeluv (talk) 13:44, 21 February 2011 (UTC)[reply]

    educationupdate.com

    "Education Update" is a 14-year-old, award-winning newspaper with 100,000 readers and 2 million monthly hits on the web. Our readership includes parents, teachers, students, guidance counselors in NY and NJ, principals, superintendents, librarians, college presidents, college deans, foundation heads, politicians, business leaders and medical school deans. Education Update is mailed to over 1600 public schools in NYC, 170 schools in NJ, 207 public libraries, 150 private schools. Qwerty200075 (talk) 05:18, 28 February 2011 (UTC)[reply]

     Not done. "Our readership"? We generally don't remove listings at the request of a site owner, editor, employee, or anyone otherwise connected to the site, especially for a site as heavily spammed on Wikipedia as this one. If a trusted, high-volume editor requests removal, we may consider it seriously. For now, you can request specific pages to be whitelisted at MediaWiki talk:Spam-whitelist. ~Amatulić (talk) 00:31, 1 March 2011 (UTC)[reply]

    no problem I understand, but just to let you know, I'm not related at all with them, just tried to use one page of their site as a source Qwerty200075 (talk) 15:13, 1 March 2011 (UTC)[reply]

    Troubleshooting and problems

    Comment by blacklister A reasons was given in the log; repeat spamming of NN zine. It was spammed by more than one role account. I'm strongly against removing it from the blacklist based on this, and I highly doubt that a non-notable music blog gets any exclusive scoops not available on other sites. If such a situation did exist, it would be more appropriate to whitelist for individual cases. Furthermore, I can't help to be suspicious of a freshly created account requesting a blacklisting removal so quickly. OhNoitsJamie Talk 14:41, 30 January 2011 (UTC)[reply]

    Hellwars.com

    Hello. I'm wondering why was website with URL stated above blacklisted? I'm a player of this MMORPG and I would like to create a Wiki article, it took me hours to write everything, and now when I wanted to save the page and ask for feedback, I can't, since it says this URL has been blacklisted. Could I get some help regarding this, because I don't really see how this site could break any rules :/ Thanks. ClammieR (talk) 23:36, 11 January 2011 (UTC)[reply]

    It was blacklisted globally at Meta, not here, which suggests that it was spammed heavily on multiple wikis. Before you bother petitioning there, ask yourself if the site would meet our WP:WEB policy (which I personally doubt it will) before you spend any more time on it. OhNoitsJamie Talk 00:36, 12 January 2011 (UTC)[reply]

    Hmm, I see. Didn't know that. But I'm wondering how can torrent sites or similar games to this be on Wikipedia, while this one can't. I'll be reading more about your policy & Meta stuff tomorrow, but I don't get it why you have double standards for similar or even worse websites. Could it be because mabye nobody reported other sites or they weren't checked for abusive behaviour or something similar? Thanks for fast reply by they way! — Preceding unsigned comment added by ClammieR (talkcontribs) 00:40, 12 January 2011 (UTC)[reply]

    Logging / COIBot Instr

    Blacklist logging

    Full instructions for admins


    Quick reference

    For Spam reports or requests originating from this page, use template {{/request|0#section_name}}

    • {{/request|213416274#Section_name}}
    • Insert the oldid 213416274 a hash "#" and the Section_name (Underscoring_spaces_where_applicable):
    • Use within the entry log here.

    For Spam reports or requests originating from Wikipedia_talk:WikiProject_Spam use template {{WPSPAM|0#section_name}}

    • {{WPSPAM|182725895#Section_name}}
    • Insert the oldid 182725895 a hash "#" and the Section_name (Underscoring_spaces_where_applicable):
    • Use within the entry log here.
    Note: If you do not log your entries, it may be removed if someone appeals the entry and no valid reasons can be found.

    Addition to the COIBot reports

    The lower list in the COIBot reports now have after each link four numbers between brackets (e.g. "www.example.com (0, 0, 0, 0)"):

    1. first number, how many links did this user add (is the same after each link)
    2. second number, how many times did this link get added to wikipedia (for as far as the linkwatcher database goes back)
    3. third number, how many times did this user add this link
    4. fourth number, to how many different wikipedia did this user add this link.

    If the third number or the fourth number are high with respect to the first or the second, then that means that the user has at least a preference for using that link. Be careful with other statistics from these numbers (e.g. good user who adds a lot of links). If there are more statistics that would be useful, please notify me, and I will have a look if I can get the info out of the database and report it. This data is available in real-time on IRC.

    Poking COIBot

    When adding {{LinkSummary}}, {{UserSummary}} and/or {{IPSummary}} templates to WT:WPSPAM, WT:SBL, WT:SWL and User:COIBot/Poke (the latter for privileged editors) COIBot will generate linkreports for the domains, and userreports for users and IPs.


    Discussion

    ip address URLs

    I have often wondered why we allow pure IP addresses to be linked to, its one of the easiest ways to bypass the SBL, and it also means that these links break often, and are unable to be corrected due to now knowing the previous host name. Any random thoughts? ΔT The only constant 20:17, 8 October 2010 (UTC)[reply]

    Hello???? any one notice this section? ΔT The only constant 16:10, 2 November 2010 (UTC)[reply]
    I can't think of a good reason to allow IP address links off the top of my head. I don't think the regex to do it would be too difficult. Anyone else? OhNoitsJamie Talk 16:25, 2 November 2010 (UTC)[reply]
    I did notice, Δ, and I think the answer is that IP address links might be useful outside of mainspace. I agree that they are surely not wanted in mainspace, and outside of mainspace they need not be linked. However, blacklisting such a wide class of links probably needs broad consensus, or at least a willingness to turn the blacklisting off if there are complaints. Gavia immer (talk) 16:33, 2 November 2010 (UTC)[reply]
    For reference, the regex would be
    \b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b — although this will also match impossible address like 666.777.888.999. It's probably good enough, although if you really want to match the legal ranges 0-255.0-255.0-255.0-255 you'd need a complicated expression:
    \b(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b - yech! ~Amatulić (talk) 00:27, 5 January 2011 (UTC)[reply]
    A lot of governments have been seizing or blocking domains which point to content they deem inappropriate (I believe some governments blocked WikiLeaks recently.) Linking directly to the IP is often an effective way to get around censorship if the domains have been seized.Bpodgursky (talk) 23:29, 20 February 2011 (UTC)[reply]
    In the unlikely event that comes up, the individual addresses can be whitelisted on a case by case basis. Stifle (talk) 11:52, 25 February 2011 (UTC)[reply]

    Automatic archiving

    Due to the format of this this page and how we archive, most archive bots cannot function here. However I just took a few minutes and wrote a custom script that should do it for us. It makes one change to convert {{LinkSummaryLive}} to {{LinkSummary}} in order to bypass any spam filter issues. (I may need to adjust it some more). There are two variables that can be configured: stale conversations, and ones tagged with templates indicating defer/done/not done ect. Right now my thoughts would be to set stale conversations to 30 days, and those tagged to 15. Thoughts? ΔT The only constant 05:50, 25 January 2011 (UTC)[reply]

    To keep this page clear, I'd like to see automated archiving - though I also like the thing we do on the whitelist: we have the open requests, which get either granted or denied, they then get moved to an appropriate section (IMHO, that could be after 24 hours), and later archived (which would be nice after say, 1-2 weeks, bit depending on size). At least they are then quick out of the 'open' area, which makes it easier to focus on what needs 'quick' attention, while still having the posts handy for some time if the problem expands to other areas, or if there are quick de-listing requests.
    I would also suggest that both 'live' links get converted (and the {{LinkSummaryLive}} converted to {{LinkSummary}}) when moving the requests.
    All in all, yes, please! --Dirk Beetstra T C 12:32, 25 January 2011 (UTC)[reply]
    If you want to create the the new sections I can tweak the code. I would request that each "section" retain the primary '=' section level, so that we are not mixing section levels, but it would be trivial to adjust my archive code. Just let me know the time periods, and I could have the code operational in less than 24 hours, and then would go ahead with the BRFA process. ΔT The only constant 18:10, 25 January 2011 (UTC)[reply]
    To the original question ... what is the bot name? Has it already been approved, or is it pending approval? For time duration, I think we can start it with 45 days stale, and tighten it up later if needed. I would prefer to have longer than needed as the starting point and adjust down, rather than too short and adjusting up. My only other concern is ensuring there's an easy to access emergency off switch (possibly linked from the header for this page). --- Barek (talk) - 18:30, 25 January 2011 (UTC)[reply]
    I have not filed for approval yet, I wanted to flush the idea out, find issues, get those addressed, before ever going to the BRFA process. As for the shutoff, that should be trivial, just a matter of configuring a wiki page. ΔT The only constant 18:34, 25 January 2011 (UTC)[reply]
    Before proceeding any further, you may want to read Wikipedia talk:Blocked external links, which is proposing some changes to where these requests are submitted, as well as how the requests on the page are structured. --- Barek (talk) - 19:41, 25 February 2011 (UTC)[reply]

    Possible malware

    There's a question at RSN about a possible malware site. Could someone take a look at Wikipedia:Reliable_sources/Noticeboard#Please_check_the_source? WhatamIdoing (talk) 06:01, 12 February 2011 (UTC)[reply]

    Ran the url through a few malware/threat detectors, seems its ok.
    Here are a few scanner tools that could be usefull.
    --Hu12 (talk) 19:53, 12 February 2011 (UTC)[reply]

    Moving

    It is proposed to relocate this process and MediaWiki talk:Spam-blacklist to Wikipedia:Blocked external links in order to reduce the "spam" connotations. Please comment at Wikipedia talk:Blocked external links. Stifle (talk) 11:51, 25 February 2011 (UTC)[reply]

    Ive done just a basic check for dead links assuming that \bname\com\b is name.com and have discovered that about 1,200 of the links listed here are dead, Im also working to cleanup the dupes and merge some regex. Ill post the list of merged/dupes in a day or so, but should we removed the dead sites? ΔT The only constant 21:54, 2 March 2011 (UTC)[reply]

    Be careful mergeing regex. "Clever" merged regexes may take less lines but they are not necessarily faster (or slower). Rich Farmbrough, 02:58, 3 March 2011 (UTC).[reply]
    Trust me, anything too complex gives me a headache with regex. see below for the adjustments. each list has all the old regexes and the last item in each group is the new regex, except when the # comments state something else. ΔT The only constant 04:41, 3 March 2011 (UTC)[reply]
    Extended content
    \b123(baisakhi|chinesenewyear|christians|durgapuja|janmashtami|kazaa|movingcompany|newyear|pongal|refills)\.com\b
    \b123breakingnews\.com\b
    \b123(baisakhi|chinesenewyear|christians|durgapuja|janmashtami|kazaa|movingcompany|newyear|pongal|refills|breakingnews)\.com\b
    
    ncaa09rosters\b
    ncaa2009rosters\b
    ncaa(20)?09rosters\b
    
    \b3fatchicks\.com\b
    \b3fatchicks\.net\b
    \b3fatchicks\.(com|net)\b
    
    \b360bracketz\.com\b
    \b360elite4free\.com\b
    \b360(elite4free|bracketz)\.com\b
    
    \b1000daysofhell\.blogspot\.com\b
    \b1000daysreality\.blogspot\.com\b
    \b1000days(reality|ofhell)\.blogspot\.com\b
    
    \b1909svdbcent\.com\b
    \b1909vdbcent\.com\b
    \b1909s?vdbcent\.com\b
    
    \babbie-cornish\.com\b
    \babbie-cornish\.org\b
    \babbie-cornish\.(org|com)\b
    
    # on the list twice
    \babsolutechinatours\.com\b
    \babsolutechinatours\.com\b
    
    \bacne(busterpro|empire)\.com\b
    \bacneguide\.com\b
    \bacnevulgaris\.blogspot\.com\b
    \bacne(busterpro|empire|guide|vulgaris\.blogspot)\.com\b
    
    \badsenserevenue2u\.blogspot\.com\b
    \badsenserevenue\.blogspot\.com\b
    \badsenserevenue(2u)?\.blogspot\.com\b
    
    \baion-powerlevel\.com\b
    \baion-powerlevel\.net\b
    \baion-powerlevel\.(net|com)\b
    
    \balaska(-defensivedriving|cruisereview)\.com\b
    \balaskanorthernlights\.com\b
    \balaska(-defensivedriving|cruisereview|northernlights)\.com\b
    
    \ball-(about-seashells|voip-info)\.com\b
    \ball-business-advertising\.com\b
    \ball-media-converter\.com\b
    \ball-(about-seashells|voip-info|media-converter|business-advertising)\.com\b
    
    \ballbeerblog\.blogspot\.com\b
    \ballbeerblog\.com\b
    \ballbeerblog(\.blogspot)?\.com\b
    
    \bamber-heard\.net\b
    \bamber-heard\.org\b
    \bamber-heard\.(org|net)\b
    
    \bamericanidolfanclub\.blogspot\.com\b
    \bamericanidolnews2009\.blogspot\.com\b
    \bamericanidol(news2009|fanclub)\.blogspot\.com\b
    
    \banidb\.info\b
    \banidb\.net\b
    \banidb\.(net|info)\b
    
    \banimalopedia\.blogspot\.com\b
    \banimalopedia\.com\b
    \banimalopedia(\.blogspot)?\.com\b
    
    \banimalsinthecity2\.webs\.com\b
    \banimalsinthecity\.webs\.com\b
    \banimalsinthecity2?\.webs\.com\b
    
    \bapakistannews\.com\b
    \bapakistantimes\.com\b
    \bapakistan(times|news)\.com\b
    
    \basyncop\.com\b
    \basyncop\.net\b
    \basyncop\.(net|com)\b
    
    \bbackroads(blog|marketing)\.com\b
    \bbackroadsmarketing\.blogspot\.com\b
    \bbackroads(blog|marketing(\.blogspot)?)\.com\b
    
    \bbalajinadar\.com\b
    \bbalancivity\.com\b
    \bbala(ncivity|jinadar)\.com\b
    
    \bbaltimore(cruises|travel)\.com\b
    \bbaltimoremdjobs\.com\b
    \bbaltimore(cruises|travel|mdjobs)\.com\b
    
    \bbenovarghese\.blogspot\.com\b
    \bbenovarghese\.com\b
    \bbenovarghese(\.blogspot)?\.com\b
    
    \bbest-cialis-store\.com\b
    \bbest-levitra-store\.com\b
    \bbest-(cialis|levitra)-store\.com\b
    
    \bbestbuybirthcontrol\.com\b
    \bbestbuys-rx\.com\b
    \bbestbuy(birthcontrol|s-rx)\.com\b
    
    \bbloodpressuresafety\.com\b
    \bbloodpressuretruth\.com\b
    \bbloodpressure(safety|truth)\.com\b
    
    \bbuzzingstock\.in\b
    \bbuzzingstock\.net\b
    \bbuzzingstock\.(net|in)\b
    
    \bcaliforni(a-contracosta-trafficschool|a-trafficschool-online|abusinessimages|apaintings|aresort|askiresort|cationfan)\.com\b
    \bcalifornia-sunrooms\.com\b
    \bcaliforni(a-contracosta-trafficschool|a-trafficschool-online|abusinessimages|apaintings|aresort|askiresort|cationfan|a-sunrooms)\.com\b
    
    \bchristinahendricks\.info\b
    \bchristinahendricks\.org\b
    \bchristinahendricks\.(org|info)\b
    
    \bclassi(ccarbase|fiedswow)\.com\b
    \bclassicalviolinvideos\.com\b
    \bclassi(ccarbase|fiedswow|calviolinvideos)\.com\b
    
    \bclick2chicago\.com\b
    \bclick2detroit\.com\b
    \bclick2losangeles\.com\b
    \bclick2philadelphia\.com\b
    \bclick2phoenix\.com\b
    \bclick2sanfrancisco\.com\b
    \bclick2(chicago|detroit|losangeles|philadelphia|phoenix|sanfrancisco)\.com\b
    
    \bclientsfirst-us\.com\b
    \bclientsfirst\.biz\b
    \bclientsfirst(-us\.com|\.biz)\b
    
    \bclosingcostfax\.com\b
    \bclosingcostfaxwholesale\.com\b
    \bclosingcostfax(wholesale)?\.com\b
    
    \bcolombo77\.com\b
    \bcolombopro\.com\b
    \bcolombo(pro|77)\.com\b
    
    \bcottagecountry\.com\b
    \bcottagerental\.com\b
    \bcottage(rental|country)\.com\b
    
    \bcrackcbse\.blogspot\.com\b
    \bcrackcbse\.in\b
    \bcrackcbse(\.in|\.blogspot\.com)\b
    
    \bcustomeracquisitionmanagementservices\.com\b
    \bcustomeracquisitionservices\.com\b
    \bcustomeracquisition(managementservices|services)\.com\b
    
    \bddr3800\.com\b
    \bddr31066\.com\b
    \bddr31333\.com\b
    \bddr31600\.com\b
    \bddr31600\.com\b
    \bddr\d+\.com\b
    
    \bdeepcreekhotproperties\.com\b
    \bdeepcreeklakeproperty\.com\b
    \bdeepcreeklakevacay\.com\b
    \bdeepcreekvacations\.com\b
    \bdeepcreekvacay\.com\b
    \bdeepcreek(vacay)|vacations|lakevacay|lakeproperty|hotproperties)\.com\b
    
    \bdiscount(-hotels-cheap\.biz|airfaresaustralia\.com|couponsguide\.com|cruisereservation\.com|edpelletstoves\.net|skivacations\.com)\b
    \bdiscount-fishing-tackle\.com\b
    \bdiscount(-hotels-cheap\.biz|(-fishing-tackle|airfaresaustralia|couponsguide|cruisereservation|skivacations)\.com|edpelletstoves\.net)\b
    
    \bdownunderendeavors\.com\b
    \bdownunderendeavours\.com\b
    \bdownunderendeavours\.net\b
    \bdownunderendeavours\.org\b
    \bdownunderendeavou?rs\.(org|com|net)\b
    
    \bdurgapurcity\.co\.in\b
    \bdurgapurcity\.com\b
    \bdurgapurcity\.(com|co\.in)\b
    
    \benjoydeals\.blogspot\.com\b
    \benjoydeals\.com\b
    \benjoydeals(\.blogspot\)?.com\b
    
    \bexperthub\.com\b
    \bexperthub\.net\b
    \bexperthub\.(net|com)\b
    
    \bf150online\.com\b
    \bf150online\.net\b
    \bf150online\.(net|com)\b
    
    \bfacebookloginhut\.com\b
    \bfacebookloginnow\.com\b
    \bfacebooklogin(now|hut)\.com\b
    
    \bfitfreak\.net\b
    \bfitfreaks\.net\b
    \bfitfreaks?\.net\b
    
    # On the list twice:
    \bfluoridealert\.org\b
    \bfluoridealert\.org\b
    
    \bforex-foreignexchange\.blogspot\.com\b
    \bforex-gold-invest\.blogspot\.com\b
    \bforex-iforex\.blogspot\.com\b
    \bforex-(iforex|gold-invest|foreignexchange)\.blogspot\.com\b
    
    \bgames2cool\.com\b
    \bgames2relax\.com\b
    \bgames2(relax|cool)\.com\b
    
    \bgeocities\.com\/4christ\.geo\b
    \bgeocities\.com/maskedriderthenext\b
    \bgeocities\.com/snuffbottle1\b
    \bgeocities\.com/(snuffbottle1|4christ\.geo|maskedriderthenext)\b
    
    \bget2dallas\.com\b
    \bget2houston\.com\b
    \bget2sandiego\.com\b
    \bget2(dallas|houston|sandiego)\.com\b
    
    \bgods-and-heroes-gold\.com\b
    \bgods-and-heroes-power-leveling\.com\b
    \bgods-and-heroes-powerleveling\.com\b
    \bgods-heroes-power-leveling\.com\b
    \bgods-heroes-gold\.com\b
    \bgods(-and)?-heroes-(gold|power-?leveling)\.com\b
    
    \bgraffiti\.net/allinfo\b
    \bgraffiti\.net/extranews
    \bgraffiti\.net/g2008\b
    \bgraffiti\.net/goodinfo\b
    \bgraffiti\.net/info4you\b
    \bgraffiti\.net/webfaqs\b
    \bgraffiti\.net/(webfaqs|info4you|goodinfo|g2008|allinfo)\b
    
    \bgreenday\.cc\b
    \bgreenday\.net\b
    \bgreenday\.(net|cc)\b
    
    \bgroups\.yahoo\.com\/group\/theincrediblerachellillis\/
    \bgroups\.yahoo\.com/group/shanediesel\b
    \bgroups\.yahoo\.com/group/(shanediesel|theincrediblerachellillis)\b
    
    \bhanoihotel\.net\b
    \bhanoihotelsbooking\.com\b
    \bhanoihotelslist\.com\b
    \bhanoihotel(slist|sbooking)?\.com\b
    
    \bhotdesigirls\.blog\.com\b
    \bhotdesigirls\.blogsome\.com\b
    \bhotdesigirls\.blog(some)?\.com\b
    
    \bhotelbargains\.biz\b
    \bhotelbargains\.info\b
    \bhotelbargains\.(info|biz)\b
    
    \bhotgigs\.com\b
    \bhotgigs\.typepad\.com\b
    \bhotgigs(\.typepad)?\.com\b
    
    \bhowtogetridofwarts\.doodlekit\.com\b
    \bhowtogetridofwarts\.weebly\.com\b
    \bhowtogetridofwarts\.(weebly|doodlekit)\.com\b
    
    \bhumansfuture\.com\b
    \bhumansfuture\.org\b
    \bhumansfuture\.(org|com)\b
    
    \bhyderabadforums\.com\b
    \bhyderabadtoday\.info\b
    \bhyderabadwow\.com\b
    \bhyderabad(forums|wow|today)\.com\b
    
    \bidrivesafely\.com\b
    \bidrivesafelytest\.com\b
    \bidrivesafely(test)?\.com\b
    
    \binfibeam\.blog\.com\b
    \binfibeam\.com\b
    \binfibeam(\.blog)?\.com\b
    
    \binfophil.com\b
    \binfophil.net\b
    \binfophil.(net|com)\b
    
    \binformatics-inc\.com\b
    \binformatics-ltd\.com\b
    \binformatics-(ltd|inc)\.com\b
    
    \bislafisher\.com\b
    \bislafisherweb\.com\b
    \bislafisher(web)?\.com\b
    
    \bjustjared(jr)?\.buzznet\.com\b
    \bjustjared(jr)?\.com\b
    \bjustjared(jr)?(\.buzznet)?\.com\b
    
    \bjwsuretybonds\.com\b
    \bjwsuretybonds\.net\b
    \bjwsuretybonds\.(net|com)\b
    
    \bkamenriderjapanhero\.com\b
    \bkamenriderjapanhero\.webs\.com\b
    \bkamenriderjapanhero(\.webs)?\.com\b
    
    \bkenyalyric\.com\b
    \bkenyalyrics\.com\b
    \bkenyalyrics?\.com\b
    
    \bkhwajagharibnawaz\.com\b
    \bkhwajagharibnawaz\.net\b
    \bkhwajagharibnawaz\.(net|com)\b
    
    \bknol.google\.com\/k\/illidan-gibran\b
    \bknol.google\.com\/k\/john-combalicer\b
    \bknol\.google\.com\/k\/onuora-amobi\b
    \bknol\.google\.com\/k\/(onuora-amobi|john-combalicer|illidan-gibran)\b
    
    \blapbandguide\.com\b
    \blapbandtalk\.com\b
    \blapband(talk|guide)\.com\b
    \blighthousesintile\.com\b
    \blighthousesites\.com\b
    \blighthousesi(tes|ntile)\.com\b
    
    \blinexlegal\.(com|it)\b
    \blinexlegal\.co\.(nz|uk)\b
    \blinexlegal\.(com|it|co\.(nz|uk))\b
    
    \blionball\.cn\b
    \blionball\.net\b
    \blionball\.(net|cn)\b
    
    \blitchfieldlakes\.com\b
    \blitchfieldlistings\.com\b
    \blitchfield(listings|lakes)\.com\b
    
    \blord-of-the-rings-gold\.com\b
    \blord-of-the-rings-online-gold\.com\b
    \blord-of-the-rings-online-power-leveling\.com\b
    \blord-of-the-rings-powerleveling\.com\b
    \blord-of-the-rings-(powerleveling|online-power-leveling|online-gold|gold)\.com\b
    
    \blovebugcentral\.com\b
    \blovebugfans\.com\b
    \blovebug(fans|central)\.com\b
    
    \bmahabaleshwar\.com\b
    \bmahabaleshwarhotels\.com\b
    \bmahabaleshwaronline\.com\b
    \bmahabaleshwar(online|hotels)?\.com\b
    
    \bmaterialprofits\.com\b
    \bmaterialprofitswildcatter\.com\b
    \bmaterialprofits(wildcatter)?\.com\b
    
    \bmaxpages\.com/ackuron\b
    \bmaxpages\.com/planetgnome\b
    \bmaxpages\.com/reenactor\b
    \bmaxpages\.com/tache\b
    \bmaxpages\.com/(tache|reenactor|planetgnome|ackuron)\b
    
    \bmicrostockforum\.com\b
    \bmicrostockgroup\.com\b
    \bmicrostock(group|forum)\.com\b
    
    \bmilakunisfan\.com\b
    \bmilakuniswow\.com\b
    \bmilakunis(fan|wow)\.com\b
    
    \bmobilefleetservices\.com\b
    \bmobilefleetservices\.net\b
    \bmobilefleetservices\.(net|com)\b
    
    \bmoola\.com:80/moopubs/b2b/exc/join\.jsp\b
    \bmoola\.com/moopubs/b2b/exc/join\.jsp\b
    \bmoola\.com(:80)?/moopubs/b2b/exc/join\.jsp\b
    
    \bmustangboards\.com\b
    \bmustangforums\.com\b
    \bmustang(forum|board)s\.com\b
    
    \bmyspace.com/djshellmax\b
    \bmyspace\.com\/fleshjack\b
    \bmyspace\.com\/grimestwins\b
    \bmyspace\.com\/prophecytrackz\b
    \bmyspace\.com\/vpipes\b
    \bmyspace\.com/hockeyquarterly\b
    \bmyspace\.com/megadry\b
    \bmyspace\.com/officialmatthewparker\b
    \bmyspace\.com/official_toronto_raptors\b
    \bmyspace\.com/paintballersauction\b
    \bmyspace\.com/thecocaineknights\b
    \bmyspace\.com/(djshellmax|fleshjack|grimestwins|hockeyquarterly|megadry|officialmatthewparker|official_toronto_raptors|paintballersauction|prophecytrackz|thecocaineknights|vpipes)\b
    
    \bnissanforum\.com\b
    \bnissanleafforum\.com\b
    \bnissan(leaf)?forum\.com\b
    
    \bohio-put-in-bay\.biz\b
    \bohio-put-in-bay\.com\b
    \bohio-put-in-bay\.info\b
    \bohio-put-in-bay\.org\b
    \bohio-put-in-bay\.(biz|com|info|org)\b
    
    \bonlinetrafficschoolguide\.com\b
    \bonlinetrafficschoolus\.com\b
    \bonlinetrafficschool(us|guide)\.com\b
    
    \boptionstradingpedia\.com\b
    \boptiontradingpedia\.com\b
    \boptions?tradingpedia\.com\b
    
    \bpackaging-labeling\.com\b
    \bpackaging-labelling\.com\b
    \bpackagingandlabeling\.com\b
    \bpackaging(-|and)labell?ing\.com\b
    
    \bpaudarco\.net\b
    \bpaudarco\.org\b
    \bpaudarco\.(org|net)\b
    
    \bpirates-of-the-burning-sea-gold\.com\b
    \bpirates-of-the-burning-sea-power-leveling\.com\b
    \bpirates-of-the-burning-sea-powerleveling\.com\b
    \bpirates-of-the-burning-sea-(gold|power-?leveling)\.com\b
    
    \bplayadelcarmenvacationpackage\.com\b
    \bplayadelcarmenvacationpackages\.com\b
    \bplayadelcarmenvacationpackages?\.com\b
    
    \bpotbs-gold\.com\b
    \bpotbs-power-leveling\.com\b
    \bpotbs-powerleveling\.com\b
    \bpotbs-(gold|power-?leveling)\.com\b
    
    \bpremier-business-centers\.(com|net|org)\b
    \bpremierbusiness-centers\.(com|net|org)\b
    \bpremier-?business-centers\.(com|net|org)\b
    
    \bpriory-of-sion\.com\b
    \bpriory-of-sion\.tripod\.com\b
    \bpriory-of-sion(\.tripod\)?.com\b
    
    \bpulsemusic\.proboards48\.com\b
    \bpulsemusic\.proboards\.com\b
    \bpulsemusic\.proboards(48)?\.com\b
    
    \bput-in-bay-hotel\.net\b
    \bput-in-bay-lodging\.com\b
    \bput-in-baygolfcarts\.com\b
    \bput-in-bay(-(lodging|lodging)|golfcarts)\.com\b
    
    \bputinbayattractions\.com\b
    \bputinbayfallball\.com\b
    \bputinbayhotel\.com\b
    \bputinbayhouse\.com\b
    \bputinbayonline\.com\b
    \bputinbayrentals\.com\b
    \bputinbayreservations\.com\b
    \bputinbayspringfling\.com\b
    \bputinbay(attractions|fallball|hotel|house|online|rentals|reservations|springfling)\.com\b
    
    \bqrcodegen\.com\b
    \bqrcodegen\.net\b
    \bqrcodegen\.(net|com)\b
    
    \brailay\.co\.uk\b
    \brailay\.us\b
    \brailay\.(us|co.\uk)\b
    
    \branchocapistranocountryestates\.com\b
    \branchocapistranoestates\.com\b
    \branchocapistrano(country)?estates\.com\b
    
    \brdujour\.blogspot\.com\b
    \brdujour\.com\b
    \brdujour(\.blogspot)?\.com\b
    
    \breenactor\.forumer\.com\b
    \breenactor\.zoomshare\.com\b
    \breenactor\.(forumer|zoomshare)\.com\b
    
    \brussian-services\.co\.uk\b
    \brussian-services\.com\b
    \brussian-services\.co(m|\.uk\)b
    
    \bsankarv1\.blogspot\.com\b
    \bsankarv2\.blogspot\.com\b
    \bsankarv3\.blogspot\.com\b
    \bsankarv\d\.blogspot\.com\b
    
    \bscribd\.com/doc/(48140751|11074145|30029387|10935894|14423948|14358694)\b
    \bscribd\.com/doc/21733512/Principles-101\b
    \bscribd\.com/doc/(10935894|11074145|14358694|14423948|21733512|30029387|48140751)\b
    
    \bseelincolnshire\.co\.uk\b
    \bseelincolnshire\.com\b
    \bseelincolnshire\.co(m|\.uk)\b
    
    \bsifl-and-olly\.com\b
    \bsifl-and-olly\.webs\.com\b
    \bsifl-and-olly(\.webs)?\.com\b
    
    \bsites\.google\.com/site/artbatiks\/home\b
    \bsites\.google\.com/site/dnapolice\/b
    \bsites\.google\.com/site/nswcnn\/b
    \bsites\.google\.com/site/(datemix|meodatz)
    \bsites\.google\.com/site/baglamukhimaa\b
    \bsites\.google\.com/site/healthofliver\b
    \bsites\.google\.com/site/skytaxi21\b
    \bsites\.google\.com/site/spywareanti\b
    \bsites\.google\.com/site/(artbatiks|baglamukhimaa|datemix|dnapolice|healthofliver|meodatz|nswcnn|skytaxi21|spywareanti)\b
    
    \bskivacationpackage\.com\b
    \bskivacationpackages\.com\b
    \bskivacationpackages?\.com\b
    
    \bspicyhits\.com\b
    \bspicyhitz\.com\b
    \bspicyhit(z|s)\.com\b
    
    \bssfree\.net\.tc\b
    \bssfree\.wikia\.com\b
    \bssfree\.(wikia\.com|net\.tc)\b
    
    \bstimul-cash\.com\b
    \bstimul-media\.com\b
    \bstimul-(media|cash)\.com\b
    
    \bsurfacehippy\.info\b
    \bsurfacehippyinfo\.com\b
    \bsurfacehippy(info)?\.com\b
    
    \bsystemid\.com\b
    \bsystemsid\.com\b
    \bsystems?id\.com\b
    
    \btahoeski\.com\b
    \btahoeskipackages\.com\b
    \btahoeskiresort\.com\b
    \btahoeskiresorts\.com\b
    \btahoeski(packages|resorts?)?\.com\b
    
    \btaipandaily\.com\b
    \btaipandaily\.net\b
    \btaipandaily\.(net|com)\b
    
    \btestinside\.co\.uk\b
    \btestinside\.com\b
    \btestinside\.co(m|\.uk)\b
    
    \btexas-defensivedriving-courses\.com\b
    \btexas-defensivedriving-online\.com\b
    \btexas-defensivedriving-(online|course)\.com\b
    
    \bticketairline\.com\b
    \bticketairlines\.com\b
    \bticketairlines?\.com\b
    
    \btitle24bid\.com\b
    \btitle24requirements\.com\b
    \btitle24service\.com\b
    \btitle24(bid|requirements|service)\.com\b
    
    \btjoos\.co\.uk\b
    \btjoos\.com\b
    \btjoos\.co(m|\.uk)\b
    
    \btollywoodnews\.info\b
    \btollywoodtalk\.info\b
    \btollywood(talk|news)\.info\b
    
    \btvdata\.ru\b
    \btvdata\.tv\b
    \btvdata\.(tv|ru)\b
    
    \bucoin\.info\b
    \bucoin\.net\b
    \bucoin\.(net|info)\b
    
    \bultimatecoupons\.com\b
    \bultimatecoupons\.net\b
    \bultimatecoupons\.(net|com)\b
    
    \bvampire-diaries\.superforum\.fr
    \bvampire-diaries\.xooit\.fr\b
    \bvampire-diaries\.(xooit|superforum)\.fr\b
    
    \bvanguard-power-leveling\.com\b
    \bvanguard-powerleveling\.com\b
    \bvanguard-saga-of-heroes-gold\.com\b
    \bvanguard-saga-of-heroes-power-leveling\.com\b
    \bvanguard-soh-gold\.com\b
    \bvanguard-(power-?leveling|saga-of-heroes-(gold|power-leveling)|soh-gold)\.com\b
    
    \bvasanthandco\.com\b
    \bvasanthandco\.net\b
    \bvasanthandco\.(com|net)\b
    
    \bvietnam-travelservice\.com\b
    \bvietnam-travelservices\.com\b
    \bvietnam-travelservices?\.com\b
    
    \bvoobly\.com\b
    \bvoobly\.net\b
    \bvoobly\.(com|net)\b
    
    \bwartremoval\.doodlekit\.com\b
    \bwartremoval\.weebly\.com\b
    \bwartremoval\.(weebly|doodlekit)\.com\b
    
    \bwealth-by-green\.(com|net|org)\b
    \bwealthbygreen\.(com|net|org)\b
    \bwealth-?by-?green\.(com|net|org)\b
    
    \bwebspawner\.com/users/arafahospitals\b
    \bwebspawner\.com/users/griersonorigins\b
    \bwebspawner\.com/users/(arafahospitals|griersonorigins)\b
    
    \bwikijob\.co\.uk\b 
    \bwikijobs\.co\.uk\b 
    \bwikijobs?\.co\.uk\b 
    
    \bwillsoncomputers\.myallblogs\.com\b
    \bwillsoncomputers\.weebly\.com\b
    \bwillsoncomputers\.(weebly|myallblogs)\.com\b
    
    \bwindows-vista-update\.com\b
    \bwindows-vista-videos\.com\b
    \bwindows-vista-(update|videos)\.com\b
    
    \bwwwvacationstogo\.com\b
    \bwwwvacationtogo\.com\b
    \bwwwvacations?togo\.com\b
    
    \byadavaalliance\.com\b
    \byadavaalliance\.org\b
    \byadavaalliance\.(com|org)\b
    
    \bzeemap\.com\b
    \bzeemaps\.com\b
    \bzeemaps?\.com\b
    
    \bzipityzap\.com\b
    \bzippygames\.com\b
    \bzippy(games|zap)\.com\b
    
    
    Wow. I wrote a script for detecting some redundant entries some months ago. But that script actually is not very intelligent and would not suggest such merging like you do. Did you write a script for generating your list or is it purely hand-made? At least the last merging proposal (\bzipityzap\.com\b + \bzippygames\.com\b -> \bzippy(games|zap)\.com\b) seems to be hand-made (buggy). ;-)
    If you wrote a script, we could try to merge yours and mine in case you are interested.
    @Rich Farmbrough: in the sbl merged regexps like /foo(?:bar|baz|quux)/ are faster than 3 separate regexps. However, /(?:foo|bar|baz)quux/ is probably not faster than the separate version.
    So I can't see disadvantages in Δ's suggestions. I could write a small script to transfer the suggestions to the sbl. Before that we have to correct replacements like the last one. -- seth (talk) 20:38, 7 March 2011 (UTC)[reply]
    Its done by hand, I have a program that downloads the list and sorts it, then I take a look. ΔT The only constant 03:10, 13 March 2011 (UTC)[reply]
    o.O, I guess, that took several hours. I'll write a small script to parse your output and delete/group the entries. -- seth (talk) 23:05, 13 March 2011 (UTC)[reply]
    Concerning the dead links: I don't see a good reason to let them be blocked. In my opinion we should remove all dead links that were added to the sbl more than 12 months ago. -- seth (talk) 20:38, 7 March 2011 (UTC)[reply]
    Im running the list several times to confirm the fact that the links are dead, before I post that list. ΔT The only constant 03:10, 13 March 2011 (UTC)[reply]
    great! -- seth (talk) 23:05, 13 March 2011 (UTC)[reply]