Jump to content

MediaWiki talk:Spam-blacklist

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 201.20.235.47 (talk) at 13:55, 28 March 2008 (→‎Proposed removals). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

    Mediawiki:Spam-blacklist is meant to be used by the spam blacklist extension. Unlike the meta spam blacklist, this blacklist affects pages on the English Wikipedia only. Any administrator may edit the spam blacklist. See Wikipedia:Spam blacklist for more information about the spam blacklist.


    Instructions for editors

    There are 4 sections for posting comments below. Please make comments in the appropriate section. These links take you to the appropriate section:

    1. Proposed additions
    2. Proposed removals
    3. Troubleshooting and problems
    4. Discussion

    Each section has a message box with instructions. In addition, please sign your posts with ~~~~ after your comment.

    Completed requests are archived. Additions and removals are logged, reasons for blacklisting can be found there.

    Addition of the templates {{Link summary}} (for domains), {{IP summary}} (for IP editors) and {{User summary}} (for users with account) results in the COIBot reports to be refreshed. See User:COIBot for more information on the reports.


    Instructions for admins
    Any admin unfamiliar with this page should probably read this first, thanks.
    If in doubt, please leave a request and a spam-knowledgeable admin will follow-up.

    Please consider using Special:BlockedExternalDomains instead, powered by the AbuseFilter extension. This is faster and more easily searchable, though only supports whole domains and not whitelisting.

    1. Does the site have any validity to the project?
    2. Have links been placed after warnings/blocks? Have other methods of control been exhausted? Would referring this to our anti-spam bot, XLinkBot be a more appropriate step? Is there a WikiProject Spam report? If so, a permanent link would be helpful.
    3. Please ensure all links have been removed from articles and discussion pages before blacklisting. (They do not have to be removed from user or user talk pages.)
    4. Make the entry at the bottom of the list (before the last line). Please do not do this unless you are familiar with regular expressions — the disruption that can be caused is substantial.
    5. Close the request entry on here using either {{done}} or {{not done}} as appropriate. The request should be left open for a week maybe as there will often be further related sites or an appeal in that time.
    6. Log the entry. Warning: if you do not log any entry you make on the blacklist, it may well be removed if someone appeals and no valid reasons can be found. To log the entry, you will need this number – 201574655 after you have closed the request. See here for more info on logging.

    Proposed additions

    thedanube.info

    A typical advertisement website, used as "The Official Danube Site" in Danube. 213.47.181.20 (talk) 13:49, 29 February 2008 (UTC)[reply]

     Stale Stifle (talk) 13:01, 26 March 2008 (UTC)[reply]

    learnerstv.com

    Continuing and unceasing spamming of two articles, promoting some pay-to-learn online site when the material is freely available. Forever changing IP addy, so blocks not effective. Bots don't seem to have a good strike rate, and the spamming has now been ongoing for over 2 months at [1] and [2]. Plenty of examples for diffs from those histories, for example:

    SFC9394 (talk) 16:55, 1 March 2008 (UTC)[reply]

    Since XLinkBot started watching for this site, it caught the first time it was added but not the second. Not sure if second was missed, intentionally ignored, or I screwed up the regex for the site when I added it. But regardless, I've never seen an acceptable use of this site, but it is indeed often spammed to various articles (the Lewin ones are the most common but not the only targets) by throw-away anon IP accounts. WP:WPSPAM reports:
    DMacks (talk) 18:41, 1 March 2008 (UTC)[reply]
     Defer to XLinkBot There are only two articles, there has been little disruption this month, and XLinkBot has reverted the last couple of issues. Recommend semi-protection in the case of it getting seriously out of hand. Stifle (talk) 12:58, 26 March 2008 (UTC)[reply]

    nationmultimedia.com/qvote

    There is an aggressive spammer, or maybe more correct someone abusing Wikipedia for his personal vendetta against Thailand and/or the Thai authorities, adding his soapbox texts into several articles, most commonly Thailand. One common thing about these entries is that he adds links to the forum of the newspaper The Nation, which have the common form of nationmultimedia.com/qvote/... - though this link is not included always. Maybe it can help to stop this person at least a bit when he cannot add that link anymore. But mkake sure it's only the qvote subpage which gets blocked, other URLs from the Nation have to work as it is a common reference link. The IPs listed below are just the most recent ones, this goes on for months already. Blocking the IPs for longer times is not possible, as these belong to a Thai ISP and thus might block out other users.

    andy (talk) 11:45, 7 March 2008 (UTC)[reply]

     Additional information needed Please can you provide some diffs or the titles of the articles affected? Stifle (talk) 12:59, 26 March 2008 (UTC)[reply]

    systemid.com

    Domain
    Spam accounts
    Spam diffs

    [8] [9] [10] [11] [12]

    The shared IP (67.107.70.147) seems to be a continued source of SPAM. Warnings were posted to 67.107.70.147's talk page in July 2007 and September 2007. --Berkland (talk) 14:04, 25 March 2008 (UTC)[reply]

    spam.greenday.net

    heathledger.com is possibly unrelated but it was definitely spammed by multiple accounts. See WT:WPSPAM#Adsense pub-9204388014767859 again. MER-C 06:55, 26 March 2008 (UTC)[reply]

    Proposed removals

    http://home.teleport.com/~grladams/ 201.20.235.47 (talk) 13:55, 28 March 2008 (UTC)[reply]

    Troubleshooting and problems

    Discussion

    archive script

    Eagle 101 said he had one running on meta, is it possible to get it up and going here?--Hu12 10:27, 15 November 2007 (UTC)[reply]

    Would be good - Eagle hasn't been working on Meta for a while though & I've not seen anything (there was supposed to be a logging script too!) --Herby talk thyme 12:10, 15 November 2007 (UTC)[reply]
    • Great news, Ive written a script that can archive this page given the templates that we use, I can create a approved archive along with a rejected archive if people are interested. βcommand 06:51, 4 January 2008 (UTC)[reply]
    "Interested" - bit of an understatement there :) Great news - please feel free to help/supply the script. I tend to leave stuff around a week in case anyone shouts or adds more (archives once done should be left alone). How would you handle the "discussion" type bits? Cheers --Herby talk thyme 09:40, 4 January 2008 (UTC)[reply]
    First question, do you want approved and rejected request in separate archives? as for the discussions we could get Misza bot over here for things older than 30 days. βcommand 17:13, 4 January 2008 (UTC)[reply]
    I would think one archive, seperate sections, like it is currently[13], not sure if the script can do that, but if so, doubt there would be objections in implementation...--Hu12 (talk) 00:24, 10 January 2008 (UTC)[reply]
    There is no simple way of editing sections using the bot. (section editting is evil). it would just be one large archive. βcommand 00:59, 10 January 2008 (UTC)[reply]

    blogspot.com

    I added countingcrowsnew.blogspot.com, freemodlife.blogspot.com, and googlepackdownload.blogspot.com to the blacklist. I made a previous report about the blogspot sites and they're being spammed by the same blocked sockpuppet who I filed a report about here. Spellcast (talk) 22:03, 28 November 2007 (UTC)[reply]

    Update: I've also added b5050-raffle.blogspot.com, gpd2008.blogspot.com, and itsleaked.blogspot.com. They were being spammed by the same blocked sock in that report. Spellcast (talk) 05:18, 29 November 2007 (UTC)[reply]

    I'm inclined to blacklist the domain then whitelist where needed but some heavy flak is likely to arrive? --Herby talk thyme 08:06, 29 November 2007 (UTC)[reply]
    From an en:Wikipedia mission perspective (though possibly not your personal perspective:) a bigger issue than the flak that will be generated is the disruption to editing. I believe a lot of pages, particularly biographies of living people, contain legitimate links to the subject's blog - many of which are hosted on blogspot. Simply blacklisting and then waiting for whitelisting requests will likely
    1. overwhelm the whitelist page here and on meta (which given you are one of the most active admins on both, may not be ideal for you!)
    2. be confusing and frustrating to a lot of editors especially newbies, but also any who are not familiar with the blacklist/whitelist set up
    3. lead to a loss of legitimate links and legitimate edits as people struggle to work out whether to keep their edit and lose the link or the other way round while any whitelist request is ongoing.
    I think a move like that will take some careful planning and preparation to avoid these issues (might also help cut down some of the heat). One way or another, I think we need human editors to assess the current blogspot links on article pages and enter appropriate ones on the whitelist before the blacklisting goes into effect. I don't think such a move will cut out most of the flak though, so we might want to ensure there are other admins involved to help spread the weight, and a nicely presented page of evidence of the issues the domain causes to point people to.
    Blogspot certainly gets spammed a lot more than most domains, and I support blacklisting. But It's still a domain that has a lot of good links and I think it's important to think through how a move like that will impact people, and to adjust to the situation. -- SiobhanHansa 13:54, 29 November 2007 (UTC)[reply]
    Briefly - needs quite a bit of thought but equally is worth that amount of thought --Herby talk thyme 13:55, 29 November 2007 (UTC)[reply]
    There are many, many legitimate links to the domain, not only to blogs belonging to article subjects but to blogs belonging to Wikipedia contributors. Better to blacklist individual blogs as needed. --bainer (talk) 16:23, 29 November 2007 (UTC)[reply]
    Not sure why Wikipedia contributors would be adding their own blogs? A very limited number of blogs actualy meet WP:RS and even fewer still meet the requirements of WP:EL or are a blog that is the subject of the article or an official page of the articles subject. There are currently 32,916 blogspot.com Blog links on Wikipedia, if whitelisting even a thousand "legitimate links", its worth it.--Hu12 (talk) 17:03, 29 November 2007 (UTC)[reply]
    You've presented some convincing reasons to leave certain blog links out of Wikipedia, but not a reason to leave all blog links out. Wikipedia contributors might want to link to their blogs because, you know, it is possible for said contributors to frequent websites on the internet other than Wikipedia :P See WP:COMMUNITY. There is also a performance cost to whitelisting and blacklisting; as far as I can tell, 1000 whitelisted entries costs more computationally than 1000 blacklisted entries (instead of using one large regex, which is how the blacklist works, you're doing 1000 individual regex replacements). GracenotesT § 18:04, 29 November 2007 (UTC)[reply]
    I was under the impression server load was something we were supposed to leave up to the developers to worry about. If they see an issue and ask for a reassessment that would be one thing, but its not a good argument against a tactic without their weight behind it.
    The suggestion isn't that all blogs should be banned. the suggestion is that this particular domain gets spammed so much it would be beneficial to the project to blacklist it and only white list the ones that are appropriate. -- SiobhanHansa 18:13, 29 November 2007 (UTC)[reply]
    Hu12 I think it's important not to overstate the case here. Not all of the ~32,000 links (assukming the 1K of good links estimate) that are not legitimate external links or citations will actually be harmful to Wikipedia. While editors' own blogs on their user pages aren't necessary to the project, in the vast majority of cases they do no harm and may help editors fell a bond that connects them to the project. Many more will be links from discussions and projects. While I don't think that's a reason for keeping a domain that is also being spammed so much - it's not the case that we do 32,000 links worth of "good" by removing them. For the most part we only really benefit from the spam and poorly placed article links that go. -- SiobhanHansa 18:08, 29 November 2007 (UTC)[reply]

    (unindent, crosspost my post from WT:WPSPAM)

    The rule \bblogspot\.com is (currently) not on COIBot's monitorlist. Some of the sub-domains have been added via WT:WPSPAM, or have been caught by the automonitoring of COIBot (mainly because the name of the editor is the same as the name of the subdomain on blogspot.com).

    Still, a linksearch on the resolved IP of blogspot.com (72.14.207.191) results in a mere 118 results (all COIBot linkreports)! Often the multiple use of the single subdomains is not a cause for blacklisting, as they may only have been used once or twice. Also, I suspect there are tens of thousands of blogspot sub-domains out there, but these are only the links that are caught because the wiki username overlaps with the domainname of the subdomain (or have been reported here). Would this cumulative behaviour warrant blacklisting of \bblogspot\.com .. here, or even on meta? --Dirk Beetstra T C 12:37, 30 November 2007 (UTC)[reply]

    Appropriate links may indeed be a problem, though the majority will fail some or many of the policies and guidelines here (or don't even have to be a notable fact, or do not need to be a working link while being mentioned; "Mr. X has a a blog on Blogspot.<ref>primary reliable source stating that the blog is the official blog</ref>"; we are not a linkfarm), and I would argue that the spam/coi part of the problem becomes a bit difficult to control... --Dirk Beetstra T C 14:23, 30 November 2007 (UTC)[reply]
    Crosspost spamlink template for blogspot.com to link this discussion to the linkreports from COIBot. --Dirk Beetstra T C 10:31, 3 December 2007 (UTC)[reply]
    Please try to remember how frustrating generic, unexpected spam blocks can be for new and incautious editors. Last time I "checked", if you make an edit with Internet Explorer and you post it directly without preview (two things you should never do), then if the spam blacklist comes up your text is gone. Back arrow gets you the original text of the article. Edits that die that way may not get remade, and they may sour the editor on further contributions. I don't think there should be any blocks on top-level domains or large general purpose Internet sites. 70.15.116.59 23:46, 3 December 2007 (UTC)[reply]
    I have to disagree in this case - there's concern that the dynamic IP spamming it is using it to perpetrate scams or send out computer bugs. -Jéské (Blah v^_^v) 04:55, 4 December 2007 (UTC)[reply]
    There's no way we can realistically do this. blogspot has an Alexa traffic rank of 12 - it's higher than Amazon.com - and has well over 30,000 links on en.wp alone. Adding this would be incredibly disruptive to thousands of articles. Unless someone wants to go through all 32,000 links to find the ones that can be kept so we can whitelist them, there's no way we can do this. The ones that are spam should be removed and blacklisted, but WP:EL and WP:RS are not very good reasons to completely forbid links to a domain. Mr.Z-man 16:47, 8 December 2007 (UTC)[reply]
    Though I agree that Wikipedia has a big blogspam problem, I also have to concede that there are too many legit blogspot links (e.g., bio subjects own blog) as SiobhanHansa noted. OhNoitsJamie Talk 17:15, 3 February 2008 (UTC)[reply]

    (unindent)blogspot.com is currently on User:XLinkBot's revert list. XLinkBot is designed to revert only non-autoconfirmed users, and will only do so a limited number of times. Assuming we emerge from our trial period, I think this would be an effective way of stemming the influx of inappropriate blogspot links. Established editors would still be able to add blogspot.com links and only new or changed links would be reverted - so it wouldn't interfere with non-autoconfirmed users editing pages that already contained a link. --Versageek 18:33, 3 February 2008 (UTC)[reply]

    I do not support blacklisting the http://www.blogspot.com. It is too generic to be blindly blacklisted --Zache (talk) 20:06, 21 February 2008 (UTC)[reply]

    Blacklist logging

    {{WPSPAM|0#section_name}} →(replacing '0' with the correct "oldid" (ie. permalink) example shown here).

    For example:

    {{WPSPAM|182728001#Blacklist_logging}}

    results in:

    See WikiProject Spam report

    This should aid in requests originating from Wikipedia_talk:WikiProject_Spam and for use with the entry log here. I've added a snipit in the header --Hu12 (talk)


    freerepublic.com

    freerepublic.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    This is as a result of a discussion here[14] about the usage of FreeRepublic.com as a reprinting service for a primary source. I was curious to see what other articles linked to FreeRepublic and found a small handful on en and on other languages. In looking into the specific links in article space what I'm finding is that FreeRepublic is often being used in lieu of linking to the actual source [15][16], where it exists in a web archive [17], or just to link to it in the external links section[18]. I'm sure the articles were linked as references in good faith, but given that FreeRepublic is an unreliable source, should it be blacklisted and then whitelisted onto articles related to the site, added to one of the spambots, or periodically cleaned up by hand? --Bobblehead (rants) 23:51, 24 March 2008 (UTC)[reply]

    tinyURL

    Per User:Viridae's suggestion here, is there any reason not to block the entire tinyurl service? Equazcion /C 11:53, 25 Mar 2008 (UTC)

    And others like it. Ie snipurl etc. They serve no purpose that I am aware of and potentially expose our readers to malicious links that would otherwise be visible or be picked up by the spam blacklist. ViridaeTalk 11:56, 25 March 2008 (UTC)[reply]
    For those who are unfamiliar with this site, the intent is to post a really short URL which tinyurl then redirects to a much longer URL. It's useful in some forum settings to copy-paste a tiny link to some website or video or whatever. In the context of this project, though, the only possible purpose for using such a site would be to obscure a link that would otherwise trigger the blacklist, either for spam purposes or something more malicious, as noted above. I think a general policy of blocking such sites would be justifiable both under WP:EL and WP:V (as we can't verify what the link is without clicking). In the interim, though, tinyurl should definitely be blacklisted. UltraExactZZ Claims ~ Evidence 12:01, 25 March 2008 (UTC)[reply]
    Both tinyurl and snipurl are blacklisted at m:Spam blacklist. Any others should probably be proposed on the talk page there. -- zzuuzz (talk) 12:13, 25 March 2008 (UTC)[reply]
    Ok on further inspection the problem site in this case was actually azqq.com, a tinyurl-type service. Please blacklist that. Thanks. Equazcion /C 12:52, 25 Mar 2008 (UTC)
    And now that I actually took the time to read your comment, zzuzz, I'll go over to meta :) Thanks. Equazcion /C 12:54, 25 Mar 2008 (UTC)
     Done on Meta by the way! --Herby talk thyme 14:34, 25 March 2008 (UTC)[reply]

    Addition to the COIBot reports

    The lower list in the COIBot reports now have after each link four numbers between brackets (e.g. "www.example.com (0, 0, 0, 0)"):

    1. first number, how many links did this user add (is the same after each link)
    2. second number, how many times did this link get added to wikipedia (for as far as the linkwatcher database goes back)
    3. third number, how many times did this user add this link
    4. fourth number, to how many different wikipedia did this user add this link.

    If the third number or the fourth number are high with respect to the first or the second, then that means that the user has at least a preference for using that link. Be careful with other statistics from these numbers (e.g. good user do add a lot of links). If there are more statistics that would be useful, please notify me, and I will have a look if I can get the info out of the database and report it. The bots are running on a new database, Eagle 101 is working on transferring the old data into this database so it becomes more reliable.

    For those with access to IRC, there this data is available in real time. --Dirk Beetstra T C 10:41, 26 March 2008 (UTC)[reply]