Jump to content

MediaWiki talk:Spam-blacklist

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 81.71.110.7 (talk) at 14:11, 9 June 2012. The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

    Mediawiki:Spam-blacklist is meant to be used by the spam blacklist extension. Unlike the meta spam blacklist, this blacklist affects pages on the English Wikipedia only. Any administrator may edit the spam blacklist. See Wikipedia:Spam blacklist for more information about the spam blacklist.


    Instructions for editors

    There are 4 sections for posting comments below. Please make comments in the appropriate section. These links take you to the appropriate section:

    1. Proposed additions
    2. Proposed removals
    3. Troubleshooting and problems
    4. Discussion

    Each section has a message box with instructions. In addition, please sign your posts with ~~~~ after your comment.

    Completed requests are archived. Additions and removals are logged, reasons for blacklisting can be found there.

    Addition of the templates {{Link summary}} (for domains), {{IP summary}} (for IP editors) and {{User summary}} (for users with account) results in the COIBot reports to be refreshed. See User:COIBot for more information on the reports.


    Instructions for admins
    Any admin unfamiliar with this page should probably read this first, thanks.
    If in doubt, please leave a request and a spam-knowledgeable admin will follow-up.

    Please consider using Special:BlockedExternalDomains instead, powered by the AbuseFilter extension. This is faster and more easily searchable, though only supports whole domains and not whitelisting.

    1. Does the site have any validity to the project?
    2. Have links been placed after warnings/blocks? Have other methods of control been exhausted? Would referring this to our anti-spam bot, XLinkBot be a more appropriate step? Is there a WikiProject Spam report? If so, a permanent link would be helpful.
    3. Please ensure all links have been removed from articles and discussion pages before blacklisting. (They do not have to be removed from user or user talk pages.)
    4. Make the entry at the bottom of the list (before the last line). Please do not do this unless you are familiar with regular expressions — the disruption that can be caused is substantial.
    5. Close the request entry on here using either {{done}} or {{not done}} as appropriate. The request should be left open for a week maybe as there will often be further related sites or an appeal in that time.
    6. Log the entry. Warning: if you do not log any entry you make on the blacklist, it may well be removed if someone appeals and no valid reasons can be found. To log the entry, you will need this number – 496749425 after you have closed the request. See here for more info on logging.


    Proposed additions

    rashal.com

    successwithmustaphafoukara.com

    MER-C 13:00, 3 June 2012 (UTC)[reply]
    plus Added--Hu12 (talk) 00:19, 5 June 2012 (UTC)[reply]

    constructivecriticisms.wordpress.com

    Limited in scope... I'm inclined to wait. Level 3 spam warning given for now. thanks for the report. Not done--Hu12 (talk) 00:31, 5 June 2012 (UTC)[reply]

    winarticles.net

    Spammers

    See WikiProject Spam report MER-C 09:24, 6 June 2012 (UTC)[reply]

     Done--Hu12 (talk) 01:58, 7 June 2012 (UTC)[reply]


    calculate-linux.org

    This is the homepage of the Calculate Linux project, and the link is very useful to its article (namely in the External Links section). The article currently has a link to calculate-linux.com, but that domain no longer works.

    It seems the link was added to the list automatically. The log of it can be found [[4]]. I also find automatic spam detection in itself going a bit far. It's called a blacklist so we shouldn't have to whitelist links that some bot detected as malicious based on some algorithm.


    Completed Proposed additions

    Proposed removals


    mixcloud.com

    mixcloud.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    The ban relates only to spamming by people connected with the site several years ago: http://en.wikipedia.org/w/index.php?title=Wikipedia_talk:WikiProject_Spam&oldid=317486670#MIXCLOUD_LTD_Spam. I think that the site may now be notable enough to warrant an article on Wikipedia, and some of the mix pages could be valuable as references/external links. memphisto 11:18, 16 May 2012 (UTC)[reply]

    I think this makes more sense as a whitelisting request if an article about that company passes WP:WEB or WP:GNG. I don't see how it would be useful for references in general. OhNoitsJamie Talk 18:05, 16 May 2012 (UTC)[reply]
    In terms of popularity the Alexa rank for mixcloud.com is currently 4,574 - http://www.alexa.com/siteinfo/mixcloud.com. I also mentioned that it could be valuable as a Wikipedia reference/external link, as it would enable you to reference if a notable DJ had played a particular song or even link to a certain mix if it contributed to an article. memphisto 10:57, 31 May 2012 (UTC)[reply]

    altafsir.com

    I am trying to add a link to the book (Love in the Holy Quran) on this page: http://en.wikipedia.org/wiki/Prince_Ghazi_bin_Muhammad The site is a reference tool containing many reference books in both English and Arabic. Could the site please be added to the whitelist. The site is run on two domains altafsir.com and altafsir.org.— Preceding unsigned comment added by Shart000 (talkcontribs) 11:06, 17 May 2012‎

     Additional information needed What link, specifically, are you interested in using?--Hu12 (talk) 19:43, 17 May 2012 (UTC)[reply]

    There are several places on Wikipedia.org that used to link to the site, here is a list of some of the types of links.

    1. The author of "Love in the Holy Quran" lists details of the book here: http:// main.altafsir.com/LoveInQuranIntroEn.asp in English and here: http:// main.altafsir.com/LoveInQuranIntro.asp in Arabic
    2. The site is a reference tool in both Arabic and English, the site owners have spent several million $US transcribing manuscripts of old Arabic Quranic exegesis into digital form. They have several translations. Users were linking to specific pages on altafsir.com as references. Each of the works was authenticated by scholars (most of whom are professors in Universities around the world).

    We would like users to still have the option of using the site as a reference. (There are currently over 100 works transcribed on the site, each work is about 20 volumes.)

    I have noticed that they have 4 domain mirrors to the site, this was a mistake from the site admin. I have asked them to remove the mirrors and to forward the extra domains instead. They are currently working on this. — Preceding unsigned comment added by Shart000 (talkcontribs) 05:44, 28 May 2012 (UTC)[reply]

    Thank you. (I tried to add the link that we want but the spam filter is still in effect so we were unable to add it.) — Preceding unsigned comment added by Shart000 (talkcontribs) 05:35, 28 May 2012 (UTC)[reply]

    You refer to "we" in your comment above. Who is this "we"? Please know that we do not remove sites from the blacklist at the request of the site owner or anyone associated with the site. It seems that you were trying to add a link to altafsir.com in spite of our guideline Wikipedia:Conflict of interest. We can't permit that. If a trusted, high volume editor feels that the material of your site is worthy of referencing, we would consider a request from such an editor. ~Amatulić (talk) 18:54, 30 May 2012 (UTC)[reply]

    Thank you for your response. Why was the site added to the blacklist? Who can we contact about removing the site from the Blacklist? We are a think tank based in Amman, Jordan. We are not directly associated with altafsir.com, but we do help them resolve small issues like this occasionally. — Preceding unsigned comment added by Shart000 (talkcontribs) 05:42, 4 June 2012 (UTC)[reply]

    no Declined. As I stated above, we don't consider requests for removal from parties with a conflict of interest. If a trusted, high-volume editor determines that links on altafsir.com are useful as references on Wikipedia instead of non-blacklisted alternatives, we would consider requests from such an editor.
    The reasons for blacklisting are given in the links at the top of this section, as well as here. Apparently altafsir.com is also blacklisted globally due to rather massive spamming; see m:User:COIBot/XWiki/altafsir.com for the evidence that led to it. Even if altafsir.com was removed from the local blacklist here, it would still be blacklisted globally with little chance of removal. You can  Defer to Global blacklist to pursue it further but it is likely you will get a similar response. ~Amatulić (talk) 18:24, 6 June 2012 (UTC)[reply]

    nepa.com.np

    nepa.com.np: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    This site was blocked in March 2008 and no reason has been given. I would like to request that it be delisted. Karrattul (talk) 18:38, 30 May 2012 (UTC)[reply]

     Defer to Whitelist to unblock specific pages on that site. I see no reason to de-list it entirely. The reason for blocking is here. ~Amatulić (talk) 18:47, 30 May 2012 (UTC)[reply]
    Thanks, the site is intended to be used as a reference for this article http://en.wikipedia.org/wiki/Wikipedia_talk:Articles_for_creation/Sitala_Maju_(song) Karrattul (talk) 17:54, 2 June 2012 (UTC)[reply]
    Again,  Defer to Whitelist once the article is accepted into main article space. ~Amatulić (talk) 21:27, 8 June 2012 (UTC)[reply]

    mokimobility.com

    mokimobility.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    MokiMobility is a new mobile device management provider. Prior to a page being created several links were added to relevant articles linking directly to the home page, both by external users and company users. This action was viewed as spamming and the url got blacklisted. A new article has been created for the company and added to relevant articles under vendor sections, but because of the blacklist a link to the homepage cannot be included on the page summarizing the company. A link to the homepage on a company listing is a valid and useful feature of the listing. --Bradem1976 (talk) 17:50, 1 June 2012 (UTC)--Bradem1976 (talk) 17:50, 1 June 2012 (UTC)[reply]

    no Declined pending the outcome of the current proposal to speedy-delete the article.
    Provided the article is kept,  Defer to Whitelist. There is no need to remove mokimobility.com from the blacklist. If you want just one link in the MokiMobility article, then www.mokimobility.com/about/ is the best one to request at the whitelisting page. A link to the home page won't work in this case because www.mokimobility.com is the actual domain and not just a page and therefore can't be whitelisted. Furthermore www.mokimobility.com/index.html doesn't exist and www.mokimobility.com/index.php redirects to www.mokimobility.com, providing an avenue for further link spamming. ~Amatulić (talk) 18:13, 1 June 2012 (UTC)[reply]

    goo.gl

    goo.gl: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com Google's in-house link shortner that is used only for links to google searches or products, which has worked for months but is suddenly blocked. It does not appear in the log. There was no discussion of this block. This shortening service, which abides by our rules since non-Google domains can't be entered, helps reduce overly-clunky and massive POST url's, and should not be blocked without some community consultation. - ʄɭoʏɗiaɲ τ ¢ 01:09, 8 June 2012 (UTC)[reply]

    It appears to not just be for "in-google" domains. Go there and you can create them for everywhere; the search on wp:en shows one "in-google" and one "out-google". tedder (talk) 01:16, 8 June 2012 (UTC)[reply]
    BTW it's on the mediawiki blacklist, not ours. tedder (talk) 01:17, 8 June 2012 (UTC)[reply]
    Also, it seems to have been blocked at Meta since December 2009. So I'm not sure how it would have worked "for months". Anomie 01:42, 8 June 2012 (UTC)[reply]
    I've been using it for Google maps links since they introduced it, and today is the first time it has been blocked. The search tool is broken I believe Tedder; I've got them in at least 40 articles myself. - ʄɭoʏɗiaɲ τ ¢ 10:31, 8 June 2012 (UTC)[reply]
    Do you have an example of someplace you used it that worked and is not found by the search tool? Anomie 10:38, 8 June 2012 (UTC)[reply]
    Looks like I've made a mistake... I was using g.co, which google maps seems to have suddenly stopped using. <blatant sarcasm ahead> It's a good thing Google is easy to contact about technical issues. - ʄɭoʏɗiaɲ τ ¢ 12:42, 8 June 2012 (UTC)[reply]
    Alright, after some investigation, it appears short URLs generated with Google Maps will always use goo.gl/maps/. Is it possible to whitelist those whilst still blocking other goo.gl links? - ʄɭoʏɗiaɲ τ ¢ 21:14, 8 June 2012 (UTC)[reply]
    Sure, that's possible.  Defer to Whitelist for such requests. Although you may have to convince the admins there that there is a need to include a URL shortener in the white list when it's perfectly reasonable to include full URLs in articles. ~Amatulić (talk) 21:25, 8 June 2012 (UTC)[reply]

    Engineers-Excel.com

    engineers-excel.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com Provides useful (and free) spreadsheet tools for Engineers, lot of content is based on Wikipedia

    no Declined. No rationale has been given on how this would be a useful reference in any article. Furthermore, sites with content "based on Wikipedia" are generally inappropriate to include as links in Wikipedia, because such references are essentially circular. ~Amatulić (talk) 14:22, 8 June 2012 (UTC)[reply]

    Completed Proposed removals

    Troubleshooting and problems

    Logging / COIBot Instr

    Blacklist logging

    Full instructions for admins


    Quick reference

    For Spam reports or requests originating from this page, use template {{/request|0#section_name}}

    • {{/request|213416274#Section_name}}
    • Insert the oldid 213416274 a hash "#" and the Section_name (Underscoring_spaces_where_applicable):
    • Use within the entry log here.

    For Spam reports or requests originating from Wikipedia_talk:WikiProject_Spam use template {{WPSPAM|0#section_name}}

    • {{WPSPAM|182725895#Section_name}}
    • Insert the oldid 182725895 a hash "#" and the Section_name (Underscoring_spaces_where_applicable):
    • Use within the entry log here.
    Note: If you do not log your entries, it may be removed if someone appeals the entry and no valid reasons can be found.

    Addition to the COIBot reports

    The lower list in the COIBot reports now have after each link four numbers between brackets (e.g. "www.example.com (0, 0, 0, 0)"):

    1. first number, how many links did this user add (is the same after each link)
    2. second number, how many times did this link get added to wikipedia (for as far as the linkwatcher database goes back)
    3. third number, how many times did this user add this link
    4. fourth number, to how many different wikipedia did this user add this link.

    If the third number or the fourth number are high with respect to the first or the second, then that means that the user has at least a preference for using that link. Be careful with other statistics from these numbers (e.g. good user who adds a lot of links). If there are more statistics that would be useful, please notify me, and I will have a look if I can get the info out of the database and report it. This data is available in real-time on IRC.

    Poking COIBot

    When adding {{LinkSummary}}, {{UserSummary}} and/or {{IPSummary}} templates to WT:WPSPAM, WT:SBL, WT:SWL and User:COIBot/Poke (the latter for privileged editors) COIBot will generate linkreports for the domains, and userreports for users and IPs.


    Discussion


    Possible malware

    There's a question at RSN about a possible malware site. Could someone take a look at Wikipedia:Reliable_sources/Noticeboard#Please_check_the_source? WhatamIdoing (talk) 06:01, 12 February 2011 (UTC)[reply]

    Ran the url through a few malware/threat detectors, seems its ok.
    Here are a few scanner tools that could be usefull.
    --Hu12 (talk) 19:53, 12 February 2011 (UTC)[reply]

    Clean up

    As you may can see, the spam blacklist is extremely long and growing towards a size which is hard to overview by a human and which probably takes some time to apply (which happens on every edit). Due to that I wrote a script which takes the "easy" regular expressions and parses them back into domain names. Only those that either start with a \b or \. will be taken into account, cause otherwise the exact domain name which is blacklisted can't be extracted (as eg. foo\.com will match barfoo.com). Of course it only takes clear cases into account (the domain names can only contain 0-9, a-z or -, while the TLDs mustn't contain anything except of letters), furthermore all dots must be escaped. After it extracted those domain names, it checks with nslookup whether they still exist, if not, they will be removed from the spam blacklist (they will only be removed if nslookup returns NXDOMAIN, serv fails etc. are ignored). I've already did that with the global spam blacklist on meta twice (1, 2) and there haven't been any problems and none of the removed domains has been re added since.

    So now I ran my script for the English Wikipedia, the new spam blacklist can be found here and the removed lines here. It would be great, if an administrator could apply the new list or I can do it myself, if there's consensus to do so. Feel free to per hand verify some of the removed lines using your systems nslookup, just your browser or the various whois sites out there - Hoo man (talk) 17:38, 6 June 2012 (UTC)[reply]

    I'm all for removing dead or outdated entries, especially since there is successful precedent already on meta. We'd just have to make sure the diff is reflected in the logfile. ~Amatulić (talk) 18:11, 6 June 2012 (UTC)[reply]
    That's of course easily possible, as you seem to use the same format for logs as meta does. See the meta log entry for the first removal - Hoo man (talk) 18:17, 6 June 2012 (UTC)[reply]
     Done. I admit that I have no idea how this page works, but I trust you when you say it will work correctly ;) — Martin (MSGJ · talk) 10:13, 8 June 2012 (UTC)[reply]

    Thanks! Please notice this edit made by me :) I'll log the change in a second - Hoo man (talk) 11:54, 8 June 2012 (UTC)[reply]

    Ok, I've just noticed, that you use a different log style from what we use on meta (although you link to the meta help page). I've tried to do the "If you remove something from the blacklist, simply remove the relevant entry here." using a small shell script, but that doesn't work well either (especially cause cleaning out only the lines which have been removed results in a lot of left over trash). Any ideas? - Hoo man (talk) 12:25, 8 June 2012 (UTC)[reply]

    As there wasn't any reply, I used my list for logging now (diff), it seems like that it didn't create much waste, feel free to revert, if you got a better idea or consider logging the removals, as we do it on meta a better idea - Hoo man (talk) 13:46, 8 June 2012 (UTC)[reply]

    The diff looks fine to me.
    We also have a few entries that are redundant because they are already listed on meta. It would be nice to have a tool that locates matching entries between the two lists. ~Amatulić (talk) 14:28, 8 June 2012 (UTC)[reply]

    That's easily doable: double lines. I did that with the following short bash script:

    #!/bin/bash
    wget 'http://meta.wikimedia.org/w/index.php?title=Spam_blacklist&action=raw' -O /dev/stdout -o /dev/null | sort > meta_spam.txt
    wget 'http://en.wikipedia.org/w/index.php?title=MediaWiki:Spam-blacklist&action=raw' -O /dev/stdout -o /dev/null | sort > en_spam.txt
    comm -12 meta_spam.txt en_spam.txt | grep -vP '^[ #]+'
    rm meta_spam.txt en_spam.txt
    

    - Hoo man (talk) 15:15, 8 June 2012 (UTC)[reply]