Jump to content

MediaWiki talk:Spam-blacklist

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by GermanJoe (talk | contribs) at 20:22, 4 June 2019 (→‎Proposed additions: + 5). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

    Mediawiki:Spam-blacklist is meant to be used by the spam blacklist extension. Unlike the meta spam blacklist, this blacklist affects pages on the English Wikipedia only. Any administrator may edit the spam blacklist. See Wikipedia:Spam blacklist for more information about the spam blacklist.


    Instructions for editors

    There are 4 sections for posting comments below. Please make comments in the appropriate section. These links take you to the appropriate section:

    1. Proposed additions
    2. Proposed removals
    3. Troubleshooting and problems
    4. Discussion

    Each section has a message box with instructions. In addition, please sign your posts with ~~~~ after your comment.

    Completed requests are archived. Additions and removals are logged, reasons for blacklisting can be found there.

    Addition of the templates {{Link summary}} (for domains), {{IP summary}} (for IP editors) and {{User summary}} (for users with account) results in the COIBot reports to be refreshed. See User:COIBot for more information on the reports.


    Instructions for admins
    Any admin unfamiliar with this page should probably read this first, thanks.
    If in doubt, please leave a request and a spam-knowledgeable admin will follow-up.

    Please consider using Special:BlockedExternalDomains instead, powered by the AbuseFilter extension. This is faster and more easily searchable, though only supports whole domains and not whitelisting.

    1. Does the site have any validity to the project?
    2. Have links been placed after warnings/blocks? Have other methods of control been exhausted? Would referring this to our anti-spam bot, XLinkBot be a more appropriate step? Is there a WikiProject Spam report? If so, a permanent link would be helpful.
    3. Please ensure all links have been removed from articles and discussion pages before blacklisting. (They do not have to be removed from user or user talk pages.)
    4. Make the entry at the bottom of the list (before the last line). Please do not do this unless you are familiar with regular expressions — the disruption that can be caused is substantial.
    5. Close the request entry on here using either {{done}} or {{not done}} as appropriate. The request should be left open for a week maybe as there will often be further related sites or an appeal in that time.
    6. Log the entry. Warning: if you do not log any entry you make on the blacklist, it may well be removed if someone appeals and no valid reasons can be found. To log the entry, you will need this number – 900312345 after you have closed the request. See here for more info on logging.


    Proposed additions

    onefivenine.com

    I've been weeding out uses of this site for months now, and we're down to around 190 instances from what was well over 2000 in January 2018. The site was one of many discussed here, all of which will eventually come in for the same treatment. However, it is still being added and so I find myself going round in circles. - Sitush (talk) 08:06, 8 May 2019 (UTC)[reply]

    @Sitush: do you think that we could just list the whole set here and blacklist based on that discussion? (we do blacklist if there is a clear consensus that they should not be used at all, and only make exception at very rare cases which then can easily be handled at the whitelist - I am not sure if this RSN discussion is strong enough for that). --Dirk Beetstra T C 10:43, 8 May 2019 (UTC)[reply]
    I would be happy to see all but mapsofindia blacklisted. As the discussion noted, mapsofindia does have some genuine use but the rest are scrapers. Bear in mind that the participants in that discussion, other than Reyk, are all very frequent contributors to India-related articles - they know their stuff. Sorry, struck that - there was a discussion somewhere that involved more members of the India wikiproject, but it isn't the one I link above. Same outcome, though. - Sitush (talk) 10:46, 8 May 2019 (UTC)[reply]

    @Sitush: so that makes the above list? --Dirk Beetstra T C 11:01, 8 May 2019 (UTC)[reply]

    Seems to, yes, thanks. - Sitush (talk) 11:14, 8 May 2019 (UTC)[reply]
    • Yeah, I remove links to 159 every now and then and it does seem to me that I'm seeing instances I removed being put back in. Blacklist sounds like a god plan. Reyk YO! 11:19, 8 May 2019 (UTC)[reply]
    • Blacklisting makes sense for onefivenine.com. I don't have an opinion about the other websites, except for census2011.co.in: it's not the best source, but it's very far from being the kind of thing we should blacklist (it's also currently being discussed at the RSN). – Uanfala (talk) 11:39, 16 May 2019 (UTC)[reply]
    • The most recent RSN discussion, referred to by Uanfala, was archived here. FWIW, I have been revamping the 187 villages named at List of villages in Mawal taluka using the official census reports rather than the adsense-geared census2011.co.in mentioned in the list. It is time-consuming to fix but not difficult to source, and there were some where the website differed from the official report because the site seems to have missed that there were 6 uninhabited villages + three that bear similar names. - Sitush (talk) 10:03, 31 May 2019 (UTC)[reply]
    @Sitush: plus Added to MediaWiki:Spam-blacklist. --Dirk Beetstra T C 14:29, 31 May 2019 (UTC)[reply]

    company-histories.com

    This domain is operated by Advameg, which previously had all of its domains blacklisted for publishing scraped and improperly licensed content: MediaWiki talk:Spam-blacklist/archives/April 2019 § Advameg sites (city-data.com, filmreference.com, etc.) — Newslinger talk 06:45, 21 May 2019 (UTC)[reply]

    captainjobs.de

    spammed by

    Continued spamming for a new jobs site. A final warning and a block (Syedalam7680) have been ignored. 1-2 other sites have also been sporadically spammed by this user, but this one seems to be the main issue for now. GermanJoe (talk) 19:34, 21 May 2019 (UTC)[reply]

    Still spamming in Employment website. GermanJoe (talk) 20:43, 31 May 2019 (UTC)[reply]
    @GermanJoe: plus Added to MediaWiki:Spam-blacklist. --— JJMC89(T·C) 02:26, 3 June 2019 (UTC)[reply]

    Famous Birthdays (famousbirthdays.com)

    Repeated violations of the biographies of living persons policy. The provenance of Famous Birthdays's content is highly questionable, and less experienced editors have been inappropriately using this site as a reference for years. This domain is currently on User:XLinkBot/RevertList and User:XLinkBot/RevertReferencesList, but this measure does not adequately address the addition of these inappropriate citations into articles. See WP:RSN § famousbirthdays.com for the current discussion and WP:RSP § Famous Birthdays for past discussions. I don't see a valid use case for this domain (as a reference or as an external link). — Newslinger talk 21:46, 28 May 2019 (UTC)[reply]

    @Newslinger: plus Added} per the discussion on RSN and the listing in RSP. XLinkBot has been tried and not found to be sufficient. --Dirk Beetstra T C 05:16, 29 May 2019 (UTC)[reply]

    campusvarta.com

    Repeated, ongoing spamming by SPA socks and dynamic IPs for a student blog (or "EdTech news website" as it calls itself) - see detailed COIBot report for accounts and IPs. No encyclopedic usage. GermanJoe (talk) 11:50, 29 May 2019 (UTC)[reply]

    @GermanJoe: plus Added. --Dirk Beetstra T C 13:21, 29 May 2019 (UTC)[reply]

    general-ebooks.com

    At Cerium nitrate, I visited a link and my browser said "... Google Safe Browsing recently detected phishing on www.general-ebooks.com. Phishing sites pretend to be other websites to trick you. ..." That prompted me to replace 2 links to http://www.general-ebooks.com/get/3651357 with https://archive.org/details/chemischekrystal02grotuoft/page/n3 (diff). Linksearch tool lists 7 uses on en (mostly non-article); Google listed some on other-language Wikipedias. One of the links to general-ebooks.com was added in 2014 (diff). Maybe general-ebooks.com went bad? - A876 (talk) 19:29, 2 June 2019 (UTC)[reply]

    Indonesian blog spam

    Blogs have been spammed by multiple dynamic IPs and are apparently all maintained by the same - probably Indonesian - spammer. Multiple warnings given and ignored (see especially http spam link warnings for dapurocha.com). Please blacklist the whole bunch - no encyclopedic usage. GermanJoe (talk) 20:22, 4 June 2019 (UTC)[reply]

    Proposed removals


    camp-x.com

    I don't know if camp-x.com has been discussed before. It is the work of Lynn Philip Hodgson, whose books are freely referenced in such articles as Camp X, Casa Loma, George McClellan (police officer)... (see https://en.wikipedia.org/w/index.php?search=Lynn+Philip+Hodgson&title=Special%3ASearch&go=Go&ns0=1 for more). It seems to have relevant info on these subjects, especially first hand rather than in book refs that are unavailable online, except thru camp-x.com. I endeavoured to Externally link: Camp-X on the Camp-X article and was blocked. Why not allow it?

    PS I have no conflict of interest. PPS I have been editing WP for about 12 years yet there are places I find difficult to penetrate like the Blacklist and the whys and wherefores of such decisions. Am I out of line to say that makes a mockery of WPs claims to openness and user friendliness. Am I supposed to trawl the logs of such decisions for each year. Is there an easier way and why can't it come up with the the reason for Blacklisting at the time of editing... Somehow it seems like those intrepid editors that use such sites that cannot be named may be regarded as tainted and suspicious themselves. Speaking for myself, I am trying to improve WP, on my own time and could do without some of the bother. DadaNeem (talk) 23:27, 30 May 2019 (UTC)[reply]

    By fiddling around with tracked I found this page: https://en.wikipedia.org/wiki/MediaWiki_talk:Spam-blacklist/archives/July_2011#camp-x.com_removal which states that:

    Defer to Whitelist. This was blocked in December 2009 for spamming by a persistent spammer. The size of the site or who owns it isn't really relevant to the reason for listing. The only article on Wikipedia that would require a link to this site would be Camp X, and for that, you may request that a specific page to be whitelisted. If you want to whitelist just the home page, use www.camp-x.com/index.htm in your request. ~Amatulić (talk) 19:42, 12 July 2011 (UTC)

    That's all I can find. My ?s:

    • the original block details are unknown to me
    • are they even still relevant 10 years on?
    • is a block permanent? The site is no longer active yet I found the latest archive of it on archive.org
    • no request that I know of was made for a partial lifting for use on Camp X-could at least that be allowed?

    DadaNeem (talk) 00:33, 31 May 2019 (UTC)[reply]

    @DadaNeem: minus Removed from MediaWiki:Spam-blacklist. --Dirk Beetstra T C 04:52, 31 May 2019 (UTC)[reply]
    @Beetstra: Many thanks Dirk Beetstra. Please excuse me for going into rant mode. To give a more general positive outcome to this, can you suggest a way to make this process easier? I have a few ideas... BTW while camp-x.com is unblocked, the bitly equivalent, https://bit.ly/2YYBfr2, remains blocked eg I can't save this comment without rendering the bit.ly harmless. Regards DadaNeem (talk) 06:49, 31 May 2019 (UTC)[reply]
    @DadaNeem: there is no reason to ever use bit.ly. Use https://web.archive.org/web/20161026113930/http:/www.camp-x.com/camp-x.html. bit.ly is a redirect service, and those services are globally blocked as they are abused on a daily basis, utterly unneeded (there is by definition a replacement), and are hiding where you are going to end up. We will not even grant whitelisting for them.
    I am unsure what you want to make less painful here, is there anything missing in the instructions? --Dirk Beetstra T C 11:36, 31 May 2019 (UTC)[reply]

    census2011

    This is India's official government website for census information. The website is very important for citing accurate reliable information for India topics. It is not spam. Please remove. I don't know why some spammers decide to use it but it is certainly not spam. --NagalimNE (talk) 07:31, 4 June 2019 (UTC)[reply]

    • This was part of a big batch of recent additions. I did try to raise an objection, but it seems to have been ignored. These sites are not official sites of the Indian government, they're created and maintained by private individuals who've repackaged the census data. They probably aren't reliable, so their use in articles should be discouraged in favour of the actual official sources. However, they are not spam and should not be blacklisted. The first one at least also has a legitimate use in talk page discussions having to do with article names or disambiguation – it allows you to perform a search for all villages with a given name and link to that search, and as far as I know there's no easy way to do that on the official census website. – Uanfala (talk) 11:43, 4 June 2019 (UTC)[reply]

    Troubleshooting and problems

    Logging / COIBot Instructions

    Blacklist logging

    Full instructions for admins


    Quick reference

    For Spam reports or requests originating from this page, use template {{/request|0#section_name}}

    • {{/request|213416274#Section_name}}
    • Insert the oldid 213416274 a hash "#" and the Section_name (Underscoring_spaces_where_applicable):
    • Use within the entry log here.

    For Spam reports or requests originating from Wikipedia_talk:WikiProject_Spam use template {{WPSPAM|0#section_name}}

    • {{WPSPAM|182725895#Section_name}}
    • Insert the oldid 182725895 a hash "#" and the Section_name (Underscoring_spaces_where_applicable):
    • Use within the entry log here.
    Note: If you do not log your entries, it may be removed if someone appeals the entry and no valid reasons can be found.

    Addition to the COIBot reports

    The lower list in the COIBot reports now have after each link four numbers between brackets (e.g. "www.example.com (0, 0, 0, 0)"):

    1. first number, how many links did this user add (is the same after each link)
    2. second number, how many times did this link get added to wikipedia (for as far as the linkwatcher database goes back)
    3. third number, how many times did this user add this link
    4. fourth number, to how many different wikipedia did this user add this link.

    If the third number or the fourth number are high with respect to the first or the second, then that means that the user has at least a preference for using that link. Be careful with other statistics from these numbers (e.g. good user who adds a lot of links). If there are more statistics that would be useful, please notify me, and I will have a look if I can get the info out of the database and report it. This data is available in real-time on IRC.

    Poking COIBot

    When adding {{LinkSummary}}, {{UserSummary}} and/or {{IPSummary}} templates to WT:WPSPAM, WT:SBL, WT:SWL and User:COIBot/Poke (the latter for privileged editors) COIBot will generate linkreports for the domains, and userreports for users and IPs.


    Discussion