Jump to content

MediaWiki talk:Spam-blacklist

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by KH-1 (talk | contribs) at 01:18, 11 September 2020 (+1). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

    Mediawiki:Spam-blacklist is meant to be used by the spam blacklist extension. Unlike the meta spam blacklist, this blacklist affects pages on the English Wikipedia only. Any administrator may edit the spam blacklist. See Wikipedia:Spam blacklist for more information about the spam blacklist.


    Instructions for editors

    There are 4 sections for posting comments below. Please make comments in the appropriate section. These links take you to the appropriate section:

    1. Proposed additions
    2. Proposed removals
    3. Troubleshooting and problems
    4. Discussion

    Each section has a message box with instructions. In addition, please sign your posts with ~~~~ after your comment.

    Completed requests are archived. Additions and removals are logged, reasons for blacklisting can be found there.

    Addition of the templates {{Link summary}} (for domains), {{IP summary}} (for IP editors) and {{User summary}} (for users with account) results in the COIBot reports to be refreshed. See User:COIBot for more information on the reports.


    Instructions for admins
    Any admin unfamiliar with this page should probably read this first, thanks.
    If in doubt, please leave a request and a spam-knowledgeable admin will follow-up.

    Please consider using Special:BlockedExternalDomains instead, powered by the AbuseFilter extension. This is faster and more easily searchable, though only supports whole domains and not whitelisting.

    1. Does the site have any validity to the project?
    2. Have links been placed after warnings/blocks? Have other methods of control been exhausted? Would referring this to our anti-spam bot, XLinkBot be a more appropriate step? Is there a WikiProject Spam report? If so, a permanent link would be helpful.
    3. Please ensure all links have been removed from articles and discussion pages before blacklisting. (They do not have to be removed from user or user talk pages.)
    4. Make the entry at the bottom of the list (before the last line). Please do not do this unless you are familiar with regular expressions — the disruption that can be caused is substantial.
    5. Close the request entry on here using either {{done}} or {{not done}} as appropriate. The request should be left open for a week maybe as there will often be further related sites or an appeal in that time.
    6. Log the entry. Warning: if you do not log any entry you make on the blacklist, it may well be removed if someone appeals and no valid reasons can be found. To log the entry, you will need this number – 977795190 after you have closed the request. See here for more info on logging.


    Proposed additions

    breakingnews365.in

    Systematic blog spam from various dynamic IPs. GermanJoe (talk) 10:21, 3 September 2020 (UTC)[reply]

    @GermanJoe: plus Added to MediaWiki:Spam-blacklist. --GermanJoe (talk) 10:22, 3 September 2020 (UTC)[reply]

    watchlivenow.org

    watchlivenow.org: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com
    plus Added after aggressive spamming campaign. OhNoitsJamie Talk 13:14, 3 September 2020 (UTC)[reply]

    bestringtones.net

    Low-level spamming but creating multiple user accounts and nothing useful on this site. Ravensfire (talk) 14:30, 3 September 2020 (UTC)[reply]

    plus Added to MediaWiki:Spam-blacklist. --Dirk Beetstra T C 14:47, 3 September 2020 (UTC)[reply]

    smartgenguru.com

    After collecting a couple of blocks this one has just graduated to full blown multiple account use/sockpuppetry. I don't expect that they'll stop if they aren't made to stop. - MrOllie (talk) 16:48, 5 September 2020 (UTC)[reply]

    @MrOllie: plus Added to MediaWiki:Spam-blacklist. --Dirk Beetstra T C 17:38, 5 September 2020 (UTC)[reply]

    tranio.com

    Citation spamming of the blog section of this real estate listings site has just hit an even dozen sockpuppets, so it seemed to be time to list here. They never respond to warnings - just move on to a new account and continue the additions. - MrOllie (talk) 19:00, 8 September 2020 (UTC)[reply]

    @MrOllie: plus Added to MediaWiki:Spam-blacklist. --GeneralNotability (talk) 19:51, 9 September 2020 (UTC)[reply]

    lindenbotanicals.com

    Spammed by IPs, see COIBot. GeneralNotability (talk) 19:50, 9 September 2020 (UTC)[reply]

    @GeneralNotability: plus Added to MediaWiki:Spam-blacklist. --GeneralNotability (talk) 19:50, 9 September 2020 (UTC)[reply]
    @GeneralNotability: plus Added to MediaWiki:Spam-blacklist. --GeneralNotability (talk) 19:50, 9 September 2020 (UTC)[reply]

    Greenssolicitors

    See [1]

    Please blacklist. -KH-1 (talk) 01:18, 11 September 2020 (UTC)[reply]

    Proposed removals


    xyz domain

    I just discovered that I couldn't post a link to https://d0n.xyz/project/permanent-redirect/ in a discussion at Wikipedia:Village pump (policy). I cannot find an entry for .xyz, only for specific subdomains of it, and d0n.xyz is not one of them. I am at a loss to see how this has become blacklisted or why. SpinningSpark 09:51, 4 September 2020 (UTC)[reply]

    Spinningspark, the whole TLD is blacklisted because editors here had a daily task in removing spam domains on this TLD. In the last 1000 hits your 4 attempts are genuine, the other >40 are spam. The list of 1000s of domains contains so few domains that whitelisting those is more efficient.  Defer to Whitelist. —Dirk Beetstra T C 11:55, 4 September 2020 (UTC)[reply]
    Ok, but where is the entry blacklisting the whole domain? I'm still not seeing it either here or on the global blacklist. But then I only have a rudimentary understanding of regex. SpinningSpark 12:20, 4 September 2020 (UTC)[reply]
    Spinningspark, I think it's the "\bxyz\b" line. Ravensfire (talk) 15:58, 4 September 2020 (UTC)[reply]
    Interesting, .guru was blocked as "\b[_\-0-9a-z]+\.guru\b", should the xyz be similar? Ravensfire (talk) 15:59, 4 September 2020 (UTC)[reply]
    Really? As I said, I have limited skills in regex, but wouldn't that, for instance, catch https://en.wikipedia.org/wiki/xyz if it wasn't on the whitelist? That seems a bit extreme. I didn't find it myself because I was searching for "\.xyz" which I assumed is how it was listed. SpinningSpark 16:18, 4 September 2020 (UTC)[reply]
    Spinningspark, no, it wouldn’t, as you just saw. This, like the one for guru, catches the whole tld. The .guru one is a bit better, but that is because guru is a more common word than xyz. Dirk Beetstra T C 19:21, 4 September 2020 (UTC)[reply]
    The regex \bxyz\b is too broad and would result in needless false hits. If the intent is to block the tld, then it should be formatted like .guru. I have just done so. ~Anachronist (talk) 17:56, 5 September 2020 (UTC)[reply]
    Anachronist, show me false positives? Any links that have genuinely isolated xyz in them? Dirk Beetstra T C 18:11, 5 September 2020 (UTC)[reply]
    Anachronist, and anyway, SpinningSparks domain would still be blocked. Dirk Beetstra T C 18:13, 5 September 2020 (UTC)[reply]
    Anachronist, I have undone, your comment misinterprets the working of the spam blacklist (it is NOT blocking everything with xyz in it: https://example.com/xyz and https://abcdefghijklmnopqrstuvwxyz.com have xyz in it and is not blocked as you see here, neiher is he link above) and it now is unnecessary confusing for those with limited regex skills without any benefits. The only exception I can see is only half-solved here, and others will be extremely few (if any). Dirk Beetstra T C 18:22, 5 September 2020 (UTC)[reply]
    That's fine. I was concerned that any domain starting with xyz, like http://xyzzy.com would be blocked. ~Anachronist (talk) 23:18, 5 September 2020 (UTC)[reply]

    Logging / COIBot Instructions

    Blacklist logging

    Full instructions for admins


    Quick reference

    For Spam reports or requests originating from this page, use template {{/request|0#section_name}}

    • {{/request|213416274#Section_name}}
    • Insert the oldid 213416274 a hash "#" and the Section_name (Underscoring_spaces_where_applicable):
    • Use within the entry log here.

    For Spam reports or requests originating from Wikipedia_talk:WikiProject_Spam use template {{WPSPAM|0#section_name}}

    • {{WPSPAM|182725895#Section_name}}
    • Insert the oldid 182725895 a hash "#" and the Section_name (Underscoring_spaces_where_applicable):
    • Use within the entry log here.
    Note: If you do not log your entries, it may be removed if someone appeals the entry and no valid reasons can be found.

    Addition to the COIBot reports

    The lower list in the COIBot reports now have after each link four numbers between brackets (e.g. "www.example.com (0, 0, 0, 0)"):

    1. first number, how many links did this user add (is the same after each link)
    2. second number, how many times did this link get added to wikipedia (for as far as the linkwatcher database goes back)
    3. third number, how many times did this user add this link
    4. fourth number, to how many different wikipedia did this user add this link.

    If the third number or the fourth number are high with respect to the first or the second, then that means that the user has at least a preference for using that link. Be careful with other statistics from these numbers (e.g. good user who adds a lot of links). If there are more statistics that would be useful, please notify me, and I will have a look if I can get the info out of the database and report it. This data is available in real-time on IRC.

    Poking COIBot

    When adding {{LinkSummary}}, {{UserSummary}} and/or {{IPSummary}} templates to WT:WPSPAM, WT:SBL, WT:SWL and User:COIBot/Poke (the latter for privileged editors) COIBot will generate linkreports for the domains, and userreports for users and IPs.



    Discussion

    Where to find blacklist hits?

    I thought I had asked this before, but I cannot find a record of it in the archives of this page.

    Is there some sort of tool, edit filter, or log, that shows edit attempts blocked by the spam-blacklist?

    This would be useful for examining activity associated with a blacklisted link in requests to de-list an entry. ~Anachronist (talk) 04:29, 2 September 2020 (UTC)[reply]

    Anachronist, special:log/spamblacklist, or ask for a COIBot report (they are listed between the additions with ‘logitem’. Use is sometimes questionable, you don’t know ‘intention’. Dirk Beetstra T C 17:35, 5 September 2020 (UTC)[reply]
    Thanks! No, a log wouldn't show intention, but it would show frequency, and the attempts by IP addresses might suggest COI editing. It's another tool. ~Anachronist (talk) 17:50, 5 September 2020 (UTC)[reply]

    The fico entry

    I just stumbled upon it from MediaWiki talk:Spam-blacklist/archives/August 2020#regexes while trying to figure out why the filosofico.net link at Cesare Cremonini (philosopher) wasn't working properly. I imagine this is catching out several other Italian sites that are by no means spam because -fico is a relatively common suffix in that language (compare: wikt:-fico, wikt:scientifico, etc.) I can't think of any other sites straight off the bat but I wouldn't be at all surprised if there were more. I added the link I found to the whitelist. Graham87 15:15, 8 September 2020 (UTC)[reply]

    I'm not the best at regular expressions, but would adding \.fico help reduce the rate of false positives? This seems like the Scunthorpe problem ... Graham87 15:23, 8 September 2020 (UTC)[reply]
    Not to mention Spanish words like científico and filosófico ... Graham87 15:30, 8 September 2020 (UTC)[reply]
    Graham87, I’ve wrapped it in \b, so it can’t be part of a word Dirk Beetstra T C 16:39, 8 September 2020 (UTC)[reply]
    Beetstra I imagine that the original intent was to match it when part of a word, to weed out stuff like buyficoscoresherespamspamspam.com - MrOllie (talk) 16:53, 8 September 2020 (UTC)[reply]
    MrOllie, no, the problem was a set of typo squatting and tld changing spammers, I thought that ‘fico’ was uncommon enough to just take it out completely. Dirk Beetstra T C 17:08, 8 September 2020 (UTC)[reply]