Jump to content

MediaWiki talk:Spam-blacklist

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by JJMC89 (talk | contribs) at 03:34, 23 March 2020 (→‎morninglazziness: Added to Blacklist using SBHandler). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

    Mediawiki:Spam-blacklist is meant to be used by the spam blacklist extension. Unlike the meta spam blacklist, this blacklist affects pages on the English Wikipedia only. Any administrator may edit the spam blacklist. See Wikipedia:Spam blacklist for more information about the spam blacklist.


    Instructions for editors

    There are 4 sections for posting comments below. Please make comments in the appropriate section. These links take you to the appropriate section:

    1. Proposed additions
    2. Proposed removals
    3. Troubleshooting and problems
    4. Discussion

    Each section has a message box with instructions. In addition, please sign your posts with ~~~~ after your comment.

    Completed requests are archived. Additions and removals are logged, reasons for blacklisting can be found there.

    Addition of the templates {{Link summary}} (for domains), {{IP summary}} (for IP editors) and {{User summary}} (for users with account) results in the COIBot reports to be refreshed. See User:COIBot for more information on the reports.


    Instructions for admins
    Any admin unfamiliar with this page should probably read this first, thanks.
    If in doubt, please leave a request and a spam-knowledgeable admin will follow-up.

    Please consider using Special:BlockedExternalDomains instead, powered by the AbuseFilter extension. This is faster and more easily searchable, though only supports whole domains and not whitelisting.

    1. Does the site have any validity to the project?
    2. Have links been placed after warnings/blocks? Have other methods of control been exhausted? Would referring this to our anti-spam bot, XLinkBot be a more appropriate step? Is there a WikiProject Spam report? If so, a permanent link would be helpful.
    3. Please ensure all links have been removed from articles and discussion pages before blacklisting. (They do not have to be removed from user or user talk pages.)
    4. Make the entry at the bottom of the list (before the last line). Please do not do this unless you are familiar with regular expressions — the disruption that can be caused is substantial.
    5. Close the request entry on here using either {{done}} or {{not done}} as appropriate. The request should be left open for a week maybe as there will often be further related sites or an appeal in that time.
    6. Log the entry. Warning: if you do not log any entry you make on the blacklist, it may well be removed if someone appeals and no valid reasons can be found. To log the entry, you will need this number – 946913522 after you have closed the request. See here for more info on logging.


    Proposed additions

    dailyhunt.in

    There is virtually no reason this site should be used as it's almost exclusively an aggregate publisher and it very often picks up items from unreliable, blackhat SEO "news" sources. example, see the disclaimer at the bottom. In the event that they do publish something as an aggregate that isn't from a non-rs, the rs should just be used. Praxidicae (talk) 19:52, 19 February 2020 (UTC)[reply]
    @Praxidicae: plus Added to MediaWiki:Spam-blacklist. Agreed, a net negative to Wikipedia, sufficiently so that addition invites questions about the good faith of the user linking it. --Guy (help!) 20:13, 19 February 2020 (UTC)[reply]
    I'm late to this discussion, but dailyhunt.in has been a big problem in Indian entertainment articles as they shamelessly aggregate (aka steal) content from all sorts of sites without even bothering to attribute the source, which creates lots of problems when people assume that the unnamed origin source is reliable. I'm glad that it's been blacklisted. Cyphoidbomb (talk) 06:13, 14 March 2020 (UTC)[reply]

    "duleweboffice"

    Shamelessly nicked from User:Praxidicae/fakenews


    This set all belongs to a gmail account "duleweboffice@gmail.com" and several of thes sites, including foreignpolicyi.org were originally legitimate sites however they sniped the domain and it has since become an unreliable and frankly garbage spam site (as is the case for the rest, too.) Legitimate uses of this link look like: http://www.foreignpolicyi.org/node/17539 and we should see if there is an archived version somewhere. The illegitimate uses look like this and are rather easy to spot (basically anywhere this is used on entertainment, media personalities and media in general is the spam version.) the spam variant looks like this: https://foreignpolicyi.org/tanya-nolan-is-becoming-a-hit-with-new-single-love-ya/


    I did some checking. These sites have been abused on Wikipedia, in some cases severely so. It's hard to conclude anything other than SEO involvement. I salute Praxidicae for this hard work. If we blacklist then at least no new links will be added, and old ones will be nuked as the articles are edited. It's a huge job removing them entirely. Lustiger seth, is there any way to write a bot to copy the contents of the blacklist and compile a table with the number of active links on enWP, ideally just in mainspace? We might be able to use that as the basis for a reward system for Wikignomes. Guy (help!) 20:35, 19 February 2020 (UTC)[reply]

    • Just a quick note that I archived a bunch of foreignpolicyi.org links (to the original site) and deleted all traces to the original, so those are fine but should be blacklisted going forward. Same for vermontrepublic. The rest are just plain old spam and can be blacklisted unless we would rather filter as a honeypot. There's another set by the same person/email (duleweboffice) under the name "santosmilewa"example and "kravitzcj" example. I'll make a list of these shortly. They're all operated by the same 3 blackhat SEO firms along with another handful that are using a dead woman's identity (I filed actual reports with the proper agencies about this FWIW), a fake phone number and a fake real life address (it's public, so i'm not disclosing anything out of the ordinary.) Anyhow, my lists are kind of a mess right now so I'll throw some together over the next few hours/days that'll make it all easier. Praxidicae (talk) 21:13, 19 February 2020 (UTC)[reply]
      Praxidicae, heroic work, thanks. Guy (help!) 21:19, 19 February 2020 (UTC)[reply]
    @Praxidicae/fakenews: plus Added to MediaWiki:Spam-blacklist. per User:Praxidicae/fakenews/sbl. --Guy (help!) 21:28, 19 February 2020 (UTC)[reply]
    hi!
    regarding the question about the table: it would be possible, but it would take long time (weeks or months), i guess. (and i would need some time to adapt my scripts. that's propably the bottleneck.) -- seth (talk) 16:56, 23 February 2020 (UTC)[reply]
    The domain scholarlyoa.com was Beall's list. Articles on dodgy academic publishing practices are likely to point to archived copies of it. I discovered the problem when trying to revert section-blanking at World Academy of Science, Engineering and Technology; the last good version had links that are now spam-blacklisted. Not being too familiar with how spam-blacklisting works around here, I'm not sure of the best course of action. (Ping JzG.) XOR'easter (talk) 13:29, 25 February 2020 (UTC)[reply]
    XOR'easter, request whitelisting of specific URLs. Are you familiar with that process? I can help if not. Guy (help!) 15:52, 25 February 2020 (UTC)[reply]
    JzG, I'm not sure the links should just be whitelisted, since the site itself is down, probably permanently, and the actual content we should be pointing readers to is in the archived copies. XOR'easter (talk) 20:11, 25 February 2020 (UTC)[reply]
    @JzG: scholarlyoa being blacklisted is really annoying, and prevents many discussions about predatory journals. It should also be limited to non-archived links, since those are the problematic ones, rather than archived links, which refer to the legitimate site back when Jeffrey Beall ran it. Headbomb {t · c · p · b} 07:19, 27 February 2020 (UTC)[reply]
    Headbomb, No, it does not prevent that, it makes it more difficult as you now cannot link to it directly but have to disable the links when discussing them (which is highly annoying) . Unfortunately the AbuseFilter is not a reasonable alternative either, it is too heavy handed for this. That is indeed a shortcoming of the spam-blacklist and of the AbuseFilter. Do these discussions happen so often? Dirk Beetstra T C 10:04, 27 February 2020 (UTC)[reply]
    I think that such problems (hijacked domains that once had legitimate use) will become more common as spammers become more sophisticated and the use of the spam blacklist expands. I dunno, is it possible to selectively whitelist archived versions of a blacklisted URL? I know that in its current form the spam blacklist catches archived versions of a blacklisted link as well. Jo-Jo Eumerus (talk) 11:52, 27 February 2020 (UTC)[reply]
    It does not prevent that. It does. And they happen often enough on journals-related pages, given Beall's importance in that area. This is best implemented as an edit filter, which would not interfere with non-article space, such as talk spaces. Headbomb {t · c · p · b} 14:45, 27 February 2020 (UTC)[reply]
    Headbomb, it does not prevent discussing, it prevents linking to the material directly (which, I totally agree, is completely annoying). The problem here is, that there is no other solution: allow only the archive links also allows the archive links to current material, and does so everywhere everywhere (which is exactly what you don't want to be linking to). Removing it from the blacklist altogether also allows the current material, and also everywhere (as above). It is a shortcoming of the spam blacklist. It needs to be changed (well, it needed to be changed years ago). Currently your only way forward is either using it in a 'broken' form (if it is talkpage only likely the best way forward), or getting it whitelisted (not very likely to be granted for use on a talkpage). Everything else is something that needs a change to the software. Note also, that the edit filter needs to be rather heavy to be less restrictive than the spam-blacklist. Dirk Beetstra T C 12:32, 1 March 2020 (UTC)[reply]
    Headbomb, Note: the link you were all trying to add (which is the only one every hitting the filter in the whole so many years of history on that page) is one that is used in the article. Hence, it should be whitelisted just to make sure that we do not run into problems in the future. Dirk Beetstra T C 12:39, 1 March 2020 (UTC)[reply]
    Beetstra, And there's always the potential for a removal request if we think it's a false positive or joe-job. Which in this case is entirely possible: predatory publishers hate the site. Guy (help!) 00:12, 11 March 2020 (UTC)[reply]
    JzG, the spam-blacklist is to protect Wikipedia, it does not matter if things are Joe Jobbed, genuine spam, or through community consensus decided to be bad. Dirk Beetstra T C 07:03, 11 March 2020 (UTC)[reply]
    Beetstra, sure, but there's precedent for removal when the spamming was done with the deliberate intent of getting a site blacklisted. And in this case that is entirely plausible. Guy (help!) 08:46, 11 March 2020 (UTC)[reply]
    I have found my way here because one of the news sites (fin24.com) I have used for years now I have found to be suddenly black listed. It is regarded as a reliable source in South Africa (where it is based) so I was surprised to find it here. Just wondering if I should try and get that source white listed?--Discott (talk) 11:01, 11 March 2020 (UTC)[reply]

    WikiLeaks

    I came across some citations to WikiLeaks. That seems like a really bad idea: pretty much by definition the material they host is in violation of copyright. Guy (help!) 13:05, 20 February 2020 (UTC)[reply]

    As I understand it, the material they host was produced by governments and is not copyrighted.
    There is a possible problem in linking to information that governments consider classified. When I was a defense department employee, we couldn't even look at Wikileaks (even on personal time) due to the danger of being exposed to classified information we weren't cleared to know, which is a serious thing if you're in government. It was a weird situation where the public could do what they wanted but those of us in government service had restrictions. That was years ago; I don't know how they handle it these days.
    To the extent that government documents are reliable sources, citing such documents on Wikileaks should not be a problem if that's the only venue where they can be seen. ~Anachronist (talk) 05:56, 24 February 2020 (UTC)[reply]
    Anachronist, I'm not sure I understand .. is material from a government not copyrighted? I would expect that the organisation (not the individual that wrote it) holds the copyright.
    Though I agree that some of the material can be a reliable source, there is also not a necessity to have a working link to the information (if too much of the info is problematic linking to). Dirk Beetstra T C 06:01, 24 February 2020 (UTC)[reply]
    @Beetstra: I'll answer your question with the lead sentence of our article Copyright status of works by the federal government of the United States. If the communique, document, or other work was written by a government employee, it isn't subject to domestic copyright, but if the work was written by a contractor the situation is muddier. I'd wager that most of the documents on Wikileaks are generated by governments (largely the US government) and therefore not subject to copyright.
    I oppose blacklisting Wikileaks, but if we don't, then citations to it would have to be examined on a case-by-case basis. ~Anachronist (talk) 17:02, 24 February 2020 (UTC)[reply]
    Anachronist, not unless you consider material stolen from the DNC's email servers by the Russians to be "produced by governments". Also British government materials are Crown copyright. So there's absolutely no guarantee. And work product is exempt, I believe. Guy (help!) 19:40, 24 February 2020 (UTC)[reply]
    The DNC stuff isn't produced by governments, of course. I'm thinking more of US military messages, diplomatic communiques, stuff that Chelsea Manning released, and so on. I'm skeptical that government work products are exempt. There's legitimate material in there, and as I said, the citations would need to be examines on a case-by-case basis.
    I note that [link search] reveals an extremely low percentage of Wikileaks links in main article space. Most of them appear to be on talk pages and Wikipedia namespace. I wish the linksearch feature had a filter to show only mainspace pages. Glancing through it, there don't seem to be many articles actually citing Wikileaks. ~Anachronist (talk) 04:33, 25 February 2020 (UTC)[reply]
    Anachronist, did you look through wikileaks.org HTTPS links HTTP links? Guy (help!) 15:54, 25 February 2020 (UTC)[reply]
    Cool. I didn't know about that search parameter. I stand corrected. :) ~Anachronist (talk) 17:13, 25 February 2020 (UTC)[reply]

    Vietnamese websites

    New links are appearing as fast as I can remove them. NinjaRobotPirate (talk) 23:26, 7 March 2020 (UTC)[reply]

    NinjaRobotPirate
    Lets see if we have all links by these users Dirk Beetstra T C 05:21, 8 March 2020 (UTC)[reply]

    spdload.com / webspero.com

    Recurring spam for marketing sites / PR blogs, multiple warnings for each. GermanJoe (talk) 10:34, 10 March 2020 (UTC)[reply]

    @GermanJoe: plus Added to MediaWiki:Spam-blacklist. --GermanJoe (talk) 10:37, 10 March 2020 (UTC)[reply]

    thehinduopinion.com

    Minor blog spam, but deceptive impersonation of the newspaper and manipulating of existing source links. GermanJoe (talk) 17:17, 13 March 2020 (UTC)[reply]

    @GermanJoe: plus Added to MediaWiki:Spam-blacklist. --GermanJoe (talk) 17:18, 13 March 2020 (UTC)[reply]

    imgx.in

    Though I just happened upon this spam today, there has been a very clear slow-burn effort to spam this site at Wikipedia. Behold the majority of what I reverted recently: imgx.in spam

    • 13 December 2019 - 49.207.140.124 - [1]
    • 14 January 2020 - 49.207.132.52 - [2]
    • 16 January 2020 - 49.206.127.89 - [3]
    • 17 January 2020 - 49.205.77.115 - [4]
    • 18 January 2020 - 49.205.78.118 - [5]
    • 22 January 2020 - 49.207.141.114 - [6]
    • 25 January 2020‎ - 49.207.131.67 - [7]
    • 25 January 2020 - 49.205.79.220 - [8]
    • 26 January 2020 - 183.83.154.167 - [9]
    • 27 January 2020 - 49.207.143.58 - [10]
    • 29 January 2020 - 49.207.142.210 - [11]
    • 30 January 2020 - 183.83.152.194 - [12]
    • 31 January 2020 - 49.207.138.65 - [13]
    • 2 February 2020 - 49.207.134.250 - [14]
    • 2 February 2020 - 49.207.128.232 - [15]
    • 7 February 2020 - 49.207.136.32 - [16]
    • 9 February 2020 - 183.83.154.105 - [17]
    • 11 February 2020 - 49.207.134.97 - [18]
    • 13 February 2020 - 49.207.131.192 - [19]
    • 14 February 2020 - 49.206.125.120 - [20]
    • 15 February 2020 - 183.83.154.112 - [21]
    • 16 February 2020‎ - 183.83.154.112 - [22]
    • 17 February 2020 - 49.207.139.89 - [23]
    • 23 February 2020 - 49.207.138.170 - [24]
    • 24 February 2020 - 183.83.154.165 - [25]
    • 24 February 2020 - 183.83.154.165 - [26]
    • 27 February 2020 - 49.207.131.195 - [27]
    • 4 March 2020 - 49.206.125.143 - [28]
    • 7 March 2020 - 49.207.130.177 - [29]
    • 7 March 2020 - 49.207.130.177 - [30]
    • 8 March 2020 - 49.207.134.144 - [31]
    • 9 March 2020 - 49.206.125.95 - [32]
    • 11 March 2020 - 49.207.141.116 - [33]

    Obviously the domain should be blacklisted, but if someone wants to figure out any rangeblocks, that's not exactly my specialty, so I'm grateful in advance. Thanks, Cyphoidbomb (talk) 06:23, 14 March 2020 (UTC)[reply]

    @Cyphoidbomb: plus Added to MediaWiki:Spam-blacklist. These IPs only spammed the imgx.in domain, and the blacklist will prevent this in the future. I don't think a range block is necessary unless the IPs start spamming other domains. I've also upgraded Imgxbot from a soft block to a hard block. Thanks for reporting this. — Newslinger talk 08:11, 14 March 2020 (UTC)[reply]
    Cyphoidbomb, I hope you did not manually compile above list ... please ask for a COIBot report next time you see something like this. Dirk Beetstra T C 08:46, 15 March 2020 (UTC)[reply]
    @Beetstra: Sigh... I did. Cyphoidbomb (talk) 23:01, 16 March 2020 (UTC)[reply]
    Cyphoidbomb, :-) next time report the domain here in the LinkSummary template, wait until COIBot has saved/refreshed the report (in the template: 'Reports: .... COIBot') .. Dirk Beetstra T C 05:01, 17 March 2020 (UTC)[reply]
    Beetstra Y'know, I *knew* that, I just don't have an explanation other than being a total shit-for-brains. Cyphoidbomb (talk) 00:07, 18 March 2020 (UTC)[reply]
    Cyphoidbomb, WP:CIR-related block needed? :-p Dirk Beetstra T C 05:28, 18 March 2020 (UTC)[reply]

    essaycorp.co.uk

    Spammed on a couple of articles, appears to be an essay-selling website. No reason to be linking to this. creffett (talk) 13:45, 14 March 2020 (UTC)[reply]

    @Creffett: plus Added to MediaWiki:Spam-blacklist. I also blocked Euricana indefinitely. Thanks for reporting this. — Newslinger talk 17:04, 14 March 2020 (UTC)[reply]

    successpanachahtehai.com

    Spammed on multiple articles by one account (reported as promo-only account to AIV), looks like it's here to sell us something and doesn't appear to have any redeeming value. creffett (talk) 03:07, 15 March 2020 (UTC)[reply]

    @Creffett: plus Added to MediaWiki:Spam-blacklist. --Dirk Beetstra T C 06:03, 15 March 2020 (UTC)[reply]

    techpassionworld.com

    Recurring blog spam from dynamic IPs, including already partially-blocked LTA IP. GermanJoe (talk) 09:30, 15 March 2020 (UTC)[reply]

    @GermanJoe: plus Added to MediaWiki:Spam-blacklist. --GermanJoe (talk) 09:30, 15 March 2020 (UTC)[reply]

    websitestrategies.com.au

    SEO spammer via dynamic IPs, two final warnings have been ignored. GermanJoe (talk) 17:09, 15 March 2020 (UTC)[reply]

    @GermanJoe: plus Added to MediaWiki:Spam-blacklist. --GermanJoe (talk) 17:10, 15 March 2020 (UTC)[reply]

    equifax.cf

    Directly related to the already blacklisted Wikipedia:WikiProject Spam/LinkReports/creditkarma.cf and Wikipedia:WikiProject Spam/LinkReports/creditskarma.cf (where similar hijacking was observed). Reported to on ELN by user:Gbear605. Pinging WhatamIdoing. Will immediately blacklist. --Dirk Beetstra T C 10:25, 16 March 2020 (UTC)[reply]

    @Gbear605 and WhatamIdoing: plus Added to MediaWiki:Spam-blacklist. --Dirk Beetstra T C 10:26, 16 March 2020 (UTC)[reply]
    Thanks. It looks like there are no links in articles, so we're set for now. WhatamIdoing (talk) 17:59, 16 March 2020 (UTC)[reply]

    chiropractornearmereviews.com

    Per Wikipedia:WikiProject_Spam/LinkReports/chiropractornearmereviews.com - spam links for even spammier content. Guy (help!) 12:36, 17 March 2020 (UTC)[reply]

    plus Added to MediaWiki:Spam-blacklist. --Guy (help!) 12:38, 17 March 2020 (UTC)[reply]

    discount-24hour.blogspot.com

    Spammer. plus Added to MediaWiki:Spam-blacklist. --Guy (help!) 16:16, 17 March 2020 (UTC)[reply]

    flicktokick.wordpress.com

    Spam links. Cleaning up now. Some additions go back years. Guy (help!) 16:29, 17 March 2020 (UTC)[reply]

    Cleaned and plus Added to MediaWiki:Spam-blacklist. --Guy (help!) 16:33, 17 March 2020 (UTC)[reply]

    Honestbussinessman24 and market-mirror

    I came across these two services while looking over a cryptocurrency article's sources. Both seeming lack any sort of editorial control, and both advertise paid article writing services. Honestbussinessman24 even describes itself as Market-mirror in its "About" section. SamHolt6 (talk) 22:56, 17 March 2020 (UTC)[reply]

    @SamHolt6: plus Added to MediaWiki:Spam-blacklist. --Guy (help!) 18:16, 18 March 2020 (UTC)[reply]

    lyricspandits.blogspot.com

    Copyright-violating blogs

    Spam blog. Guy (help!) 18:16, 18 March 2020 (UTC)[reply]

    @JzG: plus Added to MediaWiki:Spam-blacklist. I guess we should have a relatively low threshold for these. --Dirk Beetstra T C 05:42, 19 March 2020 (UTC)[reply]

    sastedeal.com

    Per Wikipedia:WikiProject_Spam/Local/sastedeal.com, persistent addition by IPs, and the links to this site are blatant spam. creffett (talk) 03:04, 21 March 2020 (UTC)[reply]

    @Creffett: plus Added to MediaWiki:Spam-blacklist. --Dirk Beetstra T C 19:02, 21 March 2020 (UTC)[reply]

    tripraja.com

    Spammed by a variety of IPs and single-purpose users over a long period of time. Appears to be a travel sales website, so don't think it has any redeeming encyclopedic value. creffett (talk) 13:23, 21 March 2020 (UTC)[reply]

    @Creffett: Handled on meta. --Dirk Beetstra T C 19:01, 21 March 2020 (UTC)[reply]

    unearthedarcana.com

    This is basically an SEO spam blog with stolen/copyrighted Dungeons & Dragons content and affiliate links so they earn a commission on sales. It'll probably never be usable as a source so there's no encyclopedic value here. Saireddy9666 added links to Dungeons & Dragons three times and was reverted each time, then apparently returned as Jakkidirajashakerreddy to spam again. (Note that each username includes "reddy".) Woodroar (talk) 15:21, 21 March 2020 (UTC)[reply]

    @Woodroar: plus Added to MediaWiki:Spam-blacklist. --Dirk Beetstra T C 18:58, 21 March 2020 (UTC)[reply]

    .shop TLD

    • Regex requested to be blacklisted: \b[_\-0-9a-z]+\.guru\b

    As we did with the .guru top-level domain, I am wondering if we should consider blacklisting the .shop TLD. I just went through the link search list and removed .shop domains from every main space article containing them (leaving a few in which the .shop site is the actual website of the article subject).

    There weren't many instances of this, but that may be because .shop is a fairly new TLD, I reckon less than 4 years old. I don't really see strong evidence of abuse except in one case where innodot.shop kept getting added to skeleton watch (that IP user is now blocked); the others I removed were either one-offs, or the domain no longer pointed at anything relevant.

    Any thoughts? Blacklisting a TLD is a big deal. I believe we did the right thing with .guru, and even though .shop isn't a popular domain for spamming yet, the nature of it gives it a higher potential for abuse than .guru, possibly. ~Anachronist (talk) 22:25, 21 March 2020 (UTC)[reply]

    @Anachronist: I am first going to feed this to XLinkBot (references in next addition). Lets see where this gets reverted and review this in a week. If we see a number popping up and a couple of reversions we should likely consider to blacklist. Can you indicate how many you removed? plus Added to User:XLinkBot/RevertList. --Dirk Beetstra T C 06:56, 22 March 2020 (UTC)[reply]
    @Anachronist: plus Added to User:XLinkBot/RevertReferencesList. --Dirk Beetstra T C 06:57, 22 March 2020 (UTC)[reply]
    @Beestra: Good. It is unlikely that we'll see any activity in a week, so XLinkBot is the best place for now. ~Anachronist (talk) 17:05, 22 March 2020 (UTC)[reply]
    @Beetstra: Regex requested to be blacklisted: \.shop\b may result in false positives. It should be the regex we're using for .guru. I have corrected it above. ~Anachronist (talk) 17:09, 22 March 2020 (UTC)[reply]

    mymotivationalsupport.com

    Site has been spammed by probably one person IP hopping, not even close to a good reference. Ravensfire (talk) 15:44, 22 March 2020 (UTC)[reply]

    plus Added to MediaWiki:Spam-blacklist. Recurring issue since 2019. --GermanJoe (talk) 19:21, 22 March 2020 (UTC)[reply]

    uplaw.us

    Added as a "ref" by a couple of IPv6 ranges to various law-related articles, the links are basically to services uplaw.us offers related to the article topic. creffett (talk) 23:16, 22 March 2020 (UTC)[reply]

    @Creffett: plus Added to MediaWiki:Spam-blacklist. Systematic IP spam since 2019. --GermanJoe (talk) 23:30, 22 March 2020 (UTC)[reply]

    More spam blogs

    morninglazziness

    Users

    Please blacklist.-KH-1 (talk) 03:18, 23 March 2020 (UTC)[reply]

    @KH-1: Last two accounts blocked. plus Added to MediaWiki:Spam-blacklist. — JJMC89(T·C) 03:34, 23 March 2020 (UTC)[reply]

    Proposed removals

    econlib.org

    I have added a reference to an article from a 1945 economic journal in the pricing signal wiki article. I used the journal template, I then added a URL to the site that hosted the article. The website that hosts the article appears to be blacklisted. I just removed the URL, since the reference is valid and points to a historic article that exists outside the web.

    My goal was to find the reason for deletion. My initial impression was that perhaps econlib had credibility issues, maybe there had been an edit war, or it was recurrently added with poor regards to wikipedia's standards.

    However, what I found in the arhives is a very weak reason for blacklisting:

    "All appear to have been added by Vipul (talk · contribs · logs · edit filter log · block log), Riceissa (talk · contribs · logs · edit filter log · block log) or other paid surrogates of Vipul, most are selling legal services related to immigration, and the overall conclusion is SEO. All valid content can be drawn from more authoritative sources such as law books, pages in law faculty websites, official government sources etc. Guy (Help!) 23:24, 10 March 2017 (UTC)"

    Another user comments: "I think econlib.org needs to be removed. This is a legitimate, relatively prominent Economics blog where relatively prominent economists(Sumner, Bryan Caplan) discuss current issues in econ. Dark567 (talk) 01:31, 11 March 2017 (UTC) "

    The concern was heard but nothing was done about it.

    This single user resulted in the blacklist of 8 or 10 sites and econlib appeared to be caught in the fire. I'm not going to bother searching what the particular edits were.

    In addition to removing the individual site from the blacklist, there must be other false positives, so I propose some systemic improvements that can be made.

    First I'll list the steps I took to try to solve this issue:

    1. Check in both the local and global blacklists.
    2. navigate through the archives to find the reason for the ban.
    3. Post here about it.

    Between 2 and 3, finding the relevant edits would be a useful tool for determining whether the ban was warranted, and would be of significant difficulty. So far I would like to see a search tool that:

    • looks in both local and global banlists
    • finds the reason for blacklisting and links the user towards it.
    • is triggered upon an edit that includes the url, and shows the reason to the user.
    • Remove the restriction from talk pages, since it makes it hard to even discuss these blacklists. (it's /library/Essays/hykKnw.html )
    • log attempted edits that include this url. If 1 user made a low quality or vandal edit to wikipedia with that link, but 50 users attempted to make 50 good edits with that link, the url should automatically be suggested for removal.

    The benefits would be:

    • Decreasing references without free access to the cited content.
    • Decreasing biases in wikipedia.
    • Decreasing the amount of erroneously rejected edits.
    • Better communicating to users why their edit was rejected.

    The costs of this edit would be:

    • Development time, I'm free and have the skills to implement this.
    • Admin time, this might increase the maintenance of the blacklist.

    So if I get backing from those currently responsible of going through this list , I can post this suggestion to the appropriate section and start working on it. — Preceding unsigned comment added by TZubiri (talkcontribs) 23:06, 8 March 2020 (UTC)[reply]

    @TZubiri: again, a) there is a direct connection between the paid editor you are talking about and Bryan Caplan, and b) by far most of the material is replaceable (outside of the (already whitelisted) encyclopedia there will be very, very few exceptions, and most of the cases we have seen until now are replaceable with links to e.g. WikiSource. Repeated requests on the whitelist have shown the latter.
    The material that was the problem has been deleted, and is visible only to admins. It has however been explained over and over. See my first paragraph. Removing this needs a consensus (which you are free to gather) showing that the site is absolutely needed. The benefits you are quoting are hardly true:
    • most references have free access to the cited content, if it is freely hosted on econlib, it is likely freely hosted elsewhere as it is out of copyright protection (up to, often, WikiSource).
    • that argument is totally useless. The use of references is not affecting the bias. And as you have referenced it now it is totally unbiased and properly referenced. Even better, it is plainly referenced to the official source of the information. Everyone can find the reference is they want to.
    • The edits are not erroneously rejected, someone with a vested interest was editing this, it is rightfully blacklisted.
    • You just have to ask. It is less than 6 hours and you have an answer.
    There is not a lot to develop, we have the searchbox above, and this track that shows you where this was discussed and gives above explanation several times. And hence, it does not increase admin time.
    no Declined,  Defer to Whitelist for specific links on this domain (but this one is freely accessible elsewhere, even on 'neutral' servers, so I would not bother). --Dirk Beetstra T C 06:07, 9 March 2020 (UTC)[reply]

    Thanks for looking into this. I wasn't aware of the direct link of the infractions to the site owners. Since in this case econlib is functioning as a content host for a widely available primary source, I linked to an alternate host. I still think the user interface can be improved, perhaps a summarized reason for the blacklist could be provided in the rejection message along with the search results of the searchbox you reference in case users want to dig deeper, you'll apologize if users miss this but there's an overload of information for regular users. I'm interested in your perspective on this idea:

    What would you think would be a good message for users to see when they link to econlib?
    If you had the capacity to do so, would you send different messages to users depending whether econlib is being cited as a primary or secondary source? 
    

    Out of curiousity, is it technically possible to blacklist a website as a secondary source but still allow it to work as a host for primary sources? --TZubiri (talk) 19:19, 9 March 2020 (UTC)[reply]

    TZubiri, not using this blacklist, no. Guy (help!) 00:09, 11 March 2020 (UTC)[reply]
    TZubiri, except for the encyclopedia there is hardly anything there that needs to be linked there. Yes, there is a lot of material in the library that is suitable as primary or secondary source, but, again, it is practically ALL available on 'neutral' websites (up to WikiSource). Whitelisting can take care of the rest, but I do not recall having seen any requests that pass that bar.
    No, the spam-blacklist is black-and-white. That is something that should have changed (and that request is basically 14 years old), but WMF. Anyway, I don't think that the software should make the distinction 'oh, it is used as a secondary source, so it is fine to link to this spammed site'. Also, yes, it would be nice that we have the possibility for more 'custom' messages on domains. That would be possible, but again, that needs WMF. Dirk Beetstra T C 07:22, 11 March 2020 (UTC)[reply]

    scholarlyoa.com

    This has got to be the most annoying blacklisting possible. This is a hijacked site, but it (and its archived versions) are extensivly used everywhere on Wikipedia when discussing predatory journals, open access, and Beall's list (see https://en.wikipedia.org/wiki/Special:LinkSearch/*.scholarlyoa.com). Whenever someone vandalized a page, or whitewashes an article, or tries to archive a discussion with those links in (of which there are several), the blacklist is tripped. This should be removed from the BL, because it causese way more headaches than it solves. I can't even make whitelist/blacklist removal requests without triggering the damned thing, hence the spaced version.

    Headbomb {t · c · p · b} 15:16, 16 March 2020 (UTC)[reply]

    @Headbomb: no Declined what is basically 2 hits in 3 days/1000 hits on a link that is now whitelisted (and the count over 4000 hits/10 days is basically 3 hits in mainspace and 2 discussions). NO archiving faults (which can easily be resolved), NO use in discussions (which I agree can be annoying if you can’t link, but hey, you maim the link and everyone knows where you found the info and can go there with a little bit more effort). So not “several”. Use in mainspace is limited, it boils down to a couple of pages, not widespread use. Nothing that the whitelist can’t handle (and is generally then quickly resolved as you may have noticed), and the annoyance is clearly minimal.  Defer to Whitelist for the rest. --Dirk Beetstra T C 19:04, 16 March 2020 (UTC)[reply]
    If there's only two hits in 3 days, that's clearly proof that does not needed to be blacklisted. And the archives are still broken since the last time, because you can't repair them because of the blacklist. Headbomb {t · c · p · b} 19:13, 16 March 2020 (UTC)[reply]
    I agree with Headbomb here. Right now, the blacklist would prevent the reversion of vandalism on Abstract and Applied Analysis, Academic journal publishing reform, Aging (journal), Altmetrics, Beall's List, Chinese Chemical Letters, Clinical Practice, Corruption in Canada, Entropy (journal), Epistemologia, Experimental & Clinical Cardiology, Frontiers Media, Future Medicine, Google Scholar, Hindawi Publishing Corporation, Imaging in Medicine, Index Copernicus, International Archives of Medicine, International Journal of Advanced Computer Technology, International Journal of Clinical Rheumatology, Jeffrey Beall, Journal of Cosmology, Journal of Medical Internet Research, Journal of Natural Products, List of academic databases and search engines, List of confidence tricks, MDPI, Mega journal, Neuropsychiatry (journal), Nova Science Publishers, OMICS Publishing Group, Pattern Recognition in Physics, Plastic Surgery (journal), Polonnaruwa (meteorite), Predatory publishing, Pulsus Group, Redalyc, SciELO, Scientific Research Publishing, Sylwan, The Scientific World Journal, The Veliger, Vanity press, Who's Afraid of Peer Review? and Wulfenia (journal). I've had to deal with this recently, and it's quite exasperating. That's not counting its use in Talk and User-talk pages. XOR'easter (talk) 20:00, 16 March 2020 (UTC)[reply]
    And also prevent the upgrading of old links to the archived versions. Headbomb {t · c · p · b} 20:01, 16 March 2020 (UTC)[reply]

    Headbomb, yes, the links are intentionally broken, you can still copy-paste them. Anyway, that is a technical problem with the Spam-blacklist that has NOT been resolved by the WMF for 14 years.

    Wow, not hitting in 3 whole days .. I deal with spammers who come and return for over 10 years and are continuously trying to link and/or get their blacklisted links removed.

    XOR'easter no, it does NOT hamper reversion of vandalism (except in some very rare cases). Anyway, for genuine use, as for the archive-links, they should be whitelisted so that in those rare cases that vandalism cannot be reverted it is solved for the future. The use on a mere fraction of our pages (not counting talkpages, maybe 60 out of 6 million ..?) is not a reason to remove, there must be widespread use of the links.

    Get those cases whitelisted. --Dirk Beetstra T C 12:12, 17 March 2020 (UTC)[reply]

    It hampered my reversion of vandalism on World Academy of Science, Engineering and Technology; I had to do cutting and rewriting for what should have been a one-click fix. XOR'easter (talk) 16:09, 17 March 2020 (UTC)[reply]
    Likewise it hampered my reversion of whitewashing at Allied Academies. There clearly is both widespread and legitimate use of this domain, which by far supercedes spamming efforts. As for the links are intentionally broken, you can still copy-paste them, no you can't, because they are broken. They should also be working links, because they point to the intended pages, and aren't spam. That this issue is "technical" in nature is irrelevant, it's an issue period, because you are blacklisting a site that should not be blacklisted. If you want to prevent this type of spam, we have edit filters for this (set to warn, at least for editconfirmed+, so humans can go through them). Headbomb {t · c · p · b} 16:32, 17 March 2020 (UTC)[reply]
    Headbomb, edit filters are not suitable for blocking external links, the AbuseFilter is a) way too heavy on the servers for that, b) that extension is just a bit less out-dated than the spam-blacklist extension, and c) being edit-confirmed is not a reason to stop spamming or using links wrongly (I've once trouted an admin for deliberately linking to a copyright violation). Dirk Beetstra T C 17:35, 17 March 2020 (UTC)[reply]

    Both could have been rollbacked and that would not have triggered the spam-blacklist. And still, get them whitelisted which is a more permanent solution. 10 ppm is 'whitespread use' .. I guess we will see a massive influx of whitelist requests then. --Dirk Beetstra T C 17:32, 17 March 2020 (UTC)[reply]

    Both could have been rollbacked and that would not have triggered the spam-blacklist. Patently false. Because that's exactly what I did to trigger the blacklist. edit-confirmed is not a reason to stop spamming or using links wrongly. True. But this forgets that these links are neither used wrongly, nor are spam to begin with. Nor are they being spammed on a scale that requires to make use of the blacklist. Headbomb {t · c · p · b} 18:17, 17 March 2020 (UTC)[reply]
    Headbomb, I agree and have commented the entry out for now. Guy (help!) 18:23, 17 March 2020 (UTC)[reply]
    @JzG: thanks. Good to see that someone finally sees reason on this. Headbomb {t · c · p · b} 18:24, 17 March 2020 (UTC)[reply]
    Headbomb, We all did from the outset. It's a question of balancing competing interests. In my view, this tips on the side of removal because I strongly suspect a joe-job. Guy (help!) 21:09, 17 March 2020 (UTC)[reply]
    Headbomb, if you could not roll back then there is something wrong, as that was possible and should be possible. That needs to be tested and reported if that is not possible anymore. Dirk Beetstra T C 23:39, 17 March 2020 (UTC)[reply]

    Logging / COIBot Instructions

    Blacklist logging

    Full instructions for admins


    Quick reference

    For Spam reports or requests originating from this page, use template {{/request|0#section_name}}

    • {{/request|213416274#Section_name}}
    • Insert the oldid 213416274 a hash "#" and the Section_name (Underscoring_spaces_where_applicable):
    • Use within the entry log here.

    For Spam reports or requests originating from Wikipedia_talk:WikiProject_Spam use template {{WPSPAM|0#section_name}}

    • {{WPSPAM|182725895#Section_name}}
    • Insert the oldid 182725895 a hash "#" and the Section_name (Underscoring_spaces_where_applicable):
    • Use within the entry log here.
    Note: If you do not log your entries, it may be removed if someone appeals the entry and no valid reasons can be found.

    Addition to the COIBot reports

    The lower list in the COIBot reports now have after each link four numbers between brackets (e.g. "www.example.com (0, 0, 0, 0)"):

    1. first number, how many links did this user add (is the same after each link)
    2. second number, how many times did this link get added to wikipedia (for as far as the linkwatcher database goes back)
    3. third number, how many times did this user add this link
    4. fourth number, to how many different wikipedia did this user add this link.

    If the third number or the fourth number are high with respect to the first or the second, then that means that the user has at least a preference for using that link. Be careful with other statistics from these numbers (e.g. good user who adds a lot of links). If there are more statistics that would be useful, please notify me, and I will have a look if I can get the info out of the database and report it. This data is available in real-time on IRC.

    Poking COIBot

    When adding {{LinkSummary}}, {{UserSummary}} and/or {{IPSummary}} templates to WT:WPSPAM, WT:SBL, WT:SWL and User:COIBot/Poke (the latter for privileged editors) COIBot will generate linkreports for the domains, and userreports for users and IPs.


    Discussion

    I have some words to say: I tried to add in two sources, but there's error due to this ' Spam blacklist'. How should I settle this? — Preceding unsigned comment added by Manwë986 (talkcontribs) 14:04, 27 February 2020 (UTC)[reply]

    COIBot suspicious local reports

    (crossposted to WT:WPSPAM)

    m:User:LiWa3 is doing basic statistics on domains that it has seen being added, and when those statistics are suspicious it throws those in the general direction of COIBot. COIBot is then reporting those in a local category or a xwiki category, depending on the type of statistics.

    COIBot has been saving reports for years, and most of those are still lingering (COIBot closes some automatically when they have been cleaned up independently, but with so many it does not check all). I evaluated a good handful of the old ones and closed them, and I have been trying to keep up with some of the new ones for a week. I do find that a significant portion of them do need a follow up (most need cleanup, quite some outright blacklisting).

    May I ask you to turn on category changes in your watchlist, watchlist Category:Open Local COIBot Reports, and evaluate all that COIBot is opening in there. Please try to close them with an evaluation remark for further reference. Thanks. --Dirk Beetstra T C 06:08, 8 March 2020 (UTC)[reply]