Jump to content

MediaWiki talk:Spam-blacklist: Difference between revisions

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Content deleted Content added
m Archiving 2 discussion(s) to MediaWiki talk:Spam-blacklist/archives/October 2021) (bot
add currency.com
Line 134: Line 134:
*{{spamlink|news-24.fr}}
*{{spamlink|news-24.fr}}
:Seems to be a fake ref to make people believe it is [[France 24]], as with fr24news.com (discussed [[Wikipedia:Reliable_sources/Noticeboard#Fr24_News,_and_synonym-spam_sites_in_general|here]] on the RS noticeboard). —[[User:AFreshStart|AFreshStart]] ([[User talk:AFreshStart|talk]]) 19:54, 25 October 2021 (UTC)
:Seems to be a fake ref to make people believe it is [[France 24]], as with fr24news.com (discussed [[Wikipedia:Reliable_sources/Noticeboard#Fr24_News,_and_synonym-spam_sites_in_general|here]] on the RS noticeboard). —[[User:AFreshStart|AFreshStart]] ([[User talk:AFreshStart|talk]]) 19:54, 25 October 2021 (UTC)

==currency.com==
*{{Link summary|currency.com}}
*{{User summary|EdvenTermen}}
*{{User summary|Volhadey}}
*{{User summary|GreatValeria}}
*{{User summary|Annabel Aronsson}}
*{{User summary|Ян Алексеевич2323}}
*{{User summary|Amelia SSandens}}
*{{User summary|EvaPolastri}}
*{{User summary|Villanell}}
*{{User summary|Vitadey}}
*{{User summary|Evan polastri}}

:Moved on to sockpuppets after getting an indefinite block on EdvenTermen. - [[User:MrOllie|MrOllie]] ([[User talk:MrOllie|talk]]) 13:18, 26 October 2021 (UTC)


=Proposed removals=
=Proposed removals=

Revision as of 13:18, 26 October 2021

    Mediawiki:Spam-blacklist is meant to be used by the spam blacklist extension. Unlike the meta spam blacklist, this blacklist affects pages on the English Wikipedia only. Any administrator may edit the spam blacklist. See Wikipedia:Spam blacklist for more information about the spam blacklist.


    Instructions for editors

    There are 4 sections for posting comments below. Please make comments in the appropriate section. These links take you to the appropriate section:

    1. Proposed additions
    2. Proposed removals
    3. Troubleshooting and problems
    4. Discussion

    Each section has a message box with instructions. In addition, please sign your posts with ~~~~ after your comment.

    Completed requests are archived. Additions and removals are logged, reasons for blacklisting can be found there.

    Addition of the templates {{Link summary}} (for domains), {{IP summary}} (for IP editors) and {{User summary}} (for users with account) results in the COIBot reports to be refreshed. See User:COIBot for more information on the reports.


    Instructions for admins
    Any admin unfamiliar with this page should probably read this first, thanks.
    If in doubt, please leave a request and a spam-knowledgeable admin will follow-up.

    Please consider using Special:BlockedExternalDomains instead, powered by the AbuseFilter extension. This is faster and more easily searchable, though only supports whole domains and not whitelisting.

    1. Does the site have any validity to the project?
    2. Have links been placed after warnings/blocks? Have other methods of control been exhausted? Would referring this to our anti-spam bot, XLinkBot be a more appropriate step? Is there a WikiProject Spam report? If so, a permanent link would be helpful.
    3. Please ensure all links have been removed from articles and discussion pages before blacklisting. (They do not have to be removed from user or user talk pages.)
    4. Make the entry at the bottom of the list (before the last line). Please do not do this unless you are familiar with regular expressions — the disruption that can be caused is substantial.
    5. Close the request entry on here using either {{done}} or {{not done}} as appropriate. The request should be left open for a week maybe as there will often be further related sites or an appeal in that time.
    6. Log the entry. Warning: if you do not log any entry you make on the blacklist, it may well be removed if someone appeals and no valid reasons can be found. To log the entry, you will need this number – 1051941924 after you have closed the request. See here for more info on logging.


    Proposed additions


    Copyvio websites

    From discussion at WP:RSN#Fr24 News a series of websites has come up in discussion. These websites are copyright violators or content aggregators or whatever you call it when someone is stealing content from legitimate websites, altering it through translation or synonym swapping, and presenting it as their own. None of these websites are traditional news companies with an editor and staff doing their own reporting. Platonk (talk) 04:56, 18 October 2021 (UTC)[reply]

    I found a huge motherload of these websites. I added only the ones which had links to their websites used in en-Wikipedia articles (or the list was going to be very, very long) and even then I just quit because the list was getting too long anyway. During my search for these websites, I found a Reddit article discussing this same fake news rabbithole. Platonk (talk) 05:22, 18 October 2021 (UTC)[reply]

    @Platonk: No worries about being long, it is probably best to be complete (and even over-complete - put in the related websites that have not yet been used) as otherwise we will be keeping ourselves busy later on. I am waiting for a good number of reports from COIBot, if there is a significant cross-wiki component to it we may just blacklist it globally. --Dirk Beetstra T C 05:34, 18 October 2021 (UTC)[reply]
    @Beetstra: I'm pretty sure this rabbit hole goes into the hundreds of domains. I think these operators are buying up released/abandoned domains and launching their content-farm engines on them in the hopes that there is residual readership or search engine rankings that increase their hits. Meanwhile, there are just a few of us trying to remove the occurrences of them via insource-searches, which I think you need to have happen before you blacklist it (I could be wrong, though).
    I do have a question for you. If there is a citation to a legitimate old archive.org link of one of these pre-used domains, yet we blacklist it because the content-farm bought it and launched it as a fake news site, what do we do with the old occurrence in an article? I had that happen with hardware-infos.com in this article. Won't that stop anyone from saving an edit to that article unless and until they remove that citation? Platonk (talk) 05:46, 18 October 2021 (UTC)[reply]
    @Platonk: I have made sections for additional sets, please add any further that are found into a new sub-section. I will blacklist the first set in my next edit.
    Blacklisted links that are on pages are not affected. The spam-blacklist hook is on the new additions of links, not on already existing links. All pages can be normally edited, and even vandalism reverts that revert blacklisted links back in are not a problem (the link addition there is not considered 'new'). They however may become a problem if the revert/undo is conflicting with subsequent good edits on a page. If there are some left we generally blacklist and let cleanup go on afterwards, if there are hundreds or thousands of links it is better to first cleanup most of them. For the archive links the best solution is to whitelist any occurrences. --Dirk Beetstra T C 05:57, 18 October 2021 (UTC)[reply]
    @Beetstra: Thank you for answering my questions fully and for adding those to the blacklist. I will make a second set for any additional websites. Platonk (talk) 06:12, 18 October 2021 (UTC)[reply]
    @Beetstra: Actually, the undo and vandalism functions both fail because of the blacklisted link. I wanted to revert this edit and instead of removing the citation, mark it as url-status=dead and add an archive-url link to https://web.archive.org/web/20150402152348/http://www.xboxonegaming(dot)nl/2015/03/awesomenauts-assemble-hopelijk-deze-zomer-naar-xbox-one/, but I couldn't do it. Platonk (talk) 07:01, 18 October 2021 (UTC)[reply]
    @Protonk: Yes, but in that case you don't undo, you are making a new link. I am talking about a true undo, or a plain rollback. Dirk Beetstra T C 11:20, 18 October 2021 (UTC)[reply]
    @Beetstra: LOL, "rollback" didn't work for me, either... blacklist error message. I don't know what a "true undo" is, and I don't have rollback privileges. I only know the links I see when I look at the diff ([rollback (AGF)] [rollback] [vandalism] and (undo), the first three I think are Twinkle's button). None of these will let me undo the edit, which is the last edit made. Platonk (talk) 11:45, 18 October 2021 (UTC)[reply]
    @Platonk: Maybe it is part of the admin package (I just rolled-back, did not try undo). You'll have to get the archive link whitelisted though, the software counts that as 'a new link'. -- Dirk Beetstra T C 12:21, 18 October 2021 (UTC)[reply]
    @Beetstra: Wrong 'tonk. ;) Protonk (talk) 19:18, 18 October 2021 (UTC)[reply]
    @Protonk: I just understand my mistake now ... I meant to ping @Platonk:.  :-) -- Dirk Beetstra T C 05:06, 21 October 2021 (UTC)[reply]

    Is it advisable to remove original links in places where archives already exist? Would it make a difference? A simple example is this edit were I partly "fixed" the problem by switching the link to 'dead', but the original domain is still listed. Removing it would lead to a broken template. I'm guessing, but I assume that search engine algorithms cannot tell that a url is "dead" according to Wikipedia's in-house formatting, so leaving these URLS in place seems like it might be helping the SEO of these domains. That would be a shame. Grayfell (talk) 20:59, 18 October 2021 (UTC)[reply]

    Acquiring someone else's better SEO is why they scoop up old used-but-expired domains rather than inventing new names and starting from scratch. If you are able to change a citation to another domain entirely, that reduces the thieves' benefit. Your edit may have worked because you only changed the url-status parameter. The one I wrestled with last night didn't have the archive-url already in place, and nothing I did could add it without first whitelisting the archive-url (mentioned above as a solution, but which I haven't done yet). Platonk (talk) 02:07, 19 October 2021 (UTC)[reply]
    @Greyfell: I would suggest that the main link of the reference is changed to the archived one, and that the archived one is whitelisted. Although rare, keeping blacklisted (but not whitelisted) links in a document results in problems if the page gets inadvertently broken or vandalised. Also, people will have a tendency to follow the actual link and not the archived one (even if it is marked dead, who is ever reading the manual?), which may lead them, in the worst case, to phishing or malware pages (the domain is hijacked, and maybe now it is leading to something 'fine', you do not know what will come in the future. And that following of the old link also helps for SEO purposes (Wikipedia is nofollow, but webbugs are a thing, and it is still traffic to your site). Dirk Beetstra T C 05:23, 19 October 2021 (UTC)[reply]
    The established method is |url-status=usurped which prevents the source URL from being included in the HTML only the archive URL is displayed. If there is no archive URL available, delete the entire cite as unverifiable (if cite web). -- GreenC 14:50, 21 October 2021 (UTC)[reply]

    set 1

    Platonk (talk) 05:22, 18 October 2021 (UTC)[reply]
    @Platonk: plus Added to MediaWiki:Spam-blacklist. --Dirk Beetstra T C 05:57, 18 October 2021 (UTC)[reply]

    Dirk Beetstra I've been getting reports about it at WP:URLREQ, unknowingly, since at least Wikipedia:Link_rot/URL_change_requests#leighrayment.com when User:Billinghurst reported in August. We decided a blacklist edit filter was not needed as the spammer is not actively adding new links and the blacklist can cause other problems like preventing cleanup. There are established methods. At least on enwiki, my bot can add/convert to archive links, toggle |url-status=usurped, and convert {{webarchive}} to straight archive URL. There is no method to remove the link entirely without damaging the citation, but we can hide it from view in most cases. If there is no archive URL available and it's a {{cite web}} the entire cite should be deleted as unverifiable: my bot does not have that functionality yet, I can try to add it. My bot chooses archives close to the access-date or the earliest snapshot possible and I think this will mitigate spam in the archive URLs.. it can also check the archive for spam strings before adding. Retroactive checking archives for spam is more difficult, will look into it. Finally the domains will be blacklisted in the IABot database so they are treated as dead in the global wikis (without all these usurped functions just straight dead link). Most of this is not possible when the domains are blacklisted. -- GreenC 14:50, 21 October 2021 (UTC)[reply]

    jakant.web.fc2.com

    Won't give up despite numerous blocks (last attempt was caught on edit filter). OhNoitsJamie Talk 15:41, 20 October 2021 (UTC)[reply]

    americadailypost.com

    Appears to be extremely similiar site created for SEO and to support UPE as already-blacklisted theamericanreporter.com. BubbaJoe123456 (talk) 18:16, 20 October 2021 (UTC)[reply]

    jimusnr.com

    Three pages report this as a malware site. Perhaps it should be listed. Diff example. Neils51 (talk) 12:03, 22 October 2021 (UTC)[reply]

    @Neils51:  Defer to Global blacklist, cross-wiki problem. --Dirk Beetstra T C 05:45, 24 October 2021 (UTC)[reply]

    iela.ufsc.br

    Reported as malware site by own anti-virus software (see what prompted the removal). Possibly hijacked website. Should be blacklisted. RandomCanadian (talk / contribs) 00:48, 25 October 2021 (UTC)[reply]

    @RandomCanadian:  Defer to Global blacklist, if it is hijacked it should probably on meta anyway (protecting global community), but I am looking at the report and see whole ranges of IPs (and a lot of them), and of the 4 edits by users with 'advanced rights' 3 are vandalism reverts. I am afraid that this qualifies as spam. --Dirk Beetstra T C 05:45, 25 October 2021 (UTC)[reply]
    @RandomCanadian: Handled on meta. --Dirk Beetstra T C 06:45, 25 October 2021 (UTC)[reply]

    legaldictionary.net

    Over the past day or two this site has added a considerable number of links through IPv6 addresses.

    This range also spammed both legaldictionary and animals.net
    157.40.192.0/19 (talk • contribs • count • block log • x-wiki • Edit filter search • WHOIS • RDNS • tracert • robtex.com • StopForumSpam • Google • AboutUs • Project HoneyPot)

    The same pattern of link additions (since cleaned up) is documented in this project spam report. Arllaw (talk) 14:36, 25 October 2021 (UTC)[reply]

    I've taken the liberty of changing the templates above; user/IP links typically use the "User summary" or "IP summary" templates in these reports. OhNoitsJamie Talk 15:00, 25 October 2021 (UTC)[reply]
    Looking at some of the subsequent contributions of that user (presumably the same user with a dynamically-assigned IP), they've switched to using different references (e.g., this one uses a cornell.edu link); it's quite possible they are simply good-faith attempts to define and source legal terms. OhNoitsJamie Talk 15:02, 25 October 2021 (UTC)[reply]
    Their tactic appears to now be to include one or more other references along with the legaldictionary.net link, and to make the additions across two or more edits as opposed to within a single edit. The only consistency I see is the addition of legaldictionary.net. Arllaw (talk) 18:06, 25 October 2021 (UTC)[reply]
    Hmm, looking more closely at the two ranges 2605:6C80:11:0:0:0:0:0/50 and 2604:1580:FE00:5:0:0:0:0/112, I'm inclined to agree with you. I've also added "animals.net" to the report; spammed by same IP ranges, and in an amazing coincidence, both sites have the same content editor. OhNoitsJamie Talk 20:01, 25 October 2021 (UTC)[reply]
    @Arllaw: plus Added to MediaWiki:Spam-blacklist. --OhNoitsJamie Talk 20:05, 25 October 2021 (UTC)[reply]

    Two websites

    All of its articles seem to be copied from other websites. Not sure if correct attribution is given all the time. —AFreshStart (talk) 19:54, 25 October 2021 (UTC)[reply]
    Seems to be a fake ref to make people believe it is France 24, as with fr24news.com (discussed here on the RS noticeboard). —AFreshStart (talk) 19:54, 25 October 2021 (UTC)[reply]

    currency.com

    Moved on to sockpuppets after getting an indefinite block on EdvenTermen. - MrOllie (talk) 13:18, 26 October 2021 (UTC)[reply]

    Proposed removals

    mentaldaily.com

    mentaldaily.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    According to the archive: it appears there was a brief, isolated timespan in which several of the domain's stories were being used as part of a 'prolific spambot' by a user who had created a number of accounts to do it. That occurred several years ago. I only assume the domain would not be the subject of such occurrences if removed from blacklist. I propose to remove this domain from blacklist, given that I am considering using one of the domain's stories that appears to be notable and has been covered by a magazine, newspaper, and television news source that meets criteria for WP:Reliable sources. The story I am referring to is on menstruation habits and would be a good inclusion to add a sentence about the menstruation habits that i was only able to find in this media coverage for use within one of the menstruation cycle pages. Here's a few links on what I am referring to: mentaldaily.com/article/2016/11/work-anxiety-caffeine-smoking-can-make-your-period-worse-study / Coverage i found of the story: https://www.glamour.com/story/10-everyday-habits-that-actually-make-your-period-even-worse https://www.standardmedia.co.ke/mobile/amp/article/2001297030/six-foods-that-could-worsen-your-period-pains https://www.liputan6.com/health/read/2657484/menstruasi-terasa-sakit-hindari-lakukan-7-kebiasaan-ini

    Here's one article found on google news about the domain: https://www.ibtimes.sg/meet-these-10-science-research-websites-making-web-great-59989 and another from Media Bias/Fact Check: https://mediabiasfactcheck.com/mental-daily/ Let me know if possible, otherwise, will see if another source has this same information on menstruation habits. Multi7001 (talk) 01:03, 23 October 2021 (UTC)[reply]

    It's not on our local blacklist, it's on meta;  Defer to Global blacklist. From that report, it appears to have been spammed pretty heavily across multiple wikis up until the blacklisting in 2019. Alternatively, if there is a specific link you'd like to add, there's always local whitelisting, though I'm not sure if I agree with you about the WP:RS criteria being met. OhNoitsJamie Talk 03:30, 23 October 2021 (UTC)[reply]
    @Multi7001: no Declined,  Defer to Whitelist for specific links on this domain. First, the 'study' that you refer to is not a peer reviewed study, it is a study performed by 'randomly' selecting wormen who see the facebook ad and who decided to do the survey. That is not how you do a proper statistical research. Secondly, 2 other articles I read are clear regurgitating stories (seen the 'according to <peer reviewed study>'), not a properly performed researh either. Thirdly, mentaldaily.com/article/2020/01/misinformation-vs-disinformation < this is just an opinion piece, which does not make it reliable. Most of this material reads as the type of material that one would use for click-bait. Then, this was heavily spammed to en.wikipedia back in 2017 (the main editor there got blocked; see also Wikipedia:Administrators'_noticeboard/IncidentArchive943#Scorpion293_and_unconfessed_paid_advertising). It got blacklisted here in 2018 (clearly, the block did not help). Attempts to re-create the article (draft) and spamming here were thwarted by that. Then the spamming moved (as en.wikipedia could not be spammed anymore), we see spamming on other Wikis (something for persistence). Then we have blacklisting globally in 2019. That type of persistence shows the problem with spam, it does not stop just because we blacklist it somewhere. That also does not give trust that this will stop anywhere soon, the aggression of the SEO will likely continue (in fact, there are cases that come back and persist for more than a decade. And that also explains why this is re-used elsewhere: they have done proper, aggressive SEO so their results show high in Google search results. I would strongly suggest against delisting this, and just whitelist the few links that do pass the bar of WP:RS by themselves (and would consider that we first want a WP:RSN discussion before we whitelist specific links). --Dirk Beetstra T C 05:44, 24 October 2021 (UTC)[reply]

    Troubleshooting and problems

    File:Newest Brookhaven RP thumbnail.png

    roblox.com: Linksearch en (insource) - meta - de - fr - simple - wikt:en - wikt:frSpamcheckMER-C X-wikigs • Reports: Links on en - COIBot - COIBot-Local • Discussions: tracked - advanced - RSN • COIBot-Link, Local, & XWiki Reports - Wikipedia: en - fr - de • Google: searchmeta • Domain: domaintoolsAboutUs.com

    I have a problem with tagging non-free image, is that, I want to fill it with Template:Non-free use rationale video game cover. But the problem is when i fill the website parameter with the website source (e.g. roblox.com), i cannot because it is spam-blacklisted at MediaWiki:Spam-blacklist. Vitaium (talk) 11:56, 25 October 2021 (UTC)[reply]

    @Vitaium: There is no reason that that has to be a working link, so you could just take off the 'https://'. Alternatively you could as for whitelisting of the specific link ( Defer to Whitelist, see instructions at top of that page). --Dirk Beetstra T C 12:03, 25 October 2021 (UTC)[reply]

    Discussion