Jump to content

MediaWiki talk:Spam-blacklist/archives/January 2009

Page contents not supported in other languages.
Page extended-protected
From Wikipedia, the free encyclopedia

This is the current revision of this page, as edited by Anomalocaris (talk | contribs) at 18:55, 26 April 2019 (rm wikilinks in external links; <font> → <span style>). The present address (URL) is a permanent link to this version.

(diff) ← Previous revision | Latest revision (diff) | Newer revision → (diff)

Proposed Additions

Orthopedic cast page spam

There was also this paragraph added to our Orthopedic cast article about the erotic use of "recreational

casts" along with links to three more related web sites:

These are addditional related domains:
--A. B. (talkcontribs) 19:20, 29 October 2008 (UTC)[reply]


plus Added additional domains -- kinky casts and all. --A. B. (talkcontribs) 20:10, 31 October 2008 (UTC)[reply]

The original whitelist request that led to these editor-blacklisted sites has already been denied/closed and nobody has suggested they be whitelisted or kept. So I created this Orthopedic cast topic thread in the blacklist area and moved these newly blacklisted site references to it to maintain a document of historic and ongoing vendor spam abuse on the Orthopedic Cast topic. I hope its the right editing protocol and apologize if not.

The latest/newest vendor spam to be removed from the orthopedic cast page is:

The Orthopedic cast topic has been spammed in the past with spoofed / proxy / dynamic IP's posting commercial pay sites -- so suggest that blacklisting the offending referenced pay sites (as the editor did with the sites above) is probably more productive in killing off this spam then tracking dodgy IP contributors who may or not be what they appear.

In addition to the above blacklisted links, previous vendor spam references to this page, some by suspect IP posts, have included:

Beetstar


Thanks for the report.
I've run link reports on those five domains and I am not seeing any persistent spamming in the recent past:
Sometimes these reports miss things -- are there any additions I should be aware of?
--A. B. (talkcontribs) 06:13, 3 November 2008 (UTC)[reply]

Will keep an eye out for others. They pop up as single entry citations by vendor sites now and again typical of the latest one.

Beetstar —Preceding undated comment was added at 13:12, 4 November 2008 (UTC).[reply]

Thanks!
FYI, we normally look for 3 to 4 warning across all accounts before we consider blacklisting, so don't forget to give escalating warnings from the grid at Wikipedia:UTM. It also helps to put a live link (with the http://) to the spam site on the user talk page so we can find all the user accounts. Don't get indignant -- just give a warning an move on.
I hope this helps. Thanks again for your work on this. We take help from all quarters. --13:24, 4 November 2008 (UTC)

unclesirbobby.org.uk

Link
Editors

There was a big effort to get this link included in several dream articles at the beginning of this year (see IPs with warnings). Editor has been back several times over the last year. Spamming is slow and almost always with a different IP address so blocks and protection are impractical as deterrents. Requesting blacklisting. -- SiobhanHansa 18:11, 3 November 2008 (UTC)[reply]

plus Added. Thanks for reporting. --A. B. (talkcontribs) 19:15, 3 November 2008 (UTC)[reply]
Related domain that has also been spammed scarboroughphotos.org.uk :
Thanks -- SiobhanHansa 18:12, 10 November 2008 (UTC)[reply]

dnabaser.com rnabaser.com and sequence-assembler.com

See also DNA Baser history and Wikipedia:Articles_for_deletion/DNA_Baser

Somewhat sophisticated attempts to promote these related products by:

appear now to have devolved into simple spamming by SPAs:

Since requests not to spam to the other accounts appear to have lead to the use of these throwaway accounts am requesting blacklisting. -- SiobhanHansa 12:31, 22 November 2008 (UTC)[reply]

IP 85.16.163.218 (talk • contribs • deleted contribs • blacklist hits • AbuseLog • what links to user page • COIBot • Spamcheck • count • block log • x-wiki • Edit filter search • WHOIS • RDNS • tracert • robtex.com • StopForumSpam • Google • AboutUs • Project HoneyPot) continuing to spam. -- SiobhanHansa 13:54, 25 November 2008 (UTC)[reply]
plus Added --A. B. (talkcontribs) 19:16, 2 December 2008 (UTC)[reply]
I have just added \bcubic\.3x\.ro\/free-dna-tools\/index\.html\b, a page on a free server that redirects to these sites. Some records:
  • 55 records; Top 10 editors who have added dnabaser.com: Fedra (29), ClueBot (10), Madrigal12 (7), 85.16.163.218 (2), SiobhanHansa (1), 85.16.167.231 (1), Yard05er (1), Mirc007 (1), Wk master editor (1), AVBOT (1).
  • 7 records; Editors who have added rnabaser.com: Fedra (4), Madrigal12 (2), SiobhanHansa (1).
  • 1 records; Editors who have added sequence-assembler.com: Applyalert1 (1).
  • 11 records; Editors who have added cubic.3x.ro: 85.16.163.200 (5), 85.16.162.33 (3), 85.16.163.181 (2), 85.16.163.194 (1).
I suggest immediate blacklisting of any other domains/links used to circumvent blacklisting here, and that IPs in this range (85.16.0.0/16 - EWETEL-DYNDSL-POOL9 - DE) who edit unconstructively on the page Sequence_assembly are blocked. --Dirk Beetstra T C 00:04, 15 December 2008 (UTC)[reply]
Forgot to mention that I yesterday added '\bdownload3k\.com\/Install-DNA-BASER-sequence-assembling-tool\.html\b', as it was used to lead again to (this time a download site for) DNA Baser. --Dirk Beetstra T C 21:19, 17 December 2008 (UTC)[reply]

Proposed Removals

tinyurl

i understand that we dont want spam in article contents, but why is tinyurl blocked ? it's very useful to make urls more useable. we should err ont he side of useability, even if that means a few links are put up that are not strictly "encyclopedic".

why is facebook blocked as link ? fast, facebook is becoming a place for publishing corporate and other legit info that is more trustworthy than any of the media site, doubtful blogs etc that are routinely allowed in wikipedia ! please consider removign this block.

aceshowbiz.com

Why is this blacklisted, seems legit to me? Andre666 ([[User talk:Andre666|talk]]) 13:07, 24 August 2008 (UTC)[reply]

Kingcomp (talk) 08:07, 18 September 2008 (UTC) I manage aceshowbiz.com, I need to know when did our website spam prolifically ? Did it happen lately or many years ago ? We have many worth suggest article such as exclusive interview with Demi Lovato (Celebrity News, Sep 18, 2008). Please consider unlisted our website from your spam list as there is no such action for years. Many years ago aceshowbiz.com just a small website, right now we've already doing partnership with many big / reliable company. There is no time for us thinking for spamming. Just quality. Please take a visit to our website an consider. Thank You.[reply]

Domain blacklisted on meta
Google Adsense ID: 5315453046799966
servedby.advertising.com: site=72134


Related domains
These should be evaluated for blacklisting as well.


References


Comments for the site-owner
Kingcomp, we typically do not remove domains from the spam blacklist in response to site-owners' requests. Instead, we de-blacklist sites when trusted, high-volume editors request the use of blacklisted links because of their encyclopaedic value in support of our encyclopaedia pages. If such an editor asks to use your links, I'm sure the request will be carefully considered and your links may well be removed. You'll note that we've already whitelised some of your pages on such a basis.
The global blacklist is used by more than just our 700+ Wikimedia Foundation wikis (Wikipedias, Wiktionaries, etc.). All 3000+ Wikia wikis plus a substantial percentage of the 25,000+ unrelated wikis that run on our MediaWiki software have chosen to incorporate this blacklist in their own spam filtering. Each wiki has a local "whitelist" which overrides the global blacklist for that project only. Some of the non-Wikimedia sites may be interested in these links; by all means feel free to request local whitelisting on those.
Unlike Wikipedia, DMOZ is a web directory specifically designed to categorize and list all Internet sites; if you've not already gotten your sites listed there, I encourage you to do so -- it's a more appropriate venue for your links than our wikis. Their web address: http://www.dmoz.org/.
no Declined --A. B. (talkcontribs) 21:17, 3 December 2008 (UTC)[reply]
I'd like to request why this is blacklisted still. Any site can be spammed by a crazed editor that thinks they are doing the site a favor. It does not appear to get its information from "users", like some of the other sources we continue to allow to exist. Theoretically, IMDb is "spammed" on every Wiki page related to film and TV, and they are not considered "reliable sources of information".  BIGNOLE  (Contact me) 14:46, 6 December 2008 (UTC)[reply]
Bignole, once in a blue moon we see the crazy uninvolved editor that spams a site, but that's very rare. In this case, all the spam edits traceroute to a location in Indonesia, unlike our zillion IMDB links that have been added by established editor from around the world.
Even though you're an established user, I'm reluctant to remove the entire domain from the blacklist since, based on this domain's history, I lack confidence the site-owner won't go back to persistently spamming us. Are there particular aceshowbiz pages you'd like to see whitelisted? --A. B. (talkcontribs) 16:29, 6 December 2008 (UTC)[reply]
I'm slightly confused. You said the site-owner is the spammer, but then said you traced the spamming to a location in Indonesia. I cannot find any indication that the website's homebase, or any base, is in Indonesia. Did the site-owner admit to being the spammer or something?
Regardless, the only reason I care is because I was trying to find sources to verify the Teen Choice Award nominations for Kristin Kreuk for the Lana Lang (Smallville) article, which is currently under GAC. People do not tend to report on general award nominations, so finding any reliable mentioning of it has been extremely difficult. I cannot use IMDb, because it is currently snubbed from usage completely (which I generally agree with, but not entirely, as such, there is a current proposed guideline for citing IMDb in the works that would allow editors to cite things like Awards, and other information that is less controversial). I came across this link--http://www.aceshowbiz.com/celebrity/kristin_kreuk/awards.html--where AceShowBiz actually has a profile for Kreuk's awards. It takes care of all that I need. Since AceShowBiz does not get its info from users, at least as far as I can tell from their page and which is a big issue with why we typically don't use IMDb (because we cannot tell exactly who it is coming from), AceShowBiz won't be criticized as much as IMDb would be.  BIGNOLE  (Contact me) 16:57, 6 December 2008 (UTC)[reply]
Something which is that hard to source almost certainly fails WP:UNDUE. Guy (Help!) 22:29, 18 December 2008 (UTC)[reply]

mapsofworld.com

Why is this site blacklisted? I was going to use http://finance.mapsofworld.com/company/i/idemitsu-kosan.html for a reference. Is it a spam site? --Apoc2400 (talk) 21:57, 13 November 2008 (UTC)[reply]

Background:
If you just need to reference a specific page, then you can request it's link be "whitelisted" at MediaWiki talk:Spam-whitelist.
 Defer to Whitelist--A. B. (talkcontribs) 21:54, 3 December 2008 (UTC)[reply]
  • That's a lot of links and many of them are to bot-generated pages full of even more links. It seems that somebody unrelated to mapsofworld.com was persistently adding links for that and other sites to many articles over a year ago. If there is no indication that mapsofworld.com was behind the edits, can we just unlist the site? The person who added all the links is probably gone by now. --Apoc2400 (talk) 19:34, 6 December 2008 (UTC)[reply]
  • Yes, there is whitelisting, but that is quite bureaucratic and troublesome. I couldn't find any evidence of one user persistently spamming this link. It has been added here and there, but probably by unrelated people. Am I missing something in the above report? There a really too many links to make much sense of it. --Apoc2400 (talk) 19:47, 6 December 2008 (UTC)[reply]
I've personally done mass rollbacks on spamming campaigns of this site (at least a year ago). The obnoxious pop-up advert doesn't help it's case either. OhNoitsJamie Talk 19:51, 6 December 2008 (UTC)[reply]
Oh, I didn't notice because of Firefox, AdBlock and NoScript. Looking at it in IE, it sure looks like a spam site. I wonder if they stole the text I'm referencing from somewhere else. --Apoc2400 (talk) 19:59, 6 December 2008 (UTC)[reply]
I have the same setup, except I don't use noscript (though Firefox warns me about the popup before it loads it). AdBlock is handy for letting you see quickly what ads have been blocked. A few of the folks here are diligent about correlating AdWord id with other sites, which often turn out to be spam targets as well. OhNoitsJamie Talk 00:44, 7 December 2008 (UTC)[reply]

Whitehat.servehttp.com

I have no idea why this was blacklisted. I used it quite only on my page, and once as a link for some article on different radixes in math, cause i have a base converting applet. --Deo Favente (talk) 22:26, 3 December 2008 (UTC)[reply]

the whole domain servehttp.com is blacklisted on meta, see [2]. -- seth (talk) 02:02, 4 December 2008 (UTC)[reply]
Oh i see, its blocked cause its a free domain name. I guess ill still use this site in talk pages if i need to i guess. Well in this case there are still domains missing from the same provider. --Deo Favente (talk) 18:09, 11 December 2008 (UTC)[reply]
Is it possible to get it whitelisted? And where should i sugest the removeal of the other free domain names? Here or on meta? --Deo Favente (talk) 18:09, 15 December 2008 (UTC)[reply]
Individual pages may be whitelisted on the application of an established user at MediaWiki talk:Spam-whitelist. If you want to apply for a domain to be removed from the meta blacklist, see m:Talk:Spam blacklist. Stifle (talk) 14:30, 17 December 2008 (UTC)[reply]

sveti-stefan.net

This is very educatible, full of information web site and helpfull becouse it can be used like a tourist guide. End it's the only one wich a point is exactly that place from topic. I propose to remove this website from the black list because it's good to have such site offer in wikipedia story about Sveti Stefan and this site is not a spam site for shure. Thanks. —Preceding unsigned comment added by Pdjuras (talkcontribs) 17:26, 13 December 2008 (UTC)[reply]

  • no Declined. Your only contributions to Wikipedia are linking this site and then asking for it to be removed from the blacklist after it was blacklisted due to spamming. I think you may be looking for DMOZ, which, unlike Wikipedia, is a link repository. Guy (Help!) 20:48, 14 December 2008 (UTC)[reply]

Dear Sir, I didn't know that I have to be a big contributor to get opportunity to offer that one site can be removed from the black list. I didn't expect to get an answer in a pejorative manner like this one. I am serious man and I don't like that someone speak with me like this. That is not nice, that was just my opinion that this site is very proper for the topic (Sveti Stefan) especially when I sow other links. Thanks for your advice, it's not necessary. Good by! —Preceding unsigned comment added by Pdjuras (talkcontribs) 21:26, 16 December 2008 (UTC)[reply]

strivinglife.com

Greetings. This site has three transcripts of interest (only two of which were linked to from Wikipedia pages. See Ghost in the Shell 2: Innocence and Waking Life. Sites were removed under the assumption that they were spam (due to the fact that a redirecting link, on another article, was corrected; and that article didn't need to have said link), but transcripts are particuarly relevant for the latter, and to some extent for the former (albeit being an English translation, supposedly that may fall under original research. If I would have known that correcting redirecting links would have caused so much trouble, I wouldn't have bothered correcting ... Strivinglife (talk) 22:57, 27 December 2008 (UTC)[reply]

Here's what caused the block: http://meta.wikimedia.org/wiki/User:COIBot/XWiki/strivinglife.com Note that the Waking Life was an addition, still up for discussion, and the Biological update was a link correction, not a link addition. If I should be posting this elsewhere, please let me know. Strivinglife (talk) 01:05, 29 December 2008 (UTC)[reply]

Troubleshooting and problems

performance

Hi!
1. Some (>100) of the entries are fully redundant and may be removed. This would shorten the list and so increase its performance. For example, at present there exist the entries

\banontalk\.com\b
\bAnonTalk\.com\b
\b'''AnonTalk\.com\b

But all links matched by the second or third entry are matched by the first one, too. The spam extension uses the i-modifier, i.e., patterns are case-insensitive. And because between "'" and "A" there is a word boundary \ba will match here, too.
So if those entries are replaced by just the first one

\banontalk\.com\b

there would be no difference.
2. Furthermore it would increase the performance to use regexp grouping, e.g.

ourworld-top\.cs\.com\/ceoofamcolso
ourworld-top\.cs\.com\/ckelly6447
ourworld-top\.cs\.com\/jcshul
ourworld-top\.cs\.com\/latintexts

could be replaced by

ourworld-top\.cs\.com/(?:ceoofamcolso|ckelly6447|jcshul|latintexts)

(btw. the slash does not need to be escaped, because this is done by the spam extention) just like we do it at meta and de-wikipedia.
Of course one can say that this will make the list more difficult to read. But that's not a real problem, because nobody exept some admins really reads the list itself. It needs to be read by the spam extension only. The important part for human beings to read is the spam log. What do you think about that? -- seth (talk) 11:12, 27 November 2008 (UTC)[reply]

Well, it seems that nobody objects that. ;-)
Anyhow, I want to add that there is a tool for determining whether a link is blocked somewhere. This tool supports my argument that there's no need to search the SBL directly. There is no need to search the SBL by hand and so there's no need to avoid regexp constructions that are not easy to read by human beings. The only problem now is, that there is wrong entry in the log, which leads to this entry always beeing displayed. Could anybody just delete the "|"?thx Versageek! -- seth (talk) 03:24, 22 December 2008 (UTC)
Thanks!!! I've been searching by hand whenever I add something - I had no idea that tool was there. Kuru talk 03:29, 22 December 2008 (UTC)[reply]
Well, you couldn't have an idea, because it wasn't on the toolserver before 12 Dez. It took some time to get a toolserver account. -- seth (talk) 03:39, 22 December 2008 (UTC)[reply]
Additional to the first item mentioned above: There are a few entries in the en-SBL that are in meta-SBL too. It would lead to another little speed-up, if those redundant entries were deleted from this local list. Any objections to that? -- seth (talk) 01:33, 22 December 2008 (UTC)[reply]
For there are no objections, I will delete approx. 200 entries from this list within the next few days. -- seth (talk) 15:10, 4 January 2009 (UTC)[reply]
Done. Grouping of regexps will be done later. -- seth (talk) 12:07, 7 January 2009 (UTC)[reply]

wrong syntax and useless escaping

Hi!
there are some wrong entries in the list and some which have useless escaping inside.

   \sweet4ever-forum\.de\.tc\b 
-> \bsweet4ever-forum\.de\.tc\b
   \bdeath\-camps\.org\b
-> \bdeath-camps\.org\b
   \bcreateforum\.com\/phpbb\/?mforum\=reenactor\b
-> \bcreateforum\.com/phpbb/\?mforum=reenactor\b
   \bforums\.zoomshare\.com\/viewtopic\.php\?id\=15507&Reenactor%20Entertainment%20Tech%20Support\b
-> \bforums\.zoomshare\.com/viewtopic\.php\?id=15507&Reenactor%20Entertainment%20Tech%20Support\b
   \bforums\.zoomshare\.com\/viewtopic\.php\?pid\=113522
-> \bforums\.zoomshare\.com/viewtopic\.php\?pid=113522
   \bstores.\ebay\.com\b
-> \bstores\.ebay\.com\b
   \bweddings\-readings\.info\b
-> \bweddings-readings\.info\b
   \bpitbull.\wordpress\.com
-> \bpitbull\.wordpress\.com
   \bccpro2008.\wetpaint\.com\b
-> \bccpro2008\.wetpaint\.com\b
   \bmycincinnatiohiohomeinspector\.com \b
-> \bmycincinnatiohiohomeinspector\.com\b
   \bnepaeuropa2006\.blogspot\.com \b
-> \bnepaeuropa2006\.blogspot\.com\b (anyway, this entry would be redundant, so just delete it completely)
   and so on, there are many of those cases: " \b"
   \bdonmade.ytmnd.com/\b (if you wanted to block subpages only, then leave it like it is.)
(->\bdonmade.ytmnd.com\b)
   there are many of those cases: "/\b"
   \bz7\.invisionfree\.com\/Beyond_Computing\/index\.php?\b
-> \bz7\.invisionfree\.com/Beyond_Computing/index\.php\? (probably this is, what you wanted)
   \bz9\.invisionfree\.com\/Trollz\/index\.php?\b
-> \bz9\.invisionfree\.com/Trollz/index\.php\? (similar as above)
   \bpenis-*enlarge[0-9a-z\-]*\.[a-z]\b to catch the syntax "penis-*enlarge"
-> \bpenis-*enlarge[0-9a-z\-]*\.[a-z]+\b # to catch the syntax "penis-*enlarge"
   or even just
   \bpenis-*enlarge
   because the long one won't match any plain domains.
   \bastore\.amazon\.com\b affiliate Amazon products stores
-> \bastore\.amazon\.com\b # affiliate Amazon products stores
   is\.gd\b malicious redirect
-> is\.gd\b # malicious redirect (anyway, this entry would be redundant, so just delete it completely)
   \bsupermodels\.nl\b malware reported
-> \bsupermodels\.nl\b # malware reported

-- seth (talk) 14:19, 28 November 2008 (UTC)[reply]

Hello?
I'm not admin in en-wiki, so I can't do the corrections by myself.
Perhaps I was not clear enough. Just look at this: links removed. magic? no, just above mentioned bugs. -- seth (talk) 16:56, 2 December 2008 (UTC)[reply]
 Done all corrected. Thanks for pointing out the mistakes in the spam list. I do agree we need someone to actually maintain the list itself, it can get really messy and could use some maintainance. Y. Ichiro (talk) 23:10, 14 December 2008 (UTC)[reply]
I see two possibilities:
  1. I become enwiki-admin (restricted to spamlists, because my enwiki editcount is <100), I'm already admin at dewiki and temp-admin at meta.
  2. We keep on maintaining the list like this, i.e. per request and per copy&paste like User_talk:XLinkBot/RevertList#regexp_speed-up.
Because the reaction time here ist *sigh* not the best, it would be great, if I could edit the spamlist by myself. On the other hand, after one big reconditioning, perhaps there wouldn't be much more need for permanent maintenance. -- seth (talk) 00:15, 16 December 2008 (UTC)[reply]
Try WP:RFA ;) Stifle (talk) 14:34, 17 December 2008 (UTC)[reply]
I tried a few minutes ago, but failed already. ;-)
And now? -- seth (talk) 23:12, 17 December 2008 (UTC)[reply]
Hmm, not yet failed: [3]. -- seth (talk) 23:29, 17 December 2008 (UTC)[reply]
However,  Done. -- seth (talk) 12:09, 7 January 2009 (UTC)[reply]

Discussion

Um, help...

I have no idea how to make a request, nor link to my profile, but I am Soulen and can you revert the text I added to the Dragon Ball Z Tenkaichi back in, and just not the link to Youtube?


If you can, please pitch in and help whittle this down. We have editors who've been waiting several months.

Thanks, --A. B. (talkcontribs) 18:44, 8 September 2008 (UTC)[reply]

I've cleared most of this. Stifle (talk) 15:19, 25 September 2008 (UTC)[reply]


Archiving

Looks like this page could use more frequent archiving. As a non-admin I see that as a task I could help with. There don't seem to be any standards so I propose moving any section that has been completed "denied", "done" or query answered with no further discussion) for 7 days and any that has been marked as "defer to..." for 14 days. Hopefully this will make the list easier for admins to see what needs doing. Let me know if there are any objections otherwise I'll go ahead with this in a day or two. -- SiobhanHansa 18:18, 10 November 2008 (UTC)[reply]

Is there any proof this blacklist is accomplishing much of anything? Just seems to be annoying to, well, me personally, ATM. There are legitimate URIs that need to be shortened to be included in edit summaries. ¦ Reisio (talk) 04:17, 29 December 2008 (UTC)[reply]

Complaint