Wikipedia:Bots/Requests for approval/Archivedotisbot: Difference between revisions

Browse history interactively

← Previous edit Next edit →

Content deleted Content added

VisualWikitext

Inline

Revision as of 13:17, 18 June 2014

Archivedotisbot

Operator: Kww (talk · contribs · SUL · edit count · logs · page moves · block log · rights log · ANI search)

Time filed: 17:29, Saturday May 10, 2014 (UTC)

Automatic, Supervised, or Manual: Automatic

Programming language(s): PHP (based on Chartbot's existing framework)

Source code available:

Function overview: Removal all archival links to archive.is (and its alias, archive.today)~~, which was put in place to bypass the blacklist)~~

Links to relevant discussions (where appropriate): WP:Archive.is RFC, MediaWiki talk:Spam-blacklist/archives/December 2013#archive.is, Wikipedia:Administrators' noticeboard/Archive261#Archive.is headache

Edit period(s): One time run, with cleanups for any entries that got missed.

Estimated number of pages affected:

Exclusion compliant (Yes/No):

Already has a bot flag (Yes/No):

Function details:

Remove "archiveurl=" and "archivedate=" parameters whenever the archiveurl points at archive.is or archive.today.

Amended description in response to comments below.

The bot cannot implement the RFC result and keep links to archive.is. However, to help prevent deadlinking issues, the bot will take two steps:

When removing a link from an article, the bot will add a talk page notice of the form "Archive item nnnn from archive.today, used to support <url>, has been removed from this article".
A centralised list of all removals will be maintained at User:Archivedotisbot/Removal list.

—Kww(talk) 16:52, 16 May 2014 (UTC)[reply]

Discussion

Comment There is no direct connection between the existence of the links and the blacklisting of the archive.is site. Most of the archive links were put there in good faith. As archive.is performs a unique function, the proposer will need to demonstrate the links themselves are actually in violation of policy, and that any given archive is replaceable – meaning the bot ought to be capable of replacing the links with one on another archive site, particularly where the original referring url has gone dead. Non-replacement will lead to diminution of verifiability of citations used. -- Ohc ^¡digame! 01:20, 12 May 2014 (UTC)[reply]

Leaving the links in place wouldn't correspond to the RFC consensus, and having the links in place while the site is blacklisted makes for a painful editing experience.—Kww(talk) 01:31, 12 May 2014 (UTC)[reply]
Blacklisting does not distinguish good-faith edits. Welcome to the alternate universe of the MediaWiki talk:Spam-whitelist. Will the bot honor the whitelist? If so, we should get some links whitelisted before trial so that functionality may be tested. See MediaWiki talk:Spam-whitelist/Archives/2014/03#archive.is/T5OAy. This should be done before the bot runs, to avoid any discontinuity of referencing, as the whitelist approval process can take months to come to consensus. – Wbm1058 (talk) 01:58, 12 May 2014 (UTC)[reply]

Does anybody keep track of all the archive links they place? I can guess but I can never be sure. If a bot is approved, removals of potentially valid and irreplaceable (in some cases) links will be the default scenario unless all editors who consciously used the site come forward with their full list. I fear that even if I whitelisted all the articles I made substantial contributions to, that list would be incomplete. Then, some links I placed will inevitably get picked off by the bot. -- Ohc ^¡digame! 04:30, 12 May 2014 (UTC)[reply]

I have to reject the timing and implication of this request at this time on a couple of key grounds. Archive.today was not made to bypass the filter. There is no evidence that Archive.is operated the Wiki Archive script/bot. The actual situation was resolved by blocks, not the filter - the filter was by-passable for a long time. Kww made a non-neutral RFC that hinged on perceived use as ads, malware and other forms of attack - without any evidence nor any realization of any of these "bad things" would ever or be likely to occur. Frankly, the RFC was not even closed by an admin and it was that person, @Hobit:, that bought into the malware spiel and found Archive.is "guilty" without any evidence presented. Also, this is six months later, if that's not enough reason to give pause - I'll file for a community RFC or ArbCase on removing the Archive.is filter all the quicker. Back in October 2013, I'd have deferred to the opinion then, but not when thousands of Gamespot refs cannot be used because of Archive.org and Webcite's limitations and Kww seems deaf to the verifiability issues. Those who build content and maintain content pages need Archive.is to reduce linkrot from the most unstable resources like GameSpot. ChrisGualtieri (talk) 04:49, 12 May 2014 (UTC)[reply]

I will simply point out that your arguments were raised and rejected at a scrupulously neutral RFC that was widely advertised for months.—Kww(talk) 05:00, 12 May 2014 (UTC)[reply]

False, I wasn't even a part of the RFC. Also, the malware and illegal aspect were repeatedly pushed without evidence. ChrisGualtieri (talk) 16:29, 12 May 2014 (UTC)[reply]

I didn't say that you had participated: I said that your arguments had been presented. The framing of the RFC statement was scrupulously neutral. Arguments were not neutral, but such is the nature of arguments.—Kww(talk) 16:50, 12 May 2014 (UTC)[reply]

Can you prove, with firm evidence, that archive.today was created to "bypass the blacklist"? That domain has existed for months, and during this time, an attacker could have spilled a mess all over Wikipedia, but this has not occurred. Currently, archive.is does not exist (just try typing in the URL), it redirects to archive.today which is the current location of the site. A website may change domains due to any number of legitimate reasons, ranging from problems with the domain name provider, to breaking ccTLD rules. --benlisquare_T•C•E 06:04, 12 May 2014 (UTC)[reply]

Struck the language expressing cause and effect, and simply note that archive.is and archive.today are the same site.—Kww(talk) 06:13, 12 May 2014 (UTC)[reply]

I did close the RfC and am not an admin. I closed the discussion based upon the contributions to that RfC. There was no "Guilty" reading. Rather it was the sense of the participants that archive.is links should be removed because there was a concern that unethical means (unapproved bot, what looked like a bot network, etc.) were used to add those links. I think my close made it really really clear that I was hopeful we could find a way forward that let us use those links. If you (@ChrisGualtieri:) or anyone else would like to start a new RfC to see if consensus has changed, I'd certainly not object. But I do think I properly read the consensus of the RfC and that consensus wasn't irrational. On topic, I think the bot request should be approved--though if someone were to start a new RfC, I'd put that approval on hold until the RfC finished. Hobit (talk) 18:05, 12 May 2014 (UTC)[reply]
Comment. An unapproved(?) bot is already doing archive.is removal/replace: [1] 77.227.74.183 (talk) 06:18, 13 May 2014 (UTC)[reply]

Im not a bot, so that is completely uncalled for. Werieth (talk) 10:16, 13 May 2014 (UTC)[reply]

When I see a bird that walks like a duck and swims like a duck and quacks like a duck, I call that bird a duck.

I see you work 24/7 and insert high amount of unreviewed links like a bot ([2], [3] in Barcelona).

I call you a bot. 90.163.54.9 (talk) 13:03, 13 May 2014 (UTC)[reply]

I dont read Chinese, and it looks like a valid archive. Not sure what the issue is. comparing http://www.szfao.gov.cn/ygwl/yxyc/ycgy/201101/t20110120_1631663.htm and its archive version http://www.webcitation.org/684VviYTN the only differences Im seeing is its missing a few images, otherwise its the same article. Werieth (talk) 13:11, 13 May 2014 (UTC)[reply]

The first page has only frame and misses content, second has only a server error message. No human would insert such links. I also notices that you inserted many links to archived copies of youtube video pages, which is nonsense.

You should submit a bot approval request (like this one), and perform a test run before run your bot at mass scale.

Only the fact that in the same transaction you removing archive.is links prevents editors to undo your edits. Otherwise most of your edits would be reverted. 90.163.54.9 (talk) 13:14, 13 May 2014 (UTC)[reply]

Not sure what your looking at but http://www.webcitation.org/684VviYTN looks almost identical to http://www.szfao.gov.cn/ygwl/yxyc/ycgy/201101/t20110120_1631663.htm. The only two differences I see is that the archive is missing the top banner, and the QR code at the bottom. As I said Im not a bot and thus dont need to file for approval. Werieth (talk) 13:21, 13 May 2014 (UTC)[reply]

Forget 684VviYTN, it was my copy-paste error, which I promptly fixed. There are 2 other examples above. 90.163.54.9 (talk) 13:24, 13 May 2014 (UTC)[reply]

taking a look at http://www.apb.es/wps/portal/!ut/p/c1/04_SB8K8xLLM9MSSzPy8xBz9CP0os_hgz2DDIFNLYwMLfzcDAyNjQy9vLwNTV38LM_1wkA6zeH_nIEcnJ0NHAwNfUxegCh8XA2-nUCMDdzOIvAEO4Gig7-eRn5uqX5CdneboqKgIAAeNRE8!/dl2/d1/L2dJQSEvUUt3QS9ZQnB3LzZfU0lTMVI1OTMwOE9GMDAyMzFKS0owNUVPODY!/?WCM_GLOBAL_CONTEXT=/wps/wcm/connect/ExtranetAnglesLib/El%20Port%20de%20Barcelona/el+port/historia+del+port/cami+cap+el+futur/ vs http://web.archive.org/web/20131113091734/http://www.apb.es/wps/portal/!ut/p/c1/04_SB8K8xLLM9MSSzPy8xBz9CP0os_hgz2DDIFNLYwMLfzcDAyNjQy9vLwNTV38LM_1wkA6zeH_nIEcnJ0NHAwNfUxegCh8XA2-nUCMDdzOIvAEO4Gig7-eRn5uqX5CdneboqKgIAAeNRE8!/dl2/d1/L2dJQSEvUUt3QS9ZQnB3LzZfU0lTMVI1OTMwOE9GMDAyMzFKS0owNUVPODY!/?WCM_GLOBAL_CONTEXT=/wps/wcm/connect/ExtranetAnglesLib/El%20Port%20de%20Barcelona/el+port/historia+del+port/cami+cap+el+futur/ it looks like a snapshot of how the webpage looked when it was archived and the page is dynamic. There is one part of the page that appears to be dynamically generated via JavaScript that appears partially broken in the archive but most of the page content persists and is better than not having any of the content if the source goes dead. Instead of complaining about my link recovery work why dont you do something productive? Werieth (talk) 13:36, 13 May 2014 (UTC)[reply]

Productive would be to undo your changes and discuss in public the algorithms of your bot, but it is impossible because you intentionally choose pages with at least one archive.is link and thus you do abuse the archive.is filter making your unapproved bot changes irreversible. Also, you comment those changes as "replace/remove archive.is" albeit 90% of the changes you made are irrelevant to archive.is. 90.163.54.9 (talk) 15:11, 13 May 2014 (UTC)[reply]

Oppose, RFC was non-neutral, bias, and not widely advertised despite Kww's claims, this is obvious from the number of editors who say they had no knowledge of the discussion while clearly being opposed to its outcome. DWB / Are you a bad enough dude to GA Review The Joker? 08:02, 13 May 2014 (UTC)[reply]

DWB:It's hard to give much weight to an argument based on a falsehood. It was placed in the centralized discussion template on Sept 13 and not removed until Oct 31. That you personally missed a discussion doesn't invalidate a discussion. The framing of the question was scrupulously neutral.—Kww(talk) 14:44, 13 May 2014 (UTC)[reply]

And yet so many users say they were not aware of it, of course we are all lying. The RFC was based on the premise "Archive.is does what it is meant to but I am going to accuse it of things I cannot prove, and also one user is adding a lot of their links so we should block it". It was not advertised nor neutral. DWB / Are you a bad enough dude to GA Review The Joker? 19:56, 13 May 2014 (UTC)[reply]

It was advertised in the standard places for 45 days. The RFC question presented three alternatives, the first of which was to leave existing links in place, the second of which was to restore all the links to archive.is that had already been removed, and the third (which gained consensus) was to remove them all. That's about as neutral as you can get, and more widely advertised than normal. That your side did not prevail doesn't mean a discussion is flawed, it simply means that it reached a conclusion that you disagree with.—Kww(talk) 20:04, 13 May 2014 (UTC)[reply]

Question I assume the bot, if approved, will replace, not remove the archives? Will it properly include an edit summary? Thank-you. Prhartcom (talk) 14:12, 13 May 2014 (UTC)[reply]
- Good point. I'm not sure there _are_ replacements, though following the AN thread, it looks like a new archive tool is potentially available. So it certainly should be replacing them where possible. In addition, an edit summary which explains what's going on (ideally with a link to a more detailed explaination) should be required (and trivial I'd think). Hobit (talk) 14:42, 13 May 2014 (UTC)[reply]

User:Kww, what is your answer to this? Prhartcom (talk) 17:28, 13 May 2014 (UTC)[reply]

Easy enough to build a centrally accessible list of what was removed and where it pointed. Finding good replacements is not readily automated.—Kww(talk) 17:35, 13 May 2014 (UTC)[reply]

Agreed that it would be a difficult job and that the bot may have to be semi-automated run by an operator making human decisions as it runs in order to get the archives accurately replaced. Obviously you don't want to simply delete archives and have people mad at you, you want to replace them, achieving the archive.is purge goals as well as link rot prevention goals. I wish you the best of luck with it. Prhartcom (talk) 17:45, 13 May 2014 (UTC)[reply]
Oppose Kww, From your comments in this discussion it doesn't sound like you are interested in the goal of preventing link rot, but only your goal of purging archive.is, so I cannot let you proceed with damaging Wikipedia. Replace not remove. Prhartcom (talk) 21:26, 14 May 2014 (UTC)[reply]

*Oppose Per DWB. Duke Olav Otterson of Bornholm (talk) 15:13, 13 May 2014 (UTC)Blocked as sock.—Kww(talk) 15:45, 13 May 2014 (UTC)[reply]

- Sock of who? Has that user posted here? If not the socking is not abusive and the !vote stands. All the best: Rich Farmbrough, 13:13, 14 May 2014 (UTC).
  - It's an undisclosed alternative account, and is not permitted to participate in community discussions. The block has been upheld by another admin.—Kww(talk) 18:47, 14 May 2014 (UTC)[reply]

Statement: I will make a general comment to whoever closes this thing: this should not be a forum for people that did not prevail at an RFC to attempt to undermine the result. That isn't what a BRFA is about. The RFC had a conclusion, and I am requesting approval to run a bot to implement that conclusion.—Kww(talk) 15:39, 13 May 2014 (UTC)[reply]

Oppose To quote Hawkeye "Do not use a bot to remove links. Per Wikipedia:Archive.is RFC: the removal of Archive.is links be done with care and clear explanation. " Moreover the blocking of The Duke by Kww makes me doubt that Kww is in a good place to run a bot on such a contentious issue. Further the recent discussion was hardly consensual for removing the links, the more time that goes past, the less likely is it that archive.is is abusive as claimed. All the best: Rich Farmbrough, 13:13, 14 May 2014 (UTC).
Oppose I also endorse Rich Farmbrough's view that the RFC was quite clear. The solution is to manually re-find the URLs (or to hunt down replacements at other mirrors) for the ones that have been corrupted by the archiving service (much the same way that I did List of doping cases in sport and it's newly minted subchildren). Removing the parameters outright violates the consensus established at the RFC, and individual editors editing 24/7 to replace these suggests a form of automation and not manually fishing the appropriate archives. Hasteur (talk) 17:18, 14 May 2014 (UTC)[reply]

Rich, Hasteur: the language in the RFC closing is quite clear:"There is a clear consensus for a complete removal of all Archive.is links.". Hawkeye's opinion distorts the RFC closing statement, and does not reflect the actual content of the RFC. The care called for is explicit: "To those removing Archive.is from articles, please be sure to make very clear A) why the community made this decision and B) what alternatives are available to them to deal with rotlink." Not replacement. Not exhaustive searching for alternatives. Again, the purpose of an BRFA is not to provide people that disagree with an RFC an alternate venue to restate their opposition.—Kww(talk) 18:47, 14 May 2014 (UTC)[reply]

Kww you might want to check the RFC again and check your prejudice at the door. I did support removal, but controlled removal to where we don't instantly deadlink the reference by bulk removing archive.is. I'm not attempting to overturn the previous consensus, I am only saying that botting this is not endorsed. Hasteur (talk) 19:04, 14 May 2014 (UTC)[reply]

It doesn't matter what opinion either of us expressed in the RFC, Hasteur. That's not what the closing statement says. It says that the consensus is to remove them all, and there was no consensus for the level of research that you are demanding prior to removal.—Kww(talk) 21:03, 14 May 2014 (UTC)[reply]

I haven't been following the entire discussion about this issue, is there a "tl;dr" somewhere? Is this bot task planning to remove all archive.is links, with the goal that enwp will stop linking to that site as a whole? Or is this just a "cleanup" run to remove all the spammed links. It would be nice if rather than removing, we could convert them to IA links, but that will probably just be a dream ;) Legoktm (talk) 08:20, 15 May 2014 (UTC)[reply]
- @Legoktm: It's in the functional details (Remove "archiveurl=" and "archivedate=" parameters whenever the archiveurl points at archive.is or archive.today.). It means that if the base url is gone, we instantly deadlink the referernce. I observe that it's transcended basic disputes and upholding the consensus and gone to the level of "Cutting off the nose to spite the face" tactics to obliterate links to the offending website. Hasteur (talk) 12:51, 15 May 2014 (UTC)[reply]
- TLDR version for Legoktm: the sole intent of the bot is to remove every reference to archive.is from English Wikipedia. That was the consensus at WP:Archive.is RFC, so that's what the proposed bot would do. Once proposed, editors that did not prevail at the RFC have taken this opportunity to oppose the bot, many of them presenting distorted versions of the RFC close to support their position. If you look above at Hobit's position, you will see that the closer of the RFC agrees that the bot implements the consensus of the RFC. I maintain that that is the sole criteria by which this BRFA should be judged, and all the conversation above is completely irrelevant to the discussion. The question being asked is "does the bot implement the RFC?" not "does the commenter agree that links to archive.is should be removed?"—Kww(talk) 15:08, 15 May 2014 (UTC)[reply]
  - You are correct, though it is perfectly reasonable for those who opposed the removal by any means, to be against the removal by bot, even if they would have supported bot-removal for some other hypothetical links whose removal they supported. Otherwise a system of regression is in place which allows a tyranny of the minority, namely the minority that asks the questions. All the best: Rich Farmbrough, 20:06, 15 May 2014 (UTC).
    - The point is that the people that oppose the bot based on "I don't think the RFC should have generated the result that it did" should have their !votes discarded by whomever closes this thing. There are venues to discuss such things, and BRFA isn't one of them.—Kww(talk) 20:37, 15 May 2014 (UTC)[reply]
      - Kww Please don't strong arm the process like this by trying to use *fD nomenclature like !vote. There are 2 people (myself and Rich Farmbrough) who oppose the bot for completely seperate reasons besides "I don't think the RFC should have generated the result that it did". I am asking that the bot be rejected on the grounds that the deadlinking you propose is more disruptive than a controlled replacement of the links. Hasteur (talk) 20:51, 15 May 2014 (UTC)[reply]
        I'm not strongarming the process at all, Hasteur. The RFC result did not call for leaving links in place when replacements could not be found. It did not call for diligent searching for replacements prior to removal. It called for complete removal of links. The alternative you are attempting to hold the bot to did not gain consensus. I'm quite willing to entertain enhancements such as creating a centralized list of removed links or leaving talk page notices indicating what links have been removed, but I'm not willing to entertain leaving the links in place: that would run counter to the RFC result.—Kww(talk) 21:34, 15 May 2014 (UTC)[reply]
        
        Indeed, and that is perfectly legitimate, if frustrating way to !vote. If it were not, for example, we could get the following situation.
        Scenario 1: kww asks for a BRFA to remove .is links. Vote Yes, 26 % No, 74%. (say 25% think the links are good, 24% think that all bots are evil and 25% think it should be done manually)
        
        Scenario 2: RFC - passes 75%, BRFA, passes 51%.
        
        Clearly this process would be anti-consensus. Equally clearly, by extending the process with sufficient stages, and suitably worded alternatives any conclusion could be reached.
        
        All the best: Rich Farmbrough, 11:02, 16 May 2014 (UTC).

Kww doesn't seem to understand the opposition has nothing to do with "not prevailing" - like this is a trial and we are opted by some "law" to abide it. No part of the RFC was neutral or balanced - despite kww's assertions otherwise. People in the first RFC were under the impression it was all done by a bot - it did not balance the contributions of other editors or even discuss that fact in its opening. The arguments and its closing were highly ambiguous, but its been more than SIX months and much has transpired in that time. I read the RFC as to remove the Bot-added links - not the whole and the close (supervote or not) did not establish a blacklist - but a blacklist was made and the Bot-added links were not purged as was the expected result. Now we are calling for the complete removal of the entire website based on allegations, malware fears, and the acts of a single user all while knowing there are no actual issues with the additions, the website or content displayed itself. And just to top it off, as if it wasn't enough, all in the name of a flawed non-admin closed RFC that took more than six months and a much larger discussion to provoke this attempt to complete an expanded and derived reading as if the last six months (and the blacklist not functioning) never happened. Though it seems consenus can change and it has. ChrisGualtieri (talk) 17:30, 18 May 2014 (UTC)[reply]

First, your reading of the RFC is irrelevant: it was closed, and the closure was never overturned (or even challenged, for that matter). Second, if you believe that the formulation of the RFC was non-neutral, can you at least indicate what part of the original framing was non-neutral? I certainly cast my opinion in one of the discussion sections, but the framing of the circumstances was scrupulously neutral.—Kww(talk) 19:26, 18 May 2014 (UTC)[reply]
- One correction, it was certainly challenged. I think both on my talk page and either AN or ANI. In any case, ChrisGualtieri it seems wise to open a new RfC if you feel the last one was defective (other in Kww's wording or my close) or because CCC. I'm not sure why you haven't done that if you feel there were so many problems. As the closer, I've made it pretty clear I'm comfortable with a new RfC. Heck I'd even be happy to work with you on neutral wording or whatever else might be helpful. I think I closed it correctly and I don't think it was ambiguous--if some part was please let me know and I'll clarify. But I think enough time has passed that a CCC argument is a perfectly good reason to start a new RfC on the topic--I'd not be suprised if you were correct and consensus has changed. Hobit (talk) 22:35, 18 May 2014 (UTC)[reply]
  - We are moving in that direction. Let's talk on your page about an issue or two before a new RFC is made. ChrisGualtieri (talk) 04:05, 19 May 2014 (UTC)[reply]
    - Certainly you could spare a moment to actually identify a specific item in the old RFC that would support your accusations of it being biased. Or is it easier to disrupt this discussion by simply making accusations without supporting them?—Kww(talk) 04:35, 19 May 2014 (UTC)[reply]

Kww has no intention of even lifting the edit filter or discussing the details of it publicly. The data in question shows one bad user and many good users who added Archive.is links and Rotlink was not being operated by Archive.is. Allegations of illicit activity, bot nets and false identity that requires the complete nuking of a site on the basis of someone who's data doesn't even trace to Archive.is is a pretty poor excuse to punish the whole on the grounds of some boogieman. The RFC did not even recognize the good editors who added those links in the first place. It wasn't neutral and it did not even give fair representation to the two users who prominently declared that it would negatively impact their editing. The simple solution was ignored for the sake of preventing or removing the whole. Six months is far too late to suddenly spur the removal because someone disagrees with you. Kww made blind accusations and couldn't support them, but even the lengthy discussion into how those were unsupported did not deter the non-admin closer from a straw count of the !votes despite the entire premise being unsupported by the conclusion. The entire thing hinged on unsupported allegations of illicit activity, malware and that Rotlink was Archive.is, despite evidence to the contrary. I see absolutely no value in a "consensus" rooted in false pretexts, numerous users have made key arguments and Kww has brushed them off without answering them. I cannot support this bot because it represents a hail mary some six months after the fact and rooted in a direct opposition to the edit filter's very existence. ChrisGualtieri (talk) 22:57, 22 May 2014 (UTC)[reply]

BRFA doesn't deal with edit filters. If you're concerned about that, take it to WT:EF or WP:AN. Legoktm (talk) 04:34, 31 May 2014 (UTC)[reply]

Break

So, from my reading most of the opponents of the bot don't agree with the RfC's closure, and believe that the RfC itself is invalid. If that's the case, arguing here is pointless. BRFA can't overturn the closure of an RfC. WP:AN or WP:VPP are the places to do that, but Hobit says that that already happened (I didn't look for links), so IMO the closure is valid, and consequently that's not a reason to block approval of the bot. Legoktm (talk) 04:34, 31 May 2014 (UTC)[reply]

Sorry I am late to the discussion, as the original RfC did not catch my attention at the time. The fact that advice to use this service was not removed from Wikipedia:Link rot until 19 March 2014 and not explicitly prohibited until 20 March 2014 didn't help with increasing awareness. I have some questions. Please feel free to add specific answers under each. Wbm1058 (talk) 15:37, 31 May 2014 (UTC)[reply]

As for the March issue, anyone that has tried to add an archive.is link since October 2013 has had his edit blocked as a result. I would think that would be notice enough.—Kww(talk) 16:04, 31 May 2014 (UTC)[reply]

This is proposed to be a one-time run. What are the plans for dealing with such links that may be added after the one-time run? Wbm1058 (talk) 15:37, 31 May 2014 (UTC)[reply]
There's still a filter that prevents additions of archive.is and archive.today, and that filter will remain in place until the blacklist is implemented.—Kww(talk) 16:04, 31 May 2014 (UTC)[reply]
I'm confused. Here is a diff showing an archive.today reference link addition, added 21 April 2014. It doesn't seem to have been blocked, and it's not for a video game review. Wbm1058 (talk) 17:47, 31 May 2014 (UTC)[reply]
The filter hadn't caught up with the name change from archive.is to archive.today. Now it has.—Kww(talk) 21:14, 31 May 2014 (UTC)[reply]
I see. Editors clicking on Show preview after inserting <ref>http://archive.today/xxx</ref> into the edit box see MediaWiki:Abusefilter-warning-archiveis—created 16 November 2013, in response to this 15 November 2013 edit request, which in turn was a response to this discussion, which happened a month after this request‎ was not responded to—above their edit box. This warning, which I believe uses template:edit filter warning, is triggered by Special:AbuseFilter/559, the details of which are hidden from public view. I also note that Special:AbuseLog/xxx admits that "Entries in this list may be constructive or made in good faith and are not necessarily an indication of wrongdoing on behalf of the user" in spite of its name AbuseLog which presumes wrongdoing. I also note that attempts to make such edits are logged behind the editors back without informing them that this is happening, even though no edit was actually saved (was the attempted edit saved?). Be careful who you love ;-) This also begs the question: if an edit filter was installed by 25 October 2013, what was the point of requesting blacklisting in December, which seems redundant. Would I be correct in assuming that any administrator can view the filter, and the the filter is nothing more complicated than looking for <ref>http://archive.today/xxx</ref> or any alias(s) for that site? Will the bot use the same search criteria as the filter? Or use Special:LinkSearch/archive.is and Special:LinkSearch/archive.today? Wbm1058 (talk) 15:28, 1 June 2014 (UTC)[reply]
The edit filter is not nefarious, but yes, it is hidden, and no, I will not discuss the details of exactly what it takes to ensure that archive.is and archive.today links are not inserted. Blacklisting is our standard technique, not filters: the filter was installed on an emergency basis because of the attacks. The proposed bot will do exactly what is advertised: look for 'archiveurl=" parameters and remove them. That should bring our count down low enough that blacklisting the site and removing the filter is feasible. If I find that someone has undertaken a conscious effort to defeat the bot by fiddling with the parameters, I will probably tweak the bot to get past such things.—Kww(talk) 15:39, 1 June 2014 (UTC)[reply]
OK. So links like the one added in this edit, which I linked above, will not be removed by this bot, because they are not inside templates using the archiveurl=n parameter? That makes sense, if the alleged botnet restricted their additions to those using such citation templates. Wbm1058 (talk) 17:27, 2 June 2014 (UTC)[reply]
The blacklist will force removal of such links, but the bot will not. Such links are a small enough percentage of the overall problem that they can be deleted manually without bot assistance.—Kww(talk) 18:05, 2 June 2014 (UTC)[reply]
What is the status of mw:Extension:ArchiveLinks? See mw:Extension talk:ArchiveLinks, where it was proposed that "archive.is should be added". Wbm1058 (talk) 15:37, 31 May 2014 (UTC)[reply]
Your link is to a question that preceded the problems with archive.is.—Kww(talk) 16:04, 31 May 2014 (UTC)[reply]
See also m:WebCite. Ok, so I'm not familiar with the m:New project process and how it may differ from WP:RFC, but why hasn't this been closed yet, with a determination of whether or not there is a consensus for it? Are there some citation backup features provided by Archive.today (Wikipedia:Articles for deletion/Archive.is) that WebCite doesn't offer? Wbm1058 (talk) 17:27, 31 May 2014 (UTC)[reply]
Is link rot really a problem? If, per Wikipedia:Link rot#Internet archives "archive.is not permitted on the English Wikipedia", from that does it follow that whitelisting specific archive.is links is not permitted either? Wbm1058 (talk) 15:37, 31 May 2014 (UTC)[reply]
From my perspective, it's a real problem, but not as large as people make it. Replacement links tend to be available for important information. Note that some of the loudest objectors here are eager to use archive.is because it properly archives pages from one of the video game review sites.—Kww(talk) 16:04, 31 May 2014 (UTC)[reply]
There are many Web archiving initiatives. Are all of these except archive.is legal? If so, can the bot crawl all of them in search of alternatives? Perhaps Mementos can be used to make this task easier? How can we be assured that none of these alternatives will not potentially have the same issues as archive.is at some time in the future? Wbm1058 (talk) 15:37, 31 May 2014 (UTC)[reply]
Automatic archival replacement bots have been tried before, and have invariably failed. I've agreed above to keep a master list of all removed links. If people want to manually deal with the list or write scripts that assist people in finding replacements from the list, I view that as a separate task.—Kww(talk) 16:04, 31 May 2014 (UTC)[reply]
Can I look at the source code for this bot? Wbm1058 (talk) 15:37, 31 May 2014 (UTC)[reply]
This is just a simple variation on Chartbot, an approved bot that dealt with the last revamp of Billboard (a revamp that left us with tens of thousands of dead links). I've never released the source. I could, but I would like to know why you want to see it.—Kww(talk) 16:04, 31 May 2014 (UTC)[reply]

@@ Line 1: / Line 1: @@
-<noinclude>[[Category:Denied Wikipedia bot requests for approval|Archivedotisbot]]</noinclude><div class="boilerplate metadata" style="background-color:
+<noinclude>[[Category:Open Wikipedia bot requests for approval|Archivedotisbot]]</noinclude>
-#DEDACA; margin:2em 0 0 0; padding:0 10px 0 10px; border:1px solid #AAAAAA;">
-:''The following discussion is an archived debate. <span style="color:red">'''Please do not modify it.'''</span> To request review of this BRFA, please start a new section at [[WT:BRFA]].'' The result of the discussion was [[File:Symbol delete vote.svg|20px|alt=|link=]] '''Denied'''{{#ifeq:|no||.}}<!-- from Template:Bot Top-->
 ==[[User:Archivedotisbot|Archivedotisbot]]==
 {{Newbot|Archivedotisbot|}}
@@ Line 143: / Line 141: @@
 #Can I look at the source code for this bot? [[User:Wbm1058|Wbm1058]] ([[User talk:Wbm1058|talk]]) 15:37, 31 May 2014 (UTC)
 #:This is just a simple variation on Chartbot, an approved bot that dealt with the last revamp of Billboard (a revamp that left us with tens of thousands of dead links). I've never released the source. I could, but I would like to know why you want to see it.&mdash;[[User:Kww|Kww]]([[User talk:Kww|talk]]) 16:04, 31 May 2014 (UTC)
-==closure==
-I have closed the discussion per [[WP:IAR]]. Whether the RfC was neutral is not an issue; This is a BRFA, so can not alter the result. However, reading through the discussion, there is clearly not a consensus to run the bot, mainly due to concerns about legitimate archive links being removed and never replaced. I am aware I am not a member of the BAG, however this has sat here for several days, and frankly, there is no point keeping a stale discussion open when there is a clear consensus evident. --[[User:Mdann52|<span style="color:Green">'''Mdann'''</span>]][[Special:Contributions/Mdann52|<span style="color:Red">'''52'''</span>]]<small>[[User talk:Mdann52|<span style="color:Maroon">''talk to me!''</span>]]</small> 10:23, 18 June 2014 (UTC)
-:''The above discussion is preserved as an archive of the debate.  <span style="color:red">'''Please do not modify it.'''</span> To request review of this BRFA, please start a new section at [[WT:BRFA]].''<!-- from Template:Bot Bottom --></div>