Wikipedia:Requests for comment/Archive.is RFC 2

See also Wikipedia:Archive.is RFC (closed), Wikipedia:Archive.is RFC 3 (closed) and Wikipedia:Archive.is RFC 4 (closed)

MALFORMED RFC

Darkwarriorblake has deliberately malformed this RFC, and tampered with the process of consensus finding, by canvassing and misrepresenting the close of the previous RFC. User:Darkwarriorblake is formally warned that any future attempts to skew consensus will result a block. If any editor wishes to start another RFC in accordance with policy, they are free to do so. — Coffee // have a cup // beans // 23:00, 24 June 2014 (UTC)[reply]

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Background

The previous RFC regarding Archive.is concluded that the site should be added to the blacklist based on statements that links were mass added to Wikipedia by the unapproved bot called RotlinkBot, created by User:Rotlink potentially operating as a malicious botnet that mass-added some Archive.is links to articles. Archive.is is an archiving service similar to sites like Webcite and the Wayback Machine, offering different levels of service up to and including not abandoning snapshots already saved due to modern changes in a sites robots.txt file as Wayback Machine does, while Webcite has presented itself as having an uncertain long term future.

The ultimate outcome of the previous RFC is that archive.is links, whether added by the bot or by individual users, be removed with an additional weak consensus that it be added to the blacklist. Since then the addition of archive.is to the blacklist has caused more issues, preventing editing, and certain users have chosen, even in the face of dissenting users, to impose edits that remove archive.is archives without replacing them, even if this renders the now dead link completely unusable creating LINKROT.

The need for this RFC is to reflect the common talking point at the previous RFC that users, many users who actually used archive.is links were not properly informed about the discussion, and that the actions of a bot, malicious or otherwise, should not have had any impact on the links added by typical Wikipedians. Elements of the previous RFC relied on hypothetical and unproven future threats about Archive.is that should not have had any impact on the inclusion of the links. The original RFC found no issue with the quality of the snapshots provided by Archive.is.

The previous RFC posits that archive.is presents a malware risk based on the actions of RotLink and RotlinkBot, a belief supported by some users but based on a perception of RotLink's mass additions, however no evidence was provided that the site did or currently does contain malware. Protecting users from such malware was part of the reasoning for adding to the blacklist, although any ill-intent on behalf of Archive.is remains unproven. As such, I think it is time that archive.is be removed from the blacklist, while sanctions and restrictions on the use of RotlinkBot and any other bots that attempt to add links in such a way that could be considered spam, remain. Darkwarriorblake / SEXY ACTION TALK PAGE! 19:34, 24 June 2014 (UTC)[reply]

Please find the previous RFC here, as it did not seem that Darkwarriorblake linked to it. - Favre1fan93 (talk) 20:04, 24 June 2014 (UTC)[reply]

Options

1. Remove Archive.is from the blacklist and allow the links to be used by non-automated users

Support per above argument as RFC creator, nothing wrong with Archive.is or the service it provides to the project. Darkwarriorblake / SEXY ACTION TALK PAGE! 19:34, 24 June 2014 (UTC)[reply]
Support per DWB. The reasoning for this site's blacklisting seems like a non-issue now. It's just as useful as WebCite or Wayback Machine. Corvoe (speak to me) 19:55, 24 June 2014 (UTC)[reply]
Support per DWB. I have not found any issue with the valid archives I did create on the site. If a bot abused this site, then those edits should be taken care of. If not, at least a legitimate bot needs to be created to properly remove the archive.is links, and replace them either with a Webcite or Wayback Machine archive. It has become more detrimental reverting editors removing these links without replacing them, then having them on the site. As well, some sites are now utilizing robot.txt that is preventing from Webcite and Wayback Machine from archiving, and archive.is is another option to help prevent the WP:LINKROT. - Favre1fan93 (talk) 19:59, 24 June 2014 (UTC)[reply]
Support - unless legitimate problems with the site itself are raised, I don't see the purpose of putting a useful site on a blacklist due to the actions of a bot/user. --Pres N 20:02, 24 June 2014 (UTC)[reply]
Support; I disagree with the use of the blacklist in general (except maybe for, like, known pornographic sites), and the actions of one bot across part of the site, if that's what this is, don't come close to being enough of a reason to close the domain off entirely. BTW, there's another place to challenge blacklistings; I recently posted there to appeal a site for the Sonic X article. Tezero (talk) 20:23, 24 June 2014 (UTC)[reply]

2. Keep Archive.is on the blacklist, continue with removal of all links

Support. Second choice, for all the reasons stated above. Linking to sites that appear to use compromised computers in their operations places both us and our readers at unnecessary risk.—Kww(talk) 20:09, 24 June 2014 (UTC)[reply]
Kww, we still have an issue that these links are then not archived anymore, presenting the WP:LINKROT issues. If it is determined that all links are removed still, can we get a valid bot working to replace the links with new archives? - Favre1fan93 (talk) 20:12, 24 June 2014 (UTC)[reply]
If I thought it were possible, I would be happy to do so. I don't think it's possible. The problem is that the presence of an archive link doesn't indicate precisely what the editor intended to archive. A trivial example: if someone archived a page on theofficialcharts.com to preserve the position of a single when that single had barely charted, only archives of the page taken in the same Monday-Sunday week will preserve the same information. On the other hand, a Romania Top 100 position is good only during a Tuesday-Monday week, other charts Wednesday to Tuesday. That means there's no way to determine, universally, by date, whether an archive at another archiving site archives the same data. As for comparing the contents, the formats that archive.is uses and the format that the other archives use are different, so the bot can't compare the sites. It's possible to write scripts to help users look for other archives and add archives, but that determination of "yes, the data in this archive supports the citation in the article" requires human intervention.—Kww(talk) 20:22, 24 June 2014 (UTC)[reply]
Support We should not be linking to sites that could infect user's computers or rely on that functionality to operate. (note: if someone developer a legit peer-type service, that would be different; it's the unknowing potential misuse of compromised systems that's at issue). --MASEM (t) 20:55, 24 June 2014 (UTC)[reply]
There is absolutely no evidence of malware on archive.is at all, it's completely hypothetical. If we should not be linking to such sites we can't link to anything because even eBay can get hacked. Darkwarriorblake / SEXY ACTION TALK PAGE! 20:58, 24 June 2014 (UTC)[reply]
No one is arguing that there is evidence of malware on the site at present. If you can provide evidence that suggests the owner of eBay is using compromised botnets to link to eBay, that would be a more accurate parallel to this situation.—Kww(talk) 21:02, 24 June 2014 (UTC)[reply]
Even the suggestion of using compromised botnets is hypothetical, it is easy to claim a lot of things without actual evidence, the point is that anything we link to can be compromised, Wikipedia can be compromised, it is impractical to punish an entire site based on theoretical possibilities, especially given the number of archive.is links that were NOT added via bot. Darkwarriorblake / SEXY ACTION TALK PAGE! 21:06, 24 June 2014 (UTC)[reply]
This is, in part, something that demanding reliable sources helps with. A good reliable source's website is going to be carefully guarded against intrusion, and when compromises happen (which they do) they are public on it, report what happened, fix it, and warn users about that. Hacks happen, but they are quickly fixed. --MASEM (t) 21:11, 24 June 2014 (UTC)[reply]

3. Close RFC as improperly formed

Support: —Kww(talk) 20:05, 24 June 2014 (UTC)[reply]
Support: Werieth (talk) 20:07, 24 June 2014 (UTC)[reply]
Support: At least try to write an RFC that people who disagree with you might find neutral. Hipocrite (talk) 21:12, 24 June 2014 (UTC)[reply]
Support Misleading through omission and misrepresentation. RFC should be considered null and void. Dennis Brown | 2¢ | WER 21:16, 24 June 2014 (UTC)[reply]
You guys could always try to say that you feel is missing, as such unuseful comments could be considered padded votes otherwise. Darkwarriorblake / SEXY ACTION TALK PAGE! 21:32, 24 June 2014 (UTC)[reply]
When an RFC, any RFC, is this borked from the start, it is better to throw the baby out with the bathwater, so the RFC neutrality doesn't dominate the discussion at the RFC itself. The interests of neutrality demand it. Dennis Brown | 2¢ | WER 21:38, 24 June 2014 (UTC)[reply]
Support. Situation seems a bit more complicated than presented in the RFC header. The previous RFC specifically discussed a malicious botnet that spammed Wikipedia, not just an unauthorized bot that helpfully added links. I agree wholeheartedly with Dennis Brown above. NinjaRobotPirate (talk) 21:39, 24 June 2014 (UTC)[reply]
The points from the RFC summary are addressed in the RFC. Darkwarriorblake / SEXY ACTION TALK PAGE! 21:45, 24 June 2014 (UTC)[reply]

Discussion

That opening misrepresents thing fairly thoroughly. The consensus at the previous RFC was not to remove the links because they were added by a bot, but because the site owner appeared to be using a botnet of compromised computers to add the links. Given the apparent willingess of the site owner to compromise computers, the blacklist was imposed as a solution to reduce the risk of harm to our users. The remainder of the discussion above neglects that central point: the purpose of the blacklist was to prevent harm to our users by linking to the site. Not to punish or reward our editors, not to make comments about the quality of the archival, but to protect users. Those accusations were not spurious by any means, and, so far as proven goes, they were certainly demonstrated to be true by the preponderance of evidence: hence the result of the previous RFC.—Kww(talk) 20:04, 24 June 2014 (UTC)[reply]

This does not seem to be the appropriate place to be injecting your personal comments. Darkwarriorblake / SEXY ACTION TALK PAGE! 20:09, 24 June 2014 (UTC)[reply]

May I ask why some users feel this RFC was improperly formed? - Favre1fan93 (talk) 20:14, 24 June 2014 (UTC)[reply]

Because the opening statement misrepresents the issues.—Kww(talk) 20:23, 24 June 2014 (UTC)[reply]

The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.