Archive.today: Difference between revisions
archive.is itself does not support chrome/firefox or android, these refer to third-party applications Tag: references removed |
added note about thirdpartyness of the apps |
||
Line 40: | Line 40: | ||
Unlike [[Web crawler|crawlers]] such as [[Wayback Machine]], archive.is only captures individual pages in response to explicit user requests, and so does not obey the [[robots exclusion standard]].<ref>{{cite web |first=Dan |last=Dascalescu |authorlink=Dan Dascalescu |url=https://wiki.dandascalescu.com/reviews/online_services/web_page_archiving |title=Web page archiving – Dan Dascalescu's Wiki (review) |publisher=Wiki.dandascalescu.com |date=18 February 2013 |accessdate=3 October 2013}}</ref> Because of this, website owners cannot unilaterally remove snapshots at will, thus it is a "permanent" archive.<ref name = "vice stealing">{{cite web|url=https://motherboard.vice.com/read/dear-gamergate-please-stop-stealing-our-shit|title=Dear GamerGate: Please Stop Stealing Our Shit|author=|date=|work=Motherboard}}</ref> <!--For example, [https://web.archive.org/web/*/http://findarticles.com/p/articles/mi_m1295/is_9_65/ai_77811474 this] [[Archive.org]] link doesn't work; the error message "Page cannot be displayed due to robots.txt." is displayed, but [https://archive.is/20120717223848/http://findarticles.com/p/articles/mi_m1295/is_9_65/ai_77811474 this] Archive.is archive of the same URL does still work. -- Why commented out: Better example would be where the same error came up at archive.org, and the original content was gone, as in this example but archive.is had *directly* archived the content, rather than indirectly, as in this case. Good enough?--> |
Unlike [[Web crawler|crawlers]] such as [[Wayback Machine]], archive.is only captures individual pages in response to explicit user requests, and so does not obey the [[robots exclusion standard]].<ref>{{cite web |first=Dan |last=Dascalescu |authorlink=Dan Dascalescu |url=https://wiki.dandascalescu.com/reviews/online_services/web_page_archiving |title=Web page archiving – Dan Dascalescu's Wiki (review) |publisher=Wiki.dandascalescu.com |date=18 February 2013 |accessdate=3 October 2013}}</ref> Because of this, website owners cannot unilaterally remove snapshots at will, thus it is a "permanent" archive.<ref name = "vice stealing">{{cite web|url=https://motherboard.vice.com/read/dear-gamergate-please-stop-stealing-our-shit|title=Dear GamerGate: Please Stop Stealing Our Shit|author=|date=|work=Motherboard}}</ref> <!--For example, [https://web.archive.org/web/*/http://findarticles.com/p/articles/mi_m1295/is_9_65/ai_77811474 this] [[Archive.org]] link doesn't work; the error message "Page cannot be displayed due to robots.txt." is displayed, but [https://archive.is/20120717223848/http://findarticles.com/p/articles/mi_m1295/is_9_65/ai_77811474 this] Archive.is archive of the same URL does still work. -- Why commented out: Better example would be where the same error came up at archive.org, and the original content was gone, as in this example but archive.is had *directly* archived the content, rather than indirectly, as in this case. Good enough?--> |
||
Since July 2013, archive.is supports the [[Memento Project]] [[application programming interface]] (API) |
Since July 2013, archive.is supports the [[Memento Project]] [[application programming interface]] (API),<ref name=WSRG>{{cite web|url=https://ws-dl.blogspot.nl/2013/07/2013-07-09-archiveis-supports-memento.html |title=Archive.is Supports Memento |publisher=Web Science and Digital Libraries Research Group at [[Old Dominion University]] |work=Research and Teaching Updates |date=9 July 2013 |archiveurl=https://web.archive.org/web/20130727194715/https://ws-dl.blogspot.de/2013/07/2013-07-09-archiveis-supports-memento.html |archivedate=27 July 2013 |deadurl=no |accessdate=17 September 2013 |first=Michael L. |last=Nelson |df=dmy }}</ref><ref>[https://mementoweb.org/depot/native/archiveis/ "archive.is"] ''Memento Protocol Information''. Memento Development Group. Retrieved 17 September 2013.</ref> Third-party [[Google Chrome|Chrome]] and [[Firefox]] extensions and [[Android application]] <ref>{{cite web|url=https://play.google.com/store/apps/details?id=com.navasgroup.share2archive |title=Share2Archive - Android Apps on Google Play |publisher=Play.google.com |date=2016-11-20 |accessdate=2016-11-29}}</ref> |
||
==Worldwide availability== |
==Worldwide availability== |
Revision as of 16:50, 16 May 2017
Type of site | Web archiving |
---|---|
Available in | Multilingual |
URL | https://archive.is |
Commercial | No |
Registration | No |
archive.is (formerly archive.today) is an archive site which stores snapshots of web pages.[2] It retrieves one page at a time similar to WebCite, smaller than 50 MB each, but with Web 2.0 sites (such as Google Maps and Twitter) included.
archive.is uses headless browsing to record what embedded resources need to be captured to provide a high-quality memento, and creates a PNG image to provide a static and non-interactive visualization of the representation.[3]
Unlike crawlers such as Wayback Machine, archive.is only captures individual pages in response to explicit user requests, and so does not obey the robots exclusion standard.[4] Because of this, website owners cannot unilaterally remove snapshots at will, thus it is a "permanent" archive.[5]
Since July 2013, archive.is supports the Memento Project application programming interface (API),[6][7] Third-party Chrome and Firefox extensions and Android application [8]
Worldwide availability
Finland
On July 21, 2015, the operators blocked access to the service from all Finnish IP addresses, stating on Twitter that they did this in order to avoid escalating a dispute they allegedly had with the Finnish government.[9]
Russia
In Russia, only HTTP access is possible; HTTPS connections are blocked.[10][11]
Hosting
Russian Hostkey is the primary host (known partner of Wikileaks and Syrian Electronic Army[12]), distributed by Cloudflare.[13][14]
Domain name
The WHOIS for the domain name shows that it's registered to Denis Petrov, with an address in Prague.[15]
See also
References
- ^ "Archive.is Site Info". Site Info. Alexa Internet. Retrieved 17 October 2015.
- ^ Martin Brinkmann (22 April 2015). "Create publicly available web page archives with Archive.is". Ghacks. Retrieved 13 June 2015.
- ^ Brunelle, Justin F.; Kelly, Mat; Weigle, Michele C.; Nelson, Michael L. (25 January 2015). "The impact of JavaScript on archivability". International Journal on Digital Libraries. 17 (2). Springer-Verlag Berlin Heidelberg: 95–117. doi:10.1007/s00799-015-0140-8.
- ^ Dascalescu, Dan (18 February 2013). "Web page archiving – Dan Dascalescu's Wiki (review)". Wiki.dandascalescu.com. Retrieved 3 October 2013.
- ^ "Dear GamerGate: Please Stop Stealing Our Shit". Motherboard.
- ^ Nelson, Michael L. (9 July 2013). "Archive.is Supports Memento". Research and Teaching Updates. Web Science and Digital Libraries Research Group at Old Dominion University. Archived from the original on 27 July 2013. Retrieved 17 September 2013.
{{cite web}}
: Unknown parameter|deadurl=
ignored (|url-status=
suggested) (help) - ^ "archive.is" Memento Protocol Information. Memento Development Group. Retrieved 17 September 2013.
- ^ "Share2Archive - Android Apps on Google Play". Play.google.com. 20 November 2016. Retrieved 29 November 2016.
- ^ Lapintie, Lassi (22 July 2015). "Suomalaisilta estettiin haktivistien suosimalla verkkosivulla käynti" [Finns' access to website used by hacktivists blocked]. Iltalehti (in Finnish). Retrieved 4 March 2016.
- ^ "Роскомнадзор заблокировал сервис archive..., хранящий копии веб-сайтов". 29 January 2016. Retrieved 30 January 2016.
- ^ "Russia Blocks Another Archive Site Because It Might Contain Old Pages About Drugs". Techdirt.
- ^ https://medium.com/@Felt/grand-theory-supp-4-abc25d6a8756
- ^ "Netcraft - Search Web by Domain". Netcraft Services. Netcraft Ltd. Retrieved 3 December 2016.
- ^ "Archive.is server and hosting history". EasyCounter. Retrieved 23 January 2017.
- ^ "ISNIC - Mini Whois". www.isnic.is. Retrieved 1 April 2017.