|Slogan||Preserving for posterity|
|Created by||Gunther Eysenbach|
|70,577 (April 2014[update])|
WebCite is an on-demand archiving service, designed to digitally preserve scientific and educationally important material on the web by making snapshots of Internet contents as they existed at the time when a blogger, or a scholar or a Wikipedia editor cited or quoted from it. The preservation service enables verifiability of claims supported by the cited sources even when the original web pages are being revised, removed, or disappear for other reasons, an effect known as link rot.
Comparison to other services
The service differs from the short time Google Cache copies by having indefinite archiving, and WebCite also offers on-the-fly archiving. The Internet Archive, since 2013, also offers immediate archiving, however, WebCite has some advantages:
- pages cached by WebCite also capture several layers of underlying links while Internet Archive only captures the top page chosen for archiving. The accuracy with which formatting and functionality is preserved also varies greatly between Internet Archive and WebCite.
- WebCite checks robots.txt only at the time of archiving, Internet Archive checks robots.txt occasionally so changes in robots.txt (which can be caused by change the ownership of the domain name) can result in removing the cached pages from the Internet Archive.
WebCite is a non-profit consortium supported by publishers and editors, and it can be used by individuals without charge. Rather than relying on a web crawler which archives pages in a "random" fashion, authors who want to cite web pages in a scholarly article can initiate the archiving process. They then cite – instead of or in addition to the original URL – the snapshot address archived by WebCite, with an identifier that specifies the cited source.
Conceived in 1997 by Gunther Eysenbach, WebCite was publicly described the following year when an article on Internet quality control declared that such a service could also measure the citation impact of web pages. In the next year, a pilot service was set up at the address webcite.net (see archived screenshots of that service at the Wayback Machine (archived February 3, 1999)). Although it seemed that the need for WebCite decreased when Google's short term copies of web pages begun to be offered by Google Cache and the Internet Archive expanded their crawling (which started in 1996), WebCite was the only one allowing "on-demand" archiving by users. WebCite also offers interfaces to scholarly journals and publishers to automate the archiving of cited links. By 2008, over 200 journals had begun routinely using WebCite.
WebCite used to be, but is no longer, a member of the International Internet Preservation Consortium. In a 2012 message on Twitter, Eysenbach commented that "WebCite has no funding, and IIPC charges 4000 Euro/yr in membership fees."
WebCite "feeds its content" to other digital preservation projects, including the Internet Archive. Lawrence Lessig, an American academic who writes extensively on copyright and technology, used WebCite in his amicus brief in the United States Supreme Court case of MGM Studios, Inc. v. Grokster, Ltd.
WebCite has been running a fund-raising campaign using FundRazr since January 2013 with a target of $22,500, a sum which its operators have stated is needed to maintain and modernize the service beyond the end of 2013. This includes relocating the service to Amazon EC2 cloud hosting and legal support. It is currently undecided if WebCite will continue as a non-profit or as a for-profit entity.
WebCite allows on-demand prospective archiving. It is not crawler-based; pages are only archived if the citing author or publisher requests it. No cached copy will appear in a WebCite search unless the author or another person has specifically cached it beforehand.
To initiate the caching and archiving of a page, an author may use WebCite's "archive" menu option or create a WebCite bookmarklet that will allow web surfers to cache pages just by clicking a button in their bookmarks folder.
One can retrieve or cite archived pages through a transparent format such as
URL is the URL that was archived, and
DATE indicates the caching date. For example,
It is important to note that WebCite does not work for pages which contain a no-cache tag. WebCite respects the author's request to not have their web page cached.
One can archive a page by simply navigating in their browser to a link formatted like this:
The term "WebCite" is a registered trademark. WebCite does not charge individual users, journal editors and publishers any fee to use their service. WebCite earns revenue from publishers who want to "have their publications analyzed and cited webreferences archived", and accepts donations. Early support was from the University of Toronto.
WebCite maintains the legal position that its archiving activities are allowed by the copyright doctrines of fair use and implied license. To support the fair use argument, WebCite notes that its archived copies are transformative, socially valuable for academic research, and not harmful to the market value of any copyrighted work. WebCite argues that caching and archiving web pages is not considered a copyright infringement when the archiver offers the copyright owner an opportunity to "opt-out" of the archive system, thus creating an implied license. To that end, WebCite will not archive in violation of Web site "do-not-cache" and "no-archive" metadata, as well as robot exclusion standards, the absence of which creates an "implied license" for web archive services to preserve the content.
In a similar case involving Google's web caching activities, on January 19, 2006, the United States District Court for the District of Nevada agreed with that argument in the case of Field v. Google (CV-S-04-0413-RCJ-LRL), holding that fair use and an "implied license" meant that Google's caching of Web pages did not constitute copyright violation. The "implied license" referred to general Internet standards.
Reliability and outages
|This section does not cite any references or sources. (October 2013)|
The service is not always available due to outages caused by hardware failures, maintenance and other reasons. However, it has always returned and there are no reports of previously cached pages being permanently lost.
- "Webcitation.org Site Info". Alexa Internet. Retrieved 2014-04-01.
- Fixing Broken Links on the Internet, Internet Archive blog, October 25, 2013.
- Eysenbach, Gunther; Diepgen, Thomas L. (November 28, 1998). "Towards quality management of medical information on the internet: evaluation, labelling, and filtering of information". British Medical Journal (London: BMJ Publishing Group Ltd) 317 (7171): 1496–1502. doi:10.1136/bmj.317.7171.1496. ISSN 0959-8146. OCLC 206118688. PMC 1114339. PMID 9831581. BL Shelfmark 2330.000000. Retrieved 2008-02-27.
- Eysenbach, Gunther; Trudel, Mathieu (2005). "Going, Going, Still There: Using the WebCite Service to Permanently Archive Cited Web Pages". Journal of Medical Internet Research (Toronto: Centre for Global eHealth Innovation at the University Health Network) 7 (5): e60. doi:10.2196/jmir.7.5.e60. ISSN 1438-8871. OCLC 107198227. PMC 1550686. PMID 16403724. Retrieved 2008-02-27.
- "WebCite Consortium FAQ". webcitation.org. WebCite.
- "Twitter post". June 11, 2012. Archived from the original on March 10, 2013. Retrieved 2013-03-10.
- Cohen, Norm (January 29, 2007). "Courts Turn to Wikipedia, but Selectively". New York Times.
- "Fund WebCite (http://www.webcitation.org)". Wikimedia Foundation. Retrieved 2013-12-06.
- "Conversation between GiveWell and Webcite on 4/10/13". GiveWell. Retrieved 2009-10-18.
Dr. Eysenbach is trying to decide whether Webcite should continue as a non-profit project or a business with revenue streams built into the system.
- "WebCite Bookmarklet".
- "WebCite Legal and Copyright Information". WebCite Consortium. Retrieved 2009-06-16.
- "WebCite Member List". WebCite Consortium. Retrieved 2009-06-16.
Membership is currently free
- "WebCite Frequently Asked Questions". WebCite Consortium. Retrieved 2009-06-16.
- "WebCite Frequently Asked Questions – Who owns and runs WebCite at the moment?". WebCite Consortium. Retrieved 2009-06-16.
WebCite has been incubated and is still hosted at the University of Toronto / University Health Network's Centre for Global eHealth Innovation.