Help:Using the Wayback Machine: Difference between revisions
→top: Try archivedate = something where month/day are clearly distinguishable; accessdate = today for easy copy+paste |
→Mozilla Firefox Add-on: added add-on Resurrect Pages |
||
Line 97: | Line 97: | ||
:Several Firefox updates have broken one by one all of this plugin's functionalities. |
:Several Firefox updates have broken one by one all of this plugin's functionalities. |
||
:If even the basic functionality is not working for you, here's a temporary fix: ''[...]'' |
:If even the basic functionality is not working for you, here's a temporary fix: ''[...]'' |
||
Alternative: Firefox add-on: Resurrect Pages, https://addons.mozilla.org/en-US/firefox/addon/resurrect-pages/ |
|||
== Using the wayback template == |
== Using the wayback template == |
Revision as of 08:05, 5 January 2015
This help page is a how-to guide. It explains concepts or processes used by the Wikipedia community. It is not one of Wikipedia's policies or guidelines, and may reflect varying levels of consensus. |
This page gives information about using the Wayback Machine to cite archived copies of web pages used by articles. This is useful if a web page has changed, moved, or disappeared; links to the original content can be retained.
Editors are also encouraged to add an archive link as a part of each citation, or at least submit the referenced URL for archiving, at the same time that each citation is created or updated.
Visit the web form at https://archive.org/
, enter the original URL of the web page of interest in the "Wayback Machine" search box and then select BROWSE HISTORY. The next screen may
- redirect to the latest archived copy,
- show a box near the bottom of the page with a link inviting the user to
Save this url in the Wayback Machine
, - show a calendar listing the snapshot dates for all archived copies of that page, or
- show an error message explaining why the page cannot be archived.
In short, this the code that needs to be added to a reference:
<ref>{{<-EXISTING REFERENCE->|archiveurl=http://web.archive.org/web/20021128120000/http://www.originalurl.com|archivedate=2002-11-28|accessdate=2024-11-12|deadurl=yes}}</ref>
URL formats
A link to the Wayback Machine usually starts with http://web.archive.org/web/
followed either by a single asterisk or a 14-digit datetime reference, then a slash and finally the URL of the original web page.
Initial request
The following example usually shows a calendar linking to all archived copies of the main index page of Wikipedia.
Use the above URL format to discover the extent to which the requested page has been archived. Click one of the highlighted dates to select that specific archived copy.
If the target web page hasn't yet been archived, a box appears near the bottom of the page with a link inviting the user to Save this url in the Wayback Machine
. Clicking this invokes a request to
The above URL will show the current version of the requested web page and start the process that will attempt to archive the web page. If successful, the archived copy will become available immediately the process is completed.
For some requested pages, the Wayback Machine will return an error message explaining why that particular page has not and cannot be archived. In those cases, try a different archiving service such as WebCite.
Specific archive copy
Once the target web page has been archived, each of the specific dated archives can be individually requested using the format shown below.
The next example links to the archived copy of the main index page of Wikipedia exactly as it appeared on 30 September 2002 at 12:35:25 pm in the UTC timezone. The datetime format is YYYYMMDDhhmmss
.
Use the above format to link directly to a specific archive copy.
Adding an asterisk immediately after the date (or in place of it) is a quick way to show the calendar view of all archived copies.
The following flags can be appended to the datetime field to modify the format in which the archived content is displayed[1][2]:
id_
Identity - perform no alterations of the original resource, return it as it was archived.js_
JavaScript - return document marked up as JavaScript.cs_
CSS - return document marked up as CSS.im_
Image - return document as an image.
Depending on the circumstances under which the page images were archived, the rendering of these pages may not be consistent; therefore, it is recommended that the flags be tested before being incorporated into Wikipedia documents. When linking to pages which are no longer available, the id_
flag is the most transparent in presenting the intent of the original page, as the following example demonstrates for the Wikipedia page as it appeared on 30 September 2002 at 12:35:25 pm in the UTC timezone, without the Wayback Machine Toolbar being displayed. The datetime format is YYYYMMDDhhmmss
with id_
appended.
Use the above format to link directly to a specific archive copy without the display of the Wayback Machine Toolbar.
Latest archive copy
The next example links to the most current version of the archived page.
Using the above format is discouraged. The request is redirected to the longform URL, including 14-digit datetime stamp, for the latest archive copy thereby defeating the purpose of using the archive to link directly to a specific old version of the page.
Likewise, a similar archive URL but with the number 1
links to the oldest archive copy.
See also: Advanced URL locator hints and tips – Internet Archive
Limitations
Before October 2013 it would often take weeks or months for an archived copy of a web page to become available. Nowadays, a request to archive a particular web page is actioned immediately and the result usually made available within minutes.
The Internet Archive honors the robots exclusion standard. It will not archive sites that disallow access, and it will remove access to previous versions of a disallowed page.
For example, The New York Times has a robots.txt page at http://www.nytimes.com/robots.txt which includes:
User-agent: *
Disallow: /aponline/
Disallow: /archives/
Disallow: /reuters/
Thus, archive requests for URLs within those folders, and any other similarly listed folder of the New York Times website will be rejected.
The Washington Post uses the file http://www.washingtonpost.com/robots.txt which includes:
User-agent: ia_archiver
Disallow: /
This directive explicitly blocks the Internet Archive from accessing their entire website.
JavaScript bookmarklet
A bookmarklet is a one-click button in a web browser that is stored like a bookmark but uses javascript to carry out certain actions. To use one when you're at a dead link web page and want to visit archives saved by the Wayback Machine, click and drag the following code to your browser's bookmarks toolbar, then name it something memorable, such as Wayback (e.g. Wayback):
javascript:void(window.open('https://web.archive.org/web/*/'+location.href));
Then, when you are at a dead page, you may click the bookmarklet and it will automatically take you to the Wayback Machine's archives of that page.
The preceding code may not work for all users. In that case, you may try the following bookmarklet:
javascript:location.href='http://web.archive.org/web/*/'+document.location.href;
For a bookmarklet that allows you to manually archive a page you are visiting, store the following code in a bookmark on your browser's toolbar, with a name such as Wayback Save (e.g. Wayback Save):
javascript:void(window.open('https://web.archive.org/save/'+location.href));
Mozilla Firefox Add-on
If you are using a Mozilla Firefox browser, you can install a 404 error add-on which will automatically try to detect a missing page in Wayback Machine, and provides a button similar to the one described above.
NOTE: As of May 5, 2013 or earlier, this add-on was not working properly. The developer writes (on the add-on page, under "About this Add-on"):
- Several Firefox updates have broken one by one all of this plugin's functionalities.
- If even the basic functionality is not working for you, here's a temporary fix: [...]
Alternative: Firefox add-on: Resurrect Pages, https://addons.mozilla.org/en-US/firefox/addon/resurrect-pages/
Using the wayback template
{{wayback}} can create these links for you; use the |url=
, |title=
and |date=
parameters to specify the URL, title and date. For example:
{{wayback |url=http://www.wikipedia.org/ |title=Wikipedia |date=20010727112808 }}
→ Archived 2001-07-27 at the Wayback Machine
Without the date included:
{{wayback |url=http://www.wikipedia.org/ |title=Wikipedia }}
→ Archived (Date missing) at wikipedia.org (Error: unknown archive URL)
Note that the date parameter defaults to *
Working with cite templates
{{citation}}, and all of the Citation Style 1 templates support the |archiveurl=
parameter (Note that the |archivedate=
parameter is also required!). Other citation templates may also support |archiveurl=
— see their documentation.
{{citation
|url=http://www.wikipedia.org/
|title=Wikipedia Main Page
|archiveurl=http://web.archive.org/web/20020930123525/http://www.wikipedia.org/
|archivedate=2002-09-30
|accessdate=2005-07-06
}}
→ "Wikipedia Main Page". Archived from the original on 2002-09-30. Retrieved 2005-07-06.
- Where an archived resource notes its original publication date, use
|date=
in place of |accessdate=
.
- When adding an archive URL to any citation where the original resource URL is still working, it is useful to add the
|deadurl=no
parameter. Should the original URL stop working, it is a simple job to either change this to |deadurl=yes
or remove the parameter. With |deadurl=no
, clicking the title in the footnote invokes the original (live) URL, clicking "Archived" gives the archived copy. Otherwise the title invokes the archived page, "Original" invokes the (dead unless it has been reinstated) original link.
See also
- {{dead link}}, for flagging dead links
- {{linkrot}}, for flagging pages with bare links
- {{user archiveurl}}, userbox
- {{user Internet Archive}}, userbox
- {{user web archive}}, userbox
- Wikipedia:Link rot, how-to guide for prevention of link rot
- Wikipedia:Using WebCite, using the alternative WebCite archive service
References
- ^ "Wayback Administrator Manual". Internet Archive. Archived from the original on 2014-01-20.
{{cite web}}
: Unknown parameter |deadurl=
ignored (|url-status=
suggested) (help)
- ^ "How can I view a page without the Wayback code in it?". Internet Archive. Archived from the original on 2013-08-06.
{{cite web}}
: Unknown parameter |deadurl=
ignored (|url-status=
suggested) (help)