HTTP referer

From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by 79.197.126.53 (talk) at 16:41, 21 June 2011 (rel = "noreferrer" of HTML 5). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

The referrer, or HTTP referrer — also known by the common misspelling referer that occurs as an HTTP header field — identifies, from the point of view of an Internet webpage or resource, the address of the webpage (commonly the Uniform Resource Locator (URL); the more-generic Uniform Resource Identifier (URI); or the internationalization and localization (i18n)-updated Internationalized Resource Identifier (IRI)) of the resource which links to it. By checking the referrer, the new webpage can see where the request originated.

Referrer logging is used to allow websites and web servers to identify where people are visiting them from, for promotional or security purposes. Referrer is a popular tool to combat cross-site request forgery, but such security mechanisms are weakened by the ease of disabling or forging a referrer. The referrer is widely used for statistical purposes.

A dereferrer is a means to strip the details of the referring website from a link request so that the target website cannot identify the page which was clicked on to originate a request.

Origin of the term referer

The misspelling referer originated in the original proposal by computer scientist Phillip Hallam-Baker to incorporate the field into the HTTP specification.[1] The misspelling was set in stone by the time of its incorporation into the standards document Request for Comments (RFC) 1945; document co-author Roy Fielding has remarked that neither "referrer" nor the misspelling "referer" were recognized by the standard Unix spell checker of the period.[2] "Referer" has since become a widely used spelling in the industry when discussing HTTP referrers; usage of the misspelling is not universal, though, as the correct spelling of "referrer" is used in some web specifications such as the Document Object Model.

Details

When visiting a webpage, the referrer or referring page is the URL of the previous webpage from which a link was followed.

More generally, a referrer is the URL of a previous item which led to this request. The referrer for an image, for example, is generally the HTML page on which it is to be displayed. The referrer field is an optional part of the HTTP request sent by the web browser to the web server.[3]

Many websites log referrers as part of their attempt to track their users. Most web log analysis software can process this information. Because referrer information can violate privacy, some web browsers allow the user to disable the sending of referrer information. Some proxy and firewall software will also filter out referrer information, to avoid leaking the location of non-public websites. This can, in turn, cause problems: some web servers block parts of their website to web browsers that do not send the right referrer information, in an attempt to prevent deep linking or unauthorised use of images (bandwidth theft). Some proxy software has the ability to give the top-level address of the target website as the referrer, which usually prevents these problems while still not divulging the user's last-visited website.

Recently many blogs have started publishing referrer information in order to link back to people who are linking to them, and hence broaden the conversation. This has led, in turn, to the rise of referrer spam: the sending of fake referrer information in order to popularize the spammer's website.

Many pornographic paysites use referrer information to secure their websites. Only web browsers arriving from a small set of approved (login) pages are given access; this facilitates the sharing of materials among a group of cooperating paysites. Referrer spoofing is often used to gain free access to these paysites.

Referrer hiding

Most web servers maintain logs of all traffic, and record the HTTP referrer sent by the web browser for each request. This raises a number of privacy concerns, and as a result, a number of systems to prevent web servers being sent the real referring URL have been developed. These systems work either by blanking the referrer field or by replacing it with inaccurate data. Generally, Internet-security suites blank the referrer data, while web-based servers replace it with a false URL, usually their own. This, of course, raises the problem of referrer spam. The technical details of both methods are fairly consistent  — software applications act as a proxy server and manipulate the HTTP request, while web-based methods load websites within frames, causing the web browser to send a referrer URL of their website address. Some web browsers give their users the option to turn off referrer fields in the request header.[4]

Most web browsers do not send the referrer field when they are instructed to redirect using the "Refresh" field. This does not include some versions of Opera and many mobile web browsers. However, this method of redirection is discouraged by the World Wide Web Consortium (W3C).[5]

If a website is accessed from a HTTP Secure (HTTPS) connection and a link points to anywhere except another secure location, then the referrer field is not sent.[6]

The upcoming standard "HTML5" will support the attribute/value rel = "noreferrer" in order to instruct the user agent not to send a referrer.[7]

See also

  • Referrer spam, providing fake referrer information in order to popularize a spammer's website.
  • Referrer spoofing, changing referrer information to gain unauthorized access to a website.

References

  1. ^ Hallam-Baker, Phillip. "Re: Is Al Gore The Father of the Internet?" alt.folklore.computers, 2000-09-21
  2. ^ Fielding, Roy. "Re: Referer: (sic)." HTTP-wg, 1995-03-09
  3. ^ "The Referer[sic] request-header field allows the client to specify […] the address (URI) of the resource from which the Request-URI was obtained […]" RFC 2616 § 14.36
  4. ^ http://kb.mozillazine.org/Network.http.sendRefererHeader
  5. ^ http://www.w3.org/TR/WCAG10-HTML-TECHS/#meta-element
  6. ^ "Clients SHOULD NOT include a Referer[sic] header field in a (non-secure) HTTP request if the referring page was transferred with a secure protocol." RFC 2616 § 15.1.3
  7. ^ http://www.globinch.com/2011/01/21/html5-noreferrer-what-is-rel-noreferrer-link-type-attribute/

External links

  • RFC 2616: Hypertext Transfer Protocol – HTTP/1.1
  • IRI – Internationalized Resource Identifiers