User-Agent header

In computing, a user agent is a client application implementing a network protocol used in communications within a client–server distributed computing system. The term most notably refers to applications that access the World Wide Web, but other systems, such as the Session Initiation Protocol (SIP), use the term user agent to refer to both end points of a communications session.^[1]

Web user agents range from Web browsers to search engine crawlers (spiders), as well as mobile phones, screen readers and braille browsers used by people with disabilities. When a user agent operates, it typically identifies itself, its application type, operating system, software vendor, or software revision, by submitting a characteristic identification string to its operating peer. In the HTTP, SIP, and SMTP/NNTP^[2] protocols, this is transmitted in a header field User-Agent. Bots, such as Web crawlers, often also include a URL and/or e-mail address so that the Webmaster can contact the operator of the bot.

User agent identification

Some user agents identify their software as part of the client–server conversation. In HTTP and SIP, the identity is transmitted via the User-Agent field in the request header, as described by RFC 1945. This string is then used by the communications partner to characterize the client and optionally select suitable content or operating parameters for the session. For example, this may be used to provide properly formatted content for desktop computers and for smartphones.

The user agent string is one of the criteria by which Web crawlers may be excluded from accessing certain parts of a Web site using the Robots Exclusion Standard (robots.txt file).

Possible privacy issue

The information could help distinguish Internet users from one another because it differs, often considerably, from user to user and on average browser versions carry 10.5 bits of identifying information.^[3]

Format

RFC 1945 merely requires the format of a user agent to be a string made up of the product and optional comments. For example if your product were called WikiBrowser, your user agent string may be similar to WikiBrowser/1.0 Gecko/1.0. The parts of this string are as follows:

Product name and version (WikiBrowser/1.0)
Comment (Gecko/1.0). In this case, the comment indicates the underlying software and version.

An unofficial format, based on the above, used by Web browsers is as follows: Mozilla/[version] ([system and browser information]) [platform] ([platform details]) [extensions]. For example, Safari on the iPad has used the following:

 Mozilla/5.0 (iPad; U; CPU OS 3_2_1 like Mac OS X; en-us) AppleWebKit/531.21.10 (KHTML, like Gecko) Mobile/7B405

The components of this string are as follows:

Mozilla/5.0: Previously used to indicate compatibility with the Mozilla rendering engine
(iPad; U; CPU OS 3_2_1 like Mac OS X; en-us): Details of the system in which the browser is running
AppleWebKit/531.21.10: The platform the browser uses
(KHTML, like Gecko): Browser platform details
Mobile/7B405: This is used by the browser to indicate specific enhancements that are available directly in the browser or through third parties. An example of this is Microsoft Live Meeting which registers an extension so that the Live Meeting service knows if the software is already installed, which means it can provide a streamlined experience to joining meetings.

User agent spoofing

The popularity of various Web browser products has varied throughout the Web's history, and this has influenced the design of Web sites in such a way that Web sites are sometimes designed to work well only with particular browsers, rather than according to uniform standards by the World Wide Web Consortium (W3C) or the Internet Engineering Task Force (IETF). Web sites often include code to detect browser version to adjust the page design sent according to the user agent string received. This may mean that less-popular browsers are not sent complex content, even though they might be able to deal with it correctly or, in extreme cases, refused all content.^[4] Thus, various browsers have a feature to cloak or spoof their identification to force certain server-side content.

Other HTTP client programs, like download managers and offline browsers, also have the ability to change the user agent string.

Spam bots and Web scrapers often use fake user agents. For example, the Android browser identifies itself as Safari in order to aid compatibility^[5].

At times it has been popular among Web developers to initiate Viewable With Any Browser campaigns,^[6] encouraging developers to design Web pages that work equally well with any browser.

A result of user agent spoofing may be that collected statistics of Web browser usage are inaccurate.

User agent sniffing

The term user agent sniffing refers to the practice of Web sites showing different content when viewed with a certain user agent. On the Internet, this will result in a different site being shown when browsing the page with a specific browser. A useful example of this is Microsoft Exchange Server 2003's Outlook Web Access feature. When viewed with Internet Explorer 6 (or newer), more functionality is displayed compared to the same page in any older browsers, because older browsers could not render the same content.^{[citation needed]} User agent sniffing is mostly considered poor practice, since it encourages browser-specific design and penalizes new browsers with unrecognized user agent identifications. Instead, the W3C recommends creating HTML markup that is standard,^{[citation needed]} allowing correct rendering in as many browsers as possible, and to test for specific browser features rather than particular browser versions or brands.^[7]

Web sites specifically targeted towards mobile phones, like NTT DoCoMo's I-Mode or Vodafone's Vodafone Live! portals, often rely heavily on user agent sniffing, since mobile browsers often differ greatly from each other. Many developments in mobile browsing have been made in the last few years,^[when?] while many older phones that do not possess these new technologies are still heavily used. Therefore, mobile Web portals will often generate completely different markup code depending on the mobile phone used to browse them. These differences can be small, e.g., resizing of certain images to fit smaller screens, or quite extensive, e.g., rendering of the page in WML instead of XHTML.

Encryption strength notations

Web browsers created in the United States, such as Netscape Navigator and Internet Explorer, use the letters U, I, and N to specify the encryption strength in the user agent string. Until the United States government allowed encryption with keys longer than 40 bits to be exported, in 1996, vendors shipped various browser versions with different encryption strengths. "U" stands for "USA" (for the version with 128-bit encryption), "I" stands for "International" — the browser has 40-bit encryption and can be used anywhere in the world — and "N" stands (de facto) for "None" (no encryption).^[8] Following the lifting of export restrictions, most vendors supported 256-bit encryption.

References

^ RFC 3261, SIP: Session Initiation Protocol, IETF, The Internet Society (2002)
^ Netnews Article Format. IETF. 2009. sec. 3.2.13. doi:10.17487/RFC5536. RFC 5536. {{citation}}: Unknown parameter |month= ignored (help)
^ Peter Eckersley. "Browser Versions Carry 10.5 Bits of Identifying Information on Average", Electronic Frontier Foundation, 27 January 2010. Retrieved 25 August 2011.
^ Burstein complaining "... I've been rejected until I come back with Netscape"
^ http://androidcommunity.com/forums/f8/android-browser-reports-itself-as-apple-safari-4701/
^ "Viewable with Any Browser" campaign
^ Clary, Bob (10 February 2003). "Browser Detection and Cross Browser Support". Mozilla Developer Center. Mozilla. Retrieved 2009-05-30.
^ Zawinski, Jamie (1998-03-28). "user-agent strings (obsolete)". mozilla.org. Retrieved 2010-01-08.

External links

[1] RFC 3261, SIP: Session Initiation Protocol, IETF, The Internet Society (2002)

[2] Netnews Article Format. IETF. 2009. sec. 3.2.13. doi:10.17487/RFC5536. RFC 5536. {{citation}}: Unknown parameter |month= ignored (help)

[3] Peter Eckersley. "Browser Versions Carry 10.5 Bits of Identifying Information on Average", Electronic Frontier Foundation, 27 January 2010. Retrieved 25 August 2011.

[4] Burstein complaining "... I've been rejected until I come back with Netscape"

[5] ttp://androidcommunity.com/forums/f8/android-browser-reports-itself-as-apple-safari-4701/

[6] "Viewable with Any Browser" campaign

[7] Clary, Bob (10 February 2003). "Browser Detection and Cross Browser Support". Mozilla Developer Center. Mozilla. Retrieved 2009-05-30.

[zawinski-old-8] Zawinski, Jamie (1998-03-28). "user-agent strings (obsolete)". mozilla.org. Retrieved 2010-01-08.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]