This article may need to be rewritten entirely to comply with Wikipedia's quality standards. (August 2017)
There is a work in progress to standardize an interface that web developers can use to asynchronously transfer small HTTP data from the User Agent to a web server that call it simply beacons (in the context of web development) which can be used to send data to a web server prior to the loading of the document without delaying the load and affecting the perception of page load performance for the next navigation.
A web beacon is any of a number of techniques used to track who is reading a web page or email, when, and from which computer. They can also be used to see if an email was read or forwarded to someone else, or if a web page was copied to another website. The first web beacons were small images.
Some emails and web pages are not wholly self-contained. They may refer to content on another server, rather than including the content directly. When an email client or web browser prepares such an email or web page for display, it ordinarily sends a request to the server to send the additional content.
These requests typically include the IP address of the requesting computer, the time the content was requested, the type of web browser that made the request, and the existence of cookies previously set by that server. The server can store all of this information, and associate it with a unique tracking token attached to the content request.
On web pages
As an example of the way web beacons can make user logging easier, consider a company that owns a network of sites. This company may have a network that requires all images to be stored on one host computer while the pages themselves are stored elsewhere. They could use web beacons in order to count and recognize users traveling around the different servers on the network. Rather than gathering statistics and managing cookies on all their servers separately, they can use web beacons to keep them all together.
Originally, a web beacon was a small (usually pixel transparent) GIF or PNG image (or an image of the same color as the background) that was embedded in an HTML page, usually a page on the web or the content of an email. Modern web beacons also use the HTML IFRAME, style, script, input link, embed, object, and other tags to track usage. Whenever a user opens a page with a graphic browser or email reader, such image and other information is downloaded. This download requires the browser to send a request to the server storing that image or information, allowing the organization running that server to keep track of the HTML page.
Images and other content do not have to be invisible: any element can be used for tracking. Typically advertisements, banners and buttons are fetched from their site, not from the main site. This allows a third party site to gather information about visitors when they pull HTML content from the main site. Companies or organisations, buttons or images of which are included on many sites, can thus track (part of) the browsing habits of a significant share of web users. Earlier, this included mainly ad- or counter-serving companies, but nowadays buttons of social media sites are becoming common.
While web beacons are used in the same way in web pages or emails, they have different purposes:
- If the beacon is embedded in an email, the image is requested when the user reads the email for the first time, and can also be requested every time that the user subsequently loads the email;
- Whenever a web page (with or without beacons) is downloaded, the server holding the page knows and can store the IP address of the computer requesting the page; this information can therefore be retrieved from the server log files without the need of using beacons. Beacons are used when the monitoring party does not have easy access to the logs of the main web server and also because IP addresses are often shared amongst multiple users. This may happen when a web site owner does not control its web servers (such as in web hotels), because monitoring is done by a third party, or a greater level of detail needs to be recorded than is possible from web log analysis alone.
As with any files transferred using the Hypertext Transfer Protocol, web beacons are requested by sending the server their URL, and possibly the URL of the page containing them. Both contain information that can be useful for the gatherer:
- The URL of the web page with the beacon allows a server to determine which page is accessed;
For example, an email sent to the address
email@example.com can contain the embedded image of URL
http://firstname.lastname@example.org. Whenever the user reads the email, the image at this URL is requested. The part of the URL after the question mark is ignored by the server for the purpose of determining which file to send, but the complete URL is stored in the server's log file. As a result, the file
bug.gif is sent and shown in the email reader; at the same time, the server stores the fact that the particular email sent to
email@example.com has been read. Using this system, a spammer or email marketer can send similar emails to a large number of addresses to check which ones are valid and read by the users.
Web beacons can be used in combination with HTTP cookies like any other object transferred using the HTTP protocol.
Email web beacons
Web beacons embedded in emails have greater privacy implications than beacons embedded in web pages. Through the use of unique identifiers contained in the URL of the web beacons, the sender of an email containing a web beacon is able to record the exact time that a message was read, as well as the IP address of the computer used to read the mail or the proxy server that the user went through. In this way, the sender can gather detailed information about when and where each particular recipient reads email. Every subsequent time the email message is displayed can also send information back to the sender.
Web beacons are used by email marketers, spammers, and phishers to verify that email addresses are valid, that the content of emails has made it past the spam filters, and that the email is actually viewed by users. When the user reads the email, the email client requests the image, letting the sender know that the email address is valid and that the email was viewed. The email need not contain an advertisement or anything else related to the commercial activity of the sender. This makes detection of such emails harder for mail filters and users.
Tracking via web beacons can be prevented by using email clients that do not download images whose URLs are embedded in HTML emails. Many graphical email clients can be configured to avoid accessing remote images. Examples include the Gmail, Yahoo! and SpamCop/Horde webmail clients; Mozilla Thunderbird, Opera, Pegasus Mail, IncrediMail, Evolution, Apple Mail, later versions of Microsoft Outlook, and KMail mail readers. Other HTML techniques (such as IFrames) can still be used to track email viewing.
Text-based mail readers (such as Pine or Mutt) and graphical email clients with purely text-based HTML capabilities (such as Mulberry) do not interpret HTML or display images, so their users are not subject to tracking by email web beacons. Plain-text email messages cannot contain web beacons because their contents are interpreted as display characters instead of embedded HTML code, so opening messages does not initiate communication. Some email clients offer the option to disable all HTML in every message (thus rendering all messages as plain text), which prevents any web beacons from loading.
Many modern email readers and web-based email services will not load images when opening an HTML email from an unknown sender or that is suspected to be spam mail. The user must explicitly choose to load images. Web beacons can also be filtered out at the server level so that they never reach the end user. MailScanner is an example of gateway software that can disarm IFrames as well as web beacons. Disconnecting from the Internet before reading any downloaded messages and then deleting those messages suspected of containing web beacons before reconnecting may also eliminate the threat.
The Beacon API is a newer technique which does not require the use of invisible images or similar tactics. It is a candidate recommendation of the World Wide Web Consortium, the standards organization for the web. It is designed to allow web developers to track the activity of users by sending information (such as analytics or diagnostic data) back to the web server once the user navigates away from the page. Use of the web beacon API allows tracking without interfering with or delaying navigation away from the site, and is invisible to the end-user. Support for the Beacon API was introduced into Mozilla's Firefox browser in February 2014 and in Google's Chrome browser in November 2014.
- Facebook Beacon
- Email fraud
- Internet privacy
- Web visitor tracking
- Web analytics
- Web storage
- Local shared object
- Do Not Track
- Stefanie Olsen (July 12, 2000). "Nearly undetectable tracking device raises concern". CNET News. Retrieved July 12, 2012.
- Richard M. Smith (November 11, 1999). "The Web Bug FAQ". EFF.org Privacy Archive. Retrieved July 12, 2012.
- Jatinder Mann; Alois Reitbauer. Beacon. WD. URL: http://www.w3.org/TR/beacon/
- "Email web bug invisible tracker collects info without permission". Retrieved August 22, 2016.
- David Berlind (September 26, 2006). "Have you received any "traceable" PattyMail recently?". ZDNet. Retrieved July 12, 2012.
- Beacon W3C Candidate Recommendation 13 April 2017
- Introduction to the Beacon API - Sitepoint.com, January 2015
- Squeezing the Most Into the New W3C Beacon API - NikCodes, 16 December 2014
- Navigator.sendBeacon - Mozilla Developer Network
- Send beacon data in Chrome 39 - developers.google.com, September 2015
- The Web Bug FAQ from EFF
- Did they read it? from the Linux Weekly News
- Trojan Marketing
- Slashdot on Web Bugs—Slashdot.org Forum on Blocking Web Bugs