Using such beacons, companies and organizations can track the online behavior of web users. At first, the companies doing such tracking were mainly advertisers or web analytics companies; later social media sites also started to use such tracking techniques, for instance through the use of buttons which act as tracking beacons.
There is work in progress to standardize an interface that web developers can use to create web beacons.
A web beacon is any of a number of techniques used to track who is visiting a web page. They can also be used to see if an email was read or forwarded, or if a web page was copied to another website.
The first web beacons were small digital image files that were embedded in a web page or email. The image could be as small as a single pixel, and could be of the same color as the background, or even completely transparent (thus the name “tracking pixel”). When a user would open the page or email where such an image was embedded, he or she might not see the image, but their web browser or email reader would automatically download the image. This download would require the user’s computer to send a request to the host company’s server, where the source image was stored. This request in turn would require the user’s computer to provide identifying information about itself, allowing the host to keep track of the user.
This basic technique has been developed further so that all sorts of elements can be used as beacons. Currently these can include visible elements such as graphics, banners or buttons, but also non-pictorial HTML elements such as the frame, style, script, input link, embed, object, etc., of an email or web page.
The identifying information provided by the user's computer typically includes its IP address, the time the request was made, the type of web browser or email reader that made the request, and the existence of cookies previously sent by the host server. The host server can store all of this information, and associate it with a session identifier or tracking token that uniquely marks the interaction.
The use of framing added a new level of versatility to web beacons. Framing allows web pages to refer to content such as images or buttons or HTML elements that are located on other servers, rather than hosting this content directly on their own server. When a user sees the email or the web page, the user’s email reader or web browser prepares the referred content for display. To do so it has to send a request to the third-party server to ask it to send the referred content. As part of that request, the user's computer then has to supply identifying information to the third-party server.
This protocol allows companies to embed beacons in content that they do not directly own or operate, and then use such beacons for tracking purposes. The beacons are embedded in an email or web page as images or buttons or other HTML elements, but they are hosted on a different server than the website where they are embedded, and it is to this third-party server that requests and identifying information are sent.
For instance, in the case of an advertisement that is displayed as an image on a web page, the image file would not reside on the page’s host server, but on a server belonging to the advertising company. When a user opens the page, the user's computer will request to download the advertisement from the page’s server, but will then be referred to the advertiser's server, and will request to download the image from the advertiser's server. This request will require the user's computer to supply identifying information about itself to the advertiser.
This means that a third-party site such as an advertiser, can gather information about visitors to a main site, such as a news site or a social media site, even if users are not clicking on the advertisement. Moreover, given that beacons are not just embedded in visible advertisements but can be embedded in completely invisible elements, a third party can gather such information even if the user is completely unaware of the third party’s existence.
Use by companies
Once a company can identify a particular user, the company can then track that user's behavior across multiple interactions with different websites or web servers. As an example, consider a company that owns a network of websites. This company could store all of its images on one particular server, but store the other contents of its web pages on a variety of other servers. For instance each server could be specific to a given website, and could even be located in a different city. But the company could use web beacons to count and recognize individual users who visit the different websites. Rather than gathering statistics and managing cookies for each server independently, the company can analyze all this data together, and track the behavior of individual users across all the different websites, assembling a profile of each user as he or she navigates in these different environments.
Web beacons embedded in emails have greater privacy implications than beacons embedded in web pages. Through the use of an embedded beacon, the sender of an email - or even a third party - can record the same sort of information as an advertiser on a website, namely the time that the email was read, the IP address of the computer that was used to read the email (or the IP address of the proxy server that the reader went through), the type of software used to read the email, and the existence of any cookies previously sent. In this way, the sender - or a third party - can gather detailed information about when and where each particular recipient reads his email. Every subsequent time the email message is displayed, the same information can also be sent again to the sender or third party.
Web beacons are used by email marketers, spammers and phishers to verify that an email is read. Using this system, they can send similar emails to a large number of addresses and then check which ones are valid. Valid in this case means that the address is actually in use, that the email has made it past spam filters, and that the content of the email is actually viewed.
To some extent, this kind of email tracking can be prevented by configuring the email reader software to avoid accessing remote images. Examples of email software able to do this include the Gmail, Yahoo!, and SpamCop/Horde webmail clients; Mozilla Thunderbird, Opera, Pegasus Mail, IncrediMail, Apple Mail, later versions of Microsoft Outlook, and KMail mail readers.
However, since beacons can be embedded in email as non-pictorial elements, the email need not contain an image or advertisement or anything else related to the identity of the monitoring party. This makes detection of such emails difficult.
One way to neutralize such email tracking is to disconnect from the Internet after downloading email but before reading the downloaded messages. (Note that this assumes one is using an email reader that resides on one’s own computer and downloads the emails from the email server to one’s own computer.) In that case, messages containing beacons will not be able to trigger requests to the beacons' host servers, and the tracking will be prevented. But one would then have to delete any messages suspected of containing beacons, or risk having the beacons activate again once the computer is reconnected to the Internet.
The only way to completely avoid email tracking by beacons is to use a text-based email reader (such as Pine or Mutt), or a graphical email reader with purely text-based HTML capabilities (such as Mulberry). These email readers do not interpret HTML or display images, so their users are not subject to tracking by email web beacons. Plain-text email messages cannot contain web beacons because their contents are interpreted as display characters instead of embedded HTML code, so opening such messages does not initiate any communication.
Some email readers offer the option to disable all HTML in every message (thus rendering all messages as plain text), and this too will prevent tracking beacons from working.
More recently, many email readers and web-based email services have moved towards not loading images when opening a hypertext email that comes from an unknown sender, or that is suspected to be spam email. The user must explicitly choose to load images. But of course beacons can be embedded in non-pictorial elements of a hypertext email.
Web beacons can also be filtered out at the server level so that they never reach the end user. MailScanner is an example of gateway software that can neutralize email tracking beacons for all users of a particular server.
<!—- Commenting out this section because it seems redundant at this point. == Web pages vs. emails == While web beacons are used in the same way in web pages or emails, they have different purposes:
- If the beacon is embedded in an email, the image is requested when the user reads the email for the first time, and can also be requested every time that the user subsequently loads the email;
- Whenever a web page (with or without beacons) is downloaded, the server holding the page knows and can store the IP address of the computer requesting the page; this information can therefore be retrieved from the server log files without the need of using beacons. Beacons are used when the monitoring party does not have easy access to the logs of the main web server and also because IP addresses are often shared amongst multiple users. This may happen when a web site owner does not control its web servers (such as in web hotels), because monitoring is done by a third party, or a greater level of detail needs to be recorded than is possible from web log analysis alone.
As with any files transferred using the Hypertext Transfer Protocol, web beacons are requested by sending the server their URL, and possibly the URL of the page containing them. Both contain information that can be useful for the gatherer:
- The URL of the web page with the beacon allows a server to determine which page is accessed;
For example, an email sent to the address
firstname.lastname@example.org can contain the embedded image of URL
http://email@example.com. Whenever the user reads the email, the image at this URL is requested. The part of the URL after the question mark is ignored by the server for the purpose of determining which file to send, but the complete URL is stored in the server's log file. As a result, the file
bug.gif is sent and shown in the email reader; at the same time, the server stores the fact that the particular email sent to
firstname.lastname@example.org has been read. Using this system, a spammer or email marketer can send similar emails to a large number of addresses to check which ones are valid and read by the users.
Web beacons can be used in combination with HTTP cookies like any other object transferred using the HTTP protocol.—>
The Beacon API
The Beacon API (Application programming interface) is a candidate recommendation of the World Wide Web Consortium, the standards organization for the web. It is a standardized set of protocols designed to allow web developers to track the activity of users without slowing down website response times. It does this by sending information tracking information back to the beacon's host server after the user has navigated away from the webpage.
Use of the Beacon API allows tracking without interfering with or delaying navigation away from the site, and is invisible to the end-user. Support for the Beacon API was introduced into Mozilla's Firefox browser in February 2014 and in Google's Chrome browser in November 2014.
- Facebook Beacon
- Email fraud
- Internet privacy
- Web visitor tracking
- Web analytics
- Web storage
- Local shared object
- Do Not Track
- Stefanie Olsen (July 12, 2000). "Nearly undetectable tracking device raises concern". CNET News. Retrieved July 12, 2012.
- Richard M. Smith (November 11, 1999). "The Web Bug FAQ". EFF.org Privacy Archive. Retrieved July 12, 2012.
- "Email web bug invisible tracker collects info without permission". Retrieved August 22, 2016.
- Jatinder Mann; Alois Reitbauer. Beacon. WD. URL: http://www.w3.org/TR/beacon/
- See Internet Engineering Task Force memorandum RFC 4021.
- David Berlind (September 26, 2006). "Have you received any "traceable" PattyMail recently?". ZDNet. Retrieved July 12, 2012.
- Beacon W3C Candidate Recommendation 13 April 2017
- Introduction to the Beacon API - Sitepoint.com, January 2015
- Squeezing the Most Into the New W3C Beacon API - NikCodes, 16 December 2014
- Navigator.sendBeacon - Mozilla Developer Network
- Send beacon data in Chrome 39 - developers.google.com, September 2015
- The Web Bug FAQ from EFF
- Did they read it? from the Linux Weekly News
- Trojan Marketing
- Slashdot on Web Bugs—Slashdot.org Forum on Blocking Web Bugs