Web beacon

A web bug is an object embedded in a web page or email, which unobtrusively (usually invisibly) allows checking that a user has accessed the content.^[1] Common uses are email tracking and page tagging for web analytics. Alternative names are web beacon, tracking bug, tag, or page tag. Common names for web bugs implemented through an embedded image include tracking pixel, pixel tag, 1×1 gif, and clear gif.^[2] When implemented using JavaScript, they may be called JavaScript tags.^{[citation needed]}

Web bugging is analogous to conventional bugging, but is not as invasive or intrusive. The term should not be confused with the more benign web spider, nor with the more malicious computer worms.

Overview

A web bug is any of a number of techniques used to track who is reading a web page or email, when, and from which computer. They can also be used to see if an email was read or forwarded to someone else, or if a web page was copied to another website. The first web bugs were small images.

Some emails and web pages are not wholly self-contained. They may refer to content on another server, rather than including the content directly. When an email client or web browser prepares such an email or web page for display, it ordinarily sends a request to the server to send the additional content.

These requests typically include the IP address of the requesting computer, the time the content was requested, the type of web browser that made the request, and the existence of cookies previously set by that server. The server can store all of this information, and associate it with a unique tracking token attached to the content request.

On web pages

Web bugs are typically used by third parties to monitor the activity of customers at a site.^[3]

As an example of the way web bugs can make user logging easier, consider a company that owns a network of sites. This company may have a network that requires all images to be stored on one host computer while the pages themselves are stored elsewhere. They could use web bugs in order to count and recognize users traveling around the different servers on the network. Rather than gathering statistics and managing cookies on all their servers separately, they can use web bugs to keep them all together.

Tracking on web pages can be disabled using a number of techniques.

Turning off a browser's cookies can prevent some web bugs from tracking a customer's specific activity. The web site logs will still record a page request from the customer's IP address, but unique information associated with a cookie cannot be recorded. However, web site server techniques that do not use cookies can be employed to help track a site's cookie-blocking users. For example, a web site can identify a request from a new visitor and send that visitor links that pass a unique ID as a GET parameter.
Browser add-ons and extensions can be used. For example, the Ghostery add-on analyzes JavaScript to detect trackers, web bugs, pixels, and beacons.^[4]

In email

Web bugs are frequently used in email marketing as a way of determining which recipients open the email. Doing this allows marketers to know who has seen the promotion or announcement that they have sent, and allows them to back-off or re-engage appropriately.

Some email web bug tracking can be disabled by:

Turning off HTML display and displaying only the text.
Turning off the display of images while still using HTML.

Implementation

Originally, a web bug was a small (usually 1×1 pixel) transparent GIF or PNG image (or an image of the same color as the background) that was embedded in an HTML page, usually a page on the web or the content of an email. Modern web bugs also use the HTML IFrame, style, script, input link, embed, object, and other tags to track usage.^[5] Whenever the user opens the page with a graphical browser or email reader, the image or other information is downloaded. This download requires the browser to request the image from the server storing it, allowing the server to take notice of the download. As a result, the organization running the server is informed when the HTML page has been viewed.

The image or other content does not have to be invisible: any element requested from the third party can be used for tracking. Typically advertisements, banners and buttons are fetched from the site to which they are connected, not from the servers of the main content. This gives the external site information about visitors of the site including these on their pages. Companies or organisations, buttons or images of which are included on many sites, can thus track (part of) the browsing habits of a significant share of web users. Earlier, this included mainly ad- or counter-serving companies, but nowadays buttons of social media sites are becoming common.

While web bugs are used in the same way in web pages or emails, they have different purposes:

If the bug is embedded in an email, the image is requested when the user reads the email for the first time, and can also be requested every time that the user subsequently loads the email;
Whenever a web page (with or without bugs) is downloaded, the server holding the page knows and can store the IP address of the computer requesting the page; this information can therefore be retrieved from the server log files without the need of using bugs. Bugs are used when the monitoring party does not have easy access to the logs of the main web server. This may be, for example, because the web site owner does not control the web servers (web hotels) or because monitoring is done by a third party.

As with all files transferred using the Hypertext Transfer Protocol, web bugs are requested by sending the server their URL, and possibly the URL of the page containing them. Both URLs contain information that can be useful for the server:

The URL of the page containing the bug allows the server to determine which particular web page the user has accessed;
The URL of the bug can be appended with an arbitrary string in various ways while still identifying the same object; this extra information can be used to better identify the conditions under which the bug has been loaded, and can be added while sending the page or by JavaScript scripts after the download.

For example, an email sent to the address somebody@example.org can contain the embedded image of URL http://example.com/bug.gif?somebody@example.org. Whenever the user reads the email, the image at this URL is requested. The part of the URL after the question mark is ignored by the server for the purpose of determining which file to send, but the complete URL is stored in the server's log file. As a result, the file bug.gif is sent and shown in the email reader; at the same time, the server stores the fact that the particular email sent to somebody@example.org has been read. Using this system, a spammer or email marketer can send similar emails to a large number of addresses to check which ones are valid and read by the users.

Web bugs can be used in combination with HTTP cookies like any other object transferred using the HTTP protocol.

Email web bugs

Web bugs embedded in emails have greater privacy implications than bugs embedded in web pages. Through the use of unique identifiers contained in the URL of the web bugs, the sender of an email containing a web bug is able to record the exact time that a message was read, as well as the IP address of the computer used to read the mail or the proxy server that the user went through. In this way, the sender can gather detailed information about when and where each particular recipient reads email. Every subsequent time the email message is displayed can also send information back to the sender.

Web bugs are used by email marketers, spammers, and phishers to verify that email addresses are valid, that the content of emails has made it past the spam filters, and that the email is actually viewed by users. When the user reads the email, the email client requests the image, letting the sender know that the email address is valid and that the email was viewed. The email need not contain an advertisement or anything else related to the commercial activity of the sender. This makes detection of such emails harder for mail filters and users.^[6]

Tracking via web bugs can be prevented by using email clients that do not download images whose URLs are embedded in HTML emails. Many graphical email clients can be configured to avoid accessing remote images. Examples include the Gmail, Yahoo!, and SpamCop/Horde webmail clients; Mozilla Thunderbird, Opera, Pegasus Mail, IncrediMail, later versions of Microsoft Outlook, and KMail mail readers. Other HTML techniques (such as IFrames) can still be used to track email viewing.

Text-based mail readers (such as Pine or Mutt) and graphical email clients with purely text-based HTML capabilities (such as Mulberry) do not interpret HTML or display images, so their users are not subject to tracking by email web bugs. Plain-text email messages cannot contain web bugs because their contents are interpreted as display characters instead of embedded HTML code, so opening messages does not initiate communication. Some email clients offer the option to disable all HTML in every message (thus rendering all messages as plain text), which prevents any web bugs from loading.

Many modern email readers and web-based email services will not load images when opening an HTML email from an unknown sender or that is suspected to be spam mail. The user must explicitly choose to load images. Web bugs can also be filtered out at the server level so that they never reach the end user. MailScanner is an example of gateway software that can disarm IFrames as well as web bugs. Disconnecting from the Internet before reading any downloaded messages and then deleting those messages suspected of containing web bugs before reconnecting may also eliminate the threat.

A hosts file or a filtering web proxy can be used to specify that some servers are never to be contacted for any reason. This file must be continually updated to reflect the fact that new tracking servers are periodically brought online, and old ones re-purposed to serve legitimate content.

As web bugs require the email software to fetch the content, they have never been able to accurately count read rates for email campaigns. As a result of the above-mentioned measures, they may become still less effective.

Disposition-Notification-To email headers may be seen as another form of web bug. See RFC 4021.

References

^ Stefanie Olsen (July 12, 2000). "Nearly undetectable tracking device raises concern". CNET News. Retrieved July 12, 2012.
^ Richard M. Smith (November 11, 1999). "The Web Bug FAQ". EFF.org Privacy Archive. Retrieved July 12, 2012.
^ http://www.mailsbroadcast.com/email.bolts.nuts/about.web.bugs.htm
^ David Cancel; Felix Shnir (July 2, 2012). "Ghostery". Add-ons for Firefox. mozilla.org. Retrieved July 12, 2012.
^ Ed Felten (May 25, 2004). "Email Tracking: It Gets Worse". Freedom to Tinker. Center for Information Technology Policy. Retrieved July 12, 2012.
^ David Berlind (September 26, 2006). "Have you received any "traceable" PattyMail recently?". ZDNet. Retrieved July 12, 2012.

External links

The Web Bug FAQ from EFF
Did they read it? from the Linux Weekly News
Trojan Marketing
Slashdot on Web Bugs—Slashdot.org Forum on Blocking Web Bugs
"Have you received any 'traceable' PattyMail recently?"—David Berlind, ZDNET

[1] Stefanie Olsen (July 12, 2000). "Nearly undetectable tracking device raises concern". CNET News. Retrieved July 12, 2012.

[2] Richard M. Smith (November 11, 1999). "The Web Bug FAQ". EFF.org Privacy Archive. Retrieved July 12, 2012.

[3] ttp://www.mailsbroadcast.com/email.bolts.nuts/about.web.bugs.htm

[4] David Cancel; Felix Shnir (July 2, 2012). "Ghostery". Add-ons for Firefox. mozilla.org. Retrieved July 12, 2012.

[5] Ed Felten (May 25, 2004). "Email Tracking: It Gets Worse". Freedom to Tinker. Center for Information Technology Policy. Retrieved July 12, 2012.

[6] David Berlind (September 26, 2006). "Have you received any "traceable" PattyMail recently?". ZDNet. Retrieved July 12, 2012.

[1]

[2]

[3]

[4]

[5]

[6]