Archived: Unsanctioned Web Tracking

This is a simplified archive of the page at https://www.w3.org/2001/tag/doc/unsanctioned-tracking/

Use this page embed on your own site:

Tracking user activity on the Web using methods other than those defined for the purpose by the Web platform (“unsanctioned tracking”) is harmful to the Web, for a variety of reasons. This Finding details the TAG's stance on different forms of tracking, and how they should be addressed.

ReadArchived

Abstract

Tracking user activity on the Web using methods other than those defined for the purpose by the Web platform (“unsanctioned tracking”) is harmful to the Web, for a variety of reasons. This Finding details the TAG's stance on different forms of tracking, and how they should be addressed.

Status of This Document

This document has been produced by the W3C Technical Architecture Group (TAG). The TAG approved this finding at its July 2015 F2F. Please send comments on this finding to the publicly archived TAG mailing list www-tag@w3.org (archive).

Table of Contents

1. Tracking Your Activity on the Web

When you use the Web, the sites you visit — including advertisements, analytics services, and other included content on them — use various tools to collect information about who you are and what you do on the site. This is very common on the Web; many sites that you browse will share what you do on them with several others — in some cases, dozens.

Collectively, tracking technologies form the basis of common Web features like shopping carts, persistent site preferences, and behavioral advertising, which allows many Web sites to fund themselves.

Some tracking mechanisms are defined by Web standards, and their design takes into account user needs for privacy and control over data flows. One of the best-known and most widespread is cookies [RFC6265]. More recently, other mechanisms such as [webstorage] have been standardized to complement cookies.

In particular, browsers provide explicit ways for you to limit when standards-defined tracking technologies are used, either directly or with extensions. For example, a privacy-conscious user can choose to use a cookie blocker, or manually delete cookies. As such, the standards-defined tracking technologies are effectively “opt out” — while they are on by default, you remain in control of them, as long as you accept that sites may not work as well (or at all) if you don't allow their use.

Standards-defined tracking mechanisms also have the benefit of transparency. Users can inspect cookies and other locally stored data and user agents can provide some notice to the user that data is stored by this site. Tools have been developed that enable those users specifically interested in awareness of the tracking of their online activity to document and visualize the use of cookies and tracking pixels; for example, Lightbeam.

In practice, many end users do not themselves understand the details of the local storage mechanisms and their use for tracking. However, tracking based upon standards allows researchers, advocates and regulators to leverage their visibility and use tools to identify and evaluate the privacy-sensitive behavior of online tracking. This work is important input to making tools that can help users manage their privacy appropriately.

2. Unsanctioned Tracking: Tracking without User Control

However, sites also track user activity outside of these well-defined mechanisms:

  • Browser fingerprinting uses small variations in your Web browser implementation and configuration — as well as that of your computer itself — to uniquely identify it and correlate it with your activity.
  • So-called SuperCookies use implementation bugs, browser fingerprinting and other techniques to continue to identify you and correlate your activity even after you clear your cookies (e.g. “re-synchronizing” them).
  • Header enrichment is performed by some network operators who add HTTP request headers that reveal their customers' identities to the Web sites they visit.

Unlike standards-defined tracking, the operation of these unsanctioned techniques is not defined by Web standards, is not user-visible, and it is not under user control. If you use the same browser to visit two different sites, it is technically possible for the sites to identify your browser and correlate your behavior between them (and any other site that they work with). While there are a few legitimate uses of such methods (e.g., combatting Denial of Service attacks, or providing greater certainty about user identity for sites such as banks), unsanctioned tracking is often used for purposes that many consider malicious.

There is ample evidence that many sites already use such unsanctioned tracking methods. For more information, see resources like Panopticlick, Evercookie, and FPDetective.

3. Why Unsanctioned Tracking is Harmful

Staying in control of personal data is important to many people, because data about a person — in particular their activity on the Web — can be used to understand how they think, work and live. Users expect that their browsing information will be kept relatively private. This trust, and users controlling their experience, is a fundamental part of how the Web works.

Recognizing the importance of this information in monetary terms, the World Economic Forum has classified personal data as “a new asset class” — with the implication that if you are unable to control your data, you are on the losing side of a forced transaction.

Furthermore, tracking users' activity without their consent or knowledge is also a blatant violation of the human right to privacy [udhr].

As a result, a growing body of legal, social and technical constraints have developed around the use of standards-based tracking technology on the Web. Because they are well-defined, it is possible to discuss and regulate their use, as well as build tools to understand, visualize and control them.

For example, the EU Cookie Directive regulates the use of cookies in that jurisdiction; browsers have cookie control interfaces and extensions; and researchers can plot how cookies are used on the Web.

Unsanctioned tracking, on the other hand, has little such affordance; it is difficult (and sometimes, impossible) to detect using purely technical means in the browser. It stems not from a well-defined specification, but instead from exploitation of certain aspects of how the Web works.

The aggregate effect of unsanctioned tracking is to undermine user trust in the Web itself. Moreover, if browsers cannot isolate activity between sites and offer users control over their data, they are unable to act as trusted agents for the user.

Notably, unsanctioned tracking can be harmful even if non-identifying data is shared, because it provides the linkage among disparate information streams across contextual boundaries. For example the sharing of an opaque fingerprint among a set of unrelated online purchases can provide enough information to enable advertisers to determine that user of that browser is pregnant — and hence to target her with pregnancy-specific advertisements even before she has disclosed her pregnancy.

4. Limitations of Technical Solutions

We have had numerous discussions throughout the Web community about limiting the the browser fingerprinting “surface area” that a browser exposes, by reducing the variability in how browsers behave. In those discussions, we have tried to consider the full span of characteristics about a user, their browser and their activities that may be tracked.

While reducing fingerprinting surface area may mitigate some kinds of unsanctioned tracking, it is inadequate to foil a determined adversary. The variety of documented techniques for browser fingerprinting, from enumerating the extensions installed in the browser to examining exactly how fonts are displayed on screens, continues to increase as new features are developed.

As an extreme example, it has now been shown possible [spy-sandbox] to “listen” to the CPU on a computer to detect mouse, network and other activity, using only some JavaScript in a Web page. This information can then be used in the machine fingerprint.

In this environment, it is impractical for specification design to eliminate fingerprinting; not only would such restriction severely hobble the capability of the Web, it would also break a substantial amount of existing content. Moreover, theory confirms that we cannot expect to eliminate these problems on a general-purpose system: From a theoretical perspective, eliminating browser fingerprinting is essentially the same problem as eliminating covert channels [confinement].

As a result, we cannot solve the issues that unsanctioned tracking raises through solely technical means. At times, they may be more appropriately addressed through policy (e.g., legislation and/or regulation).

5. Findings

Therefore, the TAG:

  • Finds that unsanctioned tracking is actively harmful to the Web, because it is not under the control of users and not transparent.
  • Believes that, because combatting fingerprinting is difficult, new Web specifications should take reasonable measures to avoid adding unneeded fingerprinting surface area. However, added surface area should not be a primary factor in determining whether to add a new feature.
  • Asserts that when a new feature does add fingerprinting surface area, it should be documented as such.
  • Finds that new local storage features and other potential tracking mechanisms should maintain and interoperate with existing user controls.
  • Encourages browser vendors to expose appropriate controls to users who wish to minimize their fingerprinting surface area.
  • Acknowledges that despite best efforts, technical solutions to unsanctioned tracking are not able to completely prevent its use by a determined adversary. Instead, our focus should be on making sure that unsanctioned tracking does not become “normal” on the Web.
  • Encourages policy makers to be aware that unsanctioned tracking may introduce privacy, security and consumer protection concerns within their jurisdiction, and to consider appropriate action.

The TAG is happy to provide guidance to community members who need specific advice regarding fingerprinting in their specifications.

A. References

A.1 Informative references

[RFC6265]
A. Barth. HTTP State Management Mechanism. April 2011. Proposed Standard. URL: https://tools.ietf.org/html/rfc6265
[confinement]
Butler W. Lampson. A Note on the Confinement Problem. URL: http://research.microsoft.com/en-us/um/people/blampson/11-confinement/acrobat.pdf
[spy-sandbox]
Yossef Oren; Vasileios P. Kemerlis; Simha Sethumadhavan; Angelos D. Keromytis. The Spy in the Sandbox – Practical Cache Attacks in Javascript. URL: http://arxiv.org/pdf/1502.07373v2.pdf
[udhr]
Universal Declaration of Human Rights. URL: http://www.un.org/en/documents/udhr/
[webstorage]
Ian Hickson. Web Storage (Second Edition). 9 June 2015. W3C Candidate Recommendation. URL: http://www.w3.org/TR/webstorage/