The Discussion So Far
In the prior two posts we introduced two views of the Google Privacy Sandbox. The first was a view to the specific HTML elements that were created or used to implement the Privacy Sandbox (Figure 1).
Figure 1- The Browser with Updates for Google Privacy Sandbox
The second view began a discussion of the APIs, which I consider the true “products” of the Privacy Sandbox, along with a set of shared/supporting services that leverage the HTML elements to deliver auctions, bids, and reporting in the browser (Figure 2).
Figure 2 -Services View of the Google Privacy Sandbox
In the next series of posts, we will tie these together at a high level. This will show how the browser structures and the APIs work together to deliver each piece of Sandbox functionality. I am going to cover these in terms of which structures the APIs impact as follows:
- The main browser frame elements
- Browser storage elements
- Browser header elements
- Early discussion about permissioning
Design Principles of the Google Privacy Sandbox
At the outset of our exploration, I think it is worth restating some core design principles of the Google Privacy Sandbox that we will come back to again and again in future posts:
- Remove cookies and other forms of cross-site tracking at an individual browser level. Current privacy thinking and regulatory frameworks focus on protecting user privacy by:
- preventing the use of tools, like cookies, that can be tied together to create an identify graph where users can be tracked site-to-site.
- preventing aggregation of behavioral data across multiple web sites for the purposes of targeting or retargeting ads or other content to specific browsers.
- preventing fingerprinting of browsers, independent of cookies, that would allow the identity of an individual browser to be known, tracked, and (re)targeted.
- Keep all personally-identifiable information and user activity local to the browser. As a way of implementing this principle, the Privacy Sandbox assumes that all activity data for the browser/user remains in secure storage in the browser. That is a critical mechanic for ensuring that personally identifiable data cannot be collected in a centralized repository and used to backwards engineer a specific user's identity.
- Prevent reidentification/fingerprinting of an individual browser and its data via the mixing of browser activity data from multiple participants in auctions and bidding from within the browser itself. This is similar to point 1c, but it is more subtle and critical to understand about the design of the Privacy Sandbox. If PII data does not move from the browser, then the attack surface to merge data across actors in ad auctions, bidding, and reporting now becomes the browser itself.
All the deep complexity of the Privacy Sandbox and its supporting services is to ensure that such mixing of data among and between participants in ad auctions cannot occur even if ‘evil actors’ want to do so. There is an HTML concept of secure contexts, and a W3C specification dedicated to it. The Privacy Sandbox specifically creates secure contexts for each participant and their data so that mixing cannot occur. Like any design principle, it is unclear whether the Privacy Sandbox in the long-term can implement its needed functionality and still maintain secure contexts at all times. The fenced frames API, for example, calls out that in early instantiations it may not be possible to completely secure a specific context. But whatever the final specification, it will surely create as small an attack surface as possible for the mixing of data by evil actors.
You will see this design concern woven through many of the issue threads in the Protected Audience API Github repository. If you want an example of the type of sophisticated attack that the Privacy Sandbox is designed to handle, see this issue thread. Don’t sweat the details (unless you are a true privacy wonk and Privacy Sandbox black belt). Just get a sense of how thoughtful the W3C Protected Audiences Working Group is being about minimizing the attack surface in the browser.
One area in particular - reporting - is receiving a great deal of attention because it represents the most likely function that can accidentally recombine data to create the opportunity for cross-site tracking. In reporting, data about every winning bid, and in the future the top losing bids, from all websites where an advertiser's ads are shown is aggregated for measurement and billing purposes for both buyers and sellers. That aggregation across the volume of data collected, which for one SSP runs over 1 trillion transactions a day, potentially creates an opening for sophisticated algorithms to identify individual browser behavior across websites if great care is not taken.
- Be Performant. Here’s another, very important way to look at the design of the Google Sandbox. Because of the three prior design principles, the Privacy Sandbox is basically recreating an ad server in a browser while maintaining strict privacy. This means multiple auctions with potentially hundreds of bids for each auction will be running concurrently. Not only does the Privacy Sandbox need to prevent the mixing of data across these hundreds of potential sources, it must also run the multiple auctions and deliver an ad to the browser in parallel in under ~25 ms. That is an incredibly difficult design parameter to achieve using today’s browser technology as it was never designed to scale to that level of functionality.
Main Browser Frame Elements: Fenced Frames
Having laid out the core design priociples of the Privacy Sandbox, let's turn to the first of the new browser elements most critical to its functions: Fenced Frames. The core design goal of fenced frames is to ensure that a user’s identity/information from the advertiser cannot be correlated or mixed with user information from the publisher’s site when an ad is served. To understand why fenced frames were necessary to the Sandbox, we need to understand the concepts of opaque URLs and storage partitions in a browser. Then we can explore why iFrames are inadequate for preventing the potential correlation of user data across publisher and advertiser.
Implementation Status
Fenced frames are not part of the current FLEDGE Origin Trial 1 (FOT #1). Instead FOT #1 includes temporary support for rendering opaque URLs in iFrames. When fenced frames are available for testing, these opaque URLs will be renderable in fenced frames. At a later date, the ability to render opaque URLs into iFrames will not be allowed.
Browser Storage Partitions
Until recently, browsers have tied specific elements on the page only to the origin from which the resource was loaded. But using origin as the single key for identification potentially enables cross-site tracking. Basically this is how third-party cookies work. But this concept also applies more broadly to browser elements such as an iFrame. In the example in Figure 3, website site1 has an embedded iFrame from myvideo.com that delivers a video into the browser. The same iFrame is embedded in website site2. All myvideo.com has to do to perform cross-site tracking of a user's behavior is capture the activity information from each website. The single-key architecture also allows the embedded site to infer specific states about the user in the top-level site by using side-channel techniques such as Timing Attacks, XS-Leaks, and cross-site state inference attacks (don't worry about how these exactly work, for now. We will cover these in a later chapter).
Figure 3 - Example of How an iFrame Keyed Only to Origin Allows Cross-Site Information Flow
Google is moving to a dual-key model with the evolution of the Google Privacy Sandbox (Figure 4). The key consists of two identifiers: the origin of the element and the top-level domain (TLD) of the website into which it is embedded. The information collected is thus partitioned into separate storage buckets. myvideo.com’s iFrame can see the user activity in each storage partition, but it has no ability to use a single identifier to see across partitions. By this mechanic, partitioned storage prevents cross-site tracking, and reduces the attack surface for side-channel techniques. There are other benefits as well, such as protecting offline data in progressive web apps, but those use cases are outside the scope of this discussion.
A second use case where partitioned storage helps (not shown) is when a publisher has multiple iFrames on their website, which often happens when there are multiple ad units on a page. Before partitioned storage, it would be relatively easy to share information. iFrames inherit cookies and local storage data from the main page by default. This allows websites to track user activity across different sections or embedded experiences within the page, even if they belong to different domains. Moreover, by writing JavaScript code that targets both frames, a publisher or an evil actor can directly access and exchange data between the frames. This can be used for tracking user behavior or injecting unauthorized content.
Figure 4 - Example of How a iFrame Keyed to Origin and TLD Reduces Cross-Site Information Sharing
With iFrames in partitioned storage as in Figure 4, each partition has its own storage, including cookies and local storage. This prevents data from one iFrame from directly accessing data stored by another in a different partition. And while direct communication is still possible through JavaScript, it becomes more challenging as each iFrame operates within its own isolated JavaScript environment.
Limitation of iFrames Beyond Partitioned Storage
So, you might ask, we now have an iFrame with partitioned storage. Why is that not adequate to prevent information leakage that allows us to track user behavior between publishers and/or the adTech companies that insert ads into iFrames on the publisher’s page?
The problem with iFrames is that, separately from storage, they have several communication channels which allow them to communicate with their embedding frame. These include both direct and indirect communication channels. Although I do not want to drill deeply into technical detail, I do feel it is important to call these mechanisms out for those who wish to delve further into the topic. Direct channels include:
- postMessage. This widely used API enables cross-frame messaging, allowing data exchange between the main page and iFrames, even if they have different origins. Malicious scripts can exploit this to leak sensitive information or conduct cross-site tracking.
- window.opener. This property provides access to the opener window's object, potentially leaking information or allowing manipulation of the parent page.
- allow attribute. This attribute, mainly for older browsers, allows specifying domains that can communicate with the iFrame using window.opener. Although less relevant nowadays, it could still be exploited in specific scenarios.
- Shared DOM Properties. In rare cases, specific DOM properties might inadvertently be shared across iFrames, leading to vulnerabilities.
- DOM manipulation. Malicious scripts can manipulate the DOM (Document Object Model) within an iFrame to leak information or influence the behavior of other frames on the page.
- CSP (Content Security Policy) While primarily a security mechanism, a misconfigured CSP can inadvertently block legitimate communication channels, impacting functionality. Improper usage might leak information through unintended consequences.
Indirect channels include:
- URLs. The URL of an iFrame itself can leak information, especially if it contains query parameters or encoded data.
- Size Attributes While primarily used for layout, attributes like width and height can be manipulated by malicious scripts to communicate information subtly. This particular item is a bit problematic because the publisher has to communicate the size attributes of the available ad unit in the bid request.
- Name Attribute. Although rarely used, the name attribute can potentially serve as a communication channel if exploited creatively.
- CSPEE (cross-site execution policy) attribute. This rarely used attribute can potentially be manipulated for cross-site scripting attacks if not implemented carefully.
- resize event. Although primarily used for layout adjustments, the resize event can be exploited to send data encoded in the event parameters, especially in older browsers or with less secure implementations.
- window.parent and window.top. These properties provide access to the parent and top frames respectively, enabling potential information leakage or manipulation of the main page.
- onload and other page lifecycle events: Information might be unintentionally leaked or actions triggered through event listeners attached to various page lifecycle events.
- Document.referrer. This property reveals the URL of the document that referred the user to the current page, which might contain sensitive information depending on the context.
- Shared document.domain. In very rare cases, setting the document.domain property to the same value across iFrames can create unintended communication channels, leading to vulnerabilities.
While evil actors who are not the publisher could use these vulnerabilities to perform cross-site tracking across embedded iFrames on a single page, the more obvious vulnerability is that the publisher could, accidentally or intentionally, use these communication channels to collect data across all the iFrames on their page and compile a cross-site view of a browser across multiple advertisers. Partitioned storage alone cannot address those vulnerabilities that can occur within the top-level frame.
Fenced Frames Reduce the Communication Channel Vulnerability
This is the reason a more secure way of delivering ad content to a publisher page was needed. As a result, Google created fenced frames, which
- Explicitly prevent communication between the embedder and the top-frame site, except for certain information like ad sizes.
- Access storage and the network via partitions so no other frame outside a given fenced frame document can share information.
- May have access to browser-managed limited unpartitioned user data such as a Protected Audiences interest group.
A fenced frame is structured, like many other HTML elements, as a tree. The root fenced frame and any child iframes in its tree are not allowed to use typical communication channels to talk to frames outside the tree or vice-versa. Frames within the tree can communicate with each other just like typical iFrames.
Fenced frames behave similarly to a top-level browsing context, just embedded in another page. It can be thought of as similar to a “tab” since it has minimal communication with the top-level embedding context, is the root of its frame tree, and all the frames within the tree can communicate normally with each other.
On the other hand, fenced frames share certain properties with iFrames. Browser extensions can still access a fenced frame as an iFrame. In the case of advertising, this means an ad blocker would still function against a fenced frame the way it does on an iFrame. Developer tools, accessibility features, JavaScript functionalities like DOM manipulation, event handling, asynchronous operations, and the ability to limit third-party API access work similarly in both.
Opaque URLs as Another Means of Reducing Information Exchange in Fenced Frames
As noted above, one of the potential indirect channels for information leakage between sites is the URL of the embedded iframe, since unique query parameters or encoded data could provide an attack surface to reconnect the data between two or more iFrames. To deal with this potential issue, Google has taken another precaution to reduce the attack surface by making URLs for iFrame documents opaque. This is used especially during FOT#1 since fenced frames are not required. Opaque URLs provide at least some amount of protection against information leakage from the iFrame itself. Opaque URLs will continue to be used for fenced frames once they are available and required.
Opaque URLs are designed to intentionally hide the underlying resource information, such as the server, path, or specific file name that a URL points to. They are typically constructed using a cryptographic hash function that transforms the original URL into a seemingly random string.
A regular URL will look something like this:
This URL reveals the server, path, and filename, potentially leaking information about the product being viewed. It’s opaque version would look something like this (using a SHA-256 hash):
This URL shows a seemingly random string generated by hashing the original URL, impeding an attacker's ability to to decipher the underlying resource information.
Equally important, the iFrame doesn't have direct access to the server or resource based on the opaque URL. Instead, it sends the opaque URL to a designated proxy or resolver service. This service, trusted by the browser, holds the mapping between opaque URLs and their corresponding unhashed versions. Thus, isolation between the iFrame or fenced frame and the top-level frame is enforced quite strictly and the potential for information leakage from various attack vectors is substantially reduced.
Fenced Frames Are Not a Perfect Solution
As noted, earlier, the Privacy Sandbox may not be able to completely prevent the mixing of consumer data between advertisers and publishers, or to prevent exploits by evil actors. I’ll end this post with a quote from the Fenced Frames Explainer that states the case well:
Fenced frames disable explicit communication channels, but it is still possible to use covert channels to share data between the embedder and embeddee, e.g. global socket pool limit (as mentioned in the xsleaks audit), network side channel and intersection observer as described above, etc. Mitigations to some of these are being brainstormed. We also believe that any use of these known covert channels is clearly hostile to users and undermines web platform intent to the point that it will be realistic for browsers to take action against sites that abuse them.