The Google Privacy Sandbox Explainer: An Introduction

Let me ask you, the reader, a very simple question:

“Do you understand how the Google Privacy Sandbox works?”

By “understand how it works” I mean could you, if asked, create a presentation for managers and investors in your company? Could you describe to your brand’s advertising or privacy groups in moderate detail how it works today? How it currently is envisioned to work when the specification and the underlying systems are completed? Can you do so in enough detail to provide your technically-savvy business teams a sense of the pieces of the platform, the basic “hooks” by which they interoperate, how information flows between advertiser and publisher, and the rationale behind the system's design?

If you can answer yes to that questio n, then stop reading. The information in this series of articles is too basic for you. This can be due to one of two reasons, or both. First, you may be an engineer at one of the 20 companies involved in the FLEDGE Original Trial (FOT) #1, are working with this technology every day and you attend all the regular W3C meetings that relate to the Sandbox (there are at least four taskforce meetings weekly or biweekly). Or second, you are one of the AdTech Ishtári (think Gandalf) who has spent months locked away, empty Red Bull cans strewn at your feet, reading by candlelight through stacks of Github repositories and developer guides on the Privacy Sandbox website trying to comprehend the wide array of technologies underlying this major rewrite of the ad-supported web, along with all their API endpoints and parameters.

I, however, am neither of these, and most likely neither are you. But we both need to understand the Privacy Sandbox and its impact on the products we have to build. And even though we are very technical product people, understanding the Privacy Sandbox, if you aren’t working at one of the FOT #1 firms, is well-nigh impossible. There are several reasons for this:

Google Sandbox is both a set of technologies and a set of open standards. Much like other open standards, such as Java or Linux, the specifications are being built with, by, and for the community. An open standards process, to use an analogy, is like designing the plane while you are building the plane. But when it comes to Linux or Java, there is usually a stable “production release”, including a reference implementation, that everyone can work from while they work on the next iteration of the specification and reference implementation. In the case of the Privacy Sandbox, the overall design of the core APIs is broadly specified, but the details of implementation of key aspects of the Sandbox are changing weekly as FOT members learn and give feedback. We have not yet achieved a stable V1.0.
Google Sandbox depends on a wide-range of other browser-centric technologies. Just learning and internalizing these technologies is a tall order. Moreover, like the Sandbox, they have their own working groups and are evolving in parallel.
Too many groups; too little time. To keep up with current thinking you need to attend all the different W3C working groups (well, at least the core ones) related to the Sandbox. Unless you are directly engaged with FOT #1, it is hard to justify that much time. It is also just plain hard to sit in meeting where you don't have the level of detail needed to engage or give feedback.
Distributed development across multiple teams. No one group at Google controls all the elements of the Sandbox. For example, the group that is engaged in evolving worklets or the group that defines how subresource bundles work are not in the AdTech group that is responsible for the three core Google Sandbox APIs. It seems to me that there are only a few engineers at Google who can without hesitation stitch a single picture together of all the technical pieces that make up the Sandbox. Coming at it from outside Google – and I have spoken to people in FOT #1 who feel the same as I do - it is almost impossible to piece together that comprehensive view when so many pieces are changing on a weekly basis.
Still early days. Much of the technology – such as the Trusted Execution Environment, the Key Value Service, and the k-anonymity server - are in early alpha and untested. No one knows exactly how they will work yet (and remember, the devil is in the details). So, while it is possible to describe in broad strokes what the likely architecture of the final Sandbox platform will be, things can still change drastically by the time an actual V1.0 implementation occurs.
Related proposals evolving weekly. Even more, there are multiple standards proposals in the GitHub repositories from one or more members of the FOT #1 community. They may or may not get implemented – so are they part of the specification or not? They may not be part of the specification, but they are part of the conversation. So it is important to know about them and understand how they could, if implemented, impact the design of the Sandbox.
Integrations with prebid and OpenRTB remain to be reconciled. The Sandbox has to interact with other open standards like header bidding and OpenRTB from IAB. These interactions are critical to success of use cases like bid optimization and retargeting, yet how these elements will interact with the Google Sandbox can’t be fully defined until there is a stable V1.0 available and in use. So, understanding them is also like trying to keep your aim centered on a moving target.
Testing in phases to isolate issues and limit risk of delay. Because this is such a complex problem, to minimize risk (and as product manager I completely agree with this approach) Google in FOT #1 is testing only a limited set of in-browser functionality to make sure the “basic engine” works before it adds the more server-side elements. For example, companies in FOT #1 are allowed to use their own ad servers (under the title “Bring Your Own Server” or BYOS) which do not meet the trust requirements of the Trusted Execution Environment required under the long-term design of the specification.
Loss of "higher-level" perspective comes with deep engagement. Lastly, the Google folks and their FOT #1 partners– like the developers of any software product so complex that you have to live it 24x7 – are so deeply ensconced in the tech that it is hard for them to visualize just how tough it is for tech-savvy business people to grasp how the tech works. They have generated a HUGE amount of content to help educate the industry, and they have done yeoman’s work. But the content tends to be written by engineers for engineers involved in building to the specifications. It has also been developed piece by piece. There is no overarching outline and information flow across every aspect to ‘tell a story’ – like a book might.

So, I’ve decided to begin a series of articles on the Google Privacy Sandbox to provide a “moderately technical” overview of its elements in a “storytelling manner”. This will follow an outline that will stitch together the Sandbox from first-principals and build it up piece-by-piece until the entire structure can be seen and understood as a unified whole . These articles are intended for product managers and other executives in AdTech who wish to understand the Sandbox and its tech at an architectural level, but who don’t want to read the specifications in the GitHub repositories or spend hours on privacysandbox.com going over developer guides. There will be two types of articles:

Architectural Articles. In each of these I will cover one aspect of the architecture and its design in its current state at the time of writing. You can discover these by selecting a keyword under the Categories tab on the main nva bar or reviewing the table of contents under the Chapters tab and clicking on sections on a specific topic.
Update Articles. These will provide updates on critical discussions at the various weekly Sandbox-related meetings at the W3C, the IAB, or that show up in the issue threads in Github. I obviously can’t cover all topics and many won’t be worthy of an architectural discussion, but where there are interesting elements to consider I will write about them.

As I end this intro, I want to provide a reference to all the technologies and repositories that impact the Google Sandbox for your use as the series of blog posts expands. There is so much activity related to the Sandbox, either directly or indirectly through more general web technologies, that finding what you need at any given time can be daunting. And then finding the right page in the documentation that talks to specific issues you are interested in on that topic – well, that is often like seeking a needle in a haystack. The List of Specifications under the Resources tab on the main nav bar is intended to be used when you need to look reference something in the specs or across specifications as you continue reading my posts. Any item listed in this table is either part of the Google Privacy Sandbox, one of its related services, one of its historical antecedent versions, or related technologies that are referenced in one of the specifications (and thus you need to understand them).

The Google Privacy Sandbox Explainer: An Introduction

Browser Permissions

Client Hints Infrastructure

Browser Fingerprinting & Client Hints

Headers and Google Privacy Sandbox: An Overview

Private State Tokens