Introduction
We now move into the last two topics before we leave the browser side of the Privacy Sandbox behind: HTTP headers and browser permissions. We already did a quick review of HTTP headers in the post The Big Picture and Core Browser Elements. In this post we will delve a bit further, although not to a complete review of all standard HTTP headers, which represent a numerous coterie beyond the scope of this post and not really needed to understand the Sandbox. In the next post, we will talk about another unique element of the Google Privacy Sandbox: the User Agent Client Hints API. User Agent Client Hints is based on, and is a separate specification from, the more general Client Hints API.
User Agent Client Hints is collection of HTTP and user-agent features that enables privacy-preserving, proactive content negotiation between a browser and a server. They allow the browser to control what information can be shared and with what sites via an explicit cross-origin delegation mechanic. Once again, in case you have forgotten this by now (is that even possible?), one critical design feature of the Google Privacy Sandbox is to avoid cross-site reidentification of a user or user agent ID. User Agent Client Hints prevents various forms of browser fingerprinting which could be used to do such cross-site reidentification.
What are HTTP Headers
HTTP headers are an integral part of the HTTP protocol as it exists today. Headers are used to send essential information to and from the user agent to allow the server-side and browser to communicate effectively. Headers like the user agent header allow the server to send back the right configuration to render the web document correctly on a particular device/browser combination. Others describe the server from which the request came. Others allow or prevent cross-origin resource sharing. Others handle security to mitigate potential security risks like cross-site scripting attacks or clickjacking, among many other vulnerabilities. Those are just a few examples of the range of functions that headers provide in the back and forth between user agent and server-side. Most importantly from the perspective of the Sandbox, standards groups can define custom HTTP headers unique to their protocol that can use support applications they wish to deploy in the browser.
In many cases, custom HTTP headers like those used in the Google Privacy Sandbox are developed in conjunction with a parallel ability to perform some function using JavaScript. The reason for this is that JavaScript functions can slow web page response and rendering times. Many publishers do not like cluttering their pages with JavaScript tags. In fact, if you look at adTech history, one of the main reasons that Supply-Side Providers (SSPs) emerged early on was because publishers didn’t want to add JavaScript tags to their website from every advertiser or demand-side provider (DSP) they interacted with. SSPs required only one tag on the publisher’s website to handle any and all ad requests. Moreover, performance of the code itself becomes critical when you are dealing with an application requiring less than 100ms response times. Headers are often an alternate approach that provides higher performance.
As an example of the conversations that go on around this issue, here is an excerpt of July 3 meeting notes of the Protected Audiences API Working Group about creating HTTP headers to replace core JavaScript calls in the Protected Audiences API:
[Yao Xiao] Basically what happens today - the way tagging works - on the advertiser side we inject the iFrame and the server side returns a second response… But there are performance issues around iFrames and, equally, we have to make sure the tag supports the joinAdInterestGroup() API, which is a JavaScript API. But there are companies/users that don’t want to support a JavaScript API, they want a header-based solution instead. We have already done something like this for attribution reporting API and shared storage API. If we are going to move to the header-based approach above, we want to provide header-based support for all three endpoints - joinAdInterestGroup, leaveAdInterestGroup(),.....
[Isaac Foster] Ability to create interest groups via header - doing the light shell with refresh would be highly valued. Publishers are always hesitant to add JavaScript to their page.
How HTTP Headers are Structured
HTTP headers consist of two parts. The first part is the key. The second part is the information/value to be communicated. The key and the value in the key/value pair are separated by a colon. If there is more than one value to a key, the values are separated by a semicolon. Here is a simple example:
Content-Type: text/html; charset=UTF-8
Content-Type: multipart/form-data; boundary=sample
The sender includes these headers as part of the header section of the HTTP message.
Types of HTTP Headers
Figure 1 is a table showing the types of HTTP headers, what they are used for (generally), the restrictions on them, and examples of both standard headers and calls created to support the Google Privacy Sandbox. I am not going to drill further into the different header calls as, again, it isn’t necessary to understand the implications for the Sandbox. I will probably write a tech brief later to go through all headers by category so readers will have that as a resource. The main thing is to understand the example - how the Sandbox has created a variant of a type of header for a specific purpose of supporting its functionality.
Figure 1 - Types of Headers, Their Properties, and Examples
How Headers Work: A Generic Example
Given that we have talked a great deal about browser storage, it would be natural to ask “Where are headers stored in the browser?” In Figure 1, there is a mention of a limit on the length for a single header in Chrome of 4,096 bytes, and a total storage of 250Kb across all headers from all websites and web pages. That is not a great deal of space to provide in the browser, especially if like me you keep over 100 tabs open concurrently. While there's no theoretical limit on the number of headers you could fit within the total size limit (1,000 headers of 250 bytes each is technically possible), it's highly impractical. Most websites use a reasonable number of headers (typically less than 50). Exceeding that could lead to performance issues and compatibility problems.
So if that is the case, how do headers actually work? Are they stored, if so and for how long?
What I will do in this section is first talk about the generic mechanism for how to think about header processing and then I will give an example around a specific header.
Figure 2 shows the generic flow for a header request-response cycle. The small black-and-white boxes with stubs represent the RAM for the device to which they are attached.
Figure 2: A Generic Request/Response Header Flow
Step 1: When the user agent makes a call to a website, in this case www.example.com, the user agent builds the request headers in the client’s memory based on the URL, cookies, and other relevant information.
Step 2: The user agent sends the request headers along with the request data to the server.
Step 3: The server receives the request and stores it in memory for processing.
Step 4: The server prepares the response headers and the payload for the user agent.
Step 5: The server sends the response header along with the appropriate payload to the user agent.
Step 6: The server deletes the request and its response headers from the server memory
Step 7: Upon receiving the response, the user agent stores the response headers in client memory for processing.
Step 8: The user agent uses the response headers to understand the content type, status code, and other crucial details and the payload displayed or used by the web page as needed.
Step 9: Once the request-response cycle completes, the user agent discards the headers from memory to free up resources.
Whatever we do with headers in terms of taking in meta-information that is then used to process and return data to a client, in most cases storage is not an issue. Headers are not stored on the client but rather held in memory, and then only until processing of the headers is completed. At that point the header is discarded, making room for subsequent requests and responses.
This does not mean that data used in HTTP headers isn’t stored on either the user agent or the server side. The Set-Cookie header is a good example of this. The call below is a response header that causes the user agent to store a cookie in the Cookies SQLite file on the user agent’s local machine.
Set-Cookie: sessionId=abc123; Expires=Wed, 21 Oct 2024 07:28:00 GMT; Path=/
Headers like Cache-Control, Expires, and ETag are used to control caching behavior. These headers can lead to the storage of responses in the browser cache or intermediary caches.
- Cache-Control: This header can specify directives for caching mechanisms in both requests and responses. For example, Cache-Control: max-age=3600 indicates that the response can be cached for 3600 seconds.
Cache-Control: max-age=3600
- Expires: This header provides an absolute date/time after which the response is considered stale.
Expires: Wed, 21 Oct 2024 07:28:00 GMT
- ETag: This header is used for cache validation. It allows the server to identify if the cached version of a resource matches the current version.
ETag: "686897696a7c876b7e"
There are numerous other headers that cause data to be stored in cache or on the local client. These are just a few examples to give you a sense of the range of ways a header can use or store data locally before it is deleted from client or server memory.
How Headers Work: The Content-Type Header
Now let’s drill into a specific example of how headers are processed. We will use a very common response header - the content-type header, as an example (Figure 3). The Content-Type header specifies the original media type of a resource before content encoding. It ensures proper interpretation by the client and helps reduce the likelihood of a cross-site scripting attack.
Figure 3 - Request and Response for the Content-Type Header
The right hand side of Figure 3 shows a server with a resource - in this case a document - that is stored in multiple languages (English, French, Spanish), with multiple formats (html or pdf), with multiple potential encodings (gzip, br, compress). Encodings are compression algorithms used to reduce the amount of data that needs to be transferred over the network. As the diagram shows, there are three versions of the content: a URL for English (URL/en), for French (URL/fr), and a URL for Spanish (URL/sp).
On the left hand side of the diagram is the client that wants to retrieve the English version of the pdf for download. That information is sent in the request header to the server letting it know which variant of content-type it needs, the desired language of the content, and the types of content encoding that the user agent can process.
The server finds the correct content type in the correct language and sends it back using br content encoding along with a header that indicates what it has sent back (pdf in English, encoded using br) . Each line item in the response is a single response header, with the Content-Type header indicating it is returning a pdf. After completing the send, it deletes the original request and the response headers from memory.
When the browser receives that response along with the response header, it uses the information in the response header to use the correct decompression algorithm and then display the English version of the pdf in a browser-based pdf viewer. Once the page is displayed, the user agent deletes the response header.
User Agent Header Is In a Class of Its Own for Privacy
You may have noticed in the first row in Figure 1 there is a user agent header example. This is because the user agent header from a technical perspective is just another request header. That is its header type. But it is in its own type when it comes to privacy. This is because the user agent header has been used by data scientists, along with other information like IP Address, plug-ins, installed fonts, and screen resolution to statistically “fingerprint” a browser as another way of tracking. As a result, the user agent header is in a special class of its own and we will cover it in extensive detail in the next post.
Next Stop: Fingerprinting
The user agent header is not the only mechanism by which devices can be fingerprinted, So in the next post, we will start with an overview of fingerprinting and the various mechanics used. Then we will explore two new, interrelated standards that have evolved in the Privacy Sandbox to help reduce “the exposure surface” for fingerprinting. They are the Client Hints API and the User Agent Client Hints API.