Introduction to Cookies Having Independent Partition State
Cookies Having Independent Partitioned State (CHIPS) is the first of the five adaptations of browser storage for the Privacy Sandbox we will examine. But in order to talk about CHIPS, why it was needed, and what it does, we must talk about the technology it builds upon: cookies. Now cookies are a well understood technology and I absolutely do not want to write a primer on cookies given the focus should be on what comes next. But I cannot figure out a way to write about CHIPS without delving into cookies in some (what is for me) moderate detail.
Moreover, I have been into Internet technology since 1994 when cookies were first invented at Netscape and I was directly dealing with Netscape on various web standards. Yet until now I knew relatively little of what I will discuss in the next paragraphs. Which leads me to believe that many product and business executives in adTech may not know as much as I would like to believe.
So if you know cookies like the contents of a bag of Chips Ahoy that you snarfed down as a kid, then skip this section and go directly to the discussion of CHIPS. But if not, then stay with me as I review the history and working of browser cookies.
Browser Cookies 101
Mechanics of Browser Cookies
Browser cookies are almost as old as the web itself. Browser cookies, specifically, were first “invented” at Netscape in 1994 by Lou Montulli in response to a request from Vincent Cerf and John Keinsin at MCI who needed to store information for an ecommerce website they were building but for which they didn’t want to store all the state information on a central server. Browser cookies were a new application of what was already an established concept called magic cookies, which were widely used in Unix by the time Netscape came along. Magic cookies are code tokens that contain small amounts of data. They are used to identify a particular event or a "handle, transaction ID, or other token of agreement between cooperating programs."
The analogy for magic cookies I like to use is the use of computer magnetic tape rings to control printing at my business school. At the time, we were on a small DEC VAX 700 mainframe. When assignments were due (and nothing was online then), printing the output of assignments became a bottleneck and printing could take hours. The head of IT used computer magnetic tape rings to give printing rights. If you had one of the rings you could print. If you didn’t, you had to wait until you did. That way printing was fast and you didn't have to wait an unknown amount of time for your assignment to pop out of the printer. This is basically the use of these rings as a “token of agreement” between two cooperating items: a human and a printer. The “content” of the token, if you stretch your imagination, was a binary 0 or 1 that allowed one resource to access another. That is the basic notion behind magic cookies.
Lou Montulli took this notion one step further and used the cookie concept to store some small amount of stateful information about a customer’s interaction with a site in the browser (e.g. what items were abandoned in a shopping cart). This could then be accessed by the site’s owner (so a first-party cookie) the next time that particular browser/user visited the site to recreate the last known state. Cookies were built into Mosaic Netscape V0.9beta in October, 1994, and then into Internet Explorer V2 in 1995. The first cookie standard was issued via the Internet Engineering Task Force in 1997 as RFC 2109. It was superseded by the current specification, RFC 2965, in 2000.
Figure 1 shows the basic mechanic of how cookies are created and placed in the browser, and later how they are accessed from the browser.
Figure 1- The Basic Mechanics of Setting and Retrieving a Browser Cookie.
Figure 1a: The Initial Request to the Web Server and the Response Setting the Cookie
Figure 1b: On the next request from the browser, the request header includes the cookie
In Figure 1a, a browser makes an initial call to a server for www.theprivacysandbox.com to render a page. The server for theprivacysandbox.com checks to see whether a cookie already exists on the site via a Javascript call (for example) and if it doesn’t find a cookie it sends a Set-Cookie command in its response header. Then based on the user’s activities it may set other cookies on their browser for future use. In browsers the limit on cookies that can be stored by a single domain is 1,800 leaving plenty of room for various uses of cookies within an application.
With every subsequent request to the server, the browser sends all previously stored and appropriately designated cookies back to the server using the cookie header it has set. It uses the information in those tokens to take action or make a resource available.
A Set-Cookie HTTP response header looks something like this:
HTTP/2.0 200 OK
Content-Type: text/html
Set-Cookie: __Host-example=34d8g; SameSite=None; Secure; Path=/;
Note the switches SameSite, Secure, and Path. These are important to understand as we get into CHIPS. While I really do not want to delve into these in any big way, I need to provide you with enough information for you to understand the changes that CHIPS has made to ensure better privacy.
Cookie Attributes
Cookie Lifetimes: Expires or Max-age Attributes
There are two basic cookies. Session cookies, like sessions, only last for the duration of a browser session (and are not tied to a specific tab in a multi-tab browser session). Permanent cookies, on the other hand, expire at a specific date or when they reach a certain age relative to their initial creation with Set-Cookie. Expires or Max-age are the two different switches which can be used to set a lifetime for permanent /cookies.
Restricting Cookie Access: Secure and HttpOnly Attributes
It is important to restrict access to cookies by unintended third-parties or scripts. There are two attributes that help with this. A cookie with the Secure attribute is only sent to the server with an encrypted request over the HTTPS protocol. It's never sent with unsecured HTTP (except when it is on a localhost).
A cookie with the HttpOnly attribute is inaccessible to the JavaScript Document.cookie API. It is only sent to the server and remains there. Keeping the cookie working only on the server when the application is server-based reduces the surface for cross-site scripting (XSS) attacks.
Cookie Scope: Domain and Path Attributes
The scope of a cookie is what sites (origins) or subdomains it applies to. This makes it easy to apply different policies and behaviors to different subdomains within a larger site
The Domain attribute which sites or subdomains a cookie can apply to. Let’s imagine how this might work for www.mypublication.com. mypublication.com content is free, but there is also a paid subdomain behind a firewall allnews.mypublication.com.
If the Domain attribute is not set, then the cookie will only apply to mypublication.com. But if the Domain attribute is set as Domain = mypublication.com then the cookie applies to both mypublication.com and its subdomain of allnews.mypublication.com.
The Path attribute is similar, but it sets the scope of a cookie based on the URL path. For example, it turns out allnews.mypublication.com has two subdirectories: allnews.mypublication.com/politics and allnews.mypublication.com/sports. The politics section has a cookie that tells my server whether or not you are a Republican, Democrat, or Independent so it can customize the news it displays for you. The sports section has a cookie that tells the server your favorite teams so it can customize that information for the reader. In this case, each cookie would have a Path command - one would be Path = /politics the other Path =/sports and the appropriate cookie would only be sent to the server if the request came from a URL containing the correct path.
Cookie Security: Secure and SameSite Attributes
The Secure attribute restricts when a browser sends the cookie back to the server. It essentially ensures the cookie is only transmitted over encrypted connections, specifically those using HTTPS (Hypertext Transfer Protocol Secure). This means the communication between the browser and the server is encrypted, making it more difficult for attackers to intercept the cookie data.
The SameSite attribute is a relatively recent addition to cookie functionality and plays a crucial role in mitigating Cross-Site Request Forgery (CSRF) attacks. It allows the server to specify whether a cookie should be sent along with requests made to different websites (cross-site requests). There are three setting options:
- Only send a cookie when the request is from the origin site,
- Include a cookie in cross-site requests that are initiated through normal user actions like clicking a link
- The cookie can be sent with all requests, but only if the Secure attribute is set
Cookie Prefixes
This is a deep dive area into cookies that I would prefer not to cover. But there is an element in CHIPS that refers to cookie prefixes so I will cover them quickly here.
As cookies are implemented today, a server can't confirm that a cookie was set from a secure origin or even tell where a cookie was originally set. An evil actor could set a cookie on a subdomain with the Domain attribute, which gives access to that cookie on all other subdomains. This leaves the application open to what is known as a session fixation attack.
To counter this, the browser designers created cookie prefixes to assert specific facts about the cookie. Two prefixes are available:
- __Host-. A cookie with this prefix is accepted in a Set-Cookie header only if it's also marked with the Secure attribute, was sent from a secure origin, does not include a Domain attribute, and has the Path attribute set to /. This way, these cookies can be seen as "domain-locked".
- __Secure-. A cookie with this prefix is accepted in a Set-Cookie header only if it's marked with the Secure attribute and was sent from a secure origin. This is weaker than the __Host- prefix.
The browser will reject cookies with these prefixes that don't comply with their restrictions.
The Privacy Risks of Third-Party Cookies
The discussion above spoke specifically to first-party cookies. But it also applies to third-party cookies, with the exception that in the case of third-party cookies the publisher has to put some code on the page (the ever-present “pixel” as it is called in adTech) to allow the third-party to set a cookie on the page.
Why, you might ask, do I care about third-party cookies when the whole point of Google Privacy Sandbox is that third-party cookies are being deprecated? As mentioned in my previous post, while third-party advertising cookies may be deprecated, there are other use cases for third-party cookies that will continue. Some examples are:
- Website Analytics. One of the most common uses for third-party cookies by publishers and advertisers is to allow one or more third-party analytics partners to track user behavior where their ads are displayed on their website (e.g., page views, clicks, demographics).
- Embedded Services. Many sites use embedded services from third-parties to enhance their functionality, such as map services or third-party chat embeds. Chat embeds, for example, send information about the user's device and browsing environment to the chat service. This can help optimize the chat window's display and functionality.
- Content Personalization. Website owners can use third-party services to personalize content for users based on their browsing behavior or preferences. This can involve A/B testing different layouts or content variations.
- Session Management Across Subdomains. Some websites use subdomains for specific functionalities (e.g., shop.example.com for an e-commerce store). Third-party cookies can help maintain a consistent user session across these subdomains.
- Shopping Cart Persistence. eCommerce websites leverage third-party cookies to maintain abandoned shopping cart state when users leave their website and return later.
- Content Delivery Networks. CDNs use cookies to track user behavior and optimize content delivery based on factors like location or device type. This can involve setting a cookie to identify the CDN server that served the content to the user.
- Fraud Detection and Prevention. E-commerce websites can utilize third-party fraud detection services to identify and prevent fraudulent transactions. These services might use cookies to track user behavior and identify suspicious activity patterns.
- Maintaining Site Settings. Many sites use cookies to maintain state on site settings. For example, a site with multiple language options might use third-party cookies to remember the user's preferred language.
These use cases, since they are not intended to track an individual across sites, do not present a direct challenge to user privacy. However they could, if used by an incompetent or evil actor, be employed to perform cross-site tracking. That is why Cookies Having Independent Partitioned State (CHIPS) was felt important enough to include in the Privacy Sandbox platform, even though the specification and use of CHIPS applies to sites whether or not they use the Privacy Sandbox.
How CHIPS Works
Let’s examine the threat to privacy posed by traditional third-party cookies and then examine how CHIPS reduces that exposure (Figure 2)
Figure 2 shows the two different cases. In the first case (Figure 2a), there are two sites both of which use the same chatbot vendor. The chatbot vendor places a tracking pixel and cookie on both sites in order to identify browser features to ensure proper functioning of the chatbot. Without any further protection, the chatbot5 vendor has access to the cookie on both sites when the call is made and the pixel fires. It is the same cookie and collects the same type of data on both sites. That data can then be stored in a single data store and combined to create a cross-site profile of a user. Basically, the browser’s activity is considered the single entity for which data is collected.
Figure 2: Mechanics of Cookies With and Without CHIPS
Figure 2a: How Cookies Work Today without CHIPS
Figure 2b: How Cookies Work with the Partition Attribute Set
The fix to this is to create what is called a partitioned cookie. A partitioned cookie is one that is keyed to its top-level site and cannot be connected to cookies from another site because they sit in partitioned space and can only be accessed by a call from the top-level site. So if a third party vendor sets a cookie on myfirstsite.com it is keyed for that site. When they store a second cookie on mysecondsite.com, it is keyed for the new site. It is not the same cookie and there is no easy way for the information contained in the two cookies to be brought together. When I request data using the cookie on myfirstsite.com, it can only send me back that data and have it stored with data for myfirstsite.com
“But wait!”, you say. “That doesn’t stop me from tracking a user across sites. For example (Figure 3), I can have a user in Browser A buying shoes from myfirstsite.com. I query that cookie and bring that back into my database as the first row with Cookie ID = 123, UserID = BrowserA, site = myfirstsite.com, action = purchase, item = shoes. That same user then goes to mysecondsite.com and buys a dress. Now I have a second row with the data Cookie ID = 456, UserID = BrowserA, site = mysecondsite.com, action = purchase, item = dress. In my database, I can now use Browser A as the match key and build a profile. I have two separate partitioned cookies, but I can still create a cross-site profile. So how does CHIPS help?
Figure 3 - Why Cross-Site Tracking Can’t Happen With CHIPS
Figure 3a: This data capture would allow cross-site tracking. Why can’t I do this?
Figure 3b: CHIPS works because cookies don’t capture user or browser information that could be used to link the two data points.
Very simply, many third-party cookies are not able capture a specific browser ID or other user information from the site on which they are embedded. While browsers might provide some information about the user's client , it's often obfuscated and not reliable for user identification across sites, especially if privacy settings are strict. All the third-party provider has is their cookie ID. Given that, two cookies from two different sites cannot be recognized as the same browser/user. Thus privacy is ensured.
CHIPS adds a new attribute to the Set-Cookie HTTP response header called Partitioned. So the same Set-Cookie header as before would now look like:
HTTP/2.0 200 OK
Content-Type: text/html
Set-Cookie: __Host-example=34d8g; SameSite=None; Secure; Path=/; Partitioned
Note that when a partitioned cookie is used, the Secure attribute must be set so that cookie is only sent to the server with an encrypted request over the HTTPS protocol. It is also recommended that developers use the __Host prefix when setting partitioned cookies in order to bind them to the hostname (and not the registrable domain).
CHIPS Still Allows for Unpartitioned Cookies In Transition
There is one important element to note about CHIPS. Right now CHIPS requires the Partitioned attribute to create partitioned cookies. It is effectively an opt-in in a world where unpartitioned cookies still exist. Google took this approach for a couple of reasons
- There are a number of embedded services which expect an unpartitioned cookie and which may behave in unexpected ways with partitioned cookies without time to adapt to and debug their impact on the application.
- Firefox and Safari had already attempted to require partitioned cookies and this has created some of the problems that the use of the Partitioned “opt in” is intended to avoid while vendors transition to partitioned cookies.
There are some other subtleties/implications of cookies to mention that I found to be intellectually fascinating. Briefly they include:
- Memory Limitations. As mentioned previously, today there is an 1,800 cookie limit per domain. Given the proliferation of cookies, its impact on storage, and also the potential for cross-partition leaks with this much information, the storage space per domain has been proposed to be limited to 10 kibibytes.
Never heard of a kibibyte? Neither had I, but here is the definition:
A kibibyte is 1,024 bytes. This compares to a kilobyte which is actually 1,000 bytes.
Author’s stream of consciousness aside: Really? For 50 years, I thought a kilobyte equaled 1,024 bytes. I mean, I was there when computer memory on an Atari was 256 bytes! How could I have been so wrong for so long?
Moving on. There is a second proposed limit. Cookies should be limited to 10 cookies per partition per domain. Data analyzed from millions of browsers indicates that this will cover 99% of all use cases. You can see a discussion of this issue in the CHIPS Github repository here.
- Cookie Deletion. When clearing cookies, the browser/client should clear all cookies available to that third-party in the partition for the current top-level site alone. It must not clear the third-party's cookies in other partitions.
Browsers may choose to provide user controls to clear individual partitions of a site’s cookies.
Top-level sites should not be able to clear the third-parties' cookies in their partition. This would provide a potential attack vector for top-level sites to interfere with code running in third-party frames
- Impacts on Extensions. Extensions in some browsers are capable of reading cookies (for sites they have host permission) in background contexts using a JavaScript API (e.g. Chrome, Firefox). When extension pages load subresources from other sites, the partition key used to determine which Partitioned cookies should be included in requests must be the site of the topmost-level frame which is not an extension URL if the extension has host permissions for that frame. Otherwise the partition key should be the extension URL.
- Impacts on Service Workers. There are some, but we haven’t covered service workers in the discussion so we’ll skip this for now.
If you wish to learn more about these details, see the CHIPS explainer in the CHIPS Github repository or the very well-written draft specification from Dylan Cutler of Google.