In order to test my compliance, I use Chrome’s dev tools to inspect what cookies might be being set by my site, and I also use the
curl cookie jar to do so programmatically. At the time of publication of that policy, my checks did not reveal the use of any cookies. I performed periodic but irregular checks in the intervening period.
A few months after I updated the policy, a German court ruled that use of Google Fonts, as hosted through the Google CDN, is out of compliance with GDPR, as they cannot guarantee that the data is not transmitted to the US. Upon learning of this ruling, I changed my site to self-host the Google Fonts implementation.
Most of the header images I use for my articles are my own photos. However, for some content, such as external links, I had been cross-linking to the meta image. I recently identified a risk where this could introduce third-party cookies, and changed to self-hosting these images as well.
Identifying third-party cookies
Cookies set by other CDNs
During a period check of my website, I noticed that a single third-party cookie was being set by Cloudflare that I hadn’t encountered in my previous scans. After inspecting the source of this, I discovered that for development, I had been using a CSS framework hosted through a Cloudflare CDN. Either the CDN was not setting this cookie previously, or I had missed it in my scans (more below). This framework was one of multiple CSS and JS frameworks I was using a third-party CDN for; however, only one of the CDN calls resulted in a third-party cookie being set.
After discovering the third-party cookie set by the CDN, I undertook a detailed scan of the site to identify any other areas where third-party cookies might be set. During this investigation, I was able to identify eight posts with embedded tweets and four posts with embedded Youtube videos. Checking these pages specifically, I confirmed that these embeds could set between one and three third-party cookies.
Moreover, I had embedded some Microsoft credentials from Credly on my About page, which also set cookies.
The Cloudflare cookie was set for a CSS framework that I used on the front page of my site. The Youtube and Twitter embeds were in individual posts. Users who visited these sites may have had third-party cookies set. I keep logs for a very limited period of time (last 7 days) for the purposes of debugging. Exploring the logs, I can identify a little less than 8500 requests for my front page, and around 250 requests for one of the posts with an embedded tweet. These are not necessarily unique views and may also include bots, crawlers, archivers and other similar automated tools.
Addressing the issue of these third-party cookies being set was straightforward. First, I replaced all of the embedded Youtube content with a direct link to the video. I did the same with the embedded tweets and credentials. To address the CDN issue, I self-hosted the CSS and JS frameworks, as I had done with Google Fonts earlier.
All content on this site is self-hosted, including all images, media, libraries, and other assets. There is no embedded content on this website. Therefore there are no third-party cookies. However, this site does link to content elsewhere on the internet, and by clicking these links, you may be subject to functional or tracking cookies used by those sites and their providers.
As a result of this, I have implemented a quick check in my build and deployment pipeline that scans all
<img> tags for external domains. This check will stop the deployment of any changes to the site that might introduce third-party cookies of this sort.
In exploring how this issue might have arisen, I come to three conclusions:
- It is possible that some of the services were not setting third-party cookies the last time I assessed the site. As mentioned previously, only one of the CDNs that was hosting CSS libraries was setting a cookie, and the other was not. This may have allowed me to overlook the use of the CDN during my earlier explorations.
- It is moreover possible that I was running the test on a page that was not setting any third-party cookies, As only a few pages could set these cookies (including
index.html), running the test on any other page could lead to a false negative. Failing to check
index.html, however, would be an oversight.
- It is simply an oversight that embedding third-party content like Tweets or Youtube videos would set a third-party cookie. Here, the convenience of copying an embed code allowed me to simply ignore the risk of third-party cookies being used.
Relatedly, the reliance on an external CDN for hosting CSS code was a holdover from a time before I fixed some internal references for my local testing. Here, too, I had failed to “close the loop” in removing CDNs, even as I worked deliberately to eliminate reliance on Google’s CDN to use Google Fonts.
Furthermore, although I had not relied on the tool in the past,
curl is only able to identify first-party cookies, and cannot resolve third-party cookies.
While I have undertaken significant effort to protect my readers' privacy, this experience highlights how easy it is to miss a small detail. Using some better tools to scan for potential third-party cookie introduction will help ensure compliance with my commitments and prevent mistakes like this from creeping up again.
Third party cookies are prolific on the internet and it is a good habit for all users to periodically purge cookies. I recommend doing so.
I’m very sorry to my readers for my oversight and apologize for the adverse impact it might have had. To discharge my sins, I will donate 100€ to the Electronic Frontier Foundation. This matches the fine issued by the Munich court in the aforementioned Google Fonts case.