Adtech vendors still tracking EU users who deny consent via IAB’s TCF, study suggests – ProWellTech

Emma Watson

3 years ago

Adtech vendors still tracking EU users who deny consent via IAB’s TCF, study suggests – ProWellTech 1

New research examining what happens after Internet users in Europe land on an ad-supported website and express their “privacy choices” — using a flagship ad industry consent management platform which is supposed to allow them to control the types of ads they receive (i.e. non-tracking vs “personalized”) — has raised fresh questions over the IAB Europe’s self-styled Transparency and Consent Framework (TCF).

The TCF is already in hot water with privacy regulators.

Last month the IAB Europe announced that it’s expecting to be found in breach of the EU’s General Data Protection Regulation (GDPR) — and that the framework will also be found in breach. Although the IAB sought to suggest that a few tweaks will suffice to fix problems identified by the Belgian data protection authority (DPA).

We’re still waiting for publication of the final decision of the Belgian authority. But its preliminary findings — last year — highlighted a litany of GDPR failures.

Despite all that, the IAB has continued to argue the TCF is working as intending for the close to 800 adtech vendors who are thought to participate in the system, loudly rejecting criticism of it. Its CEO, Townsend Feehan, for example, pooh-poohed criticism earlier this month — telling Engadget that “none of this [tracking] happens if the user says no”.

However the new study throws doubt on the claim that if a user says ‘no’ to tracking/behavioral ads via the IAB’s TCF, the adtech industry respectfully falls into line.

A key piece of this research examined how the adtech ecosystem responds to user signals that request only basic, i.e. non-tracking-based ads, to examine how ad vendors respond when users say no to “personalized” ads.

Here the researchers found evidence to suggest that many adtech vendors continue to track and profile Internet users when they have explicitly said they don’t want tracking-based ads.

While a number of earlier studies have found problems with how publishers in the EU have implemented cookie consents, such as tracking cookies being dropped prior to asking a site visitor for their permission, this new research, carried out by adtech researcher, Adalytics, aims to zero in on the TCF framework itself — by examining instances where ad-supported websites have faithfully reported users’ ad choices.

Problematic data flows after that implicate the adtech industry itself — and the claims it makes for the TCF as a flagship compliance tool — because it suggests the framework fails to accurately reflect and actually respect users’ “privacy choices” once they are passed to ad “partners”.

Listing a series of takeaways, Adalytics writes that the findings suggest:

Many major ad tech vendors continue to track and profile EU users, even when an EU user has explicitly objected
The TCF strings that were designed by IAB Europe do not appear to be honored or parsed correctly by many ad tech vendors
Some ad tech vendors may not be able demonstrate they obtained user data with user consent, which may expose them to contractual compliance, investor/shareholder disclosure, and regulatory risks
In many instances, it is impossible for users to support media creators by allowing “basic ads” whilst disallowing tracking and behavioral profiling to protect their own privacy

Although it’s important to note there are limits to what the researchers were able to observe via Chrome Developer Tools — given that any processing being done on adtech companies’ own servers isn’t verifiable by such external research.

Understanding the full picture of what’s done with people’s data once the adtech ecosystem gets its hands on it is difficult. But that also cuts to the heart of surveillance-based advertising’s problem with complying with the GDPR — which also requires, accountability, transparency and security when processing people’s data, as well as a valid legal basis to do so.

Even setting those major overarching problems aside, just the fact of tracking cookies being dropped and user data being passed around when a person has explicitly said it should not be looks, well, awkward for the IAB’s TCF.

To study data activity at the adtech end of the framework, the researchers ran tests in a number of countries in the EU — visiting websites they manually verified had correctly configured the framework to send the user’s consent string, and selecting only basic ads; refusing personalized/tracking-based ads/profiling etc and also limited the choice of adtech processor to a single vendor.

The testers also made sure to object to “legitimate interests” so that their consent preferences could not be bypassed in that way.

If the TCF was functioning as Feehan’s remarks to Engadget earlier this month suggest — i.e. if users can just deny tracking simply by saying ‘no’ via the TCF — the researchers would have expected to observe data flowing only as the individual had specified it should.

Instead, they found — in most cases — data flows that looked very different vs the choices that had been expressed.

The paper also details numerous instances of tracking cookies being set prior to the user’s consent choices even being signalled. (Although they say they excluded such examples from their analysis as they were specifically aiming to study what happens after a user has submitted their choices via the TCF.)

Examples cited in the study of adtech vendors appearing to override/ignore TCF signals denying tracking include a visitor with a Belgian IP address to a popular local news website, nieuwsblad.be — who provides consent to basic ads only, and only consents to ads from Google (so they’re explicitly rejecting “personalized” ads and profiling) — yet who, on checking Chrome Developer Tools for network HTTP requests, observes HTTPS requests sent to ib.adnxs.com, a domain owned by AppNexus (aka Xandr), which responds by dropping a cookie called “uuid2” set for three months.

“Given that [this user] objected to personalised ads and creating a personalised profile, only provided consent to Google, and the fact that these consent choices were directly included in the HTTP request to adnxs.com, it is not clear why the AppNexus server responded by setting a persistent, advertising related cookie in [the user’s] browser,” the researchers observe.

In another example, a user with a French IP address visits the news website lemonde.fr and once again — despite not consenting to any cookies or purposes offered in the consent banner — they see HTTPS request being sent to id5-sync.com, which responds by setting a cookie called “id5” for three months, and triggering a 302 HTTP redirect to sync data with rtb-csync.smartadserver.com.

“This specific HTTPS request that was sent to id5-sync.com contains the previously mentioned TCF string in a query string parameter called “gdpr_consent”,” the researchers report, adding that: “The domain id5-sync.com belongs to ID5, a London-based “identity platform for the digital advertising industry”.”

They further note that the ID5 Universal ID is described in a github overview as a “shared, neutral identifier that publishers, advertisers, and ad tech platforms can use to recognise users”.

So, again, if the user is saying they don’t want to be identified and tracked for ads, why is the id5 cookie being dropped at all?

In another example detailed in the study, also involving a French IP address — this time the user visits the newspaper latribune.fr — the user’s consent choices are again apparently tossed in the virtual trash.

In this instance the user had specified they wanted basic ads served by US supply side platform OpenX.

However Openx.net was observed triggering user ID syncs with “numerous other parties”, including Amazon, Google, DataXu, AppNexus, Beeswax (bidr.io), Adelphic Predictive Data platform (ipredictive.com), AdPilot (erne.co), Simplifi Holdings (simpli.fi), and others, per the study.

In another example — involving a French IP user visiting atlantico.fr; and consenting to basic ads from Google only — the user sees a “lot of HTTPS requests being made with this TCF consent string, some of which appear to be user data syncing or setting tracking cookies”.

The researchers go on to note that: “A request sent to s.cpx.to responses by setting a cookie for 1-year called “cpSess”. This domain is owned by London-based Captify, and the “cpSess” cookie appears to be used to store and link personal information about the user” — before citing another source that suggests this cookie is “used as a tracking mechanism for […] advertising companies” and “helps with the delivery of targeted marketing adverts whilst users browse”.

The study details numerous other examples of unexpected data flows and syncing being observed after the user has asked not to be tracked.

The researchers also detail results of large-scale automated tests, as well as the manual examples cited above — based on crawler data from 48,698 different publisher domains — and here they found evidence of “tens of thousands” of ad requests erroneously claiming the GDPR does not apply to web users who were actually in the EU…

Out of 35,389 publishers found to have an HTTPS request that was sent to an ad tech vendor and contained either the ‘gdpr=0’ query string parameter, or the ‘gdpr=1’ query string parameter, 28,941 (81.8%) contained at least one HTTPS request sent to an ad tech vendor, wherein the query string macro parameter “gdpr=” was set to “0” — meaning they were signalling the GDPR does not apply.

More from Adalytics’ write up:

“According to the IAB TCF documentation and to Google’s documentation, “1 indicates that GDPR applies and 0 indicates that it does not.” Many ad tech vendors were observed processing the German and Finnish user data when “gdpr=0”, despite the crawler not having clicked on the CMP consent banner. In each of these cases, the ad tech vendor could have validated that the user was in fact in the EU by performing a simple IP address geolocation lookup. However, it appears that many take this “gdpr=0” parameter at face value, without performing any additional server-side checks, prior to sending tracking cookies or performing user data syncs.”

To try to verify this finding, Adalytics says it reached out to a programmatic media buyer who was targeting ads to users across the EU, UK, and Switzerland. This buyer had their ads served a total of 111 million times — while the “gdpr_applies” macro was set to “0” 42.3% of the time — “indicating that the ad tech company involved was claiming that the GDPR does not apply to these users”.

Yet on checking where those 47M ads were served, by using an IP address geo-location lookup service, Adalytics reports that the advertiser found the majority were served to users based in Spain, Croatia, Italy, France, and other EU countries, meaning the GDPR does actually apply…

“If ad tech companies are not checking the integrity of the “gdpr_applies” strings by performing IP address geolocation checks, they may end up misleading their advertiser clients about the relevant regulations that apply to their advertising campaigns,” Adalytics goes on to warn.

The findings further highlight the scale of the problem of ad fraud — and, where the IAB Europe’s TCF framework is concerned, suggest it offers merely a veneer of legal compliance because it isn’t actually capable of doing what’s claimed on the tin.

All-too-often adtech vendors have not properly implemented and deployed it, per the research findings — making the “Transparency and Consent Framework” at best, a sham; or, well, compliance theatre.

Little wonder this whole system is in regulatory hot water — although the length of time it’s taken EU watchdogs to interrogate the reality underlying the adtech industry’s claims continues to be extremely troubling for anyone who cares about democratic oversight and fundamental rights.

“The conclusion that we make in our study is that the adtech vendors have not taken the time, money and engineering labor hours to properly configure their servers and APIs,” Adalytics’ founder Krzysztof Franaszek told ProWellTech, discussing the research. “In theory, all TCF participants… should have set up their servers and APIs in a way that checks and validates the TCF strings.

“If an API receives some user data, to which the TCF string shows the user has not consented, the vendor should immediately discard and avoid doing anything with that data.”

Franaszek noted that the researchers did observe at least one example of correct behavior — an instance where AppNexus received a TCF string that did not allow personalization, or for a vendor called Index Exchange to get the data, and “properly/rightfully configured their server to respond with an HTTP 400 error code, that says ‘Request failed due to privacy signals’”.

However he pointed out that that’s what “ALL 795 TCF companies should be doing”, and the study shows — on the contrary — many companies are simply not checking the strings.

Also commenting on the findings, Johnny Ryan, a senior fellow at the Irish Council for Civil Liberties — and a long time adtech whistleblower who has filed numerous complaints over real-time-bidding’s (RTB) abuse of personal data — said the study is just the latest illustration that “the TCF was just a bad idea”.

“There is no control. No technical measure to actually control what happens to the data,” Ryan told ProWellTech. “This means the TCF can never fulfil its purported objectives of providing control and transparency.

“The central argument we made to the Belgian APD [DPA] is that the TCF’s inevitable failure, due to the inherent and known insecurity of RTB, means that in addition to being an unnecessary nuisance in the lives of all Europeans, the TCF also infringes the data minimization principle [of GDPR] because it processes TCF string personal data for a purpose that is impossible to achieve.”

Ryan also pointed back to some 2017 IAB correspondence — which came to light back in 2019 via FOI — in which Feehan, who was at the time seeking to lobby the European Commission on ePrivacy rules, implies consent is incompatible with RTB — attaching a paper to her email to EU lawmakers with the observation that “it is technically impossible for the user to have prior information about every data controller involved in a real-time bidding (RTB) scenario”.

And for consent to be valid under GDPR it must be specific, informed, and — indeed — freely given.

There are some limitations to the Adalytics study — such as the manual component being limited to a small number of publishers that “appear to correctly encode user consent input, in order to rule out the possibility that a publisher’s or CMP’s errors were resulting in user profiling occurring without a user’s consent”, as Franaszek puts it.

It is also not clear how many/what proportion of the 795 adtech companies using the TCF system aren’t properly respecting user choices. The instance cited above where a user denial of consent was correctly responded to shows it can happen. How often it does is a whole other matter, though.

In any case, those vendors observed by the researchers apparently not respecting users’ denial of tracking included “several multi-billion dollar multinationals like Rubicon, Pubmatic, Amazon, Google and Trade Desk”, per Franaszek.

“I think if you look at the stock market capitalization of the vendors that we examined, it’s far greater than 50% of the total digital advertising industry,” he added.

A more general caveat is that, given the adtech industry’s highly opaque character, it’s always difficult to understand how data is or isn’t being processed — and the researchers acknowledge that a lot of “nuance” is required to interpret the data in a section of the paper discussing limitations.

The study has also not gone through formal peer review so has not been subject to the rigorous academic process required if it had been published in a peer reviewed journal.

But, well, the findings sure don’t look good for adtech.

Per public documentation on a github repository for v1.0 of the IAB’s TCF framework, the ad industry body suggested that their — at that time — incoming “technical industry solution” would allow website operators to:

Control the vendors they wish to allow to access their users’ browsers (for setting and reading cookies) and process their personal data and disclose these choices to other parties in the online advertising ecosystem
Seek user consent under the ePrivacy Directive (for setting cookies or similar technical applications that access information on a device) and/or the GDPR in line with applicable legal requirements and signal the consent status through the online advertising ecosystem

The question that naturally follows from those stated capabilities is well, what then? What happens after users’ ‘privacy choices’ are signalled to the data-gobbling adtech ecosystem? How does the TCF actually ensure ad vendors respect people’s choices?

Adalytics’ research is just the latest study to suggest that industry standard pop-up “privacy choices” are a mirage; a suggestive illusion of choice wrapped in a pantomime of consent in a bid to square the impossible circle for surveillance ads: GDPR compliance.

Because the only way to be sure that the choice you ask the TCF to pass on to adtech vendors gets respected is if you ask for the tracking to continue…

The IAB has been contacted for comment on Adalytics’ research, and also with questions regarding whether it takes any steps to verify that participating ad vendors are complying with TCF strings expressing user choices.

We have also asked the IAB how it expects to adapt the framework — and its own activities — in order to come into compliance with GDPR, in light of the Belgian finding of a breach.

We’ll update this report with any response.

But if the CMPs and the IAB EU are reinterpreted as data controllers they are–in theory–given the additional responsibility of “the purposes and means of the processing of personal data” which would make them responsible for consent fraud and user data transmission…

— Aram Zucker-Scharff (@Chronotope) November 5, 2021