Visible compliance signals are not the same thing as compliance.
Cookie consent banners are now near universal on the modern web. The European Union's General Data Protection Regulation (GDPR) and its older companion the ePrivacy Directive require informed, specific consent before non-essential cookies are placed; the United States, by contrast, regulates web tracking under sectoral and state-level law in an opt-out frame. The market response in both jurisdictions has been the rise of the consent management platform (CMP) and the IAB Transparency and Consent Framework (TCF), now at version 2.2.
A natural empirical question follows: do sites self-identifying with each jurisdiction behave differently along the dimensions consent regulation targets? A second, more important question is whether the visible compliance infrastructure (the banner, the framework, the reject button) is informative about the underlying behavior (whether trackers actually fire before consent and stop after reject).
This paper presents the first matched EU/US measurement study to answer both questions in a single instrument. We crawl 3,607 popular websites stratified by jurisdictional self-identification, instrument a two-state interaction protocol (no-interaction baseline, post-reject), and measure three behaviors that EU consent regulation either implicitly or explicitly prohibits: setting tracking cookies before user interaction, hiding the reject control, and failing to honor a user's reject choice. The matched analytical sample is 1,046 EU and 1,055 US sites. The substantive contribution is a quantitative account of compliance signaling divergence, the gap between the externally visible compliance artifact and the behavior the artifact is supposed to attest to.
Measurement was conducted from a single residential vantage in Atlanta, Georgia using Playwright-instrumented headless Chromium 126 on Linux aarch64. The jurisdictional classification (EU member-state ccTLD vs US-targeting gTLD) was validated against four orthogonal signals on a 100-site random sample (HTML language, currency, privacy-policy regulatory references, server IP geolocation), with weak agreement rates of 76% (EU) and 90% (US) under the bucket-consistent standard.
Three numbers that change the audit frame.
Each statistic is computed from the matched analytical sample (n_EU = 1,046, n_US = 1,055). Full odds ratios, confidence intervals, and robustness checks across three outcome definitions are reported in the paper.
Ten findings, one signal-behavior gap.
Findings 1 through 4 are descriptive comparisons under two-proportion z-tests. Findings 5 through 7 come from the multivariate logistic regression. Finding 8 is a site-level case study. Findings 9 and 10 cover robustness and enforcement implications.
US sites place tracking cookies pre-consent 13.3 points more often than EU sites
50.6% of US sites set at least one tracking cookie before any user interaction, versus 37.3% of EU sites (z = -6.14, p < 0.001). Mean pre-consent tracking-cookie counts are 2.82 (US) and 1.56 (EU), a ratio of 1.81. Pre-consent cookies of any kind are universal in both buckets (~84%), reflecting functional, session, and CSRF cookies; the bucket difference is concentrated in the tracking subset.
TCF v2 adoption is a 20-fold asymmetry
An active TCF v2 banner was detected on 18.4% of EU sites and just 0.9% of US sites (z = 13.61, p < 0.001). The CMP vendor distribution among TCF-using EU sites is dominated by Quantcast Choice (47%), Sourcepoint (28%), TrustArc (20%), and Sourcepoint MGR (13%).
Reject buttons are 2.7× more discoverable on EU sites
A findable reject control was detected on 22.5% of EU sites and 8.4% of US sites (z = 8.95, p < 0.001). Detection is a lower bound: the crawler does not navigate Manage Preferences sub-menus or cross-origin iframes. CNIL's 2021 sanctions explicitly targeted asymmetric prominence of reject vs accept; our data is consistent with that campaign producing improvement, but more than 75% of EU sites still fail the detector.
Reject is honored more often on EU sites — for tracking cookies specifically
Among sites where reject was clicked, tracking cookies did not increase on 82.6% of EU sites versus 65.2% of US sites (z = 3.36, p < 0.001). However, the proportion not contacting new tracker hosts after the click is statistically indistinguishable across buckets (69.4% EU vs 68.5% US, ns), and total cookie counts increased on a majority of sites in both buckets. Reject suppresses tracking-cookie creation more than it suppresses tracker-host contact.
The jurisdictional effect persists after multivariate controls
A logistic regression predicting pre-consent tracking from bucket, log10(Tranco rank), TCF state, and major-CMP presence finds a US-over-EU odds ratio of 1.30 (p = 0.048). Signs are stable across three outcome operationalizations (cookies only, hosts only, combined), with the disaggregated cookies-only specification yielding OR = 1.60 (p < 0.001).
Active TCF state is associated with 76% lower odds of pre-consent tracking
Having an actively-running TCF v2 integration at crawl time produces a strong negative coefficient (OR = 0.24, p < 0.001). We read this as a selection effect of sites that have invested in substantive consent compliance, not as a causal property of the framework itself; the framework does not enforce a minimum behavior on the deploying site.
Major-CMP presence predicts more pre-consent tracking, not less
In the same model, the presence of one of the top ten commercial CMP vendors is associated with an 8-fold increase in the odds of pre-consent tracking (OR = 8.04, p < 0.001). The plausible interpretation is selection: sites with the budget and adtech sophistication to procure an enterprise CMP are also sites with substantial monetization stacks running before consent. The two predictors that share a deployment substrate point in opposite directions — the multivariate footprint of compliance signaling divergence.
Case study, eldiario.es: clicking Reject All doubles the cookie count
eldiario.es is a Spanish online news outlet in the Tranco top 1,200 deploying a Quantcast Choice CMP (TCF cmpId 7) with a clickable reject control. Pre-interaction: 16 cookies and 7 tracker hosts (Amazon advertising, Criteo, Google Analytics, Google Tag Manager, others). Post-reject: 35 cookies and 15 tracker hosts — including TikTok ads, Taboola, Facebook, Twitter ads, Hotjar, and DoubleClick newly contacted only after the user clicked the reject button. The visible compliance signal and the underlying behavior point in opposite directions.
Signs are stable across three outcome operationalizations
Refitting the multivariate model under cookies-only, tracker-hosts-only, and combined outcome definitions yields sign-consistent coefficients for bucket, TCF, and major-CMP across all three. Bonferroni correction over the eight primary tests adjusts alpha to 0.00625; all headline results remain significant.
The audit target should be the first second of page load, not the banner
Of TCF-using EU sites in the sample, 35% have at least one tracking cookie set before any user interaction. The TCF framework does not enforce a minimum behavior on the deploying site; the artifact's presence does not certify compliance. The 14-percentage-point gap in reject-control discoverability is the clearest UX-level enforcement target. Regulators auditing compliance via the presence of compliance artifacts can be misled; the audit target needs to be the underlying behavior.
Four steps, two states per site.
The instrument is designed to be replicable on commodity hardware and to surface the gap between visible consent infrastructure and underlying tracking behavior. Three honest limitations of the host-based tracker classification are reported in full in Section III.E of the paper (substring false-positives/negatives, CNAME cloaking, no payload inspection) — all three attenuate rather than amplify the EU/US contrast.
Stratified Site Selection
Top sites from the Tranco list, partitioned into an EU stratum (28 member-state ccTLDs plus .eu) and a US stratum (.com/.net/.org/.us with a 51-entry exclusion list for non-US-headquartered properties). 78 infrastructure domains are filtered out.
Two-State Crawl Protocol
Playwright with headless Chromium 126. State A: navigate, wait 14s, snapshot cookies, query window.__tcfapi, record tracker hosts. State B: locate and click a reject control via CMP-specific selectors plus a multilingual heuristic across 14 European languages, re-snapshot.
Cookie & Tracker Classification
Cookies are labelled against the Open Cookie Database (2,245 known names) into Functional, Analytics, Marketing, or Unknown. Network requests are matched against 64 curated tracker-host substrings spanning ad networks, analytics, session replay, tag managers, and adtech CDNs.
Multivariate Modeling
Logistic regression of pre-consent tracking (binary) on bucket, log10(Tranco rank), TCF active state, and major-CMP presence. Three model specifications (M1–M3) are reported with Bonferroni-adjusted significance and McFadden pseudo-R squared.
Source code, raw per-site JSONL captures, the 24-column analytical table, regression outputs, and the 100-site jurisdictional-validation sample are available on request. Replication requires an aarch64 Linux host with Python 3.10+, Playwright 1.41+, Chromium installed, and the Tranco list. A paired EU-vantage replication is recommended as future work; Bright Data residential proxies dropped ~85% of subresource CONNECT calls in our trials, breaking CMP initialization.
Audit the behavior, not the artifact.
Three recommendations follow directly from the data. The first two are concrete enforcement targets; the third is a structural shift in how supervisory authorities and CMP buyers evaluate consent compliance.
Audit at the first second of page load, not at the banner
An actively-running TCF integration does not certify ePrivacy compliance. 35% of TCF-using EU sites in this sample place at least one tracking cookie before any user interaction. The instrument-level audit target is the pre-interaction state of the cookie jar and the third-party request log, not the presence of the banner.
Enforce equal-prominence reject UX and measure the click-cost to reach it
Only 22.5% of EU sites in this sample surface a findable reject control at the top level; CNIL's 2021 line on prominence is right but under-enforced. Future measurement work should report the number of clicks required to reach reject, and supervisory authorities should treat > 1-click reject as a presumptive ePrivacy violation.
Shift the procurement frame from CMP presence to behavioral verification
Major-CMP-vendor presence in this sample is associated with more pre-consent tracking, not less. The visible deployment of an enterprise CMP is not, on the evidence here, a reliable buy-side signal of compliance. Procurement and audit teams should require demonstrable pre-interaction silence in the cookie jar and third-party request log, with periodic external replication.
Get in touch.
Interested in cookie consent measurement, ePrivacy enforcement, TCF compliance, or the wider gap between compliance artifacts and behavior? I welcome citation requests, academic collaboration, regulatory inquiries, and procurement-side replication requests.
Contact Noah