Tracing Cross Border Web Tracking
Costas Iordanou, TU Berlin and UC3M
Georgios Smaragdakis TU Berlin
Ingmar Poese BENOCS
Nikolaos Laoutaris Data Transparency Lab and Eurecat
IMC ’18 Proceedings of the Internet Measurement Conference 2018
A tracking flow is a flow between an end user and a Web tracking service. We develop an extensive measurement methodology for quantifying at scale the amount of tracking flows that cross data protection borders, be it national or international, such as the EU28 border within which the General Data Protection Regulation (GDPR) applies. Our methodology uses a browser extension to fully render advertising and tracking code, various lists and heuristics to extract well-known trackers, passive DNS replication to get all the IP ranges of trackers, and state-of-the-art geolocation. We employ our methodology on a dataset from 350 real users of the browser extension over a period of more than four months, and then generalize our results by analyzing billions of web tracking flows from more than 60 million broadband and mobile users from 4 large European ISPs. We show that the majority of tracking flows cross national borders in Europe but, unlike popular belief, are pretty well confined within the larger GDPR jurisdiction. Simple DNS redirection and PoP mirroring can increase national confinement while sealing almost all tracking flows within Europe. Last, we show that cross-border tracking is prevalent even in sensitive and hence protected data categories and groups including health, sexual orientation, minors, and others.