Website tracking, i.e. tracking of users and their digital habits, is a pervasive phenomenon, and the methods by which it is done are becoming ever more sophisticated.
Most people know of the cookie - the small file that websites place on a visitor's computer that allows them to obtain personal information about that specific user. But these are only one of many technologies that track, monitor and log user behavior.
The purpose of web tracking is for organizations, companies, websites etc. to gain insight into their users, their behavior and preferences. These insights serve to optimize user-friendliness and experience, as well as for statistical purposes, for customization, for commerce, and for profiling and targeted marketing.
However, website tracking also serves more sinister purposes, as revealed in the vast privacy and election scandals that unfolded in 2016 and continues to this day.
The two biggest and most pervasive scandals, sure to be remembered for decades to come and already acknowledged as watershed moments in public privacy awakening, are –
Web tracking is a complicated matter: a smart strategy for commercial businesses, as well as an unregulated threat to the very freedoms that make commercial markets possible.
So, we give here a broad overview of the phenomenon of website tracking – what it is, how it is done, what the technologies and methods are, their implications on our democracies, and how Cookiebot fits into to all of this.
Website tracking is the practice by which data is collected about a person's behavior online.
When a user browses on the internet, everything might potentially be tracked:
Website tracking is when users’ digital activity on a website or journey between websites is being monitored or recorded. It's very common, but the transparency leaves room for improvement.
It's not made clear to users, when they are being tracked, how, by whom, whereto the data is sent and for what purpose, and the tracking happens without their consent.
This is a primary reason why the EU has enforced a strict regulation for protecting the privacy of its citizens in the digital realm: the GDPR (the General Data Protection Regulation), enforced 25 May 2018.
The purpose is to restore the control over one’s own data to the users, by augmenting the transparency and insight into how one is being tracked, by whom, and for what purpose, along with the possibility to prevent it from happening.
There is no simple answer to this question, because tracking is many things. Amongst the most common reasons, are:
Websites track users directly and by means of integrated third party tools such as Google Analytics, mainly in order to gain insight into how their website is being used.
This enables the website’s owner to improve and optimize the functions, functionality and features of their website, so that it meets user requirements as closely as possible.
Webshops and ecommerce websites track users in order to maximize their turnover.
Simply put, the more insight a commercial website has into its customers’ actions, interests and needs, the better it can present its products to the specific user, the more it will sell.
Websites also allow for third party advertisers to track their users and display ads to them in order to get revenue from their website.
Especially news sites and other websites with editorial content have a large presence of third party website trackers. Many of these sites provide articles for free and lack an external funding. Therefore, they have to monetize pageviews with significantly more advertising than websites promoting commercial products or websites owned by governmental or public entities.
Advertisers track users so that they can target their marketing as precisely as possible and display their ads to the most relevant potential consumers.
The technology that allows companies to place ads on somebody else's website is called display advertising.
Our own Cookiebot report on ad tech surveillance on EU government websites reveal some of the scary implications of third party cookies.
Typically, advertisers make use of large scale ad networks to help them market their products to their most relevant audience on the internet. The largest online advertising network is Google Adsense.
This form of targeted advertisement is made possible only through the collection of user data, which in turn is done by website tracking tools.
In January 2016, a study from Princeton University measured and analyzed the online tracking on the top 1 million websites of the internet.
The key finding of the study was that third party trackers present on the internet takes the form of a classic long tail graph:
Illustration from the Princeton Study on Web Transparency
Even though the researchers all in all detected over 81,000 third party trackers that were present on at least two websites (thereby indicating that they are third party trackers), only a handful of those were present on the majority of the 1 million analyzed websites.
The top five most common website tracking tools were all owned by Google.
Google Analytics, a product used to log visitors to websites that integrates with the company’s ad-targeting systems, was found on almost 70 percent of sites. DoubleClick, a dedicated ad-serving system from Google, was found on close to 50 percent of sites.
One thing is knowing that web page trackers collect user data, another is to understand how this data can be used in ways that are not immediately visible or intuitive to us but can have far-reaching consequences for our right to privacy and equal treatment.
The term digital phenotyping describes the process by which our online behavior can be used to obtain insight into and map out our health, and thereto also potential health risks and issues. This means that trivial and benign data collected by a website page tracker can be turned into telling clues with accurate prediction abilities.
As an example, research suggests that early Parkinson’s disease can be detected by typing patterns on keyboards, as well as how language used in social media posts can predict depressive episodes – all data that website user tracking collects every second from millions of people all over the world.
Other known online website tracking technologies are tracking pixels (or pixel tags), web beacons (or ultrasound beacons), and browser fingerprinting (or digital fingerprinting), amongst others.
The cookie is a simple string of text that is loaded on users’ browsers when they visit a website. Its purpose is to enable the website to recognize and remember its users. But cookies make up the majority of website trackers online.
The cookie was invented back in 1994 by Lou Montulli and John Giannandrea at Netscape, and originally served to provide websites with a ‘memory’, so that they could, for example, hold items in a shopping cart while the user browsed for goods on the site.
While the cookie still serves this purpose, it can also monitor users and give a great deal of insight into user behavior.
The cookie is widely used for profiling and targeted marketing, and most websites set a great deal of cookies of first and third party provenance alike.
There are also many different cookies: necessary cookies, analytics cookies or statistics cookies, marketing cookies or advertising cookies. The strictly necessary cookies function to make your website operate its most basic functions so that a visitor can visit it. These rarely if ever have any way of tracking users.
However, analytics cookies or statistics cookies are most often third party cookies that track and log user behavior to give insight to the website owner. Marketing cookies and advertising cookies are also most often third party cookies that serve to make targeted advertisement possible. These cookies are website tracker tools for both the companies using them to optimize their sales, but serve also as website tracking tools for companies like Google and the entire ad tech industry.
Advertising cookies, marketing cookies, analytics cookies, statistics cookies – a lot of different names for the same phenomenon: a way to gain insight into a website’s users for different purposes, but with the same dire implications if left unregulated.
There has been quite a bit of negative public attention to cookies, and many users choose to block cookies from their browsers in an attempt to avoid internet website trackers.
Read our full introduction to internet cookies.
Tracking pixels, also called pixel tags or 1x1 pixels, are transparent images consisting of a single pixel, that are present (albeit virtually invisible) on a webpage or in an email.
When a user loads the webpage or opens the email, the tracking pixel is also loaded, enabling the sender of the tracking pixel, typically an ad server, to read and record that the webpage is loaded or the email is opened and similar user activities.
The purpose is much the same as for third party cookies: to get insight into users for targeted marketing.
Information that can be obtained by websites and third parties via tracking pixels include:
Tracking pixels are a widely used form of analytical or statistical tracking, but one that the General Data Protection Regulation deems unlawful if it not first consented to by the user.
Web beacons are a variety of techniques of tracking users online. Some of them are known as ultrasound beacons (or ultrasonic beacons, sometimes abbreviated uBeacons) and these are high-pitched sounds that are emitted from the device in use, e.g. when you visit a website that has the web beacon installed.
The sounds omitted from these web beacons are inaudible to humans, but your dog can hear it, and, more importantly, all the other devices in proximity to the one you were using, react to it.
Also called Ultrasonic cross-device tracking (uXDT), the uBeacon serves to bridge the gap between the digital world and the physical one.
One of the primary benefits of the ultrasound beacon is that it enables the sender to gain insight into what devices are connected with each other: your pc, mobile, tablet, etc. - thereby solving the headache of marketers and other trackers alike, that users can move between devices.
More and more mobile apps silently track users by means of ultrasound beacons for other sophisticated purposes:
For example, some retail stores have ultrasound beacons installed at their entrance that interact with your mobile phone when you go inside, enabling marketers to track and target consumers in the physical world as well as online.
So, if you for instance went to a brand store for, say, sneakers, that had an uBeacon emitter installed at their entrance, this particular brand of sneakers now knows that you may be interested in their shoes, even if you never went to their website or searched for their shoes online.
Even if a user blocks tracking cookies and uses VPN to blur their IP-address, there still are other methods for tracking users.
One of them is browser fingerprinting, the uniqueness of your specific computer, device or browser.
Whenever a user visits a website, their computer or device provides the site with highly specific information about their system and settings. The use of this information to identify and track users is known as device or browser fingerprinting, sometimes also referred to as digital fingerprinting.
A browser fingerprint is thus a collection of many, many different information about a user's device in order to create a sort of "fingerprint" for that device that can be tracked across the web.
This browser fingerprint can consist of -
This information might seem benign at first glance, but combined they can form a unique browser fingerprint that stand out as one among millions of other devices.
Browser fingerprinting is frighteningly accurate: it can successfully identify users 99 percent of the time.
It also means that even if users take privacy precautions, such as using VPNs and blocking cookies in their browser settings, a browser fingerprint, unique to their devices in use, can re-identify the user when they visit a website.
In addition to regular cookies, tracking pixels, pixel tags, web beacons and ultrasound beacons and browser fingerprinting technologies, there exists other methods for tracking users, such as undeletable zombie cookies or super cookies, dynamic cookies, Silverlight Isolated Storage, IndexedDB, etc.
As the world is coming to realize, in the digital age, data is an extremely valuable asset, that can be used for everything from owning markets, affecting the masses, to even win elections.
The methods for getting insight and tracking users is always evolving, and the means are impressively creative.
The GDPR and the ePrivacy Directive are two EU initiatives to regulate user tracking and protect personal privacy. They are probably not in themselves sufficient, but they are important steps in the right direction.
A direction heading towards a balanced regulation of the ad tech industry and its surveillance capitalism.
Imagine that you’re driving down a highway in your car one late afternoon.
You see in the not-so distant horizon an empty billboard on the side of the road. As you speed towards the billboard, thousands of companies are engaging in an invisible auction in real time, the highest bidder buying the opportunity to showcase their product on the billboard the second you pass it by.
However, these companies are bidding on more than just the commercial space on an empty digital highway.
They know that you are going to pass by the billboard, because they know which road you’re driving on, just as well as they know which car you’re driving in, what music you’re listening to, how fast you’re going, what you had for lunch, how much gas is in the tank, when you bought your car, oh and what your name is, where you came from, the color of your hair, your vices, dreams and fears.
The billboard for sale is more than an empty advertisement space: it is a tailored and targeted attempt at swaying your behavior using everything there is to know about you against you.
Aware of patterns in your behavior that you don’t even see yourself, these companies monetize your person through targeted advertisement and real time bidding auctions, where data brokers sell to commercial companies the digital billboards timed to the micromoments in which you are most swayable, most easily herded and nudged.
This is surveillance capitalism, as coined and elaborated by Harvard Professor emerita Shoshana Zuboff in her seminal work “The Age of Surveillance Capitalism”.
Surveillance capitalism, as Zuboff describes, is private human experience commodified, bought and sold as behavioral data, which creates whole new markets based on the predictive analyses of said behavioral data.
Surveillance capitalism is eroding democracy from the inside-out, she warns, because the market is driven to find its most predictive behavioral data by intervening, shaping and herding humans towards its commercial outcome.
Website tracking is the harvester of uncountable amounts of personal information every day from billions of people around the planet. For a very long time it has been non-consensual on the part of the users, the people whose data has been obtained, assembled and monetized for the profit of third party companies.
Pushback against non-consensual website tracking is underway – the biggest push coming from the General Data Protection Regulation and its clear rules of prior and informed consent that mandates how websites track users and how companies and organizations are allowed to handle personal information and user data.
The GDPR is a vital step to regulate the run-amok website tracking of the ad tech industry, and similar legislation is springing up around the world in its wake, such as California’s CCPA (California Consumer Privacy Act), as well as similar data protection law proposals in South Korea, India, Brazil.
ICO has ruled the entire ad industry as operating illegally, citing a lack of transparency in how data is processed and actioned in the so-called real time bidding schemes that take place every time a user is being served targeted advertisement.
The truth is that the free services of the Internet are most often paid for by commercial enterprises. Or in other words, the digital roads and highways that you drive down in your car are paid for by the billboards on the side of those roads.
But a balance needs to be struck in how this enterprise go around, the ecosystem of how the Internet is financed. This balance is being attempted now from a regulatory side.
But this balance is also what is our mission at Cookiebot.
It is one of the only fully GDPR and ePR compliant solutions on the market.
Cookiebot consists of three main features:
The Cookiebot scan detects and identifies all known types of tracking on the website.
It scans all the pages of your website by directing 7-8 simulated users at your website with requests every 1.5 seconds. This is few enough to not interfere with your website’s performance, but enough to detect all the types of tracking going on on your website, including dynamic cookies, ultrasound web beacons, pixel tags and fingerprinting.
When a user visits your website, Cookiebot deactivates all loaded scripts but the strictly necessary ones until the user has given their consent to the cookies, thereby complying with the requirement of prior consent.
All the cookies and other tracking technologies are listed, and grouped into four comprehensible categories, that the user can choose to opt in and out of.
All received consents are securely stored as documentation that the consent has been given, also a requirement of the GDPR.
The monthly scan results in a full report of all tracking technologies in use on the site, giving insight and control to the website owner as to what tracking is in use on their site.
As required by the GDPR, your users may go back at any time and change their settings or withdraw their consent. In the cookie declaration, features are automatically provided for the user to change or withdraw their consent whenever they want.