Updated March 31, 2020.
Website tracking, i.e. tracking of users and their digital habits, is a pervasive phenomenon, and the methods by which it is done are becoming ever more sophisticated.
We give a broad overview of the phenomenon of website tracking - how it is done, what its implications are, what the law says about it and Cookiebot's role in the game.
Website tracking is the practice by which data is collected about a person's behavior online.
Most people know of the cookie - the small file that websites place on a visitor's computer that allows them to obtain personal information about that specific user.
Website owners are also generally familiar with the distinction between first-party cookies, i.e. cookies set by the website itself, and third-party cookies, set by external service providers.
What is less well known, however, is that a large majority of the third-party cookies act as back doors or trojan horses, loading further cookies from yet other parties.
A large-scale measurement study from 2020 on website tracking uncovers the alarming reality that most website owners and operators are not aware of –
99% of all cookies are used for web tracking and to serve targeted ads.
72% of all cookies are set by fourth parties, not third parties, i.e. trojan horses.
18% of all cookies are set by fifth parties or higher, i.e. deeper trojan horses.
50% of the additional parties loaded will change between repeated visits.
This trailblazing study out of Ruhr University Bochum and the Institute for Internet Security also showed that –
Subpages set 36% more cookies than a website’s font page or landing pages.
Subpages set an average of 78 cookies, while landing pages set an average of 55 cookies.
These findings show the “dire need for privacy protection mechanism to limit cookie-based tracking”.
They show that website visitor tracking is an issue that is impossible for website owners to manage without technology that can scan, uncover and control all cookies and the trojan horses they load.
Try Cookiebot free for 30 days... or forever if you have a small website.
Cookiebot is a consent management solution that deep-scans your domain to find all cookies and similar tracking technologies. We help make your website’s use of web tracking compliant with the EU’s GDPR and California’s CCPA.
Cookiebot's GDPR/ePR compliant cookie consent banner in EU to prevent non-compliant web tracking.
Then Cookiebot automatically controls them by blocking everything until the end-users have given their consent.
Cookiebot is one of the world’s leading consent management solutions (CMP), enabling real compliance with the European GDPR/ePR and California’s CCPA by offering granular consent and opt out solutions for websites.
Cookiebot’s CCPA compliant cookie declaration in California to prevent non-compliant web tracking.
Try Cookiebot free for 30 days... or forever if you have a small website.
The purpose of web tracking is for organizations, companies, websites etc. to gain insight into their users, their behavior and preferences. Website visitor tracking is so pervasive that, as mentioned before, 99% of all cookies on websites are used for this purpose.
The insights that web tracking yield serve to optimize user-friendliness and experience, as well as for statistical purposes, for customization, for commerce, and for profiling and targeted marketing.
However, website tracking also serves more sinister purposes, as revealed in the vast privacy and election scandals that unfolded in 2016 and continues to this day.
The two biggest and most pervasive scandals, sure to be remembered for decades to come and already acknowledged as watershed moments in public privacy awakening, are:
Personal data harvested through website visitor tracking is not “just” data – not just series of numbers and random facts about people – but in fact power tools to manipulate, discriminate and infringe on the lives of real, living persons.
That’s why compliance with the data protection laws emerging across the world is paramount, not just to avoid fines, but to protect your users and their private lives.
Website tracking tools are myriad, and they get more and more sophisticated, as they try to circumvent measures put in place to control or block them.
Of the most common website tracking tools is the cookie, and cookies can track a lot of different things about people on your domain.
When a user browses on the internet, everything might potentially be tracked:
Website tracking is when users’ digital activity on a website or journey between websites is being monitored or recorded. It's very common, but the transparency leaves room for improvement.
It's not made clear to users, when they are being tracked, how, by whom, whereto the data is sent and for what purpose, and the tracking happens without their consent.
User tracking without consent is illegal according to the EU’s GDPR. If your website has visitors from inside the European Union, you must obtain their prior, explicit consent to collect and process their personal data. Read more about the GDPR here.
The EU and California has enforced strict regulations for protecting the privacy in the digital realm with the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA).
The purpose of these data privacy laws is to empower individuals with agency and control over their own data by forcing ad tech companies and big business to be transparent and give insight into how one is being tracked, by whom, and for what purpose, along with the possibility to prevent it from happening.
GDPR requires that websites must obtain the prior consent of users before any collection and processing of their personal data can take place.
CCPA forces businesses to feature a Do Not Sell My Personal Information link on their website so that users can opt out of having their data sold to third parties.
Both laws also empower users with the rights to be informed and to have collected data deleted.
There is no simple answer to this question, because web tracking is many things. Among the most common reasons are:
Websites track users directly and by means of integrated third-party tools such as Google Analytics, mainly in order to gain insight into how their website is being used.
This enables the website’s owner to improve and optimize the functions, functionality and features of their website, so that it meets user requirements as closely as possible.
Webshops and ecommerce websites track users in order to maximize their turnover.
Simply put, the more insight a commercial website has into its customers’ actions, interests and needs, the better it can present its products to the specific user, the more it will sell.
Websites also allow for third-party advertisers to track their users and display ads to them in order to get revenue from their website.
Especially news sites and other websites with editorial content have a large presence of third-party website trackers. Many of these sites provide articles for free and lack an external funding. Therefore, they have to monetize pageviews with significantly more advertising than websites promoting commercial products or websites owned by governmental or public entities.
Advertisers track users so that they can target their marketing as precisely as possible and display their ads to the most relevant potential consumers.
The technology that allows companies to place ads on somebody else's website is called display advertising.
Our own Cookiebot report on ad tech surveillance on EU government websites reveal some of the scary implications of third-party cookies.
Typically, advertisers make use of large scale ad networks to help them market their products to their most relevant audience on the internet. The largest online advertising network is Google Adsense.
This form of targeted advertisement is made possible only through the collection of user data, which in turn is done by website tracking tools.
In January 2016, a study from Princeton University measured and analyzed the online tracking on the top 1 million websites of the internet.
The key finding of the study was that third-party trackers present on the internet takes the form of a classic long tail graph:
Illustration from the Princeton Study on Web Transparency
Even though the researchers all in all detected over 81,000 third-party trackers that were present on at least two websites (thereby indicating that they are third-party trackers), only a handful of those were present on the majority of the 1 million analyzed websites.
The top five most common website tracking tools were all owned by Google.
Google Analytics, a product used to log visitors to websites that integrates with the company’s ad-targeting systems, was found on almost 70 percent of sites. DoubleClick, a dedicated ad-serving system from Google, was found on close to 50 percent of sites.
One thing is knowing that website trackers collect user data.
Another is to understand how this data can be used in ways that are not immediately visible or intuitive to us but can have far-reaching consequences for our right to privacy and equal treatment.
The term digital phenotyping describes the process by which our online behavior can be used to obtain insight into and map out our health, and thereto also potential health risks and issues. This means that trivial and benign data collected by a website page tracker can be turned into telling clues with accurate prediction abilities.
As an example, research suggests that early Parkinson’s disease can be detected by typing patterns on keyboards, as well as how language used in social media posts can predict depressive episodes – all data that website trackers collects every second from millions of people all over the world.
Other known online website tracking tools are tracking pixels (or pixel tags), web beacons (or ultrasound beacons), and browser fingerprinting (or digital fingerprinting), amongst others.
The cookie is a simple string of text that is loaded on users’ browsers when they visit a website. Its purpose is to enable the website to recognize and remember its users. But cookies make up the majority of website trackers online.
The cookie was invented back in 1994 by Lou Montulli and John Giannandrea at Netscape, and originally served to provide websites with a ‘memory’, so that they could, for example, hold items in a shopping cart while the user browsed for goods on the site.
While the cookie still serves this purpose, it can also monitor users and give a great deal of insight into user behavior.
The cookie is widely used for profiling and targeted marketing, and most websites set a great deal of cookies of first and third-party provenance alike.
There are also many different cookies: necessary cookies, analytics cookies or statistics cookies, marketing cookies or advertising cookies. The strictly necessary cookies function to make your website operate its most basic functions so that a visitor can visit it. These rarely if ever have any way of tracking users.
However, analytics cookies or statistics cookies are most often third-party cookies that track and log user behavior to give insight to the website owner. Marketing cookies and advertising cookies are also most often third-party cookies that serve to make targeted advertisement possible. These cookies are website tracker tools for both the companies using them to optimize their sales, but serve also as website tracking tools for companies like Google and the entire ad tech industry.
Advertising cookies, marketing cookies, analytics cookies, statistics cookies – a lot of different names for the same phenomenon: a way to gain insight into a website’s users for different purposes, but with the same dire implications if left unregulated.
There has been quite a bit of negative public attention to cookies, and many users choose to block cookies from their browsers in an attempt to avoid internet website trackers.
Read our full introduction to internet cookies.
Tracking pixels, also called pixel tags or 1x1 pixels, are transparent images consisting of a single pixel, that are present (albeit virtually invisible) on a webpage or in an email.
When a user loads the webpage or opens the email, the tracking pixel is also loaded, enabling the sender of the tracking pixel, typically an ad server, to read and record that the webpage is loaded or the email is opened and similar user activities.
The purpose is much the same as for third-party cookies: to get insight into users for targeted marketing.
Information that can be obtained by websites and third parties via tracking pixels include:
Both the GDPR and the CCPA regard cookies and tracking pixels as part of their definitions of personal data and personal information, respectively.
This means that for websites, businesses and organizations to use such technologies to track their users online, the GDPR requires prior consent to be obtained, while the CCPA requires users be informed of what categories of data is being collected through such technologies, and a clear way to opt out of having it sold to third parties.
Web beacons are a variety of techniques of tracking users online. Some of them are known as ultrasound beacons (or ultrasonic beacons, sometimes abbreviated uBeacons) and these are high-pitched sounds that are emitted from the device in use, e.g. when you visit a website that has the web beacon installed.
The sounds omitted from these web beacons are inaudible to humans, but your dog can hear it, and, more importantly, all the other devices in proximity to the one you were using, react to it.
Also called Ultrasonic cross-device tracking (uXDT), the uBeacon serves to bridge the gap between the digital world and the physical one.
One of the primary benefits of the ultrasound beacon is that it enables the sender to gain insight into what devices are connected with each other: your pc, mobile, tablet, etc. - thereby solving the headache of marketers and other trackers alike, that users can move between devices.
More and more mobile apps silently track users by means of ultrasound beacons for other sophisticated purposes:
For example, some retail stores have ultrasound beacons installed at their entrance that interact with your mobile phone when you go inside, enabling marketers to track and target consumers in the physical world as well as online.
So, if you for instance went to a brand store for, say, sneakers, that had an uBeacon emitter installed at their entrance, this particular brand of sneakers now knows that you may be interested in their shoes, even if you never went to their website or searched for their shoes online.
Even if a user blocks tracking cookies and uses VPN to blur their IP-address, there still are other methods for tracking users.
One of them is browser fingerprinting, the uniqueness of your specific computer, device or browser.
Whenever a user visits a website, their computer or device provides the site with highly specific information about their system and settings. The use of this information to identify and track users is known as device or browser fingerprinting, sometimes also referred to as digital fingerprinting.
A browser fingerprint is thus a collection of many, many different information about a user's device in order to create a sort of "fingerprint" for that device that can be tracked across the web.
This browser fingerprint can consist of -
This information might seem benign at first glance, but combined they can form a unique browser fingerprint that stand out as one among millions of other devices.
Browser fingerprinting is frighteningly accurate: it can successfully identify users 99 percent of the time.
It also means that even if users take privacy precautions, such as using VPNs and blocking cookies in their browser settings, a browser fingerprint, unique to their devices in use, can re-identify the user when they visit a website.
See also this guide by privacy advocates Pixel Privacy on fingerprinting and what one can do about it.
In addition to regular cookies, tracking pixels, pixel tags, web beacons and ultrasound beacons and browser fingerprinting technologies, there exists other methods for tracking users, such as undeletable zombie cookies or super cookies, dynamic cookies, Silverlight Isolated Storage, IndexedDB, etc.
As the world is coming to realize, in the digital age, data is an extremely valuable asset, that can be used for everything from owning markets, affecting the masses, to even win elections.
The methods for getting insight and tracking users is always evolving, and the means are impressively creative.
The European GDPR and ePR, and the Californian CCPA are the first, important legal steps towards a future of balanced regulation of the ad tech industry and its surveillance capitalism.
Imagine that you’re driving down a highway in your car one late afternoon.
You see in the not-so distant horizon an empty billboard on the side of the road. As you speed towards the billboard, thousands of companies are engaging in an invisible auction in real time, the highest bidder buying the opportunity to showcase their product on the billboard the second you pass it by.
However, these companies are bidding on more than just the commercial space on an empty digital highway.
They know that you are going to pass by the billboard, because they know which road you’re driving on, just as well as they know which car you’re driving in, what music you’re listening to, how fast you’re going, what you had for lunch, how much gas is in the tank, when you bought your car, oh and what your name is, where you came from, the color of your hair, your vices, dreams and fears.
The billboard for sale is more than an empty advertisement space: it is a tailored and targeted attempt at swaying your behavior using everything there is to know about you against you.
Aware of patterns in your behavior that you don’t even see yourself, these companies monetize your person through targeted advertisement and real time bidding auctions, where data brokers sell to commercial companies the digital billboards timed to the micromoments in which you are most swayable, most easily herded and nudged.
This is surveillance capitalism, as coined and elaborated by Harvard Professor emerita Shoshana Zuboff in her seminal work “The Age of Surveillance Capitalism”.
Surveillance capitalism, as Zuboff describes, is private human experience commodified, bought and sold as behavioral data, which creates whole new markets based on the predictive analyses of said behavioral data.
Surveillance capitalism is eroding democracy from the inside-out, she warns, because the market is driven to find its most predictive behavioral data by intervening, shaping and herding humans towards its commercial outcome.
Website tracking is the harvester of uncountable amounts of personal information every day from billions of people around the planet. For a very long time it has been non-consensual on the part of the users, the people whose data has been obtained, assembled and monetized for the profit of third-party companies.
Pushback against non-consensual website tracking is underway.
The General Data Protection Regulation (GDPR) and its clear rules of prior and informed consent is a vital step to regulate the run-amok website tracking of the ad tech industry.
The CCPA (California Consumer Privacy Act) is a decisive pushback in California against Silicon Valley and its behemoth ad tech industry infamous for its privacy-invasive practices.
ICO has ruled the entire ad industry as operating illegally, citing a lack of transparency in how data is processed and actioned in the so-called real time bidding schemes that take place every time a user is being served targeted advertisement.
In California, the CCPA took effect on January 1, 2020 and awaits enforcement from the Attorney General beginning July 2020.
The truth is that the free services of the Internet are most often paid for by commercial enterprises. Or in other words, the digital roads and highways that you drive down in your car are paid for by the billboards on the side of those roads.
A balance needs to be struck in how this enterprise go around, the ecosystem of how the Internet is financed. A balance is now being attempted from a regulatory side in both the EU and California.
But this balance is also the driving mission at Cookiebot.
It is one of the only fully GDPR/ePR and CCPA compliant solutions on the market.
Cookiebot consists of three main features:
The Cookiebot scan detects and identifies all known types of tracking on the website.
It scans all the pages of your website by directing 7-8 simulated users at your website with requests every 1.5 seconds. This is few enough to not interfere with your website’s performance, but enough to detect all the types of tracking going on on your website, including dynamic cookies, ultrasound web beacons, pixel tags and fingerprinting.
When a user visits your website, Cookiebot deactivates all loaded scripts but the strictly necessary ones until the user has given their consent to the cookies, thereby complying with the requirement of prior consent.
All the cookies and other tracking technologies are listed, and grouped into four comprehensible categories, that the user can choose to opt in and out of.
All received consents are securely stored as documentation that the consent has been given, also a requirement of the GDPR.
The monthly scan results in a full report of all tracking technologies in use on the site, giving insight and control to the website owner as to what tracking is in use on their site.
As required by the GDPR, your users may go back at any time and change their settings or withdraw their consent. In the cookie declaration, features are automatically provided for the user to change or withdraw their consent whenever they want.
As required by the CCPA, our cookie declaration features the Do Not Sell My Personal Information link, when your website detects visitors from inside California. Cookiebot’s automatic geotargeting makes compliance with the GDPR/ePR and CCPA simple and straight-forward.