How would you feel if I said I’d seen your web browsing history and that of your neighbours? Everything for the last year: What you view at work, what restaurant you ate at last night, what websites you visit at home, and what Christmas gifts you bought or decided against?
What if I told you I was passing this information on to strangers? And that they were using it to make judgements about you, like what advertisements you see?
What if I told you that you’d given me permission to do this, simply by clicking a button on one of those websites? And that it happens every time you use the Internet? Would it make any difference that I didn’t know your name?
This isn’t hypothetical; it’s exactly the way that behavioural ad targeting works. The academic Shoshana Zuboff has dubbed this 'surveillance capitalism’, and it’s the business model that drives the online world.
Advertising funds some of the great content we see online. For our part, users receive free cat videos, articles, and apps, but despite the annoying GDPR pop-ups we now see when we surf the web, the value exchange is never made totally clear. Here, we explain the complicated world of online advertising, and how your data is harvested and sold.
Let’s get our definitions straight
The world of online advertising is complex and confusing and very few people understand it all - which is what makes it hard to regulate and to explain.
Let’s start off by breaking online ads into two categories: ads that you see on platforms (such as Facebook, YouTube or sponsored items on Amazon), and banner ads that you see everywhere else, from news sites and mobile apps to connected TVs. For the sake of this article, we are talking only about banner ads.
Behavioural versus contextual targeting
A second important distinction is that these banner ads are personalised and matched to you and your online behaviour, not to the content of the website. They’re targeted to you based on your previous browsing behaviour in a process called ‘personalisation’. This means that when you’re reading an article about Christmas, a behaviourally targeted ad will be placed based on knowledge about your past activity - say, a past search of dogs - and this show, say, holiday toys for dogs.
Adverts that are placed directly within the publication are called ‘contextual’ and do not require any knowledge about you, or your love of dogs. They’re targeted to the content of the article or publication (e.g. advertising deals on Christmas decorations).
How does it work?
The magic happens while you’re waiting for a web page to load. In less time than the blink of an eye (literally, this process takes ⅕ of a second as opposed to ¼ needed to blink), data about you travels from the website you’re visiting to an ad auction, where advertisers bid for your attention. You’ll then see the ad from whoever has bid the second-highest price. The ‘second-highest’ feature keeps things competitive and stops companies from placing outrageous bids. This lightning-fast technical process is called real-time bidding.
This process has some advantages. It means you’re more likely to see relevant ads. But it also relies on shadowy data collection that risks your privacy and increases the likelihood of your data being accessed by a large number of other organisations.
This process has some advantages. It means you’re more likely to see relevant ads. But it also relies on shadowy data collection that risks your privacy and increases the likelihood of your data being accessed by a large number of other organisations. Would you want your health insurance company knowing you often order late night pizza, or a new bank to have access to your dating profile?
In order to understand the extent of these threats, we need to take a closer look at what’s happening behind the scenes.
How am I being tracked?
When you load a website, small text files known as ‘cookies’ are installed on your computer. These cookies collect and save information about your device, browser settings, IP address, location, or what you’ve read or watched that day. Some of them are necessary for technical purposes: information about your device might enable the site owners to identify what adaptations it should make to fit a mobile screen, or suggest the right language version. Other cookies are needed for user authentication, or to help you autofill forms. Finally, some cookies are used to track and analyse your activity: what articles you read and when, how much time you spend on the website, or which ads you click on.
Cookies can be read only by the websites that installed them. Google.com cannot read cookies installed by bbc.com - it either has to ask the BBC to share information related to that cookie, or create a cookie which monitors you when you visit the BBC’s website. The second scenario is what usually happens: third-party cookies are set by advertising companies, as well as the first-party cookies installed by the website itself. Some websites have as many as 400 external advertising ‘partners’ that they share your data with, most of whom you’ve probably never heard of.
Each cookie has an ID which serves as your digital name. Your real-life name is not relevant in the world of online advertising. What matters are your browsing patterns: what you read and watch online, when and where. These observations make it possible for algorithms to ‘guess’ things about you, like your sex or your interests. These assumptions can grow more and more sophisticated and accurate, as data linked to cookies accumulates over time and records your activity on many websites.
How is my data bought and sold?
Once your data is collected, the website sends your cookie ID, together with information about you, such as the link to the article you’re currently reading, to a ‘supply-side platform (SSP)’, a piece of software that manages the website’s ad inventory.
The SSP then sends the identifier and all information linked to it to the ad exchange, where a virtual auction takes place. Sometimes this information is sent to several ad exchanges at the same time. The ad exchange then broadcasts this information to tens if not hundreds of companies representing advertisers, known as ‘demand-side platforms (DSP)’. Their role is to evaluate if this particular user is relevant to any campaigns they are currently running on behalf of advertisers. To do so, they match the user ID they received from the ad exchange against their database, looking for additional information about the user.
Thanks to this exchange, they may know that a user is interested in or has joined interest groups on social media, has visited particular websites recently, as well as approximately what age they are, and what their location is. Afterward they place a bid (just like in an old-fashioned auction), hoping to win the possibility to show the right ad to the right person at the right time.
The ad exchange selects the winning bid and transmits the good news, together with the content of the ad and the tracking code of the DSP, back to the SSP. Only then do you, the user, see the ad on the website.
All of this happens in less than 200ms and with algorithms making those decisions — from creating a user’s profile to determining the bidding price and the winner of the auction.
This kind of data exchange is happening all the time as you surf the web. Think about the tabs you have open now, what you’ve looked at during the past week, whether you share your computer with family members. All of this information is being bought, sold and used to profile you for advertising.
For some people, it’s overwhelming and unacceptable, but others are happy to hand over data for the sake of relevant advertising. The problem with the system is that it the speed by which these transactions are completed makes it extremely complicated and opaque, and it gives people no real control over how their data is used or who it shared with. Advertising plays a key role in funding quality journalism, great content, and a free and open web, but many things have to change before this can be considered an ethical process and exchange.