You’ve probably heard the phrase “data is the new oil.” It was coined by Clive Humby – the British mathematician who established Tesco’s Clubcard loyalty program – in 2006.
It’s become such a ubiquitous phrase that arguing its counterpoint – that data is not the new oil – is now a popular trend (see this article, another and another).
Humby’s point is that both data and oil are relatively worthless in their unrefined state but have great value once processed. Recent detractors from this stance point out that data tends to have value to only specific organizations, depending on what the data represents.
As Antonio García Martínez writes in Wired, “By any reasonable valuation, Amazon’s purchase data is worth an immense fortune … to Amazon,” but not necessarily anyone else. He argues there isn’t a global market established for the trading of data the way there is for the trading of oil. “Oil is literally a liquid, fungible, and transportable commodity,” he writes.
Sure, the metaphor isn’t perfect. Some data has limited value or application regardless of how well it is processed. And it’s clearly not a commodity. But there are several examples of data trading that are similar to the established global oil market. And I’m sure Walmart executives would pay a mint for a year of Amazon sales data.
The largest data-trading market – the 21st-century digital advertising market – uses data to match businesses with customers in a way that ideally improves the experience for both parties: consumers see ads that are relevant to them, and businesses find customers more likely to convert. This is not dissimilar from the process of crude oil refining, where a commodity is transformed to create value, and yields countless marketable products from a thick, virtually useless liquid.
When it comes to ad targeting in the programmatic marketplace, the commodity is the audience, and the refinement process is the act of identifying attributes of the audience for targeting—either by segments or at the individual level. There are two primary means of ascribing identity to audiences, that are differentiated by their data-collection methods.
The first is site-authenticated data collection. Site-authenticated data is sourced from individual authentication events. When a user completes an online form, they generally agree to a privacy policy that includes a data use agreement. User data is then married to other data sources that add descriptive meaning, which becomes the basis of ad targeting. In HCP marketing, this is the National Provider Identifier (NPI).
From this point forward, when the user views digital content during an authenticated session or receives an email to their provided address, the user can be deterministically targeted.
The second method is people-based data collection. This type of data doesn’t come from a registration event, but rather a myriad of sources – data licensing, research, and manual verification – that can be on-boarded to a data management platform (DMP).
The job of the DMP is to aggregate the multitude of data from various sources into likely groupings using data science, with the objective of assigning an anonymized ID to individual users. Once assigned, these anonymized IDs are targetable within the programmatic marketplace. This methodology, while probabilistic in nature, provides reasonable accuracy but not to the degree provided by deterministic data.
In the HCP space, both methods allow for NPI-level targeting. However, in many cases, people-based data does not accommodate individual-level reporting (also known as physician-level reporting). This limitation stems from the absence of a privacy policy stipulating data use. Nonetheless, significant scale can be achieved with people-based data, as individual devices can be targeted whenever a user makes a request on an ad exchange.
The point of the oil metaphor is that data has an incredible amount of potential that can be unlocked through processing, and the source of the underlying commodity that is being refined – in this case, data – is important.
If you put the wrong oil in your car, you won’t get very far. The same holds true if you add the wrong data to your marketing engine.