Confused by the marketing data landscape? Here’s your primer on consumer data types and their respective pros and cons.
The availability and breadth of consumer data has never been higher (thanks, internet!), and for marketers, all that data is a gold mine—and a quagmire. Why? Because all marketing data is not the same.
Where the data comes from can make a big difference. Was it purchased from a vendor? Accessed through a social network? Aggregated from e-commerce transactions? Offered by the consumers themselves? The source of the consumer data can determine its quality, its usefulness, and in some cases even what companies are legally allowed to do with it.
Third-party data is information aggregated and sold by an entity without a direct relationship with a consumer.
Data vendors compile massive databases of information about consumers from many different sources—public records, credit reporting agencies, etc.—and combine it with data inferred from online behavior. They then sell targeted lists to marketers based on demographic or behavioral criteria. (Fun fact: This practice isn’t a product of the digital age—it actually predates the internet by decades.)
In a word, scale.
If you’re willing to pay for it, third-party data grows your database fast. You can build a large list for targeted campaigns almost immediately, and many third-party data vendors allow you to apply fairly granular filters, limited only by the size of your pocketbook.
The flipside of that scale is that you’ll need to swallow a lot of junk data alongside the good stuff. Third-party data is notoriously unreliable, a fact vividly illustrated by a study profiled in Digiday that found that 84 percent of users in one sample from a data vendor were identified as both male and female. While that’s a particularly egregious example, using third-party data means accepting a high risk of inaccuracy.
There are any number of reasons for this bad data. Consumer data goes stale over time—people get older, move, have families, change their interests and preferences—but that data continues to be bought and sold for years after it’s out of date. The method of collection also comes with flaws: Interests are inferred from online behavior, but as anyone who’s ever fallen down an online rabbit hole knows, people browse things they’d never actually buy all the time. Plus, the cookies that track online behavior only paint a partial picture because they don’t work cross-device, so they miss out on mobile and in-app browsing (never mind the fact that the most popular browsers are cracking down on third-party cookies anyway).
And finally, there’s the data privacy piece. A steady flow of new data privacy legislation puts limitations on third-party data. Notably, the California Consumer Privacy Act, or CCPA, specifically allows consumers to opt out of the sale of their data to third parties, and a federal data privacy law could well follow suit. Third-party data vendors are poised to feel the squeeze.
Second-party data is someone else’s first-party data that you have access to—usually a platform (Google, Twitter, Facebook, etc.), a publisher, or a non-competitive brand you are partnering with.
Second-party data is a bit of a gray area. It’s like first-party data in that it is generated by a direct interaction between a brand and a consumer, for instance when someone identifies their gender, age, or location in their social media profile. On the other, it’s like third-party data in that you are negotiating with another entity to access that data. (PSA: It’s worth noting that second-party data-sharing agreements fall under the umbrella of selling data to a third party in certain data privacy legislation, so don’t let the name confuse you.)
Second-party data boasts some, if not all, of the scale of third-party data, but typically greater accuracy and transparency. When you strike a deal with another brand or publisher, you have more visibility into how the data was collected and how it is being segmented.
The biggest downside to second-party data is that it’s not your data—your access to that audience for targeting lasts only as long as your agreement with the second party lasts. While you are technically amplifying your reach to a new audience, it’s one that you’re only borrowing.
Partnering with another brand to access their data can also bring new hassles. Negotiating data-sharing agreements is time-consuming, and never more so than in the age of GDPR and CCPA, where consent for data-sharing is either explicitly required or can be revoked retroactively, necessitating strong data governance practices on both sides. The major targeted ad platforms (primarily the Google-Facebook duopoly) make access to second-party data relatively turnkey, but the tradeoff is transparency—their walled gardens function more like black boxes, with advertisers in the dark about who their ads are actually reaching.
First-party data comes from a direct relationship between your brand and your consumers. It can be information they’ve expressly volunteered about themselves or information about actions they’ve taken on your owned channels.
The biggest pro of first-party data? It’s yours! Within what is permitted under data privacy laws, you have control over how and how often you use the data. You have total transparency into how it was collected, its accuracy, and its recency. For this reason, marketers overwhelming report that they are most confident in the accuracy of first-party data over second- or third-party data.
First-party data also confers a competitive advantage: Unlike second- and third-party data, which any of your competitors could theoretically access, first-party data is unique to your brand. It is most relevant to your marketing needs because it is information that you deliberately chose to collect or that is automatically generated in the course of a customer interacting with your brand.
First-party data flips third-party data on its head; where third-party data scales well but lacks accuracy and relevancy, first-party data tends to be accurate and relevant but harder to scale. This has to do primarily with how it is generated—the universe of potential consumer interactions to harvest data from is simply smaller with a single brand, and creating a significant pool of first-party data frequently requires an investment in technology that not every brand is willing or able to make.
Within the field of first-party data, there is a meaningful but often overlooked distinction between implicit and explicit data.
Most first-party data is implicit, or inferred, data, usually in the form of transactional data (what a consumer bought) or behavioral data (what a consumer did). From this information, marketers can make generalizations about a consumer, for instance assuming that a consumer who purchases a diaper pail is a new parent who might also be in the market for related purchases, such as diapers and wipes.
Inferred data is broadly useful but imprecise—alongside the new parents who are being accurately targeted for baby products are likely people who bought a diaper pail as a gift for a friend or relative, but who aren’t themselves in the market for related products (and who are probably really annoyed by those ads). Essentially, implicit data tells you what a customer does, but forces you to guess about the why behind it.
Explicit data, also known as declared data or zero-party data, is information that a consumer deliberately volunteers to a brand. Declared data validates implicit data and provides much-needed context, uncovering things like the preferences, motivations, and desires that inform a consumer’s behavior and allowing for more nuanced audience segmentation and personalization—in short, the why behind the buy.
Traditional marketing activities that produce declared data include forms and surveys (prone to low completion rates) and focus groups (hard to scale), making declared data less readily available than other types of first-party data. That’s changing with the rise of digital content—with the right technology in place, brands can ask for declared data within the context of an interactive digital experience, such as a product finder, lookbook, or other interactive content.
In a nutshell, all consumer data exists on a spectrum, with relevance and accuracy on one side and ease of scale on the other. In many ways it’s a self-fulfilling tradeoff—third-party data requires scale in order to get a large enough cohort within the target parameters, whereas first-party data, and specifically declared data, derives its value from its specificity, meaning a smaller amount of data can go further.
In practice, most brands will use a combination of third-, second-, and first-party data to meet their marketing goals. By recognizing that not all marketing data is the same and understanding the pros and cons of each type, marketers can make the right decision about what types of data to prioritize and why.