The Secure Data Clean Room: Unfettered Match Rates
As we covered in a previous blog, Data Clean Rooms are secure environments where multiple data sources are matched and analyzed, without sharing or compromising the data itself. A key element of that functionality is the ability to match customers across data sets. Traditionally match rates have been the key metric of identity providers. But as we move to a first-party world, identity providers will be just one piece of the match rate puzzle, with brands, media owners, and other data providers also keen to understand their common audiences.
Although frequently used as a north star metric to determine the potential reach of a marketing campaign, a match rate by itself lacks context and can be misleading. Savvy marketers are now seeking greater clarity around how the match rate is calculated, especially when evaluating multiple identity providers to use in their Data Clean Room.
The match rate challenge
The core challenge with match rates and match tests as they exist today is the lack of visibility and transparency a brand or media owner has over how they are calculated. Traditionally when creating a match rate, both parties are required to share their first-party data into a third-party location. When centralized in this way, the identity vendor can access, transform and configure the match against their identity spine. The match test itself can take many days to be conducted before a percentage is provided back to each party. This percentage lacks key context:
- Precision level: What audience level is represented, e.g. individual or household, and what key was used for the match, e.g. email, phone, address, cookie, or a combination?
- Lookback window: How old are the identifiers to be used, when were they last refreshed or actively seen?
- Type of match: Was this an offline-to-offline match, an offline-to-online match, a match just to the match partner’s identity graph, or a reflection of the match scaled against an activation partner or media owner?
- Match Methodology: What type of methodology was used to generate the match and calculate the linkages across fractional identifiers, whether deterministic, probabilistic or a hybrid of both?
- Audience expansion applied: Where audience expansion has been used to increase audience scale, has this already been applied to the match rate returned?
- Addressability: What type of data was used to expand the audience and how much of that data is based on third-party cookies or other perishable identifiers. Was the match against the media owner or activation partner also based on third-party cookies vs other stable keys?
In addition to the lack of transparency, when data is shared and centralized, each data owner loses control of their data and risks the privacy and security of their customers. This risk of exposure is too high for customer-centric companies in 2022.
Match rates and identity resolution in a Data Clean Room
As marketers embrace Data Clean Rooms to solve their collaboration needs, it is important to determine the identity resolution and match rate capabilities of the solution. While many Data Clean Rooms are moving to a decentralized infrastructure, this often doesn’t include their data matching capabilities.
The ‘dirty’ element of these solutions is that they often still require data to be moved and matched against a centralized identity spine. As we outlined above, this sharing of data risks exposure and leakage of customer data, as well as the threat of privacy infractions. This becomes of particular sensitivity to highly-regulated industries such as healthcare, financial services, and telecommunications.
Additionally, many of these solutions still operate as black boxes. Match tests can take weeks to be completed and when the results are finally returned, the organizations have very little visibility over how the match rate has been achieved. There are some key metrics all organizations should seek when conducting match tests and measuring the success of a match rate:
- Match rate: What is the percentage of records from a company that can be matched to another set of records?
- Reach: How many records or addressable users does that match rate represent?
- Precision level: What level of precision is being achieved, for example, an individual, a household, or a neighborhood
- Accuracy: What accuracy level has been achieved? This is ideally expressed as a percentage and determines how likely an identity vendor is to match the same individual across datasets. Accuracy can be impacted by using multiple mapping tables or translating the match across multiple identifiers.
- Time to match: Companies need to match, plan, activate and measure at the speed of business, so how quickly can a match be created?
Data Clean Rooms must embrace transparency, flexibility, and end-to-end decentralization, or as we call it at InfoSum - non-movement of data. Only an approach that enables data matching, planning, activation, and measurement with zero data sharing can deliver maximum data performance, with maximum data protection.
The Secure Data Clean Room
InfoSum believes companies must embrace the full power of data collaboration to deliver rich and high-performing marketing, while fully protecting the security of their data and the privacy of their customers. Part of this empowering approach is to enable companies to have complete control and transparency over how they match their data, and how the match rate is achieved.
The Secure Data Clean Room provides organizations with the freedom to match across multiple partners without having to share sensitive data with anyone.
It all starts with our Bunker technology. Each decentralized Bunker is a standalone and private cloud instance managed and controlled by a single data owner that only they can ever access. Organizations self-onboard their data into their Bunker. The data goes through a simple normalization process, after which permissions can be granted to the appropriate parties for the match test to be conducted. This entire process can be completed in minutes, rather than days or weeks as with other solutions.
Our intuitive drag and drop tools can then be used to select the Bunkers to be used in the match test - this can include direct matching between an advertiser and media company, or matching against an identity partner. The reach and match rates are returned instantly, alongside the keys (email, phone, address, device ID, etc.) or a combination of keys used to create the match. The organization then has the freedom to select the specific key or combination of keys they wish to use and even test match rates across multiple partners in real-time to optimize their reach based on their marketing goals.
Maximize your match rate
With The Secure Data Clean Room powered by InfoSum, organizations maximize the performance of their data, by maximizing the match rate they can achieve. There are four key areas InfoSum focus on:
Work directly with activation partners
The Secure Data Clean Room does not rely on mapping tables to connect identities. This removes any potential user error normally associated with tables that are stale or out of date. It also eliminates any risk of data leakage, exposure, or misuse that can occur when one or multiple parties are responsible for the security of such highly sensitive customer data. Within The Secure Data Clean Room, true and direct matches between datasets can be generated with none of the risks.
In one common use case, where an advertiser matches their first-party data against a media owner’s audience, this is done on a direct matching level. This means organizations have a direct and accurate understanding of the reach that will be achieved, rather than just the reach an identity vendor forecasts they can provide. The match rate provided by these identity intermediaries often doesn’t accurately reflect the addressable audience that can be reached, as it doesn’t account for duplication or expired cookies, for example.
Optimize match rate using best performing key
The Secure Data Clean Room puts the power of optimization in the hands of the organizations. As we touched on previously, organizations have transparency over the match rate that each key can deliver. They then have the flexibility to select the key or combination of keys that will deliver them the maximum match rate based on their marketing goals.
Use an identity bridge
Sometimes the match rate that can be achieved between two first-party data sets, for example, an advertiser’s CRM and a media owner's authenticated audience, isn’t high enough to deliver on their goal. In these circumstances, a second-party data set, or a third-party identity graph can be used as an identity bridge. The identity bridge creates a three-way match and maximizes the match rate that can be achieved.
Expand audience with federated learning
Depending on the goal of running the match rate, audience expansion can be a useful way of increasing the reach. The Secure Data Clean Room includes a federated learning engine that analyzes the characteristics of the matched audience and expands into the unmatched audience.
Perfect your first-party data and data collaboration strategies
The advertising ecosystem is in a state of flux with so much change and unknowns. This uncertainty is only made worse by the impending deprecation of third-party cookies and privacy legislation becoming stricter globally. It is clear that the risk is no longer worth the reward when it comes to the privacy and security of customer data. Control and protection of proprietary data assets need to take priority as enterprise organizations look to future-proof their first-party data strategies.
This means marketers need to develop closer, more direct connections with their supply-chain partners. Black box tactics will no longer be a safe or effective way to generate true addressable scale and accuracy. Brand marketers and media owners will need to invest more in their own data as well as forge expansive data collaboration partnerships to gain access to additional intelligence and scale.
We are at the beginning of a new horizon for data-driven marketing where sensitive data can be used to execute effective experiences across the open web, without risk. The Secure Data Clean Room provides the necessary balance between privacy and performance enabling a seamless transition towards a better data onboarding and matching process.