AWS Clean Rooms, InfoSum and the future of data collaboration

AWS Clean Rooms, InfoSum and the future of data collaboration
Alistair Bastian
Monday, December 5, 2022
Alistair Bastian
Monday, December 5, 2022

Data collaboration has been taking place for decades. However, the industry is facing new challenges with growing consumer awareness and new data privacy laws. With it the need for secure and privacy-safe data collaboration has grown, and the term “Data Clean Room” has emerged to categorize this demand. The market is currently filled with a variety of data clean room vendors, each optimized to solve the data collaboration challenge in different ways.

The latest is Amazon’s AWS Clean Rooms, announced only last week. The entrance of Amazon into the clean room market is a clear signal of the importance, demand, and scale of opportunity within the market. It should go a long way to solidifying the term “Data Clean Room” and quell the detractors of the value and necessity of clean rooms.

Our key observations about Amazon’s AWS Clean Rooms

Unlike other clean room offerings that are geared towards marketers or analysts, AWS Clean Rooms have a different audience. It is more of a specialized tool within a larger technology tool box. Similar to other AWS offerings, it is optimized for commercial entities to embed into existing processes, rather than a stand-alone product that can be immediately picked up and used by a marketer. Yet, it is impressive in the capabilities that it promises to deliver on, once harnessed.

Of primary interest to InfoSum is AWS’s use of Cryptographic computing. Cryptographic computing, as the AWS Clean Rooms feature page mentions, “allows you to collaborate using secure multi-party computation (SMPC), a technique that allows multiple parties to jointly compute a function over their inputs while keeping those inputs private. SMPC ensures that data used in collaborative computations remains encrypted: at rest, in transit, and in use.” This approach is one that is becoming increasingly prevalent with current data clean room vendors.

How InfoSum is different

This demonstrates that not all clean rooms are the same. They solve collaboration challenges in different ways. 

InfoSum has been helping customers navigate and simplify the complexities of secure, privacy-safe data collaboration long before the term “Data Clean Room'' was coined. Our vision has always been to develop the world’s first decentralized data collaboration platform. Among many goals, three have been at the heart of InfoSum’s vision; protecting the privacy of users, enabling decentralized data processing, and facilitating multiple parties to collaborate at scale and speed. Taking each of these goals in turn clearly differentiates our approach to AWS Clean Rooms.

Data privacy is not the same as data security. Data security relates to keeping personal and sensitive data secure from unauthorized access. Whereas data privacy relates to protecting data while it is in use. It relates to how data is shared with third parties and how data is legally collected and stored according to regulatory restrictions. It can however go much further than this. It can and should also mean providing the ability to reduce or remove the identifiability of individuals and prevent the unintended tracking, targeting, or reidentification of individuals. On the InfoSum platform it is not possible, through any query, data join, or analysis, to re-identify an individual. It is also not possible to reveal any personal data or attributes, sensitive or otherwise, about specific individuals or entities. Furthermore it is not possible to determine whether a specific entity or individual is a member of any collaborating party’s dataset. InfoSum uses robust, state of the art techniques to ensure that the privacy of an individual can not be compromised either accidentally, through misconfiguration, or intentionally, no matter what operations are executed on the platform. InfoSum feels that this privacy-first approach is not as strong or prominent within many clean room vendors. Some lack basic protections against reidentification attacks on individual-level data. While data access rules can be configured, typically they put all the risk and trust on the configuration, with the option of not enabling any individual-level protection at all.

InfoSum’s second goal is to enable decentralized data processing. InfoSum has been designed to enable data that resides in different geographical locations, physical locations, or cloud vendors to be queried and joined without raw data being moved. InfoSum’s query engine moves an abstract mathematical model that represents the data, between the decentralized and distributed datasets. The models are abstractions in that, although the models are built from personal identifiers, they are one-way models. They can not be reversed. It is not possible to take an InfoSum mathematical model and enumerate the personal data that was used to construct them. Although many clean room vendors can work with data in the cloud, they are typically restricted to specific cloud vendors where data needs to be centralized into a single cloud provider to enable collaboration. These solutions do not allow collaboration with data on- and off-cloud.   

The third key goal of InfoSum is to facilitate multiple parties to collaborate at scale and speed. At scale, requires that a large number of parties should be able to enter into a collaboration, participate in a query, or join. The InfoSum platform enables collaborations to be of any size. At speed, implies not only should new collaborations be fast to set up and define, but also that the time to insights or activation should be minimal. Once data is added to the InfoSum platform, InfoSum guarantees that, without modification, the dataset can be used in a collaboration with any other dataset on the Platform - instantaneously. Entering into multiple collaborations simultaneously, modifying the criteria of the collaboration, or leaving one, is also possible with the push of a button. Finally the speed at which queries are executed, and hence insights are achieved, are optimized and can take less than a second. Within InfoSum, a collaborator can enter a large number of collaborations with minimal management overhead. A new clean room, each with its own configuration parameters, is not required for each collaboration.

These core strengths of InfoSum are what differentiate InfoSum from other vendors. Privacy, decentralized data processing, and multi-party collaboration at scale and speed are the key to a modern collaborative computing platform. Not all clean rooms are the same and AWS Clean Rooms has significant potential to deliver on capabilities that combine cryptographic computing with deep data science techniques. With some work, it can be integrated with other tools in AWS’s toolbox to solve novel and bespoke use cases.  

To 2023 and beyond

Since delivering on its vision, InfoSum has helped over 100 clients and partners solve numerous data collaboration challenges, without exposing competitive knowledge, while complying with evolving regulatory restrictions, and meeting rising consumer expectations for privacy and transparency.  

The secure, privacy-safe data collaboration and clean room market is large and growing, with enough room to accommodate multiple players. The addition of new players continues to be a strong indicator of the demand, opportunity, and longevity of the space. From InfoSum’s perspective, the AWS Clean Rooms is a welcome addition; it has done little for the other vendors to bring them closer to InfoSum’s key strengths. At the same time, AWS has provided InfoSum with a tool to close the gap on these same vendors. InfoSum looks forward to leveraging and working with the AWS Clean Rooms, for specific use cases, to help complement and enhance InfoSum’s market-leading offering and position in 2023 and beyond.

Related Articles