Upgrade 2021: cis LAB Speakers

September 21, 2021 // Upgrade 2021: CIS LAB Speakers

Using Cryptography to Meet Requirements for Use of Aggregate Data While Protecting Privacy

Dan Boneh, Professor, Stanford University

Summary

Organizations have legitimate uses for aggregate data from the population of users of their products and services, to improve what they do. However, individual users have an expectation of and often a legal right to a degree of privacy. Are there ways to meet both these apparently conflicting requirements?

This was the question addressed by Professor Dan Boneh (Stanford University) in his presentation at Upgrade 2021, the NTT Research Summit. In work with a variety of co-authors, Boneh has adopted a model with two collaborating but non-colluding entities, called the server and the helper. The entire data set of individuals is split randomly between the two in such a way that, unaided, neither can reconstruct the data of any individual. Indeed, each entity holds data indistinguishable from random noise.

Now the problem is how to extract aggregate data from the two data sets without either entity becoming aware of individual data entries at a level of granularity below that of the aggregate. Furthermore, the computational and cryptographic demands on the entities must be as lightweight as possible, as the data sets involved are likely to be large. Finally, there must be a degree of robustness against malicious reporting by a single data source, so that such data cannot prevent aggregation of all the other legitimate data points.

Boneh considered three specific problems. The first is to sum the data values within a particular set of data sources, and this proves to be fairly simple. The second is the detection of so-called “heavy hitters,” values occurring more than a threshold t times in the data. He showed that this can be achieved by organizing the data in a tree structure and using data compression methods to reduce the computational load.

The third problem is the construction of anonymized histograms, arranging all values of one aspect of the data in a useful form without allowing any value to be traced to an individual. He showed this can be done by requiring a message authentication code to be included in the submitted data and then applying a robust mixing algorithm. In all cases, only symmetric cryptographic protocols are needed.

Professor Boneh concluded with a discussion of the general problem setting and the likelihood that the role of helper could be played by some trusted party such as the Internet Security Research Group. He believes further applications in the same problem family are likely to emerge.

Click below for the full transcript.

Dan Boneh

Professor, Stanford University

Professor Boneh heads the applied cryptography group and co-direct the computer security lab. Professor Boneh’s research focuses on applications of cryptography to computer security. His work includes cryptosystems with novel properties, web security, security for mobile devices, and cryptanalysis. He is the author of over a hundred publications in the field and is a Packard and Alfred P. Sloan fellow. He is a recipient of the 2014 ACM prize and the 2013 Godel prize. In 2011 Dr. Boneh received the Ishii award for industry education innovation. Professor Boneh received his Ph.D from Princeton University and joined Stanford in 1997.