Hoeteck Wee
Senior Scientist, NTT Research
Transcript of the presentation Functional Encryption for Attribute Weighted Sums, given at the NTT Upgrade 2020 Research Summit, September 30, 2020
Hi, I’m Hoeteck Wee. I’ll be presenting my work, functional encryption for attribute-weighted sums. This is joint work with Michel Abdalla and Junqing Gong. So the context of this work is to do a private database analysis. So consider we have a database of an attribute value pairs, comprising Xi, a public value, and Zi, a private value. So think, for concreteness, of Xi being, say, demographic and geographic attributes, and Zi is rating in a poll. Now we’ll be interested in computing the average Xi value over some subset of the database. For instance, where the subset of the database is selected by applying some predicate f to the public attribute Xi.
So concrete f we may be interested in would be for instance, to look at say, individuals over the age of 40, who are Fox News subscribers in the state of Wisconsin. Let’s consider a more general setting of attribute-weighted sums, where the output of function f may not be zero-to-one, but could be arbitrary weights. And when f is, the output of f lies in zero to one, this corresponds to the expression from this poll. We’ll be looking at this question from the point of view of functional encryption, where in functional encryption, we have an encryption algorithm that takes as input the database, and outputs a cyphertext, where the key generation algorithm that takes as input the function, and outputs a secret key, and where the decryption algorithm, that takes as input the cyphertext and the secret key, outputs the actual weighted sum.
And we want the additional privacy guarantee that the cyphertext and the secret key should leave no additional information about the private values, beyond what we learned from the attribute-weighted sum. In this work, we construct a functional encryption scheme for attribute-weighted sums for a large class of programs, corresponding to arithmetic branching programs, which contains, as a special case, a Boolean formula. Our scheme has the property that key generation is independent of any central database. This means that we can generate secret keys without knowing a priori, the size of the database. And we can use the same secret key to decrypt a database of any size, containing any number of entries. Also, the encryption algorithm, whose running time depends on N, but is otherwise independent on the complexity of the function f, we achieve strong simulation-based security against bot collusions, and we achieve security under said assumptions of prime-order bilinear groups.
Construction is fairly simple and proceeds in two steps. We start with a scheme for the second N = 1. So this scheme, which we denote with superscript one, that takes as input x and z, and decryption with f(x) times z. And then the second step, we amplify this basic scheme from N = 1 to general arbitrary N. Now for this case N = 1, we can actually get the scheme via some simple tweaks to prior works. So for this talk, we’re going to focus on the second step, which is an amplification procedure, that starts from N = 1, to general N, without blowing up the size of the secret key. Alright, so here’s our first attempt at such an amplification procedure. So, this is a very natural construction, namely, to encrypt the entire database, we apply the basic scheme, and encrypt each of the attribute value pairs {Xi, Zi}.
And then the secret key would basically be the secret key for the basic scheme. To decrypt, we first apply the secret key to decrypt each of the individual N’s that I’ve shown with the basic scheme, to compute a f(Xi, Zi) and then sum up all of these values. Now note that correctness is very straightforward, and for those from the underlying scheme, and we also achieve efficiency in the sense that the size of the secret key only depends on f and is independent of the N, the total number of database entries. The problem, however, is that we do not achieve security, in particular, these partial decryptions leak the individual f(Xi)Zi values. Whereas decryption should only leak the sum these values.
To solve this problem, we basically will introduce additional randomizers to the scheme. So, during encryption, we will additionally pick N random scalars whose sum is zero, and we will apply the basic scheme to encrypt Xi, together with the concatenation of Zi and Wi, as the private value.
And the secret key is similar to that from before, with a slight tweak, so that we generate a secret key for function f, that, when we decrypt the basic cyphertext, we return f(Xi)Zi plus Wi. This is something we can do, and now, if we sum all of these values, the Wi’s cancel, and then we get correctness as before. And these additional randomizers from the Wi’s also guarantee that the partial decryptions don’t leak additional information about the individual f(Xi)Zi’s. Now this works, if we only keep one secret key, however, if we keep multiple secret keys, as is the case when we have collusions, security no longer holds, because we cannot reuse these randomizer Wi’s across multiple keys. To fix this problem, we will computationally randomize the Wi’s using a DDH assumption.
Concretely, during key generation, we will pick a random scalar r that’s chosen fresh for the secret key, and encode it in the exponent, which we denote by the square brackets. And then the secret key will be generated for the function f [r], so that when you decrypt the individual cyphertexts, you don’t compute f(Xi)Zi. Rather, you compute f(Xi)Zi PLUS Wi * r, where r is the r in the secret key. And this computation is done in the exponent. Then to decrypt, we just need to multiply all these values together, which then basically induces a sum in the exponents. Of course, to get the final answer, we’d need to do a brute-force to the script block, So, we only get efficient decryption if the attribute-weighted sum lies in [unintelligible] the size domain.
Alright, let’s think about how we would try to prove security. So, consider the distribution of the partial decryptions. First, we’ll apply the DDH assumption to replace the WiR’s with uniformly random values, fresh randomizers per secret key, thanks to having a fresh r per secret key. And then we can apply this argument to essentially move each of the individual Xi,Zi terms to the first term in the entire series. This gives us security. However, if we try to carry this out, this argument out in the scheme, we need to somehow embed these N units of entropy, corresponding to [unintelligible] into either the cyphertext, or the secret key.
Now, first note that we cannot embed N units of entropy into the secret key, because the size of the secret key cannot grow with N. Now, if we try to embed this N units of entropy into the cyphertext, it also does not work, because we actually need N fresh units of entropy, per secret key query. And if we try to embed all of this into the cyphertext, the cyphertext size would grow with the number of secret key queries, which is something we cannot allow for. So, to solve this problem, we’re going to instead use a different strategy. We’re going to work with the partial sums, and embed these partial sums into the secret key, across and introduce this N units of entropy one unit at a time across N hyper-exponents.
In a bit more detail, this again is the partial decryptions. We start by looking at the first two terms. We’re going to basically apply the DDH as before, to move that f(X2)(Z2) term from the second term to the first term. Now we look at first and the third term, and again we can apply the DDH assumption to the third term, to move f(X3)Z3 to the first term. And we can keep doing this over and over again, moving to fourth f(X4)Z4 to the first term, and so on and so forth, until we move everything to the first term. This basically works. At each step, we will only need to apply DDH once, per secret key and introduce one unit of entropy.
The main problem, however, is that we will now need, in the proof of security, to show that real and simulated cyphertexts are indistinguishable, while giving out a simulated secret key. This is the hole, in general, for the basic scheme; but we can work around with this, by essentially running two copies of the basic scheme. That basically concludes the talk. So, we show how to construct functional encryption for attribute-weighted sums for arithmetic branching programs. In the work we also discussed the extension to a setting where the database is distributed across multiple clients. A couple of open problems, one, use to achieve adaptive security. And another is to get the scheme from lattice assumptions.
Thank you very much.
Functional Encryption for Attribute Weighted Sums
Hoeteck Wee,
Senior Scientist, NTT Research
recent publications by hoeteck wee
- Time-Space Tradeoffs and Short Collisions in Merkle-Damg˚ard Hash Functions
- Candidate Obfuscation via Oblivious LWE Sampling
- Inaccessible Entropy I: Inaccessible Entropy Generators and Statistically Hiding Commitments from One-Way Functions
- Inaccessible Entropy II: IE Functions and Universal One-Way Hashing
- Hoeteck Wee's Google Scholar page
- Summary of Hoetech Wee's presentation