Dan Boneh: Two Approaches to Updata...

dan boneh

Professor, Applied Cryptography and Computer Security, Stanford University

Transcript of the presentation Recent Developments in Cryptography, given at the NTT Upgrade 2020 Research Summit, September 30, 2020

Hi everyone. My name is Dan Boneh, and I want to thank the organizers for inviting me to speak. Since I only have 15 minutes, I decided to talk about something relatively simple that will hopefully be useful to NTT. This is joint work with my students Saba Eskandarian, and Sam Kim, and Maurice Shih. This work will appear at the upcoming AsiaCrypt and is available on eprint if anyone wants to learn more about what I’m going to talk about. So, I want to tell you the story of storing encrypted data in the cloud. So, all of us have lots of data, and typically we’d rather not store the data on our local machines, but rather we’d like to move the data to the cloud, so that the cloud can handle backup and the cloud can handle access control on this data and allow us to share it with others. However, for some types of data, we’d rather not have the data available in the cloud in the clear.

And so, what we do is we encrypt the data before we send it to the cloud, and the customer is the one that’s holding the key. So, the cloud has ciphertext, and the customer is the only one that has the key that can decrypt that data. Now, whenever dealing with encrypted data there’s a very common requirement called key rotation. So key rotation refers to the act of taking a ciphertext and basically re-encrypting it under a different key without changing the underlying data. Okay, and the reason we do that is so that an old key basically stops working, right? So, we re-encrypt the data under a new key. And as a result, the old red key can no longer decrypt the data. So, it’s a way for us to expire keys, so that only the new key can decrypt the current data stored in the cloud. Of course, when we do this, we have to assume that the cloud actually doesn’t store the old ciphertext.

So, we’re just going to assume that the cloud deletes the old ciphertexts, and the only thing the cloud has is only the latest version of the ciphertext, which can only be decrypted using the latest version of the key. So why do we do key rotations? Well, it turns out it’s actually quite a good idea. For one reason, like we said, it limits the lifetime of a key. If I give you a key today, you can decrypt the data today, but after I do key rotation on my data the key that I give you no longer works. Okay, so it’s a way to limit the lifetime of a key. And it’s a good idea, for example, in an organization that might have temporary employees. Basically, you might give those temporary employees a key, but once they leave effectively the keys will stop working after the key rotation has been done. Not only is it a good idea, it’s actually a requirement in many standards. So, for example, NIST requires key rotation; the payment industry requires periodic key rotation. So, it’s a fairly common requirement out there.

The problem is how do we do key rotation when the data is stored in the cloud? So there are two options that immediately come to mind, but both are problematic. The first option is we can download the entire datasets onto our client machines. This could be terabytes or petabytes of data. So, it’s a huge amount of data that we might need to download onto the client machine, decrypt it under the old key, re-encrypted under the new key and then upload all that data back to the cloud. So that works and it’s fine. The only problem is, it’s very expensive. You have to move the data back and forth in and out of the cloud. The other option, of course, is to send the actual old key and the new key to the cloud and then have the cloud re-encrypt using the old key and re-encrypt then using the new key. And, of course, that also works, but it’s insecure, right? Because now the cloud will get to see your data in the clear.

So, the question is, what to do. And it turns out there’s a better option, which is called updatable encryption. So updatable encryption works as follows. What we do is we take our old key and our new key, and we combined them together using some sort of re-key generation algorithm. What this algorithm will do is it will generate a short key. That’s a combination of the old and new key. We can then send the re-encryption key over to the cloud. The cloud can then use this key to re-encrypt the entire data in the cloud. So in doing so, basically the cloud is able to do the rotation for us, but the hope is that the cloud learns nothing about the data in doing that. Okay, so the re-encryption key that we send to the cloud should reveal nothing to the cloud about the actual data that’s being held in the cloud.

So updatable encryption is a relatively old concept. I guess it was first studied in one of our papers back from 2013. There were stronger definitions given in the work of Everspaugh, et al., in 2017. And there’s been a number of papers studying this concept since. So, before we talk about the constructions for updatable encryption, let me just quickly make sure the syntax is clear just so we see how this works. So, basically there’s a key generation algorithm that generates a key from a security parameter. Then when we encrypt a message using a particular key, we’re going to break the ciphertext into a short header and the actual ciphertext the header and the ciphertext gets into the cloud. And like I said, this header is going to be short and independent of the message length. Then when we want to do rotation, what we’ll do is basically, we’ll use the old key in the new key, along with a ciphertext header to produce what we call a re-encryption key. We’ll denote that by delta.

Okay, so the way this works is we will download the header from the cloud short header, compute the re-encryption key, send the encryption key to the cloud, and then the cloud will use the re-encrypt algorithm that uses the re-encryption key and the old ciphertext to produce the new ciphertext. And then this new ciphertext will be stored in the cloud. And again, I repeat that the assumption is that the cloud is going to erase the old ciphertext and is going to erase the re-encryption key that we send to it. And finally, at the end of the day, when we want to decrypt the actual ciphertext in the cloud, we download this ciphertext from the cloud, we decrypt it using a key K and recover the actual message.

So in this new work with my students, we set out to look at more efficient construction for updatable encryption. So, the first thing we did is, we realized there was some issues with the current security definitions. And so, we strengthen the security definitions; we strengthen them in a couple of ways, but in particular, we’d like to make sure that the actual ciphertext that is stored in the cloud doesn’t actually reveal the number of key rotations. So a rotated ciphertext should look indistinguishable from a fresh ciphertext; but not only that, that actually should also guarantee that the number of key rotations is not leaked by, from just looking at the ciphertext. So generally, we’d like to hide the number of key rotations, so that it doesn’t reveal private information about what’s in what’s encrypted inside the ciphertext.

But our main goal was to look at more efficient constructions. So, we look at two constructions. One based on a lattice-based key homomorphic PRF. So actually, the main point of this work was actually to study the performance of a lattice-based key homomorphic PRF relative to the existing updatable encryption systems. And then the other construction we gave is what’s called a nested construction, which just uses plain old symmetric encryption. And interestingly, what we show is that, in fact, the nested construction is actually the best construction we have, as long as the number of key rotations is not too high. Yeah, so if we do under 50 re-encryptions, just go ahead and use the nested construction, basically from symmetric encryption. However, if we do more than 50 key rotations, all of a sudden, the lattice-based construction becomes the best one that we have. I want to emphasize here that our goal for using lattices was not to get quantum resistance. We wanted to use lattices just because lattices are fast. And so we wanted to gain from the performance of lattices, not from the security that they provide. So, I guess before I talk about the constructions, I have to quickly just remind you of what the security model is what it is we’re trying to achieve.

I’d have to say the security model for obtainable encryption is not that easy to explain. Here you know, the adversary gets to see lots of keys. He gets to see lots of re-encryption keys. He gets to see lots of ciphertext. So instead of giving you the full definition, I’m just going to give you kind of the intuition for what this definition is trying to achieve. And I’m going to point you to the paper for the details. So really what the definition is trying to say is the following settings, right? So, imagine we have a ciphertext that’s encrypted under a certain key k, at some point later on in the future, the ciphertext gets re-encrypted using the re-encryption key delta. Okay, so now the new ciphertext is encrypted under the key k prime. And what we’re basically trying to achieve in the definition is to say that, well, if the adversary gets to see the old ciphertext the new ciphertext and the re encryption key then they learn nothing about the message and they can’t harm the integrity of the ciphertext.

Similarly, if they just see the old key and the new ciphertext they learned nothing about the message, and they can’t harm the integrity of the ciphertext. And similarly, if you see an old ciphertext in a new key, same thing. Yeah, this is again overly simplified because in reality, the adversary gets to see lots of ciphertext and lots of keys, and lots of encryption keys. And there are all these correctness conditions for when he supposed to learn something and when not. And so, I’m going to defer this to the paper, but this gives you at least the intuition for what the definition is trying to achieve. So now let’s turn to constructions. So, the first construction we’ll look at is kind of the classic way to construct an updatable encryption using what’s called the key homomorphic PRF. So key homomorphic PRFs were used by a Naor, Pinkas and Reingold [unintelligible] back in ’99. And they were defined in our paper BLMR (Boneh, Lewi, Montgomery and Raghunathan) back in 2013. The point of the BLMR paper was mainly to construct key homomorphic PRF without random oracles.

So first, let me explain what a key homomorphic PRF is. So, it’s basically a PRF where we have homomorphism relative to the key. So, you can see here if I give you the PRF under two different keys at the point X, I can add those values and get the PRF under the sum of the keys at the same point X. Okay, so that’s what the key homomorphic property it lets us do. And so key homomorphic PRF were used to construct updatable encryption schemes. The first thing we show is that in fact using key homomorphic PRF, we can build an updatable encryption scheme that satisfies our stronger security definitions. So again, I’m not going to go through this construction, but just to give you intuition for why key homomorphic PRFs are useful for updatable encryption, let me just say that the re encryption key is going to be the sum of the old key and the new key. And to see why that’s useful, let’s imagine we’re encrypting a message using counter mode.

So, you can see here a message is being encrypted using a PRF applied to a counter i. Well, if I give the cloud key one plus key two, the cloud can evaluate F key one plus key two at the point i; and if we subtract that from the ciphertext, then by the key homomorphic properties, you’ll see that FK one cancels out. And basically, we’re left with an encryption of M under the key minus key two. So, we were able to transform the ciphertext from an encryption under key one to an encryption under minus key two. Yeah, and that’s kind of the reason why they’re useful. But, of course, in reality, the construction has many, many more bells and whistles to it to satisfy the security definition.

Okay, so what do we know about key homomorphic PRs. Well, the first key homomorphic PRF is based on the DDH assumption, and that’s just a standard PRF from DDH. It’s not difficult to see that this construction actually is key homomorphic. In this work, we are particularly interested in the key homomorphic PRF that comes from lattices. So, our question was, can we optimize this key homomorphic PRF to get a very fast updatable encryption scheme? And so, the answer is yes, we can. And to do that we use the ring learning-with-error (LWE) problems.

So, our goal was really to kind of evaluate updateable encryption as it applies to lattices. So that’s the first construction. The second construction like I said, is purely based on symmetric encryption. And it’s going to have an enhancement of what we call the trivial updatable encryption scheme. So, what’s the trivial update-able encryption scheme? Well, basically we would look at a standard encryption where we encrypt a message using some message key, and then we encrypt the message key using the actual client key.

These are all symmetric encryptions. The client key would be K, and the header would be the messaging encryption key. Now, when we want to rotate the keys, all we would do is basically we would generate a new message encryption key, we’ll call it key body prime. We’ll send that over to the cloud. The cloud will encrypt the entire old ciphertext under the new key, and then encrypt the new key along with the old key under a new client key, which we call K prime. So, what gets sent to the cloud is this key body prime and header prime, and the cloud is able to do its operation and re-encrypt the old ciphertext. The new client key becomes key prime. And, of course, we can continue this over and over, in kind of an onion-like encryption where we keep encrypting all ciphertext under a new message key.

The benefit of the scheme, of course, is that it only uses a symmetric encryption. So, it’s actually quite fast. So that’s pretty good. Unfortunately, this is not quite secure. And the reason this is not secure is because the ciphertext effectively grows with a number of key rotations. So, the ciphertext actually leaks the number of key rotations. And so, it doesn’t actually satisfy our definitions. Nevertheless, we’re able to give a nested of construction that does satisfy our definitions. So, it does hide the number of key rotations. And again, there are lots of details in this construction. So, I’m going to point you to the paper for how the nested encryption works.

So now we get to the main point that I wanted to make, which is comparing the different constructions. So, let’s compare the lattice-based construction with a DDH-based construction and the symmetric-nested construction. For the DDH-based construction, we’re going to use the EPRS system just for our comparison point.

So, you can see that for four kilobyte message blocks, the lattice-based system is about 300 times faster than the DDH system. And the reason we’re able to get such a high throughput is, of course, lattices are more efficient, but also we’re able to use the AVX instructions for speed up. And we’ve also optimized the ring that we’re using, quite a bit specifically for this purpose. Nevertheless, when we compare it to the symmetric system, we see that the symmetric system is still an order of magnitude faster than even the lattice system. And so, for encryption and re-encrypt purposes, the symmetric-based system is the fastest that we have. When we go to larger message blocks, 32 kilobyte message blocks, you see that the benefit of the lattice system is even greater over the DDH system, but the symmetric system performs even better.

Now, if you think back to how the symmetric system works it creates many layers of encryption. And as a result, during decryption, we have to decrypt all these layers. So, decryption in the symmetric system takes linear time in the number of re-encryptions. So you can see this in this graph, where the time to decrypt increases linearly with a number of re-encryptions. Whereas the key homomorphic methods take a constant amount of time to decrypt, no matter how many re-encryptions there are. The crossover point is about at 50 re-encryptions, which is why we said that if in the lifetime of the ciphertext, we expect fewer than 50 re-encryptions, you might as well use the symmetric-nested system; but if you’re doing frequent dream encryptions, let’s say weekly re-encryptions, you might end up with many more than 50 re-encryptions. In which case, the lattice-based key homomorphic scheme is the best updateable system we have today. So I’m going to stop here, but let me leave you with one open problem, if you’re interested in questions in this area.

So, let me say that in our lattice-based construction, because of the noise that’s involved in lattice constructions, it turns out we had to slightly weaken our definitions of security to get the security proof to go through. I think it’s an interesting problem to see if we can build a lattice-based system that’s as efficient as the one that we have, but one that satisfies our full security definition. Okay, so I’ll stop here and I’m happy to take any questions.

Thank you very much.

Updatable Encryption

dan boneh

Dan Boneh,
Professor, Applied Cryptography and Computer Security, Stanford University

recent publications by Dan Boneh

dan boneh

Dan Boneh,Professor, Applied Cryptography and Computer Security, Stanford University

recent publications by Dan Boneh

Our Research

News & Media

Events

Dan Boneh,
Professor, Applied Cryptography and Computer Security, Stanford University