Security Cryptography Whatever
Security Cryptography Whatever
Encrypting Facebook Messenger with Jon Millican and Timothy Buck
Facebook Messenger has finally been end-to-end encrypted, a couple of years after Mark Zuckerberg announced it! Plus Instagram DMs are trialing ephemeral E2EE DMs too! We invited on Jon Millican and Timothy Buck from Meta to discuss this major cross-platform endeavor, and how David Bowie fits into their personal Labyrinth.
Transcript: https://securitycryptographywhatever.com/2023/12/28/e2ee-fb-messenger/
Links:
- https://www.facebook.com/notes/2420600258234172
- https://eprint.iacr.org/2022/1044.pdf
- https://engineering.fb.com/2023/12/06/security/building-end-to-end-security-for-messenger/
- https://www.theverge.com/2023/12/6/23991501/facebook-messenger-default-end-to-end-encryption-meta
- https://www.threads.net/@jonmillican/post/C0kQPAyoFpr
- https://engineering.fb.com/wp-content/uploads/2023/12/MessengerEnd-to-EndEncryptionOverview_12-6-2023.pdf
- https://engineering.fb.com/wp-content/uploads/2023/12/TheLabyrinthEncryptedMessageStorageProtocol_12-6-2023.pdf
- https://engineering.fb.com/2022/03/10/security/code-verify/
- https://chrome.google.com/webstore/detail/code-verify/llohflklppcaghdpehpbklhlfebooeog
"Security Cryptography Whatever" is hosted by Deirdre Connolly (@durumcrustulum), Thomas Ptacek (@tqbf), and David Adrian (@davidcadrian)
Hello, welcome to security, cryptography, whatever. I'm Deirdre.
David:I'm David.
Thomas:Chicago is further away from New Zealand than you are from
Deirdre:And who are you?
Jon:Good to know.
Thomas:This is the, this is the longest, most, this is not the longest, most geographically diverse one we've done so far.
Jon:Disappointing.
Deirdre:That was Thomas, by the way. We have two special guests today. We have John Millican. Hi, John.
Jon:Hello.
Deirdre:Hi. And we have Timothy Buck. Hi, Tim.
Tim:Hi.
Deirdre:Hey, uh, our special guests are from Meta, and we invited them on today because Facebook Messenger and Instagram DMs, uh, have just released end to end encrypted messaging by default, uh, especially for, you know, Facebook Messenger on all of your platforms, all of your surfaces that Facebook Messenger is generally supported on, is that correct?
Jon:Kind of. Um, so we're, um, we've just announced one to one default end to end encryption is rolling out on Facebook Messenger at this point. Uh, and that, that is across all platforms, uh, although the rollout will, will, uh, take a little while to, to reach completion. On Instagram, it's, it's just ephemeral messages so far, but that's coming later.
Deirdre:That's pretty cool, and I want to get to that in a little bit, but let's, let's go to the big one first, which is, there was news like a couple of years ago about trying to make all of Meta's messaging channels, which is Facebook Messenger, Instagram DMs, and of course, WhatsApp messaging. And I think there's, is there another one? There's, you have threads now, but let's not talk about threads. They don't have DMs. They just are Instagram, but Whatever. Um, and try to make all of that end to end encrypted. And it seems like that was a big lift, and you've kind of gotten over the biggest hump of that lift, which is Facebook Messenger. Mostly because Facebook Messenger is supported everywhere where Facebook is supported, which is Uh, on mobile, and on web, and it's been on web the longest, I think. And so the fact that you've gotten it rolled out, period, is a big accomplishment. Can you kind of give us a high level of what you had to address to actually get this rolled out?
Tim:Well, I think we can maybe do a two part answer. I can give some context from the product perspective, and then John can get into a little bit more details of the technical hurdles that we had. Um, so as you said, Facebook Messenger is all over the place, right? We have several different web Places, facebook. com, messenger. com, um, we have the mobile apps and not just, you know, there's the messenger mobile apps on iOS and Android, and there's also the Facebook mobile apps on iOS and Android. And then there are desktop apps as well, Windows and Mac, um, it's available in like Oculus and other places, right? So,
Deirdre:Oh!
Tim:it's, it's all over the place, right? A whole bunch of different apps that exist, uh, and a very large number of people use these. Um, even things that, like, I mean, I, I personally am not a huge, like, Facebook. com user. There's a very large number of people who use Facebook. com and message on it all the time. And this is not just messaging, but it's also calling that is end to end encrypted. And there's a large number of features that all have to be rebuilt. Uh, within that. So people have come to expect, uh, a large number of features in Messenger across messaging and calling. Well over a hundred. And each of those had to be redesigned and rebuilt, obviously. Uh, originally they were very server centric. A lot of the logic was on the server. And that doesn't work anymore, right? Because we don't, we don't have, uh, you know, access to that information. Um, and so those, those needed to be rebuilt on every single, uh, surface. Uh, and done in a way that had the quality that users expect. Um, so it's a very, very large effort. Um, and that's just rebuilding the things that people expect. But encryption obviously also comes with things like backups and keys and other Like, encryption specific things that, um, we needed to build and make easy to use and, uh, educate people about across all of those services, across all of those languages, and across all of those cultures. Um, so, yes, a very large project, but we're really excited that this is now going out as the default, uh, for those billions of people.
Deirdre:Hell yeah. Like So this is definitely in the billions. I think WhatsApp is in 2 billion, 3 billion user accounts and many more
Tim:yeah, I don't know what numbers we can say, but they're very large, yes.
Jon:Yeah. Yeah. Billions.
Deirdre:So meta, meta, meta once again with the. Either the biggest or second biggest end to end encryption rollout, all in one fell swoop. Uh, whichever, whichever it ranks with whenever. I'm pretty sure WhatsApp when it turned on end to end encryption by default was smaller than whatever, whatever the numbers are turning on now.
Jon:That's very
Deirdre:Uh, so that's huge. That's huge for
Tim:would also have been less complicated, right? Because it was mostly just mobile.
Jon:Yeah. And actually beyond that, because, um, WhatsApp was always designed sort of with its features working primarily in a device to device, uh, function. Oh, well, device to device mechanism. Whereas with Messenger, so much server side functionality, like the bytes you upload are not the bytes which are received. In WhatsApp, it largely was. And it makes a huge difference. Yeah.
Deirdre:Absolutely. Uh, one thing that strikes me as something that you'll have to adapt, like if this is using signal protocol under the hood for device to device or at least one to one, um, Uh, handling, I know that there's descriptions of groups, uh, in your white paper, but the, the standard signal protocol kind of assumes that your identity is bound to a device bound identity key. Uh, whereas Facebook Messenger definitely was not designed that way. You log, you still log in with, uh, Facebook, or just Facebook Messenger, uh, username and password, or whatever. However you authenticate with Accounts. facebook. com or whatever and then once you've done that you can do whatever you want. Can you talk a little bit about about how you adapted for that sort of identity management system?
Jon:Totally, yeah. So, with the sort of constraint as you described it, and this desire that comes with most encrypted systems, uh, to be able to identify the endpoints in question at any given moment, we basically had to treat each endpoint as its own Almost identity. So you have an identity key per device, uh, and they're added and removed and users can see as they're being added and removed and check the whole list. And yeah, as you say, you can, you can log in, uh, anywhere. You can do pretty much anything apart from read any messages that were sent to you before that moment where you logged in. And this is Right, and that's where the, uh, uh, our new model of message storage, some call it a backup, some call it secure storage, but our encrypted storage system essentially, um, comes in.
Deirdre:And this is Labyrinth?
Jon:This is Labyrinth, exactly, yeah. And so, so the context is that historically, uh, Messenger has always allowed you to log in anywhere and just access. Everything. You can scroll back through all of your history, you can send and receive new messages, etc. But, being able to guarantee that someone can do that wherever their Facebook account is available essentially means that Facebook has to have access to that message. And that's Obviously, counter to the goal of Facebook not having access to your messages, which is sort of our approximate high level statement around end to end encryption. Not quite sure what the exact wording we generally use.
Deirdre:Mm
Jon:So, that essentially means we now need to store the messages for the user in an encrypted manner, if they are to access them on another device. Or, We're going to store them off of our own properties, and I'll address that part in a sec. But, yeah, so if we're storing them in an encrypted manner, then much like with a lot of cryptographic solutions, what we've done is we've transformed one problem into a key management problem. And so it's now the user has to have a way to Transfer those keys when they access a new device. And so that's really what Labyrinth does is it gives us a way to store, store it all for the user, but then we have our recovery mechanisms, which we need in place. Um, to, to, to briefly address that sort of point I mentioned around, well, why not store it off, uh, off of our own platforms, which is what some other systems do, uh, most notably WhatsApp, or at least most notably to me, maybe, maybe that's just as a Meta employee, but, um, the difference here is really the, the constraints that we're working in, where, With WhatsApp, you're storing all of your messages locally on your device, and just backing them up periodically. And you are primarily located on a single device, and then the multi device model sort of extends out from there. With us, we, A, haven't, don't have a history of requiring users to store a large amount of data on their messaging clients. Um, this is I guess particularly notable in the case of web ones, where it may not even be possible. Uh, but even on mobile devices, if we suddenly ask someone to store an additional few gigabytes, that's not always going to go down well.
Deirdre:Mm hmm.
Jon:So that was really where this came from, and this is why we don't really call it a backup, we call it a secure storage system, because the idea is that you only need to keep a very limited amount of data on the device, and you can page it in as and when it's necessary.
Deirdre:So this is interesting because it's, When I think of, oh, I'm logging in on a new device, or I, I say I have an identity system, I'm like, oh, I'm in, I have a new phone, I'm going to log in to Facebook with my new phone, and I want to get, I don't know, whatever the most recent cache of encrypted head. Unreadable by Facebook servers, uh, Facebook Messenger messages that are addressed to me or addressed to my account ID, not necessarily my device ID, but I, I could see a system that basically lets me get the last day of, you know, kind of cached in the pipeline messages to be delivered, and then I get the last day's worth of messages, you know, up to a certain number or a certain size, blah, blah, blah, blah, blah, and like, I have enough context to just sort of, uh, get up to speed on my new device, and then new messages that are addressed to my ID, including my newly enrolled device to my ID, will also be sent to my device. I can understand that at Facebook's scale, this might not Work. Because even if you cached us like a max cap of encrypted messages for every device and for every user ID, that's literally billions and billions and billions and billions of just Just blobs sitting there waiting to be downloaded and, you know, it just doesn't work. Um, can you tell us a little bit more about how Labyrinth works and how that both helps with, like, main message history across devices and also helps do it at Facebook Messenger scale?
Thomas:I'm gonna, I'm gonna butt in before you do that, right? Cause I wanna make sure I've got like, I wanna make sure I follow where we're at, right? So like, I get like, I get roughly what Labyrinth is, is, is going to do, is, is aiming to do here. You're still running Signal protocol for device to device stuff. So for direct messaging, you're still running, like, a direct, secure messaging protocol. And, like, Labyrinth is a high fidelity record of all the messages that have been sent, so that you can bring up new devices and have a complete message history. And you're doing that because it's the UX expectation for everybody that uses Facebook Messenger. And I am one of those people. So, I guess I wanna, I have two, I have two questions, right? So first of all, did you guys, this is the simpler question. Did you Are you going to labyrinth all the previous messages that were sent?
Jon:That is an interesting question. I don't think we have a clear, um, plan on that one yet, but it would be, uh, a lot of work to do.
Thomas:Yeah, okay. A broader question I have for you, right, is you have a high fidelity, quick to look up, sorta kinda forward ratcheting message store. Why have the signal protocol part?
Jon:Yeah, that's a really good question, actually. Um, and this is, this is perhaps partly due to precedent and partly due to the complexity of, So, I mean, the precedent is we were already using Signal Protocol in Messenger, in Secret Conversations, albeit a small scale, and in WhatsApp, obviously a very large scale. On the other hand, developing a new protocol for end to end encryption, And storage is actually a lot more complex. I mean, we've, we've, we've seen, um, this exercise of developing a new protocol just for end to end IETF recently. We were involved in that with MLS and it's, it's a lot of work and it takes a lot to make sure you've got it right. Whereas signals out there, we've already been using it and we have proofs, uh, from academia about a bunch of properties that it, it achieves.
Tim:think there's an interesting like user control question in there as well, which is like, we don't require you to have the secure storage turned on. If you don't want to have this turned on, you can Just, you know, not set up, not set it up, basically. Um, and, um, so users have more control, and then they can just use the, the signal, uh, uh, protocol experience.
Thomas:That's a great answer. So, is that surfaced in the system? Or are you guys planning to surface it so that, like, both counterparties have to agree to have their messages stored?
Jon:It's stored per user. Hm.
Tim:Yeah, yeah, yeah. So, so the way that this works is when you are, you know, basically going to be transitioned to encryption, um, you'll have an, uh, an introduction, and as part of this introduction, you're asked to set up secure storage. And so what you need to do in that process is to choose a key that Meta doesn't have access to. Um, and, you know, there's several different options that we give people. Um, one, um, is a PIN. Um, when you set that PIN, um, John can get into more details of how we do PINs with HSMs and a bunch of really interesting stuff behind the scenes there to make sure we don't actually have access to that PIN. But you can create a PIN, um, that you can then reuse across devices to get back in. Um, you can also do things like store, uh, the key, um, with a third party. If you want to store your key in, you know, your Apple backups or in your Google Drive or some other place, you can choose to do that. Um, or if you are really hardcore about it, you can just literally have a 40 digit code and you, like, write it down somewhere or, like, memorize it and there's, like, no PIN, no anything else. You're using a 40 digit code to, to get back in to that message history. Um, and, or you can say, no, like, I don't want this to be on and, and for, for your messages, they, they will not be stored in secure storage.
Deirdre:hmm, mm hmm,
Thomas:So that's, you're already describing what's probably the most cryptographically kind of complex user experience of any major thing that users use. It's not just the largest deployment. You're also doing more with kind of the cryptographic user experience than I've ever heard of before.
Deirdre:mm.
Jon:Yeah, we're trying to simplify it so it doesn't feel like you're doing key management, but yeah, under the hood you basically are.
Thomas:I mean, we're happy about it,
Deirdre:Ha ha ha!
Thomas:You should, you should go further in the other direction.
Deirdre:Um,
Jon:back PGP, right? Mm
Deirdre:to new, complicated Cryptography. Uh, part of Labyrinth involves a new kind of primitive under the hood, Oblivious Revocable Functions, which was very cool because I think I saw the ePrint just float across my radar a couple of weeks ago, and I was like, Oh, that, what are they doing with that? I'm not really sure. And it's, it's basically a new way to provide better unlinkability between the encrypted attachments, uh, in your message bodies from Facebook servers. Uh, can you tell us a little bit about that?
Jon:Sure. Yeah, so When we came up with this and when we were designing Labyrinth, we essentially realized that there were certain properties which were just going to be a bit stronger in terms of privacy if Meta was not able to make the link between certain things in storage, but we still need the ability to index into them, or authenticate access to them, so that when someone actually does try to read them, we know what's happening and can verify at At that moment in time, and so we were sort of spitballing this and like, well, we could have a service that performs this mapping or whatever. And then someone was just like, could we do this with fun cryptography, John? And a couple of days later, I'm like, I feel like you might be able to. Um, then some of the other, but far better cryptographers came and actually helped make it secure. And so it was really just, uh. We had a need, and we, specifically with attachments, because these are When these are transmitted, they're uploaded once, and then shared between the members of that thread. So that provides very direct linkage between the content in multiple people's mailboxes.
Deirdre:Right.
Jon:And if we were able to not have that, that just felt slightly better.
Deirdre:Yeah.
Jon:And this was essentially where that came from. Um,
Deirdre:Uh, I need to read a little bit more about this, but it's kind of like a, it's a way of setting up sort of like PRF, but with the ability for two different parties to be able to come up with the same outputs, but in like a non obvious way, unless you know the secret or something like that. Um, at least from the outside.
Jon:yeah, I mean, at its heart, what you have is a, let's see, the input goes into a PRF. It's then used in, um, a, um, an exponentiation, elliptic curve exponentiation, and then goes out, goes through a PRF again. And that, that exponent is consistent between all pairs running an ORF, but. each, um, party and each, each pair can split up that, that exponent in different ways, essentially. So it's, it's like there's some, some weird Diffie Hellman going on in the middle, um, and taking, taking advantage of the, the, the properties of the math here, um, that, that allow us to just tweak it each time.
Deirdre:Cute. Uh, I saw a note somewhere deep in Labyrinth that was the client O O O O Oblivious retreat, uh, Oblivious revocable functions are currently accessible to the server, they're only used today for strong attachment unlinkability, uh, and, and so on and so on. Um, any plans of splitting these up somehow? Is there sort of like a migration path? forward to get even the, even better properties there.
Jon:Yeah, great questions. So, essentially, the history of this, and, uh, I gave a talk on Labyrinth, actually, at a workshop in Crypto in August last year, and that was a much more ambitious version, actually, which was using the ORF in a bunch more places for greater unlinkability, and what we realized was that this thing was so difficult to debug, and we didn't yet know the properties of the system. that if we were going to try and go down that path, we would really struggle to build a good experience that would actually be appropriate to replace the existing one. So there was, there was a moment where we, we said, right, I think we're going to have to basically break the secrecy of this ORF at the moment. Check that all of our use cases that we're using it for are okay. Just for that, that particular primitive, and it wasn't, we weren't using it for any, protecting any content or anything. And then, make a plan to hopefully reintroduce it in certain places in future. So, particularly for attachments, we, we are hoping to, to bring back that secrecy, but that's going to involve a complex migration, some new client, some client is going to have to generate a new set of secrets, distribute it across all the other clients in Labyrinth, and then, somehow Make that sufficiently reliable. I'm sure we'll get there.
Thomas:Can I, uh, can I take a stab at trying to understand, like, maybe one more iota of ORF than I do right now? So, like, I read the preprint or whatever. I got to the point where as I skimmed, then I got to the point where there was, like, a game and an adversary, and it's like, this is the part of the paper that I normally skip until I can find the actual construction, and I didn't find it, and I fell off the end of the paper. So, my understanding of what you're going for here is, like, if you chose a simpler setting for this, You can think of like, you know, a file server, right, and you've got clients of the file server. They're uploading files. The server is just there to give the bytes back, right? The clients push the file up and they want the file back. The server's got no business knowing what the file name is, right? So, like, the idea here is that When we're talking about linkability, we're talking about essentially the file name. And what we've got is a client and a server who are somehow agreeing on a label for a file that is cryptographically random, right? Um, but there isn't a single Static mapping between the file, the filename, and the label that we're using on the server. Both because, and you can stop me when I become wrong here, but the two complicating things you have here are, number one, you've got multiple devices, and they don't all share a key. This would be trivial if you just like, you know, the filename is the HMAC of a client secret, right? But you can't do that here, right? And the other problem is, You want to revoke, or roll, or ratchet that secret forward in some way? And the part that breaks my brain, like, I can sort of work out, like, I have a sort of intuition for how, like, Diffie Hellman magic happens here, of how we would do the former part, where we just have a bunch of people agreeing on a secret label. The part that breaks my brain is, you can somehow ratchet this forward and land at the same label? Sort
Jon:yeah, okay, this is a really good point, and, uh, I should have mentioned before one of the literal three words in the name of this thing, revocable. I really only spoke to the oblivious function part. But, yes, so, the intuition here is that, I guess, p to the xy equals P to the XZ, Y over Z, and that, that's all we do. So when we add a new client, what we have is the existing client, which knows X and the server knows that Y corresponds to that client. This client, um, computes a new Z. Reports that to the server, the server computes Y over Z, stores that, the client then transmits over to the new one, XZ, and then that new client does the same process again, um, with, I shouldn't have started with X, Y, and Z, it generates a queue, so that new client then ends up with XZ queue, and the server ends up with Y over Z queue, and that's it. At that point, the multiplier and divisor cancel each other out within the exponent, and so overall they're all just computing p to the xy. Does that make sense?
Thomas:of? Like, in my head right now, I've got, like, kind of similar vibe to, like, blinding. Right?
Jon:I think so, yeah.
Thomas:yeah, okay, we'll edit this and like, before you started talking there, we'll just have a thing where like, everyone get out your piece of paper and pencil and just start writing,
Jon:sounds great. Um, yeah. And I guess the other thing I should mention here is because we're doing all of this math, what's the point? Um, the idea is that at the moment, one of these devices is removed. And because the setting that the messenger works in here is very highly multi device, we're adding devices all the time, you can remove a device and revoke one at any time. At that moment, assuming the server is instantaneously, I guess honest but curious, the server will delete its, its Y corresponding to that client, whether that be Y or Y over Z or Y over ZQ or whatever we land on. And at that point, Whatever secret the client stores is completely meaningless, and it can't be combined with any of the other ones because if, if queues disappeared from, from the entire universe, then it's not useful anymore.
Deirdre:And this is, like, logging in on a new browser on Facebook. com is considered a new device enrollment. So you need to be able to handle this case, like, you could basically scale it up to every single time someone uses Facebook Messenger. They're enrolling a new device and they need to be able to revoke and enroll and set this all up and be able to, to, to pull down these more unlinkable, uh, message attachments and, and all that sort of
Jon:Right, yeah. And this is just in the
Tim:not, it's not quite every single time. yeah,
Deirdre:No, but, like, I could conceivably be that user that only does Facebook. com, uh, logins, and I have a fresh, uh, you know, incognito mode every time, and
David:Phone only used once, Deirdre over here only opening up facebook. com on an incognito browser.
Tim:No, but you're, you're right. That, that is a use case that is handled by this for sure.
Deirdre:Okay, cool.
David:And then if you do that though, you, you still do get the history in the web browser, um, assuming that you've opted into, to labyrinth by creating a pin or memorizing a 40 digit
Tim:Yeah, so you'd have to do two things, right? You need to log in, right? And then after you log in, you need to provide your key, whether that be the PIN or that be some other key that you have set up. Um, and once those two things happen, then you also
Thomas:which is like a super interesting product decision, right? Like, this is what I was thinking about before when I was saying how cryptographically sophisticated this is, like, you'd think like a normal product manager input to someone, this would be like, the user logs in, they get everything. Like, the login is the thing that authenticates it here, and you've deliberately separated those things, and I imagine that of the X billion people that use Facebook, Y billion of them are going to be mad at you about this.
Jon:Hopefully Y will be smaller than 1 in that instance,
Tim:likely be a
Jon:that's I'm sure there'll be a subset, yes.
Tim:Yeah.
Thomas:I
Tim:John, you should talk about your triangle here.
Jon:Yeah, exactly. So you've got three desirable properties, and I keep on writing this down and then forgetting exactly what they all are. But, um, it's essentially, one of them is you can log into the network, wherever, and use functionality. Uh, two is Use most of the functionality of the network. Two is that wherever you can use all of the functionality of the network, you can also send and receive messages. And three is, if you are able to send and receive messages, you also always have your history available to you. And it's just one of these classic pick two situations, once the network does not have the messages for you. And, for us, it was, the, the, yeah, the decision we took was it was non negotiable that you can continue logging into Facebook whenever you want to with your username and password and that if you do so, messaging will function, at least for new messages, and then that just naturally meant we, we have to sacrifice that guarantee that your message history is always available.
Tim:We have done quite a bit to make the message history experience less painful and easier to use for users. That's why there are a series of different options, like, you know, storing the key with iCloud or Google or having a PIN that's easily remembered and things like that.
Thomas:think it sounds like I'm dunking on you, and it's the opposite thing, right? It's like, the cynical view of all this stuff is like you're a big tech company, and whatever encryption, whatever end to end stuff you're doing is all performative. Right? And, like, one, I think people should pay more attention to this, but, like, one sign that you might be looking at something performative is if it's extraordinarily usable under all possible circumstances, and here you've surfaced a trade off, right? Like, there's an explicit concession to this, and you wouldn't have done it unless the, like, I mean, I don't know, the counter cynic in me would say you wouldn't have exposed that, that you wouldn't have that somewhat rough edge there, um, if there wasn't an actual privacy reason for it, which I think is, it's pretty neat to see that happening.
Jon:Yeah, thank you. So certainly lots of things we've done for privacy reasons here, definitely.
Thomas:Can we talk about epics? Explain epics.
Jon:Sure, okay, so this is all going back to that notion of device revocation in Labyrinth. Also, it kind of relates to your earlier question around, well, are we going to retroactively apply Labyrinth to all historical content? So, the ideal situation is that any of your message history is only available to The devices which you have currently have authorized on your account, but the the reality is I mean we can prevent it being available But decryptable should we say but that trade off is that whenever a device is removed from your account you're then going to have to re encrypt everything and That could be vast amounts of data And so the trade off we went for is saying what we want to make sure is that messages are only decryptable by devices which at one point had access to them. And of course, there's other engineering going into making sure that dices shouldn't be keeping around the keys and shouldn't have access. Like access control doesn't go out of the window just because you're encrypting things. But yes, so what's happening in Labyrinth is that means And a lot of the complexity in the protocol comes from this, that when you remove a device, we will be generating new entropy that gets mixed into the, the secrecy, essentially, it's almost like a hash ratchet, um, within the centre of the protocol. That then gets distributed to all other clients and then they can use that to, use that moving forward to encrypt future messages stored into Labyrinth. We chose to use essentially an HPKE for distributing that ebook entropy. And in a world where everything is ratcheted, that, that, that could seem like an odd choice, but essentially the, the justification here was really around reliability in that we need this to work for, You may have generated a recovery code which, within the protocol, acts exactly like a device. That might not be used for, you know, a year or more, then use it to recover. And if you have a hash ratchet, which is broken at some point, you're kind of stuck. So we needed to use something that we knew was going to work in one shot, and so we went with HPKE there. Mm
Deirdre:What's very interesting about, this is basically all within Labyrinth right now. This is like after, basically, you, you have things getting delivered to enrolled devices via the signal
Jon:Right.
Deirdre:And the fact that you have epochs of enrolled device keys and root entropy and HPKE, I'm just like, hmm, this is looking a lot more like MLS and TreeChem and a lot of the things that you might see in that collection of documents. Is this sort of like a happy accident or sort of, you know, mutual pollination of ideas or sort of What do you think
Jon:I'd probably say mutual pollination of ideas here. I mean, I was, I was closely involved in particularly the early stages of MLS. Um, and so the notion of, hey, something's changed in the devices. We're in a new epoch. That terminology maps directly into Labyrinth. Um, HPKE less so. That was more just, it was the right tool for the job. But yeah, there's a hell of a lot of cross pollination within end to end encryption.
Deirdre:Like, explicitly, MLS is trying to be scalable for very large groups. It is very large in that you have many, many, many accounts. You may have many devices per account. But not, not the same way that MLS is trying to scale to tens of thousands of, of members in a single group and forward every, every ad, every remove, every rekey, you know, blah, blah, blah. Like that's how they do epochs and stuff like that. But do you kind of see, it seems like there's like a convergent evolution of like multiple devices, uh, incoming, outgoing, ratcheting, uh, and that sort of thing. So even if it's like. Uh, slightly different domain applications. There's, there seems to be a convergence of, like, how these things are, are getting designed. Even if it's a sort of storage versus, like, full on management of the actual message delivery, which MLS is trying to do.
Jon:Yeah, I mean, I think a lot of the problems that they're solving are quite similar, and so I think some of it is honestly coincidental, but similar solutions were needed.
Deirdre:Yeah. Um, I, I wanted to ask, uh, I saw a single line that said, we found a thing during formal verification of Labyrinth. And I grept and I grept and I grept and I couldn't find any more information about formal verification of Labyrinth. What did you do? What did you find? Tell me more, please.
Jon:sure. Yeah. So this was actually more on the earlier iteration of Labyrinth than, than the later one. But, but we, we, we were, we were careful when we removed, uh, a certain aspect of it that we, uh, stuff should, should still map the same. But, but yes, so, so there, we, we worked with Karthik Ervan, um, to, to basically get some, I think it was symbolic model, uh, proofs around our claims. And we, I think we originally had. Five or six claims we were aiming for in Labyrinth. I think we've reduced that down to two or three security claims. Um, but, but yeah, so we, I can't, can't remember the exact results, but we, we tried to highlight them in the white paper of like certain things. For example, the, the equivalence with protecting everything just under a single symmetric key. Uh, that, that, that was like our top line. If we have, if we can't say that, then we've got a big problem. And then we got proofs around, uh, what was happening in. Epochs, epoch rotations subject to certain conditions, such as if a collusion occurs for, at a certain point, and you restore from an earlier epoch and roll forward, that creates more attack risks than if you're restoring from the absolute latest epoch. But yeah, no, it was great to have that formal verification, and again, if you're looking at sort of cross pollination, having worked on MLS, and all the formal verification going on there, it was certainly an obvious thing to look at.
Deirdre:Uh, I love that. Did you do some of that work before you had designed the oblivious revocable function stuff? Or did you have to model that, the symbolic thing?
Jon:Um, we, we did this after the Oblivious Provocable Function. Man, I can't even say it. Um, if I remember correctly, uh, Karthik had to make certain assumptions about that because our proofs, I believe, were in the computational model when we did the ORF, so it didn't directly map, but, uh, but yeah, under those assumptions that he made it, it seemed to work.
Deirdre:Okay. I guess you can model, you can kind of hand wave over the specific niceties of their language. You'll compute the same thing, but they're evocable. Like, yeah, you can just kind of hand wave that they will always agree and, you know, things like that. And then you can get all these other things to, to be proven or something like that. Cool. All right. So, one innovation that Facebook Messenger Secret Conversation specifically, uh, brought to encrypted messaging was message franking. Um, it looks like that that is fully present. Uh, any, any changes there? I didn't see anything in the white paper that indicated that anything changed there, but do you want to talk about that a little bit?
Jon:Sure, I mean, that one will be fairly quick. I'm pretty sure we didn't really change it much. Uh, it's, um, we, we re implemented it for the new stack, but, um, aside from that, it's essentially the same idea, exactly the same motivations. We, we want to make sure that, uh, people are able, able to report messages, and that it's very hard to actually, um, spoof a report, and, you know, particularly with certain types of content, the The impact of just being accused of having shared it can be very great, so we wanted to be relatively confident when reports come in that they're actually authentic to some degree.
Deirdre:Is message franking all on the delivery layer, or does it Does anything with hooking in Labyrinth impact most, I'm thinking mostly from like a security analysis perspective of the whole system. It's like, oh, we've designed franking without Labyrinth in mind. Does bringing in Labyrinth have an impact about how you think about either franking or reporting or anything like that for kind of Facebook, uh, messaging as a whole? This is kind of generic, but
Jon:Yeah, that's a really good question. Um, I mean, one of the impacts was we had to make sure that we were storing our payloads in the same format that we were transmitting them. Which isn't necessarily the obvious choice, and that comes with trade offs itself. But, yeah, if you're franking it, you don't want to be deserializing it and then having to reserialize, because that's a recipe for lossiness or stuff going slightly wrong. So that, that was definitely, um, a big impact. And then the, let me think, there was another one we had to think about. So I'm having to page, page it back into my cache. You know what? I've lost it, but there was another interesting one.
Deirdre:No, but that's a really good point because like things that were just sort of easy decisions to make, which is like, you're gonna Mac over the entire ciphertext as it was on the wire, because that's all you had at the time. And now you're like, oh, wait, we can't change this now. Because if you had something franked, On the, you know, as the signal ciphertext, and then, you know, someone tried to, like, bring it up from Labyrinth, and it was completely different. It was, like, decrypted, re encrypted, or re encoded, or, you know, and then re encrypted for Labyrinth, and then you're like, oh, wait, I want to show you something that only exists in Labyrinth now, and you're just sort of, like, Like, you basically lose franking. If you didn't keep your, uh, you know, all of your records about how, of the message you, you had franked in the delivery mechanism, um, you're sort of, it's kind of screwed. You're just sort of like big shrug emoji. So, you know, think, I'm hoping that you didn't, there weren't any decisions about, uh, you know, franking over the ciphertext that you regretted. When you were, when you were deploying Labyrinth or designing Labyrinth, but, uh, you know, what's done is done.
Jon:So there's an interesting point there, which is that this is actually a really good reason to frank the plaintext rather than the ciphertext.
Deirdre:okay. Yeah, yeah, Yeah.
Jon:So you have your franking key, which is, you know, just random or pseudo random, but is hidden from meta, which means that when you frank the plaintext, you're not really revealing anything. You're just revealing 32 bytes of randomness, or However long our franking tag is. But, but yeah, because we don't want to keep SignalCipherTexts around, because actually, SignalCipherTexts are not useful to persist, because the keys are constantly changing. It would just add an entire new layer of complexity.
Deirdre:Good.
Jon:Um, and I, I also remembered the other point, actually, with bringing in Labyrinth into the system, which is, now that we have If we're saying, we need to report the last 30 messages, or whatever it is, whatever that reporting window is, you want to make sure that the client already knows which messages it's reporting before it does that. And so that there's a bit of an interaction there with the cache, the local cache, to make sure that the server can't just send down a different set of messages from the ones you may have thought you were revealing at the point that it pages them in. And so we did have to think about that. But yeah, we figured it out.
Deirdre:I keep forgetting that franking includes, is, is thinking about the context, not just of a single message. It's thinking about, you know, messages and possibly after, uh, when When you're actually making a report, it's, I, I'm always like, you mack a message, and it's like, no, there's, context is also very valid, because you can make a joke, and in context, it's, it's fine, but in, out of context, it seems very bad, or something like that, um, and then you, you report it to meta, and, you know, you get your account deactivated for a month, or something like that, uh, the, yeah, things that you forget about. Okay, one last thing before we pivot, um, Deploying end to end encrypted anything on the web is just a harder thing than end to end encrypted mobile apps or even desktop apps, uh, to a degree. And I know that you kind of did what Whatsapp has kind of done here is leveraging the CodeVerify, uh, I think it's a browser extension to help, like,
Jon:yeah.
Deirdre:band check that the version of the web app software that's getting served to you to be your, one of the ends of your end to end encrypted messenger, so the, the web app version of Facebook Messenger that's now supporting end to end encryption, is what you are expecting it to be. Can you talk a little bit about that and especially how you got that working in, like, the big blue Facebook. com app?
Jon:Sure, I can talk a little of it, a little bit about it. But as I'm sure you can imagine, we had, uh, an entire web team looking at getting this working. Um, who I'm, I'm sure could wax lyrical about this for days on end, if we asked. Um, but, but yeah, it's certainly the, this constraint of we, we need to sort of know in advance what codes either will or might run in the browser so that you can attest to it and have it covered by the code verifier extension, that was.
Deirdre:Mm.
Jon:very difficult, particularly in the context of Facebook. com. It was, I mean, it was already hard for Messenger. com. Um, but yeah, we had to essentially look at all of the frameworks that we're using, all of the essentially developer efficiency tooling that's been built, and some of it established for years. And while I think certain aspects of the site's architecture, I think, have actually made We've moved closer to what's useful for this in more recent years, um, there, there, there was a bit, a bit of that push and pull between where are we tweaking the architecture of the site and where are we just saying, you know what, the CodeVerify extension is a lot more complicated than one naively may think. Like, we, we, we have. Um, I mean, it attests to manifests of JavaScript code, which might run and then it has multiple manifests because we don't want to be paging in megabytes at a time, um, and most people don't need that. So there, there's the, the most common manifest, which is quick to load and the long tail manifest, which covers everything else. Um, I have a lot of respect for the perseverance of that team who, who, who did that. They did amazing work on that.
Deirdre:I can imagine. cool. Uh, from the web perspective, I know that like, you've done a ton of work to make this workable, and having an identity management system that is like, you can log in, book log in, and then Treating all of these clients as new devices and, and like managing them partially with, with Labyrinth makes deploying to web like this plausible. Because if you just treated your web, treated your identity, your web client as like, here's an identity key. I hope you don't lose it. That's your whole identity. Like that doesn't quite work. So having all of this infrastructure in place and including CodeVerify in it. Uh, makes this possible. Does it f ing possible? Means that other people can look to you and say that it's possible. Cool,
Jon:exactly. Yeah.
Deirdre:Yeah.
Tim:Well, one, one cool thing about Code Verify before we move on is that it's, uh, uh, it, it works across all of, uh, META'S apps, right? So it, it helps verify WhatsApp, facebook.com, messenger.com, Instagram. Um, it's sort of built in a way where there's one plugin that helps you verify. The encrypted messaging on the web across all of those different, um, services.
Deirdre:That's so useful. And I know that other people have talked about either trying to fork it or emulate it or something like that and try to get like code verified but completely different other set of web apps or something like that.
Tim:It is open source, right?
Jon:to see
Deirdre:looking at.
Jon:Yeah, it's open source. And, um,
Tim:okay. I
Jon:yeah, I mean, I think in it, um, in a certain way, it would be great sort of in the, in the long run to see something along these lines kind of being standardized and some sort of like web binary transparency, um, or et cetera. But, um, Yeah, we had to start somewhere.
Deirdre:Um, and really quick, uh, we mentioned Instagram DMs in the beginning and this kind of, this kind of dovetails with all the work that had to be done for like persistent history across like newly logged in devices and all sorts of stuff. Instagram DMs are having ephemeral one on one encrypted DMs in preview now. I completely understand why you just sort of like, cool! Ephemeral, non persistent history, they only live for a little bit of a time, like, easy mode, this is like the easy mode of deploying end to end encryption. Like, is, that, is that basically it for now?
Tim:Yeah, yeah, so I guess the story there is, uh, you know, that's one step forward, right? The end goal here is to bring default end to end encryption, persistent storage, all that stuff, uh, to Instagram as well. But, uh, it's another Very, very large, uh, system with its own complexities, its own set of features that have to be rebuilt, its own set of apps that have to be supported. Um, and so, you know, we're kind of taking one step at a time. And, um, it does already have, um, optional end to end encryption, uh, similar, uh, to what we had on Messenger for many years. Um, so if you want, like, persistent thread that's not disappearing, you can create that. Um, or, uh, like, as, as you mentioned, the, the disappearing messages feature is sort of a way to get that into people's normal threads and, uh, allow us to, to, like, make sure a lot of the pieces of the puzzle work, uh, without having to plug in, you know, the message history and the keys set up and pins and all of that stuff all at the same time. Um, we will get there, uh, but one step at a time.
Deirdre:Yeah, that's awesome. Alright, David. Heh heh
David:Yeah, so you mentioned, you know, the triangle earlier, and you had to introduce like this additional pin or backup system. And I'm curious what that like conversation was like with the rest of the organization. Like, how, how did this all of this come about? And how receptive were the teams that are usually their entire goal is to reduce the number of clicks it takes to log in? Receptive to being like, what if we added more clicks to log in?
Tim:Well, well, I guess
Jon:yes.
David:you're willing to. talk about
Tim:To clarify a little bit, right, uh, if you're on like, let's say, you know, Facebook. com, the primary reason you're there is, is probably not messaging, right? Like, messaging is one feature among many in which people are, are using Facebook. And so, um, you can continue to access the vast majority of, like, public Facebook without this additional step. Right? Um, you, you know, you can use Facebook groups and all this stuff just logging in. Um, and then you can also do messaging, right? Again, without having to re verify anything at an additional pin or anything. You can send new messages. Um, and so that sometimes that's all you need, right? You log in on a web browser on, I don't know, let's say some, a la computer at a library or a friend's computer or whatever it may be. You don't need the message history in that case. You have full access to Facebook. You have full access to messaging the person you need to message. Um, but if you do want that additional, um, you know, message history, then there is a, you know, we're continuing to reduce the friction for you to make that easy for you to get, get back access there. It was, it was obviously a difficult conversation for a lot of different people with competing, uh, goals, but it is helpful when, um, the founder of the company has publicly stated that this is going to happen.
Deirdre:Yes!
Tim:as a, uh, as a forcing
David:my follow up was going to be,
Tim:yeah, for those conversations. And, and, and I think,
David:highest ranked person that, like
Tim:yeah, I mean, yeah, Mark, Mark wants encryption, right? And so that really comes down to it is like, we, we all do, right? Like we want to improve the privacy of your messaging and your calling. We haven't really talked about calling, but we also have encrypted calling, uh, uh, for all of this as well. Um. And, um, when there's sort of been that, like, top level buy in, um, we've got to figure out how to make this, uh, this major technical hurdle happen so that we can improve the privacy of billions of people. Um, there are a lot of trade off conversations and difficult conversations, but at the end of the day, we want to be able to make this privacy claim and legitimately improve people's privacy. In this dramatic way. And so at some points, there are like lines in the sand where you say either we're doing this or we're not right? Like if we're doing it, then this is what we're doing. And if not, then why are we? Why is this whole project? Right? And so you coming back to that conversation and Very, very senior people who were, um, very committed to making this happen, um, was critical, right? Like, it, it, it's, it's crucial in a, in a company of the size of, of Meta.
Deirdre:It definitely, definitely helps when the tippy top of your very large org has said on the record, Hey, we're going to do it. And you could just be like, See previous statement. Are you going to help achieve Mark's goal?
Tim:started with, Mark said, that started with, uh, so many.
Deirdre:Yeah.
David:baby Raze!
Deirdre:Oh my gosh. I feel like I'm going to go like, earworm Mark Zuckerberg now and be like, okay, now we're going to do post quantum engineering.
Thomas:like, there's been like one monumental aligning security statement at some point at every single one of the major techs. Like, it happened at Microsoft after the Summer of Worms, it happened at Google after they got owned up by the whatever thing, uh, it happened with Mark with a particularly potent hit of barbecue sauce, um, I don't know when Apple did it, but it happened, right? It's just interesting that that works.
Deirdre:not me. I was too young for some of those. To pivot back, I, poor John, I say post quantum and he's just like, looks exhausted, but, uh, Signal, uh, Signal Foundation, uh, just released a couple of weeks ago, they're kind of first stab at making at least Signal Protocol and maybe they'll eventually get to the rest of the other features that's signal, the service, uh, uh, supports cryptographically, which is they, they made the, the first handshake for a pairwise conversation, which was, uh, the original was called triple Diffie Hellman or extended triple Diffie Hellman. And their post quantum variant is a, is a hybrid called PQX. Uh, and it includes, uh, you add some Kyber keys in there and you, you shove them in the KDF and, and yada, yada, yada, uh, and you know, you get a little bit of formal analysis from Karthik et al, and then you make some fixes and, uh, it makes it even better. Is there any discussion of piloting that in end to end encrypted, uh, metaproducts?
Jon:So yes, obviously we saw that, um, have been looking at it with interest. Interestingly, I think in the, in our case, something like Labyrinth is actually a much more natural starting point in terms of the post quantum threat. Um, cause, um, I mean, with what, what signal belts it protects against the harvest now decrypt later attack. And. The dependency of that is the harvesting. Uh, in the case of Labyrinth, that is done by designing the product, whereas obviously signal ciphertexts are designed to be ephemeral. And, actually going back to that question of, oh, why didn't we just design our own protocol which handles delivery and storage
Deirdre:Mm mm
Jon:this wasn't the primary reason, but one actual benefit we get from our design is that Labyrinth, if you sort of ignore all of the epoch rotation, and you sort of say For a protocol that allows you to add but not remove devices, that's, we think, post quantum secure, um, in that threat
Deirdre:mm-Hmm.
Jon:because it's got that basis of symmetric cryptography. So you've got this one key, you need to use it to go forward or back. Well, it's not necessarily one or one key, it's rotating, but if you get one of them, you can go forward and back. But, you can break, you can crack all the asymmetric cryptography that you like, but you're still not going to get into the core of the labyrinth protocol there. So that was sort of where we were thinking, but, I mean I say you can crack all the asymmetric cryptography you like, we've got A few pieces of labyrinth adjacents, uh, or internal asymmetric repertoire. We've got the HPKE, which is potentially a natural target for thinking about that, um. You then also have two of the ways in which you can add a device in Labyrinth. One of them being using these HSMs, and so we'd need, we'd need the negotiation with the HSM. Itself to be post quantum secure currently we're using opaques that that wouldn't be
Deirdre:Mm-Hmm.
Jon:And then also if you're doing a direct device add then we use a cpase key exchange Which again is based on classical primitives So I I think experimenting with with signal Uh, PXQuantum is definitely interesting. I'm thinking more about Labyrinth at the moment, uh, for that reason, but I'm sure we will be, you know, looking across the board, as all big tech companies are, um, over the next, um, however many
Deirdre:I do think that like, we're seeing even more deployments of PAs like opaque and, and I think CI think you said C Pace. than we ever have, and as far as I know, like, the post quantum variants of those, like, there aren't really any. Like, maybe there's some, like, full homomorphic encryption based ones that are very expensive and very heavy, and, you know, like, maybe you can do something like that with lattices, but, like, no one wants to use them, so, like, research in that area of making something that's, you know, hybrid secure or post quantum secure is probably well motivated, because Shifting the attack, the deliciousness from, you know, Diffie Hellman and, you know, something that's like extended triple Diffie Hellman to, like, if we crack the opaque, uh, you know, sign up enrollment for a new device, Diffie We get, we get everything. Or, you know, we get a lot. Like, if you've stored, you stole, stored all the goodies for us, so we'll just take that, thank you. Uh, so that would be, if any, I know some cryptographers listen to our podcast. If you want, if you want a good research topic, post quantum resilient PAKES, I think is, uh, efficient, useful, post quantum or hybrid, uh, PAKES. I think will be very useful to some people. Um, John, Tim, uh, thank you so much. Uh, we'll be linking, uh, both of your white papers. And, there's just a lot of cool stuff in here. There's the Labyrinth stuff, uh, there's the, the ORF, uh, primitive, which is really cool new stuff. Um, and all the other little details, uh, of getting this deployed. Thank you so much for, for talking with us.
Jon:Thank you very much. It's been really great joining you.
Tim:Yeah, thanks for having us.