Security Cryptography Whatever

Some cryptography & security people talk about security, cryptography, and whatever else is happening.

Security Cryptography Whatever

Biscuits with Geoffroy Couprie

January 29, 2022 • Security, Cryptography, Whatever

0:00 | 58:55

We've trashed JWTs, discussed PASETO, Macaroons, and now, Biscuits! Actually, multiple iterations of Biscuits! Pairings and gamma signatures and Datalog, oh my! 🍪

Transcript:
https://securitycryptographywhatever.com/2022/01/29/biscuits-with-geoffroy-couprie/

Links:

Biscuits V2: https://www.biscuitsec.org

Experiments iterating on Biscuits: https://github.com/biscuit-auth/biscuit/tree/master/experimentations

Apache Pulsar: https://pulsar.apache.org

Spec: https://github.com/biscuit-auth/biscuit/blob/master/SPECIFICATIONS.md

Find us at:
https://twitter.com/scwpod
https://twitter.com/durumcrustulum
https://twitter.com/tqbf
https://twitter.com/davidcadrian

"Security Cryptography Whatever" is hosted by Deirdre Connolly (@durumcrustulum), Thomas Ptacek (@tqbf), and David Adrian (@dadrian)

Geoffroy 0:00

We do not deserve Turing compete languages. We do deserve them.

Deirdre 0:03

No, we don't Hello, welcome to Security, Cryptography, Whatever. I'm Deirdre. We also have our co-host. David, how are you doing David?

David 0:24

doing swell.

Deirdre 0:25

Doing swell? We also have Thomas. How are you, Thomas?

Thomas 0:29

I am speaking. So you know that I am here.

Deirdre 0:31

Yay. You're here. We also have for our special guest today, Geoffroy Couprie. Am I saying that right? Yay. And he is the co-creator creator of Biscuits, yet another authentication token, and we wanted to learn more about it.

Geoffroy 0:49

Yeah. Yeah. I'm really happy to be here. Uh, I've been listening to the show for, for a long time now and— like, the beginning, and finally some good crypto content.

Deirdre 0:58

yay. We have fans. we thought that the first version of Biscuits had pairings in it. Are we wrong? Did it have pairings in it?

Geoffroy 1:08

Yes it had and, uh, it used the excellent, uh, bellman library.

Thomas 1:15

this podcast is fantastic. You're you're our listeners intro to Biscuits is, has pairings in it. What else? What else do you need to know?

Deirdre 1:26

yeah. For, for anyone who doesn't know pairing-based cryptography, would you tell us a little bit more about pairing-based cryptography and why the first version of Biscuits used it?

David 1:37

Or maybe even what Biscuits are before we get into what, whether or not they use pairings? In case you're not a listener of the back catalog.

Deirdre 1:46

Hmm

Geoffroy 1:47

So yeah. Biscuits was, uh, th this decided like: there's Macaroons, this authorization token that can be offline activated. So you have a token, you have the restriction. And you get a new token from there. You do not need to talk to the original sever for that. You can just do that directly and it will be usable then for verification., uh, Biscuit needs, um, a secret it's uses a series of hash MAC. It's very elegant, but you need the secret to create the origin token. You also need the secret to verify. And so we thought, okay, can we get public key crypto for that? And we tried different schemes. And, one of the first ones was, pairing-based cryptography, which, uh, I learned that like, it was a very young, like I was just starting my career, um, at, at my first internship, there was a cryptographer there. Um, I said, oh yeah, the crypto looks so cool. I want to run that. And they said, Here's a paper, go read that. And that was like, random oracle model, or something like that. And then read that it's pairing-based cryptography is the future. It was in 2010 and it's still the future it's still very cool.

Thomas 3:04

Pairing curves. I was young. I didn't know better. So th th the motivation for Biscuits as, as you're putting it, it's just with other fancy tokens, the, uh, like the parties that can verify the tokens can effectively mint the tokens too cause it's a single root, shared secret, uh, shared secret, right? So like, that's the primary thing that you're trying to solve there, is just like, here's a cryptographic token where you can verify it, but being able to verify it, doesn't give you any other ability.

Geoffroy 3:31

Yeah, because, uh, we, we used, we used the Macaroons, so it was, a previous company. It was we've uh, Clever Cloud, is a French hosting company, performance service. When you did, uh, we used the Macaroons, like to, as a front for APIs, like we are the Macaroons-to-S3 API. So you get the Macaroons and any service that gives a Macaroon can just upload something easily to an S3 bucket and restriction to a specific folder, X, something like that. So it was very nice to use, but yeah, if you want to use the Macaroon somewhere and verify somewhere else, when you have microservices, it gets unwieldy. Um, and so, yeah. Could we get that with public key crypto, please, please, please? that, that was the point. And so when we got to the Zen board was okay, so we need, we have a token and we need to add something and still get a valid signature. Can we just find a way to just add signatures to themselves? and that's that's where, uh, pairing-based crypto came. and then the check get an explanation with right there. Uh, but basically you, you have a target group and you can add some things in the target group, and, that correspond to the same operation in the original origin, uh, scalars.

Deirdre 4:54

Yes. The, the multi linear map. Yes, I'm remembering it now. Cool.

Geoffroy 5:00

Yeah. And the the idea was like, okay, you can sign something, you have your private key, that's in the origin scalar group, and you signed something and you get to a point in the target group. And for each signature, you get one of these points and you can all add them. And when you verify the signatures, you can verify from the what's, what's been uh combined. So it was, it was really elegant, programs, but pairing-based crypto was still a bit slow for tokens.

Deirdre 5:30

how slow is too slow?

Geoffroy 5:33

Uh, oh, I can look up the, on which we had, uh, but it, it was not usable for, for public APIs. So.

Thomas 5:44

I would assume part of the concern with, uh, with pairing curves would just be that everyone using them would be scared shitless of them. At least, at least that's my, like my basic attitude about pairing curves is, run away, like, drop and run, like the plutonium canister. Do you like Deirdre and, and Geoffroy, are you guys, are you generally comfortable with pairing curves Deirdre? You could probably explain to people what pairing curves are and why. Uh, the ill-informed such as myself might be scared.

Deirdre 6:11

So, uh, pairing curves are not the same curves that you would use for your out of the box, uh, elliptic curve Diffie-Hellman or a really simple digital signature like ECDSA or EdDSA or whatever, they have different properties, including different underlying group structure that allows you to do the multi linear map that Geoffroy was just describing so that you can basically, add things that are in— defined over these other two groups, and it results in as like a transform of a point in a resulting group. And vice-a-versa. Those curves are nice for that specific purpose, but they are not really secure for doing regular quote unquote, "regular" elliptic curve cryptography with them. So for example, in Zcash what I work on where we use pairing friendly curves for stuff inside these ZK proof, uh, proofs, in circuits. But then we use different curves for stuff outside of the circuit, that's compatible with the pairing based curves, because we don't want to do signatures like regular elliptic curve signatures and stuff like that, and, and, um, Diffie-Hellman stuff with the pairing based curves. They let you do specific stuff, but you don't want to use them for other stuff. And it's kind of similar to how, I've talked about isogeny-based cryptography use specific curves for the isogeny-based cryptography that you wouldn't use for the other two either as well, because they have very particular purposes for them. Um, does that help?

Geoffroy 7:49

I guess. So you do the operations in like normal curves, but then for the specific operation that can only be done in pairing you do that there. And then you try to send back that.

Deirdre 7:59

yeah, kind of, yeah. there's curves that are defined over the scalar field of the other one and vice versa so that they kind of translate nicely. and for those in, while I look it up, it's like the Jubjub curve and BLS12-351 they work together nicely with those for Zcash Sapling. yeah, but they're, they're tricky and yes, they ca— the pairing operations can be more expensive.

Geoffroy 8:29

right. Yeah. I have the data there and at best I could get a verification in one or two milliseconds, and then you get the overhead of the rest of Biscuit, which is not enormous, but like you already pay a, a huge cost at a signature and verification.

Deirdre 8:51

One to two milliseconds doesn't sound too bad and I'm looking at the like latest up-to-date. Uh, there was a post that you wrote in last April, and it says that you can get the new version of Biscuits doing keygen, token creation, serialization, deserialization, and signature validation, and facts verification all under half a millisecond. So what changed from the pairing definition of Biscuits to, uh, the modern iteration?

Geoffroy 9:23

Yeah. Um, so the, the one to two milliseconds was using the MCL library, which is a specific, I think, C++ library for pairing, while uh, I was using a plus was not bellman, it was the pairing crate. That was, I think the previous one. And yeah the time it was like, yeah, I took in with three blocks was 30 milliseconds, uh, verification. So it was too much way too much. And we went through different designs. we got, uh, on GitHub, Kara folks helped a lot, uh, in the initial development of, of the spec and provide pro proposed, ah solution based on verifiable random functions.

Deirdre 10:06

Geoffroy 10:07

Uh, there was an IETF draft. Everything is in the repository, uh, experimentation. I kept a specific folder for all the things we tried that did not pan out, but could still be fun to look at. Yeah. And that was cool. And we tried that for a while and I found a big vuln, vuln that in there that like, when you, you, you could just get it again on and get back, anything you want on the sign, everything you want. So I could not find a way to make it work properly. So I went to the next one. And, that's when we get into blockchain territory. Uh, so this was proposed by Tony Acieri uh, yeah, he helped a lot to initially pointed me to a lot of interesting paper. So yeah, he proposed, uh, gamma signatures. Uh, it's a scheme that was designed to reduce the size of, uh, Bitcoin blocks. The idea was you can you take the signatures and you can aggregate them and it takes less space. And I looked at the properties and look like, okay, so once you assembled the signatures, you cannot get them back. So if you take signatures off all the blocks, you assemble them, you cannot just remove a block and get the valid signatures. So the properties should hold and you can get an aggregated signature and add another one. And it should also work fine. the only thing I needed there was a proper prime order group

Deirdre 11:50

yeah.

Geoffroy 11:50

and that's where we used Ristretto, which is like the— so, so nice to use.

Deirdre 11:56

A fast, uh, correctly encoded prime order group. It's

Geoffroy 12:01

right. Yeah, because I'm, I'm not like a cryptographer per se. I am like crypto software engineer, I guess. Um, I need some good basic tools. And when it says, you need a prime order group, I say, okay, yeah, I, you need to define the hash-to-curve thing and I'm just, uh, no, no, no. I did not want to do that. And like everything comes in Ristretto, it's so, so cool.

Deirdre 12:26

Yeah,

David 12:27

I feel like we extol the virtues of Ristretto like on almost every podcast. Like we could just be a Ristretto fan podcast.

Deirdre 12:36

it's a good group, David, a good— they're all good tweets, bront. I have some things in, my site, such as DoubleOdd curves that got introduced about a year ago by, uh, Thomas Pornin. but, uh, until they kind of gain traction, a nice prime order group defined over a very popular curve, very fast popular curve, which is curve25519. That's a nice, nice way to go for stuff like this and the implementations and the curve25519-dalek library are very nice.

Geoffroy 13:12

yeah, it's such a nice library. Like it's, it's so obvious to see how the things work and like all the types are well done. Like, it was very very nice to use.

Deirdre 13:21

Yeah. So the current version of Biscuits is this aggregated gamma signatures defined over the Ristretto group. Is that— okay? All right.

Geoffroy 13:34

Yeah. So, uh, this was for Biscuits 1.0, for the first version that we released I think eight months ago. the feedback on that was that, okay, you want to do an implementation of this? You need Ristretto. Where you don't have Ristretto everywhere, you get a good Ristretto implementation in libsodium. Uh, so I had an example of, so, okay, if you want to do the implementation of Biscuit, you, you take all of those libsodium calls, and you do them in order and it should work fine. the feedback on that was like, okay, it's still too complicated. Can we find another way? And it was still not that, uh, that fast. Like that's when we did the blogpost and everything was like, under a half a millisecond. So it was fine, but we, we, we thought, ok, it should get faster and easier to understand. And so the current scheme, which I hope will be the last one, maybe, making no promises there: it's just a series of, uh, signatures, with uh, ed25519. And the idea is that, uh, the last one, the last signature, uh, the last block comes with the next private key. So you use the previous block signs, the public key of the private key you get. That was just, it's an ephemeral key that was just generated. And if you want to add a block, you sign the block with this key, and create the new ephemeral key. And, uh, you send that to someone, like, you, you're already like, you're the one that had the block, the superior block. So having access to that private key, you don't care about it because you already have a better token and you just derive and give the, the right to the next one to just drive the next block. And this is actually extremely simple and understandable by a lot of implementers. So we've been very happy with that.

Thomas 15:39

you already, you already have a better token because you're doing an attenuated token scheme, like Macaroons. So like you're you have the, uh, you know, a verifiable token that lets you read anything and we're talking about the operation that would go, that would, that would go from that token that lets you read anything, to a token that lets you read cats.jpg and nothing else. And so like you're talking about like, okay, I have a secret that applies to the cats.jpg thing, but I'm already holding, you know, the secret that gives me access to everything already. And that's the logic that you're, you're using there for, for the secrets relationship.

Geoffroy 16:16

Right, this is, this is the way that the it, you, you, you can do the one that creates the token is not, can be a different one than the one that verifies the token. So you not need to trust everybody with your shared script.

Thomas 16:30

so you're, you're using this at Clever Cloud, I assume. You were using Macaroons before and now you're using Biscuits.

Geoffroy 16:39

So, uh, I'm not at Clever Cloud anymore, but they are still happily using Biscuits. The main, the main place where it's used is in the Pulsar cluster. So Pulsar is a queue topic system, a bit like Kafka, but with a different approach to scalability. And, when you need to make a big cluster that was used by, uh, by a lot of the customers and make it and the authorizations, they were not good enough. And the idea with Biscuits is we, like, we do not know how customers are going to use all of the tenants and namespace and queues and everything they have, and they, maybe they want to just have this application that can produce on this topic and consume from this other one. And another application needs different permissions, and we cannot decide that for them. And so we just give them a token that say, okay, this is your, your name space. You do whatever you want in there. And then you can derive a token that can do exactly what you want. And this is something we've done also internally for applications. Like we can have an agent on a specific machine of returned machine, and going to get access to one topic. And you get everything like that with Biscuits, and on the server side, we have rules saying, okay, you get the token that has this authorization for this namespace. And then we check, the authorization that are in the token that say, okay, but this is only this topic and this operation. So this is pretty, pretty powerful.

Deirdre 18:11

and the way you're doing this as you're encoding these constraints with Datalog, right?

Geoffroy 18:17

Yeah. Yeah. So all of the authorizations are done with Datalog in, uh, in the token and also on the server side. It took quite some time to get to, to do good, a good solution for authorization. We tried different things, some things we've said something we've— regex-based system, like various approaches. And, at some point we got, uh, on the, on the paper that the, 'Datalog with constraints', does a really cool one about, okay, you get the authorization rules and you can have them apply also, on other data that comes like to know it's the time, it's an IP address, something like that. And you can include a lot of data in Datalog facts, and also some stuff on the side. We have some specific operation that could be relevant for authorization and Datalog is old, like seventies old. So it's, it's a well known technology. So I think we can rely on it, but it's been not, not that well-known, a bit forgotten about for, for a long time. But it's reappearing. The there's been like databases, Datomic and stuff like that. And even authorization. Uh, there's a open policy agent. They have the Rigo language, which is derived from Datalog and also, uh, recall REST company that's making server-side authorization and they have their own that Datalog language as well, so that the— logic languages, are very great fit for authorization.

Thomas 19:50

So like Biscuits are inspired by Macaroons, right? And like, Macaroons kind of famously simple and. that attenuation logic, we were just talking about, like going from read[*] to read[cats.jpg] Or whatever, Macaroons expresses that as just like a list of booleans, right. It's just a list of predicates and all the predicates have to evaluate true for the token to be authorized. Right. So that's considerably simpler than Datalog is. So presumably like you're trying to solve, like the core problem that you're trying to solve with Biscuits is, okay. Macaroons are great, but, you know, there's no way to have readers and writers with the tokens, everyone's both a reader and a writer. And so we need public key. And then public key is its own can of worms. Right. Cause to do that kind of like chain set of predicates or chain set of conditions need to have kind of public key chaining scheme. And that's what we're talking about when we're talking about like, gamma signatures over Ristretto or the current scheme that you've got, or pairing curves, right. But the other weird thing that you did was you switched from simple list of predicates to Datalog. So. I guess two things there. Right. So one is, I'm curious about the motivation. Like, did you guys run into something where the simplest of predicates wasn't good enough? were there like specific scenarios that prompted it? And I guess my second thing would be just like, sell me on Datalog. Like what is Datalog gonna let me do that a list of predicates wouldn't let me do. And I'm sure there is something I'm just, I'm genuinely asking,

Geoffroy 21:16

Datalog gives you just a list of predicates. It's, it is that. Um, the experience there on Macaroons, uh, you need to encode your predicates in a way, uh, because if I remember correctly, just Macaroons gives you, okay, this block is these byte array, and, that's it. Something like that. And you have a way to encode, this is how the predicate looks. And if you use that from one service in, I dunno, Rust, and another one in Python, you need to have consistent passing of your predicates. So we needed a way, uh, like a well-defined specified way to encode the policies. And that's where Datalog is. It just matches, okay, I have this data and I have this information. I know it's a request on this file with this operation. I have this list of roles for my user. I can find a match or not. It's finding a predicate. Biscuit token can have data, the facts, rules that can create new facts from data, but do not re you don't really need them within the, in the tokens. And it has checks and checks is, I do a query to my list of data and it must match something. If it does not match, if any check in the token or on this, on the very first site do not match, we would fail the authorization. And it's that. It's a list of predicate, but encoded in a very, very small, very compact way and very well specified so that every implementation works the same way. And so, yeah, but it's just that it's a predicates, you have to make that part, uh, simple, like it just say yes or no. And then you can have more interesting queries if you want. But the basic approach is, says yes or no.

Deirdre 23:03

and there there's no chance that this is Turing complete, right?

Geoffroy 23:09

yep. It is. It's even, so I don't remember the proof, but it's more or less guaranteed to finish.

Deirdre 23:16

okay.

Geoffroy 23:18

We do not deserve Turing compete languages. We do deserve them.

Deirdre 23:22

No, we don't.

Thomas 23:23

what's the, I'm looking at, like, I'm looking at https://biscuitsec.org, right. Which is the website for this protocol. Everyone should look at Biscuits. Biscuits are neat. Um, I'm, I'm looking at like the Datalog example at the bottom of the page here. What's like, what's the wire encoding for, for this stuff in a token. Like It's obviously not this.

Geoffroy 23:42

Yep. It's it's in Protobuf. Uh, so everything, every part of the code is, has a specific types. Uh, it's in the spec, in the specification and in the repository, there's a Protobuf file and every fact, type, and everything has a binary representation. So it's pretty compact, strings there's a string intending system. So if you have a string that tap is in some place and reuse— reuses it somewhere, it will only pay the cost once of sitting in the token. So there's lots of cool tricks like that to make it like a bit smaller. you do not need to pass a text format from the token, like you often do in just on the tokens or we wouldn't do it Macaroons. And even that is part, uh, so the specification comes with samples so that everybody's on the same page on, I'll pass that token, I put this policies, I should get this result and nothing more. And this has helped a lot, uh, for implementation also when they try some things and they mess the, the, the code that I can see with the samples every, every time you fail it, if it is something failed.

Thomas 24:56

So as a user of Biscuit tokens, right? So part of the idea, like I would say like for, for both Macaroons and for Biscuits my guess, um, is like the, the, the core of... these things is extensibility, right? It's not having to come up with precisely the security model that you're going to expose to users kind of a priori, but instead giving them all the nuts and bolts, they would need to come up with any kind of feasible, you know, kind of policy that they would want to express. Right? So like the core thing you're trying to do here is to get people who are not Biscuit implementers, right. Who are not like, um, you know, building new Biscuit libraries or whatever, but are just using these tokens, like mentally in my head, I'm, I'm mapping this to our service. Like, um, what our users do you deploy a new application or whatever. Right. So I guess the next question I have is just looking at this, like the, the operation of taking an already sealed token and then concatenating new conditions on it. What does that look like in terms of Datalog? Like if I, if I, if I look at these tokens, like in their textual representation here and my literally just catting conditions on the end, like another 'allow if' clause or whatever, or like, what does that actually look like as a user that wanted to like, you know, take a token and make a more specific token?

Geoffroy 26:05

Yep. So the idea is like, the Biscuit token is a chain of blocks, uh, secured by crypto. Uh,

Deirdre 26:14

it's like, uh, it's like a, not quite ratcheted, but it's literally, like, I hand you the secret key for the next thing, when you verify the previous thing. With

Geoffroy 26:26

Yeah.

Deirdre 26:26

ed25519

Geoffroy 26:27

And when you get the token and you want to verify that, uh, so the first block, the first block is always like the one created by the origin. The one that we trust the most and that can contain like the basic rights. I don't know, it's a user ID, or it's saying you have access to this folder and it's in the first block. And so this is where most of the policies are executed from the VA, from the authorization authorizer's side, if you think they're verified. And then the other blocks, we look at their condition, the checks, the put on the data, and they can see all the facts that have been generated before. So they cannot create data that can affect the authorizer or affect the first block, but they can see what data has been created before and apply condition on that. So they can also see the data that was created by the authorizer, and the authorizer knows. I don't know, this is a request to file dot, file1.TXT, or this is a request from this IP and can put the data in there and further checks can see the data and put checks on it. And so it's, every block is pretty well isolated and should not affect the previous ones.

Thomas 27:41

Yeah, I'm sold on the security part of it. Right? Like I get how the, I get out of the crypto design here and how the blockchain ing and the semantics of that, that adds up to like, it's, it's safe to extend a token. I'm more just curious about like the programming model, like as a user, like what w like, am I, am I operating at like an API level? With, like a check object or something like that that I'm adding, or am I like, or am I literally like adding text that gets compiled down to Datalog? Or like, what does that look like?

Geoffroy 28:08

Yeah. So, um, right now I see, I see Biscuit as, um, like, uh, uh, basically, uh, uh, tool that's under stuff that people build upon. So I do not see user like the candidate token, and suddenly they, they they're filling with Datalog and everything. Uh, so on the micro services side, uh, if you want that one of the use case, so if you, you have a token, you get a request and you go to one service, and then the service calls another one, but wants to reduce the rights for that, and then the next one reduces and sends to the next one, and this programmatically is, you get a token, you decentralize and you can call a method that says, add this check, uh, that say, okay, now you can only do this operation, before you could do, you could do all of the operations. Now you can do only a read on that, and this is an add and then the serialize, and then you pass to the next one and programmatically it's relatively easy to do. Uh, the APIs in the various libraries are nice to use now. On the API side, on the API client side, it depends, uh, because see, since it, it relies on your API and what you provide, you still need to provide some tooling from, for the user side to say, okay, I do not want to write Datalog. Uh, even though if I'm a technical user of, uh, Austin company, maybe I do not want to just fiddle with that right away. So maybe you want a good, uh, web web interface that say, okay. Yeah, I have my token and I want to derive in that way so that I can access this part. Now that's why we provide a web assembly tools that can be used from the JS or front-end with ECMAScript modules. but yeah. The usability part? Like it's in most systems, you would hide the Biscuit and if people really want to to deploy iy to, to modify exactly the way they want, like they get the token, they see that it can, everything is available and they can just extract it, the there's a CLI they can attenuate in the CLI the way they want them to just put the token back the way they want.

Deirdre 30:18

Yeah. I can see like a test on the biscuitsec.org age, like a sample test authorization policies in Datalog. And it's like, here is an ID. Here is a file name. Here is like a, like a URL and like different kinds of rights you can have. That's not baked into Biscuits or Datalog at all. You have to have some other way to go look up, whether that is right, that anyone can have, that that is an ID that is real, that that is a resource that exists, or you're trying to create it. And you're like, you're not going to collide with something. You have to integrate it with those other systems. Like they, the, the Biscuit part is just allowing you to like present a contract, but you still have to do these other to say whether you were allowed to do the thing, but you have to go do other work. And so, like, you kind of said you would have like a UI of a web app that's like, here are the possible resources you can add to this Biscuit, behind the scenes Biscuit], or your user ID. I'm going to just inject that for you. all that sort of stuff would have to be part of your application. It's not just handled for you in the Biscuit or Datalog.

Geoffroy 31:29

yeah.

David 31:30

is going to say the like specifically, like I used the Go version of Biscuits for stuff at work, and really like the actual objects you're dealing with, look like a builder and a verifier object, both of which can take rules, predicates, or caveats or facts, um, as input to constrain or provide information about the environment. And that just looks like, I dunno, any builder API that has to build a predicate, right? You've got like, at some point you end up with a literal and at some point you end up with like a comparison thing and maybe you're providing that, like creating a bunch of objects that look like an AST, or maybe you're just providing a string and having it like parse out from the Datalog. But

Geoffroy 32:11

yeah. And the thing is like a Biscuit can, could do your antiauthorization system. But it's not necessary. Uh, like if you have a huge, complicated roll-based system with like thousands of users and tens of thousands of roles, because you always have more roles that users, you can, you won't load everything inside the Datalog machine so that it can check. Like it could do, I'll admit, some tests. You could still keep everything in memory and be quite fast in that verification, but it's not, maybe not the best use case. but you can just use part of the system, like do the nice query to the database with the roles when you, you know, it will find a good role for this user. And then just insert the data you need inside the authorizer only just the facts you need all, you can just have it like completely stateless. Okay. You have a token that contains the user ID. You extract that and you verify the roles in your system. You already have like Spring security, something or device or whatever, but then you have the token that comes with specific checks that say, okay, but I will only want a query to this endpoint. And the token does not need to know the rest of the authorization system. It just needs, I want to verify this part of the API. So you can integrate that in your system, in your API without being too traumatic

Thomas 33:32

That's okay. That's like a big general— uh you're you're gonna, you're going to get a lot of specific questions from me right now. Just because this is, this is the part of the podcast we call, the guests does Tom's job for him. Um, but this is essentially what I'm working on right now. Right? So, um, like one of my like semantic issues with, Uh, you know, with the fancy tokens are like, I have like, you have the the permissions, the thing that the token lets you do, encoded in the token, but you also have ambient permissions, right? so like our request comes in to, you know, deploy an app in our system. Right? Well, the token can say whatever the token wants to say, but beyond the token, there's also a system of ownership of like you're a member of this organization, which owns these apps. And no matter what the token says, you know, th there's still an ambient authority check for like, are you allowed to do something with this app? And like, you get to, you get to questions like, okay. So I'm trying to evaluate a token here. do I first need to make the database query, to get the list of all possible apps that this thing could be looking at, like the ambient authority for the token? Because if so that's painful, right? Like having to do that on every single token validation, it is yucky. Right. or is it simple to like, you know, as you're evaluating the token, pull in additional facts as needed, like, can you load them lazily, like, you know, needs access to this application, to this particular application in our schema, right? Like, is that an easy thing to do or do I have to have like all the lists of facts that I could possibly need before I can do the evaluation?

Geoffroy 34:58

so loading lazily is something I really want to try. Uh, so you, you, you point to a specific problem of those other authorization systems, uh, like OPA, uh, has I think three or four different ways to load data from the outside. Like one of them is like you preload everything. Another one is you a part of the query cannot directly to initial because somewhere, uh, also takes a specific Datalog query and converts that to SQL that can call to the RMU you have in your system. And they made a lot of work to, to plug into your end into various languages. It's it's amazing work really. And it's amazing to see, uh, it's, it's a bit complex, but really cool. Uh,

David 35:46

like, I, I heard you screaming. Um, but like it's not as bad as like you're making it out to be, cause what this ends up looking like, at least how we implemented it is, like you have the facts that we have are the host, the user, the, the method, whether it's a GET or a POST and the path, because we want it to be able to issue tokens for administrators, IE employees of the company, to be able to do a subset of routes as if they were a customer. Um, so that we didn't, in the admin panel. And

Thomas 36:18

That's how you guys do impersonation. It's how you guys do impersonation.

David 36:20

Exactly. Um, and so we provide authority facts in the token. Let's say, this is for user with ID x, and this is for the host with URL: blah. and this is for path blah and method, you know, GET, And then the verifier like is just middle. It runs as part of middleware on our HTTP server. So it just immediately says, oh, I know what route I am. I'm providing, you know, uh, this is my route. I'm going to provide that as an ambient fact. This is my method and provide that as an ambient fact, this is my host name and would provide that as an ambient fact that it feeds it into the verifier. And the check is simply like, is the ambient host and the ambient path and the ambient method, the same as, what the authority put in. And then that check is actually encoded that those are the variables you're looking at is put in by the authority. But like, yeah, it's reliant upon the service to correctly say like this is the resource that is attempted to being asked or attempted to being accessed.

Thomas 37:23

the other way to come at it And it's just to say, okay, like we don't, we don't have a notion of ambient authority, right? Like the, you know, w we simply won't issue tokens that are outside of the ambient authority of this user anyways. So we can encode the entire authentication authorization scheme into the token. There'd be no need to look up whether you have access to this application, because you wouldn't have the tote with the token that would say that you could, if you didn't

David 37:44

it depends if you want, like, do you want there to be a predicate that says this is allowed? Or do you just want the signature verify to do all of the work for you? Right. So the reason that we provide these ambient things is because then you're able to write a predicate, that's basically like one equals one, but it's effectively checking does host equal host, does method equal method, does path equal path, is all my, ambient thing is checking. And you could just implement that in code and skip the predicates. or you could, you know, have it be explicitly, uh, and only use the facts. You could just have the authority. So the original issuer was. This is this user, this is this path or whatever, and just verify the signature, pull out the facts, move on with your life. but if you put in like the one equals one predicates, then later on, you could write perhaps a more complicated predicate if you decided that that, that mattered.

Geoffroy 38:37

sort of like, if you want to encode everything in the token, it's fine. But for user authentication, like things change, like you get into another team, you get shared the document, like things changed between the moment that token was created and the moment you get and use it. So you can even reissue them, revoke them. So there are ways to do that. But like, if you put everything in the token, at some point you get the misfire for, for query and the user is not happy. So yeah, we need good ways to call into the system. And right now, considering. I say that again, we cannot replace everything that's that's existing right now. So plug that into an existing system that has our back or something like that, that already works. And once this specific part is done, you can get the checks from the user because, the point, one of the points of having the offline attenuation, which is even not offline delegation, you can do, it's not necessarily the same user, the same service that's used them. it will not more the way the user uses the system. Like you have your own rules in your API that's secure. This is the basic system, but it cannot map every use case of the customer you have in front of you. So if they want to just small thing, okay, I need to delegate to this part or this service that I need to have this CI thing that must read from this repository and that one into another organization, and then write to this field repository. And then I this secret, but not that one. It gets complicated very quickly. if you have huge companies with multiple, uh, sub companies that they bought, they have each their own active directory and whatever it's— you cannot pre plan for everything and having a way to do delegation exactly like the user wants it's a good trapdoor for that.

Deirdre 40:33

So in that scenario, you probably want built in expiration or revocation into your, into your Biscuit. Um, I, the, the different blocks that you add have revocation ID so that you can like take one that you've gotten, and then you can say, I revoke a previous like S— block, right? That's something you can do. Can you build and like, I assume it's pretty easy just in Datalog to be like, if, you know, milliseconds since the epoch is greater than blah, like revoke or something like that, but then you could in theory, take that Biscuit and be like, I revoked that block that says this will expire after this many milliseconds since the epoch. So what do you recommend for like managing expiration of Biscuits? to avoid some of these problems?

Geoffroy 41:22

if you revoke a block, it revokes all of the token derived from that block. So if you say, I want to revoke this block that was before that say there's an experience. Your token is expired, is easily revoked as well. Well. So that, that's why like it's, it's has to be that way, because if you revoked the token of a user, like all of the derived token should be invalidated as well.

Thomas 41:47

Okay. I have another, geoffroy does my job for me question, which is: how do you think about size for these things? How do you think about like encoding efficiency? What do you think is like the so, you know, imagine we're doing this for services that have lots and lots of requests, right? Like you're, you're talking about like timing to the, you know, sub-millisecond timing for validation because you have high traffic APIs, right? So what am I, am I nuts to think that, like, at some point I do care about how big these things are, if only because they're going to fit somehow into an HTTP header or something, or, like how, how much do you, like when you think about this, how much do you think about, like, how big of these things can get?

Geoffroy 42:26

It was like one of the huge, huge topics in the design of the token. And I tried value's different formats, uh, from, I don't know, cbor, uh, protobufs, CapnProto. Um, yeah, I, I tried various things and in the end, uh, Protobuf was a good trade off because it was small enough and used enough in various languages so that I knew that the implementation should be okay. the, the last part of making the token smaller was about, uh, the Datalog encoding. So if, if I use like textual representation it would be huge, but with the way binary representation is done, it's, it's okay. It's really very fast three refine to see. And I try to keep the tokens, like under, uh, one kilobyte. th the, the part that annoys me the most is that the signatures and the public keys, it's 42 bytes every time.

Thomas 43:24

This is the problem I run into with, with Macaroons. As we're building this out is like, no matter how tight you encode the predicates and like I'm down to like predicates that, or like, you know, I'm using message pack. And I have like a lot of really common operations, are just integer comparison so that they're just, they're tiny predicates, but the token is huge because you have all these repeated blocks of, in my case, just, you know, truncated, you know, HMAC-SHA hashes, or whatever, but still like, it just adds up really quickly. But your general take is anything under one K and you just don't have to care.

Geoffroy 43:53

oh, I'd say for, for cookies, or even HTTP header. I think the limit is like eight K, but I would not recommend that ever. Uh it's it's not, 1 K's fine. At this point, you can, you could have like a part in UIPI where you, where you just say, okay, I have this attenuated token, I send it to you and send me back a token whether you have everything in one block or everything's done correctly that way. And then there, I think people have done that. Like the three shooter can thing when you get like one block and one signatures and the smallest thing possible.

Thomas 44:30

how do you see, how do you see Biscuits interacting in a typical application? How do you see a Biscuit interacting with, you know, let's call them an authentication token, right? Like the thing that attests that you're validly logged into the site, if you were doing a Greenfield design, if you were doing something from scratch, would you do the entire thing in Biscuits or would you have, you know, an opaque session token that is part of like the ambient facts that you evaluate, that you evaluate for the token? Or how do you see that working?

Geoffroy 44:55

I'd use the Biscuit, like as cookie or something containing the session ID and either have, like, that was what we thought we were talking before. Like a way good way to load data from the, from the system. That, that would be like the largest part of how do I design my roles, how do I design, do I want to do attribute-based something or role based or, yeah, I, this is the big part will be in the data model, not in the front, getting the token back and serializing in this series in verifying the key. there are specific points that need attention. So there's revocation, uh, because the revocation IDs are native in the tokens. So that wasn't one of the big points that came, uh, a colleague of mine who was the CTO of Clever Cloud, used Macaroons a lot in his current company. And one of the big thing was ,okay, if you have not put revocation IDs in your token, at some point, you will need to have put revocation in your token in the past, and you will have a bad time. And so Biscuits they're generated from the format itself and guaranteed to be unique. So. That that way. if at some point you need to retrofit revocation in your system, you can just use those IDs, but you still need a way to ex— so you can extract them easily from the token, but thenm how do you verify them? Do you have the list, the revocation list that's pushed to every node or do you call into a central system? So this is a part that you have to provide, and this is something when you can provide advice, but it depends on the system.

Thomas 46:31

With Macaroons, like one of like, one of the suggested ways of doing revocation is, that you could bind tokens to a third-party revocation checking service, right? Like Macaroons I'll have a random nonce at the root anyways. So they're all kind of, they're not explicitly revocation, but like you have some notion of being able to revoke, uh, you know, the root of a bunch of tokens. Right. But then like you can have as a predicate on it, you know, it has to be signed off on by the revocation checking service, which means that like, when users go to use that token, they first have to submit the token to the revocation, checking service, get the signature back and then present like the collection of tokens.

Deirdre 47:05

or you have to have like a, have the revocation checkings or the verification service has to push and keep all the clients that need to check that the tokens up-to-date and like, you might have to have a fallback or something like that.

Thomas 47:17

Yeah, I guess I'm generally curious, right? Because this would also be like a natural way to express authentication as well. Um, is like, if you're, if you're doing the entire thing with your tokens and you don't have an ambient notion of your, you know, who you're logged in as which makes sense. Cause you're, you're trying to give people tokens to do things, right. So having them log in as well as a little weird. So. if you're doing that, like then a natural thing to do is to have an authentication service, the authentication service, you know, binds a claim to your, your token that says, okay, the satisfies the, are you in this organization? Things I checked it, you know, and signed off on it. Do you do much with third-party tokens, with tokens linking to other services?

Geoffroy 47:53

So. Not not right now. we're currently looking to a good way to provide a feature like the full batch tokens. Honestly, I've not been a fan of Apache caveats and stuff like that in, in Macaroons because it gets very complex, very fast and people tend to abuse them. we need something like that, uh, to back data from something, another signer or something else, like a use case that was requested by the good people of Flynn, which was a container open source platform as a service company that are very good people that did do the Biscuit Go implementation. so they, they came with that use case. Okay. We need to have a caveat that say, give me this data signed by this key. And this is a very, very just case. And that's a bit like the Fitbit caveat thing you will need. Okay. If you want this token to be valid, you have to provide me this information. And we we're currently looking to a good way to provide that, yeah.

Thomas 48:49

You You said that third party, uh, that third-party caveats tend to get abused. I know this is a, this is a thing I'm supposed to, I think I'm supposed to ask Sophie Schmieg about Um, somebody, um, about like all of the, before I, you know, plunge headfirst into the waters of Macaroons. Right. but like you said that like your, your general concern is they, they tend to get abused. Like, do you have off the top of your head, like an idea of, or a description of what you're talking about there? Like you would know better than I would have is why I'm asking.

Geoffroy 49:19

uh, like just right now. No, I think there were good examples in the talk by test finessing, uh, on that. Uh, but I like, I do not have like, example, like right now.

David 49:33

So what's next for Biscuits? Like, are you feeling good about where they're at now and that we're at a V2 or V3, whatever version we're at, V2. I think that go tags don't align or something. Maybe. I don't remember. so is that, is it about language parity now? Or feeling good or is it just evangelism? Like what's next for Biscuits?

Geoffroy 49:57

so right now there are a few implementation that, um, that had not yet come up to the V2. So there's the Go and the Java one, I think the Java one got a very good PR from a friend recently, so it should be good soon. Uh, there's a Swift one in the works. There's a C# and I think I've seen people talking about that. Basically to grow. And I think that's the right time to come into the project because the, the core, the core that we tried with or was like hard, hard to use, still and we cleaned up a lot of things. And now we can go back to adding more interesting features and see how to much more use cases. So yeah, evangelism, documentation providing good examples. That's why we did all of the, WASM web component things where every use case, every example in the website is executable, it's it took a lot of time to get that, but now it's so nice to use. that.

Deirdre 50:54

from the Rust implementation right? So

Geoffroy 50:56

yeah,

Deirdre 50:56

the Rust one that's up to date, which is nice.

Geoffroy 50:58

yeah, yeah, yeah. It's uh, everything just deploys the like, right away. It's, it's pretty cool. And yeah, making sure that it can be used easily, trying to find a way to integrate that in, uh, current web frameworks. Uh, yeah, right now I need to see good examples and I am like committed to get people to try it. So. there are still lots of things to explore. we have not seen like all the use case because there's the authentication thing, but I've seen people try licensing. Um, it's, it's still a very general platform. It's a stub that signs things on. There's a language in there that conveyed it stuff. It's

Thomas 51:37

Would you generally like w when you think about people adopting Biscuits, right, would you generally target people that would otherwise be using JWTs? Like, would you see Biscuits as like a super set of all the applications that you would use, you know, with a typical parable, JWT, token? Is there a set of like, like token applications where you don't think Biscuits are a great fit after?

Geoffroy 51:58

uh, you cannot use Biscuits in OIDC yet because it's not in the spec. Uh, it, it

Thomas 52:04

Leaving, leaving aside? OIDC.

Geoffroy 52:07

It can do, whatever, uh, the JSON Web Token, can, because you can just put data into token and it's signed and that's it. The same feature you can have that already. I think it's still a hard sell if people are committed to JSON Web Token right now. And so we have to try small things. I'm not here to like, 'we replace these because JSON web tokens are bad for this and that reason', that says, okay, let's just first find a nice place to put the Biscuits and let people try them. And then we will,

Thomas 52:39

Geoffroy, Geoffroy, Geoffroy. got to understand what this podcast is about.

Deirdre 52:44

You don't need to convince us of that. Um,

Thomas 52:47

We are here to.

David 52:48

you to take out your JWT and

Geoffroy 52:50

I have to sound reasonable sometimes.

Deirdre 52:54

like, could I put a Biscuit in like a URL slug? Cause there's a lot of like HMAC'd tokens and like maybe they're JWTs or whatever. But I think if I were like, you know, prudish in my, I'm not trying to throw the kitchen sink in my Biscuit that I throw in as a URL slug. So I could just pass it around the URL. I could probably do that. Right.

Geoffroy 53:16

yeah, you can, you can base64 it. There's no issue. Um, we, we even took the time to say, if you base64 this, is this version of base64 and not that other one, because we know base64.

David 53:30

If I encode a Biscuit to XML, does that count as SAML?

Deirdre 53:36

stop it.

Geoffroy 53:38

you say this, someone will do it, I know know it will happen.

Thomas 53:43

So w we did, like, we did a show, like, I don't know how long ago this was like five months ago or whatever, but we did a run down on tokens. and you know, w we, we got through like the list of all possible tokens. I think we got some point to Biscuits and like I had general concerns about complexity, about, you know, doing public key with a Macaroons construction that you guys had to use pairing curves or Bitcoin gamma signatures. And then there's the Datalog in it. And like, I think if you, if you, if you read through this stuff carefully and then actually talk to you about it, none of this stuff is scary per se, right? Like, um, I.

Deirdre 54:18

straightforward.

Thomas 54:19

And having the experience now of trying to map a universe of, you know, different API thingies into the really kind of tight fitting confines of Macaroons, like that's actually kind of scary in and of itself, right? There's a lot of complexity, like, if you read the Macaroons paper, it's like the whole world crystallizes into this single set of, you know, first principles that you can just operate from. And then in reality, it's nothing like that. It's much, much harder than that. Right. So like, I think that the last thing I've ever said about Biscuits was that I found them kind of, um, complex and I wasn't really sure that the complexity was earned by what it was doing. But at this point, especially I think with the Datalog stuff, which was just like, I think the immediate, like, you know, ickies that I got from it was,' wow', that the actual evaluation of the predicates here is much more complicated than just like a list of predicates or whatever. But Yeah. I guess at this point, I'd want to say I'm pretty sold on the idea here. Like I like the way that it's, I like how expressive the token language is. it seems pretty intuitive when you look at it and the cryptography has obviously gotten a lot simpler, um, than when you guys were first kicking around the idea. So it's, it's, it seems like a pretty big accomplishment, right? Like it's taking Macaroons and then making them usable with public key signatures. Pretty big win. So uh great, great work. I w you know, if I had a green, if I had a Greenfield design where I hadn't already made myself path dependent on Macaroons, I'd probably seriously consider using Biscuits.

Geoffroy 55:41

Yeah, I agree. Yeah. I guess the complexity was a big thing under that. That's why we got into V2. Um, but there's also like a kind of, it's a nerd catnip, like there's Datalog and crypto and Protobuf authorization. And like people get into that,. Oh, this is cool. I want to try it because it is complex and fun to learn. It's it's a very fun project to get into.

Deirdre 56:02

Hmm. Are you considering other parameter sets besides ed25519. in general.

Geoffroy 56:10

Yeah. Yeah. Uh, we've explored, uh, using ECDSA. Um, and we have, I have a general understanding and hope of how we can be done and you could even have like a token that's signed by one type of key on the block by another one and each will work correctly.

Thomas 56:27

This is for FIPS kits?. I assume this is for FIPS kits.

Geoffroy 56:31

Oh yeah. The name is Ganondorf. Yeah, FIPS kits

Deirdre 56:34

FIPS kits, or Bitcoin kits, gets whatever. I don't think they need it, but everyone loves that Bitcoin curve. So just, just curious and like you basically just have to, you're not doing anything special with the signatures. It's just literally a signature that is fits within your speed and size parameters. You can swap it out for not EdDSA. It could be ECDSA. It could be whatever you want it to be. If you have a favorite. We can do post quantum secure, isogeny-based Biscuits. Calling it mine, mine. They're mine. Sorry. I'm doing them. They're going to be, they're going to be big and slow compared to these Biscuits, but they're mine.

Thomas 57:21

Someday like the topic of ECDSA or the P-curves is going to come up early enough on the show where we can have a rant about the evil that is FIPS and what it does to people. Um, but it will not be this day.

Geoffroy 57:33

I encourage people to look and think about delegation a bit. Uh it's it goes, this is the side point of Biscuit. Like, oh, can you take what you had in your API? And you have your stem and new way to situation, but like you just give back, people control on how they use the system. Like you, you're under your steward of the data, their data, and their usage, and then they decide how they use that. How to delegate? And I want to see cool use cases like that because, uh, yeah, this is, they will provide, like, when you've seen authorizations with system with SAML, OAUTH, everything has always been to decentralize, uh, the authorization and the usage and it was just missing something. And I think Biscuit will provide that, but I need to like push people to try it. Right. So I want to see delegation and using very cool way.

Deirdre 58:23

Cool. Geoffroy. Thank you so much. This has been awesome. I, we confirmed that the first version was using pairings. We had a lot of consensus doubt, but you confirmed it and this is a very cool walk through Biscuits. Thank you so much.

Geoffroy 58:37

thank you. Hi, I had a blast.

Deirdre 58:40

Alright.

Thomas 58:40

Biscuits are neat, people should use Biscuits.

Deirdre 58:42

Nom, nom, nom, nom, nom.