biscuit-auth / biscuit Goto Github PK
View Code? Open in Web Editor NEWdelegated, decentralized, capabilities based authorization token
License: Apache License 2.0
delegated, decentralized, capabilities based authorization token
License: Apache License 2.0
we do not specify a revocation system here, but support for a revocation id (that would allow revoking a token and all its derived tokens) could be useful
a verifier should be able to run queries and gather data from the token, like revocation ids: in some cases, loading all of the revocation ids as logic facts might be too much data, so we would validate the token, get the revocation ids, then look them up elsewhere.
Since the facts we want to query might be generated by block or verifier rules, those queries should be run as part of the verification process. The verifier validates the token, and if there were queries, returns the associated data.
This is currently implemented in the rust version, but not the java version yet
After a few iterations on both cryptographic primitives and language designs, biscuit is in a phase of iterative improvements.
Biscuit is used in production in several places, the rust implementation now has a stable API and is used as the basis for other implementations (biscuit-wasm
and biscuit-python
).
Biscuit blocks are versioned and this mechanism has allowed gradual introduction of new features without disrupting existing deployments. Features that are not part of token serialization and authorization are not versioned and remain under the responsibility of each implementation. This includes datalog parsing for instance.
This roadmap starts from biscuit v2, which is the minima version supported by libraries. Due to an issue in the initial release of biscuit v2, a breaking change update had to be released, so biscuit v2 corresponds to version 3 of blocks. In the roadmap, only block versions will be used (v3+).
To get to a usable token implementation, here's what we would need now:
With #6 and #8, along with some out of band discussions, we have a better idea of how attenuation should work:
We're evaluating a datalog like language to express the caveats. It is simple to implement and allows complex queries. It can also be used to generate the authority
field in a compact way.
We have been exploring example queries to get a feel for how it could work.
We need to support our goals of a token that can be attenuated offline and verified in a decentralized way. To that end, we explored a few cryptographic systems:
All three of them would be usable, but we will need an audit of the schemes before deciding which one to go with.
It would be useful to have an alternative mode to transform a biscuit token to a symmetric construction, a bit like macaroons.
That mode is not well defined yet, but the idea would be to send an asymmetric token to the authentication service, which will check the token and its caveats, and create a new token with the same caveats, but using a symmetric mode, possibly with encryption.
At the cost of one RTT, we get a token that is much faster to check and can be fixed for requests to only one service (the one who knows the secret key).
It could be interesting to have a Set
atom type in the spec.
We recently merged the feature into biscuit-go (see: biscuit-auth/biscuit-go@19073ab)
This allows to group several atoms under a single one, and have constraints allowing to check if all or none the set values are included in a given set:
# given the fact:
users(#ambient, ["alice", "bob", "eve"])
# doesn't match, since eve is not in the constrained list
*allow($username) <- users(#ambient, $usernames) @ $usernames in ["alice", "bob"]
# match, $usernames does not contains any of "admin" or "root"
*allow($username) <- users(#ambient, $usernames) @ $usernames not in ["admin", "root"]
The set does not restrict embedding different atom types (ie: [#a, "alice", 42]
) but I don't see much usage of that yet. It doesn't seem to affect the constraint anyway.
wdyt ?
Compared to macaroons, the attenuation syntax seems to allow capabilities extension, whereas macaroons structurally guarantee that no extension can happen (by relying on boolean algebra).
The example features diff-ilke attenuation (instead of general properties the query must satisfy, attenuations are modifications to the token's rule set. In isolation, each attenuation rule cannot be checked.
This implies that a reduction has to happen before the receiving service can check if the token is okay for a given request. On the contrary, macaroons verifiers check each caveat in isolation.
This leads to an important question: "who's responsible for the reduction?".
Compare this to macaroons, where we have the best of both worlds: syntax is not constrained and attenuation is proved by basic boolean algebra rules.
Unless biscuit ships with a rock-solid attenuation language, with structural proofs that expansion is impossible (in the current spec, this implies amongst other things computing regexes intersections),
it means that the burden of proving attenuation is transferred to the owner, which would make it a dangerous token to use.
posix, posix extended, perl, pcre?
The difference between strings and symbols can be confusing, because there are special symbols, #authority
and #ambient
, and normal symbols that can be user defined. Those are just be considered as special strings with limited constraints checks, and a bit faster to compare since they're actually stored as an index in the symbols table.
I'd propose that we store all strings in the symbol tables (this can reduce storage size as strings tend to be repeated) and reserve the symbol type to reserved symbols (only #authority
and #ambient
for now).
This would require that the Datalog engine accesses the symbol table during execution (for string constraints checks)
Currently, blocks can be added to any non-sealed biscuit. A block can contain both caveats (like in macaroons), but also facts and rules. This requires careful thought from both the biscuit minter and the biscuit verifier: the goal is to avoid confusing facts from the biscuit minter and facts from blocks.
To that end, the special #authority
atom is used.
It's special in two ways:
All that makes the semantics rather complex (and way more complex than, say, macaroons). Another possibility would be to completely forbid rules and facts in blocks, and only allow caveats. This would bring the attenuation semantics closer to those of macaroons, which are arguably easier to reason about.
I still see the value in more complexity in blocks, but I think it should be optional (ideally, opt-in). In that case, i think the biscuit should carry this information when minted. This would avoid validating an restricted biscuit with a verifier that expects an unrestricted one.
All in all, while datalog brings huge improvements over macaroons (putting aside asymmetric crypto), the ability to add rules and facts in blocks make the complexity explode and make auditing biscuits properties way harder.
As an additional bonus, not allowing rules in biscuits completely remove an attack channel where a legitimate biscuit is attenuated with rules crafted to exploit the datalog engine.
I'm simplifying the text syntax, but it can still be a bit confusing, so I'd lie to propose a different version, that's further from traditional datalog syntax.
I'd like to change from:
right(#authority, $1, #read) <- resource(#ambient, $1), owner(#ambient, $0, $1)
to
right(#authority, $1, #read) if resource(#ambient, $1) and owner(#ambient, $0, $1)
with if
and and
case insensitive, so we can write right(#authority, $1, #read) IF resource(#ambient, $1) AND owner(#ambient, $0, $1)
if it's easier to read without text highlighting
And for caveats:
caveat1($0, $1) <- right(#authority, $0, $1), resource(#ambient, $0), operation(#ambient, $1)
to
check right(#authority, $0, $1) and resource(#ambient, $0) and operation(#ambient, $1)
or another keyword, like verify
, satisfy
, ensures
, assert
, etc
For constraints, instead of moving them to the end like this:
valid_date("file1") <- time(#ambient, $0 ), resource(#ambient, "file1") @ $0 <= 2019-12-04T09:46:41+00:00
we would write:
valid_date("file1") if time(#ambient, $0 ) and resource(#ambient, "file1") and $0 <= 2019-12-04T09:46:41+00:00
And constraints could be anywhere in the rule's body, like this:
valid_date("file1") if time(#ambient, $0 ) and $0 <= 2019-12-04T09:46:41+00:00 and resource(#ambient, "file1")
To make the logic language more useful, we should provide common patterns to be reused, with explanations of their design, etc, for patterns that cannot be integrated right away in the token API.
As an example:
currently, we've been using atom to talk about terms while atom is generally used to desinate the predicates in the head and body of a rule
To implement use cases such as third party caveats, or using a biscuit token as attestation accumulating acknowledgement from multiple parties, caveats should be able to verify a public key signature.
A few questions here:
Signature verification would work well as a constraint affecting 3 elements (message, key, signature) that could be filled in various ways:
verify(message, key, sig?)
: a signature from a specific key must be providedverify(message, key?, sig?)
: a signature must be provided, from any key, but we could have other constraints on that key here (like the key must be in a certain set)verify(message?, key, sig?)
: specify we must have a valid signature from a key, with other constraints on the messageWhen I designed the serialization format, I was not too familiar with Protobuf, so there were a few mistakes I made, like representing sum types with an index, instead of using oneof
. As an example, I wrote:
message Constraint {
required uint32 id = 1;
enum Kind {
INT = 0;
STRING = 1;
DATE = 2;
SYMBOL = 3;
BYTES = 4;
}
required Kind kind = 2;
optional IntConstraint int = 3;
optional StringConstraint str = 4;
optional DateConstraint date = 5;
optional SymbolConstraint symbol = 6;
optional BytesConstraint bytes = 7;
}
while it could be written:
message Constraint {
required uint32 id = 1;
oneof ConstraintEnum {
IntConstraint int = 2;
StringConstraint str = 3;
DateConstraint date = 4;
SymbolConstraint symbol = 5;
BytesConstraint bytes = 6;
}
}
I'd like to clean this up, but that would be a breaking change
to something more explicit like InSet/NotInSet
Following from #35, we have designed a biscuit signature flow, where the authority block defines facts requesting the user to append a signature, like so:
The authority block defines facts:
should_sign(#authority, dataID, alg, pubkey)
data(#authority, dataID, challenge)
where:
And also a caveat:
*valid(0?)<- should_sign(#authority, $0, $1, $2), valid_signature(#ambient, $0, $1, $2)
Which forces the verifier to provide a valid_signature fact matching the should_sign fact, in order to verify the biscuit.
This allows the verifier to make a query for a user signature:
*to_validate($0,$1,$2,$3,$4,$5,$6,$7) <- should_sign(#authority,$0,$1,$2), data(#authority,$0,$3), signature($0,$2,$4,$5,$6)
With the variables being:
$0
: dataID
integer$1
: alg
string$2
: pubkey
byte array$3
: challenge
byte array$4
: signature
byte array - the user provided signature$5
: signerNonce
byte array - for anti replay$6
: signerTimestamp
date - for anti replayThis expects the user to have appended a block to the biscuit containing a fact as follows:
signature(dataID, pubkey, signature, signerNonce, signerTimestamp)
But we can't verify it yet, as we expect the signature to have been constructed like so:
signature = challenge | tokenHash | signerNonce | signerTimestamp
where tokenHash
has been introduced, being a hash over all the biscuit blocks before appending the signature block.
Biscuit
|_ authority -|
|_ b1 |
|_ b2 _| tokenHash
|_ signature block
|_ b3
...
So in order to verify the signature, the verifier needs to create a hash of N
blocks, N
being the block index containing the signature fact -1
And now we're wondering, what could be the best way to retrieve this block index ?
do we need operations like these:
they would not be too hard to implement, but integrating them to the syntax might be complex
this will be useful to store arbitrary data in the tokens without relying on base64 and also to match on binary data like public keys. There's an example implementation in biscuit-auth/biscuit-rust#11
I'm using a hex:
prefix when printing and parsing, it could be a good idea to add a base64:
prefix
the *
character is used to indicate the head of a rule or caveat but it has been very confusing for users, so I'd propose that we remove it.
Additionally, I'd like to remove the head from caveats, or make it empty by default. W'll have to choose a new syntax for caveats. Something like ? <- fact1("term")
?
Last one: there's a constraint on rules to produce facts that contain at least on term, but I'm not sure that constraint really makes sense, so maybe we could remove it
in a system with a large number of users or resources, loading all of the data in the verifier might be costly.
I am adding an optional "context" field to the blocks, that can be queried before verification, to give an id to look up or some filters for the data
Do you think it would make sense to support a version/instanciation of the tokens that expose the same API & caveat language, but use a Macaroon-like, symmetric-crypto-only construction?
Requirements
Rationale
Compared to pubkey biscuits, Macaroons provide very different tradeoffs between performance and security/applicability (all verifiers needs access to the token-minting key). It could be quite useful to support a symmetric mode, for instance to support “caching” the validation (and expiration checking) of a pubkey-based credential by sending back to the user-agent a symmetric, time-limited version of it.
Having the same features and caveat language as the pubkey version supports this sort of translation between the two; in general, there should be as little difference as possible, from a user/developer's perspective, to limit cognitive overhead.
Lastly, there is a triple reason to encrypt those tokens:
In order to make implementations easier to verify, we should provide, along with the specification, a serie of test cases with the expected result.
To test a token, we will need:
Here's a temporary list of test cases we should provide:
variable
type: validation errorIt's a reserved keyword in java, which blocks the impl if we want to be consistent in naming.
I went with "Number" for now
The first block of a token is called the "authority block", and is used to define some rights and rules that will be used when validating other blocks. here is the process:
Facts from the authority block (either directly or generated from rules) start with the #authority
symbol to differentiate them from facts generated in the other blocks. This was useful in earlier versions, where data generated in one block's validation could be carried over to the next block: we wanted to avoid accidentally increasing rights by adding more authority facts for the validation in the next blocks.
Since the validation of each block is independent from the other ones, a block adding a fact can only affect its own validation, so I do not think the authority tag is still needed. The only issue I would see is in verifier queries that would try to list the facts we can access, but since it would get the facts from each block then return their intersection, a block adding more facts would have no effect. For queries that return the union of results (like revocation ids), we would need to be more careful. Maybe we should mark separately "filter queries" and "aggregation queries"?
This can simplify Biscuit significantly:
add_authority_*
methods at token creationcould it happen that a block provides facts that match what a caveat's query must produce? Not sure it is possible with the way queries are implemented, but maybe this should be verified
we should have a clear definition of what is accepted or not in the regular expression constraints, as different engines will have different behaviour. As an example, we probably want to disallow backtracking, since a regex can be provided in an attenuated block (backtracking can lead to high CPU usage).
with biscuit-auth/biscuit-rust#3 I am adding a way to check authority facts from the verifier side, thus allowing tokens with only the authority part. But if we're going to support that use case, we might as well allow block level caveats in the authority block (example: for an expiration date on the entire token).
Current problem: the block format contains facts and rules, and as a convention, for the authority block, the rules member contains rules that generate authority facts, and in other blocks, that member contains caveats.
I'd propose adding another member, so blocks would contain facts, rules and caveats.
For the authority block, we would have:
For other blocks:
Hello , i work on some use cases using biscuit web token and i would like to use the feature sealed biscuit but i have a problem with that , after calling the feature used to seal the biscuit i can not print or serialize the result.
do you have any suggestions for this ?
Hello. Just found biscuit by watching https://docs.rs/releases and decided to review it as you use a few things I'm independently interested in: Ristretto, Macaroons, and Datalog.
Edit: I managed to skip over the "non-goal: revocation". Whoops. I'll leave my feedback though.
The delegation chains embed intermediate public keys in the token, this is sub-optimal for space. Implicit certificates & certificate-less public-key signature schemes suffice for one caveat; though I don't know if it is safe to generalize for multiple hops. I do know that pairings can solve this space problem, at the cost of time. I see you've been looking into pairings already; have you settled on the ristretto-friendly naive chaining?
Global synchronization of clocks is an impractical assumption in an asynchronous distributed system. Despite TLS' misplaced trust on global time, relying on globally synchronous clocks is poor practice. It would be suitable for the validator to ping the appropriate clock for validation.
Revocation may be trivial for macaroons where the single validator never has stale information; but revocation is difficult with distributed validators as they'd all need the up to date revocation state. Again the validator should ping the appropriate party for revocation information.
The two issues above strongly promote asking the normal symmetric mint/validation service to handle all of time, revocation, and validation. The root entity must still be trusted in the asymmetric variant. I do see value in third party validation when the root party is either offline or unable to handle the validation load.
One suitable (but weaker) approach for revocation would be to dedicate a stateful pub/sub service to store the revocation sets such that neither it, nor non-validating-users can see the revocation patterns. This service would store and relay a journal of revocation events, with optional bitmap snapshots. The revoking party can only append to the revocation set. The validators can only subscribe to this growing set and detect (and prove) any truncation attacks.
This is an interesting use of Datalog; though I've been thinking about using a DFA (Deterministic Finite Automaton; basically a regular expression). Only the parties adding caveats need the message schema (or subset they attenuate). The automaton may reference validator state such as dynamic sets (I.e. group memberships).
Additionally a DFA (or any join-semilattice) may be reduced such that several caveats flatten to one; stripping redundancies. This might accelerate validators and would be desirable for clients with many caveats. The reduction is both efficient and safe.
Example. Caveats write(*, *)
and {read("/foo"),write("/foo", *)}
; reduce to write("/foo", *)
; which matches write("/foo", "bar")
.
DFAs are much more efficient (when they suffice) but are not as expressive as Datalog. I am not convinced Datalog's complexity brings anything sufficient to warrant its usage for caveats.
the current API uses functions to create facts and rules, but it does not make the Datalog code very readable.
Instead of writing this:
builder.add_authority_fact(&fact( "right", &[s("authority"), string("file1"), s("read")]));
it could be nicer to write this:
builder.add_authority_fact("right(#authority, \"file1\", #read)");
Additionally, we could provide a UI to edit rights with text, test things out from a page with the wasm version, etc.
The token printer (cf the samples) already provides a syntax, but we might want to modify it:
right(#authority, "file1", #read)
, the name is "right" and the values are inside the parens123
: integer"abc"
: string#abc
: symbol (a string that is represented by its index in a table, this reduces token size and makes evaluation fast)0?
: variable, used in rules and queriescaveat1(0?) <- resource(#ambient, 0?) && right(#authority, 0?, #read)
, caveat1(0?)
is the head, separated from the rest with the arrowexpiration(0?) <- time(#ambient, 0?) | 0? <= 1545264000
Another common Datalog syntax is the following:
parent(abc, def)
ancestor(X, Y) :- parent(X, Z), ancestor(Z, Y).
?- ancestor(bill, X).
cf https://docs.racket-lang.org/datalog/datalog.html for a BNF grammar of Datalog
Usually, we speak in terms of lower/greater and not lower/larger
The logic language we're exploring in #11 can already be useful, but it might be better to have an easy to use API that covers most use cases, and then allows people with more specific needs to use the low level tools. (the example is a bit rust-y but I'm also thinking of other languages).
Basic API:
// generate a root key pair
create_key() -> KeyPair
// generate root token
create_token(root_key: KeyPair, authority: [Facts], caveats: [Caveats]) -> Token
// derive a new token
derive_token(key: &KeyPair, caveats: &[Caveats]) -> Token
// verify
verify_token(root_public_key: &PublicKey, token: &Token, ambient: [Facts], query: Query) -> bool
Generating authority facts?
Generating a caveat?
generating a query?
Please add a license to this repo and biscuit-rust/java. Thanks!
Hello guys,
Have you thought about an implementation in node js ?
Regards
with @clementd-fretlink, we've been looking at a datalog like language to express caveats.
Here are some ideas that emerged:
Current questions:
To make it easier to reason about this language, I propose that we write some example facts, rules and queries in that issue.
First example:
authority=[right(file1, read), right(file2, read), right(file1, write)]
----------
caveat1 = resource(X) & operation(read) & right(X, read) // restrict to read operations
----------
caveat2 = resource(file1) // restrict to file1 resource
With resource(file1), operation(read)
as ambient facts, caveat1
succeeds because resource(file1) & operation(read) & right(file1, read)
is true, caveat2
succeeds because the fact resource(file1)
succeeds.
With resource(file1), operation(write)
, caveat1
fails but caveat2
succeeds.
With resource(file2), operation(read)
, caveat1
suceeds but caveat2
fails.
Right now, when generating a new token or a verifier, all of the facts and rules are entered manually in the code. We might want to load them from a file or memory instead. Since we already have protobuf definitions to transport them inside a token, I'd propose we reuse protobuf to store them
Right now the values used in contraints are statically defined in the rule or caveat, but it could be useful to make them more dynamic.
As an example, if we wanted to define that the owner of a folder has all rights on any file in that folder or its subfolders, we would be able to write the following rule (with variable names instead of numbers for readability):
right($path, $operation) <- user($user_id), owner($user_id, $folder), operation($operation),
resource($path) @ path matches $folder*
With that rule, we can define a path prefix constraint using the folder defined in the owner
fact.
With this we could also compare integer variables between facts, and I think there could be some applications with set constraints too.
This would be a breaking change in the binary format.
right now, we put some hardcoded limits on the number of iterations in a "world run", but it should be better to leave that decision to the user. We can get a token with some degenerate cases that would produce a lot of facts or apply on a large number of iterations, so there should be a hard limit there.
A few options:
to ease migrations in case of breaking changes, I'd like to add a version field (uint32) the the format. I know that protobuf is designed for compatibility between versions, but I do not want to see a case where a new field gets ignored by an old Biscuit version, that might create security issues.
One decision to make, do we put the version field:
Biscuit
message that contains the serialized blocks and the signature (the version field would not be signed)currently a caveat contains only one query, so it is difficult to model caveats using OR logic.
previously, it was possible to make intermediate rules for each case, and have a caveat that checks the rule's result, but with the recent block merge, any block could just generate the facts for those rules in advance.
This would change the format at https://github.com/CleverCloud/biscuit/blob/master/schema.proto#L28 to have each caveat contain a list of rules.
When executed, the caveat would try each rule until one of them succeeds, or return an error if all failed.
We have 4 different methods we can use:
We have to choose one, depending on a few criterion:
While the pairings solution is interesting, it is quite slow (cf the benchmarks results at https://github.com/CleverCloud/biscuit/tree/master/code#benchmarks-summary ). Also, there's no guarantee of finding high quality crypto libraries to use in various languages. So I think we'll eliminate right away.
The other solutions are fast enough, and based on the Ristretto group. There's a good implementation in Rust, curve25519-dalek, and a new one in Java, curve25519-elisabeth. Using this group reduces the risk of implementation errors.
The challenge token solution makes fast and small tokens, but its behaviour has an annoying tradeoff: when we want to verify the token, an additional operation must be performed, that prevents further attenuation. This might not be what we want for the token.
The gamma signatures solution produces short signatures and is faster than ost other schemes
RFC 3339 might be simpler
This isn't an issue, but more a contribution. We have worked on a playground equivalent to jwt.io. We've seen it was on the roadmap, so hopefully it can help.
It is available at https://www.biscuitsec.org and the code is opensource. In the readme we credited clevercloud, but please let us know if you want the wording or the display to change.
We also tried to vary the examples, and try to explain through everyday life scenarios (compared to your main use case for pulsar).
Anyway, thanks for the biscuit, we think it's a great project.
Fabien & Mohamed
verifier caveats should be applied even when there is only an authority block
Hello !
While checking the language format in the specification (https://github.com/CleverCloud/biscuit/blob/master/SPECIFICATIONS.md#logic-language) and the recent commits (0b015a6#diff-03a3756d2f8eaaf446a958595f62f98f), I'm unsure what is the latest format to be used, seeing variables defined as 0? or $0, * prefix on caveats, ! prefix on predicate, @ on conditions...
So I'm wondering what is the most up to date version of the syntax I could follow ?
Thanks!
Variables are defined with an index, and this does not make them very readable. It would help to have variable names instead.
We could reuse the symbol table to provide those: a rule can still be stored with variables containing numbers, but those numbers could be indexes in the symbol table. They would be converted between names and numbers when generating the blocks or printing the rules.
The following rule:
*caveat1($0, $1) <- right(#authority, $0, $1), resource(#ambient, $0), operation(#ambient, $1)
could be printed as:
*caveat1($file, $operation) <- right(#authority, $file, $operation), resource(#ambient, $file), operation(#ambient, $operation)
Biscuit has been in development for 2 years now and is now used in production. Most of the initial roadmap is done (we still need to commission an audit).
So it will soon be time for a stable release and more public communication. Before that, I'd prefer that we clean things up a bit, there are design decisions that were left alone because fixing them would be breaking changes, but a 1.0 release would be the right time to take care of them (here I consider a breaking change anything that would invalidate how currently existing tokens would be validated).
This will be a meta issue for the work needed towards the major release:
I'll make a branch of this repo with updated test samples once I've started that work on the Rust library.
see anything else we would need?
cc @divarvel @daeMOn63 @titanous @Keruspe @KannarFr @BlackYoup @meh
initial feedbacks indicate the rights syntax is hard to understand, especially how adding or removing rights work.
And regexps add complexity there.
it will soon be time to define the serialization format for biscuit tokens. To sum up the current ideas:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.