Code Monkey home page Code Monkey logo

Comments (22)

Ankit01Gehlot avatar Ankit01Gehlot commented on May 28, 2024

@koraa can you please explain the requirements? I would like to work on this.

from rosenpass.

digital-phoenix avatar digital-phoenix commented on May 28, 2024

/attempt #21

Hello @koraa I'm interested in working on this but I have a few questions. As this issue is fairly old now has there been any major updates that would affect this issue? I can see there is a lot of ProVerif code covering secrecy and availability but does any of it cover (or overlap with) identity hiding?

Options

from rosenpass.

koraa avatar koraa commented on May 28, 2024

@Ankit01Gehlot @digital-phoenix Hi there! Sorry for not finding the time to elaborate on the goals of this issue before.

Thank you for approaching this.

As this issue is fairly old now has there been any major updates that would affect this issue?

I don't think so. The protocol is the same as a couple months ago; there is a version two (#136) I am working on, but I don't think it would affect identity hiding.

I can see there is a lot of ProVerif code covering secrecy and availability but does any of it cover (or overlap with) identity hiding?

I made no attempt to cover identity hiding in the current ProVerif model.

I think the task would be split in two major steps:

  1. Review cryptographic literature indicating which notion of identity hiding we should consider
  2. Come up with a simplified formulation of identity hiding suitable for the symbolic model
  3. Ensure that the proof terminates.

You might end up creating a third ProVerif process to accomplish this – the proof framework used in the noninterruptability and the secrecy/authenticity model will probably not work for identity hiding.

Please note, that this task is basically a simplified form of cryptographic (mathematical) proof.

from rosenpass.

digital-phoenix avatar digital-phoenix commented on May 28, 2024

Hi @koraa thanks for elaborating. In terms of academic literature are there any specific references you would recommend, or should I do an informal meta-analysis to discover the most discussed issues around identity hiding. Also should I approach this issue as an ongoing discussion or would it be better if I presented a proposal on how I believe this should be implemented?

from rosenpass.

koraa avatar koraa commented on May 28, 2024

Hi @koraa thanks for elaborating. In terms of academic literature are there any specific references you would recommend, or should I do an informal meta-analysis to discover the most discussed issues around identity hiding. Also should I approach this issue as an ongoing discussion or would it be better if I presented a proposal on how I believe this should be implemented?

If you feel very assured about what you are doing, you could come up with a proposal immediately, but I think a safer bet would be discussing here!

Personally, I would start by looking for literature on the definition of identity hiding; you will probably find some old paper (pre 2005) since this seems like a fairly basic property. I would guess that you will come upon some indistinguishably based definition with a security game with oracles akin to the old relations among security notions paper1.

I would expect a definition to provide decryption and encryption oracles; send the challenger two public keys, encrypt a cipher text of the challenger's choice with one of the keys (chosen at random) and ask the challenger to guess which of the keys this was. The associated probability clause would assert, that the difference in probability of a right guess vs a wrong guess approaches zero (50/50 chance).

You might also come across simulation based definitions; I know less about these.

I would also go ahead and look for papers on identity hiding specific to (1) symbolic analysis frameworks such as ProVerif and Tamarin (2) advanced proof assistants such as EasyCrypt, CryptoVerif or SSProof and (3) identity hiding analysis on related protocols such as WireGuard, Noise, Post-Quantum WireGuard or KEMTLS (4) finding a definition of identity hiding for KEMs2.

You could post other suitable sources here and then we could discuss!

Footnotes

  1. Bellare, Mihir, et al. "Relations among notions of security for public-key encryption schemes." Advances in Cryptology—CRYPTO'98: 18th Annual International Cryptology Conference Santa Barbara, California, USA August 23–27, 1998 Proceedings 18. Springer Berlin Heidelberg, 1998. https://link.springer.com/chapter/10.1007/BFb0055718

  2. Xagawa, Keita. "Anonymity of NIST PQC round 3 KEMs." Annual International Conference on the Theory and Applications of Cryptographic Techniques. Cham: Springer International Publishing, 2022. https://eprint.iacr.org/2021/1323

from rosenpass.

koraa avatar koraa commented on May 28, 2024

@digital-phoenix Note the follow up work to Xagawa's paper:

Maram, Varun, and Keita Xagawa. "Post-quantum anonymity of Kyber." IACR International Conference on Public-Key Cryptography. Cham: Springer Nature Switzerland, 2023. https://link.springer.com/chapter/10.1007/978-3-031-31368-4_1

The paper seems to show that Kyber provides anonymity; which is less relevant for us since we just use Kyber for ephemeral keys.

from rosenpass.

digital-phoenix avatar digital-phoenix commented on May 28, 2024

@koraa so far I've found theses resources:

Lipp, Benjamin, Bruno Blanchet, and Karthikeyan Bhargavan. "A mechanised cryptographic proof of the WireGuard virtual private network protocol." 2019 IEEE European Symposium on Security and Privacy (EuroS&P). IEEE, 2019.

Brain, Martin, et al. "Verifying Classic McEliece: Examining the role of formal methods in post-quantum cryptography standardisation." Code-Based Cryptography Workshop. Cham: Springer Nature Switzerland, 2022.

Weninger, Andreas Johann. Privacy preserving authenticated Kkey exchange: Modelling, constructions, proofs and formal verification: Modellierung, Konstruktionen, Beweise und Verification. Diss. Wien, 2020.

Cheval, Vincent, and Bruno Blanchet. "Proving more observational equivalences with ProVerif." International conference on principles of security and trust. Berlin, Heidelberg: Springer Berlin Heidelberg, 2013.

Baelde, David, Alexandre Debant, and Stéphanie Delaune. "Proving Unlinkability using ProVerif through Desynchronised Bi-Processes." 2023 IEEE 36th Computer Security Foundations Symposium (CSF). IEEE, 2023.

The first two references are valuable examples of cryptographic proofs of protocols relating to rosenpass. While the next three are more focused around verifying privacy.

Also I'm a little bit confused by the biscuit decryption process. It appears that in order to decrypt a biscuit one needs: the biscuit_key, the nonce included with the biscuit, the static public key of the responder, the session id of the initiator and the session id of the responder. But the RespHello packet appears to include all the needed information except for the biscuit_key (which is global to the server) and the spkr. Does this mean that the only secret information one would need to decrypt the biscuit is the spkr or am I missing something?

from rosenpass.

koraa avatar koraa commented on May 28, 2024

The first two references are valuable examples of cryptographic proofs of protocols relating to rosenpass. While the next three are more focused around verifying privacy.

Splendid!

Also I'm a little bit confused by the biscuit decryption process. It appears that in order to decrypt a biscuit one needs: the biscuit_key, the nonce included with the biscuit, the static public key of the responder, the session id of the initiator and the session id of the responder. But the RespHello packet appears to include all the needed information except for the biscuit_key (which is global to the server) and the spkr. Does this mean that the only secret information one would need to decrypt the biscuit is the spkr or am I missing something?

To the responder, spkr is also global to the server. The responder issues the biscuit, the responder also decrypts it.

The only secret information needed to decrypt the biscuit is the responder-global biscuit_key; this is by design. The responder uses this to outsource their session storage into the RespHello/InitConf packages.

from rosenpass.

digital-phoenix avatar digital-phoenix commented on May 28, 2024

I missed the part where the responder both encrypts and decrypts the biscuit. I was focusing on the biscuit because it contains the pidi which could be used to identify the initiator. Since the biscuit requires the biscuit_key and that key is never transmitted (from what I can tell) an attacker would be unlikely to identify someone through the biscuit.

So far my understanding is that the only message types associated with a users identity are the pidiC, the biscuit and to a lesser degree the session ids. As the biscuit is used for availability then the Proverif availability code should also prove the security of the biscuit. By extension it should also prove that an attacker cannot learn the identity of an initiator through the biscuit.

If my understanding is correct, if an attacker was able to decrypt the pidiC then they would be able to hijack the conversation. Since there appears to be Proverif code showing the attacker cannot hijack the conversation this implies an attacker cannot learn the identity of an initiator through the pidiC.

This would leave the issue of an attacker being able to monitor a session through a session id and identifying a user through their interactions. Learning a users identity through this approach would be equivalent to the privacy risks around de-idenified data. Worth noting the length of a session may determine how big of a risk this is.

Let me know if I've missed anything.

from rosenpass.

digital-phoenix avatar digital-phoenix commented on May 28, 2024

@koraa feel free to let me know if my last post was completely off the mark.

from rosenpass.

koraa avatar koraa commented on May 28, 2024

@digital-phoenix

I missed the part where the responder both encrypts and decrypts the biscuit. I was focusing on the biscuit because it contains the pidi which could be used to identify the initiator. Since the biscuit requires the biscuit_key and that key is never transmitted (from what I can tell) an attacker would be unlikely to identify someone through the biscuit.

Exactly.

So far my understanding is that the only message types associated with a users identity are the pidiC, the biscuit and to a lesser degree the session ids.

I think it would probably help to be able to come up with a nice definition of identity hiding.
I would probably define it as: The attackers ability to learn spki or spkr, assuming they have a short (two element) list of candidates. In that way, asking for data fragments that "contain" the identity is too weak. You also have to account for known public-key attacks.

From what I see:

  • sctr is linked to spkr (hence the reliance on anonymity of the SKEM)
  • scti, is linked to spki
  • ck is also linked to both, eventually (this could be important for brute force attacks – if the attacker learns ck, they might be able to do something)
  • As you say, biscuit is linked to both through being linked to ck
  • auth is linked to ct and thus auth is also linked to both scti and sctr

As far as I can see, sidi and sidr are linked to the public keys through no other way than being part of the handshake messages.
Both are randomly generated.

As the biscuit is used for availability then the Proverif availability code should also prove the security of the biscuit. By extension it should also prove that an attacker cannot learn the identity of an initiator through the biscuit.

Yes, I think you could put it that way. The code already does that. Right?

If my understanding is correct, if an attacker was able to decrypt the pidiC then they would be able to hijack the conversation. Since there appears to be Proverif code showing the attacker cannot hijack the conversation this implies an attacker cannot learn the identity of an initiator through the pidiC.

Why would they be able to hijack the conversation? Being able to decrypt pidiC would just break anonymity.
Decrypting pidiC is one example of a scenario that could lead to loss of anonymity; or goal is to cover all possible examples.

This would leave the issue of an attacker being able to monitor a session through a session id and identifying a user through their interactions.

Which interactions? In the context of the proverif model the user does not interact beyond a key exchange; of course in the real world they do and with sufficient knowledge of the usage patter during data transmission, participants could be identified. But this is out of scope for this analysis.

Learning a users identity through this approach would be equivalent to the privacy risks around de-idenified data. Worth noting the length of a session may determine how big of a risk this is.

Agreed.

I think you analysis of the protocol is incomplete, but quite astute. It is good to be able to reason about a protocol in your mind before attempting a formal approach.
The next step would be to formulate a anonymity notion using ProVerifs query language.

Thank you for working on this, and sorry for my delay in responding. I tried to take a bit of a break since I was overworked.

from rosenpass.

digital-phoenix avatar digital-phoenix commented on May 28, 2024

@koraa Thanks for the response it helped clarify many aspects of the protocol.

Like your suggestion my initial definition for identity hiding would be through a kind of game/challenge to an attacker. Take a network where participants are assigned either the role of initiator or responder. Then over a period of time allow initiators to start sessions and communicate with responders. While this is happening give an attacker the ability to: fully monitor the network, intercept and modify messages, initiate sessions with responders, and force specific initiators to start sessions with the attacker. Provide the attacker an oracle that provides a random list containing all public keys of the participants. Then identity hiding is broken for any participants the attacker is able to match a public key with.

Notably as is the case for Wireguard the mac variable breaks identity hiding for all responders. As a result I suggest we focus on proving identity hiding for initiators.

Please let me know how you feel about this first attempt at a definition for identity hiding and if it needs any editing.

from rosenpass.

digital-phoenix avatar digital-phoenix commented on May 28, 2024

@koraa Do you think it is possible an attacker could initiate a handshake with some responder and examine the biscuit data? Then could the attacker use the fact that they know all the data used to encrypt the biscuit except for the biscuit_key and the fact they could recognize the peer id the biscuit encrypts to brute force the biscuit_key? Then could the attacker use the biscuit_key and a list of public keys to retrieve the peer id of other initiators that recently started a session with the same responder?

from rosenpass.

koraa avatar koraa commented on May 28, 2024

@koraa Thanks for the response it helped clarify many aspects of the protocol.

Like your suggestion my initial definition for identity hiding would be through a kind of game/challenge to an attacker. Take a network where participants are assigned either the role of initiator or responder. Then over a period of time allow initiators to start sessions and communicate with responders. While this is happening give an attacker the ability to: fully monitor the network, intercept and modify messages, initiate sessions with responders, and force specific initiators to start sessions with the attacker. Provide the attacker an oracle that provides a random list containing all public keys of the participants. Then identity hiding is broken for any participants the attacker is able to match a public key with.

This is nice; it matches the Dolev-Yao model, which is what CryptoVerif is modeled after.

Notably as is the case for Wireguard the mac variable breaks identity hiding for all responders. As a result I suggest we focus on proving identity hiding for initiators.

Yup; maybe we should fix that actually. We can just use a fixed MAC label instead. I added a TODO to #136
I would rather make sure:

  1. That the ProVerif model clearly shows that the MAC prevents identity hiding
  2. That the ProVerif model indicates whether there are any other attacks, if the MAC does not expose the responder identity

Please let me know how you feel about this first attempt at a definition for identity hiding and if it needs any editing.

I think you got it for the most part :)

Do you think it is possible an attacker could initiate a handshake with some responder and examine the biscuit data? Then could the attacker use the fact that they know all the data used to encrypt the biscuit except for the biscuit_key and the fact they could recognize the peer id the biscuit encrypts to brute force the biscuit_key?

That would not fly; the they would have to brute force $2^{256}$ possible states…the biscuit_key is secure enough.

However, looking at the biscuit definition there is a way to find the responder identity: By the AEAD definition in proverif, the ad (additional data) field is not secret. This is somewhat synthetic, but not entirely: Our AEAD uses the poly1305 polynomial MAC which provides weakened security.

let ad = lhash("biscuit additional data", spkr, sidi, sidr);
let ct = XAEAD::enc(k, n, pt, ad);

By the definition of the AEAD in ProVerif, the attacker could:

  1. Retrieve the additional data, getting access to the ad field
  2. The ad field is a hash protected by proper preimage resistance, but the attacker knows sidi and sidr, so they could enumerate a list of peer public keys using the ad value

There are a couple was to fix this, here are two ways to protect ad using K.

let k2 = lhash("biscuit additional data", k, spkr, sidi, sidr);
let ct = XAEAD::enc(k2, n, pt, 0);
let ad = lhash("biscuit additional data", k, spkr, sidi, sidr);
let k2 = lhash("biscuit additional data", k);
let ct = XAEAD::enc(k2, n, pt, ad);

However, I don't think this is even needed. There is no good reason why we should add spkr to ad in the first place: The responder knows they are the only ones who know the biscuit key, so in that way spkr is already tied to the biscuit key. Removing the spkr just yields the following code, which is slightly faster and simpler:

let ct = XAEAD::enc(k, n, pt, sidi || sidr);

Could you make sure your analysis covers that too?

Let me say thank you; I was surprised anyone picked up this issue and there are already actual findings.

This is amazing work! How do you feel about writing a science paper with your analysis results and how to fix Rosenpass to provide it?

Please stick to this issue; if you feel like the bounty is not high enough, ask me to raise it!

from rosenpass.

digital-phoenix avatar digital-phoenix commented on May 28, 2024

@koraa thanks for the very kind feedback. I'll gladly write a scientific paper although I'll need your help to make sure it is as accurate as possible.

In terms of the Proverif analysis is the mac currently included in the model? For the parts that are going to be updated such as: the mac and the ad field vulnerability. Do you want me to change the way the model works in Proverif or just show the flaws in the current model using Proverif?

from rosenpass.

digital-phoenix avatar digital-phoenix commented on May 28, 2024

@koraa also it would help a lot if you could increase the bounty as writing the paper and doing further analysis will take a lot of work.

from rosenpass.

koraa avatar koraa commented on May 28, 2024

@koraa thanks for the very kind feedback. I'll gladly write a scientific paper although I'll need your help to make sure it is as accurate as possible.

Good! We will do it together! It will either be a paper analyzing privacy hiding in RP specifically, or a paper introducing the symbolic analysis of rosenpass in general. You would write a first draft and I would add my extra information after.

In terms of the Proverif anralysis is the mac currently included in the model? For the parts that are going to be updated such as: the mac and the ad field vulnerability. Do you want me to change the way the model works in Proverif or just show the flaws in the current model using Proverif?

The goal generally is to represent the protocol as it is implemented in the src/ folder and as it is specified in the whitepaper. I am already working on a protocol version that addresses the findings from this issue (#136). Updating the protocol in the analysis will be easy once we have a security queries for identity hiding.

@koraa also it would help a lot if you could increase the bounty as writing the paper and doing further analysis will take a lot of work.

OK. I think we should also start doing better project management on this; having you work on one giant issue in an all-or-nothing fashion puts a lot of risk on your end and that is not fair to you.

We should have a call to discuss your situation and what we can do to support your work on this; I would like to specifically talk about how you are handling taxes on your end because when I started freelancing, I did not handle that well and ended up owing the tax authorities a bunch of money.

Could you please join the Rosenpass developer chat and divide this project into multiple tasks? If we choose this option we can issue per-task bounties, so you get paid in smaller increments to reduce your risks.

@AliceOrunitia could you take care of handling this and being a point of contact for @digital-phoenix ?

from rosenpass.

digital-phoenix avatar digital-phoenix commented on May 28, 2024

@koraa Sorry for taking a while to get back to you I got side tracked by another project. I can definitely schedule a call for sometime next week. How do I join the Rosenpass developer chat?

from rosenpass.

koraa avatar koraa commented on May 28, 2024

@koraa Sorry for taking a while to get back to you I got side tracked by another project. I can definitely schedule a call for sometime next week. How do I join the Rosenpass developer chat?

No problem; thanks for sticking to this. The developer chat can be found here:

https://matrix.to/#/#rosenpass-dev:matrix.org

from rosenpass.

digital-phoenix avatar digital-phoenix commented on May 28, 2024

@koraa I joined the developer chat. My display name is James Brownlee but I'm not entirely sure how to use matrix chat.

from rosenpass.

AliceOrunitia avatar AliceOrunitia commented on May 28, 2024

@digital-phoenix Hi, I sent you a message on Matrix, but I suspect it's wise to copypaste it here:

Hi!

Are you available for a call with Karolin to discuss this issue on Tuesday 21st November at 14:10 until 15:00 CET? I'd set up calendar invites, so would need an email to ping that to. We'd likely use our BBB video service, unless you have other preferences.

Otherwise, we can look at other dates or the discussion can simply continue on the issue page.

Kind regards,
Alice Bowman

from rosenpass.

digital-phoenix avatar digital-phoenix commented on May 28, 2024

@kora I pushed some changes that use weaksecret to show that spkr and spki are vulnerable due to the mac variable. The code still needs a lot of cleanup but you can take a look at it here https://github.com/digital-phoenix/rosenpass

from rosenpass.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.