Code Monkey home page Code Monkey logo

Comments (21)

fletcher avatar fletcher commented on May 20, 2024

First -- thanks for submitting! This is a niche project, and I haven't really promoted it in any way. But still, it's been quiet!

And thanks for pointing out the flaw!

from c-sss.

fletcher avatar fletcher commented on May 20, 2024

@bernardpaulus -- Reading through code again, and wondering if I am missing something. Do you subtract 1 from the prime twice? once before calling nonzero_random() and once inside it? Is this on purpose?

from c-sss.

fletcher avatar fletcher commented on May 20, 2024

I guess it seems from Wikipedia that this is on purpose. So it seems that my initial code was incorrect in two ways, that partially cancelled each other out. By not adding 1, I allowed a coefficient to be 0. By not subtracting 1, when I fixed the first problem I would introduce the problem of allowing a coefficient to be equal to the prime, rather than 1 less than the prime.

If my understanding is incorrect, please let me know. Otherwise, ignore. ;)

Thanks!

from c-sss.

bernardpaulus avatar bernardpaulus commented on May 20, 2024

Hello Fletcher!

Not too fast! I actually modified the wikipedia article to gain more visibility on this and quicker feedback and I'm regularly checking it for changes.

Now you said you looked up wikipedia to check my change, I realized I was wrong and reverted the edit.

=> more details follow

from c-sss.

fletcher avatar fletcher commented on May 20, 2024

In reading from Shamir's 1979 paper (copied from OCR'd image, so forgive "typos"):

The coefficients a~..... ak_~ in q(x) are randomly chosen from a uniform distribution over the integers in [0, p), and the values D~..... Dn are computed modulo p.

The key is "[0,p)", which would imply that the coefficients can be 0, but cannot be equal to the prime.

Correct?

from c-sss.

fletcher avatar fletcher commented on May 20, 2024

It looks as though some of this has been discussed here:

https://crypto.stackexchange.com/questions/29945/security-guarantees-of-shamirs-secret-sharing-when-some-co-efficients-are-zero

from c-sss.

bernardpaulus avatar bernardpaulus commented on May 20, 2024

So, yes, the idea was to stop generating random numbers in the interval [0, 255], but instead generate them on [1, 256]

To be more clear about the change, we moved from:

coef[i] = rand() % 256; // [0, 255]

to

coef[i] = rand() % 255 + 1; // [1, 255]

the same applies for arc4random_uniform.

I think this is required to avoid weak polynomials. Indeed, if you take wikipedia's example:

(a0 = 1234 ; a1 = 166; a2 = 94)

And, if, by chance a2 is 0 instead, the above is equivalent to:

splitting phase

(a0 = 1234 ; a1 = 166 ; a2  = 0 )
f(x) = 1234 + 166 x
Ds = [(1, 1400), (2, 1566), (3, 1732), (4, 1898), (5, 2064), (6, 2230)]

recovery phase

now, only two shares, say (x0 = 1, y0 =1400) and (x1 = 2, y1 = 1566), are sufficient to recover the secret:

Lagrange interpolation with two datapoints, check the wikipedia article for the formulas.

 l0(x) = (x - x1) / (x0 - x1) = (x - 2) / (1 - 2) = -x + 2
 l1(x) = (x - x0) / (x1 - x0) = (x - 1) / (2 - 1) = x - 1
 L(x) = y0 * l0(x) + y1 * l1(x) = 1400 * ( -x + 2 ) + 1566 ( x - 1 )
        = -1400x + 2800 + 1566 x - 1566 = 166x + 1234

we recovered our polynomial with only two shares, when 3 are supposed to be required.
Let's recover the secret:

L(0) = 0x + 1234 = 1234

It works similarly if a1 is 0 instead of a2 in the example. Also, in the general case, each factor equal to 0 reduces by one the number of shares required to find out the secret. This is a matter of degrees of freedom of the polynomial.

Caveat: I don't have enough time to transpose this to finite field arithmetic right now.

possible attack

Given an attacker that attempts to recover the secret (= a password)
with k - 1 shares instead of the k required
and who as access to an oracle (= a login form) to verify if his guess is true,
the attacker therefore has a non-negligible probability (= better than brute force) of recovering the secret (= being able to log in)
by just attempting a recovery with his k - 1 shares.

Given:

  • (p = 257), the size of the finite field,

  • (k = 3) the required number of shares

  • and uniformly distributed coefficients in [0, p-1] = [0, 256]
    The probability of him recovering the password by attempting a recovery with k - 1 keys is thus:

    P(p, k) = 1 - ((p-1) / p) ** (k-1) // NOT(the probability that no coefficient is 0)

For the above parameters, the attacker has the following probability to recover the shares with only two shares:

 P(257, 3) = 0.008 // rounded up, 0.8%

Which is better than brute-forcing.

To prevent anyone to have a non-negligible chance of recovering the secret with less than the k required shares, we have to ensure the underlying polynomial has the required degrees of freedom, e.g. that no coefficient is equal to 0.

from c-sss.

bernardpaulus avatar bernardpaulus commented on May 20, 2024

Cryptostatis' answer is in line with the above (just upvoted it)

from c-sss.

bernardpaulus avatar bernardpaulus commented on May 20, 2024

Hum... the simplest would be trying it out with the code.

Will do that tonight :)

from c-sss.

fletcher avatar fletcher commented on May 20, 2024

I am not a mathematician or cryptologist (just a physician with a college background in engineering.)

But, if this is truly a flaw in Shamir's scheme, wouldn't there be published criticism discussing this somewhere? There may be, but I haven't found it yet. My search is limited to google, and not very refined since I am not particularly experienced in focused searches of the literature in this field. I have found other descriptions of weaknesses, but these seem to refer to limitations of how this can be used, not actually weaknesses in the methodology or strength of "encryption." They all seem to involve dishonest shareholders, not outsiders with no information about the shares.

EDIT: For completeness, I'll add that it's possible I am misunderstanding something in the paper.

from c-sss.

fletcher avatar fletcher commented on May 20, 2024

Another page with a purported flaw in Shamir's scheme that seems to not be true (based on comments):

https://lardbucket.org/blog/archives/2007/10/30/a-flaw-in-shamirs-secret-sharing-method/

I don't claim to understand finite fields well enough to understand all the implications of them on Shamir's scheme, but I think that is an important part of this from what I'm reading.

from c-sss.

fletcher avatar fletcher commented on May 20, 2024

As a first test, I tweaked the code to always generate 0 for the first coefficient, and random numbers for the rest. It still requires the specified number of shares to reconstruct the key in my tests.

Conversely, when I set all of the coefficients to 0, then it only requires a single share (they are all identical) to recover the key. When I set all coefficients to 1, it again requires all the shares to recover the key.

I'm not saying this, by itself, disproves your theory. Just trying to add data.

from c-sss.

fletcher avatar fletcher commented on May 20, 2024

Here's a challenge.... ;)

I encrypted a short phrase using a modified version of the code. It is set to require 4 of 4 shares to reconstruct the secret. I modified the source to force one of the coefficients to be 0 (I won't tell you which one....).

I also included a similar example (different phrase) using the regular non-zero coefficients. You have a 50/50 chance of guessing which is which.

Here are 3 of the 4 required shares - example A:

0104AA312146142D8BB6
0204AA6B95CDE66B1170
0304AAA84BBE717861D0

And, example B:

0104AA3FA2C366FEECD4
0204AA0DA2E673BF0535
0304AAAB3DFD0884E081

Of course, the object is to determine the phrase, showing your work to prove it's not just using psychology to determine what i might have picked as phrases.

from c-sss.

fletcher avatar fletcher commented on May 20, 2024

In reading through your comment and trying to understand it, I have a question/comment.

I understand your comments through the splitting phase.

But it seems that you make an assumption in your recovery phase, namely you assume that the x^3 coefficient is 0, but you only know that because you set it to 0 in advance. Therefore you use the formulas to solve for a 2nd degree polynomial, and are given two data points. Therefore, of course you found the right answer. But you only know this because you are both the dealer and attacker.

What would happen if you (as attacker), appropriately used the equations for a 3rd degree polynomial with the same two data points? Would you still get an answer? Or instead, as I suspect, would you get a family of equally likely answers?

Would the same thing not also be true if the attacker chose to fix the largest coefficient to another set value, e.g. 5? If 5 happened to be correct, then they could solve with a smaller number of keys.

So, if an attacker accurately guesses the first coefficient, then they could solve the problem with one fewer key than nominally required. But this would seem to be the same as a brute force attack, no? The attacker would have no way of knowing whether the guess about the first coefficient was correct.

Am I understanding this correctly? If not, what am I missing?

Thanks!!

from c-sss.

bernardpaulus avatar bernardpaulus commented on May 20, 2024

Hello,

A quick answer, because I didn't had the time and won't have more time this weekend:

  • indeed, the attack above only works if, by chance the coefficient of the term of the highest degree is 0
  • the more general attack requires the attacker to try out each of the different coefficients, resulting in (k - 1) possible passwords. This requires another approach to recover the secret, based on the equation system instead of the lagrange polynomial.

Coming back to you on that next week :)

from c-sss.

fletcher avatar fletcher commented on May 20, 2024

@bernardpaulus The suspense is killing me! ;)

from c-sss.

ferrerrajc avatar ferrerrajc commented on May 20, 2024

Hi there.

It is my understanding that if your coefficients can be zero, then the security of the algorithm is not reduced. Even if you know k - 1 points on the curve, all values for s are equally likely. As a matter of fact, if coefficients are not allowed to be zero, then some values of s are not possible. Take, for example, p = 3 and k = 2. With the point (1, 1) and the knowledge that a1 is nonzero, it must be that s is 0 or 2. Otherwise, 1 = 1 + a1 which implies a1 = 0. This is exaggerated with such a small prime, but you get the picture.

However, if it is the case that some coefficients will be zero (or any other known value), then the security of the algorithm is crippled. An attacker needs to know fewer shares to guess the secret. If there's one known coefficient value (zero or otherwise), then once the attacker has k - 1 shares, they only need to solve some k - 1 linear systems of equations to get as many possible secrets.

from c-sss.

fletcher avatar fletcher commented on May 20, 2024

@ferrerrajc That is my understanding as well. In this case, 0 is equally as likely as any other coefficient (except for the intentionally crippled examples I posted.)

from c-sss.

bernardpaulus avatar bernardpaulus commented on May 20, 2024

Hi,

Sorry for not getting back at you earlier! @ferrerrajc the likeliness argument is very interesting, but could you specify the size of the finite field in your example? (P in wikipedia's notation)

Cheers :)

from c-sss.

ferrerrajc avatar ferrerrajc commented on May 20, 2024

Hey @bernardpaulus,

In my example, I used p = 3 for my prime. So the secret can only take one of the values 0, 1, or 2.
I wanted a very small prime and polynomial to demonstrate the effect of limiting coefficient values.

from c-sss.

fletcher avatar fletcher commented on May 20, 2024

I pushed a change to develop that restores the full range for the random numbers (e.g. 0 to 256, with prime of 257).

Based on everything I have been able to read, I believe this is correct and the range stated in Shamir's paper.

If I have misrepresented the original paper, then I definitely want to know and fix it.

If there is widely accepted proof that Shamir was wrong, and 0 should not be allowed, I am happy to change that as well.

Otherwise, individuals can modify their own version to remove 0 at the risk of actually decreasing the security of their implementation slightly (by taking the number of possibilities from 257 to 256).

from c-sss.

Related Issues (9)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.