Code Monkey home page Code Monkey logo

emp-zk's People

Contributors

carlweng avatar fionser avatar gconeice avatar lgtm-com[bot] avatar olea-digitalis avatar secantzhang avatar vingstar avatar wangxiao1254 avatar weikengchen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

emp-zk's Issues

Implementation's sensitivity to network latency

We know QuickSilver is almost constant-round.

  • offline phase grows linearly to the statement size but extremely slow, due to the iterated LPN
  • online phase is three-round

The current implementation, however, seems to be quite sensitive to the network latency. In a statement that I am proving,

  • if the RTT = 4ms, the total time is 32s.
  • if the RTT = 40ms, the total time is 89s.

This time seems to be less relevant to the network bandwidth. I also tried changing the TCP congestion control algorithm to HTCP, with a 32MB TLS buffer, but there is basically no change.

After looking at the code, I am still unsure about what happens. Some early researching:

  • BoolIO flushes frequently but removing such flushes does not help the total time.
  • The code currently calls the io->send_data for many, many small elements, during inputting and multiplying (where most of the latency comes from).

Nevertheless, a potential optimization is to delay all the input/multiplying until revealing---aka, call io->send three times during the online phase. This is complicated, yet might be generally useful. Instead of hoping the network can figure out the correct way to send data, we can instead make big IO calls.

A PR might be incoming.

Latency grows with the number of GGM trees, aka O(T)

This is a follow-up to #7 regarding the phenomenon that when the network latency goes up, the running time also goes up.

I used the following parameters adapted from Ferret:

const static int N_REG_Fp = 10608640;
const static int T_REG_Fp = 1295;
const static int K_REG_Fp = 589824;
const static int BIN_SZ_REG_Fp = 13;
const static int N_PRE_REG_Fp = 649728;
const static int T_PRE_REG_Fp = 1269;
const static int K_PRE_REG_Fp = 36288;
const static int BIN_SZ_PRE_REG_Fp = 9;
const static int N_PRE0_REG_Fp = 51200;
const static int T_PRE0_REG_Fp = 800;
const static int K_PRE0_REG_Fp = 5060;
const static int BIN_SZ_PRE0_REG_Fp = 6;

Network bandwidth 2Gbps with 20ms RTT. 32 threads.

I change T_REG_Fp along with BIN_SZ_REG_Fp. So, I have the following results.

T = 1295, BIN = 13
17.3s

T = 2590, BIN = 12
21.9s

T = 5180, BIN = 11
28.6s

T = 10360, BIN = 10
41.7s

T = 20720, BIN = 9
68.0s

Note that this experiment is not nicely controlled, as the network and computation also go up. So, I also do it with a lower RTT, 4ms.

T = 1295, BIN = 13
10.8s

T = 2590, BIN = 12
11.8s

T = 5180, BIN = 11
13.1s

T = 10360, BIN = 10
15.9s

T = 20720, BIN = 9
21.5s

An important discovery is that, if we disable the malicious checks, the time goes down significantly.

T = 1295, BIN = 13
RTT = 4ms: 9.3s
RTT = 40ms: 10.6s

T = 20720, BIN = 9
RTT = 4ms: 10.0s
RTT = 40ms: 11.3s

A PR will be pushed soon to fix this issue.

What happens is that, in SPCOT,

  • Alice is the sender of the GGM tree, Bob is the receiver of the GGM tree
  • Alice is the receiver of the checking challenge, Bob is the sender of the checking challenge

Therefore, on each thread, there are essentially O(T) rounds.

Static Library?

After building the library I see a /usr/local/lib/libemp-zk.dylib (for macOS) but I don't see a static version built.
Is it not supported / recommended?

Compilation Error on Xcode

I can successfully install ot, zk and tool on a macOS, with all the tests passing successfully.
However, the following simple program, which only makes a reference to setup_zk_bool won't compile with Xcode.

#include <emp-tool/emp-tool.h>
#include <emp-zk/emp-zk.h>

int main(int argc, const char * argv[]) {
    setup_zk_bool((BoolIO<NetIO>**)nullptr, 0, 0);
    return 0;
}

The errors are the following:

emp-tool/utils/aes_opt.h:12:15: Always_inline function '_mm_aesenclast_si128' requires target feature 'aes', but would be inlined into function 'ks_rounds' that is compiled without support for 'aes'

emp-tool/utils/aes_opt.h:79:13: Always_inline function '_mm_aesenc_si128' requires target feature 'aes', but would be inlined into function 'ParaEnc' that is compiled without support for 'aes'

emp-tool/utils/aes_opt.h:89:12: Always_inline function '_mm_aesenclast_si128' requires target feature 'aes', but would be inlined into function 'ParaEnc' that is compiled without support for 'aes'

Since all the tests pass using the same compiler on the same machine, I suspect that the issue has to do with compiler configurations, with Xcode telling the compiler to be strict and throw errors when it could just give a warning. But I don't know which flags that could be.

A small Xcode project containing the program is available here.

working over 254-bit fields

Hi, we are considering using emp-zk for our project by modifying it to work over a 254-bit field, specifically BN254, ideally BLS12-381.
Could you kindly offer your intuition as to what kind of CPU perfomance degradation we should expect compared to the current 61-bit field. Thanks.

Implementations of prime fields with some 2-arity

This issue serves as a note on EMP-ZK for a prime field that is not of a Mersenne prime, but with a high two-arity (aka, p - 1 has a large k for 2^k).

This is needed in some situations:

  • Integration with SPDZ using HE-based preprocessing, where the homomorphic encryption is based on ring-LWE, and the plaintext field (where QuickSilver may integrate) has some 2-arity.
  • Other situations probably involving ring LWE/SIS problems.

The code in this secret gist (https://gist.github.com/weikengchen/ac6de8086f2144bd3677e2771aaf015c) is an example of a prime field:

p = 2^{62} - 2^{26} - 2^{25} + 1

So it has a high two-arity of ~25.

The code in this secret gist (https://gist.github.com/weikengchen/1d7dbe19706f1c229741933323d00a8a) is an example of a prime field:

p = 2^{62} - 2^{16} + 1

So it has a high two-arity of ~16.

The second one is about 2.5x slower than the Mersenne prime, which may improve if we batch x >= p? x - p : x with vector instructions, which appears to contribute to a large part of the overhead. These two gists did not apply a few optimizations in the original version (which may be indeed worthwhile to look into).

Efficient VOLE for one-shot zero-knowledge proofs

This issue is just to leave a note. It is mainly an engineering addition.

Currently, we generate the offline materials in big batches of N. This is because efficient LPN map K -> N is often "big".
Therefore, even if two parties are proving very small statements, the one-shot time is not small.

There are many solutions to this:

  1. When computing the LPN map K -> N, we instead just compute K -> N' where N' < N. The limitation is that it does not fully use K, and K could be smaller if one computes the parameters more carefully.

  2. Use the original OT extension.

Both might be worthwhile of looking.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.