Code Monkey home page Code Monkey logo

sok's People

Contributors

alexanderviand avatar dependabot[bot] avatar pjattke avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sok's Issues

Improve plotting

  • Rename the current plotting folder to "paper_plots"
  • Plot all with large, colorful plots designed to explore data rather than to be printed

Both sets of plots should be generated by the Github Runner.

Design & implement Benchmarks for libraries

Benchmarking the libraries against each other:

Micro-benchmarks

  • Ctxt-Ctxt Multiplication time (incl. relin)
  • Ctxt-Ptxt Multiplication time (incl. relin)
  • Ctxt-Ctxt Addition time
  • Ctxt-Ptxt Addition
  • Sk Encryption time
  • Pk Encryption Time
  • Decryption time
  • Rotation (native, i.e. single-key) [Only where applicable]

Libraries to evaluate

BFV/BGV

CKKS

CGGI

  • TFHE
  • Palisade [FHEW] (see abcd2ed)
  • Concrete

Parameter settings to evaluate

  • Plaintext space: t = 1 or t = 2^64
  • For levelled schemes: n=16k, 128-bit security, choose q s.t.: 1-level or 10-level

We should fix two plaintext space sizes, e.g. 1-bit (binary) and 64-bit and use that for all benchmarks, just so that we don't blow up the space of possibilities too much.
For levelled schemes, we should evaluate with 1-mult params and with e.g. 10 levels.

For binary-only libraries like TFHE, the 64-bit version then uses an adder/mult circuit. In that case, Ctxt-Ptxt optimizations are possible but not implemente which is fine, since this isn't the main core of the paper anyway.
For fair comparison among implementations of the same scheme, we should hardcode the parameters to be equal everywhere.

Finally, we could consider implementing the benchmark applications (NN, cardio, Chi-Squared) for each library, but this out of scope for the current timeframe.

Generate metadata file as output in Cardio SEAL-BFV & Cingulata

Add a metadata file (text file) that describes the environment and settings used for the benchmark run.
As there is information that can only be gathered in the container

In-Container Information

This information is only available in the container and we need a script for each combination of (benchmark program, tool).

  • FHE scheme paramters p, q, n
    • see print_parameters(...) in examples.h
  • Ciphertext size (inputs + result)
  • Key size
  • Number of operations (additions, multiplications, relinearizations)
    • this and this could be helpful functions for Cingulata

System / EC2 VM Information

This is information that is not available in the docker container but must be gathered on the EC2 VM.
The script collecting this information is usable for all benchmark programs.

@AlexanderViand What do you think about adding the git commit SHA to the S3 folder name? Would this break something related to the visualization website?

Note to myself - this would require the following change to L43@benchmark.yml:

$(echo $(date +'%Y%m%d_%H%M%S')_$(git rev-parse --short "$GITHUB_SHA"))

Cingulata compilation

  • Describe the Cingulata compilation in the Wiki.
  • Inform Alexander so he can make required changes to the Eval Docker image.

nGraph-HE build not working

See reported issue. If no response received till July 30, send a mail to Fabian Boemer ([email protected]). Set Rosario Cammarota ([email protected]) and Casimir Wierzynski ([email protected]) in CC?

Mail snippets:

We are currently working on a survey of FHE compilers & optimisation techniques for an SoK-style paper on the topic ...

We're planning to benchmark and compare as many tools & techniques as possible, including some apples-to-oranges comparison between tools with very different focuses. We want to explore what types of applications benefit most from which kinds of optimisations, aiding developers in selecting tools based on their application needs, and maybe identify opportunities for synergies between tools.

We would love to include in our evaluation, and .... *however,

<we would be delighted to hear ...>

Benchmarking Workflow

Some aspects for improving the current SoK benchmarking setup:

  • Use for each tool a dedicated docker-entrypoint.sh script
    • Missing for SEALion
  • Implement support for running same program with different configurations (e.g., varying input size)
    • Goal should be to avoid code duplication by only having code once in repository
  • Run each program (not tool!) in a separate container to speed up benchmark execution time
    • Maybe it makes sense to somehow dynamically generate the docker-entrypoint.sh scripts based on a template (one per program) instead of generating a static individual script per {tool, program} configuration.
    • As this adds complexity to the Github Action: maybe it makes more sense to write a python script that is then executed in the Github Actions runner (instead of writing code directly in action just run the script).
  • Use AWS CLI to send SSM command to VMs

Define output format of benchmark programs

The CSV output schema used by cardio in Cingulata currently looks like:

num_conditions, t_keygen, t_input_encryption, t_computation, t_decryption
15, 0.713642, 9.640915500, 51.604768300, 0.234077

TODO

  • Discuss how output should generally look like (i.e., independent of eval benchmark program).
  • Discuss which metadata to include: separate text file in same tool directory?
  • Think about naming conventions. Currently:
s3://sok-repository-eval-benchmarks/<timestamp_folder>/<tool_folder>/<toolname_benchmark-program.csv>

where timestamp_folder uses the format YYYYMMDD_HHMMSS, tool_folder is named according to the tool (e.g., Cingulata), and the benchmark's result file is composed of the tool's name and the benchmark program (e.g., cingulata_cardio.csv).

Add wiki page for cardio benchmark

Create a wiki page for the cardio benchmark that contains:

  • a pseudo code of the main computation
  • the high-level idea of the program (i.e., what does it compute)
  • approximations/changes made to accommodate tools

Add "naive" versions of manually implemented applications?

We currently contrast manually-optimized vs compiler-optimized programs . Except for the NN task, this tends to result in the manual versions significantly outperforming the compiled versions, so maybe it'd be interesting so a non-optimized ("naive") manual implementation?

E.g. cardio-seal could use a ripple-carry-adder vs the Sklansky adder

One more option would be to never use in-place operations, to always relinearize directly after each mult (which we might do already anyway) and maybe even to always use ctxt-ctxt operations, to really simulate a "poor" implementation?

Cardio

  • SEAL-BFV

NN

  • Not necessary, since even our best attempt is still worse than what tools can do

Chi-Squared

  • SEAL-BFV

Parametrization of cardio example (ex. Cingulata)

  • Parametrization not easily possible in Cingulata as circuit is generated by using cmake/make magic.
  • We would need to create a well-defined circuit anyways to enable reproducibility between tools and inhibit simple optimizations (e.g., caused by just duplicating existing conditions N times -- tested and did not make sense)
  • Do we need parametrization at all? Up to now already 80 test programs to write (see last columns in System Eval Setups)

Plotting: One plot script per benchmark program?

  • One plot script per benchmark program?
    • Idea: Script determines latest workflow run folder, then retrieves all the CSV file for a specific eval program (e.g., cardio).
  • How should plot looks like?
    • This depends on which information we include.
    • Examples:
      • Bar plots grouped by encryption_t, computation_t, decryption_t; one bar for each tool.
        • Disadvantage: Total time difficult to compare.
      • Stacked + grouped plot: One bar for each tool, this bar is divided into encryption_t, computation_t, decryption_t.

Neural Network Application

To implement the NN-Inference application, the following needs to be done:

  • Train Plaintext version of LeNet-5 for MNIST using approximated activation functions
  • Train Plaintext version of SqueezeNet for CIFAR-10 using approximated activation functions
  • Add plotting code for application
  • Implement networks in SEALion
  • Implement networks in nGraph-HE (Depends on nGraph-HE build issues being solved, see #4)
  • Implement networks "manually optimized" in SEAL-CKKS-SIMD (recycle code from MSR Private AI Bootcamp Project)
  • Implement networks naively in SEAL-CKKS
  • Consider how to modify network for integer-only (SEAL-BFV/SEAL-BFV-SIMD, ALCHEMY, Ramparts)
  • Implement networks naively in SEAL-BFV
  • Implement networks "manually optimized" in SEAL-BFV-SIMD
  • Implement networks in ALCHEMY
  • Implement networks in Ramparts
  • Consider how to modify/quantize network for binary-only (Cingulata, TFHE)
  • Implement networks in Cingulata
  • Implement networks in TFHE

Ideas for Visualization Website

Some ideas that came up that could be interesting to add to the visualization website:

  • Add button to send repository_dispatch event for triggering Github Action workflows

  • As an extension of suggestion one it would be nice to display the URL of the workflow status page after the button was pressed

  • The workflow is very simple and just sends the commands via AWS SSM to the EC2 instances. However, a successful workflow run does not guarantee that all benchmarks were executed successfully. Up to now, the easiest way is to check if the expected files were written to S3.

    • Maybe our flask application can provide a simple POST endpoint that is called from the Docker Eval image after finishing the benchmark run. It takes the tool name and shows a green light if a tool successfully finished the benchmark. ...but probably that's too much work and does not provide lot of value.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.