The sok from marblehe

We should fix two plaintext space sizes, e.g. 1-bit (binary) and 64-bit and use that for all benchmarks, just so that we don't blow up the space of possibilities too much.
For levelled schemes, we should evaluate with 1-mult params and with e.g. 10 levels.

For binary-only libraries like TFHE, the 64-bit version then uses an adder/mult circuit. In that case, Ctxt-Ptxt optimizations are possible but not implemente which is fine, since this isn't the main core of the paper anyway.
For fair comparison among implementations of the same scheme, we should hardcode the parameters to be equal everywhere.

Finally, we could consider implementing the benchmark applications (NN, cardio, Chi-Squared) for each library, but this out of scope for the current timeframe.

Generate metadata file as output in Cardio SEAL-BFV & Cingulata

Add a metadata file (text file) that describes the environment and settings used for the benchmark run.
As there is information that can only be gathered in the container

In-Container Information

This information is only available in the container and we need a script for each combination of (benchmark program, tool).

FHE scheme paramters p, q, n
- see print_parameters(...) in examples.h
Ciphertext size (inputs + result)
- use trick with serialization, see example_rnn.cpp
Key size
- use trick with serialization, see example_rnn.cpp
Number of operations (additions, multiplications, relinearizations)
- this and this could be helpful functions for Cingulata

System / EC2 VM Information

This is information that is not available in the docker container but must be gathered on the EC2 VM.
The script collecting this information is usable for all benchmark programs.

System specs
- CPU, number of cores
- memory
- OS version
Commit ID
- git rev-parse HEAD
(Peak) RAM consumption
- see post How to record the peak memory usage of a Docker container?; maybe can be adapted to record memory usage of whole container too so that no process name must be given

@AlexanderViand What do you think about adding the git commit SHA to the S3 folder name? Would this break something related to the visualization website?

Note to myself - this would require the following change to L43@benchmark.yml:

$(echo $(date +'%Y%m%d_%H%M%S')_$(git rev-parse --short "$GITHUB_SHA"))

Cingulata compilation

Describe the Cingulata compilation in the Wiki.
Inform Alexander so he can make required changes to the Eval Docker image.

Create eval example + Dockerfile for TFHE

Similar as for the other tools, we need a small demo program that shows how writing a program for TFHE looks like and how it can be compiled.

Check why plotting failed in benchmark task

Look into the last GA benchmark run and check why plotting failed, see the log here: https://github.com/MarbleHE/SoK/runs/1479985694?check_suite_focus=true

Update E3 base image and check whether our programs still work

nGraph-HE build not working

See reported issue. If no response received till July 30, send a mail to Fabian Boemer ([email protected]). Set Rosario Cammarota ([email protected]) and Casimir Wierzynski ([email protected]) in CC?

Mail snippets:

We are currently working on a survey of FHE compilers & optimisation techniques for an SoK-style paper on the topic ...

We're planning to benchmark and compare as many tools & techniques as possible, including some apples-to-oranges comparison between tools with very different focuses. We want to explore what types of applications benefit most from which kinds of optimisations, aiding developers in selecting tools based on their application needs, and maybe identify opportunities for synergies between tools.

We would love to include in our evaluation, and .... *however,

Add docs on how to inspect a running Docker container

Add Palisade environment

Add Palisade docker environment

https://gitlab.com/palisade/palisade-release

(note that the release is actually more current (10.0.5) than the development repo (10.0.4) as of 2020-11-24)

Check docker_build images workflow task if new images are built (PALISADE, EVA, concrete)

Benchmarking Workflow

Some aspects for improving the current SoK benchmarking setup:

Use for each tool a dedicated docker-entrypoint.sh script
- Missing for SEALion
Implement support for running same program with different configurations (e.g., varying input size)
- Goal should be to avoid code duplication by only having code once in repository
Run each program (not tool!) in a separate container to speed up benchmark execution time
- Maybe it makes sense to somehow dynamically generate the docker-entrypoint.sh scripts based on a template (one per program) instead of generating a static individual script per {tool, program} configuration.
- As this adds complexity to the Github Action: maybe it makes more sense to write a python script that is then executed in the Github Actions runner (instead of writing code directly in action just run the script).
Use AWS CLI to send SSM command to VMs
- Set SSM command's timeoutSeconds argument to the timeout configured in the workflow
- Provide parameters --output-s3-bucket-name <value> and --output-s3-key-prefix <value> to write SSM output into S3 bucket (see https://docs.aws.amazon.com/cli/latest/reference/ssm/send-command.html) for easier debugging

Add EVA environment

Should be based on SEAL environment, since EVA is based on that

Add documentation of Chi-Squared test to Wiki

As the Chi-Squared test was implemented right before the submission deadline, its associated wiki page is not complete yet.

Optimize E3 execution order depending on computation time of the last run

Implement Chi-Squared Task in EVA

Define output format of benchmark programs

The CSV output schema used by cardio in Cingulata currently looks like:

num_conditions, t_keygen, t_input_encryption, t_computation, t_decryption
15, 0.713642, 9.640915500, 51.604768300, 0.234077

TODO

Discuss how output should generally look like (i.e., independent of eval benchmark program).
Discuss which metadata to include: separate text file in same tool directory?
Think about naming conventions. Currently:

s3://sok-repository-eval-benchmarks/<timestamp_folder>/<tool_folder>/<toolname_benchmark-program.csv>

where timestamp_folder uses the format YYYYMMDD_HHMMSS, tool_folder is named according to the tool (e.g., Cingulata), and the benchmark's result file is composed of the tool's name and the benchmark program (e.g., cingulata_cardio.csv).

Implement Cardio in TFHE

Wiki images broken - HTTP 406 errors

The Wiki here: https://github.com/MarbleHE/SoK/wiki

Attempts to embed these files:-

https://github.com/MarbleHE/SoK/wiki/docs/tableI.png

https://github.com/MarbleHE/SoK/wiki/docs/tableII.png

Both of which are broken images (HTTP/406)

Add wiki page for cardio benchmark

Create a wiki page for the cardio benchmark that contains:

a pseudo code of the main computation
the high-level idea of the program (i.e., what does it compute)
approximations/changes made to accommodate tools

Implement NN task in EVA

Add "naive" versions of manually implemented applications?

We currently contrast manually-optimized vs compiler-optimized programs . Except for the NN task, this tends to result in the manual versions significantly outperforming the compiled versions, so maybe it'd be interesting so a non-optimized ("naive") manual implementation?

E.g. cardio-seal could use a ripple-carry-adder vs the Sklansky adder

One more option would be to never use in-place operations, to always relinearize directly after each mult (which we might do already anyway) and maybe even to always use ctxt-ctxt operations, to really simulate a "poor" implementation?

Cardio

SEAL-BFV

NN

Not necessary, since even our best attempt is still worse than what tools can do

Chi-Squared

SEAL-BFV

Generate unoptimized result for Cingulata

Run the Cardio benchmark in Cingulata but without the optimizations to see how they impact the run time.
Automate the benchmarking execution

Parametrization not easily possible in Cingulata as circuit is generated by using cmake/make magic.
We would need to create a well-defined circuit anyways to enable reproducibility between tools and inhibit simple optimizations (e.g., caused by just duplicating existing conditions N times -- tested and did not make sense)
Do we need parametrization at all? Up to now already 80 test programs to write (see last columns in System Eval Setups)

Implement batched+optimized variant of Cardio in SEAL-CKKS
Add documentation in SoK-Wiki/Cardio

Plotting: One plot script per benchmark program?

One plot script per benchmark program?
- Idea: Script determines latest workflow run folder, then retrieves all the CSV file for a specific eval program (e.g., cardio).
How should plot looks like?
- This depends on which information we include.
- Examples:
  - Bar plots grouped by encryption_t, computation_t, decryption_t; one bar for each tool.
    - Disadvantage: Total time difficult to compare.
  - Stacked + grouped plot: One bar for each tool, this bar is divided into encryption_t, computation_t, decryption_t.
    - As example see this or this

Add demo program + eval dockerfile for nGraph-HE

Similarly as for the other tools, we need a simple demo program and a dockerfile that serve as blueprint for implementing our benchmark programs.

Create plots as part of github action & upload plots + files to S3

Create a GitHub Action that creates vector + pixel versions of the plots and stores them in S3 bucket.
Upload plotting scripts from repository to S3 bucket after workflow run.

Originally posted by @pjattke in #1 (comment)

Add button to send repository_dispatch event for triggering Github Action workflows
- See create a repository_dispatch event
- See send a POST request from flask
As an extension of suggestion one it would be nice to display the URL of the workflow status page after the button was pressed
- URL can be retrieved from GitHub API
The workflow is very simple and just sends the commands via AWS SSM to the EC2 instances. However, a successful workflow run does not guarantee that all benchmarks were executed successfully. Up to now, the easiest way is to check if the expected files were written to S3.
- Maybe our flask application can provide a simple POST endpoint that is called from the Docker Eval image after finishing the benchmark run. It takes the tool name and shows a green light if a tool successfully finished the benchmark. ...but probably that's too much work and does not provide lot of value.

marblehe / sok Goto Github PK

sok's People

Contributors

Stargazers

Watchers

Forkers

sok's Issues

Micro-benchmarks

Libraries to evaluate

BFV/BGV

CKKS

CGGI

Parameter settings to evaluate

In-Container Information

System / EC2 VM Information

Cardio

NN

Chi-Squared

Recommend Projects

Recommend Topics

Recommend Org