Code Monkey home page Code Monkey logo

whitenoise-core's Introduction

Please note that we are renaming the toolkit and will be introducing the new name in the coming weeks.


Build Status


Core Differential Privacy Library

See also the accompanying system repository and samples repository accompanying repositories for this system.


Differential privacy is the gold standard definition of privacy protection. This project aims to connect theoretical solutions from the academic community with the practical lessons learned from real-world deployments, to make differential privacy broadly accessible to future deployments. Specifically, we provide several basic building blocks that can be used by people involved with sensitive data, with implementations based on vetted and mature differential privacy research. Here in the Core, we provide a pluggable open source library of differentially private algorithms and mechanisms for releasing privacy preserving queries and statistics, as well as APIs for defining an analysis and a validator for evaluating these analyses and composing the total privacy loss on a dataset.

The mechanisms library provides a fast, memory-safe native runtime for validating and running differentially private analyses. The runtime and validator are built in Rust, while Python support is available and R support is forthcoming.

Differentially private computations are specified as an analysis graph that can be validated and executed to produce differentially private releases of data. Releases include metadata about accuracy of outputs and the complete privacy cost of the analysis.


More about the Core

Components

The primary releases available in the library, and the mechanisms for generating these releases, are enumerated below. For a full listing of the extensive set of components available in the library see this documentation.

Statistics Mechanisms Utilities
Count Gaussian Cast
Histogram Geometric Clamping
Mean Laplace Digitize
Quantiles Filter
Sum Imputation
Variance/Covariance Transform

Architecture

There are three sub-projects that address individual architectural concerns. These sub-projects communicate via protobuf messages that encode a graph description of an arbitrary computation, called an analysis.

1. Validator

  • Location: /validator-rust

The core library, is the validator, which provides a suite of utilities for checking and deriving sufficient conditions for an analysis to be differentially private. This includes checking if specific properties have been met for each component, deriving sensitivities, noise scales and accuracies for various definitions of privacy, building reports and dynamically validating individual components. This library is written in Rust.

2. Runtime

  • Location: /runtime-rust

There must also be a medium to execute the analysis, called a runtime. There is a reference runtime written in Rust, but runtimes may be written using any computation framework--be it SQL, Spark or Dask--to address your individual data needs.

3. Bindings

Finally, there are helper libraries for building analyses, called bindings. Bindings may be written for any language, and are thin wrappers over the validator and/or runtime(s). Language bindings are currently available for Python, with support for at minimum R, Rust and SQL forthcoming.

Note on Protocol Buffers

  • Location: /validator-rust/prototypes

Communication among projects is handled via Protocol Buffer definitions in the /validator-rust/prototypes directory. All three sub-projects implement:

  • Protobuf code generation
  • Protobuf serialization/deserialization
  • Communication over FFI
  • Handling of distributable packaging

At some point the projects have compiled cross-platform (more testing needed). The validator and reference runtime compile to standalone libraries that may be linked into your project, allowing communication over C foreign function interfaces.

Installation

Refer to troubleshooting.md for install problems.

PyPi packages

Refer to core-python which contains python bindings, including links to PyPi packages.

Crates.io

The crates are intended for library consumers.

The Rust Validator and Runtime are available as crates:

From Source

The source install is intended for library developers.

You may find it easier to use the library with this repository set up as a submodule of some set of language bindings. In this case, switch to the language bindings setup. You can still push commits and branches from the core submodule of whatever bindings language you prefer.

  1. Clone the repository

     git clone [email protected]:opendifferentialprivacy/whitenoise-core.git
    
  2. Install system dependencies (rust, gcc)
    Mac:

    curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
    xcode-select --install

    Linux:

    curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
    sudo apt-get install diffutils gcc make m4

    Windows: Install WSL and refer to the linux instructions.

  3. In a new terminal:
    Build crate

     cargo build
    

    Test crate

     cargo test
    

    Document crate

     cargo rustdoc --open
    

    Build production docs

     ./build_docs.sh
    

There are crates in validator-rust and runtime-rust, and a virtual crate in root that runs commands on both. Switch between crates via cd, or by setting the manifest path --manifest-path=validator-rust/Cargo.toml.


Getting Started

Jupyter Notebook Examples

We have numerous Jupyter notebooks demonstrating the use of the Core library and validator through our Python bindings. These are in our accompanying samples repository which has exemplars, notebooks and sample code demonstrating most facets of this project.

Relative error distributions Release box plots Histogram releases Utility simulations Bias simulations

Core Rust Documentation

The Rust documentation includes full documentation on all pieces of the library and validator, including extensive component by component descriptions with examples.

Communication

  • Please use GitHub issues for bug reports, feature requests, install issues, and ideas.
  • Gitter is available for general chat and online discussions.
  • For other requests, please contact us at [email protected].
    • Note: We encourage you to use GitHub issues, especially for bugs.

Releases and Contributing

Please let us know if you encounter a bug by creating an issue.

We appreciate all contributions. We welcome pull requests with bug-fixes without prior discussion.

If you plan to contribute new features, utility functions or extensions to the core, please first open an issue and discuss the feature with us.

  • Sending a PR without discussion might end up resulting in a rejected PR, because we may be taking the core in a different direction than you might be aware of.

Contributing Team

Joshua Allen, Christian Covington, Eduardo de Leon, Ira Globus-Harris, James Honaker, Jason Huang, Saniya Movahed, Michael Phelan, Raman Prasad, Michael Shoemate, You?

whitenoise-core's People

Contributors

ctcovington avatar joshua-oss avatar mikephelan avatar raprasad avatar saniavn avatar shoeboxam avatar tercer avatar turbofreeze avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.