The pytorch-privacy from ebagdasa

pytorch-privacy's Introduction

Differential Privacy in PyTorch

I used code from tf-privacy and the paper A General Approach to Adding Differential Privacy to Iterative Training Procedures.

This is a very basic implementation necessary for the quick start. There are two train commands: train for traditional training and train_dp for DP. You can see that train_dp has added clipping and noise to each computed gradient (gradients are not averaged over the batch at first).

To run it configure params.yaml and install libs: pip install requirements.txt and execute:

python training.py --params utils/params.yaml

The current result is 97.5% using noise multiplier 1.1 and S=1.

To count the epsilon value using RDP just use this code from the original repo:

!pip install tensorflow_privacy

from tensorflow_privacy.privacy.analysis import compute_dp_sgd_privacy
compute_dp_sgd_privacy.compute_dp_sgd_privacy(n=60000, batch_size=250, noise_multiplier=1.3, epochs=15, delta=1e-5)

pytorch-privacy's People

Contributors

Stargazers

Watchers

pytorch-privacy's Issues

Questions about privacy mechanism and using TF to calculate privacy costs

I'm new to DP, it's hard for me to understand some privacy mechanisms and calculation of privacy costs.
I found that in the code, it seems that all batches in each epoch need to calculate gradients and meet DP.
for i, data in tqdm(enumerate(trainloader, 0), leave=True):...
For example, if there are total 600 examples and the batch size is set to 100, then there will be 6 batches in each epoch. In the code, all the 6 batches should be used for gradient calculation (with DP).
However, I found this to be inconsistent with the traditional DP-SGD process in Abadi et al.'s work "Deep learning with differential privacy", where a probability q is used to randomly sample a batch in each epoch, and then calculate privacy cost use Moments Accountant. That is to say, each epoch only uses one randomly sampled batch to calculate gradient, and the size of this batch is uncertain, and the expected value is qN.
So, I wonder to know that, in your code, using this way (multiple batches per epoch) to process each epoch is to meet the privacy cost calculation in TF?
Maybe I am not expressing clearly enough, thank you for your patience.
Besides, I will be very grateful if there is any reference or intuitive explanation.Thanks!:)

ebagdasa / pytorch-privacy Goto Github PK

pytorch-privacy's Introduction

Differential Privacy in PyTorch

pytorch-privacy's People

Contributors

Stargazers

Watchers

Forkers

pytorch-privacy's Issues

Questions about privacy mechanism and using TF to calculate privacy costs

A typo in README

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent