Hi dayu11, I'm a beginner in differential privacy. I'm grateful that you open sour

A question from a rookie about differentially-private-deep-learning HOT 5 CLOSED

Frankozay commented on June 27, 2024

A question from a rookie

from differentially-private-deep-learning.

Comments (5)

dayu11 commented on June 27, 2024

Hey Frankozay,

Thank you for your question. I would suggest getting started with some basic mechanisms in Differential Privacy, i.e., how can we make the output of an algorithm differentially private? See this page for a short introduction on Laplace/Gaussian Mechanism.

For example, if you want to perturb the logits, you can apply Gaussian or Laplace Mechanism to the logits. This corresponds to clip the norm of logits and then add noise to them. One thing worth mentioning is that making logits differentially private is not sufficient for learning with DP. This is because the gradient computation also needs intermediate activations, which are not DP by default.

One reason that we add noise to gradients is because our objective function is the average if all individual loss functions. This means each individual gradient only contributes 1/n of the overall loss, which greatly reduces the sensitivity. You can get back to this explanation once you learn the definition of sensitivity in a DP mechanism.

Thanks again for the interesting question!

from differentially-private-deep-learning.

Frankozay commented on June 27, 2024

Thank you very much for your detailed reply!! Your advice is very useful to me. I have read your suggestions and partially understood the mechanisms. If you don't mind I have a few more questions.

“Making logits differentially private is not sufficient for learning with DP”. Does this mean that the added noise must participate in training to satisfy learning with DP? Like add noise to dataset?

And at the experimental level, I'm still having trouble understanding the privacy section of the code. If I want to make a demo experiment like apply Gaussian Mechanism to the logits or dataset, what specific part of your code do I need to focus on to complete the demo?

Thanks again for your patient reply!!! I appreciate you answering my questions in your busy schedule!!

from differentially-private-deep-learning.

Frankozay commented on June 27, 2024

Additional:
In code,
sigma, eps = get_sigma(sampling_prob, steps, args.eps, args.delta, rgp=False)
Is the eps here the exact value obtained at the end of the training, or is it an approximation?

from differentially-private-deep-learning.

dayu11 commented on June 27, 2024

Thank you for your questions.

-- Does this mean that the added noise must participate in training to satisfy learning with DP? Like add noise to dataset?

I'm not sure that I fully understand your question. One thing for sure is that the model update must be the output of a DP mechanism. If you only add noise to logits, you can only use logits to update the model.

Regards to adding noise to dataset. Adding noise to individual datapoints needs too much noise for the perturbed data to be useful. Recall DP ensures the output does not change much if you remove any individual datapoint. If you want to add noise to dataset, then your goal is to release individual datapoints. Say you want to release a datapoint X_i, by the definition of DP, the output of \tilde X_i should not be changed much even if X_i is not in the dataset. Therefore, DP does not permit releasing high-quality individual datapoints. You may be interested in DP generation models which aim to learn a distribution of the private data instead of releasing them directly.

-- If I want to make a demo experiment like apply Gaussian Mechanism to the logits or dataset, what specific part of your code do I need to focus on to complete the demo?

You can implement a Gaussian Mechanism by yourself and play with this code. That code contains the simplest deep learning pipeline so you should be able to understand where to modify.

-- Is the eps here the exact value obtained at the end of the training, or is it an approximation?

The eps value is exact and it is strictly smaller than the target eps.

I would suggest you to read the original DP-SGD paper to get a better view of this topic. After reading it, you should be able to understand how they prove the privacy guarantee for the gradients : ). The privacy accounting in this repo uses their tool to compute privacy costs.

from differentially-private-deep-learning.

Frankozay commented on June 27, 2024

Thank you for your questions.

-- Does this mean that the added noise must participate in training to satisfy learning with DP? Like add noise to dataset?

I'm not sure that I fully understand your question. One thing for sure is that the model update must be the output of a DP mechanism. If you only add noise to logits, you can only use logits to update the model.

Regards to adding noise to dataset. Adding noise to individual datapoints needs too much noise for the perturbed data to be useful. Recall DP ensures the output does not change much if you remove any individual datapoint. If you want to add noise to dataset, then your goal is to release individual datapoints. Say you want to release a datapoint X_i, by the definition of DP, the output of \tilde X_i should not be changed much even if X_i is not in the dataset. Therefore, DP does not permit releasing high-quality individual datapoints. You may be interested in DP generation models which aim to learn a distribution of the private data instead of releasing them directly.

-- If I want to make a demo experiment like apply Gaussian Mechanism to the logits or dataset, what specific part of your code do I need to focus on to complete the demo?

You can implement a Gaussian Mechanism by yourself and play with this code. That code contains the simplest deep learning pipeline so you should be able to understand where to modify.

-- Is the eps here the exact value obtained at the end of the training, or is it an approximation?

The eps value is exact and it is strictly smaller than the target eps.

I would suggest you to read the original DP-SGD paper to get a better view of this topic. After reading it, you should be able to understand how they prove the privacy guarantee for the gradients : ). The privacy accounting in this repo uses their tool to compute privacy costs.

Thank you again for your detailed reply, I will read the suggestions and materials you provided carefully, thanks!

from differentially-private-deep-learning.

A question from a rookie about differentially-private-deep-learning HOT 5 CLOSED

Comments (5)

Related Issues (6)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent