Code Monkey home page Code Monkey logo

Comments (5)

dayu11 avatar dayu11 commented on June 27, 2024

Hey Frankozay,

Thank you for your question. I would suggest getting started with some basic mechanisms in Differential Privacy, i.e., how can we make the output of an algorithm differentially private? See this page for a short introduction on Laplace/Gaussian Mechanism.

For example, if you want to perturb the logits, you can apply Gaussian or Laplace Mechanism to the logits. This corresponds to clip the norm of logits and then add noise to them. One thing worth mentioning is that making logits differentially private is not sufficient for learning with DP. This is because the gradient computation also needs intermediate activations, which are not DP by default.

One reason that we add noise to gradients is because our objective function is the average if all individual loss functions. This means each individual gradient only contributes 1/n of the overall loss, which greatly reduces the sensitivity. You can get back to this explanation once you learn the definition of sensitivity in a DP mechanism.

Thanks again for the interesting question!

from differentially-private-deep-learning.

Frankozay avatar Frankozay commented on June 27, 2024

Thank you very much for your detailed reply!! Your advice is very useful to me. I have read your suggestions and partially understood the mechanisms. If you don't mind I have a few more questions.

“Making logits differentially private is not sufficient for learning with DP”. Does this mean that the added noise must participate in training to satisfy learning with DP? Like add noise to dataset?

And at the experimental level, I'm still having trouble understanding the privacy section of the code. If I want to make a demo experiment like apply Gaussian Mechanism to the logits or dataset, what specific part of your code do I need to focus on to complete the demo?

Thanks again for your patient reply!!! I appreciate you answering my questions in your busy schedule!!

from differentially-private-deep-learning.

Frankozay avatar Frankozay commented on June 27, 2024

Additional:
In code,
sigma, eps = get_sigma(sampling_prob, steps, args.eps, args.delta, rgp=False)
Is the eps here the exact value obtained at the end of the training, or is it an approximation?

from differentially-private-deep-learning.

dayu11 avatar dayu11 commented on June 27, 2024

Thank you for your questions.

-- Does this mean that the added noise must participate in training to satisfy learning with DP? Like add noise to dataset?

I'm not sure that I fully understand your question. One thing for sure is that the model update must be the output of a DP mechanism. If you only add noise to logits, you can only use logits to update the model.

Regards to adding noise to dataset. Adding noise to individual datapoints needs too much noise for the perturbed data to be useful. Recall DP ensures the output does not change much if you remove any individual datapoint. If you want to add noise to dataset, then your goal is to release individual datapoints. Say you want to release a datapoint X_i, by the definition of DP, the output of \tilde X_i should not be changed much even if X_i is not in the dataset. Therefore, DP does not permit releasing high-quality individual datapoints. You may be interested in DP generation models which aim to learn a distribution of the private data instead of releasing them directly.

-- If I want to make a demo experiment like apply Gaussian Mechanism to the logits or dataset, what specific part of your code do I need to focus on to complete the demo?

You can implement a Gaussian Mechanism by yourself and play with this code. That code contains the simplest deep learning pipeline so you should be able to understand where to modify.

-- Is the eps here the exact value obtained at the end of the training, or is it an approximation?

The eps value is exact and it is strictly smaller than the target eps.

I would suggest you to read the original DP-SGD paper to get a better view of this topic. After reading it, you should be able to understand how they prove the privacy guarantee for the gradients : ). The privacy accounting in this repo uses their tool to compute privacy costs.

from differentially-private-deep-learning.

Frankozay avatar Frankozay commented on June 27, 2024

Thank you for your questions.

-- Does this mean that the added noise must participate in training to satisfy learning with DP? Like add noise to dataset?

I'm not sure that I fully understand your question. One thing for sure is that the model update must be the output of a DP mechanism. If you only add noise to logits, you can only use logits to update the model.

Regards to adding noise to dataset. Adding noise to individual datapoints needs too much noise for the perturbed data to be useful. Recall DP ensures the output does not change much if you remove any individual datapoint. If you want to add noise to dataset, then your goal is to release individual datapoints. Say you want to release a datapoint X_i, by the definition of DP, the output of \tilde X_i should not be changed much even if X_i is not in the dataset. Therefore, DP does not permit releasing high-quality individual datapoints. You may be interested in DP generation models which aim to learn a distribution of the private data instead of releasing them directly.

-- If I want to make a demo experiment like apply Gaussian Mechanism to the logits or dataset, what specific part of your code do I need to focus on to complete the demo?

You can implement a Gaussian Mechanism by yourself and play with this code. That code contains the simplest deep learning pipeline so you should be able to understand where to modify.

-- Is the eps here the exact value obtained at the end of the training, or is it an approximation?

The eps value is exact and it is strictly smaller than the target eps.

I would suggest you to read the original DP-SGD paper to get a better view of this topic. After reading it, you should be able to understand how they prove the privacy guarantee for the gradients : ). The privacy accounting in this repo uses their tool to compute privacy costs.

Thank you again for your detailed reply, I will read the suggestions and materials you provided carefully, thanks!

from differentially-private-deep-learning.

Related Issues (6)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.