Code Monkey home page Code Monkey logo

bayesiandefense's People

Contributors

chongruo avatar xuanqing94 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

bayesiandefense's Issues

KL loss

Can KLLoss and re-parameters be customized?

Correct use of noise_fn

I have a question regarding your weight_noise.py file. I feel that the noise_fn function is not really used in any other module. Let's take a look for example at the linear.py. Here, there are two definitions of the forward() function. The first one (defined in line 39) uses the function. But this is later overwritten in line 46 by an implementation of forward that does not call noise_fn at all. So now I have two questions: Is this really intended? Is the noise_fn necessary at all?

Edit: Is it correct that the noise_fn function is relevant if one wants to use Bayes by Backprop to train the network and otherwise the reparametrization trick is used?

Thanks for your help and thanks for sharing your code/experiments of your paper!

Unable to download checkpoint files

Hi,

The downloading of checkpoint files from http://muradin.cs.ucdavis.edu:9876/ was not successful. I also tried to download it on chrome/firefox and also from terminal using wget.

Please let me know how I can get these checkpoints :)

Thanks,

VGG with Variational Inference not training

Hi,
I am training a VGG VI network on CIFAR-10 and the validation accuracy remains very low (20%) even after training for 200 epochs. The model was overfitting with training accuracy reaching past 70. I put L2 regularization (weight_decay) in the optimizer but still no increase in validation accuracy.
Is there a reason for it? What am I doing wrong here? I used all the default parameters.

Thanks,
Kumar

Effectiveness of the defense

Hi,
thanks for your interesting paper and for releasing the source code for the experiments, too. When I saw your paper for the first time, I was really interested in it as the results looked very promising.

During the past few weeks, I have worked with this code and tried to reproduce your results. Doing this, I noticed some things in your code that I would like to talk about.

My first observation is about your experiments with the VGG network. I have noticed that you calculate the KL-divergence for all layers in the network but do not use this sum of the divergences but actually just the divergence of the last layer (cf. L39 of vgg_vi.py). Therefore, your proposed regularization is only applied to the last layer during training and not the entire network. This is different from the description you give in the ICLR paper. Did you do this on purpose? I tried to train the network using the divergence on the whole network and failed using the hyperparameters reported in the paper. Is there something I am overlooking?

My second observation is about your STL10 experiments, on which I focused after my initial observations of the issues with the VGG network. I noticed that one can improve the effectiveness of the PGD attack on the Bayesian network by averaging the gradients over multiple forward-backward passes before actually performing the PGD step on the input. Using this method I was able to decrease the accuracy of your model (based on the checkpoint you've uploaded). The following table compares the results you've listed in the ICLR print with the ones I obtained with my modified attack:

Attack 0.015 0.035 0.055 0.07
Your PGD Attack 51.8 37.6 27.2 21.1
My PGD Attack 47.0 30.3 16.0 8.6

Here, one sees that for every value of the perturbation strength the modified attack is stronger, i.e. decreases the model's accuracy more strongly. This is especially true for large perturbations (e.g. 21.1% vs. 8.6%).
Next, I played a little bit around with the training of the normal VGG network (w/o Bayesian reasoning). By tweaking the hyperparameters I was able to get these accuracies on a PGD attack with the same attack parameters as you used for the results in your paper:

Training method 0.015 0.035 0.055 0.07
Your hyperparameters 46.7 27.4 12.8 7.0
My hyperparameters 43.5 30.7 18.9 12.8

Now, one can compare this set of results with the results I got with my modified PGD attack on your Bayesian model. It looks like the Bayesian model/training is not really improving the robustness of the network.

I am really looking forward to your answers and explanations, hoping that all of this can easily be solved.

Thanks,
Roland

Forward method in vgg_vi.py

In the code "BayesianDefense/models/vgg_vi.py",

def forward(self, x):
    kl_sum = 0
    out = x
    for l in self.features:
        if type(l).__name__.startswith("Rand"):
            out, kl = l.forward(out)
            if kl is not None:
                kl_sum += kl
        else:
            out = l.forward(out)
    out = out.view(out.size(0), -1)
    out, kl = self.classifier.forward(out)
    kl_sum += kl
    return out, kl

I guess kl, one of the return values, should be kl_sum.
I am looking forward to explanations :)

Thanks,
HarryKim

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.