xuanqing94 / bayesiandefense Goto Github PK

View Code? Open in Web Editor NEW

61.0 61.0 13.0 84 KB

Adv-BNN: Improved Adversarial Defense through Robust Bayesian Neural Network

License: MIT License

Python 95.01% Shell 4.99%

bayesiandefense's People

Contributors

Stargazers

Watchers

Forkers

hyzcn rowanai dssrgu harry24k aiasd stannnnnnn robot-ai-machinelearning amose-yao yujun-shi prathyusha-akundi nanyangye haoyang-219 zhangconghhh

bayesiandefense's Issues

why does beta need to divide 100?

BayesianDefense/main_vi.py

Line 97 in c4c0be9

return 1.0 / N / 100

Does this work on L-2 norm based attacks?

Incorrect PGD implementation

https://github.com/xuanqing94/BayesianDefense/blob/master/attacker/pgd.py#L25

Here diff.clamp_(-eps, eps) should be diff = diff / diff.max() * eps.

I wonder whether this affects the credibility of the proposed defense.

KL loss

Can KLLoss and re-parameters be customized?

I have a question regarding your weight_noise.py file. I feel that the noise_fn function is not really used in any other module. Let's take a look for example at the linear.py. Here, there are two definitions of the forward() function. The first one (defined in line 39) uses the function. But this is later overwritten in line 46 by an implementation of forward that does not call noise_fn at all. So now I have two questions: Is this really intended? Is the noise_fn necessary at all?

Edit: Is it correct that the noise_fn function is relevant if one wants to use Bayes by Backprop to train the network and otherwise the reparametrization trick is used?

Thanks for your help and thanks for sharing your code/experiments of your paper!

Unable to download checkpoint files

Hi,

The downloading of checkpoint files from http://muradin.cs.ucdavis.edu:9876/ was not successful. I also tried to download it on chrome/firefox and also from terminal using wget.

Please let me know how I can get these checkpoints :)

Thanks,

VGG with Variational Inference not training

Hi,
I am training a VGG VI network on CIFAR-10 and the validation accuracy remains very low (20%) even after training for 200 epochs. The model was overfitting with training accuracy reaching past 70. I put L2 regularization (weight_decay) in the optimizer but still no increase in validation accuracy.
Is there a reason for it? What am I doing wrong here? I used all the default parameters.

Thanks,
Kumar

Effectiveness of the defense

Hi,
thanks for your interesting paper and for releasing the source code for the experiments, too. When I saw your paper for the first time, I was really interested in it as the results looked very promising.

During the past few weeks, I have worked with this code and tried to reproduce your results. Doing this, I noticed some things in your code that I would like to talk about.

My first observation is about your experiments with the VGG network. I have noticed that you calculate the KL-divergence for all layers in the network but do not use this sum of the divergences but actually just the divergence of the last layer (cf. L39 of vgg_vi.py). Therefore, your proposed regularization is only applied to the last layer during training and not the entire network. This is different from the description you give in the ICLR paper. Did you do this on purpose? I tried to train the network using the divergence on the whole network and failed using the hyperparameters reported in the paper. Is there something I am overlooking?

My second observation is about your STL10 experiments, on which I focused after my initial observations of the issues with the VGG network. I noticed that one can improve the effectiveness of the PGD attack on the Bayesian network by averaging the gradients over multiple forward-backward passes before actually performing the PGD step on the input. Using this method I was able to decrease the accuracy of your model (based on the checkpoint you've uploaded). The following table compares the results you've listed in the ICLR print with the ones I obtained with my modified attack:

Attack	0.015	0.035	0.055	0.07
Your PGD Attack	51.8	37.6	27.2	21.1
My PGD Attack	47.0	30.3	16.0	8.6

Here, one sees that for every value of the perturbation strength the modified attack is stronger, i.e. decreases the model's accuracy more strongly. This is especially true for large perturbations (e.g. 21.1% vs. 8.6%).
Next, I played a little bit around with the training of the normal VGG network (w/o Bayesian reasoning). By tweaking the hyperparameters I was able to get these accuracies on a PGD attack with the same attack parameters as you used for the results in your paper:

Training method	0.015	0.035	0.055	0.07
Your hyperparameters	46.7	27.4	12.8	7.0
My hyperparameters	43.5	30.7	18.9	12.8

Now, one can compare this set of results with the results I got with my modified PGD attack on your Bayesian model. It looks like the Bayesian model/training is not really improving the robustness of the network.

I am really looking forward to your answers and explanations, hoping that all of this can easily be solved.

Thanks,
Roland

Forward method in vgg_vi.py

In the code "BayesianDefense/models/vgg_vi.py",

def forward(self, x):
    kl_sum = 0
    out = x
    for l in self.features:
        if type(l).__name__.startswith("Rand"):
            out, kl = l.forward(out)
            if kl is not None:
                kl_sum += kl
        else:
            out = l.forward(out)
    out = out.view(out.size(0), -1)
    out, kl = self.classifier.forward(out)
    kl_sum += kl
    return out, kl

I guess kl, one of the return values, should be kl_sum.
I am looking forward to explanations :)

Thanks,
HarryKim

Test process for RSE model

According to the RSE paper, http://openaccess.thecvf.com/content_ECCV_2018/papers/Xuanqing_Liu_Towards_Robust_Neural_ECCV_2018_paper.pdf

the test process should add several outputs together to make a prediction. But in your code, it seem you just run the net once and make the prediction.

xuanqing94 / bayesiandefense Goto Github PK

bayesiandefense's People

Contributors

Stargazers

Watchers

Forkers

bayesiandefense's Issues

why does beta need to divide 100?

Does this work on L-2 norm based attacks?

Incorrect PGD implementation

KL loss

Correct use of noise_fn

Unable to download checkpoint files

VGG with Variational Inference not training

Effectiveness of the defense

Forward method in vgg_vi.py

Test process for RSE model

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent