xuanqing94 / bayesiandefense Goto Github PK
View Code? Open in Web Editor NEWAdv-BNN: Improved Adversarial Defense through Robust Bayesian Neural Network
License: MIT License
Adv-BNN: Improved Adversarial Defense through Robust Bayesian Neural Network
License: MIT License
Line 97 in c4c0be9
https://github.com/xuanqing94/BayesianDefense/blob/master/attacker/pgd.py#L25
Here diff.clamp_(-eps, eps)
should be diff = diff / diff.max() * eps
.
I wonder whether this affects the credibility of the proposed defense.
Can KLLoss and re-parameters be customized?
I have a question regarding your weight_noise.py
file. I feel that the noise_fn
function is not really used in any other module. Let's take a look for example at the linear.py
. Here, there are two definitions of the forward()
function. The first one (defined in line 39) uses the function. But this is later overwritten in line 46 by an implementation of forward
that does not call noise_fn
at all. So now I have two questions: Is this really intended? Is the noise_fn
necessary at all?
Edit: Is it correct that the noise_fn
function is relevant if one wants to use Bayes by Backprop
to train the network and otherwise the reparametrization trick is used?
Thanks for your help and thanks for sharing your code/experiments of your paper!
Hi,
The downloading of checkpoint files from http://muradin.cs.ucdavis.edu:9876/
was not successful. I also tried to download it on chrome/firefox and also from terminal using wget
.
Please let me know how I can get these checkpoints :)
Thanks,
Hi,
I am training a VGG VI network on CIFAR-10 and the validation accuracy remains very low (20%) even after training for 200 epochs. The model was overfitting with training accuracy reaching past 70. I put L2 regularization (weight_decay) in the optimizer but still no increase in validation accuracy.
Is there a reason for it? What am I doing wrong here? I used all the default parameters.
Thanks,
Kumar
Hi,
thanks for your interesting paper and for releasing the source code for the experiments, too. When I saw your paper for the first time, I was really interested in it as the results looked very promising.
During the past few weeks, I have worked with this code and tried to reproduce your results. Doing this, I noticed some things in your code that I would like to talk about.
My first observation is about your experiments with the VGG
network. I have noticed that you calculate the KL-divergence for all layers in the network but do not use this sum of the divergences but actually just the divergence of the last layer (cf. L39 of vgg_vi.py
). Therefore, your proposed regularization is only applied to the last layer during training and not the entire network. This is different from the description you give in the ICLR paper. Did you do this on purpose? I tried to train the network using the divergence on the whole network and failed using the hyperparameters reported in the paper. Is there something I am overlooking?
My second observation is about your STL10
experiments, on which I focused after my initial observations of the issues with the VGG
network. I noticed that one can improve the effectiveness of the PGD
attack on the Bayesian network by averaging the gradients over multiple forward-backward passes before actually performing the PGD
step on the input. Using this method I was able to decrease the accuracy of your model (based on the checkpoint you've uploaded). The following table compares the results you've listed in the ICLR print with the ones I obtained with my modified attack:
Attack | 0.015 | 0.035 | 0.055 | 0.07 |
---|---|---|---|---|
Your PGD Attack | 51.8 | 37.6 | 27.2 | 21.1 |
My PGD Attack | 47.0 | 30.3 | 16.0 | 8.6 |
Here, one sees that for every value of the perturbation strength the modified attack is stronger, i.e. decreases the model's accuracy more strongly. This is especially true for large perturbations (e.g. 21.1% vs. 8.6%).
Next, I played a little bit around with the training of the normal VGG
network (w/o Bayesian reasoning). By tweaking the hyperparameters I was able to get these accuracies on a PGD
attack with the same attack parameters as you used for the results in your paper:
Training method | 0.015 | 0.035 | 0.055 | 0.07 |
---|---|---|---|---|
Your hyperparameters | 46.7 | 27.4 | 12.8 | 7.0 |
My hyperparameters | 43.5 | 30.7 | 18.9 | 12.8 |
Now, one can compare this set of results with the results I got with my modified PGD
attack on your Bayesian model. It looks like the Bayesian model/training is not really improving the robustness of the network.
I am really looking forward to your answers and explanations, hoping that all of this can easily be solved.
Thanks,
Roland
In the code "BayesianDefense/models/vgg_vi.py",
def forward(self, x):
kl_sum = 0
out = x
for l in self.features:
if type(l).__name__.startswith("Rand"):
out, kl = l.forward(out)
if kl is not None:
kl_sum += kl
else:
out = l.forward(out)
out = out.view(out.size(0), -1)
out, kl = self.classifier.forward(out)
kl_sum += kl
return out, kl
I guess kl, one of the return values, should be kl_sum.
I am looking forward to explanations :)
Thanks,
HarryKim
According to the RSE paper, http://openaccess.thecvf.com/content_ECCV_2018/papers/Xuanqing_Liu_Towards_Robust_Neural_ECCV_2018_paper.pdf
the test process should add several outputs together to make a prediction. But in your code, it seem you just run the net once and make the prediction.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.