softsys4ai / athena Goto Github PK
View Code? Open in Web Editor NEWAthena: A Framework for Defending Machine Learning Systems Against Adversarial Attacks
Home Page: https://softsys4ai.github.io/athena/
License: MIT License
Athena: A Framework for Defending Machine Learning Systems Against Adversarial Attacks
Home Page: https://softsys4ai.github.io/athena/
License: MIT License
This will be a baseline attack, also we compare with a model trained on random noise.
Hi I'm trying to reproduce your experiment but while I was trying to evaluate FGSM attacks under white box settings, I found that the normalized l2-dissimilarity for a FGSM at eps=0.1 is only about 0.007. As in your code, the upper bound for an white-box attack is determined by some pre-generated adversarial examples. I'm wondering how you guys finish the experiments as in Fig. 7 in your paper?
So far, we have empirical evidence that shows there is a significant difference between BS and AE in terms of inference time on the clean model and many transform models. Check the column "inference Probability" in
Inference_Time-T_Test.xlsx for the t-test result of inference time of BS and AE on every model.
Question: what is the root cause of this difference?
Investigation idea:
Which part of NN layers (lower layers or upper layers) has more impact on this difference?
measure the time cost for each layer for both BS and AE
Do the inferences of BS and AE result in a different number of neurons to be activated
we need to prepare a python notebook containing our end to end approach with some specific examples
white-box
[Error Message]
Traceback (most recent call last):
File "detection_as_defense.py", line 67, in
transformationList)
File "/home/kevinsjh_gmail_com/adversarial_transformers/util.py", line 1087, in predictionForTest
tranSamples = transform_images(curSamples, transformType)
File "/home/kevinsjh_gmail_com/adversarial_transformers/transformation.py", line 748, in transform_images
elif (transformation_type in TRANSFORMATION.NOISES):
File "/home/kevinsjh_gmail_com/adversarial_transformers/transformation.py", line 671, in add_noise
img_noised = skimage.util.random_noise(img, mode=noise_mode)
File "/home/kevinsjh_gmail_com/anaconda3/lib/python3.7/site-packages/skimage/util/noise.py", line 155, in random_noise
out = np.random.poisson(image * vals) / float(vals)
File "mtrand.pyx", line 4005, in mtrand.RandomState.poisson
ValueError: lam value too large.
After investigating the accuracy of all models (clean model + 32 transform models) under the attack of FGSM250 AEs, it is observed that the accuracy of a few transform models become lower than the clean model's accuracy, which are higher than that of the clean model previously. Check the table below.
Model Type | testset | trainset |
---|---|---|
Clean | 0.1143 | 0.1663 |
rotate270 | 0.1034 | 0.1394 |
erosion | 0.1336 | 0.1182 |
opening | 0.3172 | 0.1363 |
shift_bottom_right | 0.1033 | 0.1181 |
We should use t-sne to look into some properties of models trained on transformations and provide some intuitions why they work.
Use logging to manage the information, rather than using print function.
Use ensemble models trained from fgsm AEs and have them tested on bim AEs.
Use ensemble models trained from bim AEs and have them tested on fgsm AEs.
Some causes crash while some generated black images.
list:
new filter transformations,
denoising transformations,
geo transformations,
seg transformations.
Rename some of the existing transformation types to organize all transformations better.
unit tests
We'd like to provide one-pixel attack in our tool kit.
provide CW (0, 2, and inf norms) attacks
We'd like to provide a simple blackbox attack
use FLAG to manage
File "scripts.py", line 308, in <module>
tf.app.run()
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/platform/app.py", line 40, in run
_run(main=main, argv=argv, flags_parser=_parse_flags_tolerate_undef)
File "/usr/local/lib/python3.5/dist-packages/absl/app.py", line 300, in run
_run_main(main, args)
File "/usr/local/lib/python3.5/dist-packages/absl/app.py", line 251, in _run_main
sys.exit(main(argv))
File "scripts.py", line 290, in main
generate_adversarial_examples(DATA.mnist, ATTACK.JSMA)
File "scripts.py", line 177, in generate_adversarial_examples
theta=theta, gamma=gamma)
File "/home/tester/advML/attacks/attacker.py", line 107, in get_adversarial_examples
X_adv, Y = whitebox.generate(model_name, X, Y, attack_method, attack_params)
File "/home/tester/advML/attacks/whitebox.py", line 172, in generate
adv_x = attacker.generate(model.input, **attack_params)
File "/home/tester/.local/lib/python3.5/site-packages/cleverhans/attacks/__init__.py", line 948, in generate
labels, nb_classes = self.get_or_guess_labels(x, kwargs)
File "/home/tester/.local/lib/python3.5/site-packages/cleverhans/attacks/__init__.py", line 281, in get_or_guess_labels
preds = self.model.get_probs(x)
File "/home/tester/.local/lib/python3.5/site-packages/cleverhans/utils_keras.py", line 179, in get_probs
return self.get_layer(x, name)
File "/home/tester/.local/lib/python3.5/site-packages/cleverhans/utils_keras.py", line 227, in get_layer
output = self.fprop(x)
File "/home/tester/.local/lib/python3.5/site-packages/cleverhans/utils_keras.py", line 203, in fprop
self.keras_model = KerasModel(new_input, out_layers)
File "/usr/local/lib/python3.5/dist-packages/keras/legacy/interfaces.py", line 91, in wrapper
return func(*args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/keras/engine/network.py", line 93, in __init__
self._init_graph_network(*args, **kwargs)
File "/usr/local/lib/python3.5/dist-packages/keras/engine/network.py", line 231, in _init_graph_network
self.inputs, self.outputs)
File "/usr/local/lib/python3.5/dist-packages/keras/engine/network.py", line 1366, in _map_graph_network
tensor_index=tensor_index)
File "/usr/local/lib/python3.5/dist-packages/keras/engine/network.py", line 1347, in build_map
for i in range(len(node.inbound_layers)):
**TypeError: object of type 'InputLayer' has no len()**
Presume that attackers have access to all weak defenses but are not aware of the ensemble strategies.
pgd attack
We should investigate how adv attacks change the class activation mapping (CAM) and how our transformations change the map. It may reveal some understanding of why our approach works.
support deepfool (L_inf norm)
Approach 1: add randomness (random noise) to AEs and then use them to train ensemble models
Approach 2: use the strongest type of AEs to build ensemble models for defense
U of SC computing resource center supports Sylab.io, With that, we should be able to create a docker container, which gives us root privilege, and deploy it to the computing resource center to access GPU nodes.
Try it out after conferences' deadlines
The scripts, which work well on local machines, do not work on rci nodes, some exceptions thrown by tensorflow. Highly suspect this is caused by the gap between versions. I am using tensorflow 1.13 on local machines, while tensorflow 1.12 is on rci nodes.
Need to take a look at this issue and fix the bugs on rci.
support MIM attack
Hi, is it possible to implement reset function for other transformations?
With some figures and appropriate plots.
The idea is to differentiate BS and AE based on their output from distinct transform models.
In Detecing adversarial samples from artifacts.pdf, it is shown that different models make different mistakes when presented to same AEs. And 2018-arXiv-PictureAE-Picture_AE_detection_bimodel.pdf proposes Bi-model approach that concatenates the output of an image from two distinct models as its feature representation and then feeds it to a binary classifier for classification. The approach is claimed to reach >90% detection accuracy on mnist and cifar10.
we can concatenate/stack up the output of transform models for an input image and use it as a representation of the image and feed into a binary classifier. This might could have a higher detection accuracy and generalize better across different type of attacks.
Investigation:
Identify patterns of the prediction output of BS and AE
for BS and each type of AE, plot the boxplot for the average, min and max accuracy of all transform models
Detection approach 1: majority voting
the prediction output of AEs is much more diverse than that of BS. That is, the number of transform models agrees with each other on AE will be much smaller than that on BS. If this number is below some threshold, say 75% X total_number_of_models, the input image will be marked as an AE. Otherwise, it is considered as a benign sample.
Detection approach 2: distance matrix
【empirical evidence to collection】: distance of prediction outputs of a benign sample from two distinct transform models is close to 0, while the distance of prediction outputs of an AE from two distinct transform models should be much larger than 0.
Try with different distance metrics: L2, entropy, KL divergence, cosine, correlation
【distance matrix】: for an image, create a distance matrix by computing the distances of its prediction outputs between each pair of transform models. Investigate any possible property or difference between AE and BS
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.