Code Monkey home page Code Monkey logo

ensemble-adv-training's Introduction

Ensemble Adversarial Training

This repository contains code to reproduce results from the paper:

Ensemble Adversarial Training: Attacks and Defenses
Florian Tramèr, Alexey Kurakin, Nicolas Papernot, Dan Boneh and Patrick McDaniel
ArXiv report: https://arxiv.org/abs/1705.07204


REQUIREMENTS

The code was tested with Python 2.7.12, Tensorflow 1.0.1 and Keras 1.2.2.

EXPERIMENTS

We start by training a few simple MNIST models. These are described in mnist.py.

python -m train models/modelA --type=0
python -m train models/modelB --type=1
python -m train models/modelC --type=2
python -m train models/modelD --type=3

Then, we can use (standard) Adversarial Training or Ensemble Adversarial Training (we train for either 6 or 12 epochs in the paper). With Ensemble Adversarial Training, we additionally augment the training data with adversarial examples crafted from external pre-trained models (models A, C and D here):

python -m train_adv models/modelA_adv --type=0 --epochs=12
python -m train_adv models/modelA_ens models/modelA models/modelC models/modelD --type=0 --epochs=12

The accuracy of the models on the MNIST test set can be computed using

python -m simple_eval test [model(s)]

To evaluate robustness to various attacks, we use

python -m simple_eval [attack] [source_model] [target_model(s)] [--parameters (opt)]

The attack can be:

Attack Description Parameters
fgs Standard FGSM eps (the norm of the perturbation)
rand_fgs Our FGSM variant that prepends the gradient computation by a random step eps (the norm of the total perturbation); alpha (the norm of the random perturbation)
ifgs The iterative FGSM eps (the norm of the perturbation); steps (the number of iterative FGSM steps)
CW The Carlini and Wagner attack eps (the norm of the perturbation); kappa (attack confidence)

Note that due to GPU non-determinism, the obtained results may vary by a few percent compared to those reported in the paper. Nevertheless, we consistently observe the following:

  • Standard Adversarial Training performs worse on transferred FGSM examples than on a "direct" FGSM attack on the model due to a gradient masking effect.
  • Our RAND+FGSM attack outperforms the FGSM when applied to any model. The gap is particularly pronounced for the adversarially trained model.
  • Ensemble Adversarial Training is more robust than (standard) adversarial training to transferred examples computed using any of the attacks above.
CONTACT

Questions and suggestions can be sent to [email protected]

ensemble-adv-training's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ensemble-adv-training's Issues

No gradients provided for any variable: ['cnn_model_2/conv2d_6/kernel:0', 'cnn_model_2/conv2d_6/bias:0', 'cnn_model_2/conv2d_7/kernel:0', 'cnn_model_2/conv2d_7/bias:0', 'cnn_model_2/dense_4/kernel:0', 'cnn_model_2/dense_4/bias:0', 'cnn_model_2/dense_5/kernel:0', 'cnn_model_2/dense_5/bias:0'].


ValueError Traceback (most recent call last)
in ()

in train(train_dataset, train_labels_dataset, epochs)
9 labels = label_batch
10 print('logits->',logits.shape , 'labels->',labels.shape)
---> 11 train_step(logits,labels)
12
13 if (epoch + 1) % 15 == 0:

in train_step(logits, labels)
12 #loss = cross_entropy(labels,tf.argmax(logits,axis=1))
13 gradient_of_cnn = cnn_tape.gradient(loss,model.trainable_variables)
---> 14 cnn_optimizer.apply_gradients(zip(gradient_of_cnn,model.trainable_variables))

~/anaconda3/lib/python3.7/site-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py in apply_gradients(self, grads_and_vars, name)
394 ValueError: If none of the variables have gradients.
395 """
--> 396 grads_and_vars = filter_grads(grads_and_vars)
397 var_list = [v for (
, v) in grads_and_vars]
398

~/anaconda3/lib/python3.7/site-packages/tensorflow/python/keras/optimizer_v2/optimizer_v2.py in _filter_grads(grads_and_vars)
922 if not filtered:
923 raise ValueError("No gradients provided for any variable: %s." %
--> 924 ([v.name for _, v in grads_and_vars],))
925 if vars_with_empty_grads:
926 logging.warning(

ValueError: No gradients provided for any variable: ['cnn_model_2/conv2d_6/kernel:0', 'cnn_model_2/conv2d_6/bias:0', 'cnn_model_2/conv2d_7/kernel:0', 'cnn_model_2/conv2d_7/bias:0', 'cnn_model_2/dense_4/kernel:0', 'cnn_model_2/dense_4/bias:0', 'cnn_model_2/dense_5/kernel:0', 'cnn_model_2/dense_5/bias:0'].


i```
nput_shape = (28,28.1)
class cnn_model(tf.keras.Model):
    def __init__(self,inputs=(28,28,1)):
        super(cnn_model,self).__init__()
        
        #self.conv1 = layers.Conv2D(32,(3,3),activation='relu',input_shape= input_shape)
        self.conv1 = layers.Conv2D(32, 3, 3, padding='same', activation='relu')
        self.maxpool = layers.MaxPool2D((2,2))
        self.conv2 = layers.Conv2D(64,(3,3),activation ='relu')
        self.conv3 = layers.Conv2D(128,(3,3),activation='relu')
        self.flatten = layers.Flatten()
        self.dense64 = layers.Dense(64,activation='relu')
        self.dense10 = layers.Dense(10,activation='relu')
        self.dropout = layers.Dropout(0.25)
    def call(self,x):
        x = self.conv1(x)
        x = self.conv2(x)
        x = self.maxpool(x)
        x = self.dropout(x)
        x = self.flatten(x)
        x = self.dense64(x)
        x = self.dense10(x)
        return x

#loss = tf.losses.mean_squared_error(labels,logits)
cnn_optimizer = tf.optimizers.Adam(1e-4)
#optimizer = tf.train. (learning_rate=0.001)
#print(loss)


checkpoint_dir = './training_checkpoints'
checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt")
checkpoint = tf.train.Checkpoint(cnn_optimizer=cnn_optimizer)

# Notice the use of `tf.function`
# This annotation causes the function to be "compiled".
cross_entropy = tf.keras.losses.BinaryCrossentropy(from_logits=True)

#@tf.function
def train_step(logits,labels):    
    
    with tf.GradientTape() as cnn_tape:
        print("train_step")
        loss = tf.losses.mean_squared_error(labels,tf.argmax(logits,axis=1))
        #loss = tf.losses.BinaryCrossentropy(multi_class_labels=labels, logits=logits)
        #loss = cross_entropy(labels,tf.argmax(logits,axis=1))
        gradient_of_cnn = cnn_tape.gradient(loss,model.trainable_variables)
        cnn_optimizer.apply_gradients(zip(gradient_of_cnn,model.trainable_variables))

def train(train_dataset,train_labels_dataset,epochs):
    for epoch in range(epochs):
        start = time.time()
        
        for train_batch,label_batch in zip(train_dataset,train_labels_dataset):
            print(train_batch.shape,label_batch.shape)
            #logits = tf.argmax(model(train_batch),axis=1)
            logits = model(train_batch)
            labels = label_batch
            print('logits->',logits.shape , 'labels->',labels.shape)
            train_step(logits,labels)
            
        if (epoch + 1) % 15 == 0:
            checkpoint.save(file_prefix = checkpoint_prefix)
        print ('Time for epoch {} is {} sec'.format(epoch + 1, time.time()-start))

%%time
train(train_dataset,train_labels_dataset, EPOCHS)

About the convergence of adversarial training.

Hello. Thanks for your great work. May I ask, how to judge the convergence of adversarial training? Just according to the curve of loss?
As training going on, models become more robust. What if attack fails when training?

Tool for loss surfaces?

Hello,
Could you disclose the tool you used for visualizing the loss surfaces of models in the publication?

how to save the problem of "No gradients provided for any variable"

I try to run the code, but I always have the same problem:

Is there someone know how to solve this problem?

tensorflow/lib/python2.7/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Using TensorFlow backend.
('X_train shape:', (60000, 28, 28, 1))
(60000, 'train samples')
(10000, 'test samples')
Loaded MNIST test data.
Traceback (most recent call last):
  File "/anaconda2/envs/tensorflow/lib/python2.7/runpy.py", line 174, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/anaconda2/envs/tensorflow/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/DeepLearning/ensemble-adv-training/train.py", line 57, in <module>
    main(args.model, args.type)
  File "/DeepLearning/ensemble-adv-training/train.py", line 39, in main
    tf_train(x, y, model, X_train, Y_train, data_gen)
  File "tf_utils.py", line 79, in tf_train
    optimizer = tf.train.AdamOptimizer().minimize(loss)
  File "/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/training/optimizer.py", line 350, in minimize
    ([str(v) for _, v in grads_and_vars], loss))
ValueError: No gradients provided for any variable, check your graph for ops that do not support gradients, between variables ["<tf.Variable 'conv2d_1/kernel:0' shape=(5, 5, 1, 64) dtype=float32_ref>", "<tf.Variable 'conv2d_1/bias:0' shape=(64,) dtype=float32_ref>", "<tf.Variable 'conv2d_2/kernel:0' shape=(5, 5, 64, 64) dtype=float32_ref>", "<tf.Variable 'conv2d_2/bias:0' shape=(64,) dtype=float32_ref>", "<tf.Variable 'dense_1/kernel:0' shape=(25600, 128) dtype=float32_ref>", "<tf.Variable 'dense_1/bias:0' shape=(128,) dtype=float32_ref>", "<tf.Variable 'dense_2/kernel:0' shape=(128, 10) dtype=float32_ref>", "<tf.Variable 'dense_2/bias:0' shape=(10,) dtype=float32_ref>"] and loss Tensor("Mean:0", shape=(), dtype=float32).

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.