Code Monkey home page Code Monkey logo

hsja's People

Contributors

jianbo-lab avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

hsja's Issues

Which true gradient does HSJA estimated for? (in function "approximate_gradient")

Dear Sir:
In your code def approximate_gradient(model, sample, num_evals, delta, params),

HSJA/hsja.py

Line 165 in daecd5c

def approximate_gradient(model, sample, num_evals, delta, params):

I thought this code is used for estimating the true gradient as the following code:

    def get_grad(self, model, x, true_labels, target_labels):
        with torch.enable_grad():
            # x = torch.clamp(x,min=0,max=1.0).cuda()
            x = x.cuda()
            x.requires_grad_()
            logits = torch.softmax(model(x),dim=1)
            if true_labels is not None:
                true_labels = true_labels.cuda()
            if target_labels is not None:
                target_labels = target_labels.cuda()
            loss = self.cw_loss(logits, true_labels, target_labels)
            gradient = torch.autograd.grad(loss, x,retain_graph=True)[0].cpu().detach()
        return gradient

where cw_loss is defined as the following:

    def cw_loss(self, logit, label, target=None):
        if target is not None:
            # targeted cw loss: logit_t - max_{i\neq t}logit_i
            _, argsort = logit.sort(dim=1, descending=True)
            target_is_max = argsort[:, 0].eq(target).long()
            second_max_index = target_is_max.long() * argsort[:, 1] + (1 - target_is_max).long() * argsort[:, 0]
            target_logit = logit[torch.arange(logit.shape[0]), target]
            second_max_logit = logit[torch.arange(logit.shape[0]), second_max_index]
            return target_logit - second_max_logit
        else:
            # untargeted cw loss: max_{i\neq y}logit_i - logit_y
            _, argsort = logit.sort(dim=1, descending=True)
            gt_is_max = argsort[:, 0].eq(label).long()
            second_max_index = gt_is_max.long() * argsort[:, 1] + (1 - gt_is_max).long() * argsort[:, 0]
            gt_logit = logit[torch.arange(logit.shape[0]), label]
            second_max_logit = logit[torch.arange(logit.shape[0]), second_max_index]
            return second_max_logit - gt_logit

But when I compute the cosine simillarity between two gradient, I found it was very low, about 0.02?
Which true gradient does approximate_gradient approximate?
Can you give me some code or example please?

Is it possible and easy to enable batch processing?

Hey,

Really nice work and well written code as well. I would like to ask whether batch processing (optimize several images at the same time) is possible to implement on top of the code base, and whether it will be supported in the future. If not, is there any suggestions on trying to implement batch mode myself.

Thanks!

Question about L_infinity norm attack of HSJA.

I read your code carefully, and I have the following questions that confuse me in L-infinity norm of HSJA.

  1. In line : https://github.com/Jianbo-Lab/HSJA/blob/master/hsja.py#L225 why you set highs = dists_post_update rather than highs = np.ones(len(perturbed_images)) used in L2 norm? And also the thresholds = np.minimum(dists_post_update * params['theta'], params['theta']) is different from L2 norm version?

  2. In function def project, why you use the following code to project the alpha? The L2 norm is easier to understand which return (1-alphas) * original_image + alphas * perturbed_images as the middle point lies in the line between original_image and perturbed_images. How to understand the following L-inf norm code?

elif params['constraint'] == 'linf':
		out_images = clip_image(
			perturbed_images, 
			original_image - alphas, 
			original_image + alphas
			)
		return out_images
  1. Why you use update = np.sign(gradf) as the gradient direction in L-infinity norm version? (

    HSJA/hsja.py

    Line 99 in 6a91456

    update = np.sign(gradf)
    )

  2. Furthermore, is the gradf returned by def approximate_gradient the same vector as the normal vector of the decision hyperplane?

Why the perturbed image is not clip with min=0 and max=1 after binary search step? Bug?

I notice the statement perturbed = clip_image(perturbed + epsilon * update, clip_min, clip_max) is placed before perturbed,dist_post_update = binary_search_batch(sample, perturbed[None], model, params) in line 110~line 114 (geometric_progression). (https://github.com/Jianbo-Lab/HSJA/blob/master/hsja.py#L114)
Why there is no clipping operation after perturbed,dist_post_update = binary_search_batch(sample, perturbed[None], model, params) ? Is it a bug?

Question about delta

The pseudocode in the paper describes that under a certain delta value, when the output results of the classifier are all zero, the delta value will be reduced by half, but I don’t seem to see this step in the code. Could you please explain this? I still occasionally encounter this problem when using your code to run experiments under different gamma values or different networks.

Provide parameters and example models for ImageNet

The default parameters does not work well for Inception v3 model on ImageNet dataset in my setup. This repo does not provide ImageNet models and corresponding parameters for HSJA. Can you provide example parameters and example models for ImageNet dataset? Thanks a lot!

MNIST model for attack

Hi,

Thank you for providing the source code of the attack. However, this code attacks only for the CIFAR10 resnet model. Is there any pretrained model that you used for MNIST images? If so, could you recommend what model you used?

Any information will be of great help!

Thanks!

How is input norm calculated?

I have a quick question about input norm calculation. I think the linf norm is just the max difference on input space and previous attack works mostly use 0.05 or 8/255. In our paper, the linf norm is really different between ImageNet/CIFAR100 and MNIST/CIFAR10, is that because you calculate the norm on [0,255] space for ImageNet/CIFAR100?

this is a problem to run this code ,but i don't how to solve , please query a help,"construct_original_network grads = tape.gradient(preds[:,c], x) File "C:\Users\高婷\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\eager\backprop.py", line 1086, in gradient unconnected_gradients=unconnected_gradients) File "C:\Users\高婷\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\eager\imperative_grad.py", line 77, in imperative_grad compat.as_str(unconnected_gradients.value)) AttributeError: 'KerasTensor' object has no attribute '_id'"

Unnecessary binary search?

Could you explain the difference between the binary search in the initialize method [1] and the binary search right after it [2]? It seems to me that the second one is unnecessary, since the image already has been projected to the boundary.

[1] Binary search in the initialize method

HSJA/hsja.py

Lines 288 to 300 in daecd5c

# Binary search to minimize l2 distance to original image.
low = 0.0
high = 1.0
while high - low > 0.001:
mid = (high + low) / 2.0
blended = (1 - mid) * sample + mid * random_noise
success = decision_function(model, blended[None], params)
if success:
high = mid
else:
low = mid
initialization = (1 - high) * sample + high * random_noise

[2] Binary search right after initialize

HSJA/hsja.py

Lines 81 to 86 in daecd5c

# Project the initialization to the boundary.
perturbed, dist_post_update = binary_search_batch(sample,
np.expand_dims(perturbed, 0),
model,
params)
dist = compute_distance(perturbed, sample, constraint)

a problem about convergence

Hello,I've met a problem.
when I use MNIST or CIFAR10 or ImageNet ,the l2 norm almost unchanged after the first binary search.
looking forward to your reply ,thanks

HSJA is extremely slow when I use the params in this repo

I find when I set the params as 'max_iter=64, max_eval=10000, init_eval=100' to attack a ResNet on MNIST, it is very time-consuming. I use a batch including 200 MNIST images but the algorithm spends more than 12 hours obtaining the adversarial examples of this batch. BTW, I use an RTX Titan to test the speed. Is it normal to cost a lot of time, or I do something wrong?

The approximate_gradient function seems not follow the equations in paper

HSJA/hsja.py

Line 180 in 6a91456

fval = 2 * decisions.astype(float).reshape(decision_shape) - 1.0

It seems like you use fval as the average of the subtraction between the Binary function and the baseline of the estimate. I do not understand this. Is this another approximation?

On the other hand, referring to eq.11, for me, it is more like an estimate of integration. Could you give me some details? Thanks.

The value of theta is different from the paper

Hi Jianbo,

Thanks for your nice work, and it's experimental results are really impressive.

I notice the values of theta are actually different in paper and source code. From equation 17 in the paper, theta should be 0.001 * d^{-3/2} under the l2 constriant. However, in the source code, theta is set to 1 * d^{-3/2} in this newer commit or 0.01 * d^{-1/2} in this older commit

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.