hsja's People
Forkers
sunshine352 kutim tartaruszen qiaoptdun asforme cmhcbb alisaxxw lishaofeng y-tashi pkuqgg yhznb zoeleeee milkigit mojojojo99 tongkangheng qrickdd zhenglisec yangyzju vardhanaletihsja's Issues
Which true gradient does HSJA estimated for? (in function "approximate_gradient")
Dear Sir:
In your code def approximate_gradient(model, sample, num_evals, delta, params)
,
Line 165 in daecd5c
I thought this code is used for estimating the true gradient as the following code:
def get_grad(self, model, x, true_labels, target_labels):
with torch.enable_grad():
# x = torch.clamp(x,min=0,max=1.0).cuda()
x = x.cuda()
x.requires_grad_()
logits = torch.softmax(model(x),dim=1)
if true_labels is not None:
true_labels = true_labels.cuda()
if target_labels is not None:
target_labels = target_labels.cuda()
loss = self.cw_loss(logits, true_labels, target_labels)
gradient = torch.autograd.grad(loss, x,retain_graph=True)[0].cpu().detach()
return gradient
where cw_loss
is defined as the following:
def cw_loss(self, logit, label, target=None):
if target is not None:
# targeted cw loss: logit_t - max_{i\neq t}logit_i
_, argsort = logit.sort(dim=1, descending=True)
target_is_max = argsort[:, 0].eq(target).long()
second_max_index = target_is_max.long() * argsort[:, 1] + (1 - target_is_max).long() * argsort[:, 0]
target_logit = logit[torch.arange(logit.shape[0]), target]
second_max_logit = logit[torch.arange(logit.shape[0]), second_max_index]
return target_logit - second_max_logit
else:
# untargeted cw loss: max_{i\neq y}logit_i - logit_y
_, argsort = logit.sort(dim=1, descending=True)
gt_is_max = argsort[:, 0].eq(label).long()
second_max_index = gt_is_max.long() * argsort[:, 1] + (1 - gt_is_max).long() * argsort[:, 0]
gt_logit = logit[torch.arange(logit.shape[0]), label]
second_max_logit = logit[torch.arange(logit.shape[0]), second_max_index]
return second_max_logit - gt_logit
But when I compute the cosine simillarity between two gradient, I found it was very low, about 0.02?
Which true gradient does approximate_gradient
approximate?
Can you give me some code or example please?
Is it possible and easy to enable batch processing?
Hey,
Really nice work and well written code as well. I would like to ask whether batch processing (optimize several images at the same time) is possible to implement on top of the code base, and whether it will be supported in the future. If not, is there any suggestions on trying to implement batch mode myself.
Thanks!
Question about L_infinity norm attack of HSJA.
I read your code carefully, and I have the following questions that confuse me in L-infinity norm of HSJA.
-
In line : https://github.com/Jianbo-Lab/HSJA/blob/master/hsja.py#L225 why you set
highs = dists_post_update
rather thanhighs = np.ones(len(perturbed_images))
used in L2 norm? And also thethresholds = np.minimum(dists_post_update * params['theta'], params['theta'])
is different from L2 norm version? -
In function
def project
, why you use the following code to project thealpha
? The L2 norm is easier to understand whichreturn (1-alphas) * original_image + alphas * perturbed_images
as the middle point lies in the line betweenoriginal_image
andperturbed_images
. How to understand the following L-inf norm code?
elif params['constraint'] == 'linf':
out_images = clip_image(
perturbed_images,
original_image - alphas,
original_image + alphas
)
return out_images
-
Why you use
update = np.sign(gradf)
as the gradient direction in L-infinity norm version? (Line 99 in 6a91456
-
Furthermore, is the
gradf
returned bydef approximate_gradient
the same vector as the normal vector of the decision hyperplane?
Why the perturbed image is not clip with min=0 and max=1 after binary search step? Bug?
I notice the statement perturbed = clip_image(perturbed + epsilon * update, clip_min, clip_max)
is placed before perturbed,dist_post_update = binary_search_batch(sample, perturbed[None], model, params)
in line 110~line 114 (geometric_progression). (https://github.com/Jianbo-Lab/HSJA/blob/master/hsja.py#L114)
Why there is no clipping operation after perturbed,dist_post_update = binary_search_batch(sample, perturbed[None], model, params)
? Is it a bug?
Question about delta
The pseudocode in the paper describes that under a certain delta value, when the output results of the classifier are all zero, the delta value will be reduced by half, but I don’t seem to see this step in the code. Could you please explain this? I still occasionally encounter this problem when using your code to run experiments under different gamma values or different networks.
Provide parameters and example models for ImageNet
The default parameters does not work well for Inception v3 model on ImageNet dataset in my setup. This repo does not provide ImageNet models and corresponding parameters for HSJA. Can you provide example parameters and example models for ImageNet dataset? Thanks a lot!
What does baseline substraction mean in the gradient estimation?
The code starts from
https://github.com/Jianbo-Lab/HSJA/blob/master/hsja.py#L182
What does these codes mean?
I can understand the above code to estimate gradients, which is mainly RGF method.
MNIST model for attack
Hi,
Thank you for providing the source code of the attack. However, this code attacks only for the CIFAR10 resnet model. Is there any pretrained model that you used for MNIST images? If so, could you recommend what model you used?
Any information will be of great help!
Thanks!
How is input norm calculated?
I have a quick question about input norm calculation. I think the linf norm is just the max difference on input space and previous attack works mostly use 0.05 or 8/255. In our paper, the linf norm is really different between ImageNet/CIFAR100 and MNIST/CIFAR10, is that because you calculate the norm on [0,255] space for ImageNet/CIFAR100?
this is a problem to run this code ,but i don't how to solve , please query a help,"construct_original_network grads = tape.gradient(preds[:,c], x) File "C:\Users\高婷\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\eager\backprop.py", line 1086, in gradient unconnected_gradients=unconnected_gradients) File "C:\Users\高婷\AppData\Roaming\Python\Python37\site-packages\tensorflow\python\eager\imperative_grad.py", line 77, in imperative_grad compat.as_str(unconnected_gradients.value)) AttributeError: 'KerasTensor' object has no attribute '_id'"
Why this line code exists? HELP plz
Line 333 in daecd5c
Which stepsize_search performs better in terms of query and distortion?
Your code has two options for stepsize_search
: geometric_progression
and grid_search
, which one should I use for the experiments for the best performance in terms of query and distortion?
Unnecessary binary search?
Could you explain the difference between the binary search in the initialize method [1] and the binary search right after it [2]? It seems to me that the second one is unnecessary, since the image already has been projected to the boundary.
[1] Binary search in the initialize method
Lines 288 to 300 in daecd5c
[2] Binary search right after initialize
Lines 81 to 86 in daecd5c
a problem about convergence
Hello,I've met a problem.
when I use MNIST or CIFAR10 or ImageNet ,the l2 norm almost unchanged after the first binary search.
looking forward to your reply ,thanks
HSJA is extremely slow when I use the params in this repo
I find when I set the params as 'max_iter=64, max_eval=10000, init_eval=100' to attack a ResNet on MNIST, it is very time-consuming. I use a batch including 200 MNIST images but the algorithm spends more than 12 hours obtaining the adversarial examples of this batch. BTW, I use an RTX Titan to test the speed. Is it normal to cost a lot of time, or I do something wrong?
The approximate_gradient function seems not follow the equations in paper
Line 180 in 6a91456
It seems like you use fval as the average of the subtraction between the Binary function and the baseline of the estimate. I do not understand this. Is this another approximation?
On the other hand, referring to eq.11, for me, it is more like an estimate of integration. Could you give me some details? Thanks.
The value of theta is different from the paper
Hi Jianbo,
Thanks for your nice work, and it's experimental results are really impressive.
I notice the values of theta are actually different in paper and source code. From equation 17 in the paper, theta should be 0.001 * d^{-3/2} under the l2 constriant. However, in the source code, theta is set to 1 * d^{-3/2} in this newer commit or 0.01 * d^{-1/2} in this older commit
Hi,I want to know which represents the number of queries in hsja.py?
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.