yapengtian / av-robustness-cvpr21 Goto Github PK
View Code? Open in Web Editor NEWCan audio-visual integration strengthen robustness under multimodal attacks?
License: MIT License
Can audio-visual integration strengthen robustness under multimodal attacks?
License: MIT License
main_attack.py line 550
builder.build_sound builder.build_frame
fc_dim is unexcepted args, i delete and it worked well ,should i delete it?
hi, sorry to disturb you but i got some problems when i tried to reproduce your results. my results somtimes would be more than 10% lower than yours so i think there must be something wrong when i run your code. my running procedure on AVE dataset is listed as follows:
(1) run extract_audio.py and extract_frames.py to get audio and frame files.
(2) run train_attack_AVE.sh.
(3) run eval_attack_AVE.sh to see the defense ability of the model obtained by (2). since the epsilons in your defense settings are all 0.006 and in your attack settings are 0.012 sometimes, in order to see and compare the defense ability of the model in the same settings as other defense methods, i changed the epsilons whose values are 0.012 in main_attack.py to 0.006 just like them in main_defense.py.
(4) run train_defense_AVE.sh.
(5) run eval_defense_AVE.sh to see the defense ability of the approach you proposed. according to my understanding i could get the results equal to those in your paper in this step but i didn't.
are there something wrong with my steps?
besides the procedure, i also have other questions:
when i run extract_audio.py and extract_frames.py, i found more than 20 videos in AVE dataset didn't have their audio files. is this normal?
are your results of defense ability of your model on AVE dataset actually under 60 epochs instead of 30? because i have to run (30 + 30) epochs to get your results.
i found the epsilons are 0.06 and 0.12 in your paper and 0.006 and 0.012 in your code, are these typos? i used 0.006 and 0.012 since i thought the parameters in your code are the correct version.
hi thx for your amazing work.
can i ask when will the code be updated? looking forward to your awesome work!!!
hello, i am wondering:
why the frequency when extracting waveforms is 11025hz in readme.md but in extract_audio.py it is 11000hz?
why the recommended epochs of AVE dataset is 100 in your paper but in train_defense_AVE.sh it is 30?
why the learning rate of frame and audio of training defense on AVE are 1e-3 and 1e-4 respectively in your paper but in train_defense_AVE.sh they are 1e-4 and 1e-3?
Hi,thank you for your sharing.I am trying to reproduce your work.It seems that 'data/val.csv' is missing?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.