huawei-noah / disout Goto Github PK
View Code? Open in Web Editor NEWCode for AAAI 2020 paper, Beyond Dropout: Feature Map Distortion to Regularize Deep Neural Networks (Disout).
License: BSD 3-Clause "New" or "Revised" License
Code for AAAI 2020 paper, Beyond Dropout: Feature Map Distortion to Regularize Deep Neural Networks (Disout).
License: BSD 3-Clause "New" or "Revised" License
In the script of resnet.py and train.py where it sets weight_behind of Disout,however in the training process of ImageNet, it resets weight_behind of Disout None. However, why? According to the paper, it should be guided by the ERC value of next layer, and where you use the weights of next layer to approximately compute the gradient of distortion.
There was an async parameter in train_imagenet.py, but it was deprecated now
Hi! Thanks for sharing your great work! I have some questions to ask you. Can Disout be used for 3D-CNN? What should I do? Thank you very much!
So the Linear Scheduler for linearly increasing distortion (similar to dropblock) would in no way work for multi-gpu training since it uses a simple variable i (not tensor) so when we do the following
def step(self):
if self.i < len(self.drop_values):
self.disout.dist_prob = self.drop_values[self.i]
self.i += 1
The value of i will never get updated. You can try if you want. My question is, how did you guys run this code to train imagenet and got those results?
Dear authors,
Thanks for your great work, It's very useful for training CNNs. Would you mind telling me the training configuration so that I can duplicate the method? Another question: I noticed that DropBlock is trained for 270 epochs, while the default training epoch for Disout is 540. Does it make a big difference to the result?
Thanks
Can anyone tell me how to set the hyperparameters of disout in the recurrent neural network?
Thanks for your great work!
I am very curious about the experiments of "Feature Map Distortion" mentioned in paper (for fully connected layer). In code there is only block-wise distortion.
Could you provide a example of Feature Map Distortion?
If we scale the weight_behind by 10, it will output the same features because of BN? Will the magnitude of distortion still keep the same?
Hi,
Thanks for your great work!
I am reproducing your first experiment recently, but even the baseline method I implemented is very different from the results in the paper. (In your paper, Table 1 says the accuracy of conventional CNN method on CIFAR-10 reaches 81.99%)
My implementation is as consistent as possible with the narrative in your section "Implementation details" of "Experiments on Fully Connected Layers". Maybe I missed some details or tricks (such as padding, momentum, etc.), could you provide more information about this experiment?
hi i have some question about the code;
I have write the code in tensorflow yestoday;
In my opinion , as the code released show that the disout is on the bisic of the dropblock added with a random value;
<0>in your code, the weight_behind is always set None ;
<1> i am also confused about the ERC after reading the paper; why use "dist=self.alpha0.01(var**0.5)*torch.randn(*x.shape,device=x.device)" to get the ERC? where mentioned in your paper?
Dear Huawei-Noah:
Thank you very much for hoping to reproduce the ideas in the article, but in the process of debugging the code, I found the lack of experiment on fully connected layers code, Can you provide the train_cnn.py and train100_cnn.py code? Thank you very much!
https://github.com/huawei-noah/Disout/blob/master/models/resnet.py#L188
The forwarding for each weight update only does one step distortion update.
Line 106 in 1c8591f
But in the paper, it wrote:
Could anyone explain what does the inconsistency result from?
BTW, some typo in the paper:
I guess it's "l+1" rather than "l+l"
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.