Code Monkey home page Code Monkey logo

disout's People

Contributors

iamhankai avatar yehuitang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

disout's Issues

Inconsistent training behavior in CIFAR and ImageNet

In the script of resnet.py and train.py where it sets weight_behind of Disout,however in the training process of ImageNet, it resets weight_behind of Disout None. However, why? According to the paper, it should be guided by the ERC value of next layer, and where you use the weights of next layer to approximately compute the gradient of distortion.

Can anyone help me solve this problem?

Hi, I am a CS master in TYUST, I'm very interested in Disout.
But I could not find this object function in your code
image

image

Can you tell me where it is in the code?
Think you !

Can Disout be used for 3D-CNN?

Hi! Thanks for sharing your great work! I have some questions to ask you. Can Disout be used for 3D-CNN? What should I do? Thank you very much!

Linear scheduler in multi-gpu training

So the Linear Scheduler for linearly increasing distortion (similar to dropblock) would in no way work for multi-gpu training since it uses a simple variable i (not tensor) so when we do the following

def step(self):
        if self.i < len(self.drop_values):
            self.disout.dist_prob = self.drop_values[self.i]
        self.i += 1

The value of i will never get updated. You can try if you want. My question is, how did you guys run this code to train imagenet and got those results?

Training configuration

Dear authors,

Thanks for your great work, It's very useful for training CNNs. Would you mind telling me the training configuration so that I can duplicate the method? Another question: I noticed that DropBlock is trained for 270 epochs, while the default training epoch for Disout is 540. Does it make a big difference to the result?

Thanks

Could you provide a example of Feature Map Distortion?

Thanks for your great work!

I am very curious about the experiments of "Feature Map Distortion" mentioned in paper (for fully connected layer). In code there is only block-wise distortion.

Could you provide a example of Feature Map Distortion?

Scale of weight_behind

If we scale the weight_behind by 10, it will output the same features because of BN? Will the magnitude of distortion still keep the same?

Could you provide more details of "Experiments on Fully Connected Layers"?

Hi,

Thanks for your great work!

I am reproducing your first experiment recently, but even the baseline method I implemented is very different from the results in the paper. (In your paper, Table 1 says the accuracy of conventional CNN method on CIFAR-10 reaches 81.99%)

My implementation is as consistent as possible with the narrative in your section "Implementation details" of "Experiments on Fully Connected Layers". Maybe I missed some details or tricks (such as padding, momentum, etc.), could you provide more information about this experiment?

May I submit a PR to your repository?

Hi,

I'm very interested in your work about Disout. And I reproduced the experiment about conventional CNNs in the paper. I want to submit a PR to your repository, and I wonder if you are willing to accept this.

Here is my performance of conventional CNNs.

image

the parameter of the weight_behind?

hi i have some question about the code;
I have write the code in tensorflow yestoday;
In my opinion , as the code released show that the disout is on the bisic of the dropblock added with a random value;
<0>in your code, the weight_behind is always set None ;
<1> i am also confused about the ERC after reading the paper; why use "dist=self.alpha0.01(var**0.5)*torch.randn(*x.shape,device=x.device)" to get the ERC? where mentioned in your paper?

Project code

Dear Huawei-Noah:

Thank you very much for hoping to reproduce the ideas in the article, but in the process of debugging the code, I found the lack of experiment on fully connected layers code, Can you provide the train_cnn.py and train100_cnn.py code? Thank you very much!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.