Code Monkey home page Code Monkey logo

mobileformer's Introduction

MobileFormer

An implementation of MobileFormer proposed by Yinpeng Chen, Xiyang Dai et al.

Including

[1] Mobile-Former proposed in: 
                        Yinpeng Chen, Xiyang Dai et al., Mobile-Former: Bridging MobileNet and Transformer. 
                        arxiv.org/abs/2108.05895
[2] Dynamtic ReLU proposed in: 
                        Yinpeng Chen, Xiyang Dai et al., Dynamtic ReLU. 
                        arxiv.org/abs/2003.10027v2
[3] Lite-BottleNeck proposed in: 
                        Yunsheng Li, Yinpeng Chen et al., MicroNet: Improving Image Recognition with Extremely Low FLOPs. 
                        arxiv.org/abs/2108.05894v1
[4] Adam-W proposed in:
                        Ilya Loshchilov & Frank Hutter, Decoupled Weight Decay Regularization.
                        arxiv.org/abs/1711.05101v3
[5] Mixup proposed in:
                        Hongyi Zhang, Moustapha Cisse et al., Mixup: Beyond Empircal Risk Minimization.
                        arxiv.org/abs/1710.09412
[6] Multi-FocalLoss (not used), focal loss is proposed in:
                        Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, Piotr Dollár, Focal Loss for Dense Object Detection.
                        arxiv.org/abs/1708.02002

Note

(1) Due to the expanded DW conv used in strided Mobile-Former blocks, 
    the out_channel should be divisible by expand_size of the next block.
(2) Adam-W and Mixup is embedded in train.py.
(3) Use run() in train.py to train('run') or search('search'). There is an example in the train.py.

'###### The '#'s #######'

'##### are aligned #####'

No pre-train parameters for now.

About Training:

Following DeiT, there is an optional learning rate and weight decay set for grid search (if you want):
    LR from [5e-4, 3e-4, 5e-5] * batchsize / 256 ( or 512)
    WD from [0.03, 0.04, 0.05]
Looooooooong Training for CNN, but for transformer, its ok (maybe).

mobileformer's People

Contributors

slwang9353 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.