Introduction | Updates | Results&Pretrained Models | Statement |
This repository contains the code, models, test results for the paper ViTAE: Vision Transformer Advanced by Exploring Intrinsic Inductive Bias. It contains several reduction cells and normal cells to introduce scale-invariance and locality into vision transformers.
06/08/2021 The paper is post on arxiv! The code will be made public available once cleaned up.
name | resolution | acc@1 | acc@5 | acc@RealTop-1 | Pretrained |
---|---|---|---|---|---|
ViTAE-T | 224x224 | 75.3 | 92.7 | 82.9 | Coming Soon |
ViTAE-6M | 224x224 | 77.9 | 94.1 | 84.9 | Coming Soon |
ViTAE-13M | 224x224 | 81.0 | 95.4 | 86.9 | Coming Soon |
ViTAE-S | 224x224 | 82.0 | 95.9 | 87.0 | Coming Soon |
This project is for research purpose only. For any other questions please contact yufei.xu at outlook.com qmzhangzz at hotmail.com .