Computer Vision Models
Classification
-
AlexNet
- ImageNet Classification with Deep Convolutional Neural Networks, 2012 -
VGGNets
- Very Deep Convolutional Networks for Large-Scale Image Recognition, 2014 -
GoogLeNet
- Going Deeper with Convolutions, 2014 -
Inception-V3
- Rethinking the Inception Architecture for Computer Vision, 2015 -
Inception-V4 and Inception-ResNet
- Inception-v4, Inception-ResNet and the Impact of Residual Connections on Learning, 2016 -
ResNet
- Deep Residual Learning for Image Recognition, 2015 -
SqueezeNet
- SqueezeNet: AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size, 2016 -
ResNeXt
- Aggregated Residual Transformations for Deep Neural Networks, 2016 -
Res2Net
- Res2Net: A New Multi-scale Backbone Architecture, 2019 -
ReXNet
- Rethinking Channel Dimensions for Efficient Model Design, 2020 -
Xception
- Xception: Deep Learning with Depthwise Separable Convolutions, 2016 -
DenseNet
- Densely Connected Convolutional Networks, 2016 -
DLA
- Deep Layer Aggregation, CVPR, 2017 -
DPN
- Dual Path Networks, 2017 -
Non-Local
- Non-local Neural Networks, CVPR, 2017 -
NASNet-A
- Learning Transferable Architectures for Scalable Image Recognition, 2017 -
PNasNet
- Progressive Neural Architecture Search, 2017 -
MobileNets
- MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications, 2017 -
MobileNetV2
- MobileNetV2: Inverted Residuals and Linear Bottlenecks, 2018 -
MobileNetV3
- Searching for MobileNetV3, 2019 -
ShuffleNet
- ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices, 2017 -
ShuffleNet V2
- ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design, 2018 -
MnasNet
- MnasNet: Platform-Aware Neural Architecture Search for Mobile, 2018 -
GhostNet
- GhostNet: More Features from Cheap Operations, CVPR, 2019 -
SKNets
- Selective Kernel Networks, CVPR, 2019 -
ResNeSt
- ResNeSt: Split-Attention Networks, 2020 -
HRNet
- Deep High-Resolution Representation Learning for Visual Recognition, 2019 -
CSPNet
- CSPNet: A New Backbone that can Enhance Learning Capability of CNN, 2019 -
EfficientNet
- EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks, 2019 -
EfficientNetV2
- EfficientNetV2: Smaller Models and Faster Training, 2021 -
RegNet
- Designing Network Design Spaces, 2020 -
GPU-EfficientNets
- Neural Architecture Design for GPU-Efficient Networks, 2020 -
HaloNets
- Scaling Local Self-Attention for Parameter Efficient Visual Backbones, 2021 -
LambdaNetworks
- LambdaNetworks: Modeling Long-Range Interactions Without Attention, 2021 -
RepVGG
- RepVGG: Making VGG-style ConvNets Great Again, 2021 -
HardCoRe-NAS
- HardCoRe-NAS: Hard Constrained diffeRentiable Neural Architecture Search, 2021 -
NFNet
- High-Performance Large-Scale Image Recognition Without Normalization, 2021 -
NF-ResNets
- Characterizing signal propagation to close the performance gap in unnormalized ResNets, 2021 -
ConvMixer
- Patches are all you need?, 2021 -
ConvNeXt
- A ConvNet for the 2020s, 2022, CVPR
Transformer
-
ViT
- An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale, ICLR, 2020 -
DeiT
- Training data-efficient image transformers & distillation through attention, 2020 -
Swin Transformer
- Swin Transformer: Hierarchical Vision Transformer using Shifted Windows, ICCV, 2021 -
Twins
- Twins: Revisiting the Design of Spatial Attention in Vision Transformers, NeurIPS, 2021
MLP
-
MLP-Mixer
- MLP-Mixer: An all-MLP Architecture for Vision, 2021 -
ResMLP
- ResMLP: Feedforward networks for image classification with data-efficient training, 2021 -
gMLP
- Pay Attention to MLPs, 2021
Self-supervised
-
MAE
- Masked Autoencoders Are Scalable Vision Learners, 2021
Object Detection
-
R-CNN
- Rich feature hierarchies for accurate object detection and semantic segmentation, CVPR, 2013 -
Fast R-CNN
- Fast R-CNN, ICCV, 2015 -
Faster R-CNN
- Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks, 2015 -
YOLOv1
- You Only Look Once: Unified, Real-Time Object Detection, 2015 -
SSD
- SSD: Single Shot MultiBox Detector, ECCV, 2015 -
FPN
- Feature Pyramid Networks for Object Detection, 2016
Semantic Segmentation
-
FCN
- Fully Convolutional Networks for Semantic Segmentation, CVPR, 2014 -
UNet
- U-Net: Convolutional Networks for Biomedical Image Segmentation, MICCAI, 2015 -
PSPNet
- Pyramid Scene Parsing Network, CVPR, 2016 -
DeepLabv3
- Rethinking Atrous Convolution for Semantic Image Segmentation, 2017 -
DeepLabv3+
- Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation, CVPR, 2018 -
Mask R-CNN
- Mask R-CNN, 2017
Generative Models
GANs
-
GAN
- Generative Adversarial Networks, 2014 -
DCGAN
- Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks, ICLR, 2016 -
WGAN
- Wasserstein GAN, 2017
VAEs
-
VAE
- Auto-Encoding Variational Bayes, 2013 -
β-VAE
- beta-VAE: Learning Basic Visual Concepts with a Constrained Variational Framework, ICLR, 2017