Code Monkey home page Code Monkey logo

solo-learn-0810's Introduction

tests Documentation Status codecov

solo-learn

A library of self-supervised methods for unsupervised visual representation learning powered by PyTorch Lightning. We aim at providing SOTA self-supervised methods in a comparable environment while, at the same time, implementing training tricks. While the library is self-contained, it is possible to use the models outside of solo-learn.


News

  • [Oct 10 2021]: ๐Ÿ‘น Restructured augmentation pipelines to allow more flexibility and multicrop. Also added multicrop for BYOL.
  • [Sep 27 2021]: ๐Ÿ• Added NNSiam, NNBYOL, new tutorials for implementing new methods 1 and 2, more testing and fixed issues with custom data and linear evaluation.
  • [Sep 19 2021]: ๐Ÿฆ˜ Added online k-NN evaluation.
  • [Sep 17 2021]: ๐Ÿค– Added ViT and Swin.
  • [Sep 13 2021]: ๐Ÿ“– Improved Docs and added tutorials for pretraining and offline linear eval.
  • [Aug 13 2021]: ๐Ÿณ DeepCluster V2 is now available.
  • [Jul 31 2021]: ๐Ÿฆ” ReSSL is now available.
  • [Jul 21 2021]: ๐Ÿงช Added Custom Dataset support.
  • [Jul 21 2021]: ๐ŸŽ  Added AutoUMAP.

Methods available:


Extra flavor

Data

  • Increased data processing speed by up to 100% using Nvidia Dali.
  • Asymmetric and symmetric augmentations.

Evaluation and logging

  • Online linear evaluation via stop-gradient for easier debugging and prototyping (optionally available for the momentum encoder as well).
  • Online Knn evaluation.
  • Normal offline linear evaluation.
  • All the perks of PyTorch Lightning (mixed precision, gradient accumulation, clipping, automatic logging and much more).
  • Easy-to-extend modular code structure.
  • Custom model logging with a simpler file organization.
  • Automatic feature space visualization with UMAP.
  • Common metrics and more to come...

Training tricks

  • Multi-cropping dataloading following SwAV:
    • Note: currently, only SimCLR supports this.
  • Exclude batchnorm and biases from LARS.
  • No LR scheduler for the projection head in SimSiam.

Requirements

  • torch
  • tqdm
  • einops
  • wandb
  • pytorch-lightning
  • lightning-bolts

Optional:

  • nvidia-dali

NOTE: if you are using CUDA 10.X change nvidia-dali-cuda110 to nvidia-dali-cuda100 in setup.py, line 7.


Installation

To install the repository with Dali and/or UMAP support, use:

pip3 install .[dali,umap]

If no Dali/UMAP support is needed, the repository can be installed as:

pip3 install .

NOTE: If you want to modify the library, install it in dev mode with -e.

NOTE 2: Soon to be on pip.


Training

For pretraining the encoder, follow one of the many bash files in bash_files/pretrain/.

After that, for offline linear evaluation, follow the examples on bash_files/linear.

NOTE: Files try to be up-to-date and follow as closely as possible the recommended parameters of each paper, but check them before running.


Results

Note: hyperparameters may not be the best, we will be re-running the methods with lower performance eventually.

CIFAR-10

Method Backbone Epochs Dali Acc@1 (online) Acc@1 (offline) Acc@5 (online) Acc@5 (offline) Checkpoint
Barlow Twins ResNet18 1000 โŒ 92.10 99.73 ๐Ÿ”—
BYOL ResNet18 1000 โŒ 92.58 99.79 ๐Ÿ”—
DeepCluster V2 ResNet18 1000 โŒ 88.85 99.58 ๐Ÿ”—
DINO ResNet18 1000 โŒ 89.52 99.71 ๐Ÿ”—
MoCo V2+ ResNet18 1000 โŒ 92.94 99.79 ๐Ÿ”—
NNCLR ResNet18 1000 โŒ 91.88 99.78 ๐Ÿ”—
ReSSL ResNet18 1000 โŒ 90.63 99.62 ๐Ÿ”—
SimCLR ResNet18 1000 โŒ 90.74 99.75 ๐Ÿ”—
Simsiam ResNet18 1000 โŒ 90.51 99.72 ๐Ÿ”—
SwAV ResNet18 1000 โŒ 89.17 99.68 ๐Ÿ”—
VICReg ResNet18 1000 โŒ 92.07 99.74 ๐Ÿ”—
W-MSE ResNet18 1000 โŒ 88.67 99.68 ๐Ÿ”—

CIFAR-100

Method Backbone Epochs Dali Acc@1 (online) Acc@1 (offline) Acc@5 (online) Acc@5 (offline) Checkpoint
Barlow Twins ResNet18 1000 โŒ 70.90 91.91 ๐Ÿ”—
BYOL ResNet18 1000 โŒ 70.46 91.96 ๐Ÿ”—
DeepCluster V2 ResNet18 1000 โŒ 63.61 88.09 ๐Ÿ”—
DINO ResNet18 1000 โŒ 66.76 90.34 ๐Ÿ”—
MoCo V2+ ResNet18 1000 โŒ 69.89 91.65 ๐Ÿ”—
NNCLR ResNet18 1000 โŒ 69.62 91.52 ๐Ÿ”—
ReSSL ResNet18 1000 โŒ 65.92 89.73 ๐Ÿ”—
SimCLR ResNet18 1000 โŒ 65.78 89.04 ๐Ÿ”—
Simsiam ResNet18 1000 โŒ 66.04 89.62 ๐Ÿ”—
SwAV ResNet18 1000 โŒ 64.88 88.78 ๐Ÿ”—
VICReg ResNet18 1000 โŒ 68.54 90.83 ๐Ÿ”—
W-MSE ResNet18 1000 โŒ 61.33 87.26 ๐Ÿ”—

ImageNet-100

Method Backbone Epochs Dali Acc@1 (online) Acc@1 (offline) Acc@5 (online) Acc@5 (offline) Checkpoint
Barlow Twins ๐Ÿš€ ResNet18 400 โœ”๏ธ 80.38 80.16 95.28 95.14 ๐Ÿ”—
BYOL ๐Ÿš€ ResNet18 400 โœ”๏ธ 80.16 80.32 95.02 94.94 ๐Ÿ”—
DeepCluster V2 ResNet18 400 โŒ 75.36 75.4 93.22 93.10 ๐Ÿ”—
DINO ResNet18 400 โœ”๏ธ 74.84 74.92 92.92 92.78 ๐Ÿ”—
DINO ๐Ÿ˜ช ViT Tiny 400 โŒ 63.04 TODO 87.72 TODO ๐Ÿ”—
MoCo V2+ ๐Ÿš€ ResNet18 400 โœ”๏ธ 78.20 79.28 95.50 95.18 ๐Ÿ”—
NNCLR ๐Ÿš€ ResNet18 400 โœ”๏ธ 79.80 80.16 95.28 95.30 ๐Ÿ”—
ReSSL ResNet18 400 โœ”๏ธ 76.92 78.48 94.20 94.24 ๐Ÿ”—
SimCLR ๐Ÿš€ ResNet18 400 โœ”๏ธ 77.04 77.48 94.02 93.42 ๐Ÿ”—
Simsiam ResNet18 400 โœ”๏ธ 74.54 78.72 93.16 94.78 ๐Ÿ”—
SwAV ResNet18 400 โœ”๏ธ 74.04 74.28 92.70 92.84 ๐Ÿ”—
VICReg ๐Ÿš€ ResNet18 400 โœ”๏ธ 79.22 79.40 95.06 95.02 ๐Ÿ”—
W-MSE ResNet18 400 โœ”๏ธ 67.60 69.06 90.94 91.22 ๐Ÿ”—

๐Ÿš€ methods where hyperparameters were heavily tuned.

๐Ÿ˜ช ViT is very compute intensive and unstable, so we are slowly running larger architectures and with a larger batch size. Atm, total batch size is 128 and we needed to use float32 precision. If you want to contribute by running it, let us know!

ImageNet

Method Backbone Epochs Dali Acc@1 (online) Acc@1 (offline) Acc@5 (online) Acc@5 (offline) Checkpoint
Barlow Twins ResNet50 100 โœ”๏ธ
BYOL ResNet50 100 โœ”๏ธ 68.63 68.37 88.80 88.66 ๐Ÿ”—
DeepCluster V2 ResNet50 100 โœ”๏ธ
DINO ResNet50 100 โœ”๏ธ
MoCo V2+ ResNet50 100 โœ”๏ธ
NNCLR ResNet50 100 โœ”๏ธ
ReSSL ResNet50 100 โœ”๏ธ
SimCLR ResNet50 100 โœ”๏ธ
Simsiam ResNet50 100 โœ”๏ธ
SwAV ResNet50 100 โœ”๏ธ
VICReg ResNet50 100 โœ”๏ธ
W-MSE ResNet50 100 โœ”๏ธ

Training efficiency for DALI

We report the training efficiency of some methods using a ResNet18 with and without DALI (4 workers per GPU) in a server with an Intel i9-9820X and two RTX2080ti.

Method Dali Total time for 20 epochs Time for a 1 epoch GPU memory (per GPU)
Barlow Twins โŒ 1h 38m 27s 4m 55s 5097 MB
โœ”๏ธ 43m 2s 2m 10s (56% faster) 9292 MB
BYOL โŒ 1h 38m 46s 4m 56s 5409 MB
โœ”๏ธ 50m 33s 2m 31s (49% faster) 9521 MB
NNCLR โŒ 1h 38m 30s 4m 55s 5060 MB
โœ”๏ธ 42m 3s 2m 6s (64% faster) 9244 MB

Note: GPU memory increase doesn't scale with the model, rather it scales with the number of workers.


Citation

If you use solo-learn, please cite our preprint:

@misc{turrisi2021sololearn,
      title={Solo-learn: A Library of Self-supervised Methods for Visual Representation Learning}, 
      author={Victor G. Turrisi da Costa and Enrico Fini and Moin Nabi and Nicu Sebe and Elisa Ricci},
      year={2021},
      eprint={2108.01775},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={\url{https://github.com/vturrisi/solo-learn}},
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.