Code Monkey home page Code Monkey logo

cmc's Introduction

Contrastive Multiview Coding

This repo covers the implementation for CMC, which learns representations from multiview data in a self-supervised way (by multiview, we mean multiple sensory, multiple modal data, or literally multiple viewpoint data. It's flexible to define what is a "view"):

"Contrastive Multiview Coding" Paper, Project Page.

Teaser Image

Highlights

(1) Representation quality as a function of number of contrasted views.

We found that, the more views we train with, the better the representation (of each single view).

(2) Contrastive objective v.s. Predictive objective

We compare the contrastive objective to cross-view prediction, finding an advantage to the contrastive approach.

(3) Unsupervised v.s. Supervised

ResNet-101 trained with our unsupervised CMC objective (only contrasting Luminance with Chrominance) surpasses supervisedly trained AlexNet on ImageNet classification (60.1% v.s. 59.3%). For this first time on ImageNet classification, unsupervised methods are surpassing the classic supervised-AlexNet proposed in 2012 (CPC++ and DIM++ also achieve this milestone concurrently).

Installation

This repo was tested with Ubuntu 16.04.5 LTS, Python 3.5, PyTorch 0.4.0, and CUDA 9.0. But it should be runnable with recent PyTorch versions >=0.4.0

Note: It seems to us that training with Pytorch version >= 1.0 yields slightly worse results. If you find the similar discrepancy and figure out the problem, please report this since we are trying to fix it as well.

Training Encoder with CMC on ImageNet

NCE flags:

  • --nce_k: number of negatives to contrast for each positive. Default: 4096
  • --nce_m: the momentum for dynamically updating the memory. Default: 0.5
  • --nce_t: temperature that modulates the distribution. Default: 0.07 for ImageNet, 0.1 for STL-10

Path flags:

  • --data_folder: specify the ImageNet data folder.
  • --model_path: specify the path to save model.
  • --tb_path: specify where to save tensorboard monitoring events.

An example of command line for training CMC (Default: AlexNet on Single GPU)

CUDA_VISIBLE_DEVICES=0 python train_CMC.py --data_folder path/to/data --model_path path/to/save --tb_path path/to/tensorboard

By default, the training scripts will use L and ab as two views to contrast with each other. If you want to specify other image channels as different views, simply modifying here,

Training Linear Classifier

Path flags:

  • --data_folder: specify the ImageNet data folder. Should be the same as above.
  • --save_path: specify the path to save the linear classifier.
  • --tb_path: specify where to save tensorboard events monitoring linear classifier training.

Specify the checkpoint that you want to evaluate with --model_path flag, this path should directly point to the .pth file.

Therefore, an example of command line for evaluating, say ./models/ckpt.pth, should look like:

CUDA_VISIBLE_DEVICES=0 python LinearProbing.py --data_folder path/to/data --save_path path/to/save --tb_path path/to/tensorboard --model_path ./models/ckpt.pth

Citation

If you find this repo useful for your research, please consider citing the paper

@inproceedings{tian2019cmc,
  title={Contrastive Multiview Coding},
  author={Yonglong Tian and Dilip Krishnan and Phillip Isola},
  journal={arXiv preprint arXiv:1906.05849},
  year={2019}
}

For any questions, please contact Yonglong Tian ([email protected]).

Acknowledgements

Part of this code is inspired by Zhirong Wu's unsupervised learning algorithm lemniscate.

cmc's People

Contributors

hobbitlong avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.