Code Monkey home page Code Monkey logo

spherical-cnn's Introduction

Learning SO(3) Equivariant Representations with Spherical CNNs

Abstract

We address the problem of 3D rotation equivariance in convolutional neural networks. 3D rotations have been a challenging nuisance in 3D classification tasks requiring higher capacity and extended data augmentation in order to tackle it. We model 3D data with multi-valued spherical functions and we propose a novel spherical convolutional network that implements exact convolutions on the sphere by realizing them in the spherical harmonic domain. Resulting filters have local symmetry and are localized by enforcing smooth spectra. We apply a novel pooling on the spectral domain and our operations are independent of the underlying spherical resolution throughout the network. We show that networks with much lower capacity and without requiring data augmentation can exhibit performance comparable to the state of the art in standard retrieval and classification benchmarks.

Demo

This repository contains a demo, where we train and test the model on the SO(3)-rotated ModelNet40 dataset.

Check the requirements in requirements.txt

Download the dataset here (7.0Gb).

The following commands should

  • uncompress the downloaded dataset from /tmp to ~/data/m40-so3-64,
  • clone this repo,
  • create a virtualenv,
  • install the requirements,
  • train and test the model.
mkdir -p ~/data/m40-so3-64 && tar -xvf /tmp/m40_tf_csph_3d_48aug_64.tar.gz -C ~/data/m40-so3-64

git clone https://github.com/daniilidis-group/spherical-cnn.git
cd spherical-cnn

virtualenv -p python3 env
source env/bin/activate
pip install -r requirements.txt
TF_CPP_MIN_LOG_LEVEL=1 python3 scripts/train.py \
                               @params/model-64.txt \
                               @params/m40-64.txt \
                               @params/training.txt \
                               --dset_dir ~/data/m40-so3-64 \
                               --logdir /tmp/m40-so3 \
                               --run_id m40-so3

Sample output:

Running on m40-so3
Namespace(...)
Loading dataset from from_cached_tfrecords...
Loading model two_branch. Logdir /tmp/m40-so3
Start training...
epoch=1; lr=0.0010 train: 0.2286, valid: 0.2286
epoch=2; lr=0.0010 train: 0.4335, valid: 0.4335
epoch=3; lr=0.0010 train: 0.5975, valid: 0.5975
(...)
epoch=46; lr=0.0000 train: 0.9630, valid: 0.9630
epoch=47; lr=0.0000 train: 0.9640, valid: 0.9640
epoch=48; lr=0.0000 train: 0.9724, valid: 0.9724
Start testing...
| model   |  train |    val |   test | train time |
| m40-so3 | 0.9724 | 0.9724 | 0.8679 |      88.31 |

This reproduces the SO(3)/SO(3) result in table 1 (86.9%) of the paper. It runs in about 75min on a Nvidia GeForce GTX 1080 Ti, and requires ~2.2 Gb of GPU memory.

Change --dset_dir if you saved the dataset elsewhere.

Call pytest to run the unit tests.

References

Esteves, C., Allen-Blanchette, C., Makadia, A., & Daniilidis, K. Learning SO(3) Equivariant Representations with Spherical CNNs. European Conference on Computer Vision, ECCV 2018 (oral). http://arxiv.org/abs/1711.06721

@article{esteves17_learn_so_equiv_repres_with_spher_cnns,
  author = {Esteves, Carlos and Allen-Blanchette, Christine and Makadia, Ameesh and Daniilidis, Kostas},
  title = {Learning SO(3) Equivariant Representations With Spherical Cnns},
  journal = {CoRR},
  year = {2017},
  url = {http://arxiv.org/abs/1711.06721},
  archivePrefix = {arXiv},
  eprint = {1711.06721},
  primaryClass = {cs.CV},
}

Authors

Carlos Esteves [1], Christine Allen-Blanchette [1], Ameesh Makadia [2], Kostas Daniilidis [1]

[1] GRASP Laboratory, University of Pennsylvania

[2] Google

spherical-cnn's People

Contributors

machc avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.