Code Monkey home page Code Monkey logo

emvn's Introduction

Equivariant Multi-View Networks

animations/combined.gif

Abstract

Several popular approaches to 3D vision tasks process multiple views of the input independently with deep neural networks pre-trained on natural images, achieving view permutation invariance through a single round of pooling over all views. We argue that this operation discards important information and leads to subpar global descriptors. In this paper, we propose a group convolutional approach to multiple view aggregation where convolutions are performed over a discrete subgroup of the rotation group, enabling, thus, joint reasoning over all views in an equivariant (instead of invariant) fashion, up to the very last layer. We further develop this idea to operate on smaller discrete homogeneous spaces of the rotation group, where a polar view representation is used to maintain equivariance with only a fraction of the number of input views. We set the new state of the art in several large scale 3D shape retrieval tasks, and show additional applications to panoramic scene classification.

Demo

This repository contains demos for our best models on rotated and aligned ModelNet40.

Check the requirements in requirements.txt. Our codebase has been tested on TensorFlow 1.15 but the dependency is commented out in requirements.txt to silence GitHub’s security warnings.

Download the datasets here.

The following commands should

  • clone this repo,
  • create a virtualenv,
  • install the requirements.
git clone https://github.com/daniilidis-group/emvn.git
cd emvn
virtualenv -p python3 env
source env/bin/activate
pip install -r requirements.txt

Training on aligned ModelNet40

Change –data and –logdir appropriately.

python3 emvn/train.py \
        --data /path/to/m40canon_{}.lmdb \
        --logdir /tmp/emvn_m40canon \
        --epochs 15 \
        --batch-size 12 \
        --skip_eval \
        --eval_retrieval \
        --retrieval_include_same \
        --triplet_loss \
        --optimizer nesterov \
        --lr-decay-mode cos \
        --gconv_support 0,8,1,15,12,25,21,19,29,7,11,20,4 \
        --gcc 512,512,512 \
        --n_homogeneous 20 \
        --lr 3e-3 \
        --homogeneous_only1st \
        --n_fc_before_gconv 1 \
        --n_group_elements 60 \
        --pretrained

Sample outputs:

[2019-10-28 15:17:08,566:INFO] Loading data...
[1028 15:17:08 @format.py:93] Found 2468 entries in m40canon_test.lmdb
Classes=40, Views=1
[1028 15:17:09 @format.py:93] Found 9843 entries in m40canon_train.lmdb
[1028 15:17:09 @format.py:93] Found 2468 entries in m40canon_test.lmdb
Loading 100 inputs from pretrained model...
[2019-10-28 15:17:11,027:INFO] Loaded model...
[2019-10-28 15:17:11,027:INFO] Loaded model; params=23522920
[2019-10-28 15:17:14,366:INFO] Running on cuda:0
[2019-10-28 15:17:14,368:INFO] Checkpoint /tmp/emvn_m40canon/latest.pth.tar not found; ignoring.
[2019-10-28 15:17:14,368:INFO] Epoch: [1/15]
        Iter [10/820] Loss: 3.4836 Time batch: 0.2294 s; Time GPU 0.1063 s; Time to load: 0.2292 s
(...)
        Iter [820/820] Loss: 0.0081 Time batch: 0.2310 s; Time GPU 0.1035 s; Time to load: 0.2310 s
[2019-10-28 16:05:08,857:INFO] Time taken: 190.51 sec...
[2019-10-28 16:05:08,858:INFO]  Saving latest model
[2019-10-28 16:05:09,671:INFO] Total time for this epoch: 191.31975293159485 s
Starting evaluation...
        Iter [10/411] Loss: 0.0165  Acc: 100.0000
(...)
        Iter [410/411] Loss: 0.0059  Acc: 100.0000
[2019-10-28 16:07:14,435:INFO] Evaluating retrieval...
[2019-10-28 16:07:15,352:INFO] Computed pairwise distances between 2466 samples
[2019-10-28 16:07:17,191:INFO] acc per class=[100.0, 92.0, 98.0, 85.0, 96.0, 98.0, 100.0, 100.0, 98.0, 95.0, 75.0, 95.0, 98.83720930232558, 95.0, 95.34883720930233, 20.0, 98.0, 100.0, 100.0, 90.0, 100.0, 99.0, 98.0, 81.3953488372093, 100.0, 99.0, 82.0, 95.0, 98.0, 100.0, 98.0, 100.0, 80.0, 97.0, 95.0, 100.0, 90.0, 80.0, 90.0, 100.0]
[2019-10-28 16:07:17,192:INFO]  combined: 92.61, Acc: 94.61, mAP: 93.36, Loss: 0.2238

Results are within 0.2% of Table 2/Ours-R-12 in the paper.

Training on rotated ModelNet40

Change –data and –logdir appropriately.

python3 emvn/train.py \
        --data /path/to/m40rot_{}.lmdb \
        --logdir /tmp/emvn_m40rot \
        --epochs 15 \
        --batch-size 6 \
        --skip_eval \
        --eval_retrieval \
        --retrieval_include_same \
        --triplet_loss \
        --optimizer nesterov \
        --lr-decay-mode cos \
        --lr 1.5e-3 \
        --gconv_support 0,8,1,15,12,25,21,19,29,7,11,20,4 \
        --gcc 512,512,512 \
        --n_fc_before_gconv 1 \
        --n_group_elements 60 \
        --pretrained

Sample outputs:

[2019-10-28 15:20:01,355:INFO] Loading data...
[1028 15:20:01 @format.py:93] Found 2468 entries in m40rot_test.lmdb
Classes=40, Views=1
[1028 15:20:03 @format.py:93] Found 9843 entries in m40rot_train.lmdb
[1028 15:20:03 @format.py:93] Found 2468 entries in m40rot_test.lmdb
Loading 100 inputs from pretrained model...
[2019-10-28 15:20:04,614:INFO] Loaded model...
[2019-10-28 15:20:04,622:INFO] Loaded model; params=21687912
[2019-10-28 15:20:08,627:INFO] Running on cuda:0
[2019-10-28 15:20:08,727:INFO] Checkpoint /tmp/emvn_m40rot/latest.pth.tar not found; ignoring.
[2019-10-28 15:20:08,728:INFO] Epoch: [1/15]
        Iter [10/1640] Loss: 4.7577 Time batch: 0.5615 s; Time GPU 0.2113 s; Time to load: 0.5800 s
(...)
        Iter [1640/1640] Loss: 0.0071 Time batch: 0.3312 s; Time GPU 0.1352 s; Time to load: 0.3332 s
[2019-10-28 17:58:58,240:INFO] Time taken: 541.46 sec...
[2019-10-28 17:58:58,241:INFO]                    
 Starting evaluation...                           
        Iter [10/822] Loss: 0.0001  Acc: 100.0000
(...)
        Iter [820/822] Loss: 2.6854  Acc: 66.6667
[2019-10-28 18:00:35,861:INFO] Evaluating retrieval...
[2019-10-28 18:00:36,760:INFO] Computed pairwise distances between 2466 samples
[2019-10-28 18:00:38,703:INFO] acc per class=[100.0, 86.0, 100.0, 85.0, 97.0, 98.0, 85.0, 99.0, 98.0, 95.0, 75.0, 100.0, 83.72093023255815, 95.0, 82.55813953488372, 45.0, 97.0, 99.0, 100.0, 85.0, 100.0, 98.0, 97.0, 76.74418604651163, 95.0, 98.0, 79.0, 90.0, 88.0, 90.0, 95.0, 100.0, 65.0, 77.0, 90.0, 99.0, 77.0, 84.0, 55.0, 77.77777777777777]
[2019-10-28 18:00:38,704:INFO]  combined: 87.80, Acc: 90.67, mAP: 88.38, Loss: 0.3442

Results are within 0.4% of Table 3/Ours-R-60 in the paper.

Training on aligned SHREC’17

Coming soon!

Training on rotated SHREC’17

Coming soon!

Reference

Carlos Esteves*, Yinshuang Xu*, Christine Allen-Blanchette, Kostas Daniilidis. “Equivariant Multi-View Networks”. The IEEE International Conference on Computer Vision (ICCV), 2019.

@InProceedings{Esteves_2019_ICCV,
author = {Esteves, Carlos and Xu, Yinshuang and Allen-Blanchette, Christine and Daniilidis, Kostas},
title = {Equivariant Multi-View Networks},
booktitle = {The IEEE International Conference on Computer Vision (ICCV)},
month = {October},
year = {2019}
}

Authors

Carlos Esteves*, Yinshuang Xu*, Christine Allen-Blanchette, Kostas Daniilidis

GRASP Laboratory, University of Pennsylvania

emvn's People

Contributors

machc avatar dependabot[bot] avatar

Stargazers

 avatar J_JJ avatar smile1897 avatar Darrell Young avatar  avatar  avatar 李玮(Wei Li) avatar Massinissa avatar  avatar NEU-Junshun avatar Yue Pan  avatar X^_^X avatar pyhan avatar Abhinaba Bala  avatar JaKeoung Koo avatar leerw avatar ZhangZiyu avatar qiangzibro avatar TzuRen avatar Jie Yang avatar Freddie Rao avatar  avatar Matt Shaffer avatar  avatar  avatar Danqing Kang avatar Aniket Agarwal avatar Raghav Shrivastava avatar Matt Shaffer avatar Tengyu Ma avatar Kuang-Yu Jeng avatar  avatar Call Me Maybe avatar Pan He avatar  avatar  avatar Xinwei He avatar mocobt avatar  avatar Yongjie Shi avatar Lin Li avatar Akira Tamamori avatar Fan Yang avatar  avatar DI HUANG avatar Si-yu Derrick ZHANG avatar Jiaming Sun avatar  avatar 爱可可-爱生活 avatar CVMart avatar  avatar Tianye Li avatar Roberto A. Ibanez avatar Haipeng Wang avatar Taco Cohen avatar Mircea Mironenco avatar

Watchers

James Cloos avatar  avatar Kenneth Chaney avatar Drew Jaegle avatar Bernd Pfrommer avatar Abhinaba Bala  avatar Pyjcsx avatar Ray avatar  avatar  avatar  avatar Matt Shaffer avatar paper2code - bot avatar

emvn's Issues

Rotation Matrix for View configurations

Hi Carlos,

Congrats for being accepted as ICCV Oral!
I'm wondering how to obtain the rotation matrix for the different view configurations in your paper: 12x5, 20x3, 60x1.
It would be very nice if you can share the code to generate these matrices.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.