Code Monkey home page Code Monkey logo

deeperinversecompositionalalgorithm's Introduction

Taking a Deeper Look at the Inverse Compositional Algorithm (CVPR 2019, Oral Presentation)

alt text

Summary

This is the official repository of our CVPR 2019 paper:

Taking a Deeper Look at the Inverse Compositional Algorithm, Zhaoyang Lv, Frank Dellaert, James M. Rehg, Andreas Geiger, CVPR 2019

@inproceedings{Lv19cvpr,  
  title     = {Taking a Deeper Look at the Inverse Compositional Algorithm}, 
  author    = {Lv, Zhaoyang and Dellaert, Frank and Rehg, James and Geiger, Andreas},  
  booktitle = {CVPR},  
  year      = {2019}  
}

Please cite the paper if you use the provided code in your research.

Project Members

Contact

Please drop me an email if you have any questions regarding this project.

Zhaoyang Lv ([email protected], [email protected])

Setup

The code is developed using Pytorch 1.0, CUDA 9.0, Ubuntu16.04. Pytorch > 1.0 and Cuda > 9.0 were also tested in some machines. If you need to integrate to some code bases which are using Pytorch < 1.0, note that some functions (particularly the matrix inverse operator) are not backward compatible. Contact me if you need some instructions.

You can reproduce the setup by using our anaconda environment configurations

conda env create -f setup/environment.yml

Everytime before you run, activate the environment

conda activate deepICN # the environment name will be (deepICN)

Quick Inference Example

Here is a quick run using pre-trained model on a short TUM trajectory.

python run_example.py

Be careful about the depth reader when you switch to a different dataset, which may use different scaling to the TUM dataset. And this pre-trained model is trained for egocentric-motion estimation using TUM dataset, not for the object-centric motion estimation. These factors may affect the result.

To run the full training and evaluation, please follow the steps below.

Prepare the datasets

TUM RGBD Dataset: Download the dataset from TUM RGBD to '$YOUR_TUM_RGBD_DIR'. Create a symbolic link to the data directory as

ln -s $YOUR_TUM_RGBD_DIR code/data/data_tum

MovingObjects3D Dataset Download the dataset from MovingObjs3D to '$YOUR_MOV_OBJS_3D_DIR'. Create a symbolic link to the data directory as

ln -s $YOUR_MOV_OBJS_3D_DIR code/data/data_objs3D

Run training

Train example with TUM RGBD dataset:

python train.py --dataset TUM_RGBD 

# or run with the specific setting
python train.py --dataset TUM_RGBD \
--encoder_name ConvRGBD2 \
--mestimator MultiScale2w \
--solver Direct-ResVol \
--keyframes 1,2,4,8 # mixing keyframes subsampled from 1,2,4,8 for training.

To check the full training setting, run the help config as

python train.py --help

Use tensorboard to check the progress during training

tensorboard --log logs/TUM_RGBD --port 8000 # go to localhost:8000 to check the training&validation curve

Train example with MovingObjects3D: All the same as the last one only except changing the dataset name

python train.py --dataset MovingObjs3D \
--keyframes 1,2,4 # mixing keyframes subsampled from 1,2,4 for training

# check the training progress using tensorboard
tensorboard --log logs/MovingObjs3D --port 8001

We will soon release the instructions for the other two datasets used in the paper, data in the BundleFusion and DynamicBundleFusion.

Run evaluation with the pretrained models

Run the pretrained model: If you have set up the dataset properly with the datasets, you can run the learned model with the checkpoint we provided in the trained model directory

python evaluate.py --dataset TUM_RGBD \
--trajectory fr1/rgbd_dataset_freiburg1_360 \
--encoder_name ConvRGBD2 \
--mestimator MultiScale2w \
--solver Direct-ResVol \
--keyframes 1 \
--checkpoint trained_models/TUM_RGBD_ABC_final.pth.tar

You can substitute the trajectory, the keyframe and the checkpoint file. The training and evaluation share the same config setting. To check the full setting, run the help config as

python evaluate.py --help

Results: The evaluation results will be generated automatically in both '.pkl' and '*.csv' in the folder 'test_results/'.

Run a baseline: This implementation can be simplified to a vanilla Lucas-Kanade method minizing the photometric error without any learnable module. Note that it is not the RGBD VO baseline we report in the paper. It may not be the optimal Lucas-Kanade baseline you want to compared with since we use the same stopping criterion, Gauss-Newton solver within the same framework as our learned model. There is no extra bells and whistles, but it may provide a baseline for you to explore the algorithm in various directions.

python evaluate.py --dataset TUM_RGBD \
--trajectory fr1/rgbd_dataset_freiburg1_360 \
--encoder_name RGB --mestimator None --solve Direct-Nodamping \
--keyframes 1 # optionally 1,2,4,8, etc.

License

MIT License

Copyright (c) 2019 Zhaoyang Lv, Frank Dellaert, James M. Rehg, Andreas Geiger

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

deeperinversecompositionalalgorithm's People

Contributors

binbin-xu avatar lvzhaoyang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deeperinversecompositionalalgorithm's Issues

Cannot load pretrained model

It seems that I cannot load a pre-trained model with the code. I have used the anaconda from the repo. I guess the problem is the CUDA version. I use CUDA 10.0, however, your requirements say CUDA > 9.0. Did you also test with a newer version? Thanks a lot

net.load_state_dict(torch.load(config.checkpoint)['state_dict']) File "/home/anaconda3/envs/deepICN/lib/python3.6/site-packages/torch/nn/modules/module.py", line 769, in load_state_dict self.__class__.__name__, "\n\t".join(error_msgs))) RuntimeError: Error(s) in loading state_dict for LeastSquareTracking: Missing key(s) in state_dict: "encoder.net0.0........

Theoretical Error between Formula 11 and 12

The photometric error in formula 12 whose last dim is reshaped as H*W cannot be used to calculate the delta se3 whose last dim is 6 directly.

The code in inverse_update_pose :

    inv_H = invH(H)
    xi = torch.bmm(inv_H, Rhs)
    # simplifed inverse compositional for SE3
    d_R = geometry.batch_twist2Mat(-xi[:, :3].view(-1,3))
    d_t = -torch.bmm(d_R, xi[:, 3:])

has cheat us.

The xi's last dim is H*W, the xi[:, :3] and xi[:, 3:] only get its head and tail.

What I recommend is to add several FC layers between residual_photometric_error and residual_pose_error to learn the project from H*W to 6.

Question about inverse_update_pose function

Thanks for the great work. I have a question related to the pose update:
In models/algorithms.py, in the function
def inverse_update_pose(H, Rhs, pose):
...
pose = geometry.batch_Rt_compose(R, t, d_R, d_t)
...
However, in models/geometry.py, batch_Rt_compose was defined as
def batch_Rt_compose(d_R, d_t, R0, t0):

Could you please advise on how the update was performed? Thanks a lot.

Pretained weights and eval code for MovingObjects3D?

Hi, thanks for the great work! Do you have any plan to release the pretrained weights and eval code of MovingObjects3D dataset?
It would be helpful for following work to eval and benchmark the results on this dataset.

jacobian_warping computation of right-multiplication

Thanks for your great work. I have a question related to the jacobian_warping:
In models/algorithms.py, in the function
def compute_jacobian_warping(p_invdepth, K, px, py):
...
dx_dp = torch.cat((-xy, 1+x2, -y, invd, O, -invd*x), dim=2)
dy_dp = torch.cat((-1-y
2, xy, x, O, invd, -invd*y), dim=2)
...
The jacobian (dx_dp,dy_dp) here is the left-multiplication jacobian.
Are the left-multiplication and right-multiplication jacobian the same?
I am not familiar with right-multiplication, Could you explain it? Thanks a lot.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.