Code Monkey home page Code Monkey logo

trianflow's Introduction

Towards Better Generalization: Joint Depth-Pose Learning without PoseNet

Created by Wang Zhao, Shaohui Liu, Yezhi Shu, Yong-Jin Liu.

Introduction

This implementation is based on our CVPR'2020 paper "Towards Better Generalization: Joint Depth-Pose Learning without PoseNet". You can find the arXiv version of the paper here. In this repository we release code and pre-trained models for TrianFlow (our method) and a strong baseline PoseNet-Flow method. img

Installation

The code is based on Python3.6. You could use either virtualenv or conda to setup a specified environment. And then run:

pip install -r requirements.txt

Run a demo

To run a depth prediction demo, you may need to first download the pretrained model from here.

python test.py --config_file ./config/default_1scale.yaml --gpu 0 --mode depth --task demo --image_path ./data/demo/kitti.png --pretrained_model ./models/pretrained/depth_pretrained.pth --result_dir ./data/demo

This will give you a predicted depth map for demo image. img img

Run experiments

Prepare training data:

  1. For KITTI depth and flow tasks, download KITTI raw dataset using the script provided on the official website. You also need to download KITTI 2015 dataset to evaluate the predicted optical flow. Run the following commands to generate groundtruth files for eigen test images.
cd ./data/eigen
python export_gt_depth.py --data_path /path/to/your/kitti/root 
  1. For KITTI Odometry task, download KITTI Odometry dataset.
  2. For NYUv2 experiments, download NYUv2 raw sequences and labeled data mat, also the traing test split mat from here. Put the labeled data and splits file under the same directory. The data structure should be:
nyuv2
  | basements  
  | cafe
  | ...
nyuv2_test
  | nyu_depth_v2_labeled.mat 
  | splits.mat

Training:

  1. Modify the configuration file in the ./config directory to set up your path. The config file contains the important paths and default hyper-parameters used in the training process.
  2. For KITTI depth, we have the three-stage training schedule.
1. python train.py --config_file ./config/kitti.yaml --gpu [gpu_id] --mode flow --prepared_save_dir [name_of_your_prepared_dataset] --model_dir [your/directory/to/save/training/models]
2. python train.py --config_file ./config/kitti.yaml --gpu [gpu_id] --mode depth --prepared_save_dir [name_of_your_prepared_dataset] --model_dir [your/directory/to/save/training/models] --flow_pretrained_model [path/to/your/stage1_flow_model]
3. python train.py --config_file ./config/kitti_3stage.yaml --gpu [gpu_id] --mode depth_pose --prepared_save_dir [name_of_your_prepared_dataset] --model_dir [your/directory/to/save/training/models] --depth_pretrained_model [path/to/your/stage2_depth_model]

If you are running experiments on the dataset for the first time, it would first process data and save in the [prepared_base_dir] path defined in your config file. For other datasets like KITTI Odometry and NYUv2 dataset, you could run with the same commands using the appropriate config file.

We also implement and release codes for the strong baseline PoseNet-Flow method, you could run it by two-stage training:

1. python train.py --config_file [path/to/your/config/file] --gpu [gpu_id] --mode flow --prepared_save_dir [name_of_your_prepared_dataset] --model_dir [your/directory/to/save/training/models]
2. python train.py --config_file [path/to/your/config/file] --gpu [gpu_id] --mode flowposenet --prepared_save_dir [name_of_your_prepared_dataset] --model_dir [your/directory/to/save/training/models] --flow_pretrained_model [path/to/your/stage1_flow_model]

Evaluation:

We provide pretrained models here for different tasks. The performance could be slightly different with the paper due to randomness.

  1. To evaluate the monocular depth estimation on kitti eigen test split, run:
python test.py --config_file ./config/kitti.yaml --gpu [gpu_id] --mode depth --task kitti_depth --pretrained_model [path/to/your/model] --result_dir [path/to/save/results]
  1. To evaluate the monocular depth estimation on nyuv2 test split, run:
python test.py --config_file ./config/nyu.yaml --gpu [gpu_id] --mode depth --task nyuv2 --pretrained_model [path/to/your/model] --result_dir [path/to/save/results]
  1. To evaluate the optical flow estimation on KITTI 2015, run:
python test.py --config_file ./config/kitti.yaml --gpu [gpu_id] --mode flow_3stage --task kitti_flow --pretrained_model [path/to/your/model] --result_dir [path/to/save/results]
  1. To evaluate the visual odometry task on KITTI Odometry dataset, first get predictions on a single sequence and then evaluate:
python infer_vo.py --config_file ./config/odo.yaml --gpu [gpu_id] --traj_save_dir_txt [where/to/save/the/prediction/file] --sequences_root_dir [the/root/dir/of/your/image/sequences] --sequence [the sequence id] ----pretrained_model [path/to/your/model]
python ./core/evaluation/eval_odom.py --gt_txt [path/to/your/groundtruth/poses/txt] --result_txt [path/to/your/prediction/txt] --seq [sequence id to evaluate]

You could evaluate on the sampled KITTI odometry dataset by simply sampling the raw image sequences and gt-pose txt. Then run infer_vo.py on the sampled image sequence and eval_odom.py with predicted txt and sampled gt txt to get results.

Citation

If you find our work useful in your research, please consider citing:

@inproceedings{zhao2020towards,
  title={Towards Better Generalization: Joint Depth-Pose Learning without PoseNet},
  author={Zhao, Wang and Liu, Shaohui and Shu, Yezhi and Liu, Yong-Jin},
  booktitle={Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2020}
}

Related Projects

Digging into Self-Supervised Monocular Depth Prediction.

Competitive Collaboration: Joint Unsupervised Learning of Depth, Camera Motion, Optical Flow and Motion Segmentation.

Visual Odometry Revisited: What Should Be Learnt?

trianflow's People

Contributors

b1ueber2y avatar thuzhaowang avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.