Code Monkey home page Code Monkey logo

depth-from-motion's Introduction

Depth from Motion (DfM)

This repository is the official implementation for DfM and MV-FCOS3D++.

pv-demo

3d-demo-318 3d-demo2-318

Introduction

This is an official release of the paper Monocular 3D Object Detection with Depth from Motion & MV-FCOS3D++: Multi-View Camera-Only 4D Object Detection with Pretrained Monocular Backbones.

The code is still going through large refactoring. We plan to re-organize this repo as a combination of core codes for this project and mmdet3d requirement finally.

Please stay tuned for the clean release of all the configs and models.

Note: We will also release the refactored code in the official mmdet3d soon.

Monocular 3D Object Detection with Depth from Motion,
Tai Wang, Jiangmiao Pang, Dahua Lin
In: Proc. European Conference on Computer Vision (ECCV), 2022
[arXiv][Bibtex]

MV-FCOS3D++: Multi-View Camera-Only 4D Object Detection with Pretrained Monocular Backbones,
Tai Wang, Qing Lian, Chenming Zhu, Xinge Zhu, Wenwei Zhang
In: arxiv, 2022
[arXiv][Bibtex]

Results

DfM

The results of DfM and its corresponding config are shown as below.

We have released the preliminary model for reproducing the results on the KITTI validation set.

The complete model checkpoints and logs will be released soon.

Backbone Lr schd Mem (GB) Inf time (fps) Easy Moderate Hard Download
ResNet34 - - - 29.3570 20.2645 17.4731 model

MV-FCOS3D++

The results of MV-FCOS3D++ (baseline version) and its corresponding config are shown as below.

We have released the preliminary config for reproducing the results on the Waymo validation set.

(To comply the license agreement of Waymo dataset, the pre-trained models on Waymo dataset are not released.)

The complete model configs and logs will be released soon.

Backbone Lr schd Mem (GB) Inf time (fps) mAPL mAP mAPH Download
ResNet101+DCN - - - -

Installation

It requires the following OpenMMLab packages:

  • MMCV-full >= v1.6.0 (recommended for the latest iou3d computation)
  • MMDetection >= v2.24.0
  • MMSegmentation >= v0.20.0

All the above versions are recommended except mmcv. Lower version of mmdet and mmseg may also work but are not tested temporarily.

Example commands are shown as follows.

conda create --name dfm python=3.7 -y
conda activate dfm
conda install pytorch==1.9.0 torchvision==0.10.0 cudatoolkit=11.3 -c pytorch -c conda-forge
pip install mmcv-full==1.6.0
pip install mmdet==2.24.0
pip install mmsegmentation==0.20.0
git clone https://github.com/Tai-Wang/Depth-from-Motion.git
cd Depth-from-Motion
pip install -v -e .

License

This project is released under the Apache 2.0 license.

Usage

Data preparation

First prepare the raw data of KITTI and Waymo data following MMDetection3D.

Then we prepare the data related to temporally consecutive frames. (still unstable and details under modifying & testing)

For KITTI, we need to additionally download the pose and label files of the raw data here and the official mapping (between the raw data and the 3D detection benchmark split) here. Then we can run the data converter script:

python tools/create_data.py kitti --root-path ./data/kitti --out-dir ./data/kitti --extra-tag kitti

For Waymo, we need to additionally generate the ground truth bin file for camera-only setting (only boxes covered by the perception range of cameras are considered). Besides, we recommend use the latest waymo dataset, which includes the camera synced annotations tailored to this setting.

python tools/create_waymo_gt_bin.py

The final data structure looks like below:

mmdetection3d
├── mmdet3d
├── tools
├── configs
├── data
│   ├── kitti
│   │   ├── ImageSets
│   │   ├── testing
│   │   │   ├── calib
│   │   │   ├── image_2
│   │   │   ├── prev_2
│   │   │   ├── velodyne
│   │   ├── training
│   │   │   ├── calib
│   │   │   ├── image_2
│   │   │   ├── prev_2
│   │   │   ├── label_2
│   │   │   ├── velodyne
│   │   ├── raw
│   │   │   ├── 2011_09_26_drive_0001_sync
│   │   │   ├── xxxx (other raw data files)
│   │   ├── devkit
│   │   │   ├── mapping
│   │   │   │   ├── train_mapping.txt
│   │   │   │   ├── train_rand.txt
│   ├── waymo
│   │   ├── waymo_format
│   │   │   ├── training
│   │   │   ├── validation
│   │   │   ├── testing
│   │   │   ├── gt.bin
│   │   │   ├── cam_gt.bin
│   │   ├── kitti_format
│   │   │   ├── ImageSets

Training and testing

For training and testing, you can follow the standard command in mmdet to train and test the model

# train DfM on KITTI
./tools/slurm_train.sh ${PARTITION} ${JOB_NAME} ${CONFIG_FILE} ${WORK_DIR}

For simple inference and evaluation, you can use the command below:

# evaluate DfM on KITTI and MV-FCOS3D++ on Waymo
./tools/slurm_test.sh ${PARTITION} ${JOB_NAME} ${CONFIG_FILE} ${CKPT_PATH} --eval mAP

Acknowledgement

This codebase is based on MMDet3D and it benefits a lot from LIGA-Stereo.

Citation

@inproceedings{wang2022dfm,
    title={Monocular 3D Object Detection with Depth from Motion},
    author={Wang, Tai and Pang, Jiangmiao and Lin, Dahua},
    year={2022},
    booktitle={European Conference on Computer Vision (ECCV)},
}
@article{wang2022mvfcos3d++,
  title={{MV-FCOS3D++: Multi-View} Camera-Only 4D Object Detection with Pretrained Monocular Backbones},
  author={Wang, Tai and Lian, Qing and Zhu, Chenming and Zhu, Xinge and Zhang, Wenwei},
  journal={arXiv preprint},
  year={2022}
}

depth-from-motion's People

Contributors

alexpasqua avatar congee524 avatar dcnsw avatar encore-zhou avatar filapro avatar gopi231091 avatar hjin2902 avatar jshilong avatar junhaozhang98 avatar meng-zha avatar mickeyouyou avatar subjectivist avatar tai-wang avatar tianweiy avatar tojimahammatov avatar virusapex avatar vvsssssk avatar wangruohui avatar wenbo-yu avatar whao-wu avatar wuziyi616 avatar xavierwu95 avatar xiangxu-0103 avatar xieenze avatar xiliu8006 avatar yezhen17 avatar yinchimaoliang avatar zcmax avatar zhanggefan avatar zwwwayne avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.