Code Monkey home page Code Monkey logo

motionseg3d's Introduction

Efficient Spatial-Temporal Information Fusion for LiDAR-Based 3D Moving Object Segmentation

Paper | Project page | Supp. Video

This repo contains the code for our paper:

Efficient Spatial-Temporal Information Fusion for LiDAR-Based 3D Moving Object Segmentation.
Jiadai Sun, Yuchao Dai, Xianjing Zhang, Jintao Xu, Rui Ai, Weihao Gu, and Xieyuanli Chen
Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2022

demo video


How to Use

Installation

# Ubuntu 18.04 and above is recommended.
conda env create -f environment.yaml
conda activate mos3d

# Install SoftPool follow https://github.com/alexandrosstergiou/SoftPool
git clone https://github.com/alexandrosstergiou/SoftPool.git
cd SoftPool-master/pytorch
make install
--- (optional) ---
make test

# Install TorchSparse follow https://github.com/mit-han-lab/torchsparse
sudo apt install libsparsehash-dev 
pip install --upgrade git+https://github.com/mit-han-lab/[email protected]

Pretrained Model on KITTI-MOS

Download the toy-dataset and pretrained weights, and unzip them to project_path. You can also use gdown to download them in command line.

[Download command for toy-dataset and checkpoints (click to expand)]
gdown --id 1t8OuDgFzUspWtYVHSfiGkXtGrBsuvtWL # for toy-data
unzip toydata.zip

mkdir log && cd log
gdown --id 199hRJBs-3MVgqrd4Tb08Eo5pjBG74cSX # for checkpoints
unzip ckpt_motionseg3d_pointrefine.zip

Then you could use the follow command to inference and visualize the predictions. If you use toy dataset, please modify the seq_id corresponding to valid in model_path/data_cfg.yaml.

# To inference the predictions.
python infer.py -d ./toydata -m ./log/motionseg3d_pointrefine -l ./pred/oursv1 -s valid
python infer.py -d ./toydata -m ./log/motionseg3d_pointrefine -l ./pred/oursv2 -s valid --pointrefine

# Visualize the predictions.
python utils/visualize_mos.py -d ./toydata -p ./pred/oursv2 --offset 0 -s 38

Data Preperation for SemanticKITTI-MOS and KITTI-Road-MOS (newly annotated by us)

  1. Download KITTI Odometry Benchmark Velodyne point clouds (80 GB) from here.
  2. Download KITTI Odometry Benchmark calibration data (1 MB) from here.
  3. Download SemanticKITTI label data (179 MB) (alternatively the data in Files corresponds to the same data) from here.
  4. Download KITTI-Road Velodyne point clouds from original website, more details can be found in config/kitti_road_mos.md
  5. Download the KITTI-Road-MOS label data annotated by us, the pose and calib files from here (6.1 MB) .
  6. Extract everything into the same folder, as follow:
[Expected directory structure of SemanticKITTI (click to expand)]
DATAROOT
├── sequences
│   └── 08
│       ├── calib.txt                       # calibration file provided by KITTI
│       ├── poses.txt                       # ground truth poses file provided by KITTI
│       ├── velodyne                        # velodyne 64 LiDAR scans provided by KITTI
│       │   ├── 000000.bin
│       │   ├── 000001.bin
│       │   └── ...
│       ├── labels                          # ground truth labels provided by SemantiKITTI
│       │   ├── 000000.label
│       │   ├── 000001.label
│       │   └── ...
│       └── residual_images_1               # the proposed residual images
│           ├── 000000.npy
│           ├── 000001.npy
│           └── ...
  1. Next run the data preparation script (based on LMNet) to generate the residual images. More parameters about the data preparation can be found in the yaml file config/data_preparing.yaml.
python utils/auto_gen_residual_images.py 

Inference on SemanticKITTI-MOS

The newly labeled KITTI-Road-MOS data is divided into train/valid set.
The useage of data can be controlled by specifying --data_config in training. During inference, if you use toy dataset or do not download the KITTI-Road-MOS, please modify the seq_id corresponding to valid in model_path/data_cfg.yaml.

# validation split
python infer.py -d DATAROOT -m ./log/model_path/logs/TIMESTAMP/ -l ./predictions/ -s valid 

# test split
python infer.py -d DATAROOT -m ./log/model_path/logs/TIMESTAMP/ -l ./predictions/ -s test

The predictions/labels will be saved to ./predictions/.

Evaluation on SemanticKITTI-MOS validation split

# Only on seq08
python utils/evaluate_mos.py -d DATAROOT -p ./predictions/ --datacfg config/labels/semantic-kitti-mos.raw.yaml

# On seq08 + road-validation-split
python utils/evaluate_mos.py -d DATAROOT -p ./predictions/ --datacfg config/labels/semantic-kitti-mos.yaml

Training on SemanticKITTI-MOS

The training is seperated into two phases, and switching between phases is currently manually controlled. --data_config determines whether to use new label data KITTI-Road-MOS, such as -dc config/labels/semantic-kitti-mos.yaml or -dc config/labels/semantic-kitti-mos.raw.yaml

  • Phase 1 (multi-gpu): Only the range image is used for input and supervision. The training log and checkpoint will be stored in ./log/ours_motionseg3d/logs/TIMESTAMP/.
export CUDA_VISIBLE_DEVICES=0,1,2,3
python train.py -d DATAROOT -ac ./train_yaml/mos_coarse_stage.yml -l log/ours_motionseg3d
  • Phase 2 (single gpu): After the first phase of training, use the following command to start the second phase of training for the PointRefine module.
export CUDA_VISIBLE_DEVICES=0
python train_2stage.py -d DATAROOT -ac ./train_yaml/mos_pointrefine_stage.yml -l log/ours_motionseg3d_pointrefine -p "./log/ours_motionseg3d/logs/TIMESTAMP/"

Citation

If you find this code useful for your research, please use the following BibTeX entry.

@inproceedings{sun2022mos3d,
  title={Efficient Spatial-Temporal Information Fusion for LiDAR-Based 3D Moving Object Segmentation},
  author={Sun, Jiadai and Dai, Yuchao and Zhang, Xianjing and Xu, Jintao and Ai, Rui and Gu, Weihao and Chen, Xieyuanli},
  booktitle={IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
  year={2022},
  organization={IEEE}
}

Acknowledgment

We would like to thank Yufei Wang and Mochu Xiang for their insightful and effective discussions.
Some of the code in this repo is borrowed from LMNet and spvnas.

Copyright

Copyright 2022, Jiadai Sun, Xieyuanli Chen, Xianjing Zhang, HAOMO.AI Technology Co., Ltd., China.

This project is free software made available under the GPL v3.0 License. For details see the LICENSE file.

motionseg3d's People

Contributors

chen-xieyuanli avatar maxchanger avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.