Code Monkey home page Code Monkey logo

lidiff's Introduction

Scaling Diffusion Models to Real-World 3D LiDAR Scene Completion

Paper | Sup. material | Video

This repo contains the code for the scene completion diffusion method proposed in the CVPR'24 paper: "Scaling Diffusion Models to Real-World 3D LiDAR Scene Completion".

Our method leverages diffusion process as a point-wise local problem, disentangling the scene data distribution during in the diffusion process, learning only the point local neighborhood distribution. From our formulation we can achieve a complete scene representation from a single LiDAR scan directly operating over the 3D points.

Dependencies

Installing python packages pre-requisites:

sudo apt install build-essential python3-dev libopenblas-dev

pip3 install -r requirements.txt

Installing MinkowskiEngine:

pip3 install -U MinkowskiEngine==0.5.4 --install-option="--blas=openblas" -v --no-deps

To setup the code run the following command on the code main directory:

pip3 install -U -e .

SemanticKITTI Dataset

The SemanticKITTI dataset has to be download from the official site and extracted in the following structure:

./lidiff/
└── Datasets/
    └── SemanticKITTI
        └── dataset
          └── sequences
            ├── 00/
            │   ├── velodyne/
            |   |       ├── 000000.bin
            |   |       ├── 000001.bin
            |   |       └── ...
            │   └── labels/
            |       ├── 000000.label
            |       ├── 000001.label
            |       └── ...
            ├── 08/ # for validation
            ├── 11/ # 11-21 for testing
            └── 21/
                └── ...

Ground truth generation

To generate the ground complete scenes you can run the map_from_scans.py script. This will use the dataset scans and poses to generate the sequence map to be used as ground truth during training:

python3 map_from_scans.py --path Datasets/SemanticKITTI/dataset/sequences/

Once the sequences map is generated you can then train the model.

Training the diffusion model

For training the diffusion model, the configurations are defined in config/config.yaml, and the training can be started with:

python3 train.py

For training the refinement network, the configurations are defined in config/config_refine.yaml, and the training can be started with:

python3 train_refine.py

Trained model

You can download the trained model weights and save then to lidiff/checkpoints/:

Diffusion Scene Completion Pipeline

For running the scene completion inference we provide a pipeline where both the diffusion and refinement network are loaded and used to complete the scene from an input scan. You can run the pipeline with the command:

python3 tools/diff_completion_pipeline.py --diff DIFF_CKPT --refine REFINE_CKPT -T DENOISING_STEPS -s CONDITIONING_WEIGHT

We provide one scan as example in lidiff/Datasets/test/ so you can directly test it out with our trained model by just running the code above.

Citation

If you use this repo, please cite as :

@inproceedings{nunes2024cvpr,
    author = {Lucas Nunes and Rodrigo Marcuzzi and Benedikt Mersch and Jens Behley and Cyrill Stachniss},
    title = {{Scaling Diffusion Models to Real-World 3D LiDAR Scene Completion}},
    booktitle = {{Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR)}},
    year = {2024}
}

lidiff's People

Contributors

nuneslu avatar ellingtonkirby avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.