prbonn / lidiff Goto Github PK

[CVPR'24] Scaling Diffusion Models to Real-World 3D LiDAR Scene Completion

License: MIT License

Python 100.00%

lidiff's Introduction

Scaling Diffusion Models to Real-World 3D LiDAR Scene Completion

Paper | Sup. material | Video

This repo contains the code for the scene completion diffusion method proposed in the CVPR'24 paper: "Scaling Diffusion Models to Real-World 3D LiDAR Scene Completion".

Our method leverages diffusion process as a point-wise local problem, disentangling the scene data distribution during in the diffusion process, learning only the point local neighborhood distribution. From our formulation we can achieve a complete scene representation from a single LiDAR scan directly operating over the 3D points.

Dependencies

Installing python (we have used python 3.8) packages pre-requisites:

sudo apt install build-essential python3-dev libopenblas-dev

pip3 install -r requirements.txt

Installing MinkowskiEngine:

pip3 install -U MinkowskiEngine==0.5.4 --install-option="--blas=openblas" -v --no-deps

To setup the code run the following command on the code main directory:

pip3 install -U -e .

SemanticKITTI Dataset

The SemanticKITTI dataset has to be download from the official site and extracted in the following structure:

./lidiff/
└── Datasets/
    └── SemanticKITTI
        └── dataset
          └── sequences
            ├── 00/
            │   ├── velodyne/
            |   |       ├── 000000.bin
            |   |       ├── 000001.bin
            |   |       └── ...
            │   └── labels/
            |       ├── 000000.label
            |       ├── 000001.label
            |       └── ...
            ├── 08/ # for validation
            ├── 11/ # 11-21 for testing
            └── 21/
                └── ...

Ground truth generation

To generate the ground complete scenes you can run the map_from_scans.py script. This will use the dataset scans and poses to generate the sequence map to be used as ground truth during training:

python3 map_from_scans.py --path Datasets/SemanticKITTI/dataset/sequences/

Once the sequences map is generated you can then train the model.

Training the diffusion model

For training the diffusion model, the configurations are defined in config/config.yaml, and the training can be started with:

python3 train.py

For training the refinement network, the configurations are defined in config/config_refine.yaml, and the training can be started with:

python3 train_refine.py

Trained model

You can download the trained model weights and save then to lidiff/checkpoints/:

Diffusion model weights
Refinement model weights

Diffusion Scene Completion Pipeline

For running the scene completion inference we provide a pipeline where both the diffusion and refinement network are loaded and used to complete the scene from an input scan. You can run the pipeline with the command:

python3 tools/diff_completion_pipeline.py --diff DIFF_CKPT --refine REFINE_CKPT -T DENOISING_STEPS -s CONDITIONING_WEIGHT

We provide one scan as example in lidiff/Datasets/test/ so you can directly test it out with our trained model by just running the code above.

Citation

If you use this repo, please cite as :

@inproceedings{nunes2024cvpr,
    author = {Lucas Nunes and Rodrigo Marcuzzi and Benedikt Mersch and Jens Behley and Cyrill Stachniss},
    title = {{Scaling Diffusion Models to Real-World 3D LiDAR Scene Completion}},
    booktitle = {{Proc. of the IEEE/CVF Conf. on Computer Vision and Pattern Recognition (CVPR)}},
    year = {2024}
}

lidiff's People

Contributors

Stargazers

Watchers

Forkers

saimouli ziliangmiao enginbozkurt chisyliu nivir ahmetalperenvar phamhoanganhbk hiyyg ellingtonkirby nethdeco liuxinren456852 city945 nicholascstanley gaga1313 sihurs whuhxb

lidiff's Issues

Python version

Hi!

It seems that the packages listed in requirements.txt are very outdated, and are not available on all the python versions when installing with miniconda and pip. Could you please tell the python version which you had used?

If possible, also outline the environment setup procedure that you had followed.

Thanks!

Other datasets

Dear authors,

I'm curious about whether it is possible to be trained on other datasets, such as nuScenes/Waymo. Have you ever tried that?

Thanks.

Question about local point denoising

Hi,
First, thanks for the last reply of problem about pytroch3D install!

I would like to ask about the concept of local point denoising. How should I understand the process of constructing diffusion by adding local noise offsets to each point pm? Why not do reparameterization at each point like DDPM does on each image (Equation 1 in the paper), but instead add noise directly to each point pm? If possible, I would like to know how you considered this method, why it can actually work with scene data and its mathematical explanation.

Thank you!

Best regards,
Yifan

Python version

Hello! Thank you for the code release.

I was wondering what is the Python version that the project is run with?

ModuleNotFoundError: No module named 'pcdiff'

Hi, I am trying to train your model with KITTI-Semantic, but it cannot find module 'pcdiff'. Am I missing something?

provide test code?

Although I found the eval_path.py code, it does not allow one-click permission. I don’t know if it is your correct test code.

pytorch3d versions

Hi whenever I tried to install pytorch3D by pip3 install pytorch3d==0.7.4, it gives following error,

ERROR: Could not find a version that satisfies the requirement pytorch3d==0.7.4 (from versions: 0.1.1, 0.2.0, 0.2.5, 0.3.0) ERROR: No matching distribution found for pytorch3d==0.7.4

Do you have any idea what is the problem?

Ground Truth Geneartion process taking too long

Hi,

First of all, thank you for your work! It is very inspiring.

I am in the process of ground truth generation, but completing each sequence takes 2~6 hours. Is it normal?

map_clean.npy files does not exist

Hello, Hope you are doing well.

Trying to run some experiments after making a container work and the environments is work. i got this error:

FileNotFoundError: [Errno 2] No such file or directory: '/workspace/dataset/sequences/00/map_clean.npy'

How can i generate this file? (what preprocess should be done, to get that file)

thank you

Point cloud input number

Hi again,

In my understanding, scans in SemanticKITTI have an average 480,000 points, but in your code and configurations, you used 180000 points for aggregated input p_full and 18000 points for p_part. Suppose I understand correctly, aggregated point cloud (from map_clean) will have a lot more points, and even considering filtering out moving objects and filtering points based on maximum range. In that case, the current configuration requires extreme downsampling. Is there any reason you picked 180000 as your num_points and 1/10 as your downsampling rate for p_part?

Training time and you used the number of gpus

I want to know your training time and the number of gpus, Thank you!

Datasets/test file

Hello. Could you please tell me which sequence is the 000123.ply from?

Compute Requirements

Hey! Thanks for the great work!

Could you please elaborate on the compute requirements, both for training and inference. Specifically, please elaborate on the following:

the name of the GPU used.
the number of GPUs used for training.
the time to train on a specific dataset (say, SemanticKITTI).

Thanks!

Point cloud completion for specific targets

I'm working on a project. The background point cloud is unimportant in this project. Can I modify the model so that the network only completes specific targets. Please tell me, what should I do?

Problem with Pytorch3D

Hi!!!,

thank you for the work at first, which is very great!!! I am currently working on a project at Uni based on your content, but I am encountering some difficulties in downloading PyTorch3D on my Windows 10 computer. At this stage, it is quite challenging to borrow a suitable device or rent an appropriate server. I would like to ask if there are any possible alternatives to this dependency for PyTorch3D?

Cheers,
Yifan

Add image as control condition

I tried adding images as more control conditions, but our model starts oscillating after the second epoch and does not decrease. Has the author done this experiment before?

Used GPU

Hi, what and how many GPUs are used for the training?

An illegal memory access was encountered

First, i want to express my gratitude for the excellent work you're doing. I've encountered an issue during the training of my network that I hope you can assist with. The details of the error are as follows:

RuntimeError: merge_sort: failed to synchronize: cudaErrorIllegalAddress: an illegal memory access was encountered
[KeOps] error: cuMemFree(buffer) failed with error CUDA_ERROR_ILLEGAL_ADDRESS

I found that the error occurred during the backward pass of the MinkGlobalEnc module. Could you please provide some guidance on how to resolve this issue?

Thank you very much for your help!

data: num_points

Dear Author,

data: num_points in config.yaml means the maximum number that the network can take as input? or gt?

Training with custom dataset.

Hi!,

I am training and testing this network with my custom dataset. I was able to modify the code for training but my results look bizarre, have you encountered this issue? Also, the training loss decreases, but after a few epochs, it is stuck at around 0.9. Are there any configurations that I am missing?

Here is the input, ground truth, and the result, respectively.