Code Monkey home page Code Monkey logo

pointbev's Introduction

Official PyTorch Implementation of PointBeV: A Sparse Approach to BeV Predictions

PointBeV: A Sparse Approach to BeV Predictions
Loick Chambon, Eloi Zablocki, Mickael Chen, Florent Bartoccioni, Patrick Perez, Matthieu Cord.
Valeo AI, Sorbonne University

PointBeV reaches state-of-the-art on several segmentation tasks (vehicle without filtering above) while allowing a trade-off between performance and memory consumption. PointBeV reaches state-of-the-art on several segmentation tasks (vehicle with filtering above). It can also be used using different pattern strategies, for instance a LiDAR pattern.
alt text
Illustration of different sampling patterns, respectively: a full, a regular, a drivable hdmap, a lane hdmap, a front camera and a LiDAR pattern. PointBeV is flexible to any pattern.

Abstract

We propose PointBeV, a novel sparse BeV segmentation model operating on sparse BeV features instead of dense grids. This approach offers precise control over memory usage, enabling the use of long temporal contexts and accommodating memory-constrained platforms. PointBeV employs an efficient two-pass strategy for training, enabling focused computation on regions of interest. At inference time, it can be used with various memory/performance trade-offs and flexibly adjusts to new specific use cases. PointBeV achieves state-of-the-art results on the nuScenes dataset for vehicle, pedestrian, and lane segmentation, showcasing superior performance in static and temporal settings despite being trained solely with sparse signals.

PointBeV architecture is an architecture dealing with sparse representations. It uses an efficient Sparse Feature Pulling module to propagate features from images to BeV and a Sparse Attention module for temporal aggregation.

✏️ Bibtex

If this work is helpful for your research, please consider citing the following BibTeX entry and putting a star on this repository.

@inproceedings{chambon2024pointbev,
      title={PointBeV: A Sparse Approach to BeV Predictions}, 
      author={Loick Chambon and Eloi Zablocki and Mickael Chen and Florent Bartoccioni and Patrick Perez and Matthieu Cord},
      year={2024},
      booktitle={CVPR}
}

Updates:

  • 【28/02/2024】 Code released.
  • 【27/02/2024】 PointBeV has been accepted to CVPR 2024.

🚀 Main results

🔥 Vehicle segmentation

PointBeV is originally designed for vehicle segmentation. It can be used with different sampling patterns and different memory/performance trade-offs. It can also be used with temporal context to improve the segmentation.

Vehicle segmentation of various static models at 448x800 image resolution with visibility filtering. More details can be found in our paper.
Models PointBeV (ours) BAEFormer SimpleBeV BEVFormer CVT
IoU 47.6 41.0 46.6 45.5 37.7
Below we illustrate the model output. On the ground truth, we distinguish vehicle with low visibility (vis < 40%) in light blue from those with higher visibility (vis > 40%) in dark blue. We can see that PointBeV is able to segment vehicles with low visibility, which is a challenging task for other models. They often correspond to occluded vehicles.

We also illustrate the results of a temporal model on random samples taken from the NuScenes validation set. The model used for the visualisation is trained without filtering, at resolution 448x800.

✨ Sparse inference

PointBeV can be used to perform inference with fewer points than other models. We illustrate this below with a vehicle segmentation model. We can see that PointBeV is able to perform inference with 1/10 of the points used by other models while maintaining a similar performance. This is possible thanks to the sparse approach of PointBeV. In green is represented the sampling mask. Predictions are only performed on the sampled points.

🔥 Pedestrian and lane segmentation

PointBeV can also be used for different segmentation tasks such as pedestrians or hdmap segmentation.

Pedestrian segmentation of various static models at 224x480 resolution. More details can be found in our paper.
Models PointBeV (ours) TBP-Former ST-P3 FIERY LSS
IoU 18.5 17.2 14.5 17.2 15.0

🔥 Lane segmentation

Lane segmentation of various static models at different resolutions. More details can be found in our paper.
Models PointBeV (ours) MatrixVT M2BeV PeTRv2 BeVFormer
IoU 49.6 44.8 38.0 44.8 25.7

🔨 Setup

➡️ Create the environment.

git clone https://github.com/...
cd PointBeV
micromamba create -f environment.yaml -y
micromamba activate pointbev

➡️ Install cuda dependencies.

cd pointbev/ops/gs; python setup.py build install; cd -

➡️ Datasets.

We used nuScenes dataset for our experiments. You can download it from the official website: https://www.nuscenes.org/nuscenes.

mkdir data
ln -s $PATH/nuscenes data/nuScenes
pytest tests/test_datasets.py

➡️ Backbones:

Backbones are downloaded the first time the code is run. We've moved them to a folder so that we can retrieve the weights quickly for other runs.

wget https://download.pytorch.org/models/resnet50-0676ba61.pth -P backbones
wget https://download.pytorch.org/models/resnet101-63fe2227.pth -P backbones
wget https://github.com/lukemelas/EfficientNet-PyTorch/releases/download/1.0/efficientnet-b4-6ed6700e.pth -P backbones

Optional: Preprocess the dataset to train HDmaps model. Building hdmaps 'on the fly' can slow down the dataloader, so we strongly advise you to save the preprocessed dataset.

python pointbev/data/dataset/create_maps.py --split val train --version=trainval
python pointbev/data/dataset/create_maps.py --split mini_val mini_train --version=mini

The directory will be as follows.

PointBeV
├── data
│   ├── nuScenes
│   │   ├── samples
│   │   ├── sweeps
│   │   ├── v1.0-mini
|   |   ├── v1.0-trainval
|   |── nuscenes_processed_map
|   |   ├── label
|   |   |   ├── mini_train
|   |   |   ├── mini_val
|   |   |   ├── train
|   |   |   ├── val
|   |   ├── map_0.1

🔄 Training

Sanity check.

pytest tests/test_model.py

Overfitting.

python pointbev/train.py flags.debug=True task_name=debug

Training with simple options:

python pointbev/train.py \
model/net/backbone=efficientnet \ # Specifiy the backbone
data.batch_size=8 \ # Select a batch size
data.valid_batch_size=24 \ # Can be a different batch size to faster validation
data.img_params.min_visibility=1 \ # With or without the visibility filtering
data/[email protected]_params=scale_0_3 \ # Image resolution
task_name=folder # Where to save the experiment in the logs folder.

If you want to train with the reproduced code of BeVFormer static (by specifying model=BeVFormer), do not forget to compile the CUDA dependency.

cd pointbev/ops/defattn; python setup.py build install; cd -

Then select BeVFormer model when running code:

python pointbev/train.py \
model=BeVFormer 

🔄 Evaluation

To evaluate a checkpoint, do not forget to specify the actual resolution and the visibility filtering applied.

python pointbev/train.py train=False test=True task_name=eval \
ckpt.path=PATH_TO_CKPT \
model/net/backbone=efficientnet \
data/[email protected]_params=scale_0_5 \
data.img_params.min_visibility=1 

If you evaluate a pedestrian or an hdmap model do not forget to change the annotations.

python pointbev/train.py train=False test=True task_name=eval \
ckpt.path=PATH_TO_CKPT \
model/net/backbone=resnet50 \
data/[email protected]_params=scale_0_3 \
data.img_params.min_visibility=2 \
data.filters_cat="[pedestrian]" # Instead of filtering vehicles, we filter pedestrians for GT.

If you evaluate a temporal model do not forget to change the model and the temporal frames.

python pointbev/train.py train=False test=True task_name=eval \
model=PointBeV_T \
data.cam_T_P='[[-8,0],[-7,0],[-6,0],[-5,0],[-4,0],[-3,0],[-2,0],[-1,0],[0,0]]' \
ckpt.path=PATH_TO_CKPT \
model/net/backbone=resnet50 \
data/[email protected]_params=scale_0_3 \
data.img_params.min_visibility=2 \
data.filters_cat="[pedestrian]"

About the temporal frames, T_P means 'Time_Pose'. For instance:

  • [[-1,0]] outputs the T=-1 BeV at the T=0 location.
  • [[0,-1]] outputs the T=0 BeV at the T=-1 location.
  • [[-8,0],[-7,0],[-6,0],[-5,0],[-4,0],[-3,0],[-2,0],[-1,0],[0,0]] outputs the T=-8 to T=0 BeV at the T=0 location.

Checkpoints

Backbone Resolution Visibility IoU
Eff-b4 224x480 1 38.69
Eff-b4 448x800 1 42.09
Eff-b4 224x480 2 43.97
Eff-b4 448x800 2 47.58

👍 Acknowledgements

Many thanks to these excellent open source projects:

To structure our code we used this template: https://github.com/ashleve/lightning-hydra-template

Todo:

  • Release other checkpoints.

pointbev's People

Contributors

loickch avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

pointbev's Issues

'gs' directory does not exist

I hope this message finds you well. I am currently attempting to set up the PointBev environment and ran into an issue during the installation process. The installation instructions mention the following command:
cd pointbev/ops/gs; python setup.py build install; cd -
However, upon checking the repository's structure, I noticed that the 'gs' directory does not exist within the 'ops' directory. This has prevented me from completing the setup as instructed.
If cd the dir——defattn and run python setup.py develop, it still doesn't work.

Really looking forward to the open-sourcing of this work.

I'm really looking forward to the open-sourcing of this work. :^) The methods you mentioned in this article are very inspiring, but I still have some doubts about the implementation details. When will you plan to open-source it?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.