View Code? Open in Web Editor
NEW
Awesome papers about Multi-Camera 3D Object Detection and Segmentation in Bird's-Eye-View, such as DETR3D, BEVDet, BEVFormer, BEVDepth, UniAD
awesome-bev-perception-multi-cameras's Introduction
Awesome BEV Perception from Multi-Cameras
LSS: Lift, Splat, Shoot: Encoding Images from Arbitrary Camera Rigs by Implicitly Unprojecting to 3D [paper ] [Github ]
DETR3D: 3D Object Detection from Multi-view Images via 3D-to-2D Queries [paper ] [Github ]
CaDDN:Categorical Depth Distribution Network for Monocular 3D Object Detection [paper ] [Github ]
FIERY: Future Instance Prediction in Bird's-Eye View from Surround Monocular Cameras [paper ] [Github ]
CVT: Cross-view Transformers for real-time Map-view Semantic Segmentation [paper ] [Github ]
Graph-DETR3D: Rethinking Overlapping Regions for Multi-View 3D Object Detection [paper ]
BEVFormer: Learning Bird's-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers [paper ] [Github ]
PETR: Position Embedding Transformation for Multi-View 3D Object Detection [paper ][Github ]
SpatialDETR: Robust Scalable Transformer-Based 3D Object Detection from Multi-View Camera Images with Global Cross-Sensor Attention[paper ] [Github ]
BEVDet: High-Performance Multi-Camera 3D Object Detection in Bird-Eye-View [paper ] [Github ]
BEVDet4D: Exploit Temporal Cues in Multi-camera 3D Object Detection [paper ]
PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images [paper ][Github ]
M2BEV: Multi-Camera Joint 3D Detection and Segmentation with Unified Birds-Eye View Representation [paper ]
BEVerse: Unified Perception and Prediction in Birds-Eye-View for Vision-Centric Autonomous Driving [paper ] [Github ]
PolarDETR: Polar Parametrization for Vision-based Surround-View 3D Detection[paper ] [Github ]
(CoRL 2022) LaRa: Latents and Rays for Multi-Camera Bird's-Eye-View Semantic Segmentation [paper ] [Github ]
(AAAI 2023) PolarFormer: Multi-camera 3D Object Detection with Polar Transformers[paper ] [Github ]
(ICRA 2023) CrossDTR: Cross-view and Depth-guided Transformers for 3D Object Detection[paper ] [Github ]
(AAAI 2023) BEVDepth: Acquisition of Reliable Depth for Multi-view 3D Object Detection [paper ][Github ]
A Simple Baseline for BEV Perception Without LiDAR [paper ] [Github ]
BEVFormer v2: Adapting Modern Image Backbones to Bird's-Eye-View Recognition via Perspective Supervision [paper ]
AeDet: Azimuth-invariant Multi-view 3D Object Detection [paper ] [Github
(WACV 2023) BEVSegFormer: Bird’s Eye View Semantic Segmentation From Arbitrary Camera Rigs [paper ]
Time Will Tell: New Outlooks and A Baseline for Temporal Multi-View 3D Object Detection [paper ][Github ]
VideoBEV: Exploring Recurrent Long-term Temporal Fusion for Multi-view 3D Perception [paper ]
HoP: Temporal Enhanced Training of Multi-view 3D Object Detector via Historical Object Prediction [paper ]
StreamPETR: Exploring Object-Centric Temporal Modeling for Efficient Multi-View 3D Object Detection [paper ][Github ]
SparseBEV: High-Performance Sparse 3D Object Detection from Multi-Camera Videos [paper ][Github ]
(AAAI 2023) BEVStereo: Enhancing Depth Estimation in Multi-view 3D Object Detection with Dynamic Temporal Stereo [paper ] [Github ]
STS: Surround-view Temporal Stereo for Multi-view 3D Detection [paper ]
End to End BEV Perception
ST-P3: End-to-end Vision-based Autonomous Driving via Spatial-Temporal Feature Learning [paper ][Github ]
UniAD: Planning-oriented Autonomous Driving [paper ][Github ]
(ICLR 2023) BEVDistill: Cross-Modal BEV Distillation for Multi-View 3D Object Detection [paper ] [Github ]
TiG-BEV: Multi-view BEV 3D Object Detection via Target Inner-Geometry Learning [paper ][Github ]
RoboBEV: Towards Robust Bird's Eye View Detection under Corruptions
[paper ] [Github ]
Fast-BEV: A Fast and Strong Bird’s-Eye View Perception Baseline [paper ] [Github ]
MatrixVT: Efficient Multi-Camera to BEV Transformation for 3D Perception [paper ][Github ]
(ICRA 2022) HDMapNet: An Online HD Map Construction and Evaluation Framework [paper ] [Github ]
(ICLR 2023) MapTR: Structured Modeling and Learning for Online Vectorized HD Map Construction [paper ] [Github ]
FUTR3D: A Unified Sensor Fusion Framework for 3D Detection [paper ] [Github ]
(NeurIPS 2022) BEVFusion: A Simple and Robust LiDAR-Camera Fusion Framework [paper ] [Github ]
(NeurIPS 2022) Unifying Voxel-based Representation with Transformer for 3D Object Detection [paper ] [Github ]
BEVFusion: Multi-Task Multi-Sensor Fusion with Unified Bird's-Eye View Representation [paper ] [Github ]
CMT: Cross Modal Transformer via Coordinates Encoding for 3D Object Dectection [paper ] [Github ]
BEVFusion4D: Learning LiDAR-Camera Fusion Under Bird's-Eye-View via Cross-Modality Guidance and Temporal Aggregation [paper ]
Vision-Centric BEV Perception: A Survey [paper ] [Github ]
Delving into the Devils of Bird's-eye-view Perception: A Review, Evaluation and Recipe [paper ][Github ]
TPVFormer: An academic alternative to Tesla's Occupancy Network [Github ]
UniWorld: Autonomous Driving Pre-training via World Models [paper ][github ]
Occ-BEV: Multi-Camera Unified Pre-training via 3D Scene Reconstruction [paper ][Github ]
Occupancy-MAE: Self-supervised Pre-training Large-scale LiDAR Point Clouds with Masked Occupancy Autoencoders [paper ][Github ]
Focal Sparse Convolutional Networks for 3D Object Detection [paper ] [Github ]
Voxel Field Fusion for 3D Object Detection [paper ] [Github ]
Scaling up Kernels in 3D CNNs [paper ] [Github ]
awesome-bev-perception-multi-cameras's People
awesome-bev-perception-multi-cameras's Issues
Hello, thanks for your great efforts of collecting papers for reference. Our paper SparseBEV[arXiv] [github] has recently been accepted to ICCV 2023. Please include our paper in your list.
Hi chaytonmin!
I'd like to contribute to your awesome repository but I'm not allowed to create a new branch and a pull request from it. Could you tell me how can I add new lines to the README.md? Thank you.
Hi! I am the first author of CrossDTR. It is my pleasure to present my paper in ICRA 2023.
BEV Segmentation:
X-Align: Cross-Modal Cross-View Alignment for Bird's-Eye-View Segmentation
BEV Detection/Cross sensor KD:
X3KD: Knowledge Distillation Across Modalities, Tasks and Stages for Multi-Camera 3D Object Detection