Code Monkey home page Code Monkey logo

3d-shape-analysis-paper-list's Introduction

3D-Shape-Analysis-Paper-List

A list of papers, libraries and datasets I recently read is collected for anyone who shows interest at



Statistics: πŸ”₯ code is available & stars >= 100  |  ⭐ citation >= 50

3D Detection & Segmentation

  • [Arxiv] Swin3D: A Pretrained Transformer Backbone for 3D Indoor Scene Understanding [Project]
  • [Arxiv] RegionPLC: Regional Point-Language Contrastive Learning for Open-World 3D Scene Understanding [Project]
  • [CVPR2023] EFEM: Equivariant Neural Field Expectation Maximization for 3D Object Segmentation Without Scene Supervision [Project]
  • [CVPR2023] PiMAE: Point Cloud and Image Interactive Masked Autoencoders for 3D Object Detection [Project]
  • [Arxiv] Mask3D for 3D Semantic Instance Segmentation [github]
  • [ECCV2022] ObjectBox: From Centers to Boxes for Anchor-Free Object Detection [github]
  • [Arxiv] Masked Autoencoders for Self-Supervised Learning on Automotive Point Clouds
  • [CVPR2022] HyperDet3D: Learning a Scene-conditioned 3D Object Detector
  • [Arxiv] AutoAlign: Pixel-Instance Feature Aggregation for Multi-Modal 3D Object Detection

Before 2022

  • [AAAI2022] AFDetV2: Rethinking the Necessity of the Second Stage for Object Detection from Point Clouds
  • [AAAI2022] Static-Dynamic Co-Teaching for Class-Incremental 3D Object Detection
  • [NeurIPS2021] Revisiting 3D Object Detection From an Egocentric Perspective
  • [Arxiv] Embracing Single Stride 3D Object Detector with Sparse Transformer [github]
  • [AAAI2022] Learning Auxiliary Monocular Contexts Helps Monocular 3D Object Detection
  • [Arxiv] 3D-VField: Learning to Adversarially Deform Point Clouds for Robust 3D Object Detection
  • [Arxiv] Fast Point Transformer
  • [3DV2021] Open-set 3D Object Detection
  • [Arxiv] FCAF3D: Fully Convolutional Anchor-Free 3D Object Detection [Project]
  • [TPAMI2021] Point Cloud Instance Segmentation with Semi-supervised Bounding-Box Mining
  • [Arxiv] Online Adaptation for Implicit Object Tracking and Shape Reconstruction in the Wild
  • [Arxiv] RAANet: Range-Aware Attention Network for LiDAR-based 3D Object Detection with Auxiliary Density Level Estimation [github]
  • [Arxiv] SimpleTrack: Understanding and Rethinking 3D Multi-object Tracking [github]
  • [NeurIPS2021] Multimodal Virtual Point 3D Detection [Project]
  • [BMVC2021] 3D Object Tracking with Transformer [github]
  • [3DV2021] Learning 3D Semantic Segmentation with only 2D Image Supervision
  • [3DV2021] NeuralDiff: Segmenting 3D objects that move in egocentric videos [Project]
  • [BMVC2021] FAST3D: Flow-Aware Self-Training for 3D Object Detectors
  • [ICCV2021] Guided Point Contrastive Learning for Semi-supervised Point Cloud Semantic Segmentation
  • [CORL2021] DETR3D: 3D Object Detection from Multi-view Images via 3D-to-2D Queries [github]
  • [NeurIPS2021] Object DGCNN: 3D Object Detection using Dynamic Graphs [github]
  • [Arxiv] Improved Pillar with Fine-grained Feature for 3D Object Detection
  • [Arxiv] 3D-FCT: Simultaneous 3D Object Detection and Tracking Using Feature Correlation
  • [ICCVW2021] MonoCInIS: Camera Independent Monocular 3D Object Detection using Instance Segmentation
  • [Arxiv] GSIP: Green Semantic Segmentation of Large-Scale Indoor Point Clouds
  • [Arxiv] Pix2seq: A Language Modeling Framework for Object Detection
  • [Arxiv] MVM3Det: A Novel Method for Multi-view Monocular 3D Detection
  • [ICCV2021] NEAT: Neural Attention Fields for End-to-End Autonomous Driving [github]
  • [ICCV2021] Pyramid R-CNN: Towards Better Performance and Adaptability for 3D Object Detection
  • [ICCV2021] 4D-Net for Learned Multi-Modal Alignment
  • [ICCV2021] Active Learning for Deep Object Detection via Probabilistic Modeling [github]
  • [ICCV2021] An End-to-End Transformer Model for 3D Object Detection [Project]
  • [ICCV2021] Improving 3D Object Detection with Channel-wise Transformer
  • [ICCV2021] Voxel Transformer for 3D Object Detection
  • [CVPR2021] To the Point: Efficient 3D Object Detection in the Range Image With Graph Convolution Kernels
  • [Arxiv] M3DeTR: Multi-representation, Multi-scale, Mutual-relation 3D Object Detection with Transformers
  • [ICCV2021] Exploring Simple 3D Multi-Object Tracking for Autonomous Driving
  • [ICCV2021] LIGA-Stereo: Learning LiDAR Geometry Aware Representations for Stereo-based 3D Detector
  • [ICCV2021] Instance Segmentation in 3D Scenes using Semantic Superpoint Tree Networks [github]
  • [ICCV2021] RandomRooms: Unsupervised Pre-training from Synthetic Shapes and Randomized Layouts for 3D Object Detection
  • [ICCV2021] Is Pseudo-Lidar needed for Monocular 3D Object detection?
  • [IROS2021] PTT: Point-Track-Transformer Module for 3D Single Object Tracking in Point Clouds [github]
  • [ICCV2021] Oriented R-CNN for Object Detection [github]
  • [ICCV2021] Box-Aware Feature Enhancement for Single Object Tracking on Point Clouds [github]
  • [IROS2021] Joint Multi-Object Detection and Tracking with Camera-LiDAR Fusion for Autonomous Driving
  • [ACMMM2021] From Voxel to Point: IoU-guided 3D Object Detection for Point Cloud with Voxel-to-Point Decoder [github]
  • [ICCV2021] DRINet: A Dual-Representation Iterative Learning Network for Point Cloud Segmentation
  • [ICCV2021] Hierarchical Aggregation for 3D Instance Segmentation [github]
  • [Arxiv] Investigating Attention Mechanism in 3D Point Cloud Object Detection [pytorch]
  • [ICCV2021] VMNet: Voxel-Mesh Network for Geodesic-Aware 3D Semantic Segmentation [pytorch]
  • [ICCV2021] Geometry Uncertainty Projection Network for Monocular 3D Object Detection
  • [Arxiv] Aug3D-RPN: Improving Monocular 3D Object Detection by Synthetic Images with Virtual Depth
  • [Arxiv] DV-Det: Efficient 3D Point Cloud Object Detection with Dynamic Voxelization
  • [ICCV2021] ReDAL: Region-based and Diversity-aware Active Learning for Point Cloud Semantic Segmentation
  • [ICCV2021] Rank & Sort Loss for Object Detection and Instance Segmentation [pytorch]
  • [Arxiv] Multi-Modality Task Cascade for 3D Object Detection [github]
  • [ACMMM2021] Neighbor-Vote: Improving Monocular 3D Object Detection through Neighbor Distance Voting
  • [Arxiv] Monocular 3D Object Detection: An Extrinsic Parameter Free Approach
  • [Arxiv] Real-time 3D Object Detection using Feature Map Flow [pytorch]
  • [Arxiv] To the Point: Efficient 3D Object Detection in the Range Image with Graph Convolution Kernels
  • [CVPR2021] RSN: Range Sparse Net for Efficient, Accurate LiDAR 3D Object Detection
  • [Arxiv] Sparse PointPillars: Exploiting Sparsity in Birds-Eye-View Object Detection
  • [Arxiv] ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection [Project]
  • [CVPR2021] 3D Spatial Recognition without Spatially Labeled 3D [Project]
  • [Arxiv] Lite-FPN for Keypoint-based Monocular 3D Object Detection
  • [TPAMI] MonoGRNet: A General Framework for Monocular 3D Object Detection
  • [Arxiv] Lidar Point Cloud Guided Monocular 3D Object Detection
  • [Arxiv] Geometry-aware data augmentation for monocular 3D object detection
  • [Arxiv] OCM3D: Object-Centric Monocular 3D Object Detection
  • [CVPR2021] Objects are Different: Flexible Monocular 3D Object Detection [github]
  • [CVPR2021] HVPR: Hybrid Voxel-Point Representation for Single-stage 3D Object Detection
  • [Arxiv] Group-Free 3D Object Detection via Transformers [pytorch]
  • [CVPR2021] GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection [pytorch]
  • [CVPR2021] Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds [pytorch]
  • [CVPR2021] Depth-conditioned Dynamic Message Propagation for Monocular 3D Object Detection [github]
  • [CVPR2021] Delving into Localization Errors for Monocular 3D Object Detection [github]
  • [CVPR2021] 3D-MAN: 3D Multi-frame Attention Network for Object Detection
  • [CVPR2021] LiDAR R-CNN: An Efficient and Universal 3D Object Detector [github]
  • [CVPR2021] 3DIoUMatch: Leveraging IoU Prediction for Semi-Supervised 3D Object Detection [pytorch]
  • [CVPR2021] M3DSSD: Monocular 3D Single Stage Object Detector
  • [CVPR2021] MonoRUn: Monocular 3D Object Detection by Reconstruction and Uncertainty Propagation
  • [Arxiv] SparsePoint: Fully End-to-End Sparse 3D Object Detector
  • [Arxiv] RangeDet:In Defense of Range View for LiDAR-based 3D Object Detection
  • [ICRA2021] YOLOStereo3D: A Step Back to 2D for Efficient Stereo 3D Detection [github]
  • [CVPR2021] ST3D: Self-training for Unsupervised Domain Adaptation on 3D Object Detection [github]
  • [Arxiv] Offboard 3D Object Detection from Point Cloud Sequences
  • [CVPR2021] DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution [github]
  • [Arxiv] Pseudo-labeling for Scalable 3D Object Detection
  • [Arxiv] DPointNet: A Density-Oriented PointNet for 3D Object Detection in Point Clouds
  • [Arxiv] PV-RCNN++: Point-Voxel Feature Set Abstraction With Local Vector Representation for 3D Object Detection [pytorch]
  • [Arxiv] Rethinking Rotated Object Detection with Gaussian Wasserstein Distance Loss
  • [Arxiv] CubifAE-3D: Monocular Camera Space Cubification for Auto-Encoder based 3D Object Detection
  • [Arxiv] Self-Attention Based Context-Aware 3D Object Detection [pytorch]
  • [Arxiv] Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection

Before 2021

  • [Arxiv] It’s All Around You: Range-Guided Cylindrical Network for 3D Object Detection
  • [Arxiv] 3DIoUMatch: Leveraging IoU Prediction for Semi-Supervised 3D Object Detection [Project]
  • [Arxiv] Demystifying Pseudo-LiDAR for Monocular 3D Object Detection
  • [3DV2020] PanoNet3D: Combining Semantic and Geometric Understanding for LiDAR Point Cloud Detection
  • [AAAI2021] PC-RGNN: Point Cloud Completion and Graph Neural Network for 3D Object Detection
  • [Arxiv] SegGroup: Seg-Level Supervision for 3D Instance and Semantic Segmentation
  • [Arxiv] 3D Object Detection with Pointformer
  • [WACV2021] CenterFusion: Center-based Radar and Camera Fusion for 3D Object Detection [pytorch]
  • [Arxiv] Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation [pytorch]
  • [Arxiv] Learning to Predict the 3D Layout of a Scene
  • [Arxiv] Canonical Voting: Towards Robust Oriented Bounding Box Detection in 3D Scenes [Project]
  • [Arxiv] DyCo3D: Robust Instance Segmentation of 3D Point Clouds through Dynamic Convolution
  • [Arxiv] Temporal-Channel Transformer for 3D Lidar-Based Video Object Detection in Autonomous Driving
  • [NeurIPS2020] Every View Counts: Cross-View Consistency in 3D Object Detection with Hybrid-Cylindrical-Spherical Voxelization
  • [NeurIPS2020] Group Contextual Encoding for 3D Point Clouds [pytorch]
  • [Arxiv] 3D Object Recognition By Corresponding and Quantizing Neural 3D Scene Representations [Project]
  • [Arxiv] A Density-Aware PointRCNN for 3D Objection Detection in Point Clouds
  • [Arxiv] Monocular 3D Detection with Geometric Constraints Embedding and Semi-supervised Training
  • [ECCV2020] Reinforced Axial Refinement Network for Monocular 3D Object Detection
  • [Arxiv] RUHSNet: 3D Object Detection Using Lidar Data in Real Time [pytorch]
  • [IROS2020] 3D Multi-Object Tracking: A Baseline and New Evaluation Metrics [Project][Code]
  • [ECCV2020] Virtual Multi-view Fusion for 3D Semantic Segmentation
  • [ACMMM2020] Weakly Supervised 3D Object Detection from Point Clouds
  • [ECCV2020] Weakly Supervised 3D Object Detection from Lidar Point Cloud [pytorch]
  • [ECCV2020] Kinematic 3D Object Detection in Monocular Video
  • [IROS2020] Object-Aware Centroid Voting for Monocular 3D Object Detection
  • [ECCV2020] Pillar-based Object Detection for Autonomous Driving
  • [Arxiv] Local Grid Rendering Networks for 3D Object Detection in Point Clouds
  • [Arxiv] Learning to Detect 3D Objects from Point Clouds in Real Time
  • [Arxiv] SVGA-Net: Sparse Voxel-Graph Attention Network for 3D Object Detection from Point Clouds
  • [CVPR2020] PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation
  • [CVPR2020] FroDO: From Detections to 3D Objects
  • [CVPR2020] Physically Realizable Adversarial Examples for LiDAR Object Detection
  • [CVPR2020] Associate-3Ddet: Perceptual-to-Conceptual Association for 3D Point Cloud Object Detection
  • [CVPR2020] End-to-end 3D Point Cloud Instance Segmentation without Detection
  • [CVPR2020] MonoPair: Monocular 3D Object Detection Using Pairwise Spatial Relationships
  • [CVPR2020] Structure Aware Single-stage 3D Object Detection from Point Cloud
  • [CVPR2020] Learning Depth-Guided Convolutions for Monocular 3D Object Detection [pytorch] πŸ”₯
  • [CVPR2020] What You See is What You Get: Exploiting Visibility for 3D Object Detection
  • [CVPR2020] Density Based Clustering for 3D Object Detection in Point Clouds
  • [CVPR2020] Disp R-CNN: Stereo 3D Object Detection via Shape Prior Guided Instance Disparity Estimation
  • [CVPR2020] End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection
  • [CVPR2020] PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection
  • [CVPR2020] MLCVNet: Multi-Level Context VoteNet for 3D Object Detection
  • [CVPR2020] PointPainting: Sequential Fusion for 3D Object Detection
  • [CVPR2020] Joint 3D Instance Segmentation and Object Detection for Autonomous Driving
  • [CVPR2020] Point-GNN: Graph Neural Network for 3D Object Detection in a Point Cloud [tensorflow]
  • [CVPR2020] Joint 3D Instance Segmentation and Object Detection for Autonomous Driving
  • [CVPR2020] HVNet: Hybrid Voxel Network for LiDAR Based 3D Object Detection
  • [CVPR2020] A Hierarchical Graph Network for 3D Object Detection on Point Clouds
  • [Arxiv] H3DNet: 3D Object Detection Using Hybrid Geometric Primitives [pytorch]
  • [CVPR2020] P2B: Point-to-Box Network for 3D Object Tracking in Point Clouds
  • [Arxiv] 3D-CVF: Generating Joint Camera and LiDAR Features Using Cross-View Spatial Feature Fusion for 3D Object Detection
  • [CVPR2020] Joint Spatial-Temporal Optimization for Stereo 3D Object Tracking
  • [CVPR2020] Learning to Evaluate Perception Models Using Planner-Centric Metrics
  • [CVPR2020] Disp R-CNN: Stereo 3D Object Detection via Shape Prior Guided Instance Disparity Estimation [pytorch]
  • [Arxiv] SSN: Shape Signature Networks for Multi-class Object Detection from Point Clouds [github]
  • [CVPR2020] End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection [github]
  • [Arxiv] Finding Your (3D) Center: 3D Object Detection Using a Learned Loss
  • [CVPR2020] PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation
  • [CVPR2020] 3D-MPA: Multi Proposal Aggregation for 3D Semantic Instance Segm
  • [CVPR2020] Fusion-Aware Point Convolution for Online Semantic 3D Scene Segmentation
  • [CVPR2020] OccuSeg: Occupancy-aware 3D Instance Segmentation
  • [CVPR2020] Learning to Segment 3D Point Clouds in 2D Image Space
  • [CVPR2020] Point-GNN: Graph Neural Network for 3D Object Detection in a Point Cloud [tensorflow]
  • [AAAI2020] ZoomNet: Part-Aware Adaptive Zooming Neural Network for 3D Object Detection
  • [Arxiv] MonoPair: Monocular 3D Object Detection Using Pairwise Spatial Relationships
  • [Arxiv] HVNet: Hybrid Voxel Network for LiDAR Based 3D Object Detection
  • [Arxiv] SMOKE: Single-Stage Monocular 3D Object Detection via Keypoint Estimation
  • [Arxiv] 3DSSD: Point-based 3D Single Stage Object Detector
  • [Arxiv] Monocular 3D Object Detection with Decoupled Structured Polygon Estimation and Height-Guided Depth Estimation
  • [CVPR2020] ImVoteNet: Boosting 3D Object Detection in Point Clouds with Image Votes
  • [Arxiv] A Review on Object Pose Recovery: from 3D Bounding Box Detectors to Full 6D Pose Estimators
  • [Arxiv] ScanRefer: 3D Object Localization in RGB-D Scans using Natural Language
  • [Arxiv] Objects as Points [github] ⭐πŸ”₯
  • [Arxiv] RTM3D: Real-time Monocular 3D Detection from Object Keypoints for Autonomous Driving [github]
  • [CVPR2020] DSGN: Deep Stereo Geometry Network for 3D Object Detection [github]
  • [Arxiv] Learning and Memorizing Representative Prototypes for 3D Point Cloud Semantic and Instance Segmentation
  • [Arxiv] PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection
  • [Arxiv] Object as Hotspots: An Anchor-Free 3D Object Detection Approach via Firing of Hotspots
  • [CVPR2020] SESS: Self-Ensembling Semi-Supervised 3D Object Detection
  • [NeurIPS2019] PerspectiveNet: 3D Object Detection from a Single RGB Image via Perspective Points
  • [NeurIPS2019] Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds
  • [ICCV2019] Deep Hough Voting for 3D Object Detection in Point Clouds
  • [AAAI2020] JSNet: Joint Instance and Semantic Segmentation of 3D Point Clouds
  • [ICCV2019] M3D-RPN: Monocular 3D Region Proposal Network for Object Detection [pytorch]
  • [ICCV2019] 3D Instance Segmentation via Multi-Task Metric Learning
  • [Arxiv] Single-Stage Monocular 3D Object Detection with Virtual Cameras
  • [Arxiv] Depth Completion via Deep Basis Fitting
  • [Arxiv] Relation Graph Network for 3D Object Detection in Point Clouds
  • [CVPR2019] 3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans [pytorch] πŸ”₯
  • [ICCV2019] Rescan: Inductive Instance Segmentation for Indoor RGBD Scans [C++]
  • [ICCV2019] Transferable Semi-Supervised 3D Object Detection From RGB-D Data
  • [ICCV2019] STD: Sparse-to-Dense 3D Object Detector for Point Cloud
  • [CVPR2019] PointRCNN: 3D Object Proposal Generation and Detection from Point Cloud [pytorch]
  • [Arxiv] Fast Point R-CNN
  • [Arxiv] Class-balanced Grouping and Sampling for Point Cloud 3D Object Detection [pytorch] πŸ”₯
  • [ECCV2018] 3DMV: Joint 3D-Multi-View Prediction for 3D Semantic Scene Segmentation [pytorch] πŸ”₯

Shape Representation

  • [CVPR2023] Masked Scene Contrast: A Scalable Framework for Unsupervised 3D Representation Learning [github]
  • [Arxiv] Neural Vector Fields: Implicit Representation by Explicit Learning
  • [ECCV2022] NeuMesh: Learning Disentangled Neural Mesh-based Implicit Field for Geometry and Texture Editing [Project]
  • [Arxiv] Masked Autoencoders in 3D Point Cloud Representation Learning
  • [Arxiv] NeuralODF: Learning Omnidirectional Distance Fields for 3D Shape Representation
  • [Siggraph2022] Learning Smooth Neural Functions via Lipschitz Regularization [Project]
  • [Siggraph2022] Dual Octree Graph Networks for Learning Adaptive Volumetric Shape Representations [Project]
  • [Arxiv] A Level Set Theory for Neural Implicit Evolution under Explicit Flows
  • [CVPR2022] GIFS: Neural Implicit Function for General Shape Representation [Project]
  • [Arxiv] PINs: Progressive Implicit Networks for Multi-Scale Neural Representations
  • [Arxiv] Distillation with Contrast is All You Need for Self-Supervised Point Cloud Representation Learning
  • [Arxiv] Spelunking the Deep: Guaranteed Queries for General Neural Implicit Surfaces
  • [Arxiv] MINER: Multiscale Implicit Neural Representations
  • [Arxiv] De-rendering 3D Objects in the Wild
  • [Arxiv] Implicit Autoencoder for Point Cloud Self-supervised Representation Learning

Before 2022

  • [Arxiv] End-to-End Learning of Multi-category 3D Pose and Shape Estimation
  • [Arxiv] Point2Cyl: Reverse Engineering 3D Objects from Point Clouds to Extrusion Cylinders
  • [Arxiv] Representing 3D Shapes with Probabilistic Directed Distance Fields
  • [Arxiv] Text2Mesh: Text-Driven Neural Stylization for Meshes [Project]
  • [Arxiv] PointCLIP: Point Cloud Understanding by CLIP [github]
  • [Arxiv] Voint Cloud: Multi-View Point Cloud Representation for 3D Understanding
  • [Arxiv] Gradient-SDF: A Semi-Implicit Surface Representation for 3D Reconstruction
  • [Arxiv] Intuitive Shape Editing in Latent Space
  • [NeurIPS2021] Learning Object-Centric Representations of Multi-Object Scenes from Multiple Views [github]
  • [Arxiv] Neural Fields as Learnable Kernels for 3D Reconstruction
  • [NeurIPS2021] OctField: Hierarchical Implicit Functions for 3D Modeling [github]
  • [3DV2021] RefRec: Pseudo-labels Refinement via Shape Reconstruction for Unsupervised 3D Domain Adaptation [github]
  • [3DV2021] PolyNet: Polynomial Neural Network for 3D Shape Recognition with PolyShape Representation [Project]
  • [Arxiv] BACON: Band-limited Coordinate Networks for Multiscale Scene Representation [Project]
  • [Arxiv] UNIST: Unpaired Neural Implicit Shape Translation Network [Project]
  • [Arxiv] Representing Shape Collections with Alignment-Aware Linear Models [Project]
  • [ICCV2021] Spatio-temporal Self-Supervised Representation Learning for 3D Point Clouds
  • [Arxiv] DeepCurrents: Learning Implicit Representations of Shapes with Boundaries
  • [3DV] AIR-Nets: An Attention-Based Framework for Locally Conditioned Implicit Representations [github]
  • [Arxiv] HyperCube: Implicit Field Representations of Voxelized 3D Models
  • [Arxiv] ARAPReg: An As-Rigid-As Possible Regularization Loss for Learning Deformable Shape Generators
  • [ICCV2021] Multiresolution Deep Implicit Functions for 3D Shape Representation
  • [ICCV2021] Learning Canonical 3D Object Representation for Fine-Grained Recognition
  • [Arxiv] Point Discriminative Learning for Unsupervised Representation Learning on 3D Point Clouds
  • [Arxiv] A Deep Signed Directional Distance Function for Object Shape Representation
  • [Arxiv] 3D Neural Scene Representations for Visuomotor Control [Project]
  • [Arxiv] A-SDF: Learning Disentangled Signed Distance Functions for Articulated Shape Representation [Project]
  • [Arxiv] ShapeMOD: Macro Operation Discovery for 3D Shape Programs [Project]
  • [Arxiv] CoCoNets: Continuous Contrastive 3D Scene Representations [Project]
  • [Arxiv] DeepMetaHandles: Learning Deformation Meta-Handles of 3D Meshes with Biharmonic Coordinates [Project]

Before 2021

  • [CVPR2021] clDice-a Novel Topology-Preserving Loss Function for Tubular Structure Segmentation [github]
  • [CVPR2021] Point2Skeleton: Learning Skeletal Representations from Point Clouds [pytorch]
  • [Arxiv] ParaNet: Deep Regular Representation for 3D Point Clouds
  • [Arxiv] Geometric Adversarial Attacks and Defenses on 3D Point Clouds [tensorflow]
  • [Arxiv] Learning Category-level Shape Saliency via Deep Implicit Surface Networks
  • [Arxiv] pi-GAN: Periodic Implicit Generative Adversarial Networks for 3D-Aware Image Synthesis
  • [Arxiv] Deep Implicit Templates for 3D Shape Representation
  • [NeurIPS2020] MetaSDF: Meta-learning Signed Distance Functions [Project]
  • [Arxiv] RISA-Net: Rotation-Invariant Structure-Aware Network for Fine-Grained 3D Shape Retrieval [tensorflow]
  • [Arxiv] Overfit Neural Networks as a Compact Shape Representation
  • [Arxiv] DSM-Net: Disentangled Structured Mesh Net for Controllable Generation of Fine Geometry [Project]
  • [Arxiv] PatchNets: Patch-Based Generalizable Deep Implicit 3D Shape Representations
  • [Arxiv] CaSPR: Learning Canonical Spatiotemporal Point Cloud Representations
  • [Arxiv] ROCNET: RECURSIVE OCTREE NETWORK FOR EFFICIENT 3D DEEP REPRESENTATION
  • [ECCV2020] GeLaTO: Generative Latent Textured Objects [Project]
  • [ECCV2020] Ladybird: Quasi-Monte Carlo Sampling for Deep Implicit Field Based 3D Reconstruction with Symmetry
  • [Arxiv] Neural Sparse Voxel Fields
  • [CVPR2020] StructEdit: Learning Structural Shape Variations [github]
  • [Arxiv] PAI-GCN: Permutable Anisotropic Graph Convolutional Networks for 3D Shape Representation Learning [github]
  • [CVPR2020] Learning Generative Models of Shape Handles [Project page]
  • [CVPR2020] DualSDF: Semantic Shape Manipulation using a Two-Level Representation [github]
  • [CVPR2020] Learning Unsupervised Hierarchical Part Decomposition of 3D Objects from a Single RGB Image [pytorch]
  • [NeurIPS2019] Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations [pytorch]
  • [Arxiv] Label-Efficient Learning on Point Clouds using Approximate Convex Decompositions
  • [Arxiv] Global-Local Bidirectional Reasoning for Unsupervised Representation Learning of 3D Point Clouds
  • [Arxiv] Deep Local Shapes: Learning Local SDF Priors for Detailed 3D Reconstruction
  • [Arxiv] SeqXY2SeqZ: Structure Learning for 3D Shapes by Sequentially Predicting 1D Occupancy Segments From 2D Coordinates
  • [CVPR2020] D3Feat: Joint Learning of Dense Detection and Description of 3D Local Features
  • [Arxiv] Implicit Geometric Regularization for Learning Shapes
  • [Arxiv] Analytic Marching: An Analytic Meshing Solution from Deep Implicit Surface Networks
  • [Arxiv] Adversarial Generation of Continuous Implicit Shape Representations [pytorch]
  • [Arxiv] A Novel Tree-structured Point Cloud Dataset For Skeletonization Algorithm Evaluation [dataset]
  • [CVPRW2019] SkelNetOn 2019: Dataset and Challenge on Deep Learning for Geometric Shape Understanding [project]
  • [Arxiv] Skeleton Extraction from 3D Point Clouds by Decomposing the Object into Parts
  • [Arxiv] InSphereNet: a Concise Representation and Classification Method for 3D Object
  • [Arxiv] Deep Structured Implicit Functions
  • [CVIU] 3D articulated skeleton extraction using a single consumer-grade depth camera
  • [ICLR2019] Point Cloud GAN [tensorflow]
  • [ICCV2019] Learning Shape Templates with Structured Implicit Functions
  • [ICCV2019] 3D Point Cloud Generative Adversarial Network Based on Tree Structured Graph Convolutions [pytorch]
  • [ICCV2019] Implicit Surface Representations as Layers in Neural Networks
  • [CVPR2019] DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation [pytorch] πŸ”₯ ⭐
  • [SIGGRAPH2019] StructureNet: Hierarchical Graph Networks for 3D Shape Generation [pytorch]
  • [SIGGRAPH Asia2019] LOGAN: Unpaired Shape Transform in Latent Overcomplete Space [tensorflow]
  • [TOG] Voxel Cores: Efficient, robust, and provably good approximation of 3D medial axes
  • [SIGGRAPH2018] P2P-NET: Bidirectional Point Displacement Net for Shape Transform [tensorflow]
  • [ICML2018] Learning Representations and Generative Models for 3D Point Clouds [tensorflow] πŸ”₯⭐
  • [NeurIPS2018] Discovery of Latent 3D Keypoints via End-to-end Geometric Reasoning [tensorflow][project page]:star::fire:
  • [AAAI2018] Unsupervised Articulated Skeleton Extraction from Point Set Sequences Captured by a Single Depth Camera
  • [3DV2018] Parsing Geometry Using Structure-Aware Shape Templates
  • [SIGGRAPH2017] GRASS: Generative Recursive Autoencoders for Shape Structures [pytorch] πŸ”₯
  • [TOG] Erosion Thickness on Medial Axes of 3D Shapes
  • [Vis Comput] Distance field guided L1-median skeleton extraction
  • [CGF] Contracting Medial Surfaces Isotropically for Fast Extraction of Centred Curve Skeletons
  • [CGF] Improved Use of LOP for Curve Skeleton Extraction
  • [SIGGRAPH Asia2015] Deep Points Consolidation [C++ & Qt]
  • [SIGGRAPH2015] Burning The Medial Axis
  • [SIGGRAPH2009] Curve Skeleton Extraction from Incomplete Point Cloud [matlab] ⭐
  • [TOG] SDM-NET: deep generative network for structured deformable mesh
  • [TOG] Robust and Accurate Skeletal Rigging from Mesh Sequences πŸ”₯
  • [TOG] L1-medial skeleton of point cloud [C++] πŸ”₯
  • [EUROGRAPHICS2016] 3D Skeletons: A State-of-the-Art Report πŸ”₯
  • [SGP2012] Mean Curvature Skeletons [C++] πŸ”₯
  • [SMIC2010] Point Cloud Skeletons via Laplacian-Based Contraction [Matlab] πŸ”₯

Shape & Scene Completion

  • [ECCV2022] CompNVS: Novel View Synthesis with Scene Completion
  • [ECCV2022] PatchRD: Detail-Preserving Shape Completion by Learning Patch Retrieval and Deformation [Project]
  • [Arxiv] SRPCN: Structure Retrieval based Point Completion Network
  • [ICRA2022] Temporal Point Cloud Completion with Pose Disturbance
  • [Arxiv] Towards realistic symmetry-based completion of previously unseen point clouds [github]

Before 2022

  • [AAAI2022] Not All Voxels Are Equal: Semantic Scene Completion from the Point-Voxel Perspective
  • [AAAI2022] Attention-based Transformation from Latent Features to Point Clouds
  • [Arxiv] MonoScene: Monocular 3D Semantic Scene Completion [Project]
  • [Arxiv] Semi-supervised Implicit Scene Completion from Sparse LiDAR [github]
  • [NeurIPS2021] Density-aware Chamfer Distance as a Comprehensive Metric for Point Cloud Completion [github]
  • [Arxiv] PU-Transformer: Point Cloud Upsampling Transformer
  • [BMVC2021] Self-Supervised Point Cloud Completion via Inpainting
  • [IROS2021] Graph-Guided Deformation for Point Cloud Completion
  • [IROS2021] Semantic Segmentation-assisted Scene Completion for LiDAR Point Clouds [github]
  • [Arxiv] 3D Point Cloud Completion with Geometric-Aware Adversarial Augmentation
  • [Arxiv] PC2-PU: Patch Correlation and Position Correction for Effective Point Cloud Upsampling
  • [ICCV2021] Voxel-based Network for Shape Completion by Leveraging Edge Generation [github]
  • [ICCV2021] PoinTr: Diverse Point Cloud Completion with Geometry-Aware Transformers [github]
  • [ICCV2021] SnowflakeNet: Point Cloud Completion by Snowflake Point Deconvolution with Skip-Transformer [github]
  • [Arxiv] CarveNet: Carving Point-Block for Complex 3D Shape Completion
  • [IJCAI2021] IMENet: Joint 3D Semantic Scene Completion and 2D Semantic Segmentation through Iterative Mutual Enhancement
  • [CVPR2021] Point Cloud Upsampling via Disentangled Refinement [github]
  • [TVCG2021] Consistent Two-Flow Network for Tele-Registration of Point Clouds [Project]
  • [Arxiv] 4DComplete: Non-Rigid Motion Estimation Beyond the Observable Surface [Project]
  • [CVPR2021] Unsupervised 3D Shape Completion through GAN Inversion [Project]
  • [Arxiv] ASFM-Net: Asymmetrical Siamese Feature Matching Network for Point Completion
  • [CVPR2021] Variational Relational Point Completion Network [Project]
  • [CVPR2021] View-Guided Point Cloud Completion
  • [CVPR2021] Semantic Scene Completion via Integrating Instances and Scene in-the-Loop [pytorch]
  • [CVPR2021] Denoise and Contrast for Category Agnostic Shape Completion
  • [CVPR2021] Cycle4Completion: Unpaired Point Cloud Completion using Cycle Transformation with Missing Region Coding
  • [CVPR2021] PMP-Net: Point Cloud Completion by Learning Multi-step Point Moving Paths
  • [CVPR2021] Style-based Point Generator with Adversarial Rendering for Point Cloud Completion
  • [Arxiv] VPC-Net: Completion of 3D Vehicles from MLS Point Clouds

Before 2021

  • [Arxiv] PMP-Net: Point Cloud Completion by Learning Multi-step Point Moving Paths
  • [Arxiv] S3CNet: A Sparse Semantic Scene Completion Network for LiDAR Point Clouds
  • [Arxiv] Semantic Scene Completion using Local Deep Implicit Functions on LiDAR Data
  • [Arxiv] Learning-based 3D Occupancy Prediction for Autonomous Navigation in Occluded Environments
  • [Arxiv] PMP-Net: Point Cloud Completion by Learning Multi-step Point Moving Paths
  • [3DV2020] SCFusion: Real-time Incremental Scene Reconstruction with Semantic Completion
  • [Arxiv] Refinement of Predicted Missing Parts Enhance Point Cloud Completion [pytorch]
  • [Arxiv] Unsupervised Partial Point Set Registration via Joint Shape Completion and Registration
  • [Arxiv] LMSCNet: Lightweight Multiscale 3D Semantic Completion [Demo]
  • [ECCV2020] SoftPoolNet: Shape Descriptor for Point Cloud Completion and Classification
  • [ECCV2020] Weakly-supervised 3D Shape Completion in the Wild
  • [Arxiv] Point Cloud Completion by Learning Shape Priors
  • [Arxiv] KAPLAN: A 3D Point Descriptor for Shape Completion
  • [Arxiv] VPC-Net: Completion of 3D Vehicles from MLS Point Clouds
  • [Arxiv] SPSG: Self-Supervised Photometric Scene Generation from RGB-D Scans
  • [Arxiv] GRNet: Gridding Residual Network for Dense Point Cloud Completion
  • [Arxiv] Deep Octree-based CNNs with Output-Guided Skip Connections for 3D Shape and Scene Completion
  • [CVPR2020] Point Cloud Completion by Skip-attention Network with Hierarchical Folding
  • [CVPR2020] Cascaded Refinement Network for Point Cloud Completion [github]
  • [CVPR2020] Anisotropic Convolutional Networks for 3D Semantic Scene Completion [github]
  • [AAAI2020] Attention-based Multi-modal Fusion Network for Semantic Scene Completion
  • [CVPR2020] 3D Sketch-aware Semantic Scene Completion via Semi-supervised Structure Prior [github]
  • [ECCV2020] Multimodal Shape Completion via Conditional Generative Adversarial Networks [pytorch]
  • [CVPR2020] RevealNet: Seeing Behind Objects in RGB-D Scans
  • [CVPR2020] Implicit Functions in Feature Space for 3D Shape Reconstruction and Completion
  • [CVPR2020] PF-Net: Point Fractal Network for 3D Point Cloud Completion
  • [Arxiv] 3D Gated Recurrent Fusion for Semantic Scene Completion
  • [ICCVW2019] EdgeConnect: Structure Guided Image Inpainting using Edge Prediction [pytorch] πŸ”₯⭐
  • [ICRA2020] Depth Based Semantic Scene Completion with Position Importance Aware Loss
  • [CVPR2020] SG-NN: Sparse Generative Neural Networks for Self-Supervised Scene Completion of RGB-D Scans
  • [Arxiv] PQ-NET: A Generative Part Seq2Seq Network for 3D Shapes
  • [ICLR2020] Unpaired Point Cloud Completion on Real Scans using Adversarial Training [tensorflow]
  • [AAAI2020] Morphing and Sampling Network for Dense Point Cloud Completion [pytorch]
  • [ICCVW2019] Render4Completion: Synthesizing Multi-View Depth Maps for 3D Shape Completion
  • [ICCV2019] ForkNet: Multi-branch Volumetric Semantic Completion from a Single Depth Image [tensorflow]
  • [ICCV2019] Cascaded Context Pyramid for Full-Resolution 3D Semantic Scene Completion [Caffe3D]
  • [ICCV2019] Multi-Angle Point Cloud-VAE: Unsupervised Feature Learning for 3D Point Clouds from Multiple Angles by Joint Self-Reconstruction and Half-to-Half Prediction
  • [Arxiv] EdgeNet: Semantic Scene Completion from RGB-D images
  • [CVPR2019] TopNet: Structural Point Cloud Decoder [pytorch & tensorflow]
  • [CVPR2019] Deep Reinforcement Learning of Volume-guided Progressive View Inpainting for 3D Point Scene Completion from a Single Depth Image
  • [CVPR2019] Leveraging Shape Completion for 3D Siamese Tracking [pytorch]
  • [CVPR2019] RL-GAN-Net: A Reinforcement Learning Agent Controlled GAN Network for Real-Time Point Cloud Shape Completion [pytorch]
  • [3DV2018] PCN: Point Completion Network [tensorflow] πŸ”₯
  • [ECCV2018] Efficient Semantic Scene Completion Network with Spatial Group Convolution [pytorch]
  • [CVPR2018] ScanComplete: Large-Scale Scene Completion and Semantic Segmentation for 3D Scans [tensorflow] πŸ”₯⭐
  • [CVPR2018] Learning 3D Shape Completion from Laser Scan Data with Weak Supervision [torch][torch]
  • [IJCV2018] Learning 3D Shape Completion under Weak Supervision [torch][torch]
  • [ICCV2017] High-Resolution Shape Completion Using Deep Neural Networks for Global Structure and Local Geometry Inference ⭐
  • [ICCV2017] Shape Completion using 3D-Encoder-Predictor CNNs and Shape Synthesis [torch] πŸ”₯⭐
  • [CVPR2017] Semantic Scene Completion from a Single Depth Image [caffe] πŸ”₯⭐
  • [CVPR2016] Structured Prediction of Unobserved Voxels From a Single Depth Image [resource] ⭐

Shape Reconstruction & Generation

  • [Arxiv] PODIA-3D: Domain Adaptation of 3D Generative Model Across Large Domain Gap Using Pose-Preserved Text-to-Image Diffusion [Project]
  • [Arxiv] 3D-aware Image Generation using 2D Diffusion Models [Project]
  • [Arxiv] HyperDiffusion: Generating Implicit Neural Fields with Weight-Space Diffusion [Project]
  • [Arxiv] DITTO-NeRF: Diffusion-based Iterative Text To Omni-directional 3D Model [Project]
  • [Arxiv] Make-It-3D: High-Fidelity 3D Creation from A Single Image with Diffusion Prior [Project]
  • [Arxiv] RealFusion: 360Β° Reconstruction of Any Object from a Single Image [Project]
  • [Arxiv] 3DGen: Triplane Latent Diffusion for Textured Mesh Generation
  • [Arxiv] Let 2D Diffusion Model Know 3D-Consistency for Robust Text-to-3D Generation [Project]
  • [CVPR2023] Controllable Mesh Generation Through Sparse Latent Point Diffusion Models [Project]
  • [CVPR2023] NEF: Neural Edge Fields for 3D Parametric Curve Reconstruction from Multi-view Images [Project]
  • [ICLR2023] MeshDiffusion: Score-based Generative 3D Mesh Modeling [Project]
  • [CVPR2023] PartNeRF: Generating Part-Aware Editable 3D Shapes without 3D Supervision [Project]
  • [Arxiv] Instruct-NeRF2NeRF: Editing 3D Scenes with Instructions [Project]
  • [CVPR2023] SINE: Semantic-driven Image-based NeRF Editing with Prior-guided Editing Field [Project]
  • [Arxiv] NeuDA: Neural Deformable Anchor for High-Fidelity Implicit Surface Reconstruction
  • [Arxiv] Delicate Textured Mesh Recovery from NeRF via Adaptive Surface Refinement [Project]
  • [Arxiv] 3D generation on ImageNet [Project]
  • [Arxiv] Text-driven Visual Synthesis with Latent Diffusion Prior [Project]
  • [Arxiv] VQ3D: Learning a 3D-Aware Generative Model on ImageNet [Project]
  • [Arxiv] TEXTure: Text-Guided Texturing of 3D Shapes [Project]
  • [Arxiv] LEGO-Net: Learning Regular Rearrangements of Objects in Rooms [Project]
  • [Arxiv] DisCoScene: Spatially Disentangled Generative Radiance Fields for Controllable 3D-aware Scene Synthesis [Project]
  • [Arxiv] GeoCode: Interpretable Shape Programs [Project]
  • [Arxiv] Dream3D: Zero-Shot Text-to-3D Synthesis Using 3D Shape Prior and Text-to-Image Diffusion Models [Project]
  • [Arxiv] Point-E: A System for Generating 3D Point Clouds from Complex Prompts [Project]
  • [Arxiv] LoopDraw: a Loop-Based Autoregressive Model for Shape Synthesis and Editing
  • [Arxiv] SDFusion: Multimodal 3D Shape Completion, Reconstruction, and Generation [Project]
  • [Arxiv] NeRDi: Single-View NeRF Synthesis with Language-Guided Diffusion as General Image Priors
  • [Arxiv] Diffusion-SDF: Text-to-Shape via Voxelized Diffusion [Project]
  • [Arxiv] 3D-LDM: Neural Implicit 3D Shape Generation with Latent Diffusion Models
  • [Arxiv] Score Jacobian Chaining: Lifting Pretrained 2D Diffusion Models for 3D Generation [Project]
  • [Arxiv] SparseFusion: Distilling View-conditioned Diffusion for 3D Reconstruction [Project]
  • [Arxiv] 3D Neural Field Generation using Triplane Diffusion [Project]
  • [Arxiv] Neural Volumetric Mesh Generator
  • [Arxiv] Tetrahedral Diffusion Models for 3D Shape Generation
  • [Arxiv] MagicPony: Learning Articulated 3D Animals in the Wild [Project]
  • [Arxiv] RenderDiffusion: Image Diffusion for 3D Reconstruction, Inpainting and Generation [Project]
  • [Arxiv] Magic3D: High-Resolution Text-to-3D Content Creation [Project]
  • [Arxiv] Latent-NeRF for Shape-Guided Generation of 3D Shapes and Textures
  • [NeurIPS2022] LION: Latent Point Diffusion Models for 3D Shape Generation [Project]
  • [NeurIPS2022] GET3D: A Generative Model of High Quality 3D Textured Shapes Learned from Images [Project]
  • [ECCV2022] Cross-Modal 3D Shape Generation and Manipulation [Project]
  • [ECCV2022] Deforming Radiance Fields with Cages
  • [NeurIPS2021] NeRS: Neural Reflectance Surfaces for Sparse-view 3D Reconstruction in the Wild [Project]
  • [CVPR2022] CLIP-Forge: Towards Zero-Shot Text-to-Shape Generation [github]
  • [CVPR2022] Multi-View Mesh Reconstruction with Neural Deferred Shading [Project]
  • [Arxiv] Neural Surface Reconstruction of Dynamic Scenes with Monocular RGB-D Camera [Project]
  • [Arxiv] Unbiased 4D: Monocular 4D Reconstruction with a Neural Deformation Model
  • [Arxiv] 3DILG: Irregular Latent Grids for 3D Generative Modeling [Project]
  • [CVPR2022] FvOR: Robust Joint Shape and Pose Optimization for Few-view Object Reconstruction [Project]
  • [CVPR2022] Topologically-Aware Deformation Fields for Single-View 3D Reconstruction [Project]
  • [Arxiv] Planes vs. Chairs: Category-guided 3D shape learning without any 3D cues [Project]
  • [Arxiv] Neural Vector Fields for Surface Representation and Inference
  • [CVPR2022] Pre-train, Self-train, Distill: A simple recipe for Supersizing 3D Reconstruction [Project]
  • [CVPR2022] BARC: Learning to Regress 3D Dog Shape from Images by Exploiting Breed Information [Project]
  • [CVPR2022] Ο†-SfT: Shape-from-Template with a Physics-Based Deformation Model [Project]
  • [CVPR2022] OcclusionFusion: Occlusion-aware Motion Estimation for Real-time Dynamic 3D Reconstruction [Project]
  • [Arxiv] Neural Dual Contouring
  • [Arxiv] POCO: Point Convolution for Surface Reconstruction [Project]
  • [ICCV2021] SurfGen: Adversarial 3D Shape Synthesis with Explicit Surface Discriminators [github]

Before 2022

  • [Arxiv] DoodleFormer: Creative Sketch Drawing with Transformers
  • [NeurIPS2021] Class-agnostic Reconstruction of Dynamic Objects from Videos [Project]
  • [Arxiv] The Shape Part Slot Machine: Contact-based Reasoning for Generating 3D Shapes from Parts
  • [Arxiv] MeshUDF: Fast and Differentiable Meshing of Unsigned Distance Field Networks [github]
  • [Arxiv] TransMVSNet: Global Context-aware Multi-view Stereo Network with Transformers [github]
  • [Arxiv] JoinABLe: Learning Bottom-up Assembly of Parametric CAD Joints
  • [Arxiv] Image Based Reconstruction of Liquids from 2D Surface Detections
  • [Arxiv] TaylorImNet for Fast 3D Shape Reconstruction Based on Implicit Surface Function
  • [NeurIPS2021] Deep Marching Tetrahedra: a Hybrid Representation for High-Resolution 3D Shape Synthesis [Project]
  • [ICML2021] Neural-Pull: Learning Signed Distance Functions from Point Clouds by Learning to Pull Space onto Surfaces [tensorflow]
  • [Arxiv] StyleSDF: High-Resolution 3D-Consistent Image and Geometry Generation [Project]
  • [3DV2021] High Fidelity 3D Reconstructions with Limited Physical Views [Project]
  • [3DV2021] Multi-Category Mesh Reconstruction From Image Collections [github]
  • [Arxiv] Style Agnostic 3D Reconstruction via Adversarial Style Transfer [https://github.com/Felix-Petersen/style-agnostic-3d-reconstruction]
  • [Arxiv] BANMo: Building Animatable 3D Neural Models from Many Casual Videos [Project]
  • [Arxiv] EditVAE: Unsupervised Part-Aware Controllable 3D Point Cloud Shape Generation
  • [Arxiv] Differentiable Stereopsis: Meshes from multiple views using differentiable rendering [Project]
  • [ICCV2021] Neural Strokes: Stylized Line Drawing of 3D Shapes
  • [ACMMM2021] Single Image 3D Object Estimation with Primitive Graph Networks
  • [Arxiv] Octree Transformer: Autoregressive 3D Shape Generation on Hierarchically Structured Sequences
  • [Arxiv] ABO: Dataset and Benchmarks for Real-World 3D Object Understanding [Project]
  • [ICCV2021] Common Objects in 3D: Large-Scale Learning and Evaluation of Real-life 3D Category Reconstruction [github]
  • [Arxiv] Learnable Triangulation for Deep Learning-based 3D Reconstruction of Objects of Arbitrary Topology from Single RGB Images
  • [ICCV2021] Learning Signed Distance Field for Multi-view Surface Reconstruction
  • [Arxiv] Image2Lego: Customized LEGO Set Generation from Images
  • [ICCV2021] Unsupervised Learning of Fine Structure Generation for 3D Point Clouds by 2D Projection Matching [github]
  • [Arxiv] Object Wake-up: 3-D Object Reconstruction, Animation, and in-situ Rendering from a Single Image
  • [Arxiv] DOVE: Learning Deformable 3D Objects by Watching Videos [Project]
  • [Arxiv] Active 3D Shape Reconstruction from Vision and Touch
  • [NeurIPS2020] 3D Shape Reconstruction from Vision and Touch [pytorch]
  • [Arxiv] LegoFormer: Transformers for Block-by-Block Multi-view 3D Reconstruction
  • [Arxiv] Shape from Blur: Recovering Textured 3D Shape and Motion of Fast Moving Objects
  • [Arxiv] View Generalization for Single Image Textured 3D Models [Project]
  • [Arxiv] Shape As Points: A Differentiable Poisson Solver
  • [Arxiv] Neural Implicit 3D Shapes from Single Images with Spatial Patterns
  • [IJCAI2021] Spline Positional Encoding for Learning 3D Implicit Signed Distance Fields
  • [Arxiv] Z2P: Instant Rendering of Point Clouds
  • [CVPR2021] Multi-view 3D Reconstruction of a Texture-less Smooth Surface of Unknown Generic Reflectance
  • [CVPR2021] Birds of a Feather: Capturing Avian Shape Models from Images [Project]
  • [Arxiv] DeepCAD: A Deep Generative Network for Computer-Aided Design Models
  • [Arxiv] StrobeNet: Category-Level Multiview Reconstruction of Articulated Objects
  • [CVPR2021] Sketch2Model: View-Aware 3D Modeling from Single Free-Hand Sketches
  • [Arxiv] Sign-Agnostic CONet: Learning Implicit Surface Reconstructions by Sign-Agnostic Optimization of Convolutional Occupancy Networks
  • [IJCAI2021] PointLIE: Locally Invertible Embedding for Point Cloud Sampling and Recovery
  • [Arxiv] UNISURF: Unifying Neural Implicit Surfaces and Radiance Fields for Multi-View Reconstruction
  • [CVPR2021] Shape and Material Capture at Home
  • [CVPR2021] StereoPIFu: Depth Aware Clothed Human Digitization via Stereo Vision [Project]
  • [Arxiv] CAPRI-Net: Learning Compact CAD Shapes with Adaptive Primitive Assembly
  • [CVPR2021] Fully Understanding Generic Objects: Modeling, Segmentation, and Reconstruction [Project]
  • [CVPR2021] Online Learning of a Probabilistic and Adaptive Scene Representation
  • [CVPR2021] Fostering Generalization in Single-view 3D Reconstruction by Learning a Hierarchy of Local and Global Shape Priors
  • [Arxiv] Sketch2Mesh: Reconstructing and Editing 3D Shapes from Sketches
  • [CVPR2021] Deep Implicit Moving Least-Squares Functions for 3D Reconstruction [Project]
  • [Arxiv] PC2WF: 3D WIREFRAME RECONSTRUCTION FROM RAW POINT CLOUDS
  • [CVPR2021] Diffusion Probabilistic Models for 3D Point Cloud Generation [Project]
  • [Arxiv] ShaRF: Shape-conditioned Radiance Fields from a Single View [Project]
  • [Arxiv] Shelf-Supervised Mesh Prediction in the Wild
  • [Arxiv] HyperPocket: Generative Point Cloud Completion
  • [Arxiv] Im2Vec: Synthesizing Vector Graphics without Vector Supervision [resource]
  • [Arxiv] Secrets of 3D Implicit Object Shape Reconstruction in the Wild
  • [Arxiv] Joint Learning of 3D Shape Retrieval and Deformation
  • [Arxiv] Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Shapes

Before 2021

  • [Arxiv] Learning Delaunay Surface Elements for Mesh Reconstruction
  • [Arxiv] Compositionally Generalizable 3D Structure Prediction
  • [Arxiv] Online Adaptation for Consistent Mesh Reconstruction in the Wild
  • [Arxiv] Sign-Agnostic Implicit Learning of Surface Self-Similarities for Shape Modeling and Reconstruction from Raw Point Clouds
  • [Arxiv] Deep Optimized Priors for 3D Shape Modeling and Reconstruction
  • [Arxiv] DO 2D GANS KNOW 3D SHAPE? UNSUPERVISED 3D SHAPE RECONSTRUCTION FROM 2D IMAGE GANS [Project]
  • [Arxiv] DUDE: Deep Unsigned Distance Embeddings for Hi-Fidelity Representation of Complex 3D Surfaces
  • [3DV2020] Learning to Infer Semantic Parameters for 3D Shape Editing [Project]
  • [3DV2020] Cycle-Consistent Generative Rendering for 2D-3D Modality Translation [Project]
  • [3DV2020] A Divide et Impera Approach for 3D Shape Reconstruction from Multiple Views
  • [Arxiv] A Closed-Form Solution to Local Non-Rigid Structure-from-Motion
  • [Arxiv] Deformed Implicit Field: Modeling 3D Shapes with Learned Dense Correspondence
  • [Arxiv] D-NeRF: Neural Radiance Fields for Dynamic Scenes
  • [Arxiv] Modular Primitives for High-Performance Differentiable Rendering
  • [CVPR2021] NeuralFusion: Online Depth Fusion in Latent Space
  • [Arxiv] Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synthesis of a Deforming Scene from Monocular Video [Project]
  • [NeurIPS2020] Continuous Object Representation Networks: Novel View Synthesis without Target View Supervision [Project]
  • [NeurIPS2020] SDF-SRN: Learning Signed Distance 3D Object Reconstruction from Static Images [Project]
  • [NeurIPS2020] Multiview Neural Surface Reconstruction by Disentangling Geometry and Appearance [Project]
  • [NeurIPS2020] Convolutional Generation of Textured 3D Meshes [Project]
  • [Arxiv] Vid2CAD: CAD Model Alignment using Multi-View Constraints from Videos
  • [NeurIPS2020] UCLID-Net: Single View Reconstruction in Objec Space [Project]
  • [NeurIPS2020] CaSPR: Learning Canonical Spatiotemporal Point Cloud Representations [Project]
  • [NeurIPS2020] Generative 3D Part Assembly via Dynamic Graph Learning [pytorch]
  • [NeurIPS2020] Learning Deformable Tetrahedral Meshes for 3D Reconstruction [Project]
  • [NeurIPS2020] SoftFlow: Probabilistic Framework for Normalizing Flow on Manifolds [pytorch]
  • [Arxiv] Training Data Generating Networks: Linking 3D Shapes and Few-Shot Classification
  • [Arxiv] MESHMVS: MULTI-VIEW STEREO GUIDED MESH RECONSTRUCTION
  • [Arxiv] Learning Occupancy Function from Point Clouds for Surface Reconstruction
  • [NeurIPS2020] SDF-SRN: Learning Signed Distance 3D Object Reconstruction from Static Images [Project]
  • [Arxiv] GRF: Learning a General Radiance Field for 3D Scene Representation and Rendering [github]
  • [3DV2020] A Progressive Conditional Generative Adversarial Network for Generating Dense and Colored 3D Point Clouds
  • [3DV2020] Better Patch Stitching for Parametric Surface Reconstruction
  • [NeurIPS2020] Skeleton-bridged Point Completion: From Global Inference to Local Adjustment [Project Page]
  • [Arxiv] NeRF++: Analyzing and Improving Neural Radiance Fields [pytorch]
  • [Arxiv] Improved Modeling of 3D Shapes with Multi-view Depth Maps
  • [SIGGRAPH2020] One Shot 3D Photography [Project]
  • [BMVC2020] Large Scale Photometric Bundle Adjustment
  • [ECCV2020] Interactive Annotation of 3D Object Geometry using 2D Scribbles [Project]
  • [BMVC2020] Visibility-aware Multi-view Stereo Network
  • [ECCV2020] Pix2Surf: Learning Parametric 3D Surface Models of Objects from Images
  • [ECCV2020] 3D Bird Reconstruction: a Dataset, Model, and Shape Recovery from a Single View [Project][Pytorch]
  • [BMVC2020] 3D-GMNet: Single-View 3D Shape Recovery as A Gaussian Mixture
  • [SIGGRAPH2020] Self-Sampling for Neural Point Cloud Consolidation
  • [ECCV2020] Stochastic Bundle Adjustment for Efficient and Scalable 3D Reconstruction [github]
  • [Arxiv] NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections [Project]
  • [Arxiv] MeshODE: A Robust and Scalable Framework for Mesh Deformation
  • [Arxiv] MRGAN: Multi-Rooted 3D Shape Generation with Unsupervised Part Disentanglement
  • [ECCV2020] Meshing Point Clouds with Predicted Intrinsic-Extrinsic Ratio Guidance [pytorch]
  • [ECCV2020] Who Left the Dogs Out? 3D Animal Reconstruction with Expectation Maximization in the Loop
  • [ECCV2020] Dense Hybrid Recurrent Multi-view Stereo Net with Dynamic Consistency Checking
  • [ECCV2020] Shape and Viewpoint without Keypoints
  • [Arxiv] Object-Centric Multi-View Aggregation
  • [ECCV2020] Points2Surf Learning Implicit Surfaces from Point Clouds
  • [NeurIPS2020] Neural Mesh Flow: 3D Manifold Mesh Generation via Diffeomorphic Flows [Project]
  • [Arxiv] Pix2Vox++: Multi-scale Context-aware 3D Object Reconstruction from Single and Multiple Images
  • [Arxiv] Neural Non-Rigid Tracking
  • [NeurIPS2020] MeshSDF: Differentiable Iso-Surface Extraction
  • [Arxiv] 3D Reconstruction of Novel Object Shapes from Single Images
  • [NeurIPS2020] ShapeFlow: Learnable Deformations Among 3D Shapes [pytorch]
  • [Arxiv] 3D Shape Reconstruction from Free-Hand Sketches
  • [Arxiv] Convolutional Occupancy Networks
  • [Siggraph2020] Point2Mesh: A Self-Prior for Deformable Meshes
  • [Arxiv] PointTriNet: Learned Triangulation of 3D Point
  • [Arxiv] A Simple and Scalable Shape Representation for 3D Reconstruction
  • [Siggraph2020] Vid2Curve: Simultaneously Camera Motion Estimation and Thin Structure Reconstruction from an RGB Video
  • [CVPR2020] From Image Collections to Point Clouds with Self-supervised Shape and Pose Networks [tensorflow]
  • [CVPR2020] Through the Looking Glass: Neural 3D Reconstruction of Transparent Shapes [github]
  • [Arxiv] PolyGen: An Autoregressive Generative Model of 3D Meshes
  • [Arxiv] Combinatorial 3D Shape Generation via Sequential Assembly
  • [Arxiv] Few-Shot Single-View 3-D Object Reconstruction with Compositional Priors
  • [Arxiv] Neural Object Descriptors for Multi-View Shape Reconstruction
  • [CVPR2020] SPARE3D: A Dataset for SPAtial REasoning on Three-View Line Drawings [pytorch]
  • [Arxiv] Modeling 3D Shapes by Reinforcement Learning
  • [ECCV2020] ParSeNet: A Parametric Surface Fitting Network for 3D Point Clouds [pytorch]
  • [Arxiv] Self-Supervised 2D Image to 3D Shape Translation with Disentangled Representations
  • [Arxiv] Universal Differentiable Renderer for Implicit Neural Representations
  • [Arxiv] Learning 3D Part Assembly from a Single Image
  • [Arxiv] Curriculum DeepSDF
  • [Arxiv] PT2PC: Learning to Generate 3D Point Cloud Shapes from Part Tree Conditions
  • [Arxiv] Self-supervised Single-view 3D Reconstruction via Semantic Consistency
  • [Arxiv] Meta3D: Single-View 3D Object Reconstruction from Shape Priors in Memory
  • [Arxiv] STD-Net: Structure-preserving and Topology-adaptive Deformation Network for 3D Reconstruction from a Single Image [new]
  • [Arxiv] Curvature Regularized Surface Reconstruction from Point Cloud
  • [Arxiv] Hypernetwork approach to generating point clouds
  • [Arxiv] Inverse Graphics GAN: Learning to Generate 3D Shapes from Unstructured 2D Data
  • [Arxiv] Meshlet Priors for 3D Mesh Reconstruction
  • [Arxiv] Front2Back: Single View 3D Shape Reconstruction via Front to Back Prediction
  • [Arxiv] SDFDiff: Differentiable Rendering of Signed Distance Fields for 3D Shape Optimization
  • [CVPR2019] Occupancy Networks: Learning 3D Reconstruction in Function Space [pytorch] πŸ”₯⭐
  • [NeurIPS2019] DISN: Deep Implicit Surface Network for High-quality Single-view 3D Reconstruction [tensorflow]
  • [NeurIPS2019] Learning to Infer Implicit Surfaces without 3D Supervision
  • [CVPR2019] A Skeleton-bridged Deep Learning Approach for Generating Meshes of Complex Topologies from Single RGB Images [pytorch & tensorflow]
  • [Arxiv] Deep Level Sets: Implicit Surface Representations for 3D Shape Inference
  • [CVPR2019] Learning Implicit Fields for Generative Shape Modeling [tensorflow] πŸ”₯
  • [ICCV2019] Point-based Multi-view Stereo Network [pytorch] ⭐
  • [Arxiv] TSRNet: Scalable 3D Surface Reconstruction Network for Point Clouds using Tangent Convolution
  • [Arxiv] DR-KFD: A Differentiable Visual Metric for 3D Shape Reconstruction
  • [ICCV2019] GraphX-Convolution for Point Cloud Deformation in 2D-to-3D Conversion
  • [ICCV2019] Pixel2Mesh++: Multi-View 3D Mesh Generation via Deformation [pytorch]
  • [ICCV2019] Few-Shot Generalization for Single-Image 3D Reconstruction via Priors
  • [ICCV2019] Deep Mesh Reconstruction from Single RGB Images via Topology Modification Networks
  • [AAAI2018] Learning Efficient Point Cloud Generation for Dense 3D Object Reconstruction [tensorflow] ⭐πŸ”₯
  • [NeurIPS2017] MarrNet: 3D Shape Reconstruction via 2.5D Sketches [torch]:star::fire:

3D Scene Understanding

  • [Arxiv] CLIP-FO3D: Learning Free Open-world 3D Scene Representations from 2D Dense CLIP
  • [CVPR2023] Learning 3D Scene Priors with 2D Supervision [Project]
  • [CVPR2023] Mask3D: Pre-training 2D Vision Transformers by Learning Masked 3D Priors
  • [Arxiv] Decoupling Human and Camera Motion from Videos in the Wild [Project]
  • [CVPR2022] PhotoScene: Photorealistic Material and Lighting Transfer for Indoor Scenes [github]
  • [Arxiv] Semantic Instance Segmentation of 3D Scenes Through Weak Bounding Box Supervision [Project]
  • [CVPR2022] Learning Multi-View Aggregation In the Wild for Large-Scale 3D Semantic Segmentation [github]
  • [CVPR2022] 3D-SPS: Single-Stage 3D Visual Grounding via Referred Point Progressive Selection
  • [CVPR2022] BEHAVE: Dataset and Method for Tracking Human Object Interactions [Project]

Before 2022

  • [Arxiv] Transferable End-to-end Room Layout Estimation via Implicit Encoding [Project]
  • [Arxiv] ScanQA: 3D Question Answering for Spatial Scene Understanding
  • [Arxiv] 3D Question Answering
  • [Arxiv] MVLayoutNet:3D layout reconstruction with multi-view panoramas
  • [SGP2021] Roominoes: Generating Novel 3D Floor Plans From Existing 3D Rooms
  • [Arxiv] 4DContrast: Contrastive Learning with Dynamic Correspondences for 3D Scene Understanding
  • [Arxiv] Pose2Room: Understanding 3D Scenes from Human Activities [Project]
  • [NeurIPS2021] SEAL: Self-supervised Embodied Active Learning using Exploration and 3D Consistency [Project]
  • [Arxiv] D3Net: A Speaker-Listener Architecture for Semi-supervised Dense Captioning and Visual Grounding in RGB-D Scans [Project]
  • [Arxiv] Recognizing Scenes from Novel Viewpoints
  • [Arxiv] Putting 3D Spatially Sparse Networks on a Diet
  • [Arxiv] Cerberus Transformer: Joint Semantic, Affordance and Attribute Parsing [github]
  • [NeurIPS2021] Neural Scene Flow Prior [github]
  • [ICCV2021] Structured Bird's-Eye-View Traffic Scene Understanding from Onboard Images [Project]
  • [Arxiv] RoomStructNet: Learning to Rank Non-Cuboidal Room Layouts From Single View
  • [EMNLP2021] Language-Aligned Waypoint (LAW) Supervision for Vision-and-Language Navigation in Continuous Environments [Project]
  • [Arxiv] KITTI-360: A Novel Dataset and Benchmarks for Urban Scene Understanding in 2D and 3D [Project]
  • [CVPR2021] OpenRooms: An End-to-End Open Framework for Photorealistic Indoor Scene Datasets [github]
  • [Arxiv] Pointly-supervised 3D Scene Parsing with Viewpoint Bottleneck [github]
  • [TPAMI2021] Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR-based Perception [github]
  • [Arxiv] PQ-Transformer: Jointly Parsing 3D Objects and Layouts from Point Clouds [github]
  • [Arxiv] Residual 3D Scene Flow Learning with Context-Aware Feature Extraction
  • [ICCV2021] Learning to Generate Scene Graph from Natural Language Supervision [github]
  • [ICCV2021] The Surprising Effectiveness of Visual Odometry Techniques for Embodied PointGoal Navigation [Project]
  • [ICCV2021] Graph-to-3D: End-to-End Generation and Manipulation of 3D Scenes Using Scene Graphs
  • [ICCV2021] PICCOLO: Point Cloud-Centric Omnidirectional Localization
  • [ICCV2021] Unconditional Scene Graph Generation
  • [Arxiv] Learning Indoor Layouts from Simple Point-Clouds
  • [Arxiv] LanguageRefer: Spatial-Language Model for 3D Visual Grounding
  • [Arxiv] WiCluster: Passive Indoor 2D/3D Positioning using WiFi without Precise Labels
  • [CVPR2021] Zillow Indoor Dataset: Annotated Floor Plans With 360deg Panoramas and 3D Room Layouts [github]
  • [ICRA2021] Efficient and Robust LiDAR-Based End-to-End Navigation [Project]
  • [ICLR2021] VTNet: Visual Transformer Network for Object Goal Navigation
  • [CVPR2021] Self-Point-Flow: Self-Supervised Scene Flow Estimation from Point Clouds with Optimal Transport and Random Walk
  • [CVPR2021] HCRF-Flow: Scene Flow from Point Clouds with Continuous High-order CRFs and Position-aware Flow Embedding
  • [Arxiv] FloorPlanCAD: A Large-Scale CAD Drawing Dataset for Panoptic Symbol Spotting
  • [Arxiv] SCTN: Sparse Convolution-Transformer Network for Scene Flow Estimation
  • [Arxiv] Collision Replay: What Does Bumping Into Things Tell You About Scene Geometry? [Project]
  • [Arxiv] Pri3D: Can 3D Priors Help 2D Representation Learning?
  • [Arxiv] LaLaLoc: Latent Layout Localisation in Dynamic, Unvisited Environments
  • [CVPRW] OmniLayout: Room Layout Reconstruction from Indoor Spherical Panoramas [github]
  • [Arxiv] Learning to Reconstruct 3D Non-Cuboid Room Layout from a Single RGB Image [pytorch]
  • [Arxiv] SQN: Weakly-Supervised Semantic Segmentation of Large-Scale 3D Point Clouds with 1000Γ— Fewer Labels [github]
  • [CVPR2021] FESTA: Flow Estimation via Spatial-Temporal Attention for Scene Point Clouds
  • [CVPR2021] Free-form Description Guided 3D Visual Graph Network for Object Grounding in Point Cloud [github]
  • [ICRA] Reconstructing Interactive 3D Scenes by Panoptic Mapping and CAD Model Alignments [Project]
  • [Arxiv] Contextual Scene Augmentation and Synthesis via GSACNet
  • [Arxiv] In-Place Scene Labelling and Understanding with Implicit Scene Representation
  • [CVPR2021] Bidirectional Projection Network for Cross Dimension Scene Understanding [github]
  • [Arxiv] Free-form Description Guided 3D Visual Graph Network for Object Grounding in Point Cloud [github]
  • [CVPR2021] Visual Room Rearrangement [Project]
  • [Arxiv] MonteFloor: Extending MCTS for Reconstructing Accurate Large-Scale Floor Plans
  • [Arxiv] Structured Scene Memory for Vision-Language Navigation
  • [Arxiv] House-GAN++: Generative Adversarial Layout Refinement Networks
  • [Arxiv] Weakly Supervised Learning of Rigid 3D Scene Flow
  • [ICLR2021] End-to-End Egospheric Spatial Memory
  • [Arxiv] Single-Shot Cuboids: Geodesics-based End-to-end Manhattan Aligned Layout Estimation from Spherical Panoramas [Project]
  • [Arxiv] A modular vision language navigation and manipulation framework for long horizon compositional tasks in indoor environment
  • [Arxiv] Deep Reinforcement Learning for Producing Furniture Layout in Indoor Scenes
  • [Arxiv] Where2Act: From Pixels to Actions for Articulated 3D Objects [Project]

Before 2021

  • [Arxiv] PanopticFusion: Online Volumetric Semantic Mapping at the Level of Stuff and Things
  • [Arxiv] AI2-THOR: An Interactive 3D Environment for Visual AI [Project]
  • [Arxiv] Audio-Visual Floorplan Reconstruction
  • [Arxiv] PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clouds
  • [Arxiv] RAFT-3D: Scene Flow using Rigid-Motion Embeddings
  • [Arxiv] GenScan: A Generative Method for Populating Parametric 3D Scan Datasets
  • [Arxiv] LayoutGMN: Neural Graph Matching for Structural Layout Similarity
  • [Arxiv] Seeing Behind Objects for 3D Multi-Object Tracking in RGB-D Sequences
  • [Arxiv] P4Contrast: Contrastive Learning with Pairs of Point-Pixel Pairs for RGB-D Scene Understanding
  • [Arxiv] Fast and Furious: Real Time End-to-End 3D Detection, Tracking and Motion Forecasting with a Single Convolutional Net
  • [Arxiv] Localising In Complex Scenes Using Balanced Adversarial Adaptation
  • [Arxiv] Efficient RGB-D Semantic Segmentation for Indoor Scene Analysis
  • [NeurIPS2020] Multi-Plane Program Induction with 3D Box Priors [Project]
  • [Arxiv] HoHoNet: 360 Indoor Holistic Understanding with Latent Horizontal Features
  • [Arxiv] Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts
  • [Arxiv] Generative Layout Modeling using Constraint Graphs
  • [NeurIPS2020] Rel3D: A Minimally Contrastive Benchmark for Grounding Spatial Relations in 3D [pytorch]
  • [NeurIPS2020] Learning Affordance Landscapes for Interaction Exploration in 3D Environments [Project]
  • [NeurIPS2020W] Unsupervised Domain Adaptation for Visual Navigation
  • [Arxiv] Embodied Visual Navigation with Automatic Curriculum Learningin Real Environments
  • [Arxiv] 3D Room Layout Estimation Beyond the Manhattan World Assumption
  • [Arxiv] OpenBot: Turning Smartphones into Robots [Project]
  • [Arxiv] Audio-Visual Waypoints for Navigation
  • [Arxiv] Learning Affordance Landscapes for Interaction Exploration in 3D Environments [Project]
  • [ECCV2020] Occupancy Anticipation for Efficient Exploration and Navigation [Project]
  • [Arxiv] Retargetable AR: Context-aware Augmented Reality in Indoor Scenes based on 3D Scene Graph
  • [Arxiv] Generating Person-Scene Interactions in 3D Scenes
  • [Arxiv] GeoLayout: Geometry Driven Room Layout Estimation Based on Depth Maps of Planes
  • [ECCV2020] ReferIt3D: Neural Listeners for Fine-Grained 3D Object Identification in Real-World Scenes
  • [Arxiv] Structural Plan of Indoor Scenes with Personalized Preferences
  • [Arxiv] HoliCity: A City-Scale Data Platform for Learning Holistic 3D Structures [Project]
  • [CVPR2020] End-to-End Optimization of Scene Layout [Project]
  • [Arxiv] Improving Target-driven Visual Navigation with Attention on 3D Spatial Relationships
  • [CVPR2020] Learning 3D Semantic Scene Graphs from 3D Indoor Reconstructions
  • [Arxiv] LayoutMP3D: Layout Annotation of Matterport3D
  • [CVPR2020] Local Implicit Grid Representations for 3D Scenes
  • [Arxiv] Scan2Plan: Efficient Floorplan Generation from 3D Scans of Indoor Scenes
  • [CVPR2020] RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds [tensorflow] πŸ”₯
  • [CVPR2020] Intelligent Home 3D: Automatic 3D-House Design from Linguistic Descriptions Only
  • [ICRA2020] 3DCFS: Fast and Robust Joint 3D Semantic-Instance Segmentation via Coupled Feature Selection
  • [Arxiv] Indoor Scene Recognition in 3D
  • [Journal] Dark, Beyond Deep: A Paradigm Shift to Cognitive AI with Humanlike Common Sense
  • [Arxiv] BlockGAN Learning 3D Object-aware Scene Representations from Unlabelled Images
  • [Arxiv] 3D Dynamic Scene Graphs: Actionable Spatial Perception with Places, Objects, and Humans [Project] Related: [Arxiv] [Arxiv]
  • [ICCV2019] U4D: Unsupervised 4D Dynamic Scene Understanding
  • [ICCV2019] UprightNet: Geometry-Aware Camera Orientation Estimation from Single Images
  • [ICCV2019] Habitat: A Platform for Embodied AI Research [habitat-api] [habitat-sim] ⭐
  • [ICCV2019] SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences [project page] ⭐
  • [ICCV2019] Neural Inverse Rendering of an Indoor Scene From a Single Image
  • [ICCV2019] SceneGraphNet: Neural Message Passing for 3D Indoor Scene Augmentation [pytorch]
  • [ICCV2019] RIO: 3D Object Instance Re-Localization in Changing Indoor Environments [dataset]
  • [ICCV2019] CamNet: Coarse-to-Fine Retrieval for Camera Re-Localization
  • [ICCV2019] U4D: Unsupervised 4D Dynamic Scene Understanding
  • [NeurIPS2018] Learning to Exploit Stability for 3D Scene Parsing

3D Scene Reconstruction & Generation

  • [CVPR2023] Neuralangelo: High-Fidelity Neural Surface Reconstruction [Project]
  • [Arxiv] Instruct 3D-to-3D: Text Instruction Guided 3D-to-3D conversion [Project]
  • [Arxiv] FastSurf: Fast Neural RGB-D Surface Reconstruction using Per-Frame Intrinsic Refinement and TSDF Fusion Prior Learning
  • [CVPR2023] I$^2$-SDF: Intrinsic Indoor Scene Reconstruction and Editing via Raytracing in Neural SDFs [Project]
  • [Arxiv] CC3D: Layout-Conditioned Generation of Compositional 3D Scenes [Project]
  • [Arxiv] RICO: Regularizing the Unobservable for Indoor Compositional Reconstruction
  • [Arxiv] Learning a Room with the Occ-SDF Hybrid: Signed Distance Function Mingled with Occupancy Aids Scene Representation
  • [Arxiv] Text2Room: Extracting Textured 3D Meshes from 2D Text-to-Image Models [Project]
  • [Arxiv] Compositional 3D Scene Generation using Locally Conditioned Diffusion [Project]
  • [Arxiv] Set-the-Scene: Global-Local Training for Generating Controllable NeRF Scenes [Project]
  • [BMVC2022] SPARC: Sparse Render-and-Compare for CAD model alignment in a single RGB image [github]
  • [Arxiv] NICER-SLAM: Neural Implicit Scene Encoding for RGB SLAM
  • [Arxiv] Text-To-4D Dynamic Scene Generation
  • [Arxiv] Behind the Scenes: Density Fields for Single View Reconstruction [Project]
  • [Arxiv] MIME: Human-Aware 3D Scene Generation [Project]
  • [CVPR2022] PlaneMVS: 3D Plane Reconstruction from Multi-View Stereo
  • [CVPR2022] Neural 3D Scene Reconstruction with the Manhattan-world Assumption [Project]
  • [CVPR2022] 3D Scene Painting via Semantic Image Synthesis
  • [Siggraph2022] SNeRF: Stylized Neural Implicit Representations for 3D Scenes [Project]
  • [Siggraph2022] Neural 3D Reconstruction in the Wild [Project]
  • [Arxiv] GO-Surf: Neural Feature Grid Optimization for Fast, High-Fidelity RGB-D Surface Reconstruction [Project]
  • [Arxiv] RayTran: 3D pose estimation and shape reconstruction of multiple objects from videos with ray-traced transformers
  • [Arxiv] iSDF: Real-Time Neural Signed Distance Fields for Robot Perception [Project]
  • [Arxiv] NeuRIS: Neural Reconstruction of Indoor Scenes Using Normal Priors [Project]
  • [CVPR2022] PlanarRecon: Real-time 3D Plane Detection and Reconstruction from Posed Monocular Videos [Project]
  • [CVPR2022] Learning 3D Object Shape and Layout without 3D Supervision [Project]
  • [Arxiv] MonoSDF: Exploring Monocular Geometric Cues for Neural Implicit Surface Reconstruction [Project]
  • [Arxiv] BlobGAN: Spatially Disentangled Scene Representations [Project]
  • [CVPR2022] NeRFusion: Fusing Radiance Fields for Large-Scale Scene Reconstruction
  • [Arxiv] ATEK: Augmenting Transformers with Expert Knowledge for Indoor Layout Synthesis

Before 2022

  • [Arxiv] IterMVS: Iterative Probability Estimation for Efficient Multi-View Stereo [github]
  • [Arxiv] What's Behind the Couch? Directed Ray Distance Functions (DRDF) for 3D Scene Reconstruction [Project]
  • [Arxiv] Input-level Inductive Biases for 3D Reconstruction
  • [Arxiv] ROCA: Robust CAD Model Retrieval and Alignment from a Single Image
  • [Arxiv] Multi-View Stereo with Transformer
  • [3DV2021] 3DVNet: Multi-View Depth Prediction and Volumetric Refinement
  • [Arxiv] VoRTX: Volumetric 3D Reconstruction With Transformers for Voxelwise View Selection and Fusion
  • [Arxiv] CIRCLE: Convolutional Implicit Reconstruction and Completion for Large-scale Indoor Scene
  • [Arxiv] Joint stereo 3D object detection and implicit surface reconstruction
  • [CoRL2021] TANDEM: Tracking and Dense Mapping in Real-time using Deep Multi-view Stereo [Project]
  • [NeurIPS2021] Voxel-based 3D Detection and Reconstruction of Multiple Objects from a Single Image [Project]
  • [NeurIPS2021] Panoptic 3D Scene Reconstruction From a Single RGB Image
  • [Arxiv] NICE-SLAM: Neural Implicit Scalable Encoding for SLAM [Project]
  • [BMVC2021] PlaneRecNet: Multi-Task Learning with Cross-Task Consistency for Piece-Wise Plane Detection and Reconstruction from a Single RGB Image [github]
  • [ICCV2021] Scene Synthesis via Uncertainty-Driven Attribute Synchronization [github]
  • [NeurIPS2021] ATISS: Autoregressive Transformers for Indoor Scene Synthesis [Project]
  • [ICCV2021] Learning Indoor Inverse Rendering with 3D Spatially-Varying Lighting
  • [Arxiv] Black-Box Test-Time Shape REFINEment for Single View 3D Reconstruction
  • [Arxiv] Indoor Scene Generation from a Collection of Semantic-Segmented Depth Images
  • [ICCV2021] Vis2Mesh: Efficient Mesh Reconstruction from Unstructured Point Clouds of Large Scenes with Learned Virtual View Visibility [github]
  • [ICCV2021] 3DIAS: 3D Shape Reconstruction with Implicit Algebraic Surfaces [Project]
  • [ICCV2021] VolumeFusion: Deep Depth Fusion for 3D Scene Reconstruction
  • [Arxiv] AA-RMVSNet: Adaptive Aggregation Recurrent Multi-view Stereo Network
  • [Arxiv] NeuralMVS: Bridging Multi-View Stereo and Novel View Synthesis
  • [ICCV2021] Out-of-Core Surface Reconstruction via Global $TGV$ Minimization
  • [ICCV2021] Discovering 3D Parts from Image Collections [Project]
  • [ICCV2021] PlaneTR: Structure-Guided Transformers for 3D Plane Recovery [pytorch]
  • [Arxiv] TransformerFusion: Monocular RGB Scene Reconstruction using Transformers [Project]
  • [Arxiv] Indoor Panorama Planar 3D Reconstruction via Divide and Conquer
  • [Arxiv] NeuS: Learning Neural Implicit Surfaces by Volume Rendering for Multi-view Reconstruction
  • [CVPR2021] Mirror3D: Depth Refinement for Mirror Surfaces [Project]
  • [CVPR2021] Plan2Scene: Converting Floorplans to 3D Scenes [Project]
  • [Arxiv] Translational Symmetry-Aware Facade Parsing for 3D Building Reconstruction
  • [Arxiv] Learning to Stylize Novel Views [Project]
  • [Arxiv] Stylizing 3D Scene via Implicit Representation and HyperNetwork
  • [CVPR2021] SAIL-VOS 3D: A Synthetic Dataset and Baselines for Object Detection and 3D Mesh Reconstruction from Video Data [Project]
  • [Arxiv] The Boombox: Visual Reconstruction from Acoustic Vibrations [Project]
  • [Arxiv] Joint Pose and Shape Estimation of Vehicles from LiDAR Data
  • [CVPR2021] NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video [Project]
  • [Arxiv] DDR-Net: Learning Multi-Stage Multi-View Stereo With Dynamic Depth Range [pytorch]
  • [Arxiv] Planar Surface Reconstruction from Sparse Views [Project]
  • [Arxiv] Neural RGB-D Surface Reconstruction
  • [Arxiv] RetrievalFuse: Neural 3D Scene Reconstruction with a Database
  • [ICCV2021] PlenOctrees for Real-time Rendering of Neural Radiance Fields [C++]
  • [Arxiv] iMAP: Implicit Mapping and Positioning in Real-Time
  • [CVPR2021] Monte Carlo Scene Search for 3D Scene Understanding
  • [CVPR2021] Holistic 3D Scene Understanding from a Single Image with Implicit Representation
  • [CVPR2021] RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction [pytorch]
  • [Arxiv] IBRNet: Learning Multi-View Image-Based Rendering [Project]
  • [Arxiv] STaR: Self-supervised Tracking and Reconstruction of Rigid Objects in Motion with Neural Rendering [Project]

Before 2021

  • [ToG2018] Deep convolutional priors for indoor scene synthesis [github]
  • [Arxiv] MO-LTR: Multiple Object Localization, Tracking and Reconstruction from Monocular RGB Videos
  • [Arxiv] DI-Fusion: Online Implicit 3D Reconstruction with Deep Priors
  • [3DV2020] Scene Flow from Point Clouds with or without Learning
  • [Arxiv] Stable View Synthesis
  • [Arxiv] Neural Scene Graphs for Dynamic Scenes
  • [3DV2020] RidgeSfM: Structure from Motion via Robust Pairwise Matching Under Depth Uncertainty [pytorch]
  • [Arxiv] FlowStep3D: Model Unrolling for Self-Supervised Scene Flow Estimation
  • [Arxiv] MoNet: Motion-based Point Cloud Prediction Network
  • [Arxiv] MonoRec: Semi-Supervised Dense Reconstruction in Dynamic Environments from a Single Moving Camera
  • [Arxiv] Efficient Initial Pose-graph Generation for Global SfM
  • [Arxiv] Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes [Project]
  • [Arxiv] RGBD-Net: Predicting color and depth images for novel views synthesis
  • [Arxiv] SSCNav: Confidence-Aware Semantic Scene Completion for Visual Semantic Navigation [Project]
  • [Arxiv] From Points to Multi-Object 3D Reconstruction
  • [Arxiv] Worldsheet: Wrapping the World in a 3D Sheet for View Synthesis from a Single Image [Project]
  • [Arxiv] SceneFormer: Indoor Scene Generation with Transformers [pytorch]
  • [NeurIPS2020] Neural Sparse Voxel Fields [Project]
  • [Arxiv] Towards Part-Based Understanding of RGB-D Scans
  • [Arxiv] Dynamic Plane Convolutional Occupancy Networks
  • [NeurIPS2020] Neural Unsigned Distance Fields for Implicit Function Learning [Project]
  • [Arxiv] Holistic static and animated 3D scene generation from diverse text descriptions [pytorch]
  • [Arxiv] Semi-Supervised Learning of Multi-Object 3D Scene Representations
  • [ECCV2020] CAD-Deform: Deformable Fitting of CAD Models to 3D Scans
  • [ECCV2020] Mask2CAD: 3D Shape Prediction by Learning to Segment and Retrieve
  • [ECCV2020] Learnable Cost Volume Using the Cayley Representation
  • [ECCV2020] Topology-Change-Aware Volumetric Fusion for Dynamic Scene Reconstruction
  • [ECCV2020] Convolutional Occupancy Networks
  • [CVPR2020] MARMVS: Matching Ambiguity Reduced Multiple View Stereo for Efficient Large Scale Scene Reconstruction
  • [ECCV2020] CoReNet: Coherent 3D scene reconstruction from a single RGB image
  • [CVPR2020] DOPS: Learning to Detect 3D Objects and Predict their 3D Shapes
  • [ECCV2020] SceneCAD: Predicting Object Alignments and Layouts in RGB-D Scans
  • [Arxiv] Removing Dynamic Objects for Static Scene Reconstruction using Light Fields
  • [Arxiv] Atlas: End-to-End 3D Scene Reconstruction from Posed Images
  • [Arxiv] Scan2Plan: Efficient Floorplan Generation from 3D Scans of Indoor Scenes
  • [Arxiv] Plane Pair Matching for Efficient 3D View Registration
  • [CVPR2020] Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes from a Single Image [pytorch]
  • [Arxiv] Indoor Layout Estimation by 2D LiDAR and Camera Fusion
  • [Arxiv] General 3D Room Layout from a Single View by Render-and-Compare
  • [ICCV2019] Learning to Reconstruct 3D Manhattan Wireframes from a Single Image
  • [CVPR2019] PlaneRCNN: 3D Plane Detection and Reconstruction from a Single Image [pytorch]:fire:
  • [ICCV2019] 3D Scene Reconstruction with Multi-layer Depth and Epipolar Transformers
  • [ICCV Workshop2019] Silhouette-Assisted 3D Object Instance Reconstruction from a Cluttered Scene
  • [ICCV2019] 3D-RelNet: Joint Object and Relation Network for 3D prediction [pytorch]
  • [3DV2019] Pano Popups: Indoor 3D Reconstruction with a Plane-Aware Network
  • [CVPR2018] Factoring Shape, Pose, and Layout from the 2D Image of a 3D Scene [pytorch]
  • [IROS2017] Indoor Scan2BIM: Building Information Models of House Interiors
  • [CVPR2017] 3DMatch: Learning Local Geometric Descriptors from RGB-D Reconstructions [github]

NeRF

  • [Arxiv] SC-GS: Sparse-Controlled Gaussian Splatting for Editable Dynamic Scenes
  • [Arxiv] Scaffold-GS: Structured 3D Gaussians for View-Adaptive Rendering [https://city-super.github.io/scaffold-gs/]
  • [NeurIPS2023] PyNeRF: Pyramidal Neural Radiance Fields
  • [Arxiv] K-Planes: Explicit Radiance Fields in Space, Time, and Appearance
  • [Arxiv] PhysGaussian: Physics-Integrated 3D Gaussians for Generative Dynamics
  • [ICCV2023] Seal-3D: Interactive Pixel-Level Editing for Neural Radiance Fields [github]
  • [Arxiv] Zip-NeRF: Anti-Aliased Grid-Based Neural Radiance Fields [Project]
  • [CVPR2023] Seeing Through the Glass: Neural 3D Reconstruction of Object Inside a Transparent Container [Project]
  • [Arxiv] CompoNeRF: Text-guided Multi-object Compositional NeRF with Editable 3D Scene Layout
  • [Arxiv] LERF: Language Embedded Radiance Fields [Project]
  • [CVPR2023] Nerflets: Local Radiance Fields for Efficient Structure-Aware 3D Scene Representation from 2D Supervisio
  • [CVPR2023] HelixSurf: A Robust and Efficient Neural Implicit Surface Learning of Indoor Scenes with Iterative Intertwined Regularization [github]
  • [Arxiv] BakedSDF: Meshing Neural SDFs for Real-Time View Synthesis [Project]
  • [Arxiv] NerfDiff: Single-image View Synthesis with NeRF-guided Distillation from 3D-aware Diffusion [Project]
  • [Arxiv] HR-NeuS: Recovering High-Frequency Surface Geometry via Neural Implicit Surfaces
  • [Arxiv] 3D-aware Blending with Generative NeRFs [Project]
  • [Arxiv] Factor Fields: A Unified Framework for Neural Fields and Beyond
  • [Arxiv] Removing Objects From Neural Radiance Fields
  • [Arxiv] Interactive Segmentation of Radiance Fields [Project]
  • [Arxiv] Robust Dynamic Radiance Fields [Project]
  • [Arxiv] NeRF-Art: Text-Driven Neural Radiance Fields Stylization [Projetc]
  • [Arxiv] 4K-NeRF: High Fidelity Neural Radiance Fields at Ultra High Resolutions [Project]
  • [Arxiv] EditableNeRF: Editing Topologically Varying Neural Radiance Fields by Key Points
  • [Arxiv] SSDNeRF: Semantic Soft Decomposition of Neural Radiance Fields [Project]
  • [Arxiv] NeRFEditor: Differentiable Style Decomposition for Full 3D Scene Editing [Project]
  • [Arxiv] Ref-NPR: Reference-Based Non-Photorealistic Radiance Fields [Project]
  • [WACV2023] ScanNeRF: a Scalable Benchmark for Neural Radiance Fields [Project]
  • [Arxiv] LaTeRF: Label and Text Driven Object Radiance Fields
  • [Arxiv] Beyond RGB: Scene-Property Synthesis with Neural Radiance Fields
  • [CVPR2022] RigNeRF: Fully Controllable Neural 3D Portraits [Project]
  • [Arxiv] Panoptic Neural Fields: A Semantic Object-Aware Neural Scene Representation
  • [Arxiv] D2NeRF: Self-Supervised Decoupling of Dynamic and Static Objects from a Monocular Video [Project]
  • [Arxiv] Artemis: Articulated Neural Pets with Appearance and Motion synthesis [Project]
  • [Arxiv] KeypointNeRF: Generalizing Image-based Volumetric Avatars using Relative Spatial Encoding of Keypoints [Project]
  • [Arxiv] Control-NeRF: Editable Feature Volumes for Scene Rendering and Manipulation
  • [Arxiv] PVSeRF: Joint Pixel-, Voxel- and Surface-Aligned Radiance Field for Single-Image Novel View Synthesis
  • [Arxiv] Block-NeRF: Scalable Large Scene Neural View Synthesis [Project]
  • [Arxiv] Pix2NeRF: Unsupervised Conditional Ο€-GAN for Single Image to Neural Radiance Fields Translation
  • [Arxiv] NeSF: Neural Semantic Fields for Generalizable Semantic Segmentation of 3D Scenes [Project]
  • [Arxiv] HumanNeRF: Free-viewpoint Rendering of Moving People from Monocular Video [github]
  • [Arxiv] NeROIC: Neural Rendering of Objects from Online Image Collections [Projetc]
  • [Arxiv] DFA-NeRF: Personalized Talking Head Generation via Disentangled Face Attributes Neural Rendering
  • [Arxiv] InfoNeRF: Ray Entropy Minimization for Few-Shot Neural Volume Rendering [Project]

Before 2022

  • [Arxiv] Mega-NeRF: Scalable Construction of Large-Scale NeRFs for Virtual Fly-Throughs [Project]
  • [Arxiv] Light Field Neural Rendering [Project]
  • [Arxiv] CG-NeRF: Conditional Generative Neural Radiance Fields
  • [Arxiv] Ref-NeRF: Structured View-Dependent Appearance for Neural Radiance Fields [Project]
  • [Arxiv] MoFaNeRF: Morphable Facial Neural Radiance Field
  • [Arxiv] Dense Depth Priors for Neural Radiance Fields from Sparse Input Views
  • [Arxiv] NeRF-SR: High-Quality Neural Radiance Fields using Super-Sampling [Project]
  • [Arxiv] RegNeRF: Regularizing Neural Radiance Fields for View Synthesis from Sparse Inputs [Project]
  • [Arxiv] NeRFReN: Neural Radiance Fields with Reflections [Project]
  • [Arxiv] NeuSample: Neural Sample Field for Efficient View Synthesis [Project]
  • [Arxiv] Urban Radiance Fields [Project]
  • [Arxiv] GeoNeRF: Generalizing NeRF with Geometry Priors [Project]
  • [Arxiv] NeRF in the Dark: High Dynamic Range View Synthesis from Noisy Raw Images [Project]
  • [Arxiv] VaxNeRF: Revisiting the Classic for Voxel-Accelerated Neural Radiance Field [github]
  • [Arxiv] Direct Voxel Grid Optimization: Super-fast Convergence for Radiance Fields Reconstruction [github]
  • [Arxiv] LOLNeRF: Learn from One Look
  • [Arxiv] Instant Neural Graphics Primitives with a Multiresolution Hash Encoding [Project]
  • [NeurIPS2021] Neural View Synthesis and Matching for Semi-Supervised Few-Shot Learning of 3D Pose [github]
  • [Arxiv] PERF: Performant, Explicit Radiance Fields
  • [Arxiv] Plenoxels: Radiance Fields without Neural Networks [Project]
  • [NeurIPS2021] Neural Human Performer: Learning Generalizable Radiance Fields for Human Performance Rendering [Project]
  • [ICCV2021] CodeNeRF: Disentangled Neural Radiance Fields for Object Categories [github]
  • [ICCV2021] Learning Object-Compositional Neural Radiance Field for Editable Scene Rendering [Project]
  • [ICCV2021] Differentiable Surface Rendering via Non-Differentiable Sampling
  • [ICCV2021] Putting NeRF on a Diet: Semantically Consistent Few-Shot View Synthesis [Project]
  • [Arxiv] Fast and Explicit Neural View Synthesis
  • [Arxiv] Depth-supervised NeRF: Fewer Views and Faster Training for Free [Project] [pytorch]
  • [Arxiv] A Higher-Dimensional Representation for Topologically Varying Neural Radiance Fields [Project]
  • [Arxiv] NeRF in detail: Learning to sample for view synthesis
  • [Arxiv] NeRFactor: Neural Factorization of Shape and Reflectance Under an Unknown Illumination [Project]
  • [Arxiv] Neural Trajectory Fields for Dynamic Novel View Synthesis
  • [Arxiv] Editing Conditional Radiance Fields [Project]
  • [CVPR2021] Stereo Radiance Fields (SRF): Learning View Synthesis for Sparse Views of Novel Scenes
  • [Arxiv] GNeRF: GAN-based Neural Radiance Field without Posed Camera
  • [Arxiv] BARF: Bundle-Adjusting Neural Radiance Fields [Project]
  • [Arxiv] MVSNeRF: Fast Generalizable Radiance Field Reconstruction from Multi-View Stereo
  • [CVPR2021] Neural Lumigraph Rendering [Project]
  • [Arxiv] Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields
  • [Arxiv] KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs
  • [Arxiv] FastNeRF: High-Fidelity Neural Rendering at 200FPS
  • [CVPR2021] NeX: Real-time View Synthesis with Neural Basis Expansion [Project]
  • [Arxiv] DONeRF: Towards Real-Time Rendering of Neural Radiance Fields using Depth Oracle Networks [Project]
  • [Arxiv] NeRF--: Neural Radiance Fields Without Known Camera Parameters [Project]

Before 2021

  • [Arxiv] pixelNeRF: Neural Radiance Fields from One or Few Images [Project]
  • [Arxiv] NeRV: Neural Reflectance and Visibility Fields for Relighting and View Synthesis [Project]
  • [Arxiv] Neural Radiance Flow for 4D View Synthesis and Video Processing [Project]
  • [Arxiv] Deformable Neural Radiance Fields [Project]
  • [Arxiv] DeRF: Decomposed Radiance Fields
  • [Arxiv] NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis

About Human Body

  • [Arxiv] Animatable Gaussians: Learning Pose-dependent Gaussian Maps for High-fidelity Human Avatar Modeling [Project]
  • [Arxiv] D3GA - Drivable 3D Gaussian Avatars [Project]
  • [Arxiv] NPC: Neural Point Characters from Video [Project]
  • [Arxiv] Normal-guided Garment UV Prediction for Human Re-texturing
  • [Arxiv] Sketch2Cloth: Sketch-based 3D Garment Generation with Unsigned Distance Fields
  • [Arxiv] PointAvatar: Deformable Point-based Head Avatars from Videos [Project]
  • [Arxiv] PhoMoH: Implicit Photorealistic 3D Models of Human Heads
  • [Arxiv] 3DHumanGAN: Towards Photo-Realistic 3D-Aware Human Image Generation
  • [Arxiv] Rodin: A Generative Model for Sculpting 3D Digital Avatars Using Diffusion [Project]
  • [Arxiv] Generating Holistic 3D Human Motion from Speech [Project]
  • [Arxiv] MoFusion: A Framework for Denoising-Diffusion-based Motion Synthesis [Project]
  • [Arxiv] RANA: Relightable Articulated Neural Avatars [Project]
  • [Arxiv] Magic: Multi Art Genre Intelligent Choreography Dataset and Network for 3D Dance Generation
  • [Arxiv] One-shot Implicit Animatable Avatars with Model-based Priors [Project]
  • [Arxiv] PhysDiff: Physics-Guided Human Motion Diffusion Model [Project]
  • [Arxiv] Instant Volumetric Head Avatars [Project]
  • [Arxiv] EVA3D: Compositional 3D Human Generation from 2D Image Collections [Project]
  • [ECCV2022] Compositional Human-Scene Interaction Synthesis with Semantic Control [Project]
  • [ECCV2022] Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis [Project]
  • [CVPR2022] Photorealistic Monocular 3D Reconstruction of Humans Wearing Clothing [Project]
  • [ECCV2022] DiffuStereo: High Quality Human Reconstruction via Diffusion-based Stereo Using Sparse Cameras [Project]
  • [CVPR2022] Capturing and Inferring Dense Full-Body Human-Scene Contact [Project]
  • [Arxiv] Realistic One-shot Mesh-based Head Avatars [Project]
  • [CVPR2022] SmartPortraits: Depth Powered Handheld Smartphone Dataset of Human Portraits for State Estimation, Reconstruction and Synthesis
  • [Arxiv] DAD-3DHeads: A Large-scale Dense, Accurate and Diverse Dataset for 3D Head Alignment from a Single Image [Project]
  • [CVPR2022] Structured Local Radiance Fields for Human Avatar Modeling
  • [CVPR2022] ImFace: A Nonlinear 3D Morphable Face Model with Implicit Neural Representations
  • [Arxiv] AutoAvatar: Autoregressive Neural Fields for Dynamic Avatar Modeling [Project]

Before 2022

  • [Arxiv] The Wanderings of Odysseus in 3D Scenes [Project]
  • [Arxiv] Putting People in their Place: Monocular Regression of 3D People in Depth [github]
  • [Arxiv] Tracking People by Predicting 3D Appearance, Location & Pose [Project]
  • [Arxiv] Adversarial Parametric Pose Prior
  • [NeurIPS2021] Garment4D: Garment Reconstruction from Point Cloud Sequences [Project]
  • [Arxiv] MobRecon: Mobile-Friendly Hand Mesh Reconstruction from Monocular Image [github]
  • [Arxiv] Total Scale: Face-to-Body Detail Reconstruction from Sparse RGBD Sensors
  • [Arxiv] GLAMR: Global Occlusion-Aware Human Mesh Recovery with Dynamic Cameras [Project]
  • [3DV2021] LatentHuman: Shape-and-Pose Disentangled Latent Representation for Human Bodies [Project]
  • [Arxiv] A Lightweight Graph Transformer Network for Human Mesh Reconstruction from 2D Human Pose
  • [Arxiv] MHFormer: Multi-Hypothesis Transformer for 3D Human Pose Estimation [github]
  • [Arxiv] Multi-Person 3D Motion Prediction with Multi-Range Transformers [Project]
  • [Arxiv] DD-NeRF: Double-Diffusion Neural Radiance Field as a Generalizable Implicit Body Representation
  • [Arxiv] Creating and Reenacting Controllable 3D Humans with Differentiable Rendering
  • [Arxiv] Deep Two-Stream Video Inference for Human Body Pose and Shape Estimation
  • [BMVC2021] AniFormer: Data-driven 3D Animation with Transformer [Project]
  • [ACMMM2021] VoteHMR: Occlusion-Aware Voting Network for Robust 3D Human Mesh Recovery from Partial Point Clouds
  • [Arxiv] Playing for 3D Human Recovery [Project]
  • [ICCV2021] Learning to Regress Bodies from Images using Differentiable Semantic Rendering [Project]
  • [Arxiv] ICON: Implicit Clothed humans Obtained from Normals [github]
  • [ICCV2021] Hierarchical Kinematic Probability Distributions for 3D Human Shape and Pose Estimation from Images in the Wild [Project]
  • [Arxiv] SPEC: Seeing People in the Wild with an Estimated Camera [Project]
  • [NeurIPS2021] Tracking People with 3D Representations [github]
  • [Arxiv] A Skeleton-Driven Neural Occupancy Representation for Articulated Hands
  • [Arxiv] GraFormer: Graph Convolution Transformer for 3D Pose Estimation [github]
  • [ICCV2021] Graph-Based 3D Multi-Person Pose Estimation Using Multi-View Images
  • [ICCV2021] Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation [github]
  • [ICCV2021] 3D Human Texture Estimation from a Single Image with Transformers
  • [ICCV2021] DensePose 3D: Lifting Canonical Surface Maps of Articulated Objects to the Third Dimension
  • [Arxiv] SNARF: Differentiable Forward Skinning for Animating Non-Rigid Neural Implicit Shapes [Project]
  • [ICCV2021] Probabilistic Modeling for Human Mesh Recovery [Project]
  • [ICCV2021] Unsupervised Dense Deformation Embedding Network for Template-Free Shape Correspondence
  • [ACMMM2021] DC-GNet: Deep Mesh Relation Capturing Graph Convolution Network for 3D Human Shape Reconstruction
  • [SiggraphAsia2019] Neural State Machine for Character-Scene Interactions [github]
  • [ICCV2021] Learning Motion Priors for 4D Human Body Capture in 3D Scenes [Project]
  • [Arxiv] Deep Virtual Markers for Articulated 3D Shapes
  • [ICCV2021] Gravity-Aware Monocular 3D Human-Object Reconstruction [Project]
  • [ICCV2021] Learning Anchored Unsigned Distance Functions with Gradient Direction Alignment for Single-view Garment Reconstruction
  • [Arxiv] D3D-HOI: Dynamic 3D Human-Object Interactions from Videos [github]
  • [ICCV2021] Stochastic Scene-Aware Motion Prediction [Project] [github]
  • [ICCV2021] ARCH++: Animation-Ready Clothed Human Reconstruction Revisited
  • [ICCV2021] EventHPE: Event-based 3D Human Pose and Shape Estimation
  • [ACMMM2021] Learning Multi-Granular Spatio-Temporal Graph Network for Skeleton-based Action Recognition [github]
  • [ACMMM2021] Skeleton-Contrastive 3D Action Representation Learning [github]
  • [Arxiv] Learning Local Recurrent Models for Human Mesh Recovery
  • [Arxiv] H3D-Net: Few-Shot High-Fidelity 3D Head Reconstruction [Project]
  • [Arxiv] Unsupervised 3D Human Mesh Recovery from Noisy Point Clouds [github]
  • [Arxiv] MetaAvatar: Learning Animatable Clothed Human Models from Few Depth Images [Project]
  • [Arxiv] Deep3DPose: Realtime Reconstruction of Arbitrarily Posed Human Bodies from Single RGB Images
  • [Arxiv] THUNDR: Transformer-based 3D HUmaN Reconstruction with Markers
  • [CVPR2021] Function4D: Real-time Human Volumetric Capture from Very Sparse RGBD Sensors [Project]
  • [Arxiv] Bridge the Gap Between Model-based and Model-free Human Reconstruction
  • [Arxiv] Neural Actor: Neural Free-view Synthesis of Human Actors with Pose Control
  • [Arxiv] Scene-aware Generative Network for Human Motion Synthesis
  • [Arxiv] Human Motion Prediction Using Manifold-Aware Wasserstein GAN
  • [CVPR2021] Function4D: Real-time Human Volumetric Capture from Very Sparse Consumer RGBD Sensors [Project]
  • [Arxiv] TRiPOD: Human Trajectory and Pose Dynamics Forecasting in the Wild [Project]
  • [CVPR2021] We are More than Our Joints: Predicting how 3D Bodies Move [Project]
  • [CVPR2021] LEAP: Learning Articulated Occupancy of People [Project]
  • [Arxiv] 3DCrowdNet: 2D Human Pose-Guided 3D Crowd Human Pose and Shape Estimation in the Wild
  • [CVPR2021] SCALE: Modeling Clothed Humans with a Surface Codec of Articulated Local Elements [Project]
  • [Arxiv] Action-Conditioned 3D Human Motion Synthesis with Transformer VAE [Project]
  • [Arxiv] Dynamic Surface Function Networks for Clothed Human Bodies [github]
  • [Arxiv] Neural Articulated Radiance Field [github]
  • [Arxiv] Mesh Graphormer
  • [CVPR2021] SimPoE: Simulated Character Control for 3D Human Pose Estimation [Project]
  • [Arxiv] TRAJEVAE - Controllable Human Motion Generation from Trajectories [Project]
  • [CVPR2021] Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors [Project]
  • [CVPR2021] Bilevel Online Adaptation for Out-of-Domain Human Mesh Reconstruction [Project]
  • [CVPR2021] Learning Parallel Dense Correspondence from Spatio-Temporal Descriptors for Efficient and Robust 4D Reconstruction [github]
  • [Arxiv] Probabilistic 3D Human Shape and Pose Estimation from Multiple Unconstrained Images in the Wild
  • [Arxiv] 3D Human Pose Estimation with Spatial and Temporal Transformers [pytorch]
  • [CVPR2021] Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks
  • [Arxiv] DanceNet3D: Music Based Dance Generation with Parametric Motion Transformer
  • [Arxiv] Aggregated Multi-GANs for Controlled 3D Human Motion Prediction [Project]
  • [AAAI] PC-HMR: Pose Calibration for 3D Human Mesh Recovery from 2D Images/Videos
  • [Arxiv] NeuralHumanFVV: Real-Time Neural Volumetric Human Performance Rendering using RGB Cameras
  • [CVPR2021] SMPLicit: Topology-aware Generative Model for Clothed People [Project]
  • [CVPR2021] HybrIK: A Hybrid Analytical-Neural Inverse Kinematics Solution for 3D Human Pose and Shape Estimation [pytorch]
  • [Arxiv] Single-Shot Motion Completion with Transformer [Project]
  • [EG2021] Walk2Map: Extracting Floor Plans from Indoor Walk Trajectories
  • [Arxiv] Forecasting Characteristic 3D Poses of Human Actions
  • [Arxiv] Capturing Detailed Deformations of Moving Human Bodies
  • [Arxiv] A-NeRF: Surface-free Human 3D Pose Refinement via Neural Rendering [Project]
  • [Arxiv] Learn to Dance with AIST++: Music Conditioned 3D Dance Generation [Project]
  • [Arxiv] S3: Neural Shape, Skeleton, and Skinning Fields for 3D Human Modeling
  • [Arxiv] PandaNet : Anchor-Based Single-Shot Multi-Person 3D Pose Estimation
  • [Arxiv] Neural Body: Implicit Neural Representations with Structured Latent Codes for Novel View Synthesis of Dynamic Humans [Project]
  • [Arxiv] Chasing the Tail in Monocular 3D Human Reconstruction with Prototype Memory
  • [3DV2020] PLACE: Proximity Learning of Articulation and Contact in 3D Environments [Project]
  • [ICCV2019] Resolving 3D Human Pose Ambiguities with 3D Scene Constraints [Project]

Before 2021

  • [ICCV2021] Monocular, One-stage, Regression of Multiple 3D People [github]
  • [ECCV2020] History Repeats Itself: Human Motion Prediction via Motion Attention [pytorch]
  • [ECCV2020] 3D Human Shape and Pose from a Single Low-Resolution Image with Self-Supervised Learning [Project]
  • [Arxiv] Synthesizing Long-Term 3D Human Motion and Interaction in 3D Scenes [Project]
  • [Arxiv] End-to-End Human Pose and Mesh Reconstruction with Transformers
  • [Arxiv] Human Mesh Recovery from Multiple Shots [Project]
  • [NeurIPS2020] 3D Multi-bodies: Fitting Sets of Plausible 3D Human Models to Ambiguous Image Data [Project]
  • [Arxiv] Holistic 3D Human and Scene Mesh Estimation from Single View Images
  • [Arxiv] Beyond Static Features for Temporally Consistent 3D Human Pose and Shape from a Video
  • [Arxiv] Pose2Pose: 3D Positional Pose-Guided 3D Rotational Pose Prediction for Expressive 3D Human Pose and Mesh Estimation
  • [Arxiv] NeuralAnnot: Neural Annotator for in-the-wild Expressive 3D Human Pose and Mesh Training Sets
  • [Arxiv] 4D Human Body Capture from Egocentric Video via 3D Scene Grounding [Project]
  • [Arxiv] Populating 3D Scenes by Learning Human-Scene Interaction [Project]
  • [ECCV2020] Long-term Human Motion Prediction with Scene Context [Project]
  • [Arxiv] Vid2Actor: Free-viewpoint Animatable Person Synthesis from Video in the Wild [Project]
  • [Arxiv] ANR: Articulated Neural Rendering for Virtual Avatars
  • [Arxiv] Generating 3D People in Scenes without People [Project]
  • [ICCV2019] Holistic++ Scene Understanding: Single-view 3D Holistic Scene Parsing and Human Pose Estimation with Human-Object Interaction and Physical Commonsense
  • [CVPR2019] Putting Humans in a Scene: Learning Affordance in 3D Indoor Environments [Project]
  • [TOG2016] Pigraphs: learning interaction snapshots from observations [Project]

General Methods

  • [CVPR2023] Masked Jigsaw Puzzle: A Versatile Position Embedding for Vision Transformers [github]
  • [Arxiv] HexPlane: A Fast Representation for Dynamic Scenes [Project]
  • [Arxiv] Joint Representation Learning for Text and 3D Point Cloud
  • [Arxiv] Ponder: Point Cloud Pre-training via Neural Rendering
  • [Arxiv] 3D Point Cloud Pre-training with Knowledge Distillation from 2D Images
  • [Arxiv] Autoencoders as Cross-Modal Teachers: Can Pretrained 2D Image Transformers Help 3D Representation Learning? [Project]
  • [Arxiv] Attentive Mask CLIP
  • [Arxiv] Synthetic-to-Real Domain Generalized Semantic Segmentation for 3D Indoor Point Clouds
  • [Arxiv] Frozen CLIP Model is Efficient Point Cloud Backbone
  • [Arxiv] Continuous diffusion for categorical data
  • [Arxiv] EVA: Exploring the Limits of Masked Visual Representation Learning at Scale
  • [Arxiv] Neural Density-Distance Fields [Project]
  • [Arxiv] Understanding Masked Image Modeling via Learning Occlusion Invariant Feature
  • [Arxiv] Jigsaw-ViT: Learning Jigsaw Puzzles in Vision Transformer [Project]
  • [Arxiv] Masked Surfel Prediction for Self-Supervised Point Cloud Learning [github]
  • [Arxiv] Point-M2AE: Multi-scale Masked Autoencoders for Hierarchical Point Cloud Pre-training [github]
  • [Arxiv] 3D-Aware Video Generation [Project]
  • [Arxiv] Learning Viewpoint-Agnostic Visual Representations by Recovering Tokens in 3D Space [Project]
  • [Arxiv] Masked Frequency Modeling for Self-Supervised Visual Pre-Training [Project]
  • [Arxiv] GRAM-HD: 3D-Consistent Image Generation at High Resolution with Generative Radiance Manifolds [Project]
  • [Arxiv] Diffusion Models for Video Prediction and Infilling [Project]
  • [Arxiv] MaskViT: Masked Visual Pre-Training for Video Prediction [Project]
  • [Arxiv] Random Walks for Adversarial Meshes
  • [ICLR2022] Rethinking Network Design and Local Geometry in Point Cloud: A Simple Residual MLP Framework [github]
  • [CVPR2022] Rethinking Semantic Segmentation: A Prototype View [github]
  • [Arxiv] How to Understand Masked Autoencoders
  • [ICLR2022] QuadTree Attention for Vision Transformers [github]
  • [Arxiv] Contrastive Neighborhood Alignment

Before 2022

  • [Arxiv] Domain Adaptation on Point Clouds via Geometry-Aware Implicits
  • [ICCV2021] Progressive Seed Generation Auto-encoder for Unsupervised Point Cloud Learning
  • [Arxiv] Variance-Aware Weight Initialization for Point Convolutional Neural Networks
  • [Arxiv] Learning to Detect Every Thing in an Open World [Project]
  • [Arxiv] Point-BERT: Pre-training 3D Point Cloud Transformers with Masked Point Modeling [Project]
  • [Arxiv] CpT: Convolutional Point Transformer for 3D Point Cloud Processing
  • [Arxiv] Swin Transformer V2: Scaling Up Capacity and Resolution [github]
  • [Arxiv] TransMix: Attend to Mix for Vision Transformers [github]
  • [Arxiv] Self-supervised GAN Detector [github]
  • [NeurIPS2021] Residual Relaxation for Multi-view Representation Learning
  • [ICCV2021] Video Autoencoder: self-supervised disentanglement of static 3D structure and motion [Project]
  • [NeurIPS2021] SAPE: Spatially-Adaptive Progressive Encoding for Neural Optimization [Project]
  • [Arxiv] Efficient Geometry-aware 3D Generative Adversarial Networks [Project]
  • [Arxiv] Self-attention Does Not Need $O(n^2)$ Memory
  • [Arxiv] CAP-Net: Correspondence-Aware Point-view Fusion Network for 3D Shape Analysis
  • [Arxiv] PointMixer: MLP-Mixer for Point Cloud Understanding
  • [NeurIPS2021] Blending Anti-Aliasing into Vision Transformer
  • [ICCV2021] Learning Inner-Group Relations on Point Clouds
  • [Arxiv] Point-Voxel Transformer: An Efficient Approach To 3D Deep Learning
  • [Siggraph2021] SP-GAN: Sphere-Guided 3D Shape Generation and Manipulation [Project] [github]
  • [ICCV2021] GraphFPN: Graph Feature Pyramid Network for Object Detection
  • [Arxiv] CKConv: Learning Feature Voxelization for Point Cloud Analysis
  • [ICCV2021] Generic Attention-model Explainability for Interpreting Bi-Modal and Encoder-Decoder Transformers [pytorch]
  • [Arxiv] Volume Rendering of Neural Implicit Surfaces
  • [CVPR2021] Iso-Points: Optimizing Neural Implicit Surfaces with Hybrid Representations
  • [Arxiv] DeepMesh: Differentiable Iso-Surface Extraction
  • [Arxiv] Neural Marching Cubes
  • [Arxiv] Geometry-Consistent Neural Shape Representation with Implicit Displacement Fields
  • [Arxiv] Light Field Networks: Neural Scene Representations with Single-Evaluation Rendering
  • [ICML2021] Revisiting Point Cloud Shape Classification with a Simple and Effective Baseline [pytorch]
  • [Arxiv] Deep Medial Fields
  • [Arxiv] Subdivision-Based Mesh Convolution Networks [Jittor]
  • [Arxiv] VA-GCN: A Vector Attention Graph Convolution Network for learning on Point Clouds [pytorch]
  • [Arxiv] Aggregating Nested Transformers
  • [Arxiv] Rethinking the Design Principles of Robust Vision Transformer [pytorch]
  • [Siggraph2021] Acorn: Adaptive Coordinate Networks for Neural Scene Representation
  • [Arxiv] Walk in the Cloud: Learning Curves for Point Clouds Shape Analysis [Project]
  • [Arxiv] Pay Attention to MLPs
  • [Arxiv] ResMLP: Feedforward networks for image classification with data-efficient training
  • [Arxiv] RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition
  • [Arxiv] MLP-Mixer: An all-MLP Architecture for Vision
  • [Arxiv] Vector Neurons: A General Framework for SO(3)-Equivariant Networks
  • [CVPR2021] MongeNet: Efficient Sampler for Geometric Deep Learning [Project]
  • [Arxiv] Point Cloud Learning with Transformer
  • [Arxiv] Dual Transformer for Point Cloud Analysis
  • [Arxiv] AttWalk: Attentive Cross-Walks for Deep Mesh Analysis
  • [Arxiv] Learning from 2D: Pixel-to-Point Knowledge Transfer for 3D Pretraining
  • [Arxiv] Field Convolutions for Surface CNNs
  • [Arxiv] Rethinking Spatial Dimensions of Vision Transformers [pytorch] πŸ”₯
  • [CVPR2021] PAConv: Position Adaptive Convolution with Dynamic Kernel Assembling on Point Clouds [pytorch]
  • [Arxiv] Concentric Spherical GNN for 3D Representation Learning
  • [Arxiv] High-Performance Large-Scale Image Recognition Without Normalization
  • [Arxiv] Generative Models as Distributions of Functions
  • [Arxiv] Point-set Distances for Learning Representations of 3D Point Clouds
  • [Arxiv] Compressed Object Detection
  • [Arxiv] A linearized framework and a new benchmark for model selection for fine-tuning
  • [Arxiv] The Devils in the Point Clouds: Studying the Robustness of Point Cloud Convolutions
  • [Arxiv] Self-Supervised Pretraining of 3D Features on any Point-Cloud [pytorch]
  • [3DV2020] Learning Rotation-Invariant Representations of Point Clouds Using Aligned Edge Convolutional Neural Networks

Before 2021

  • [ICCV2019] Efficient Learning on Point Clouds with Basis Point Sets [pytorch]
  • [CVPR2019] On the Continuity of Rotation Representations in Neural Networks [pytorch]
  • [Arxiv] Diffusion is All You Need for Learning on Surfaces
  • [Arxiv] SPU-Net: Self-Supervised Point Cloud Upsampling by Coarse-to-Fine Reconstruction with Self-Projection Optimization
  • [3DV2020] Rotation-Invariant Point Convolution With Multiple Equivariant Alignments
  • [Arxiv] One Point is All You Need: Directional Attention Point for Feature Learning
  • [Arxiv] PCT: Point Cloud Transformer
  • [Arxiv] Hausdorff Point Convolution with Geometric Priors
  • [Arxiv] MARNet: Multi-Abstraction Refinement Network for 3D Point Cloud Analysis [Github]
  • [Arxiv] Point Transformer
  • [Arxiv] Learning geometry-image representation for 3D point cloud generation
  • [Arxiv] Deeper or Wider Networks of Point Clouds with Self-attention?
  • [NeurIPS2020] Primal-Dual Mesh Convolutional Neural Networks [pytorch]
  • [NeurIPS2020] Rational neural networks [tensorflow]
  • [NeurIPS2020] Exchangeable Neural ODE for Set Modeling [Project]
  • [NeurIPS2020] SE(3)-Transformers: 3D Roto-Translation Equivariant Attention Networks [Project]
  • [NeurIPS2020] NVAE: A Deep Hierarchical Variational Autoencoder [pytorch]
  • [NeurIPS2020] Implicit Graph Neural Networks [pytorch]
  • [NeurIPS2020] The Autoencoding Variational Autoencoder [pytorch]
  • [Arxiv] PointManifold: Using Manifold Learning for Point Cloud Classification
  • [Arxiv] RelationNet++: Bridging Visual Representations for Object Detection via Transformer Decoder
  • [Arxiv] Pre-Training by Completing Point Clouds [pytorch]
  • [NeurIPS2020] Rotation-Invariant Local-to-Global Representation Learning for 3D Point Cloud
  • [Arxiv] IF-Defense: 3D Adversarial Point Cloud Defense via Implicit Function based Restoration [pytorch]
  • [Arxiv] DV-ConvNet: Fully Convolutional Deep Learning on Point Clouds with Dynamic Voxelization and 3D Group Convolution
  • [Arxiv] Spatial Transformer Point Convolution
  • [Arxiv] Minimal Adversarial Examples for Deep Learning on 3D Point Clouds
  • [BMVC2020] Black Magic in Deep Learning: How Human Skill Impacts Network Training
  • [ECCV2020] PointMixup: Augmentation for Point Clouds [Code]
  • [ECCV2020] DR-KFS: A Differentiable Visual Similarity Metric for 3D Shape Reconstruction
  • [Arxiv] Unsupervised 3D Learning for Shape Analysis via Multiresolution Instance Discrimination
  • [Arxiv] Global Context Aware Convolutions for 3D Point Cloud Understanding
  • [ECCV2020] Shape Adaptor: A Learnable Resizing Module [pytorch]
  • [ACMMM2020] Differentiable Manifold Reconstruction for Point Cloud Denoising [pytorch]
  • [ECCV2020] Discrete Point Flow Networks for Efficient Point Cloud Generation
  • [Siggraph2020] Neural Subdivision
  • [Arxiv] PointContrast: Unsupervised Pre-training for 3D Point Cloud Understanding
  • [Arxiv] Accelerating 3D Deep Learning with PyTorch3D
  • [Arxiv] Natural Graph Networks
  • [ECCV2020] Progressive Point Cloud Deconvolution Generation Network [github]
  • [Arxiv] Point Set Voting for Partial Point Cloud Analysis
  • [Arxiv] PointMask: Towards Interpretable and Bias-Resilient Point Cloud Processing
  • [Arxiv] Fully Convolutional Mesh Autoencoder using Efficient Spatially Varying Kernels
  • [Arxiv] A Closer Look at Local Aggregation Operators in Point Cloud Analysis [github]
  • [NeurIPS2020] Implicit Neural Representations with Periodic Activation Functions [pytorch] πŸ”₯
  • [Arxiv] Rethinking Sampling in 3D Point Cloud Generative Adversarial Networks
  • [Arxiv] Local-Area-Learning Network: Meaningful Local Areas for Efficient Point Cloud Analysis
  • [Arxiv] TearingNet: Point Cloud Autoencoder to Learn Topology-Friendly Representations
  • [Arxiv] Fully Convolutional Mesh Autoencoder using Efficient Spatially Varying Kernels
  • [Arxiv] Rethinking Sampling in 3D Point Cloud Generative Adversarial Networks
  • [Arxiv] MeshWalker: Deep Mesh Understanding by Random Walks
  • [Arxiv] MOPS-Net: A Matrix Optimization-driven Network for Task-Oriented 3D Point Cloud Downsampling
  • [Arxiv] DPDist : Comparing Point Clouds Using Deep Point Cloud Distance
  • [CVPR2020] PointASNL: Robust Point Clouds Processing using Nonlocal Neural Networks with Adaptive Sampling
  • [AAAI2020] Shape-Oriented Convolution Neural Network for Point Cloud Analysis
  • [Arxiv] Joint Supervised and Self-Supervised Learning for 3D Real-World Challenges
  • [Arxiv] LIGHTCONVPOINT: CONVOLUTION FOR POINTS [pytorch]
  • [Arxiv] Variational Auto-Decoder [pytorch]
  • [Arxiv] Generative PointNet: Energy-Based Learning on Unordered Point Sets for 3D Generation, Reconstruction and Classification
  • [CVPR2020] DualConvMesh-Net: Joint Geodesic and Euclidean Convolutions on 3D Meshes [pytorch]
  • [CVPR2020] RPM-Net: Robust Point Matching using Learned Features [github]
  • [CVPR2020] Global-Local Bidirectional Reasoning for Unsupervised Representation Learning of 3D Point Clouds
  • [CVPR2020] PointGMM: a Neural GMM Network for Point Clouds
  • [Arxiv] Dynamic ReLU
  • [CVPR2020] SampleNet: Differentiable Point Cloud Sampling [pytorch]
  • [Arxiv] Defense-PointNet: Protecting PointNet Against Adversarial Attacks
  • [CVPR2020] FPConv: Learning Local Flattening for Point Convolution [pytorch]
  • [SIGGRAPH2019] MeshCNN: A Network with an Edge [pytorch] πŸ”₯⭐
  • [ICCV2019] Total Denoising: Unsupervised Learning of 3D Point Cloud Cleaning [tensorflow]
  • [ICCV2019] PU-GAN: a Point Cloud Upsampling Adversarial Network:fire:
  • [CVPR2019] Relation-Shape Convolutional Neural Network for Point Cloud Analysis [pytorch] πŸ”₯
  • [CVPR2019] Patch-based Progressive 3D Point Set Upsampling [tensorflow] [pytorch] πŸ”₯
  • [TOG2019] Dynamic Graph CNN for Learning on Point Clouds [Project] πŸ”₯ ⭐
  • [ECCV2018] EC-Net: an Edge-aware Point set Consolidation Network [project page]
  • [CVPR2018] PU-Net: Point Cloud Upsampling Network ⭐πŸ”₯
  • [Arxiv] PointAugment: an Auto-Augmentation Framework for Point Cloud Classification
  • [ICLR2017] DEEP LEARNING WITH SETS AND POINT CLOUDS
  • [NeurIPS2017] Deep Sets
  • [Siggraph2006] Designing with Distance Fields

Others (inc. Networks in Classification, Matching, Registration, Alignment, Depth, Normal, Pose, Keypoints, etc.)

  • [Arxiv] ConceptLab: Creative Generation using Diffusion Prior Constraints [Project]
  • [Arxiv] Fast Complementary Dynamics via Skinning Eigenmodes [Project]
  • [Arxiv] Visual Instruction Inversion: Image Editing via Visual Prompting [Project]
  • [Arxiv] Objaverse-XL: A Universe of 10M+ 3D Objects
  • [Arxiv] Temporally Consistent Online Depth Estimation Using Point-Based Fusion [Project]
  • [CVPR2023] Open-Vocabulary Panoptic Segmentation with Text-to-Image Diffusion Models [Project]
  • [Arxiv] Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models [github]
  • [Arxiv] Pix2Video: Video Editing using Image Diffusion [Project]
  • [Arxiv] Cross-domain Compositing with Pretrained Diffusion Models [Project]
  • [Arxiv] 3D-aware Conditional Image Synthesis [Project]
  • [CVPR2022] Focal Length and Object Pose Estimation via Render and Compare [github]
  • [CVPR2022] Kubric: A scalable dataset generator

Before 2022

  • [Arxiv] Channel-Wise Attention-Based Network for Self-Supervised Monocular Depth Estimation
  • [Arxiv] Toward Practical Self-Supervised Monocular Indoor Depth Estimation
  • [Arxiv] PartImageNet: A Large, High-Quality Dataset of Parts [github]
  • [Arxiv] AdaAfford: Learning to Adapt Manipulation Affordance for 3D Articulated Objects via Few-shot Interactions
  • [Arxiv] Benchmarking Detection Transfer Learning with Vision Transformers
  • [Arxiv] Panoptic Segmentation: A Review [github]
  • [NeurIPS2021] Sparse Steerable Convolutions: An Efficient Learning of SE(3)-Equivariant Features for Estimation and Tracking of Object Poses in 3D Space [github]
  • [Arxiv] Attention Mechanisms in Computer Vision: A Survey
  • [Arxiv] Leveraging Geometry for Shape Estimation from a Single RGB Image [github]
  • [Arxiv] Deep Point Set Resampling via Gradient Fields [github]
  • [Arxiv] Efficient 3D Deep LiDAR Odometry [github]
  • [NeurIPS2021] 3DP3: 3D Scene Perception via Probabilistic Programming
  • [NeurIPS2021] CoFiNet: Reliable Coarse-to-fine Correspondences for Robust Point Cloud Registration [github]
  • [BMVC2021] Cascading Feature Extraction for Fast Point Cloud Registration
  • [Arxiv] Pseudo Supervised Monocular Depth Estimation with Teacher-Student Network
  • [BMVC2021] Multi-Stream Attention Learning for Monocular Vehicle Velocity and Inter-Vehicle Distance Estimation
  • [Arxiv] Occlusion-Robust Object Pose Estimation with Holistic Representation [github]
  • [BMVC2021] Depth-only Object Tracking
  • [3DV2021] Self-Supervised Monocular Scene Decomposition and Depth Estimation
  • [Arxiv] Deep Point Cloud Normal Estimation via Triplet Learning
  • [3DV2021] Attention meets Geometry: Geometry Guided Spatial-Temporal Attention for Consistent Self-Supervised Monocular Depth Estimation
  • [CORL2021] LENS: Localization enhanced by NeRF synthesis
  • [3DV2021] PLNet: Plane and Line Priors for Unsupervised Indoor Depth Estimation [github]
  • [Arxiv] Unsupervised Pose-Aware Part Decomposition for 3D Articulated Objects
  • [ICCV2021] PCAM: Product of Cross-Attention Matrices for Rigid Registration of Point Clouds [Project]
  • [ICCV2021] Excavating the Potential Capacity of Self-Supervised Monocular Depth Estimation
  • [ICCV2021] StereOBJ-1M: Large-scale Stereo Image Dataset for 6D Object Pose Estimation
  • [IROS2021] KDFNet: Learning Keypoint Distance Field for 6D Object Pose Estimation
  • [ICCV2021] Estimating and Exploiting the Aleatoric Uncertainty in Surface Normal Estimation [github]
  • [Arxiv] Leveraging SE(3) Equivariance for Self-Supervised Category-Level Object Pose Estimation [Project]
  • [ICCV2021] Deep Hough Voting for Robust Global Registration
  • [Arxiv] You Only Hypothesize Once: Point Cloud Registration with Rotation-equivariant Descriptors [Project]
  • [ICCV2021] A Robust Loss for Point Cloud Registration
  • [Arxiv] Geometry-Aware Self-Training for Unsupervised Domain Adaptationon Object Point Clouds
  • [IROS2021] Category-Level 6D Object Pose Estimation via Cascaded Relation and Recurrent Reconstruction Networks [Project] [github]
  • [ICCV2021] StructDepth: Leveraging the structural regularities for self-supervised indoor depth estimation [github]
  • [ICCV2021] SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation
  • [ICCV2021] Self-supervised Monocular Depth Estimation for All Day Images using Domain Separation
  • [ICCV2021] AdaFit: Rethinking Learning-based Normal Estimation on Point Clouds [Project]
  • [Arxiv] DnD: Dense Depth Estimation in Crowded Dynamic Indoor Scenes
  • [ICCV2021] Towards Interpretable Deep Networks for Monocular Depth Estimation [github]
  • [Arxiv] UPDesc: Unsupervised Point Descriptor Learning for Robust Registration
  • [IROS2021] BundleTrack: 6D Pose Tracking for Novel Objects without Instance or Category-Level 3D Models [github]
  • [Arxiv] RigNet: Repetitive Image Guided Network for Depth Completion
  • [Arxiv] DCL: Differential Contrastive Learning for Geometry-Aware Depth Synthesis
  • [ACMMM2021] BridgeNet: A Joint Learning Network of Depth Map Super-Resolution and Monocular Depth Estimation [Project] [github]
  • [Arxiv] Disentangled Implicit Shape and Pose Learning for Scalable 6D Pose Estimation
  • [ICCV2021] HRegNet: A Hierarchical Network for Large-scale Outdoor LiDAR Point Cloud Registration [Project] [pytorch]
  • [Arxiv] Score-Based Point Cloud Denoising
  • [Arxiv] HIDA: Towards Holistic Indoor Understanding for the Visually Impaired via Semantic Instance Segmentation with a Wearable Solid-State LiDAR Sensor
  • [Arxiv] Learn to Learn Metric Space for Few-Shot Segmentation of 3D Shapes
  • [Arxiv] EdgeConv with Attention Module for Monocular Depth Estimation
  • [ICML2021] Implicit-PDF: Non-Parametric Representation of Probability Distributions on the Rotation Manifold [Project]
  • [ICRA2021] An Adaptive Framework For Learning Unsupervised Depth Completion [github] [github]
  • [ICRA2021] TSDF++: A Multi-Object Formulation for Dynamic Object Tracking and Reconstruction [github]
  • [Siggraph2021] Orienting Point Clouds with Dipole Propagation
  • [CVPR2021] The Temporal Opportunist: Self-Supervised Multi-Frame Monocular Depth
  • [Arxiv] Fully Convolutional Line Parsing [pytorch]
  • [CVPR2021] Depth Completion using Plane-Residual Representation
  • [Arxiv] Domain Adaptive Monocular Depth Estimation With Semantic Information
  • [CVPR2021] Depth Completion with Twin Surface Extrapolation at Occlusion Boundaries [github]
  • [Arxiv] Local Metrics for Multi-Object Tracking
  • [Arxiv] Full Surround Monodepth from Multiple Cameras
  • [CVPR2021] RGB-D Local Implicit Function for Depth Completion of Transparent Objects [Project]
  • [CVPR2021] Learning Camera Localization via Dense Scene Matching [pytorch]
  • [Arxiv] LSG-CPD: Coherent Point Drift with Local Surface Geometry for Point Cloud Registration
  • [ICRA2021] PlaneSegNet: Fast and Robust Plane Estimation Using a Single-stage Instance Segmentation CNN
  • [Arxiv] Learning Fine-Grained Segmentation of 3D Shapes without Part Labels
  • [CVPR2021] Skeleton Merger: an Unsupervised Aligned Keypoint Detector
  • [CVPR2021] Beyond Image to Depth: Improving Depth Prediction using Echoes
  • [CVPR2021] FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose Estimation with Decoupled Rotation Mechanism [Project]
  • [CVPR2021] Self-supervised Geometric Perception
  • [Arxiv] StablePose: Learning 6D Object Poses from Geometrically Stable Patches
  • [Arxiv] A Parameterised Quantum Circuit Approach to Point Set Matching
  • [Arxiv] Adjoint Rigid Transform Network: Self-supervised Alignment of 3D Shapes
  • [Arxiv] Video Transformer Network
  • [ICLR2021] NeMo: Neural Mesh Models of Contrastive Features for Robust 3D Pose Estimation [pytorch]
  • [Arxiv] NBDT: NEURAL-BACKED DECISION TREE [pytorch]
  • [Arxiv] AdaBins: Depth Estimation using Adaptive Bins [pytorch]
  • [Arxiv] Unsupervised Monocular Depth Reconstruction of Non-Rigid Scenes
  • [Arxiv] CorrNet3D: Unsupervised End-to-end Learning of Dense Correspondence for 3D Point Clouds

Before 2021

  • [NeurIPS2019] PRNet: Self-Supervised Learning for Partial-to-Partial Registration [pytorch]
  • [Arxiv] iNeRF: Inverting Neural Radiance Fields for Pose Estimation [Project]
  • [Arxiv] Boosting Monocular Depth Estimation with Lightweight 3D Point Fusion
  • [Arxiv] 3D Registration for Self-Occluded Objects in Context
  • [Arxiv] Continuous Surface Embeddings
  • [Arxiv] SpinNet: Learning a General Surface Descriptor for 3D Point Cloud Registration
  • [Arxiv] MVTN: Multi-View Transformation Network for 3D Shape Recognition
  • [Arxiv] PREDATOR: Registration of 3D Point Clouds with Low Overlap
  • [Arxiv] Deep Magnification-Arbitrary Upsampling over 3D Point Clouds
  • [Arxiv] Occlusion Guided Scene Flow Estimation on 3D Point Clouds
  • [NeurIPS2020] An Analysis of SVD for Deep Rotation Estimation
  • [EG2020W] SHREC 2020 track: 6D object pose estimation
  • [ACCV2020] Best Buddies Registration for Point Clouds
  • [3DV] A New Distributional Ranking Loss With Uncertainty: Illustrated in Relative Depth Estimation
  • [BMVC2020] View-consistent 4D Light Field Depth Estimation
  • [BMVC2020] Neighbourhood-Insensitive Point Cloud Normal Estimation Network [Project]
  • [ECCV2020] DeepGMR: Learning Latent Gaussian Mixture Models for Registration [Project]
  • [ECCV2020] Motion Capture from Internet Videos [Project]
  • [ECCV2020] Depth Completion with RGB Prior
  • [ECCV2020] 6D Camera Relocalization in Ambiguous Scenes via Continuous Multimodal Inference
  • [Arxiv] Self-Supervised Learning of Point Clouds via Orientation Estimation
  • [SIGGRAPH2020] SymmetryNet: Learning to Predict Reflectional and Rotational Symmetries of 3D Shapes from Single-View RGB-D Images [Project]
  • [ECCV2020] Learning Stereo from Single Images [github]
  • [Arxiv] Learning Long-term Visual Dynamics with Region Proposal Interaction Networks [Project]
  • [ECCV2020] Beyond Controlled Environments: 3D Camera Re-Localization in Changing Indoor Scenes [Project]
  • [ECCV2020] Unsupervised Shape and Pose Disentanglement for 3D Meshes
  • [Arxiv] PVSNet: Pixelwise Visibility-Aware Multi-View Stereo Network
  • [ECCV2020] P2Net: Patch-match and Plane-regularization for Unsupervised Indoor Depth Estimation
  • [CVPR2020] Learning multiview 3D point cloud registration [pytorch]
  • [CVPR2020] Feature-metric Registration: A Fast Semi-supervised Approach for Robust Point Cloud Registration without Correspondences
  • [Siggraph2020] Consistent Video Depth Estimation
  • [Arxiv] Deep Feature-preserving Normal Estimation for Point Cloud Filtering
  • [Arxiv] Pseudo RGB-D for Self-Improving Monocular SLAM and Depth Prediction
  • [CVPR2020] Towards Better Generalization: Joint Depth-Pose Learning without PoseNet [pytorch]
  • [Arxiv] Monocular Camera Localization in Prior LiDAR Maps with 2D-3D Line Correspondences
  • [Arxiv] Adversarial Texture Optimization from RGB-D Scans
  • [Arxiv] SAPIEN: A SimulAted Part-based Interactive ENvironment
  • [CVPR2020] G2L-Net: Global to Local Network for Real-time 6D Pose Estimation with Embedding Vector Features
  • [Arxiv] On Localizing a Camera from a Single Image
  • [Arxiv] DeepFit: 3D Surface Fitting via Neural Network Weighted Least Squares
  • [CVPR2020] KFNet: Learning Temporal Camera Relocalization using Kalman Filtering
  • [Arxiv] Neural Contours: Learning to Draw Lines from 3D Shapes
  • [Arxiv] 3dDepthNet: Point Cloud Guided Depth Completion Network for Sparse Depth and Single Color Image
  • [Arxiv] Unsupervised Learning of Category-Specific Symmetric 3D Keypoints from Point Sets
  • [CVPR2020] End-to-End Learning Local Multi-view Descriptors for 3D Point Clouds
  • [Arxiv] PnP-Net: A hybrid Perspective-n-Point Network
  • [CVPR2020] MobilePose: Real-Time Pose Estimation for Unseen Objects with Weak Shape Supervision
  • [CVPR2020] D3VO: Deep Depth, Deep Pose and Deep Uncertainty for Monocular Visual Odometry
  • [ICIP2020] TRIANGLE-NET: TOWARDS ROBUSTNESS IN POINT CLOUD CLASSIFICATION
  • [ICRA2020] Robust 6D Object Pose Estimation by Learning RGB-D Features
  • [Arxiv] Predicting Sharp and Accurate Occlusion Boundaries in Monocular Depth Estimation Using Displacement Fields
  • [Arxiv] Single Image Depth Estimation Trained via Depth from Defocus Cues [pytorch]
  • [Arxiv] DepthTransfer: Depth Extraction from Video Using Non-parametric Sampling
  • [Arxiv] Target-less registration of point clouds: A review
  • [Arxiv] Quaternion Equivariant Capsule Networks for 3D point clouds
  • [Arxiv] Category-Level Articulated Object Pose Estimation
  • [Arxiv] A Quantum Computational Approach to Correspondence Problems on Point Sets
  • [Arxiv] DeepSFM: Structure From Motion Via Deep Bundle Adjustment
  • [Arxiv] P2GNet: Pose-Guided Point Cloud Generating Networks for 6-DoF Object Pose Estimation
  • [ICCV2019] Learning Local RGB-to-CAD Correspondences for Object Pose Estimation
  • [ICCV2019] Joint Embedding of 3D Scan and CAD Objects [dataset]
  • [ICLR2019] BA-NET: DENSE BUNDLE ADJUSTMENT NETWORKS [tensorflow]
  • [ICCV2019] GP2C: Geometric Projection Parameter Consensus for Joint 3D Pose and Focal Length Estimation in the Wild
  • [ICCV2019] Closed-Form Optimal Two-View Triangulation Based on Angular Errors
  • [ICCV2019] Polarimetric Relative Pose Estimation
  • [ICCV2019] End-to-End CAD Model Retrieval and 9DoF Alignment in 3D Scans
  • [ICCV2019] Deep Non-Rigid Structure from Motion
  • [CVPR2019] On the Continuity of Rotation Representations in Neural Networks [pytorch]
  • [Arxiv] Deep Interpretable Non-Rigid Structure from Motion [tensorflow]
  • [Arxiv] IKEA Furniture Assembly Environment for Long-Horizon Complex Manipulation Tasks [dataset]
  • [CVPR2019] Scan2CAD: Learning CAD Model Alignment in RGB-D Scans [pytorch] πŸ”₯
  • [3DV2019] Location Field Descriptors: Single Image 3D Model Retrieval in the Wild
  • [CVPR2016] Marr Revisited: 2D-3D Alignment via Surface Normal Prediction [caffe]

Survey, Resources and Tools

  • [Dataset] Aria Synthetic Environments Dataset
  • [Dataset] Aria Digital Twin: A New Benchmark Dataset for Egocentric 3D Machine Perception [Project]
  • [Dataset] CAD-Estate: Large-scale CAD Model Annotation in RGB Videos [github]
  • [Arxiv] Teaching CLIP to Count to Ten
  • [Arxiv] ControlNet
  • [Arxiv] T2I-Adapter
  • [Arxiv] OmniObject3D: Large-Vocabulary 3D Object Dataset for Realistic Perception, Reconstruction and Generation [Project]
  • [Arxiv] SDFStudio: A Unified Framework for Surface Reconstruction [Project]
  • [Arxiv] Objaverse: A Universe of Annotated 3D Objects [Project]
  • [Arxiv] Omni3D: A Large Benchmark and Model for 3D Object Detection in the Wild [Project]
  • [NeurIPS2021] ARKitScenes: A Diverse Real-World Dataset For 3D Indoor Scene Understanding Using Mobile RGB-D Data [github]
  • [Dataset] ReplicaCAD [Project]
  • [PhDthesis] Synthesizing Photorealistic Images with Deep Generative Learning
  • [ICCVW2021] V2X-Sim: A Virtual Collaborative Perception Dataset for Autonomous Driving [Project]
  • [Arxiv] TransCG: A Large-Scale Real-World Dataset for Transparent Object Depth Completion and Grasping [Project]
  • [Arxiv] A Survey of Neural Trojan Attacks and Defenses in Deep Learning
  • [Arxiv] Tiny Object Tracking: A Large-scale Dataset and A Baseline [github]
  • [Arxiv] A survey of top-down approaches for human pose estimation
  • [Arxiv] A Survey on RGB-D Datasets
  • [Arxiv] Avoiding Overfitting: A Survey on Regularization Methods for Convolutional Neural Networks

Before 2022

  • [Arxiv] iSeg3D: An Interactive 3D Shape Segmentation Tool
  • [Arxiv] Benchmarking Pedestrian Odometry: The Brown Pedestrian Odometry Dataset (BPOD) [Project]
  • [Arxiv] PandaSet: Advanced Sensor Suite Dataset for Autonomous Driving [Project]
  • [Arxiv] Few-Shot Object Detection: A Survey
  • [Arxiv] Paris-CARLA-3D: A Real and Synthetic Outdoor Point Cloud Dataset for Challenging Tasks in 3D Mapping [Project]
  • [Arxiv] PyTorchVideo: A Deep Learning Library for Video Understanding [Project]
  • [Arxiv] DIML/CVL RGB-D Dataset: 2M RGB-D Images of Natural Indoor and Outdoor Scenes [Project]
  • [Arxiv] A Review on Human Pose Estimation
  • [ICCV2021] BuildingNet: Learning to Label 3D Buildings [Project]
  • [ICCV2021] Omnidata: A Scalable Pipeline for Making Multi-Task Mid-Level Vision Datasets from 3D Scans [Project]
  • [Arxiv] Habitat-Matterport 3D Dataset (HM3D): 1000 Large-scale 3D Environments for Embodied AI
  • [Arxiv] MINERVAS: Massive INterior EnviRonments VirtuAl Synthesis [Project]
  • [Arxiv] UrbanScene3D: A Large Scale Urban Scene Dataset and Simulator [Project]
  • [Arxiv] SODA10M: Towards Large-Scale Object Detection Benchmark for Autonomous Driving [Project]
  • [Arxiv] A Survey on Human-aware Robot Navigation
  • [Arxiv] One Million Scenes for Autonomous Driving: ONCE Dataset [Project]
  • [Arxiv] 3D Object Detection for Autonomous Driving: A Survey
  • [Arxiv] The Oxford Road Boundaries Dataset
  • [CVPR2021] 3D AffordanceNet: A Benchmark for Visual Object Affordance Understanding
  • [Arxiv] 3DB: A Framework for Debugging Computer Vision Models [github]
  • [Arxiv] NViSII: A Scriptable Tool for Photorealistic Image Generation [github]
  • [Dataset] Structured3D: A Large Photo-realistic Dataset for Structured 3D Modeling
  • [Survey] 3D Semantic Scene Completion: a Survey
  • [Survey] Deep Learning based 3D Segmentation: A Survey
  • [Survey] A comprehensive survey on point cloud registration
  • [Survey] Domain Generalization: A Survey
  • [Dataset] SUM: A Benchmark Dataset of Semantic Urban Meshes
  • [Survey] Attention Models for Point Clouds in Deep Learning: A Survey
  • [Benchmark] H3D: Benchmark on Semantic Segmentation of High-Resolution 3D Point Clouds and textured Meshes from UAV LiDAR and Multi-View-Stereo [Project]
  • [Survey] Dynamic Neural Networks: A Survey
  • [Survey] Online Continual Learning in Image Classification: An Empirical Survey
  • [Survey] Deep Learning for Visual Tracking: A Comprehensive Survey
  • [Survey] Occlusion Handling in Generic Object Detection: A Review
  • [Survey] Curriculum Learning: A Survey
  • [Github] Awesome Neural Radiance Fields
  • [Survey] Neural Volume Rendering: NeRF And Beyond
  • [Survey] Transformers in Vision: A Survey
  • [Survey] Efficient Transformers: A Survey
  • [Survey] Semantics for Robotic Mapping, Perception and Interaction: A Survey
  • [Survey] Generative Adversarial Networks in Computer Vision: A Survey and Taxonomy

Before 2021

  • [Dataset] The Replica Dataset: A Digital Replica of Indoor Spaces [github]
  • [IROS2021] iGibson 1.0: a Simulation Environment for Interactive Tasks in Large Realistic Scenes [Project]
  • [Dataset] Objectron: A Large Scale Dataset of Object-Centric Videos in the Wild with Pose Annotations [Github]
  • [Survey] Skeleton-based Approaches based on Machine Vision: A Survey
  • [Survey] Deep Learning-Based Human Pose Estimation: A Survey [Github]
  • [Dataset] Hypersim: A Photorealistic Synthetic Dataset for Holistic Indoor Scene Understanding [Github]
  • [Survey] A Review and Comparative Study on Probabilistic Object Detection in Autonomous Driving [Github]
  • [Dataset] RELLIS-3D Dataset: Data, Benchmarks and Analysis [Github]
  • [Arxiv] Motion Prediction on Self-driving Cars: A Review
  • [Github] TESSE: Unity-based simulator to enable research in perception, mapping, learning, and robotics
  • [Survey] A Survey on Visual Transformer
  • [Survey] A Survey on Contrastive Self-supervised Learning
  • [Survey] A Survey of Surface Reconstruction from Point Clouds
  • [Dataset] Torch-Points3D: A Modular Multi-Task Framework for Reproducible Deep Learning on 3D Point Clouds [Project]
  • [Thesis] Learning to Reconstruct and Segment 3D Objects
  • [Survey] An Overview Of 3D Object Detection
  • [Survey] A Brief Review of Domain Adaptation
  • [Dataset] Announcing the Objectron Dataset
  • [Tutorial] Video Action Understanding: A Tutorial
  • [Arxiv] Fusion 360 Gallery: A Dataset and Environment for Programmatic CAD Reconstruction [Page]
  • [Survey] Multi-Task Learning with Deep Neural Networks: A Survey
  • [Survey] Deep Learning for 3D Point Cloud Understanding: A Survey
  • [Thesis] COMPUTATIONAL ANALYSIS OF DEFORMABLE MANIFOLDS: FROM GEOMETRIC MODELING TO DEEP LEARNING
  • [Arxiv] F*: An Interpretable Transformation of the F-measure
  • [Dataset] Gibson Database of 3D Spaces
  • [BMVC2020] Black Magic in Deep Learning: How Human Skill Impacts Network Training
  • [Arxiv] PyTorch Metric Learning
  • [Arxiv] RGB-D Salient Object Detection: A Survey [Project]
  • [Arxiv] AiRound and CV-BrCT: Novel Multi-View Datasets for Scene Classification [Project]
  • [CVPR2020] OASIS: A Large-Scale Dataset for Single Image 3D in the Wild [Project]
  • [Arxiv] 3D-FUTURE: 3D FUrniture shape with TextURE
  • [Arxiv] 3D-FRONT: 3D Furnished Rooms with layOuts and semaNTics [Project][Link]
  • [Arxiv] Differentiable Rendering: A Survey
  • [Arxiv] Visual Relationship Detection using Scene Graphs: A Survey
  • [Arxiv] Polarization Human Shape and Pose Dataset
  • [Arxiv] IDDA: a large-scale multi-domain dataset for autonomous driving [Project page]
  • [CVPR2020] RoboTHOR: An Open Simulation-to-Real Embodied AI Platform [Project page]
  • [EG2020] State of the Art on Neural Rendering
  • [IJCAI-PRICAI2020] 3D-FUTURE: 3D FUrniture shape with TextURE
  • [Arxiv] Toronto-3D: A Large-scale Mobile LiDAR Dataset for Semantic Segmentation of Urban Roadways
  • [Arxiv] KeypointNet: A Large-scale 3D Keypoint Dataset Aggregated from Numerous Human Annotations
  • [Arxiv] A Review on Generative Adversarial Networks: Algorithms, Theory, and Applications
  • [Arxiv] From Seeing to Moving: A Survey on Learning for Visual Indoor Navigation (VIN)
  • [Arxiv] DIODE: A Dense Indoor and Outdoor DEpth Dataset [dataset]
  • [Github] Various GANs with Pytorch.
  • [Arxiv] SemanticPOSS: A Point Cloud Dataset with Large Quantity of Dynamic Instances [dataset]
  • [CVM] A Survey on Deep Geometry Learning: From a Representation Perspective
  • [Arxiv] A survey on Semi-, Self- and Unsupervised Techniques in Image Classification
  • [Arxiv] fastai: A Layered API for Deep Learning
  • [Arxiv] AU-AIR: A Multi-modal Unmanned Aerial Vehicle Dataset for Low Altitude Traffic Surveillance [dataset]
  • [Arxiv] VIRTUAL KITTI 2 [dataset]
  • [Arxiv] Tutorial on Variational Autoencoders
  • [Arxiv] Review: deep learning on 3D point clouds
  • [Arxiv] Image Segmentation Using Deep Learning: A Survey
  • [CVPR2018] Pixels, Voxels, and Views: A Study of Shape Representations for Single View 3D Object Shape Prediction
  • [Arxiv] Evolution of Image Segmentation using Deep Convolutional Neural Network: A Survey
  • [Arxiv] MCMLSD: A Probabilistic Algorithm and Evaluation Framework for Line Segment Detection
  • [Arxiv] Deep Learning for 3D Point Clouds: A Survey
  • [Arxiv] A Survey on Deep Learning-based Architectures for Semantic Segmentation on 2D images
  • [Arxiv] A Survey on Deep Learning Architectures for Image-based Depth Reconstruction
  • [Arxiv] secml: A Python Library for Secure and Explainable Machine Learning
  • [Arxiv] Bundle Adjustment Revisited
  • [ICCV2019] Deep CG2Real: Synthetic-to-Real Translation via Image Disentanglement
  • [Arxiv] SIFT Meets CNN: A Decade Survey of Instance Retrieval
  • [ICCV2019] Revisiting Point Cloud Classification: A New Benchmark Dataset and Classification Model on Real-World Data [tensorflow]
  • [Arxiv] BlendedMVS: A Large-scale Dataset for Generalized Multi-view Stereo Networks [dataset]
  • [Arxiv] Imbalance Problems in Object Detection: A Review [repository]
  • [IJCV] Deep Learning for Generic Object Detection: A Survey
  • [Arxiv] Differentiable Visual Computing (Ph.D thesis)
  • [BMVC2018] InteriorNet: Mega-scale Multi-sensor Photo-realistic Indoor Scenes Dataset [dataset]
  • [ICCV2017] The Mapillary Vistas Dataset for Semantic Understanding of Street Scenes [dataset] [script] ⭐
  • [Arxiv] SynthCity: A large scale synthetic point cloud [dataset]
  • [Github] Mesh Voxelization (SDFs or Occupancy grids)
  • [Github] SDFGen (to generate grid-based signed distance field (level set))
  • [Github] Blender renderer for python
  • [Github] Blender renderer for python
  • [Github] Volumetric TSDF Fusion of RGB-D Images in Python
  • [Github] Volumetric TSDF Fusion of Multiple Depth Maps
  • [Github] PyFusion
  • [Github] PyRender
  • [Github] PyMCubes
  • [Github] Watertight and Simplified Meshes through TSDF Fusion (Python tool for obtaining watertight meshes using TSDF fusion.)
  • [Github] Several tools about SDF functions.
  • [Github] 3DMatch Toolbox
  • [stackoverflow] Computing truncated signed distance function(TSDF) from a point cloud
  • [Github] voxblox: A library for flexible voxel-based mapping, mainly focusing on truncated and Euclidean signed distance fields.
  • [Github] Discregrid: A static C++ library for the generation of discrete functions on a box-shaped domain. This is especially suited for the generation of signed distance fields.
  • [Github] awesome-voxel: Voxel resources for coders
  • [Github] gvdb-voxels: Sparse volume compute and rendering on NVIDIA GPUs
  • [Github] pyntcloud is a Python library for working with 3D point clouds.
  • [Github] Open3D: A Modern Library for 3D Data Processing
  • [Github] mesh_to_sdf: Calculate signed distance fields for arbitrary meshes
  • [Github] Detecting & Penalizing Mesh Intersections
  • [CVPR2021] Picasso: A CUDA-based Library for Deep Learning over 3D Meshes [Github]
  • [Github] A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications
  • [Arxiv] Shuffler: A Large Scale Data Management Tool for Machine Learning in Computer Vision
  • [Arxiv] PyGAD: An Intuitive Genetic Algorithm Python Library [Github]
  • [Arxiv] PyGAD: An Intuitive Genetic Algorithm Python Library [Github]
  • [ICRA2014] A Benchmark for RGB-D Visual Odometry, 3D Reconstruction and SLAM [Project]
  • [CVPR2016] SceneNet: Understanding Real World Indoor Scenes With Synthetic Data [Project]

3d-shape-analysis-paper-list's People

Contributors

gaozhongpai avatar hippogriff avatar raincrash avatar wufeim avatar yinyunie avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

3d-shape-analysis-paper-list's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.