Code Monkey home page Code Monkey logo

objectposeestimationdatasets's Introduction

ObjectPoseEstimationDatasets

A repo to summarize datasets used for object pose estimation and rendering methods used to generate synthetic training data.

In the following tables, 3D CAD model is noted as model and 2D pictured object is noted as object.

Table of Content

Rendering methods

Differentiable Renderer

Neural 3D Mesh Renderer: Kato el al. CVPR 2018

RenderNet: Thu et al. NIPS 2018

Blender Render

In this repo, we provide python code to generate rendering images from 3D models using blender as a python module that is easy to install and generate photo-realistic images : )

TODO: scripts about how to use it.

Other works using blender can be found here that generates one model at a time.

Physical Simulator

PyBullet: a very popular one in the Robotics community.

Others

Glumpy: does not support headless rendering (failed on ssh mode)

UnrealCV: extension of Unreal Engine 4, helps interact with virtual world and communicate with external program.

SyntheticComputerVision: resuming a lot of techniques used to generate synthetic image

Attention: 3D models should be aligned in the same way through meshlab to ensure the consistent orientation while wandering across the different datasets.

3D model datasets

In order to testify the network generalization ability (tested on images containing unseen 3D models from the training set), the following dataset could be used to generate synthetic training data.

Notice that ABC contains generic and arbitrary industrial CAD models while ShapeNetCore and ModelNet contain common category objects such as cars and chairs.

Dataset Number of categories Number of models Reference
ABC - 1 million CVPR 2019
ShapeNetCore 55 ~51,300 ArXiv 2015
ModelNet-40 40 26,960 CVPR 2015

Objects in the controlled environments

This table lists the datasets commonly known as BOP: Benchmark 6D Object Pose Estimation, which provide accurate 3D object models and precise 2D-3D alignment.

Dataset Sample image Annotation Statistics Reference
HomebrewedDB 6D pose + Depth + BoundingBox 33 models in 13 videos with 17,420 frames Preprint 2019
YCB-Video 6D Pose + Depth + Mask 21 models in 92 videos with 133,827 frames RSS 2018
T-LESS 6D Pose + Depth 30 models in 20 videos with ~49K frames WACV 2017
Doumanoglou 6D Pose + Depth 2 models in 3 videos with 183 frames CVPR 2016
Tejani 6D Pose + Depth 6 models in 6 videos with 2,067 frames ECCV 2014
Occluded-LINEMOD 6D Pose + Depth 8 models in 1,214 frames with 8,992 objects ECCV 2014
LINEMOD 6D pose + Depth for one object 15 models in 15 videos with 18,273 frames ACCV 2012

Objects in the wild

In this table, Pix3D and ScanNet provide precise 2D-3D alignment while others only provide a coarse alignment. PASCAL3D+ is the de facto benchmark used for viewpoint estimation.

Dataset Sample image Annotation Statistics Reference
ApolloCar3D 6D Pose + Mask 34 car models with 60K+ objects in 5,277 images CVPR 2019
Pix3D 6D Pose + Mask 9 categories containing 395 models in 10,069 images CVPR 2018
ScanNet 6D Pose + Segmentation + Depth 2.5M RGB-D frames in 1,515 scenes CVPR 2017
ObjectNet3D Euler Angles + BoundingBox 100 categories with 201,888 objects in 90,127 images ECCV 2016
PASCAL3D+ Euler Angles + BoundingBox 12 categories with 36,292 objects in 30,889 images WACV 2014
KITTI 3D BoundingBox 80,256 objects in 14,999 images CVPR 2012

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.