Code Monkey home page Code Monkey logo

a2d2's Introduction

a2d2

Setup

Approach

Use the approach form this paper 3D Bounding Box Estimation Using Deep Learning and Geometry (https://arxiv.org/pdf/1612.00496.pdf)

Regress on stable 3d object properties using deep convolutional neural netword and then combine these estimates with geometric constrains provided by a 2D object bounding box to produce a complete 3D bounding box.

first network output estimates 3D object orientation second network output estimates 3D object dimensions

method takes the 2D detection bounding box and estimates a 3D bounding box

Problem

Given the LIDAR and CAMERA data, determine the location and orientation in 3D of other vehicles around the car.

Solution

2D object detection on camera image is easy and can be solved by various CNN-based solutions like YOLO and RCNN. The tricky part here is the 3D requirement. It becomes a little easier with LIDAR data, accompanied with the calibration matrices.

So the solution is straight-forward in three processing steps:

  • Detect 2D BBoxes of other vehicles visible on image frame captured by CAMERA. This can be achieved by YOLOv2 or SqueezeDet. It turns out that SqueezeDet works better for this job and is selected.
  • Determine the dimension and the orientation of detected vehicles. As demonstrated by https://arxiv.org/abs/1612.00496, dimension and orientation of other vehicles can be regressed from the image patch of corresponding 2D BBoxes.
  • Determine the location in 3D of detected vehicles. This can be achived by localizing the point cloud region whose projection stays within the detected 2D BBoxes.

References

a2d2's People

Contributors

wittmaan avatar dependabot[bot] avatar

Stargazers

 avatar  avatar

Watchers

 avatar paper2code - bot avatar

a2d2's Issues

Projecting 3D bounding box to image

Hi @wittmaan,

I'm struggling to plot a 3D bounding box (ground truth or prediction) to an image. The A2D2 dataset provides the camera intrinsic matrix (K) and the rotation+translation of the camera frame to the global frame. This results in the K [R | T] matrix needed to project a 3D point to 2D space, but it gives me strange results.

Did you find a solution for this?

Thanks,

Michiel

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.