Code Monkey home page Code Monkey logo

robustplanarpose's Introduction

Pose Estimation For Planar Target
by Nghia Ho

http://nghiaho.com/?page_id=576

This is an OpenCV port of Robust Pose Estimation from a Planar Target (2006) by Gerald Schweighofer and Axel Pinz using their Matlab code from the link in the paper. It is used to determine the pose of a planar target. It can be used for Augmented Reality to track a planar target, such as a business card.

The algorithm is very robust and does not require accurate calibration of the camera for Augmented Reality applications. I found that a rough guess for the FOV of my webcam worked.

Changes from last version

Added a check to see if the root finding code returned any valid solutions (thanks to csantos)
Requirements

OpenCV 2.x or later

Usage

On Linux, type make to compile the demo code and ./demo to see how it works. On Windows you have to make your own project.

Note: For Visual Studio users you might get a compiler error about undefined M_PI_2, please visit http://msdn.microsoft.com/en-us/library/4hwaceh6.aspx for a fix.

To use this code in your own program you only need to call the function

RPP:Rpp(const cv::Mat &model_3D, const cv::Mat &iprts, cv::Mat &Rlu, cv::Mat &tlu, int &it1, double &obj_err1, double &img_err1);

A description of the parameters are as follows:

model_3D are the fixed 3D points of the object you want to track, they should never change throughout the program. For example, say we have a square card we want to track, we can arbitrarily choose (-1,-1,0), (-1,1,0), (1,1,0), (1,-1,0) as the 4 corners point in 3D space. The values you choose will determine the reference frame for which the estimated pose will be relative to. Using the same example, (0,0,0) in this case will be at the centre of the square card.

iprts are the normalised image points. You will need to form a typical 3×3 camera matrix to normalise the points, which we shall call K. K typically looks like

    [ fx 0  cx ]
K = [ 0  fy cy ]
    [ 0  0  1  ]

fx/fy are the focal lengths (in pixels) and cx/cy are the optical centres. For Augmented Reality application, I find you can get away with rough approximations. For example, say the webcam has a horizontal FOV of about 60 degrees and the image size is 640×480. Then the 4 parameters are:

cx = 640/2 = 320
cy = 480/2 = 240
fx = fy = cx/tan(60/2 * pi / 180) = 554.26

Now that you have K, to normalise the points simply do

[ x' ]         [ x ]
[ y' ] = K^(-1)[ y ]
[ 1  ]         [ 1 ]

where x and y are the un-normalised image points

Rlu is the returned 3×3 rotation matrix. If you pass in a valid 3×3 matrix the it will be used as the initial guess of the rotation. In a tracking scenario you would continuously feedback the returned Rlu as input for the next frame.

tlu is the returned 3×1 translation vector

it1 is the number of iterations done

obj_err1 is the object projection error in 3D

img_err1 is the image projection error, in normalised image point space

License

Simplified BSD License, unless superseded by a license from the original Matlab code or the copy and paste code for Rpoly.cpp.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.