Code Monkey home page Code Monkey logo

depth-map-cv's Introduction

Depth Map

Objective:

Using a single camera it is not possible to estimate the distance of point P from the camera located at point O. All of the points in the projective line that P belongs to will map to the same points p in the image. Therefore, making it impossible to estimate the distance.

image

However there is a solution to this problem; a stereo camera system can be used. Therefore, here I will explore how we can use a parallel camera system to estimate depth of objects in an image from scratch, without using any libraries.

image

Calculating depth(z):

                  ๐‘‘๐‘–๐‘ ๐‘๐‘Ž๐‘Ÿ๐‘–๐‘ก๐‘ฆ (๐‘‘) = (๐‘ฅ๐‘™ โˆ’ ๐‘ฅ๐‘Ÿ)   - (1) <br>
                  ๐‘‘๐‘’๐‘๐‘กโ„Ž (๐‘ง) = ๐‘“*๐ต/d          - (2) <br>

a. Image preprocessing

โ— Re-sizing stereo images to be of the same size (500, w). Here we used the cv2.resize() function to perform this action. This also helped in faster processing as originally the images were around (2000, 2000). Function โ‡’ resizeImage().

b. Disparity Calculation

โ— Once we have the stereo images of the same size, for each window patch in the left image, its correspondence location in the right image is retrieved by using normalized cross-correlation over the epipolar line. Functions โ‡’ get_disparity_parallel(), compute_row() and norm_cross_correlation().
โ— Then, the disparity is calculated using equation (1) for the correspondence locations. Here, the get_disparity_parallel() calls the compute_row() that fetches the window patches from the left image one by one and passes it to the norm_cross_correlation(). Then, the norm_cross_correlation() returns the output correlated map over the epipolar line back to compute_row() and compute_row() calculates the correspondence location. Once we have the correspondence location, the compute_row() calculates the disparity. Functions โ‡’ get_disparity_parallel(), compute_row() and norm_cross_correlation().

c. Depth Estimation

โ— After having the disparity map, the depth of pixels is calculated using equation (2). Function โ‡’ get_depth()
โ— Here, some pixels would have the depth of infinity as their disparity was zero, and for visualization purposes, this infinity value is handled by the function replace_inf().

Stereo Images:


Depth Map:

Semantics of calib.txt file terms =>

cam0,1: camera matrices for the rectified views, in the form [f 0 cx; 0 f cy; 0 0 1], where f: focal length in pixels cx, cy: principal point (note that cx differs between view 0 and 1)

doffs: x-difference of principal points, doffs = cx1 - cx0

baseline: camera baseline in mm

width, height: image size

ndisp: a conservative bound on the number of disparity levels; the stereo algorithm MAY utilize this bound and search from d = 0 .. ndisp-1

isint: whether the GT disparites only have integer precision (true for the older datasets; in this case submitted floating-point disparities are rounded to ints before evaluating)

vmin, vmax: a tight bound on minimum and maximum disparities, used for color visualization; the stereo algorithm MAY NOT utilize this information

dyavg, dymax: average and maximum absolute y-disparities, providing an indication of the calibration error present in the imperfect datasets.

Resource => https://vision.middlebury.edu/stereo/data/scenes2014/

depth-map-cv's People

Contributors

nobeldang avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.