Code Monkey home page Code Monkey logo

total-generalized-variation-stereo-matching-pipeline's Introduction

Stereo implementation of total generalized variation (TGV) on GPU

This is a CUDA C implementation of Fast and Accurate Large-Scale Stereo Reconstruction Using Variational Methods [2] and is supplied as part of the supplementary material to my honours thesis. To my knowledge this is the only open source implementation of stereo TGV. Results below are from the first scene in the 2015 KITTI stereo evaluate data set.

Cones animation
Animation of disparity images from each iteration of TGV's gradient descent
KITTI animation
Animation of disparity images from each iteration of TGV's gradient descent

Key:

  • MC = Matching costs
  • CA = Cost aggregation[2]
  • TGV = Total generalized variation[1]
Left image
Right image
Estimated disparity image
MC
Estimated disparity image
MC+CA
Estimated disparity image
MC+CA+TGV
Ground truth disparity image

Requirements

  • CUDA
  • CMake
  • Optional: ImageMagic to convert .pgm to .png -> mogrify -format png *.pgm

Usage

# Build
mkdir build
cd build
cmake ../
make

# Binary descriptor representation of census transform on a 7x7 window
export CENSUS_DESCRIPTOR="90,144,91,144,92,144,93,144,94,144,95,144,\
96,144,107,144,108,144,109,144,110,144,111,144,112,144,113,144,124,144,\
125,144,126,144,127,144,128,144,129,144,130,144,141,144,142,144,143,144,\
144,144,145,144,146,144,147,144,158,144,159,144,160,144,161,144,162,144,\
163,144,164,144,175,144,176,144,177,144,178,144,179,144,180,144,181,144,\
192,144,193,144,194,144,195,144,196,144,197,144,198,144,144,144,144,144,\
144,144,144,144,144,144,144,144,144,144,144,144,144,144,144,144,144,144,\
144,144,144,144,144,144,144,144"

# Usage:
# ./matching <left image> <right image> \
#   <estimated disparity image> <encoded binary descriptor> \
#   <max disparity> <disparity multiplyer>

# Pipeline:
# 1. Matching costs
# 2. Cost Aggregation[2]
# 3. TGV[1]
./matching ../kitti/reference.pgm ../kitti/target.pgm ../out.pgm \
  $CENSUS_DESCRIPTOR 128 1

Limitations

  • The upper bound for the max disparity is 256. This can be increased by modifying the shared memory buffers of size 256 in selection-functions.cu.
  • Only greyscale .pgm format files can be used for input and output.
  • TGV parameter sets are specified at compile time. They are defined in selection-functions.cu.
...
// [1] mentions for low resolution Middleburry use λd=1.0 and λs=0.2
#define LAMBDA_S 0.2f
#define LAMBDA_A (8.0f * LAMBDA_S)
#define LAMBDA_D 1.0f
...
// [1] mentions for 2015 kitti use λd=0.4 and λs=1.0
#define LAMBDA_S 1.0f
#define LAMBDA_A (8.0f * LAMBDA_S)
#define LAMBDA_D 0.4f
...

Descriptors

Part of my research was learning binary descriptors using genetic algorithms (GAs) for use in stereo matching. The defenition of the some descriptors from the paper are listed here.

Results

1. Census (C)


KITTI 3px thresh err: 10.96%
export CENSUS_DESCRIPTOR="90,144,91,144,92,144,93,144,94,144,95,144,\
    96,144,107,144,108,144,109,144,110,144,111,144,112,144,113,144,124,144,\
    125,144,126,144,127,144,128,144,129,144,130,144,141,144,142,144,143,144,\
    144,144,145,144,146,144,147,144,158,144,159,144,160,144,161,144,162,144,\
    163,144,164,144,175,144,176,144,177,144,178,144,179,144,180,144,181,144,\
    192,144,193,144,194,144,195,144,196,144,197,144,198,144,144,144,144,144,\
    144,144,144,144,144,144,144,144,144,144,144,144,144,144,144,144,144,144,\
    144,144,144,144,144,144,144,144"

2. GA optimized (GA-R)


KITTI 3px thresh err: 8.26%
export GA_R_DESCRIPTOR=131,135,90,210,93,244,253,168,158,162,104,36,\
    122,61,252,260,162,106,106,42,204,18,70,37,29,180,13,44,108,103,84,284,\
    156,170,202,30,82,167,165,141,37,108,253,179,129,125,287,12,128,275,154,\
    107,42,145,7,89,215,244,235,231,127,21,86,123,175,205,280,285,78,49,52,\
    46,155,161,88,98,51,44,278,178,145,163,99,112,185,176,93,117,98,96,223,\
    71,63,94,170,176,108,238,287,30,125,74,203,96,82,48,191,125,157,150,36,\
    62,85,124,84,79,104,102,275,189,46,60,65,184,190,154,159,40

3. GA optimized, census seed (GA-RC)


KITTI 3px thresh err: 8.14%
export GA_RC_DESCRIPTOR=90,144,91,144,92,144,93,144,94,144,95,144,96,\
    144,107,144,119,27,84,50,1,94,160,166,84,132,264,237,186,77,259,180,209,\
    204,195,43,188,136,91,86,254,248,85,11,126,149,111,61,27,40,203,283,119,\
    124,87,36,84,152,246,180,241,25,11,58,195,187,112,181,88,156,203,162,271,\
    279,102,20,230,138,140,123,153,121,54,62,155,222,151,135,165,152,109,128,\
    145,143,120,280,221,104,98,101,107,104,168,142,170,124,4,76,126,148,147,\
    99,152,197,70,74,119,228,247,193,22,6,210,145,161,218,113,103

Related Publications

What to cite

The paper related to this work has not been published yet. If you use this code in your research please cite:

@misc{milson2018,
  author = {Andrew Milson},
  title = {GPU Total Generalized Variation Stereo},
  year = {2018},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{https://github.com/andrewmilson/total-generalized-variation-stereo-matching-pipeline}},
}

total-generalized-variation-stereo-matching-pipeline's People

Contributors

andrewmilson avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

total-generalized-variation-stereo-matching-pipeline's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.