Code Monkey home page Code Monkey logo

ridgesfm's Introduction

RidgeSfM: Structure from Motion via Robust Pairwise Matching Under Depth Uncertainty

Benjamin Graham, David Novotny
3DV 2020

This is the official implementation of RidgeSfM: Structure from Motion via Robust Pairwise Matching Under Depth Uncertainty in PyTorch.

Link to paper | Poster

ScanNet reconstruction

RidgeSfM applied to the ScanNet test set

Scene 0707_00 frame skip rate k=1
ScanNet reconstruction

Scene 0708_00 frame skip rate k=3
ScanNet reconstruction

Scene 0709_00 frame skip rate k=10
ScanNet reconstruction

Scene 0710_00 frame skip rate k=30
ScanNet reconstruction

Below we illustrate the depth uncertainty factors of variation for a frame from scene 0708.

ScanNet Depth Factors of variation
Top left: an input image.
Bottom left: the predicted depth.
Middle and right: We use SVD to reduce the 32 FoV planes down to 12 planes, and display them as 4 RGB images; each of the 4x3 color planes represents one factor of variation.

RidgeSfM applied to a video taken on a mobile phone

We applied RidgeSfM to a short video taken using a mobile phone camera. There is no ground truth pose, so the bottom right hand corner of the video is blank.

Living room - skip rate k=3
ScanNet reconstruction

RidgeSfM applied to the KITTI odometry dataset

We trained a depth prediction network on the KITTI depth prediction training set. We then processed videos from the KITTI Visual Odometry dataset. We used the 'camera 2' image sequences, cropping the input to RGB images of size 1216x320. We used R2D2 as the keypoint detector. We used a frame skip rate of k=3. The scenes are larger spatially, so for visualization we increased the number of K-Means centroids to one million.

Scene 6 - skip rate k=3
ScanNet reconstruction

Scene 7 - skip rate k=3
ScanNet reconstruction

Setup

wget https://github.com/magicleap/SuperGluePretrainedNetwork/blob/master/models/weights/superpoint_v1.pth?raw=true -O ridgesfm/weights/superpoint_v1.pth
wget https://raw.githubusercontent.com/magicleap/SuperGluePretrainedNetwork/master/models/superpoint.py -O ridgesfm/superpoint.py
  • To download one test scene:
    • Download scannet.zip to RidgeSfM/ridgesfm/data and check that RidgeSfM/ridgesfm/data/scannet/test_scenes/0707_00/seq=0707_00-maxframes=300-frameskip=10-start=1.pth exists.
    • Run python ridgesfm.py scene.n=0 scene.frameskip=10
  • To prepare the full ScanNet dataset:
    • Download the ScanNet dataset to ridgesfm/data/scannet_sens/[train|test]/
    • Download SensReader to ridgesfm/data
    • Run bash prepare_scannet.sh in ridgesfm/data/
    • Run python ridgesfm.py scene.n=x scene.frameskip=y with x = 0,1,...,98 or 99 and y = 1,3,10 or 30

To process your own video

  • calibrate your camera using calibrate/calibrate.ipynb
  • then run python ridgesfm.py scenes=calibrate/ scene.n=0 scene.frameskip=10
Videos are scaled and/or cropped to resoltion 640x480. The notebook calculates a camera intrinsic matrix for the rescaled video. RidgeSfM will work best when the C.I. matrix is similar to that of the depth prediction network's training data, i.e. [[578, 0, 319.5], [0, 578, 239.5], [0, 0, 1]].

Dependencies:

License

RidgeSfM is CC-BY-NC licensed, as found in the LICENSE file. Terms of use. Privacy

Citations

If you find this code useful in your research then please cite:

@InProceedings{ridgesfm2020,
    author       = "Benjamin Graham and David Novotny",
    title        = "Ridge{S}f{M}: Structure from Motion via Robust Pairwise Matching Under Depth Uncertainty",
    booktitle    = "International Conference on 3D Vision (3DV)",
    year         = "2020",
}

ridgesfm's People

Contributors

btgraham avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.