Code Monkey home page Code Monkey logo

costdcnet's Introduction

CostDCNet

This repository contains the accompanying code for CostDCNet: Cost Volume based Depth Completion for a Single RGB-D Image, ECCV'22

Overview

Successful depth completion from a single RGB-D image requires both extracting plentiful 2D and 3D features and merging these heterogeneous features appropriately. We propose a novel depth completion framework, CostDCNet, based on the cost volume-based depth estimation approach that has been successfully employed for multi-view stereo (MVS). The key to high-quality depth map estimation in the approach is constructing an accurate cost volume. To produce a quality cost volume tailored to single-view depth completion, we present a simple but effective architecture that can fully exploit the 3D information, three options to make an RGB-D feature volume, and a per-plane pixel shuffle for efficient volume upsampling. Our framework consists of lightweight (~1.8M parameters) deep neural networks, running in real time (~30ms). Nevertheless, thanks to our simple but effective design, CostDCNet demonstrates depth completion results comparable to or better than the state-of-the-art (SOTA) methods.

Getting Started

Prerequisites

  • Ubuntu 18.06 or higher
  • CUDA 11.1 or higher
  • pytorch 1.8 or higher
  • python 3.8 or higher

Environment Setup (Anaconda)

We recommend using Anaconda

conda create -n costDCNet python==3.8.12
conda activate costDCNet
conda install openblas-devel -c anaconda
conda install pytorch==1.8.0 torchvision==0.9.0 torchaudio==0.8.0 cudatoolkit=11.1 -c pytorch -c conda-forge
pip install -U git+https://github.com/NVIDIA/MinkowskiEngine -v --no-deps --install-option="--blas_include_dirs=${CONDA_PREFIX}/include" --install-option="--blas=openblas"
pip install -r requirements.txt

Testing (NYUv2)

We used preprocessed NYUv2 dataset like NLSPN.

python eval_nyu.py --data_path PATH_TO_NYUv2

License

This software is being made available under the terms in the LICENSE file.

Any exemptions to these terms requires a license from the Pohang University of Science and Technology.

Useful Links

Citing CostDCNet

@inproceedings{kam2022costdcnet,
  title={CostDCNet: Cost Volume Based Depth Completion for a Single RGB-D Image},
  author={Kam, Jaewon and Kim, Jungeon and Kim, Soongjin and Park, Jaesik and Lee, Seungyong},
  booktitle={Computer Vision--ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23--27, 2022, Proceedings, Part II},
  pages={257--274},
  year={2022},
  organization={Springer}
}

Related projects

NOTE : Our implementation is based on the repositories as:

costdcnet's People

Contributors

kamse avatar

Stargazers

 avatar  avatar Gizzatov Amir avatar chrisLi avatar Mingkang Xiong avatar  avatar  avatar Yanglin Zhang avatar  avatar ZhengYu Zhu avatar  avatar S3D4F5 avatar  avatar  avatar lyn avatar  avatar Hyoungseob Park avatar  avatar Yang Zhiwen avatar  avatar  avatar Seongtae Kim avatar  avatar Chunghyun Park avatar Mu Hu avatar Donggun avatar  avatar ZhiyangZhou avatar DONG Haonan avatar  avatar  avatar jiangfan wu avatar KimSoongJin avatar 爱可可-爱生活 avatar Jooeun Son avatar Suhyun Shin avatar Wonjong Jang avatar Woohyeok Kim avatar Geonu Kim avatar Yang Heemin avatar Hwayoon Lee avatar Nuri Ryu avatar  avatar Yucheol Jung avatar yqdch avatar Junyong Lee avatar daikiyamanaka avatar Lu Ming avatar Jaesik Park avatar Comar avatar Jaesung Rim avatar

Watchers

 avatar

costdcnet's Issues

Questions regarding experimental setup for Table 2.

Greetings! I appreciate your remarkable work and would like to inquire about the experimental setup described in Table 2.
image
Specifically, I am curious about the environment in which you compared Point-Fusion and the method proposed in your paper when the number of points is 32.

We attempted to replicate your approach by using the provided CostDCNet model (available at https://github.com/kamse/CostDCNet/tree/main/weights) for inference, but we could not achieve the same level of performance reported in your paper. Consequently, we surmised that the results in the paper were obtained by training CostDCNet with 32 points instead of 500. Could you clarify whether this is the case?

Some questions regarding the paper

Thank you for the interesting work! I just had a quick question regarding the Ablation study in Table 4.
image

Going from Row 1 to Row 2, the 2D (RGBD) UNet is replaced with the proposed 3D UNet. I was wondering - how does the # of parameters decrease when going from 2D to 3D UNet? I would have thought that going from 2D to 3D convolutions, the # of parameters would increase. More specifically, what is the structure of the 2D UNet? And, am I correct in assuming that the loss for the 2D UNet is simply direct regression + L1 instead of soft-argmax + L1?

Also, before the 2D & 3D encoders are added, when generating the fused volume, what exactly are the 2D and 3D features used?

training code

Hi~ kamse,
Thank you for your excellent work! It's very inspiring to me. Could you provide your training code?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.