Code Monkey home page Code Monkey logo

singleviewreconstruction's Introduction

Single View Reconstruction

3D Scene Reconstruction from a Single Viewport

Maximilian Denninger and Rudolph Triebel

Accepted paper at ECCV 2020. paper, short-video, long-video

The author (Maximilian Denninger) gave a talk about the paper, which can be found here.

Overview

data overview image

Abstract

We present a novel approach to infer volumetric reconstructions from a single viewport, based only on a RGB image and a reconstructed normal image. To overcome the problem of reconstructing regions in 3D that are occluded in the 2D image, we propose to learn this information from synthetically generated high-resolution data. To do this, we introduce a deep network architecture that is specifically designed for volumetric TSDF data by featuring a specific tree net architecture. Our framework can handle a 3D resolution of 512³ by introducing a dedicated compression technique based on a modified autoencoder. Furthermore, we introduce a novel loss shaping technique for 3D data that guides the learning process towards regions where free and occupied space are close to each other. As we show in experiments on synthetic and realistic benchmark data, this leads to very good reconstruction results, both visually and in terms of quantitative measures.

Content description

This repository contains everything necessary to reproduce the results presented in our paper. This includes the generation of the data and the training of our model. Be aware, that the generation of the data is time consuming as each process is optimized to the maximum but still billions of truncated signed distance values and weights have to be calculated. Including of course all the color and normals images. The data used for the training of our model was after compression around 1 TB big.

As SUNCG is not longer available, we can not upload the data, we used for training as it falls under the the SUNCG blocking. If you do not have access to the SUNCG dataset, you can try using the 3D-Front dataset and change the code to match this new dataset.

Citation

If you find our work useful, please cite us with:

@inproceedings{denninger2020,
  title={3D Scene Reconstruction from a Single Viewport},
  author={Denninger, Maximilian and Triebel, Rudolph},
  booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
  year={2020}
}

Environment

Before you execute any of the modules in this project, please install the conda environment:

conda env create -f environment.yml

This will create the SingleViewReconstruction environment, you can use it by:

conda activate SingleViewReconstruction

This uses Tensorflow 1.15 with python 3.7. This also includes some OpenGL packages for the visualizer.

Quick and easy complete run of the pipeline

There is a script, which provides a full run of the BlenderProc pipeline, you will need the "SingleViewReconstruction" environment.

But, be aware before you executed this script. That it will execute a lot of code and download a lot of stuff to your PC.

This program will download BlenderProc and then afterwards blender. It will also download the SceneNet dataset and the corresponding texture lib used by SceneNet. It will render some color & normal images for the pipeline and will also generate a true output voxelgrid to compare the results to best possible.

Before running, this make sure that you adapt the SDFGen/CMakeLists.txt file. See this README.md.

python run_on_example_scenes_from_scenenet.py

This will take a while and afterwards you can look at the generated scene with:

python TSDFRenderer/visualize_tsdf.py BlenderProc/output_dir/output_0.hdf5

Data generation

This is a quick overview over the data generation process, it is all based on the SUNCG house files.

data overview image

  1. The SUNCG house.json file is converted with the SUNCGToolBox in a house.obj and camerapositions file, for more information: SUNCG
  2. Then, these two files are used to generate the TSDF voxelgrids, for more information: SDFGen
  3. The voxelgrid is used to calculate the loss weights via the LossCalculatorTSDF
  4. They are used to first the train an autoencoder and then compress the 512³ voxelgrids down to a size of 32³x64, which we call encoded. See CompressionAutoEncoder.
  5. Now only the color & normal images are missing, for that we use BlenderProc with the config file defined in here.

These are then combined with this script to several tf records, which are then used to train our SingleViewReconstruction network.

Download of the trained models

We provide a script to easily download all models trained in this approach:

  1. The SingleViewReconstruction model
  2. The Autoencoder Compression Model CompressionAutoEncoder
  3. The Normal Generation Model UNetNormalGen
python download_models.py

singleviewreconstruction's People

Contributors

themasterlink avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

singleviewreconstruction's Issues

Volume and Surface area estimation

Hi,
I have a problem where I need to estimate the volume and surface area of multiple similar (curved) objects from 2D image. Is it possible to do this using the trained model given in your repo? If yes, do point me in the right direction. Also, will I be required to use multiple views of the same objects for my problem? (It is possible for me to obtain multiple views for my use case)

Thanks in advance.

I get an error in run_on_example_scenes_from_scenenet.py.

If you execute the following, an error will occur.
python run_on_example_scenes_from_scenenet.py

This script will perform a lot of steps automatically, including the download of SceneNet, BlenderProc, weights of the models used in this repo, and Blender.
If you agree with the download of the dataset and the open source code BlenderProc, type in yes:yes
download BlenderProc
download the full dataset SceneNet with corresponding textures
Download the scenenet textures
Access denied with the following error:

Cannot retrieve the public link of the file. You may need to change
the permission to 'Anyone with the link', or have had many accesses. 

You may still be able to access the file from the browser:

 https://drive.google.com/uc?id=0B_CLZMBI0zcuQ3ZMVnp1RUkyOFk 

tar (child): /content/SingleViewReconstruction/BlenderProc/BlenderProc/resources/scenenet/SceneNetData/../texture_folder.tgz: Cannot open: No such file or directory
tar (child): Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error is not recoverable: exiting now
Traceback (most recent call last):
File "run_on_example_scenes_from_scenenet.py", line 217, in
scenenet_data_folder = download_SceneNet(blender_proc_git_path)
File "run_on_example_scenes_from_scenenet.py", line 66, in download_SceneNet
os.remove(output)
FileNotFoundError: [Errno 2] No such file or directory: '/content/SingleViewReconstruction/BlenderProc/BlenderProc/resources/scenenet/SceneNetData/../texture_folder.tgz'

The data file is lost!

The house.obj and camera position files are then stored in a data folder, however, the link of the data is missing.

Which data to use ?

Hey @themasterlink
Thanks for the nice repo also sorry for not being detailed .
I downloaded the models by running the code with python download_models.py , and all the models are in the required folder and then when I run the predict data_points.py file with python predict_datapoint.py --output OUTPUT --use_pretrained_weights /home/trilok/SVR-final/SingleViewReconstruction/SingleViewReconstruction/model , I am getting an issue and here is the complete traceback , Can you please help me out ?
Thanks
Trilok

/home/trilok/SVR-final/SingleViewReconstruction/SingleViewReconstruction/src/SettingsReader.py:16: YAMLLoadWarning: calling yaml.load() without Loader=... is deprecated, as the default Loader is unsafe. Please read https://msg.pyyaml.org/load for full details.
  settings = yaml.load(stream)
Traceback (most recent call last):
  File "predict_datapoint.py", line 200, in <module>
    predict_some_sample_points(hdf5_paths, model_path, args.output, args.use_pretrained_weights, args.use_gen_normal)
  File "predict_datapoint.py", line 90, in predict_some_sample_points
    settings = SettingsReader(settings_file_path, data_folder)
  File "/home/trilok/SVR-final/SingleViewReconstruction/SingleViewReconstruction/src/SettingsReader.py", line 85, in __init__
    raise Exception("No paths were found, please generate data before using this script, check this path: {}".format(self.folder_path))
Exception: No paths were found, please generate data before using this script, check this path: data

P.S :- Also the color_normal_mean.hdf5 file is there in the data folder

How to reconstruct 3D scene from just one rgb image with pretrained model?

Hello @themasterlink
Thanks for your great work for Indoor SingleViewReconstruction. Sorry that I'm newer in this deep learning task.
I'm confused about how to reconstruct a 3D scene from just one RGB image with a pre-trained model.
The input image may be like the following:
test

After analyzing the readme of all subfolders and discussion in #3, I am still very confused about how a jpg -->
I try to develop the SingleJPGReconstruction.py with the following step:

  1. Gain normal_img from the test.jpg with UNetNormalGen:
    I am blocked in h5py input and setting_file.yml.
    the input of generate_encoded_outputs.py is h5py and settings_file.yml that is seemed to be generated by BlenderProc. I have no idea how to generate this input from a single image without BlenderProc.
    I tried to use command python generate_predicted_normals.py --model_path model/model.ckpt --path ../data in SingleViewReconstruction/SingleViewReconstruction.

  2. Combine normal_image and color_image into one hdf5 file.
    #3 says the data should minus normal_mean_image and color_mean_image
    The color_normal_mean.hdf5 is used here:

SingleViewReconstruction/SingleViewReconstruction/generate_tf_records.py

Line 123 in f1475ca

 normal_o -= normal_mean_img 
and here:

SingleViewReconstruction/SingleViewReconstruction/src/DataSetLoader.py

Line 95 in f1475ca
color_img -= self.mean_img 
  1. try predict_datapoint.py
    I tried command python predict_datapoint.py data/color_normal_mean.hdf5 --output OUTPUT --use_pretrained_weights.
    there wasn't train*.tf_record.

The plan of course didn't work. Could you give me some help on how to make it? Thanks sincerely.

Missing color_normal_mean.hdf5 for pretrained model

Hi Maximilian,
I am trying to predict some scene reconstruction using your pretrained model. However, a "color_normal_mean.hdf5" file is needed to do correct normalization to my own feeded images as done in the training. Since I have no idea the mean of colors and normals in the training data used for the released model, I would appreciate if you could upload the color_normal_mean.hdf5 which corresponds to your pretrained model.
Thanks!

Replica-dataset

Hi,
First, thank you for your great Paper, I really enjoyed reading it.
I am currently trying the network with the SUNCG dataset, but as you mention in the paper you also used the Replica dataset to evaluate your network, could you provide some input on how you did this? By that I mean: data adaptation/generation, evaluation script on the different datasets, etc. Every input on this matter is more than welcome.
Thank you very much for your great work.

Camera pose

Thank you for your amazing work!

I'm new to the 3d reconstruction work.
I have read your paper and find that the camera pose is extremly important to the reconstruction task.
image
image

Could you please tell me that
how do I get a 3-dimensional reconstruction if I input an image and do not know the corresponding camera pose when the image was taken?

Thank you

performance

Are you testing on the dataset again? What's the performance of SingleViewReconstruction?

Environment installation failure

Hi,

I'm having trouble installing the environment using the .yml file. Whenever I run conda env create -f environment.yml I get the following error:

Collecting package metadata (repodata.json): done
Solving environment: failed

ResolvePackageNotFound:

  • cudatoolkit==10.0.130=0
  • mkl-service==2.3.0=py37he904b0f_0
  • numpy-base==1.18.5=py37hde5b4d6_0
  • tensorflow==1.15.0=gpu_py37h0f0df58_0
  • readline==8.0=h7b6447c_0
  • ncurses==6.2=he6710b0_1
  • python==3.7.7=hcff3b4d_5
  • sqlite==3.32.3=h62c20be_0
  • libstdcxx-ng==9.1.0=hdf63c60_0
  • libprotobuf==3.12.3=hd408876_0
  • tensorflow-gpu==1.15.0=h0d30ee6_0
  • libedit==3.1.20191231=h7b6447c_0
  • mkl==2020.1=217
  • _tflow_select==2.1.0=gpu
  • ld_impl_linux-64==2.33.1=h53a641e_7
  • mkl_fft==1.1.0=py37h23d657b_0
  • wrapt==1.12.1=py37h7b6447c_1
  • tensorflow-base==1.15.0=gpu_py37h9dcbed7_0
  • zlib==1.2.11=h7b6447c_3
  • cupti==10.0.130=0
  • intel-openmp==2020.1=217
  • xz==5.2.5=h7b6447c_0
  • numpy==1.18.5=py37ha1c710e_0
  • c-ares==1.15.0=h7b6447c_1001
  • mkl_random==1.1.1=py37h0573a6f_0
  • libgfortran-ng==7.3.0=hdf63c60_0
  • libgcc-ng==9.1.0=hdf63c60_0
  • tk==8.6.10=hbc83047_0
  • hdf5==1.10.4=hb1b8bf9_0
  • h5py==2.10.0=py37h7918eee_0
  • grpcio==1.27.2=py37hf8bcb03_0
  • openssl==1.1.1g=h7b6447c_0
  • libffi==3.3=he6710b0_1
  • cudnn==7.6.5=cuda10.0_0
  • protobuf==3.12.3=py37he6710b0_0

What OS was this environment originally installed on and exported from?
I'm running on MacOS Mojave Version 10.14.6
Any assistance would be great!

Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.