Code Monkey home page Code Monkey logo

derender3d's Introduction

De-rendering 3D Objects in the Wild

Paper | Video | Project Page

This is the official implementation for the CVPR 2022 paper:

De-rendering 3D objects in the Wild

Felix Wimbauer1,2, Shangzhe Wu1 and Christian Rupprecht1
1Visual Geometry Group, University of Oxford, 2Technical University of Munich

CVPR 2022 (arXiv)

thumbnail.mp4

A method for de-rendering a 3D object from a single image into shape, material, and lighting, that is trained in a weakly-supervised fashion relying only on rough shape estimates.

πŸ“‹ Abstract

With increasing focus on augmented and virtual reality applications (XR) comes the demand for algorithms that can lift objects from images and videos into representations that are suitable for a wide variety of related 3D tasks. Large-scale deployment of XR devices and applications means that we cannot solely rely on supervised learning, as collecting and annotating data for the unlimited variety of objects in the real world is infeasible. We present a weakly supervised method that is able to decompose a single image of an object into shape (depth and normals), material (albedo, reflectivity and shininess) and global lighting parameters. For training, the method only relies on a rough initial shape estimate of the training objects to bootstrap the learning process. This shape supervision can come for example from a pretrained depth network or - more generically - from a traditional structure-from-motion pipeline. In our experiments, we show that the method can successfully de-render 2D images into a decomposed 3D representation and generalizes to unseen object categories. Since in-the-wild evaluation is difficult due to the lack of ground truth data, we also introduce a photo-realistic synthetic test set that allows for quantitative evaluation.

@InProceedings{wimbauer2022rendering,
  title={De-rendering 3D Objects in the Wild},
  author={Wimbauer, Felix and Wu, Shangzhe and Rupprecht, Christian},
  booktitle={CVPR},
  year={2022}
}

πŸ—οΈοΈ Setup

🐍 Python Environment

We use Conda to manage our Python environment:

conda env create -f environment.yml

Then, activate the conda environment :

conda activate derender3d

πŸ“Έ Checkpoints

We provide download links for pretrained models for CelebA-HQ and Co3D. Models will be stored under results/models at the same location the training checkpoints will be stored.

setup/download_model.sh {celebahq|co3d}

πŸ’Ύ Processed Datasets

We provide download links for the processed Co3D dataset. For CelebA-HQ, the licensing is unclear, which is why we can only provide intructions to reproduce the dataset. Datasets will be stored under datasets. If you should prefer another storage location, you can create soft-links to the respective locations in the datasets folder.

setup/download_processed_co3d.sh

🎀 Demo

Coming Soon

For now, please have a look at the scripts directory, which provides many useful code snippets for data inspection, image generation, relighting videos, and consistency videos.

πŸ‹οΈ Training

We provide experiment configurations under experiments/release to reproduce the results we reported in the paper. To perform training, run the following commands:

CelebA-HQ

python run.py --config experiments/release/celebahq.yml --num_workers 8 --gpu 0
python run.py --config experiments/release/celebahq_nr.yml --num_workers 8 --gpu 0

Co3D

python run.py --config experiments/release/co3d.yml --num_workers 8 --gpu 0

πŸ“Š Evaluation

To recalculate the numbers we report in the paper, please run the scripts/eval_cosy.py script. This requires you to setup the Co3D checkpoint and COSy dataset, as explained before.

python scripts/eval_cosy.py

Manual Dataset Creation

Coming Soon

TODO

  • Check reproducibility
  • Refactor and clean up code
  • Create download scripts for data and trained models
  • Check conda environment
  • Write detailed ReadMe
  • Create demo
  • Create fork for Unsup3D with data setup scripts (for CelebA-HQ)
  • Create fork for Co3D with data setup scripts

Acknowledgements

This repository is largely based on the Unsup3D repository by Shangzhe Wu.

derender3d's People

Contributors

brummi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

derender3d's Issues

Why is the N_ref map half red and half green?

I tried this code and saved normal map from depth and refinement normal map, like this:
normal from depth map:
1a8ad9e6-572d-439b-9822-67fc6adbc03e
normal from network:
0c5ba67f-4361-43be-b9b0-834c758441c6
I'm confused, why is the refinement normal map half red half green? looking forward to your response!

How to train on custom dataset

Hi, thanks for the great work.

May I ask how can I train the model on my own dataset? From the paper, seems there is one step to extract coarse light and albedo information, just want to know how I can do that for my own dataset.

Thanks in advance.

Questions about the data setup and how to setup my own datasets?

Amazing work there! Very glad you release the training code and the preprocessed Co3D datasets.

But I cannot reproduce the Co3D dataset or the CelebA-HQ dataset.

What is the β€œextracted” folder in the dataloader?
train val precomputed dir  cfas 9et(train val precomputed dir'  None)

Why are the dimensions of light (1x3) and view matrix (2x3) different from those of unsup3d?

When will the data setup scripts be released?

How could I do the preprocessing step on my own or other published scanned object datasets such as GSO (Google Scanned Objects)?

Thanks again for this fantastic job!

Questions about Coarse Light & Albedo estimates

Thanks for your work and code, I have a few questions about Extracting Coarse Light & Albedo.

  1. How should the total variation regularization (TV) mentioned in the paper be calculated?
  2. How large is the image size for which the calculation takes less than 1 second as mentioned in the paper?
  3. if I want to estimate the coarse light parameters and albedo map on another dataset, what optimization method should I choose, can scipy.optimize.last_squares work?

Single Image Relighting

Excellent work!
I was wondering how to apply target lighting to single source image like Fig.6 in the paper?
Could you teach me how to do that? Thanks!

How to inference the model on custom image?

Hi,

Thanks for the great work. When I try to inference the model and loading the checkpoints (CO3D), with whatever script I always get such errors: ModuleNotFoundError: No module named 'unsup3d.

@Brummi Could you please guide me on how to inference custom images using the pre-trained CO3D models? Thanks!

can you tell me where can I get the algorithm of computing normal map from depth map?

Hey there,
one stupid question, I don't really understand the code of get_normal_from_depth and some related code like depth_to_3d_grid, can you please tell me where I can learn this(compute normal map from depth map)? It's quiet different from what I saw on https://stackoverflow.com/questions/34644101/calculate-surface-normals-from-depth-image-using-neighboring-pixels-cross-produc

Why Do we Need Pre-Computed for Test Images

Hello @Brummi, I was trying to run inference using scripts/images_decomposition_co3d.py, however, it looks like the code always relies on precomputed stuff e.g., depth, albedo, normal map etc. For example when running for the category = 'bench', the test_path_precompute is always set to 'datasets/co3d/extracted_bench/precomputed/val' and inspecting the dictionary (data_dict) shows all the tensors are already set essentially to non-zero values:

print('data_dict.keys(): ', data_dict.keys())
print(data_dict['input_im'].shape)
print(data_dict['recon_albedo'].shape)
print(data_dict['recon_depth'].shape)
print(data_dict['recon_normal'].shape)
print(data_dict['recon_normal'][0,0,:10,:10])

Shouldn't the test image be run without relying on any pre-computed inputs and just the single RGB image? as that is the perception I had reading the paper. All these tensors should be initialized to zero except the input_im. May be having a single minimum inference example can help here: loading the model and running inference for a sample face/object image.

cannot download pre-trained model

Thanks for sharing your excellent work!
I want to test this model, but I can't download it by using set_up/download_model.sh(received nothing), so I tried with entering website https://www.robots.ox.ac.uk/~vgg/research/derender3d/data, but received You don't have permission to access this resource.
Can you help? thanks a lot.

reproduce Table2

Hi, thanks for the reproducibility of your code.

A little question about the result of co3d and eval_cosy.py.
I got result as follow:

  35.5Hz       Normal_l1: 0.96366      Normal_mse: 0.17515     Normal_dot: 0.26274     Normal_deviation: 37.98527      Albedo_sie: 0.07594     Albedo_l1: 0.85689      Albedo_ssim: 0.76152    Spec_l1: 0.12306        Spec_mse: 0.07565       Spec_sie: 0.04684

from which Spec_l1: 0.12306 Spec_mse: 0.07565 are different from reported results in Table 2.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.