brummi / derender3d Goto Github PK

CVPR 2022 - derender3d: A method for de-rendering a 3D object from a single image into shape, material, and lighting, that is trained in a weakly-supervised fashion relying only on rough shape estimates.

License: MIT License

Python 97.98% Shell 2.02%

computervision cvpr2022 graphics in-the-wild inverse

derender3d's Introduction

De-rendering 3D Objects in the Wild

Paper | Video | Project Page

This is the official implementation for the CVPR 2022 paper:

De-rendering 3D objects in the Wild

Felix Wimbauer^1,2, Shangzhe Wu¹ and Christian Rupprecht¹
¹Visual Geometry Group, University of Oxford, ²Technical University of Munich

CVPR 2022 (arXiv)

thumbnail.mp4

A method for de-rendering a 3D object from a single image into shape, material, and lighting, that is trained in a weakly-supervised fashion relying only on rough shape estimates.

📋 Abstract

With increasing focus on augmented and virtual reality applications (XR) comes the demand for algorithms that can lift objects from images and videos into representations that are suitable for a wide variety of related 3D tasks. Large-scale deployment of XR devices and applications means that we cannot solely rely on supervised learning, as collecting and annotating data for the unlimited variety of objects in the real world is infeasible. We present a weakly supervised method that is able to decompose a single image of an object into shape (depth and normals), material (albedo, reflectivity and shininess) and global lighting parameters. For training, the method only relies on a rough initial shape estimate of the training objects to bootstrap the learning process. This shape supervision can come for example from a pretrained depth network or - more generically - from a traditional structure-from-motion pipeline. In our experiments, we show that the method can successfully de-render 2D images into a decomposed 3D representation and generalizes to unseen object categories. Since in-the-wild evaluation is difficult due to the lack of ground truth data, we also introduce a photo-realistic synthetic test set that allows for quantitative evaluation.

@InProceedings{wimbauer2022rendering,
  title={De-rendering 3D Objects in the Wild},
  author={Wimbauer, Felix and Wu, Shangzhe and Rupprecht, Christian},
  booktitle={CVPR},
  year={2022}
}

🏗️️ Setup

🐍 Python Environment

We use Conda to manage our Python environment:

conda env create -f environment.yml

Then, activate the conda environment :

conda activate derender3d

📸 Checkpoints

We provide download links for pretrained models for CelebA-HQ and Co3D. Models will be stored under results/models at the same location the training checkpoints will be stored.

setup/download_model.sh {celebahq|co3d}

💾 Processed Datasets

We provide download links for the processed Co3D dataset. For CelebA-HQ, the licensing is unclear, which is why we can only provide intructions to reproduce the dataset. Datasets will be stored under datasets. If you should prefer another storage location, you can create soft-links to the respective locations in the datasets folder.

setup/download_processed_co3d.sh

🎤 Demo

Coming Soon

For now, please have a look at the scripts directory, which provides many useful code snippets for data inspection, image generation, relighting videos, and consistency videos.

🏋️ Training

We provide experiment configurations under experiments/release to reproduce the results we reported in the paper. To perform training, run the following commands:

CelebA-HQ

python run.py --config experiments/release/celebahq.yml --num_workers 8 --gpu 0
python run.py --config experiments/release/celebahq_nr.yml --num_workers 8 --gpu 0

Co3D

python run.py --config experiments/release/co3d.yml --num_workers 8 --gpu 0

📊 Evaluation

To recalculate the numbers we report in the paper, please run the scripts/eval_cosy.py script. This requires you to setup the Co3D checkpoint and COSy dataset, as explained before.

python scripts/eval_cosy.py

Manual Dataset Creation

Coming Soon

TODO

Check reproducibility
Refactor and clean up code
Create download scripts for data and trained models
Check conda environment
Write detailed ReadMe
Create demo
Create fork for Unsup3D with data setup scripts (for CelebA-HQ)
Create fork for Co3D with data setup scripts

Acknowledgements

This repository is largely based on the Unsup3D repository by Shangzhe Wu.

derender3d's People

Contributors

Stargazers

Watchers

Forkers

bruinxiong peterzs yuhuang-ca 3a1b2c3 kigane kirstihly derry-xing tuskaw wealook w0lramd whuhxb delldu augmentedrealitycat

derender3d's Issues

Why is the N_ref map half red and half green？

I tried this code and saved normal map from depth and refinement normal map, like this:
normal from depth map:

normal from network:

I'm confused, why is the refinement normal map half red half green? looking forward to your response!

How to train on custom dataset

Hi, thanks for the great work.

May I ask how can I train the model on my own dataset? From the paper, seems there is one step to extract coarse light and albedo information, just want to know how I can do that for my own dataset.

Thanks in advance.

Questions about the data setup and how to setup my own datasets?

Amazing work there! Very glad you release the training code and the preprocessed Co3D datasets.

But I cannot reproduce the Co3D dataset or the CelebA-HQ dataset.

What is the “extracted” folder in the dataloader?

Why are the dimensions of light (1x3) and view matrix (2x3) different from those of unsup3d?

When will the data setup scripts be released?

How could I do the preprocessing step on my own or other published scanned object datasets such as GSO (Google Scanned Objects)?

Thanks again for this fantastic job!

Questions about Coarse Light & Albedo estimates

Thanks for your work and code, I have a few questions about Extracting Coarse Light & Albedo.

How should the total variation regularization (TV) mentioned in the paper be calculated?
How large is the image size for which the calculation takes less than 1 second as mentioned in the paper?
if I want to estimate the coarse light parameters and albedo map on another dataset, what optimization method should I choose, can scipy.optimize.last_squares work?

Single Image Relighting

Excellent work!
I was wondering how to apply target lighting to single source image like Fig.6 in the paper?
Could you teach me how to do that? Thanks!

How to inference the model on custom image?

Hi,

Thanks for the great work. When I try to inference the model and loading the checkpoints (CO3D), with whatever script I always get such errors: ModuleNotFoundError: No module named 'unsup3d.

@Brummi Could you please guide me on how to inference custom images using the pre-trained CO3D models? Thanks!

can you tell me where can I get the algorithm of computing normal map from depth map?

Hey there,
one stupid question, I don't really understand the code of get_normal_from_depth and some related code like depth_to_3d_grid, can you please tell me where I can learn this(compute normal map from depth map)? It's quiet different from what I saw on https://stackoverflow.com/questions/34644101/calculate-surface-normals-from-depth-image-using-neighboring-pixels-cross-produc

Why not estimate coarse albedo from coarse brightness (B_u) directly?

If I get the optimised $L_c$ from Equation 3, then the aggregated shading map is very close to $2 B_u$.
Then watching Equation 4, why not just represent coarse albedo from $B_u$ ?
$$\tilde{A}=I_u (2 B_u)^{-1} $$

Why there is no I/(d^2) in Equation 2 and 3?

The distance from light source to object surface are related to depth. Why not consider it in your light model?

Why Do we Need Pre-Computed for Test Images

Hello @Brummi, I was trying to run inference using scripts/images_decomposition_co3d.py, however, it looks like the code always relies on precomputed stuff e.g., depth, albedo, normal map etc. For example when running for the category = 'bench', the test_path_precompute is always set to 'datasets/co3d/extracted_bench/precomputed/val' and inspecting the dictionary (data_dict) shows all the tensors are already set essentially to non-zero values:

print('data_dict.keys(): ', data_dict.keys())
print(data_dict['input_im'].shape)
print(data_dict['recon_albedo'].shape)
print(data_dict['recon_depth'].shape)
print(data_dict['recon_normal'].shape)
print(data_dict['recon_normal'][0,0,:10,:10])

Shouldn't the test image be run without relying on any pre-computed inputs and just the single RGB image? as that is the perception I had reading the paper. All these tensors should be initialized to zero except the input_im. May be having a single minimum inference example can help here: loading the model and running inference for a sample face/object image.

Is material stored in vertex color ?

Where to download CelebAHQ cropped / extracted / unsup3d

The script you've provided in setup/download_processed_celebahq.sh does not provide links for downloading the CelebAHQ dataset. What links should we use if we want to train the network on this dataset? Thanks!

Demo release time ?

cannot download pre-trained model

Thanks for sharing your excellent work!
I want to test this model, but I can't download it by using set_up/download_model.sh(received nothing), so I tried with entering website https://www.robots.ox.ac.uk/~vgg/research/derender3d/data, but received You don't have permission to access this resource.
Can you help? thanks a lot.

reproduce Table2

Hi, thanks for the reproducibility of your code.

A little question about the result of co3d and eval_cosy.py.
I got result as follow:

  35.5Hz       Normal_l1: 0.96366      Normal_mse: 0.17515     Normal_dot: 0.26274     Normal_deviation: 37.98527      Albedo_sie: 0.07594     Albedo_l1: 0.85689      Albedo_ssim: 0.76152    Spec_l1: 0.12306        Spec_mse: 0.07565       Spec_sie: 0.04684

from which Spec_l1: 0.12306 Spec_mse: 0.07565 are different from reported results in Table 2.