Code Monkey home page Code Monkey logo

l2g-nerf's Introduction

L2G-NeRF: Local-to-Global Registration for Bundle-Adjusting Neural Radiance Fields

Project Page | Paper | Video

Yue Chen¹, Xingyu Chen¹, Xuan Wang², Qi Zhang³, Yu Guo¹, Ying Shan³, Fei Wang¹.

¹Xi'an Jiaotong University, ²Ant Group, ³Tencent AI Lab.

This repository is an official implementation of L2G-NeRF using pytorch.

💻 Installation

Hardware

  • We implement all experiments on a single NVIDIA GeForce RTX 2080 Ti GPU.
  • L2G-NeRF takes about 4.5 and 8 hours for training in synthetic objects and real-world scenes, respectively, while training BARF takes about 8 and 10.5 hours.

Software

  • Clone this repo by git clone https://github.com/rover-xingyu/L2G-NeRF
  • This code is developed with Python3. PyTorch 1.9+ is required. It is recommended use Anaconda to set up the environment, use conda env create --file requirements.yaml python=3 to install the dependencies and activate it by conda activate L2G-NeRF

🔑 Training and Evaluation

Data download

Both the Blender synthetic data and LLFF real-world data can be found in the NeRF Google Drive. For convenience, you can download them with the following script:

# Blender
gdown --id 18JxhpWD-4ZmuFKLzKlAw-w5PpzZxXOcG # download nerf_synthetic.zip
unzip nerf_synthetic.zip
rm -f nerf_synthetic.zip
mv nerf_synthetic data/blender
# LLFF
gdown --id 16VnMcF1KJYxN9QId6TClMsZRahHNMW5g # download nerf_llff_data.zip
unzip nerf_llff_data.zip
rm -f nerf_llff_data.zip
mv nerf_llff_data data/llff

Running L2G-NeRF

To train and evaluate L2G-NeRF:

# <GROUP> and <NAME> can be set to your likes, while <SCENE> is specific to datasets

# NeRF (3D): Synthetic Objects
# Blender (<SCENE>={chair,drums,ficus,hotdog,lego,materials,mic,ship})
python3 train.py \
--model=l2g_nerf --yaml=l2g_nerf_blender \
--group=exp_synthetic --name=l2g_lego \
--data.scene=lego --gpu=3 \
--data.root=/the/data/path/of/nerf_synthetic/ \
--camera.noise_r=0.07 --camera.noise_t=0.5

python3 evaluate.py \
--model=l2g_nerf --yaml=l2g_nerf_blender \
--group=exp_synthetic --name=l2g_lego \
--data.scene=lego --gpu=3 \
--data.root=/the/data/path/of/nerf_synthetic/ \
--data.val_sub= --resume

# NeRF (3D): Real-World Scenes
# LLFF (<SCENE>={fern,flower,fortress,horns,leaves,orchids,room,trex})
python3 train.py \
--model=l2g_nerf --yaml=l2g_nerf_llff \
--group=exp_LLFF --name=l2g_fern \
--data.scene=fern --gpu=3 \
--data.root=/the/data/path/of/nerf_llff_data/ \
--loss_weight.global_alignment=2

python3 evaluate.py \
--model=l2g_nerf --yaml=l2g_nerf_llff \
--group=exp_LLFF --name=l2g_fern \
--data.scene=fern --gpu=3 \
--data.root=/the/data/path/of/nerf_llff_data/ \
--loss_weight.global_alignment=2 \
--resume

# Neural image Alignment (2D): Rigid
# use the image of “Girl With a Pearl Earring” renovation ©Koorosh Orooj (CC BY-SA 4.0) for rigid image alignment
python3 train.py \
--model=l2g_planar --yaml=l2g_planar \
--group=exp_planar --name=l2g_girl  \
--warp.type=rigid --warp.dof=3 \
--data.image_fname=data/girl.jpg \
--data.image_size=[595,512] \
--data.patch_crop=[260,260] \
--seed=1 --gpu=3

# Neural image Alignment (2D): Homography
# use the image of “cat” from ImageNet for homography image alignment
python3 train.py \
--model=l2g_planar --yaml=l2g_planar \
--group=exp_planar --name=l2g_cat \
--warp.type=homography --warp.dof=8 \
--data.image_fname=data/cat.jpg \
--data.image_size=[360,480] \
--data.patch_crop=[180,180] \
--gpu=0

Running BARF

If you want to train and evaluate the BARF extension of the original NeRF model that jointly optimizes poses (coarse-to-fine positional encoding):

# <GROUP> and <NAME> can be set to your likes, while <SCENE> is specific to datasets

# NeRF (3D): Synthetic Objects
# Blender (<SCENE>={chair,drums,ficus,hotdog,lego,materials,mic,ship})
python3 train.py \
--model=barf --yaml=barf_blender \
--group=exp_synthetic --name=barf_lego \
--data.scene=lego --gpu=1 \
--data.root=/the/data/path/of/nerf_synthetic/ \
--camera.noise_r=0.07 --camera.noise_t=0.5

python3 evaluate.py \
--model=barf --yaml=barf_blender \
--group=exp_synthetic --name=barf_lego \
--data.scene=lego --gpu=1 \
--data.root=/the/data/path/of/nerf_synthetic/ \
--data.val_sub= --resume

# NeRF (3D): Real-World Scenes
# LLFF (<SCENE>={fern,flower,fortress,horns,leaves,orchids,room,trex})
python3 train.py \
--model=barf --yaml=barf_llff \
--group=exp_LLFF --name=barf_fern \
--data.scene=fern --gpu=1 \
--data.root=/the/data/path/of/nerf_llff_data/

python3 evaluate.py \
--model=barf --yaml=barf_llff \
--group=exp_LLFF --name=barf_fern \
--data.scene=fern --gpu=1 \
--data.root=/the/data/path/of/nerf_llff_data/ \
--resume

# Neural image Alignment (2D): Rigid
# use the image of “Girl With a Pearl Earring” renovation ©Koorosh Orooj (CC BY-SA 4.0) for rigid image alignment
python3 train.py \
--model=planar --yaml=planar \
--group=exp_planar --name=barf_girl \
--warp.type=rigid --warp.dof=3 \
--data.image_fname=data/girl.jpg \
--data.image_size=[595,512] \
--data.patch_crop=[260,260] \
--seed=1 --gpu=1

# Neural image Alignment (2D): Homography
# use the image of “cat” from ImageNet for homography image alignment
python3 train.py \
--model=planar --yaml=planar \
--group=exp_planar --name=barf_cat \
--warp.type=homography --warp.dof=8 \
--data.image_fname=data/cat.jpg \
--data.image_size=[360,480] \
--data.patch_crop=[180,180] \
--gpu=2

Running Naive

If you want to train and evaluate the Naive extension of the original NeRF model that jointly optimizes poses (full positional encoding):

# <GROUP> and <NAME> can be set to your likes, while <SCENE> is specific to datasets

# NeRF (3D): Synthetic Objects
# Blender (<SCENE>={chair,drums,ficus,hotdog,lego,materials,mic,ship})
python3 train.py \
--model=barf --yaml=barf_blender \
--group=exp_synthetic --name=nerf_lego \
--data.scene=lego --gpu=2 \
--data.root=/home/cy/PNW/datasets/nerf_synthetic/ \
--barf_c2f=null \
--camera.noise_r=0.07 --camera.noise_t=0.5

python3 evaluate.py \
--model=barf --yaml=barf_blender \
--group=exp_synthetic --name=nerf_lego \
--data.scene=lego --gpu=2 \
--data.root=/home/cy/PNW/datasets/nerf_synthetic/ \
--barf_c2f=null \
--data.val_sub= --resume

# NeRF (3D): Real-World Scenes
# LLFF (<SCENE>={fern,flower,fortress,horns,leaves,orchids,room,trex})
python3 train.py \
--model=barf --yaml=barf_llff \
--group=exp_LLFF --name=nerf_fern \
--data.scene=fern --gpu=2 \
--data.root=/home/cy/PNW/datasets/nerf_llff_data/ \
--barf_c2f=null

python3 evaluate.py \
--model=barf --yaml=barf_llff \
--group=exp_LLFF --name=nerf_fern \
--data.scene=fern --gpu=2 \
--data.root=/home/cy/PNW/datasets/nerf_llff_data/ \
--barf_c2f=null --resume 

# Neural image Alignment (2D): Rigid
# use the image of “Girl With a Pearl Earring” renovation ©Koorosh Orooj (CC BY-SA 4.0) for rigid image alignment
python3 train.py \
--model=planar --yaml=planar \
--group=exp_planar --name=naive_girl \
--warp.type=rigid --warp.dof=3 \
--data.image_fname=data/girl.jpg \
--data.image_size=[595,512] \
--data.patch_crop=[260,260] \
--seed=1 --gpu=2 --barf_c2f=null

# Neural image Alignment (2D): Homography
# use the image of “cat” from ImageNet for homography image alignment
python3 train.py \
--model=planar --yaml=planar \
--group=exp_planar --name=naive_cat \
--warp.type=homography --warp.dof=8 \
--data.image_fname=data/cat.jpg \
--data.image_size=[360,480] \
--data.patch_crop=[180,180] \
--gpu=3 --barf_c2f=null

Running reference NeRF

If you want to train and evaluate the reference NeRF models (assuming known camera poses):

# <GROUP> and <NAME> can be set to your likes, while <SCENE> is specific to datasets

# NeRF (3D): Synthetic Objects
# Blender (<SCENE>={chair,drums,ficus,hotdog,lego,materials,mic,ship})
python3 train.py \
--model=nerf --yaml=nerf_blender \
--group=exp_synthetic --name=ref_lego \
--data.scene=lego --gpu=0 \
--data.root=/home/cy/PNW/datasets/nerf_synthetic/ 

python3 evaluate.py \
--model=nerf --yaml=nerf_blender \
--group=exp_synthetic --name=ref_lego \
--data.scene=lego --gpu=0 \
--data.root=/home/cy/PNW/datasets/nerf_synthetic/ \
--data.val_sub= --resume

# NeRF (3D): Real-World Scenes
# LLFF (<SCENE>={fern,flower,fortress,horns,leaves,orchids,room,trex})
python3 train.py \
--model=nerf --yaml=nerf_llff \
--group=exp_LLFF --name=ref_fern \
--data.scene=fern --gpu=0 \
--data.root=/home/cy/PNW/datasets/nerf_llff_data/


python3 evaluate.py \
--model=nerf --yaml=nerf_llff \
--group=exp_LLFF --name=ref_fern \
--data.scene=fern --gpu=0 \
--data.root=/home/cy/PNW/datasets/nerf_llff_data/ \
--resume

😆 Visualization

Results and Videos

All the results will be stored in the directory output/<GROUP>/<NAME>. You may want to organize your experiments by grouping different runs in the same group. Many videos will be created to visualize the pose optimization process and novel view synthesis.

TensorBoard

The TensorBoard events include the following:

  • SCALARS: the rendering losses and PSNR over the course of optimization. For L2G_NeRF/BARF/Naive, the rotational/translational errors with respect to the given poses are also computed.
  • IMAGES: visualization of the RGB images and the RGB/depth rendering.

Visdom

The visualization of 3D camera poses is provided in Visdom: Run visdom -port 8600 to start the Visdom server. The Visdom host server is default to localhost; this can be overridden with --visdom.server (see options/base.yaml for details). If you want to disable Visdom visualization, add --visdom!.

Mesh

The extract_mesh.py script provides a simple way to extract the underlying 3D geometry using marching cubes (supporte for the Blender dataset). Run as follows:

python3 extract_mesh.py \
--model=l2g_nerf --yaml=l2g_nerf_blender \
--group=exp_synthetic --name=l2g_lego \
--data.scene=lego --gpu=3 \
--data.root=/home/cy/PNW/datasets/nerf_synthetic/ \
--data.val_sub= --resume

🔎 Codebase structure

The main engine and network architecture in model/l2g_nerf.py inherit those from model/nerf.py.

Some tips on using and understanding the codebase:

  • The computation graph for forward/backprop is stored in var throughout the codebase.
  • The losses are stored in loss. To add a new loss function, just implement it in compute_loss() and add its weight to opt.loss_weight.<name>. It will automatically be added to the overall loss and logged to Tensorboard.
  • If you are using a multi-GPU machine, you can set --gpu=<gpu_number> to specify which GPU to use. Multi-GPU training/evaluation is currently not supported.
  • To resume from a previous checkpoint, add --resume=<ITER_NUMBER>, or just --resume to resume from the latest checkpoint.
  • To eliminate the global alignment objective, set --loss_weight.global_alignment=null, the ablation is equivalent to a local registration method.

Citation

If you find this project useful for your research, please use the following BibTeX entry.

@inproceedings{chen2023local,
  title={Local-to-global registration for bundle-adjusting neural radiance fields},
  author={Chen, Yue and Chen, Xingyu and Wang, Xuan and Zhang, Qi and Guo, Yu and Shan, Ying and Wang, Fei},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={8264--8273},
  year={2023}
}

Acknowledge

Our code is based on the awesome pytorch implementation of Bundle-Adjusting Neural Radiance Fields (BARF). We appreciate all the contributors.

l2g-nerf's People

Contributors

rover-xingyu avatar fanegg avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.