Code Monkey home page Code Monkey logo

grm's Introduction

GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation

GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation
Yinghao Xu*, Zifan Shi*, Wang Yifan, Hansheng Chen, Ceyuan Yang, Sida Peng, Yujun Shen, Gordon Wetzstein

3d-generation.mp4

Todo List

  • Release gradio demo code.
  • Release inference code.
  • Release pretrained models.
  • Release training code.

GRM Demo

Requirements

  • 64-bit Python 3.10 and PyTorch 2.0.1 or higher.
  • CUDA 11.8
  • Users can use the following commands to install the packages
conda create -n grm python=3.10
conda activate grm 
pip install -r requirements.txt --extra-index-url https://download.pytorch.org/whl/cu118
cd third_party/diff-gaussian-rasterization &&  pip install -e .

Pretrained weights

Pretrained weights can be downloaded from Hugging Face.

# Download weights
mkdir checkpoints && cd checkpoints
wget https://huggingface.co/justimyhxu/GRM/resolve/main/grm_u.pth -O grm_u.pth
wget https://huggingface.co/justimyhxu/GRM/resolve/main/grm_r.pth -O grm_r.pth
wget https://huggingface.co/justimyhxu/GRM/resolve/main/grm_zero123plus.pth -O grm_zero123plus.pth
cd ..

Note that we provide three checkpoints for use. We use the OpenCV coordinate system.

Checkpoint Training settings
grm_u.pth The elevations are all 20 degrees and the azimuths uniformly cover all the 360-degree information.
grm_r.pth The azimuths roughly cover the 360-degree information.
grm_zero123plus.pth Three views are with 30-degree elevations and the azimuths are evenly distributed at intervals of 120 degrees. Another view has the elevation of -20 degrees and the azimuth is 60 degrees different from one of the three.

Besides, you need to download checkpoints for SV3D.

cd checkpoints
wget --header="Authorization: Bearer HF_TOKEN" https://huggingface.co/stabilityai/sv3d/resolve/main/sv3d_p.safetensors -O sv3d_p.safetensors && cd ..

Inference

# export cuda
export CUDA_HOME=/usr/local/cuda-11.8
export PATH=$CUDA_HOME/bin:$PATH
export LD_LIBRARY_PATH=$CUDA_HOME/lib64:$LD_LIBRARY_PATH

# text-to-3D
python test.py --prompt 'a car made out of cheese'
# image-to-3D with zero123plus-v1.1
python test.py --image_path examples/dragon2.png --model zero123plus-v1.1
# image-to-3D with zero123plus-v1.2
python test.py --image_path examples/dragon2.png --model zero123plus-v1.2 --fuse_mesh True --optimize_texture True
# image-to-3D with SV3D
python test.py --image_path examples/dragon2.png --model sv3d --fuse_mesh True --optimize_texture True

Add --fuse_mesh True if you would like to get the textured mesh. Add --optimize_texture True if you would like to optimize texture on extracted textured mesh.

Gradio Demo

We provide an offline gradio demo, which can be run with the following command:

python app.py

Results

Blender Demo

blender_demo.mp4

Sparse-view Reconstruction

sparse-view.mp4

Acknowledgement

We thank all of the following amazing codes:

BibTeX

@article{xu2024grm,
     author    = {Xu, Yinghao and Shi, Zifan and Yifan, Wang and Peng, Sida and Yang, Ceyuan and Shen, Yujun and Wetzstein Gordon},
     title     = {GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation},
     journal   = {arxiv: 2403.14621},
     year      = {2024},
    }

grm's People

Contributors

deniskochetov avatar justimyhxu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.