Code Monkey home page Code Monkey logo

zero123's Introduction

Zero-1-to-3: Zero-shot One Image to 3D Object

Zero-1-to-3: Zero-shot One Image to 3D Object
Ruoshi Liu1, Rundi Wu1,Basile Van Hoorick1,Pavel Tokmakov2,Sergey Zakharov2,Carl Vondrick1
1Columbia University, 2Toyota Research Institute

Updates

we've optimized our code base with some simple tricks and the current demo runs at around 22GB VRAM so it's runnable on a RTX 3090(Ti)!

weights also available at

wget https://cv.cs.columbia.edu/zero123/assets/105000.ckpt
wget https://cv.cs.columbia.edu/zero123/assets/165000.ckpt

and huggingface repo (still uploading):

https://huggingface.co/datasets/cvlab/zero123-weights 

Note that we have released two model weights. By default, we use 105000.ckpt whih is the checkpoint after finetuning 105000 iterations on objaverse. 165000.ckpt is also available.

Usage

Novel View Synthesis

conda create -n zero123 python=3.9
conda activate zero123
cd zero123
pip install -r requirements.txt
git clone https://github.com/CompVis/taming-transformers.git
pip install -e taming-transformers/
git clone https://github.com/openai/CLIP.git
pip install -e CLIP/

Download checkpoint under zero123:

https://drive.google.com/drive/folders/1geG1IO15nWffJXsmQ_6VLih7ryNivzVs?usp=sharing

Run our gradio demo for novel view synthesis:

python gradio_new.py

Note that this app uses around 22 GB of VRAM, so it may not be possible to run it on any GPU.

3D Reconstruction

cd 3drec
pip install -r requirements.txt
python run_zero123.py \
    --scene 'pikachu' \
    --index 0 \
    --n_steps 10000 \
    --lr 0.05 \
    --sd.scale 100.0 \
    --emptiness_weight 0 \
    --depth_smooth_weight 10000. \
    --near_view_weight 10000. \
    --train_view True \
    --prefix 'experiments/exp_wild' \
    --vox.blend_bg_texture False \
    --nerf_path 'data/nerf_wild'

We tested the installation processes on a system with Ubuntu 20.04 with an NVIDIA GPU with Ampere architecture.

Acknowledgement

This repocitory is based on Stable Diffusion, Objaverse, and SJC. We would like to thank the authors of these work for publicly releasing their code. We would like to thank the authors of NeRDi and SJC for their helpful feedback.

We would like to thank Changxi Zheng for many helpful discussions. This research is based on work partially supported by the Toyota Research Institute, the DARPA MCS program under Federal Agreement No. N660011924032, and the NSF NRI Award #1925157.

Citation

@misc{liu2023zero1to3,
      title={Zero-1-to-3: Zero-shot One Image to 3D Object}, 
      author={Ruoshi Liu and Rundi Wu and Basile Van Hoorick and Pavel Tokmakov and Sergey Zakharov and Carl Vondrick},
      year={2023},
      eprint={2303.11328},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

zero123's People

Contributors

ruoshiliu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.