Code Monkey home page Code Monkey logo

stag4d's Introduction

STAG4D: Spatial-Temporal Anchored Generative 4D Gaussians

1Nanjing University 2CASIA 3Fudan University
*equal contribution +corresponding author

Update

6.20: IMPORTANT. Fix the bug caused by new version of diff_gauss. Newest version of diff_gauss use color, depth, norm, alpha, radii, extra as an output. However, previous version use color, depth, alpha, radii as an output. Using older version of this code will cause mismatch error and may misuse normal for the alpha loss, resulting in bad results.

5.26: Update Text/Image to 4D data below.

5.21: Fix RGB loss into the batch loop. Add visualize code.

⚙️ Installation

pip install -r requirements.txt

git clone --recursive https://github.com/slothfulxtx/diff-gaussian-rasterization.git
pip install ./diff-gaussian-rasterization

pip install ./simple-knn

Video-to-4D

To generate the examples in the project page, you can download the dataset from google drive. Place them in the dataset folder, and run:

python main.py --config configs/stag4d.yaml path=dataset/minions save_path=minions

#use --gui=True to turn on the visualizer (recommend)
python main.py --config configs/stag4d.yaml path=dataset/minions save_path=minions gui=True

To generate the spatial-temporal consistent data from stratch, your should place your rgba data in the form of

├── dataset
│   | your_data 
│     ├── 0_rgba.png
│     ├── 1_rgba.png
│     ├── 2_rgba.png
│     ├── ...

and then run

python scripts/gen_mv.py --path dataset/your_data --pipeline_path xxx/guidance/zero123pp

python main.py --config configs/stag4d.yaml path=data_path save_path=saving_path gui=True

To visualize the result, use you can replace the main.py with visualize.py, and the result will be saved to the valid/xxx path, e.g.:

python visualize.py --config configs/stag4d.yaml path=dataset/minions save_path=minions

Text-to-4D

For Text to 4D generation, we recommend using SDXL and SVD to generate a reasonable video. Then, after matting the video, use the command above to generate a good 4D result. (This pipeline contains many independent parts and is kind of complex, so we may upload the whole workflow after integration if possible.)

If you want generate the examples in the paper, I also updated the corresponding data here in google drive. Remember to set size to 26 in config or use size=26 in the command:

python main.py --config configs/stag4d.yaml path=dataset/xxx save_path=xxx size=26

Tips for better quality

If you want sacrifice time for better quality, here is some tips you can try to further improve the generated quality.

1, Use larger batch size.

2, Run for more steps.

Citation

If you find our work useful for your research, please consider citing our paper as well as Consistent4D:

@article{zeng2024stag4d,
      title={STAG4D: Spatial-Temporal Anchored Generative 4D Gaussians}, 
      author={Yifei Zeng and Yanqin Jiang and Siyu Zhu and Yuanxun Lu and Youtian Lin and Hao Zhu and Weiming Hu and Xun Cao and Yao Yao},
      year={2024},
      eprint={2403.14939},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

@article{jiang2023consistent4d,
      title={Consistent4D: Consistent 360{\deg} Dynamic Object Generation from Monocular Video}, 
      author={Yanqin Jiang and Li Zhang and Jin Gao and Weimin Hu and Yao Yao},
      year={2023},
      eprint={2311.02848},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgment

This repo is built on DreamGaussian and Zero123plus. Thank all the authors for their great work.

stag4d's People

Contributors

bilendm avatar zeng-yifei avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.