Code Monkey home page Code Monkey logo

sgam's Introduction

SGAM: Building a Virtual 3D World through Simultaneous Generation and Mapping

Yuan Shen1, Wei-Chiu Ma2, Shenlong Wang1
Unversity of Illinois at Urbana-Champaign1, Massachusetts Institute of Technology2

Accepted at NeurIPS 2022.

Paper linkProject PageColab Quickstart

(The GIF animation above is generated via SGAM with only the first RGB-D frame known.)

TL;DR

We present a new 3D scene generation framework that simultaneously generates sensor data at novel viewpoints and builds a 3D map. Our framework is illustrated in the diagram below. drawing

Quickstart

Try our Colab codebook to play our trained models on CLEVR-Infinite and GoogleEarth-Infinite!

Installment

  • Manual Installment (Only tested on Ubuntu 18.04):

    1. Create Conda environment and install part of python packages
      conda create -n sgam python=3.9.13
      conda activate sgam
      pip install -r requirement.txt
      
    2. Install pytorch and pytorch_lightning
      pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113
      pip install pytorch_lightning==1.5.10
      

Video

youtube video link https://www.youtube.com/watch?v=GrtooGn_Rws

Data

CLEVR-Infinite Dataset

  • Note the depth is nonlinear when we render from blender. Checkout how we convert the depth from nonlinear to linear in line 103 of data/clevr-infinite.py
  • To get a quick glance at our dataset, here is one tiny scene example.
  • Two validation scene data can be downloaded from this link.
  • Our training, validation and testing dataset is available at this link for downloading.
  • To generate more training dataset at a large scale, we provide the blender script in clevr_generation directory. We randomly distribute primitive 3D objects by simulating flying objects falling and collision. Detailed steps are as follows:
    1. find a device that has GPU, and then install blender 2.92
    sudo snap install blender --channel=2.92/stable --classic
    
    1. (optional) If you want to visualize one CLEVR-Infinite scene, run the following command.
    /snap/bin/blender random_scene.blend
    
    1. Specify output directory in line 253

    2. Run the following command to render. You can change the iteration number to set the number of random scene.

      bash blender_generation.sh
    
    1. Run the postprocessing script to get rgb, depth map and transform.json
      python convert_exr.py
    

GoogleEarth-Infinite Dataset

  • Please reach out us by email if you hope to get access to the dataset.

Trained Models

We provide our trained model on GoogleEarth-Infinite and CLEVR-Infinite. Please download and organize the pre-trained checkpoints as follows:

SGAM   
└───trained_models
    └───google_earth
    │   │   config.yaml
    │   │   XXX.ckpt
    │    
    └───clevr-infinite   
        │   config.yaml
        │   XXX.ckpt

Training

  1. VQGAN codebook training.
python train_generative_sensing_model.py --base configs/codebooks/XXX.yaml --gpus 0, -t True
  1. Conditional Generation
python train_generative_sensing_model.py --base configs/conditional_generation/XXX.yaml --gpus 0, -t True

Inference

CLEVR-Infinite

python main_scene_generation.py --dataset="clevr-infinite" --use_rgbd_integration True

GoogleEarth-Infinite

python main_scene_generation.py --dataset="google_earth" --use_rgbd_integration True

Acknowledgement

We thank Vlas Zyrianov for his feedback on our paper drafts. Besides, our codebase is modified on top of VQGAN codebase. Many thanks to Patrick Esser and Robin Rombach, who makes their code available.

Citation

If you find our work is useful, please cite our work with the bibtex down below, thanks!

@inproceedings{
    shen2022sgam,
    title={{SGAM}: Building a Virtual 3D World through Simultaneous Generation and Mapping},
    author={Yuan Shen and Wei-Chiu Ma and Shenlong Wang},
    booktitle={Thirty-Sixth Conference on Neural Information Processing Systems},
    year={2022},
    url={https://openreview.net/forum?id=17KCLTbRymw}
}

sgam's People

Contributors

yshen47 avatar rolandgao avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.