Code Monkey home page Code Monkey logo

gaussianformer's Introduction

GaussianFormer: Scene as Gaussians for Vision-Based 3D Semantic Occupancy Prediction

GaussianFormer: Scene as Gaussians for Vision-Based 3D Semantic Occupancy Prediction

Yuanhui Huang, Wenzhao Zheng$\dagger$, Yunpeng Zhang, Jie Zhou, Jiwen Lu$\ddagger$

$\dagger$ Project leader $\ddagger$ Corresponding author

GaussianFormer proposes the 3D semantic Gaussians as a more efficient object-centric representation for driving scenes compared with 3D occupancy.

teaser

News

  • [2024/05/28] Paper released on arXiv.
  • [2024/05/28] Demo release.

Demo

demo

legend

Overview

comparisons

Considering the universal approximating ability of Gaussian mixture, we propose an object-centric 3D semantic Gaussian representation to describe the fine-grained structure of 3D scenes without the use of dense grids. We propose a GaussianFormer model consisting of sparse convolution and cross-attention to efficiently transform 2D images into 3D Gaussian representations. To generate dense 3D occupancy, we design a Gaussian-to-voxel splatting module that can be efficiently implemented with CUDA. With comparable performance, our GaussianFormer reduces memory consumption of existing 3D occupancy prediction methods by 75.2% - 82.2%.

overview

Getting Started

Code coming soon~

Related Projects

Our work is inspired by these excellent open-sourced repos: TPVFormer PointOcc SelfOcc SurroundOcc OccFormer BEVFormer

Citation

If you find this project helpful, please consider citing the following paper:

@article{huang2024gaussian,
    title={GaussianFormer: Scene as Gaussians for Vision-Based 3D Semantic Occupancy Prediction},
    author={Huang, Yuanhui and Zheng, Wenzhao and Zhang, Yunpeng and Zhou, Jie and Lu, Jiwen},
    journal={arXiv preprint arXiv:2405.17429},
    year={2024}
}

gaussianformer's People

Contributors

wzzheng avatar huang-yh avatar

Stargazers

 avatar wasted avatar  avatar  avatar Yifan Wang avatar  avatar Mu Hu avatar Charles Huang avatar True C avatar  avatar lbr avatar  avatar  avatar 裴虎镇(Hojin Bae) avatar SiLang avatar Ch avatar Yazan Murhij avatar Youlu_Tian avatar 罗宏昆 avatar  avatar Lu Tianyu avatar Fudong Ge avatar Sha Lu avatar 庄庭达 avatar  avatar Matthew avatar  avatar Shuo avatar  avatar Long Koelpin avatar DeepDuke avatar Tony Du avatar ray avatar  avatar Haowen Xiong avatar Youquan Liu avatar YZ avatar LI Xinrun avatar Yechong Liu avatar  avatar ChengZeLu avatar Liu Xiaolu avatar lg(x) avatar  avatar  avatar Yili Liu avatar Qichu Sun avatar hepingpeace avatar hosylay avatar Junhua Liu avatar Brian Pugh avatar  avatar  avatar WANG KAIXUAN avatar Katharine avatar  avatar  avatar Linn avatar Hirachy avatar ShawnDai avatar  avatar  avatar  avatar  avatar  avatar zhangnanyue avatar 龙笑泽 avatar  avatar  avatar HWOLF avatar  avatar Yingshuang Zou avatar yuki_13 avatar  avatar DY Zhang avatar Shijie Li avatar  avatar kyle avatar  avatar Ruixiang Zhang avatar  avatar indestructibleeeee avatar Tharita Tipdecho avatar  avatar Jiachen Tao avatar Zechuan Zhang avatar  avatar  avatar Li HaoRan avatar  avatar Xiaobing Han avatar Irene's dude avatar Hai Pham avatar LZ91X  avatar shen hui xiang avatar Zane Du avatar Jun-Jun Wan avatar huhupy avatar Zeeshan Khan Suri avatar  avatar

Watchers

Even avatar  avatar  avatar Bencheng avatar Yue Pan  avatar Xiaoyu Zhang avatar  avatar  avatar DY Zhang avatar 刘慧杰 avatar  avatar Yikang Ding avatar Loick avatar Nguyen The Hiep avatar Song Wang avatar cogitoErgoSum avatar  avatar Drewvv avatar  avatar Zheng Zhang avatar

gaussianformer's Issues

Code Released

Hopefully the code will be released soon, thanks.

Code release

Could you release your great code for the future??

Thank you for your excellent research!

Self-encoding

Hello. Thank you for your excellent work. I would like to know how you handle gaussians that fall on the same grid during sparse convolution.

Code Release Time

Hello, , thank you very much to your team for open-sourcing such an excellent project. May I ask when the code will be released?

Rendered RGB images and semantic maps

Hi, I noticed that the Tab.6 of the paper (arxiv version) mentioned that the photometric loss doesn't improve the performance. However, I am wondering how the rendered RGB and semantic maps look like. Would you provide some visualization results?

Thanks!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.