fvplab / argus-3d Goto Github PK

Python 75.71% C 6.82% Mako 3.15% Cython 4.34% C++ 9.97%

argus-3d's Introduction

Argus-3D: Pushing Auto-regressive Models for 3D Shape Generation at Capacity and Scalability

Installation

You can create an anaconda environment called argus-3d using

conda env create -f environment.yaml
conda activate argus-3d

Next, compile the extension modules. You can do this via

python setup.py build_ext --inplace

Generation

Download stage1 checkpoint and place it into output/PR256_ED512_EN8192.

Download stage2 checkpoint and place it into output/PR256_ED512_EN8192/class-guide/transformer3072_24_32.

Then you can try class-guide generation by run:

python generate_class-guide.py --batch_size 16 --cate chair

This script should create a folder output/PR256_ED512_EN8192/class-guide/transformer3072_24_32/class_cond where the output meshes are stored.

Note: Our model requires significant memory, and it's recommended to run it on a GPU with high VRAM capacity (40GB or above). Generating a single mesh on the A100 (80GB) takes approximately 50 seconds on average, while on V100 (32GB) it takes ~6 minutes.

Dataset

The occupancies, point clouds, and supplementary rendered images based on the Objaverse dataset can be downloaded from https://huggingface.co/datasets/BAAI/Objaverse-MIX

Coming Soon

Image-guide generation
Text-guide generation
Training code

Shout-outs

Thanks to everyone who makes their code and models available.

Thanks for open-sourcing!

BibTeX

@inproceedings{inproceedings,
      author = {Luo, Simian and Qian, Xuelin and Fu, Yanwei and Zhang, Yinda and Tai, Ying and Zhang, Zhenyu and Wang, Chengjie and Xue, Xiangyang},
      year = {2023},
      month = {10},
      pages = {14093-14103},
      title = {Learning Versatile 3D Shape Generation with Improved Auto-regressive Models},
      doi = {10.1109/ICCV51070.2023.01300}
}
@misc{qian2024pushing,
      title={Pushing Auto-regressive Models for 3D Shape Generation at Capacity and Scalability}, 
      author={Xuelin Qian and Yu Wang and Simian Luo and Yinda Zhang and Ying Tai and Zhenyu Zhang and Chengjie Wang and Xiangyang Xue and Bo Zhao and Tiejun Huang and Yunsheng Wu and Yanwei Fu},
      year={2024},
      eprint={2402.12225},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

argus-3d's People

Contributors

Stargazers

Watchers

Forkers

mmarking sfidea imxxu amarone sinashish bruinxiong

argus-3d's Issues

Can this run on Windows?

A very outstanding work. Can this run on Windows? When I execute commands on Windows, the system prompts:
`ResolvePackageNotFound:

ld_impl_linux-64=2.38
libgomp=11.2.0
libgcc-ng=11.2.0
nspr=4.35
nss=3.89.1
libgfortran-ng=11.2.0
ncurses=6.4
readline=8.2
dbus=1.13.18
gst-plugins-base=1.14.1
gmp=6.2.1
libxkbcommon=1.0.1
nettle=3.6
libedit=3.1.20221030
libgfortran5=11.2.0
gnutls=3.6.13
libstdcxx-ng=11.2.0
gstreamer=1.14.1
libuuid=1.41.5
openh264=2.1.1
_openmp_mutex=5.1`

Please ask for Argus-3D configuration information

I would like to ask about the step of compiling the extension module python setup.py build_ext --inplace. The error is as follows:

File "/home/hjq/anaconda3/envs/argus-3d/lib/python3.8/site-packages/torch/utils/cpp_extension.py", line 1733, in _run_ninja_build
raise RuntimeError(message) from e
RuntimeError: Error compiling objects for extension

Is my server environment configured incorrectly?

Can we generate textures or vertex colors

This program of yours is very powerful, higher quality than shap-E, which is amazing. I also have a few small questions to ask, namely: 1. Can we generate textures or vertex colors with this tool; 2. I see if there are 55 categories generated by categories, can I generate other items. We are iSoftStone Innovation Research Institute, and we are also doing research and application work on automatic modeling. Hope to have some cooperation if possible, thank you!

Problem with compile the extension modules

Thank you for the great work!
when i run
python setup.py build_ext --inplace
it comes with
'ValueError: 'src/utils/libmcubes/mcubes.pyx' doesn't match any files'
i don't see libmcubes file in the github project~

How to process my own datasets?

Great work! I'd like to use my own datasets for training. Do you have any scripts that allow me to convert my own datasets (some .obj/ .glb/ .stl format meshes) to the same format as in the Objaverse-Mix. Many thanks ! Besides, I've some questions regarding this dataset:

I've taken a deep look into this datasets. However I found some extreme low quality data inside this dataset. Is there anyway we can filtered these out?
How to filter subdataset by category? Is there any meta data I can use?

It's already August now, I don't know when I'll see the code for the graph model. Looking forward to it!

Looking forward to it!

Could you release the dataset (i.e., groundtruth occupancies)

Such a nice work! Thank you for sharing the code.
I wonder if the training dataset (especially the gt occupancies and watertight meshes) could be released. This will be beneficial for the whole community of 3D generation. Thank you!

it's too slow and result with cube

I was use the GPU device with 4090 (24GB) to generate a class knife, batch_size 16.

It's show that need 126 batch total, each batch cost about 20~30min, somethings wrong with it?

# total 0/126
Category knife:   0%|                                   | 0/126 [00:00<?, ?it/s]

# each
 55%|█████████████████████▉                  | 562/1024 [09:06<15:16,  1.98s/it]

请教下

如果我只想生成一个车，而不是把所有的车都生成出来，应该怎么处理

没有冒犯大神们的意思，赞美的话就不说了，说说我理解的，和理想模型之间的差距。

3D的数据可以用，但我坚持认为，在3D上监督做，不是最终的路，还是要到“2D-Image + 极度丰富的文本描述“，这两种约束做监督。
从算法架构上，我觉得，假设我们最终系统，假设能支撑比较好的生成100万类/个对象（先不说场景），我觉得最多前面1万类/个对象可能需要端到端的学习，而99%的后来的新对象，不需要参与端到端的训练，只是跑整个算法很小的一个子网（几个环节），提取到必要的特征，可以类似普通数据库能存下来就好。然后infer的时候能“近邻搜”出来（可以是个NN来表示更合适的距离），直接走后面的流程。这样才能更大的支持到1000万类/个。
这一点，数字是随意估计的，核心想说的，必须要有几乎零训练的线性/亚线性的扩展模式。diffusion那种靠参数记忆的，我觉得对于3D（shape+texture+motion+interaction+....)，做到真实世界普通人觉得华丽，或者UE5可以直接用（MetaHuman那种，还带LOD），模型大的要死。文章中，codebook那些地方，也就是VAE的细脖子，我觉得迟早是推广到真实世界后容量的瓶颈。
其他不说了，怕喷。

最后开个认真的的玩笑：从数据量上看：Objaverse-mix看起来3d的很大，相比于text, 2d-image做出chatgpt那种普通人觉得华丽的效果，差了至少两个数量级，咋弥补；从算力上看：8个A100训练4周，感觉是从0训练，那估计也差了两个数量级。

Loading Stage2 Fail !!!!!!

my GPU is A6000 48G, RAM is 128GB，but i wait so long time and res is：

Using Transformer Model:  FAST_Transformer_joint Baseline Quant Single !!
Using Single Quantizer  !!!! Reduce 4 High Reso 256 Relu Quantize 256 Symm 1
FAST_transformer_builder_baseline No drop Quant single cond class
z emb shape torch.Size([8192, 3072])

Loading Stage2 Fail !!!!!!

Current best validation metric (iou): -inf
Total number of parameters: 3670362821
output path: output/PR256_ED512_EN8192/class-guide/transformer3072_24_32
100%|███████████████████████████████████████| 1024/1024 [32:10<00:00,  1.88s/it]
100%|███████████████████████████████████████| 1024/1024 [32:04<00:00,  1.88s/it]

i also get the cube。
i want to know why my stage2 failed！

训练代码什么时候放出来呢？

感谢作者棒棒的工作，训练代码对我当前的工作十分有帮助，训练代码什么时候放出来呢？有些着急用，可以先给我发一份吗？[email protected]，可以有偿

请教个问题

请问下这个项目至少40g显存才能跑起来吗