buaacyw / meshanything Goto Github PK

From anything to mesh like human artists. Official impl. of "MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers"

Home Page: https://buaacyw.github.io/mesh-anything/

License: Other

Python 100.00%

3d generative-ai generative-model mesh mesh-generation auto-regressive-model large-language-models transformers point-cloud

meshanything's Introduction

MeshAnything:
Artist-Created Mesh Generation
with Autoregressive Transformers

Yiwen Chen^1,2*, Tong He^2†, Di Huang², Weicai Ye², Sijin Chen³, Jiaxiang Tang⁴
Xin Chen⁵, Zhongang Cai⁶, Lei Yang⁶, Gang Yu⁷, Guosheng Lin^1†, Chi Zhang^8†
^*Work done during a research internship at Shanghai AI Lab.
^†Corresponding authors.
¹S-Lab, Nanyang Technological University, ²Shanghai AI Lab,
³Fudan University, ⁴Peking University, ⁵University of Chinese Academy of Sciences,
⁶SenseTime Research, ⁷Stepfun, ⁸Westlake University

Release

[6/17] 🔥🔥 We released the 350m version of MeshAnything.

Release
Contents
Installation
Usage
Important Notes
TODO
Acknowledgement
Star History
BibTeX

Installation

Our environment has been tested on Ubuntu 22, CUDA 11.8 with A100, A800 and A6000.

Clone our repo and create conda environment

git clone https://github.com/buaacyw/MeshAnything.git && cd MeshAnything
conda create -n MeshAnything python==3.10.13
conda activate MeshAnything
pip install torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt
pip install flash-attn --no-build-isolation

pip install git+https://github.com/buaacyw/MeshAnything.git

And directly use in your code as

import MeshAnything

Usage

Local Gradio Demo

python app.py

Mesh Command line inference

# folder input
python main.py --input_dir examples --out_dir mesh_output --input_type mesh

# single file input
python main.py --input_path examples/wand.ply --out_dir mesh_output --input_type mesh

# Preprocess with Marching Cubes first
python main.py --input_dir examples --out_dir mesh_output --input_type mesh --mc

Point Cloud Command line inference

# Note: if you want to use your own point cloud, please make sure the normal is included.
# The file format should be a .npy file with shape (N, 6), where N is the number of points. The first 3 columns are the coordinates, and the last 3 columns are the normal.

# inference for folder
python main.py --input_dir pc_examples --out_dir pc_output --input_type pc_normal

# inference for single file
python main.py --input_path pc_examples/mouse.npy --out_dir pc_output --input_type pc_normal

Important Notes

It takes about 7GB and 30s to generate a mesh on an A6000 GPU.
The input mesh will be normalized to a unit bounding box. The up vector of the input mesh should be +Y for better results.
Limited by computational resources, MeshAnything is trained on meshes with fewer than 800 faces and cannot generate meshes with more than 800 faces. The shape of the input mesh should be sharp enough; otherwise, it will be challenging to represent it with only 800 faces. Thus, feed-forward 3D generation methods may often produce bad results due to insufficient shape quality. We suggest using results from 3D reconstruction, scanning and SDS-based method (like DreamCraft3D) as the input of MeshAnything.
Please refer to https://huggingface.co/spaces/Yiwen-ntu/MeshAnything/tree/main/examples for more examples.

TODO

The repo is still being under construction, thanks for your patience.

Release of training code.
Release of larger model.

Acknowledgement

Our code is based on these wonderful repos:

Star History

BibTeX

@misc{chen2024meshanything,
  title={MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers},
  author={Yiwen Chen and Tong He and Di Huang and Weicai Ye and Sijin Chen and Jiaxiang Tang and Xin Chen and Zhongang Cai and Lei Yang and Gang Yu and Guosheng Lin and Chi Zhang},
  year={2024},
  eprint={2406.10163},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

meshanything's People

Contributors

Stargazers

Watchers

Forkers

hbcbh1999 alakia equalmix thomascherickal yuancaimaiyi tsok-xyz techthiyanes whuhxb mr-l-2015 chinshou birdddev jags111 waywardspooky slab28 karthikra yangzhichen763 mrcodechef dylanebert claytongulick jlin jello10298 nemkumar neoyg sorokinvld ruofeidu domsfx doktormike nathanrish faisalshahbaz yuripourre-forks camenduru eukaryoting behzadshahid319 sikkgit igo-kon masaakiyagi jefedeoro angle2046 garaujo-edumobi guochaopeng cloudenginehub tactile-taco ajido priyal9002 keshavaspanda hadryan jeffreywardman subburaj-beebox rouai lcbx tiredonwatch hardikdava gildedter accout-personal flazerain yukinagatovr cryo420 codingwatching scchess jackzhousz ant-arktis hhenrique conglesolutionx beyond6699 gervanz alexnewtown yash943 suneric

meshanything's Issues

Building wheel for mesh2sdf (pyproject.toml) did not run successfully.

I am on windows 11, I tried several environments with all the possible combinations of cuda and pytorch, reading the other issue about windows build. When I try to instlal the requirements I keep getting a error for mesh2sdf. I had Visual Studio already installed with C++, but I also installed the C++ build tools, both 2019 and 2022 version. This is the error I keep getting:

`Building wheels for collected packages: mesh2sdf
Building wheel for mesh2sdf (pyproject.toml) ... error
error: subprocess-exited-with-error

× Building wheel for mesh2sdf (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [17 lines of output]
running bdist_wheel
running build
running build_py
creating build
creating build\lib.win-amd64-cpython-310
creating build\lib.win-amd64-cpython-310\mesh2sdf
copying mesh2sdf\compute.py -> build\lib.win-amd64-cpython-310\mesh2sdf
copying mesh2sdf_init_.py -> build\lib.win-amd64-cpython-310\mesh2sdf
running build_ext
building 'mesh2sdf.core' extension
creating build\temp.win-amd64-cpython-310
creating build\temp.win-amd64-cpython-310\Release
creating build\temp.win-amd64-cpython-310\Release\csrc
"C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.40.33807\bin\HostX86\x64\cl.exe" /c /nologo /O2 /W3 /GL /DNDEBUG /MD -DVERSION_INFO=1.1.0 -Icsrc -IC:\Users\Username\AppData\Local\Temp\pip-build-env-50vvqli3\overlay\Lib\site-packages\pybind11\include -IC:\Users\Username\anaconda3\envs\MeshAnything\include -IC:\Users\Carlo\anaconda3\envs\MeshAnything\Include "-IC:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.40.33807\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Auxiliary\VS\include" /EHsc /Tpcsrc/makelevelset3.cpp /Fobuild\temp.win-amd64-cpython-310\Release\csrc/makelevelset3.obj /std:c++latest /EHsc /bigobj
makelevelset3.cpp
C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.40.33807\include\cstdlib(11): fatal error C1083: Cannot open include file: 'math.h': No such file or directory
error: command 'C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.40.33807\bin\HostX86\x64\cl.exe' failed with exit code 2
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for mesh2sdf
Failed to build mesh2sdf
ERROR: ERROR: Failed to build installable wheels for some pyproject.toml based projects (mesh2sdf)`

support for >800 faces

seriously.

About point cloud encoder

Hi, @buaacyw

Thanks for sharing your wonderful research. Just small question. Do I understand correctly, that you use point cloud encoder frozen with exactly the same weights, provided by Michelangelo paper? So this part of the model is trained only on ShapeNet, not ShapeNet + Objavarse?

Input requirement

I tried my own point cloud and it does not work well. I noticed that the model cannot generate more than 800 faces, does that mean the input shall be very smooth?

I also downloaded the car example from your hugging face link, seems it has more than 800 faces. A little confused.

How to convert images to mesh?

It seems like the input should be mesh or point cloud, does that mean we should first convert the images to mesh or point cloud?

Poor results using mesh input files

This is a photogrammetry scan of a little pouch, processed as a mesh. Once I can get the point cloud converted to npy I'll try that method as well. Potential issue is holes. Is a sealed mesh required for good results? Here's what I've been getting.
Without a polygon limit, it just gave me the input mesh but maybe half the polygons optimized away. It was still an extremely high poly mesh. I want something useable.

Here's the 250 max triangle limit run. The little bunch of polygons toward the bottom. Looks nothing like the input mesh.

python main.py --input_dir ..\props\magnifyingglass\bag\obj --out_dir ..\props\magnifyingglass\bag --input_type mesh --n_max_triangles 250

Look at this output! Utterly awful! I've got to be doing something wrong if you are capable of getting the results in your photos.

Training code

Hello, great work! Can you please provide the training code and details? Thank you.

large scale pointCloud

Hello , big olds, could your masterpiece handle large scale pointCloud inputs to nice mesh ?

Question about MeshAnything

Hello! 😊 I discovered your MeshAnything project while exploring trending python repositories. Awesome job! 🎉 Could you send me more details on Telegram? Also, please review my work and follow me on GitHub @nectariferous. Thanks!

How to do text and images?

Is that not possible right now?

The input types are mesh and pc_normal, but this shows text and image:

Maybe I'm missing something.

Point Clouds

Hey, I see the picture of point clouds being used as an input. How do I do this?

ImportError: libtorch_cuda_cpp.so: cannot open shared object file: No such file or directory

Thank you for the great work. Could you please check the following error that occurred while executing the app.py?

Traceback (most recent call last):
  File "/root/projects/MeshAnything/app.py", line 8, in <module>
    from main import get_args, load_model
  File "/root/projects/MeshAnything/main.py", line 6, in <module>
    from MeshAnything.models.meshanything import MeshAnything
  File "/root/projects/MeshAnything/MeshAnything/models/meshanything.py", line 5, in <module>
    from MeshAnything.models.shape_opt import ShapeOPTConfig
  File "/root/projects/MeshAnything/MeshAnything/models/shape_opt.py", line 2, in <module>
    from transformers.models.opt.modeling_opt import OPTForCausalLM, OPTModel, OPTDecoder, OPTLearnedPositionalEmbedding, OPTDecoderLayer
  File "/root/anaconda3/envs/meshany/lib/python3.10/site-packages/transformers/models/opt/modeling_opt.py", line 46, in <module>
    from flash_attn import flash_attn_func, flash_attn_varlen_func
  File "/root/anaconda3/envs/meshany/lib/python3.10/site-packages/flash_attn/__init__.py", line 3, in <module>
    from flash_attn.flash_attn_interface import (
  File "/root/anaconda3/envs/meshany/lib/python3.10/site-packages/flash_attn/flash_attn_interface.py", line 10, in <module>
    import flash_attn_2_cuda as flash_attn_cuda
ImportError: libtorch_cuda_cpp.so: cannot open shared object file: No such file or directory

How to get the color?

After running the code, I get an obj file with no color and only mesh. What should I do if I want to get the obj with color?

Running on Windows

Hey, i tried to run it on windows but flash-attn is not installing on windows.

From what I found in different github repos is, that flash-attn needs a cuda version 12.0 or higher to run on windows systems.

Maybe in the near future, you can update your script to run on a higher cuda version, so we can test it on a windows system too =)

The refine mesh didn't look good

Is there any way to improve the mesh quality?

How to convert 3D gaussian splatting to mesh?

Hi,

Thank you for sharing your impressive work!
In your paper, you demonstrate that MeshAnything is capable of converting 3D GS to mesh. I am curious about the implementation process. Does MeshAnything simple store centers of 3D GS and ignore other parameters, or does it use a special algorithm to convert 3D GS to point cloud? Can this be achieved using the current release code?

some questions about FlashAttention

Building the wheel is very slow. I chose the version "flash_attn-2.6.1+cu118torch2.1cxx11abiTRUE-cp310-cp310-linux_x86_64.whl" to download and install via pip, based on the specifications torch==2.11, cuda==11.8, python==3.10 on my Linux system. there still exists errors, as shown below:
ImportError: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.26' not found (required by /home/whuav/anaconda3/envs/MeshAnything/lib/python3.10/site-packages/flash_attn_2_cuda.cpython-310-x86_64-linux-gnu.so)

any suggestions?

can this one bed executed without flashattention?

flash-attention is implemented for only ampere and newer gpus, also it blocks code from using on cpu, which is not good

It took ~51s in A6000 when running the point cloud mouse example

Got a bunch of warnings when trying the example:

FutureWarning: resume_download is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use force_download=True.
warnings.warn(
The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
You are using a model of type opt to instantiate a model of type shape_opt. This is not supported for all configurations of models and can yield errors.
The model was loaded with use_flash_attention_2=True, which is deprecated and may be removed in a future release. Please use attn_implementation="flash_attention_2" instead.
You are attempting to use Flash Attention 2.0 without specifying a torch dtype. This might lead to unexpected behaviour
Flash Attention 2.0 only supports torch.float16 and torch.bfloat16 dtypes, but the current dype in ShapeOPT is torch.float32. You should run training or inference using Automatic Mixed-Precision via the with torch.autocast(device_type='torch_device'): decorator, or load the model with the torch_dtype argument. Example: model = AutoModel.from_pretrained("openai/whisper-tiny", attn_implementation="flash_attention_2", torch_dtype=torch.float16)
Flash Attention 2.0 only supports torch.float16 and torch.bfloat16 dtypes, but the current dype in ShapeOPTModel is torch.float32. You should run training or inference using Automatic Mixed-Precision via the with torch.autocast(device_type='torch_device'): decorator, or load the model with the torch_dtype argument. Example: model = AutoModel.from_pretrained("openai/whisper-tiny", attn_implementation="flash_attention_2", torch_dtype=torch.float16)
Flash Attention 2.0 only supports torch.float16 and torch.bfloat16 dtypes, but the current dype in ShapeOPTDecoder is torch.float32. You should run training or inference using Automatic Mixed-Precision via the with torch.autocast(device_type='torch_device'): decorator, or load the model with the torch_dtype argument. Example: model = AutoModel.from_pretrained("openai/whisper-tiny", attn_implementation="flash_attention_2", torch_dtype=torch.float16)
The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.

Not sure the above warnings will slower the running or not. Could someone help to take a look? Thank you!