Code Monkey home page Code Monkey logo

meshanything's Introduction

MeshAnything:
Artist-Created Mesh Generation
with Autoregressive Transformers

Yiwen Chen1,2*, Tong He2†, Di Huang2, Weicai Ye2, Sijin Chen3, Jiaxiang Tang4
Xin Chen5, Zhongang Cai6, Lei Yang6, Gang Yu7, Guosheng Lin1†, Chi Zhang8†
*Work done during a research internship at Shanghai AI Lab.
Corresponding authors.
1S-Lab, Nanyang Technological University, 2Shanghai AI Lab,
3Fudan University, 4Peking University, 5University of Chinese Academy of Sciences,
6SenseTime Research, 7Stepfun, 8Westlake University

                   

Demo GIF

Release

  • [6/17] 🔥🔥 We released the 350m version of MeshAnything.

Contents

Installation

Our environment has been tested on Ubuntu 22, CUDA 11.8 with A100, A800 and A6000.

  1. Clone our repo and create conda environment
git clone https://github.com/buaacyw/MeshAnything.git && cd MeshAnything
conda create -n MeshAnything python==3.10.13
conda activate MeshAnything
pip install torch==2.1.1 torchvision==0.16.1 torchaudio==2.1.1 --index-url https://download.pytorch.org/whl/cu118
pip install -r requirements.txt
pip install flash-attn --no-build-isolation

or

pip install git+https://github.com/buaacyw/MeshAnything.git

And directly use in your code as

import MeshAnything

Usage

Local Gradio Demo

python app.py

Mesh Command line inference

# folder input
python main.py --input_dir examples --out_dir mesh_output --input_type mesh

# single file input
python main.py --input_path examples/wand.ply --out_dir mesh_output --input_type mesh

# Preprocess with Marching Cubes first
python main.py --input_dir examples --out_dir mesh_output --input_type mesh --mc

Point Cloud Command line inference

# Note: if you want to use your own point cloud, please make sure the normal is included.
# The file format should be a .npy file with shape (N, 6), where N is the number of points. The first 3 columns are the coordinates, and the last 3 columns are the normal.

# inference for folder
python main.py --input_dir pc_examples --out_dir pc_output --input_type pc_normal

# inference for single file
python main.py --input_path pc_examples/mouse.npy --out_dir pc_output --input_type pc_normal

Important Notes

  • It takes about 7GB and 30s to generate a mesh on an A6000 GPU.
  • The input mesh will be normalized to a unit bounding box. The up vector of the input mesh should be +Y for better results.
  • Limited by computational resources, MeshAnything is trained on meshes with fewer than 800 faces and cannot generate meshes with more than 800 faces. The shape of the input mesh should be sharp enough; otherwise, it will be challenging to represent it with only 800 faces. Thus, feed-forward 3D generation methods may often produce bad results due to insufficient shape quality. We suggest using results from 3D reconstruction, scanning and SDS-based method (like DreamCraft3D) as the input of MeshAnything.
  • Please refer to https://huggingface.co/spaces/Yiwen-ntu/MeshAnything/tree/main/examples for more examples.

TODO

The repo is still being under construction, thanks for your patience.

  • Release of training code.
  • Release of larger model.

Acknowledgement

Our code is based on these wonderful repos:

Star History

Star History Chart

BibTeX

@misc{chen2024meshanything,
  title={MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers},
  author={Yiwen Chen and Tong He and Di Huang and Weicai Ye and Sijin Chen and Jiaxiang Tang and Xin Chen and Zhongang Cai and Lei Yang and Gang Yu and Guosheng Lin and Chi Zhang},
  year={2024},
  eprint={2406.10163},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

meshanything's People

Contributors

buaacyw avatar dylanebert avatar hardikdava avatar icoz69 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

meshanything's Issues

Building wheel for mesh2sdf (pyproject.toml) did not run successfully.

I am on windows 11, I tried several environments with all the possible combinations of cuda and pytorch, reading the other issue about windows build. When I try to instlal the requirements I keep getting a error for mesh2sdf. I had Visual Studio already installed with C++, but I also installed the C++ build tools, both 2019 and 2022 version. This is the error I keep getting:

`Building wheels for collected packages: mesh2sdf
Building wheel for mesh2sdf (pyproject.toml) ... error
error: subprocess-exited-with-error

× Building wheel for mesh2sdf (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [17 lines of output]
running bdist_wheel
running build
running build_py
creating build
creating build\lib.win-amd64-cpython-310
creating build\lib.win-amd64-cpython-310\mesh2sdf
copying mesh2sdf\compute.py -> build\lib.win-amd64-cpython-310\mesh2sdf
copying mesh2sdf_init_.py -> build\lib.win-amd64-cpython-310\mesh2sdf
running build_ext
building 'mesh2sdf.core' extension
creating build\temp.win-amd64-cpython-310
creating build\temp.win-amd64-cpython-310\Release
creating build\temp.win-amd64-cpython-310\Release\csrc
"C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.40.33807\bin\HostX86\x64\cl.exe" /c /nologo /O2 /W3 /GL /DNDEBUG /MD -DVERSION_INFO=1.1.0 -Icsrc -IC:\Users\Username\AppData\Local\Temp\pip-build-env-50vvqli3\overlay\Lib\site-packages\pybind11\include -IC:\Users\Username\anaconda3\envs\MeshAnything\include -IC:\Users\Carlo\anaconda3\envs\MeshAnything\Include "-IC:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.40.33807\include" "-IC:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Auxiliary\VS\include" /EHsc /Tpcsrc/makelevelset3.cpp /Fobuild\temp.win-amd64-cpython-310\Release\csrc/makelevelset3.obj /std:c++latest /EHsc /bigobj
makelevelset3.cpp
C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.40.33807\include\cstdlib(11): fatal error C1083: Cannot open include file: 'math.h': No such file or directory
error: command 'C:\Program Files (x86)\Microsoft Visual Studio\2022\BuildTools\VC\Tools\MSVC\14.40.33807\bin\HostX86\x64\cl.exe' failed with exit code 2
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for mesh2sdf
Failed to build mesh2sdf
ERROR: ERROR: Failed to build installable wheels for some pyproject.toml based projects (mesh2sdf)`

About point cloud encoder

Hi, @buaacyw

Thanks for sharing your wonderful research. Just small question. Do I understand correctly, that you use point cloud encoder frozen with exactly the same weights, provided by Michelangelo paper? So this part of the model is trained only on ShapeNet, not ShapeNet + Objavarse?

Input requirement

I tried my own point cloud and it does not work well. I noticed that the model cannot generate more than 800 faces, does that mean the input shall be very smooth?

I also downloaded the car example from your hugging face link, seems it has more than 800 faces. A little confused.

How to convert images to mesh?

It seems like the input should be mesh or point cloud, does that mean we should first convert the images to mesh or point cloud?

Poor results using mesh input files

This is a photogrammetry scan of a little pouch, processed as a mesh. Once I can get the point cloud converted to npy I'll try that method as well. Potential issue is holes. Is a sealed mesh required for good results? Here's what I've been getting.
Without a polygon limit, it just gave me the input mesh but maybe half the polygons optimized away. It was still an extremely high poly mesh. I want something useable.

Here's the 250 max triangle limit run. The little bunch of polygons toward the bottom. Looks nothing like the input mesh.

python main.py --input_dir ..\props\magnifyingglass\bag\obj --out_dir ..\props\magnifyingglass\bag --input_type mesh --n_max_triangles 250

Look at this output! Utterly awful! I've got to be doing something wrong if you are capable of getting the results in your photos.

Screenshot 2024-06-20 211354
Screenshot 2024-06-20 211358

Training code

Hello, great work! Can you please provide the training code and details? Thank you.

large scale pointCloud

Hello , big olds, could your masterpiece handle large scale pointCloud inputs to nice mesh ?

Question about MeshAnything

Hello! 😊 I discovered your MeshAnything project while exploring trending python repositories. Awesome job! 🎉 Could you send me more details on Telegram? Also, please review my work and follow me on GitHub @nectariferous. Thanks!

How to do text and images?

Is that not possible right now?

The input types are mesh and pc_normal, but this shows text and image:

image

Maybe I'm missing something.

Point Clouds

Hey, I see the picture of point clouds being used as an input. How do I do this?

ImportError: libtorch_cuda_cpp.so: cannot open shared object file: No such file or directory

Thank you for the great work. Could you please check the following error that occurred while executing the app.py?

Traceback (most recent call last):
  File "/root/projects/MeshAnything/app.py", line 8, in <module>
    from main import get_args, load_model
  File "/root/projects/MeshAnything/main.py", line 6, in <module>
    from MeshAnything.models.meshanything import MeshAnything
  File "/root/projects/MeshAnything/MeshAnything/models/meshanything.py", line 5, in <module>
    from MeshAnything.models.shape_opt import ShapeOPTConfig
  File "/root/projects/MeshAnything/MeshAnything/models/shape_opt.py", line 2, in <module>
    from transformers.models.opt.modeling_opt import OPTForCausalLM, OPTModel, OPTDecoder, OPTLearnedPositionalEmbedding, OPTDecoderLayer
  File "/root/anaconda3/envs/meshany/lib/python3.10/site-packages/transformers/models/opt/modeling_opt.py", line 46, in <module>
    from flash_attn import flash_attn_func, flash_attn_varlen_func
  File "/root/anaconda3/envs/meshany/lib/python3.10/site-packages/flash_attn/__init__.py", line 3, in <module>
    from flash_attn.flash_attn_interface import (
  File "/root/anaconda3/envs/meshany/lib/python3.10/site-packages/flash_attn/flash_attn_interface.py", line 10, in <module>
    import flash_attn_2_cuda as flash_attn_cuda
ImportError: libtorch_cuda_cpp.so: cannot open shared object file: No such file or directory

How to get the color?

After running the code, I get an obj file with no color and only mesh. What should I do if I want to get the obj with color?

Running on Windows

Hey, i tried to run it on windows but flash-attn is not installing on windows.

From what I found in different github repos is, that flash-attn needs a cuda version 12.0 or higher to run on windows systems.

Maybe in the near future, you can update your script to run on a higher cuda version, so we can test it on a windows system too =)

How to convert 3D gaussian splatting to mesh?

Hi,

Thank you for sharing your impressive work!
In your paper, you demonstrate that MeshAnything is capable of converting 3D GS to mesh. I am curious about the implementation process. Does MeshAnything simple store centers of 3D GS and ignore other parameters, or does it use a special algorithm to convert 3D GS to point cloud? Can this be achieved using the current release code?

some questions about FlashAttention

Building the wheel is very slow. I chose the version "flash_attn-2.6.1+cu118torch2.1cxx11abiTRUE-cp310-cp310-linux_x86_64.whl" to download and install via pip, based on the specifications torch==2.11, cuda==11.8, python==3.10 on my Linux system. there still exists errors, as shown below:
ImportError: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.26' not found (required by /home/whuav/anaconda3/envs/MeshAnything/lib/python3.10/site-packages/flash_attn_2_cuda.cpython-310-x86_64-linux-gnu.so)

any suggestions?

It took ~51s in A6000 when running the point cloud mouse example

Got a bunch of warnings when trying the example:

FutureWarning: resume_download is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use force_download=True.
warnings.warn(
The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.
You are using a model of type opt to instantiate a model of type shape_opt. This is not supported for all configurations of models and can yield errors.
The model was loaded with use_flash_attention_2=True, which is deprecated and may be removed in a future release. Please use attn_implementation="flash_attention_2" instead.
You are attempting to use Flash Attention 2.0 without specifying a torch dtype. This might lead to unexpected behaviour
Flash Attention 2.0 only supports torch.float16 and torch.bfloat16 dtypes, but the current dype in ShapeOPT is torch.float32. You should run training or inference using Automatic Mixed-Precision via the with torch.autocast(device_type='torch_device'): decorator, or load the model with the torch_dtype argument. Example: model = AutoModel.from_pretrained("openai/whisper-tiny", attn_implementation="flash_attention_2", torch_dtype=torch.float16)
Flash Attention 2.0 only supports torch.float16 and torch.bfloat16 dtypes, but the current dype in ShapeOPTModel is torch.float32. You should run training or inference using Automatic Mixed-Precision via the with torch.autocast(device_type='torch_device'): decorator, or load the model with the torch_dtype argument. Example: model = AutoModel.from_pretrained("openai/whisper-tiny", attn_implementation="flash_attention_2", torch_dtype=torch.float16)
Flash Attention 2.0 only supports torch.float16 and torch.bfloat16 dtypes, but the current dype in ShapeOPTDecoder is torch.float32. You should run training or inference using Automatic Mixed-Precision via the with torch.autocast(device_type='torch_device'): decorator, or load the model with the torch_dtype argument. Example: model = AutoModel.from_pretrained("openai/whisper-tiny", attn_implementation="flash_attention_2", torch_dtype=torch.float16)
The BetterTransformer implementation does not support padding during training, as the fused kernels do not support attention masks. Beware that passing padded batched data during training may result in unexpected outputs. Please refer to https://huggingface.co/docs/optimum/bettertransformer/overview for more details.

Not sure the above warnings will slower the running or not. Could someone help to take a look? Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.