Code Monkey home page Code Monkey logo

maua's Introduction

Maua

๐Ÿ‘ท โ›๏ธ WIP ๐Ÿ› ๏ธ ๐Ÿ‘ท

Maua is a Python library (and command line interface) for synthesizing images, video, and audio using deep learning.

While many research groups publish code to reproduce results of their papers, it is often still time intensive to set up the environment correctly and figure out how to run the algorithms on your own data. The goal of Maua is to collect these methods in one place to make it easy to use them as creative tools. The primary design goal is composability. Combining different methods in unique ways exponentially expands the space of possible results--and so the creative freedom.

Maua is still under construction for now and so the API and functionality are subject to change.

Installation

Deep learning is very compute hungry, this means that a decent GPU is practically a requirement. Install CUDA and then Maua can be installed using pip as follows:

pip install numpy Cython torch --extra-index-url https://download.pytorch.org/whl/cu116
pip install git+https://github.com/maua-maua-maua/maua.git --extra-index-url https://pypi.ngc.nvidia.com --extra-index-url https://download.pytorch.org/whl/cu116

Currently installation has only been tested on a Ubuntu 20.04 machine with NVIDIA GPUs. Other configurations are also possible but might be more involved. If you're running into problems, feel free to open an issue!

Compiling Extensions

python maua/submodules/pycuda/configure.py --cuda-enable-gl
mv siteconf.py maua/submodules/pycuda
pip install -e maua/submodules/pycuda

git clone https://github.com/NVIDIA/apex
cd apex
python setup.py install --cuda_ext --cpp_ext 
cd ..

pip install maua[flow]

Usage

WARNING: some of the examples below might be out of date.

Command line

Use --help to find the options available

python -m maua --help
python -m maua autoregressive --help
python -m maua.diffusion.sample --help

Examples

Generate images with classifier-free guided diffusion:

python -m maua.diffusion.sample --text "A street art mural of a dapper turtle with wings"

Generate images by finetuning RuDALL-E on a set of images:

python -m maua autoregressive ru finetune --input_dir /path/to/directory/of/images/

Upscale images using RealESRGAN:

python -m maua super image upscale /path/to/image.png /path/to/image2.png /path/to/image3.png --model_name RealESRGAN-pbaylies-hr-paintings

Python

All of the command line functions are also available for import within python.

Examples

High resolution diffusion:

from maua.diffusion.sample import main as diffusion
from maua.super.image import upscale

images = [diffusion(text="A street art mural of a dapper turtle with wings", timesteps=100) for i in range(5)]

for pil_image in upscale(images, model_name='latent-diffusion'):
    pil_image.save(f"output/{i}.png")

Thanks

Thank you to everyone that makes their work available to the community. Maua incorporates open source code from all of the internet, without the work of these amazing people this wouldn't be possible. Below is a (probably very) incomplete list of people whose code has directly or indirectly contributed to Maua:

@crowsonkb, @ProGamerGov, @lucidrains, @dvschultz, @eps696, @l4rz, @caillonantoine, @ben-hayes, @adrienchaton, @sportsracer48, @afiaka87, @aydao, @rosinality, @genekogan, @dribnet, @alexjc, @htoyryla, @neverix, @sniklaus, @xinntao, @cszn, @JingyunLiang, @kentsyx, @kakaobrain, @yu45020, @twitter@advadnoun, @twitter@danielrussruss, @twitter@ai_curio

Citations

Maua relies on many innovations coming directly from the research community. It's safe to say that if you use Maua in any way for research related purposes you should be citing some papers. For now, please do a quick web-search based on the file path (these are generally named after the method). A full list of papers to cite will be compiled and documented clearly in the future.

License

The main license for this repository is GPL-v3. However, due to the wide variety of sources of code, different components might fall under different licenses. Efforts are still underway to ensure all parts of the Maua library are licensed and attributed correctly. If there are any issues with licensing please make an issue and they will be rectified ASAP!

The code is provided free of charge for the purpose of enabling people to make art and explore deep learning synthesis. Due to the aformentioned licensing, commercial use of Maua may be tricky. You will need to make sure that you are adhering to the licensing terms of all the submodules!

In general, output does not fall under the license of the code. Copyrights of model weights and model outputs are still a major gray area so use these commercially at your own risk! As a rule of thumb, try to consider if what you are making really is transformative. If you've simply scraped someone's social media page and trained a model to mimic them, you need to give them credit and should not use the model or outputs commercially without discussing with them.

maua's People

Contributors

jcbrouwer avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

maua's Issues

installation difficulties on windows

When I try to do the basic install I run into a repeated error:

Running command git submodule update --init --recursive -q
error: unable to create file docs/d0/d14/structnlohmann_1_1detail_1_1is__compatible__integer__type__impl_3_01RealIntegerType_00_01Compatie5920c849e839ebb9f8c57349c900796.html: Filename too long

_attn/cutlass'
error: subprocess-exited-with-error

ร— git submodule update --init --recursive -q did not run successfully.
โ”‚ exit code: 1
โ•ฐโ”€> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

Install difficulties

Keep seeing this repeated error on install

Running command git submodule update --init --recursive -q
error: unable to create file docs/d0/d14/structnlohmann_1_1detail_1_1is__compatible__integer__type__impl_3_01RealIntegerType_00_01Compatie5920c849e839ebb9f8c57349c900796.html: Filename too long
_attn/cutlass'
error: subprocess-exited-with-error

ร— git submodule update --init --recursive -q did not run successfully.
โ”‚ exit code: 1
โ•ฐโ”€> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

Installation fails

Hi! I'm trying to get this project running specifically for the audio-visual part. I'm following the installation instructions in the root README, running in a Google Colab notebook using a T4 GPU.

I've run the following:
!pip install nvidia-cudnn-cu116 --extra-index-url https://pypi.ngc.nvidia.com
as per the earlier issue here.

Then:
!pip install numpy Cython torch --extra-index-url https://download.pytorch.org/whl/cu116

and finally:
!pip install git+https://github.com/maua-maua-maua/maua.git --extra-index-url https://pypi.ngc.nvidia.com --extra-index-url https://download.pytorch.org/whl/cu116

Which then gives the output:

Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com/, https://download.pytorch.org/whl/cu116
Collecting git+https://github.com/maua-maua-maua/maua.git
  Cloning https://github.com/maua-maua-maua/maua.git to /tmp/pip-req-build-0gqqay2x
  Running command git clone --filter=blob:none --quiet https://github.com/maua-maua-maua/maua.git /tmp/pip-req-build-0gqqay2x
  Resolved https://github.com/maua-maua-maua/maua.git to commit d968fd91f0d9b371ae5a5aa8e7ad9e9a906ded22
  Running command git submodule update --init --recursive -q
  Preparing metadata (setup.py) ... done
Collecting apex@ git+https://github.com/NVIDIA/apex (from maua==0.4)
  Cloning https://github.com/NVIDIA/apex to /tmp/pip-install-hyxcyr_i/apex_ee92ddee1c1c471a964997c89113d2cc
  Running command git clone --filter=blob:none --quiet https://github.com/NVIDIA/apex /tmp/pip-install-hyxcyr_i/apex_ee92ddee1c1c471a964997c89113d2cc
  Resolved https://github.com/NVIDIA/apex to commit 141bbf1cf362d4ca4d94f4284393e91dda5105a5
  Running command git submodule update --init --recursive -q
  Installing build dependencies ... done
  error: subprocess-exited-with-error
  
  ร— Getting requirements to build wheel did not run successfully.
  โ”‚ exit code: 1
  โ•ฐโ”€> See above for output.
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  Getting requirements to build wheel ... error
error: subprocess-exited-with-error

ร— Getting requirements to build wheel did not run successfully.
โ”‚ exit code: 1
โ•ฐโ”€> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

The end game here is that I just want to be able to run the audio-visual code with a pretrained StyleGAN3 model preferably in Google Colab, so any solution which gets me there I'm happy with ๐Ÿ˜„ Here's a test environment for you.

Attempting to run upscaling on images

I'm trying to upscale some images and am trying to use this repo to do so.
I have run into a couple of issues.

During the installation steps:

git clone --recursive https://github.com/maua-maua-maua/maua.git 
cd maua
conda create -n maua python=3.8 pytorch torchvision torchaudio cudatoolkit=11.3 cudatoolkit-dev=11.3 cudnn mpi4py Cython pip=21.3.1 -c nvidia -c pytorch -c conda-forge
conda activate maua
pip install -r requirements.txt
pip install -r audio/requirements.txt
pip install cupy-cuda113==9.6

When running pip install -r requirements.txt, the process gets stuck when attempting to build the wheel for nestedtensor.
I commented this out hoping that for upscaling, this package will not be required.

When I tried to run the provided upscaling command,
python -m maua super /path_to_my_image.png --model_name RealESRGAN-pbaylies-hr-paintings
I got an error message because pytorch-lightning wasn't installed.
I resolved this by installing the package from here:
https://www.pytorchlightning.ai/

I then reran the upscaling command but got the following error:
Segmentation fault (core dumped)

I am using Ubuntu 18.04. I have a Tesla T4 gpu. My CUDA version is 11.4 and my NVIDIA driver version is 470.82.00.
When this error occurred, my cpu was about 98% idle, I had about 10Gb of RAM free, and my T4 was completely free.

Could you please advise?

control of video output size

is there a way to change the w/h of a video output? currently using an init video that is 1024 x 512, and when i try to change output size in the code, (512,512) it gives me this:

RuntimeError: The size of tensor a (512) must match the size of tensor b (1024) at non-singleton dimension 2

If i try changing either width or height I am presented with this. I may just be trying to modify the wrong param - apologies in advance!

video diffusion slow down

when processing lengthy (at least few minutes) videos with diffusion, the process gets slowing down progressively (probably taking days instead of hours). it may be due to the big optflow data files (in my case ~5gb); may it be more optimal to save them the per frame?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.