Code Monkey home page Code Monkey logo

maxdiffusion's Introduction

Unit Tests

Overview

MaxDiffusion is a collection of reference implementations of various latent diffusion models written in pure Python/Jax that run on XLA devices including Cloud TPUs and GPUs. MaxDiffusion aims to be a launching off point for ambitious Diffusion projects both in research and production. We encourage you to start by experimenting with MaxDiffusion out of the box and then fork and modify MaxDiffusion to meet your needs.

The goal of this project is to provide reference implementations for latent diffusion models that help developers get started with training, tuning, and serving solutions on XLA devices including Cloud TPUs and GPUs. We started with Stable Diffusion inference on TPUs, but welcome code contributions to grow.

MaxDiffusion supports

  • Stable Diffusion 2 base (training and inference)
  • Stable Diffusion 2.1 (training and inference)
  • Stable Diffusion XL (training and inference).
  • Stable Diffusion Lightning (inference).
  • ControlNet inference (Stable Diffusion 1.4 & SDXL).

WARNING: The training code is purely experimental and is under development.

Table of Contents

Getting Started

We recommend starting with a single TPU host and then moving to multihost.

Minimum requirements: Ubuntu Version 22.04, Python 3.10 and Tensorflow >= 2.12.0.

Getting Started: Local Development for single host

Local development is a convenient way to run MaxDiffusion on a single host.

  1. Create and SSH to a single-host TPU (v4-8).
  2. Clone MaxDiffusion in your TPU VM.
  3. Within the root directory of the MaxDiffusion git repo, install dependencies by running:
pip3 install jax[tpu] -f https://storage.googleapis.com/jax-releases/libtpu_releases.html
pip3 install -r requirements.txt
pip3 install .

Training

After installation completes, run the training script.

  • Stable Diffusion XL

    export LIBTPU_INIT_ARGS=""
    python -m src.maxdiffusion.train_sdxl src/maxdiffusion/configs/base_xl.yml run_name="my_xl_run" base_output_directory="gs://your-bucket/" per_device_batch_size=1

    To generate images with a trained checkpoint, run:

    python -m src.maxdiffusion.generate src/maxdiffusion/configs/base_xl.yml run_name="my_run" pretrained_model_name_or_path=<your_saved_checkpoint_path> from_pt=False attention=dot_product
  • Stable Diffusion 2 base

    export LIBTPU_INIT_ARGS=""
    python -m src.maxdiffusion.models.train src/maxdiffusion/configs/base_2_base.yml run_name="my_run" base_output_directory="gs://your-bucket/"

    To generate images with a trained checkpoint, run:

    python -m src.maxdiffusion.generate src/maxdiffusion/configs/base_2_base.yml run_name="my_run" pretrained_model_name_or_path=<your_saved_checkpoint_path> from_pt=False attention=dot_product

Inference

To generate images, run the following command:

  • Stable Diffusion XL

    Single and Multi host inference is supported with sharding annotations:

    python -m src.maxdiffusion.generate_sdxl src/maxdiffusion/configs/base_xl.yml run_name="my_run"

    Single host pmap version:

    python -m src.maxdiffusion.generate_sdxl_replicated
  • Stable Diffusion 2 base

    python -m src.maxdiffusion.generate src/maxdiffusion/configs/base_2_base.yml run_name="my_run"
    
  • Stable Diffusion 2.1

    python -m src.maxdiffusion.generate src/maxdiffusion/configs/base21.yml run_name="my_run"

    SDXL Lightning

    Single and Multi host inference is supported with sharding annotations:

    python -m src.maxdiffusion.generate_sdxl src/maxdiffusion/configs/base_xl.yml run_name="my_run" lightning_repo="ByteDance/SDXL-Lightning" lightning_ckpt="sdxl_lightning_4step_unet.safetensors"

    ControlNet

    Might require installing extra libraries for opencv: apt-get update && apt-get install ffmpeg libsm6 libxext6 -y

    • Stable Diffusion 1.4

      python src/maxdiffusion/controlnet/generate_controlnet_replicated.py
    • Stable Diffusion XL

      python src/maxdiffusion/controlnet/generate_controlnet_sdxl_replicated.py

Getting Started: Multihost development

Multihost training for Stable Diffusion 2 base can be run using the following command:

TPU_NAME=<your-tpu-name>
ZONE=<your-zone>
PROJECT_ID=<your-project-id>
gcloud compute tpus tpu-vm ssh $TPU_NAME --zone=$ZONE --project $PROJECT_ID --worker=all --command="
git clone https://github.com/google/maxdiffusion
pip3 install jax[tpu] -f https://storage.googleapis.com/jax-releases/libtpu_releases.html
pip3 install -r requirements.txt
pip3 install .
python -m src.maxdiffusion.models.train src/maxdiffusion/configs/base_2_base.yml run_name=my_run base_output_directory=gs://your-bucket/"

Comparison to Alternatives

MaxDiffusion started as a fork of Diffusers, a Hugging Face diffusion library written in Python, Pytorch and Jax. MaxDiffusion is compatible with Hugging Face Jax models. MaxDiffusion is more complex and was designed to run distributed across TPU Pods.

Development

Whether you are forking MaxDiffusion for your own needs or intending to contribute back to the community, a full suite of tests can be found in tests and src/maxdiffusion/tests.

To run unit tests and lint, simply run:

python -m pytest
ruff check --fix .

The full suite of -end-to end tests is in tests and src/maxdiffusion/tests. We run them with a nightly cadance.

maxdiffusion's People

Contributors

jfacevedo-google avatar entrpn avatar zhiyuli-goog avatar parambole avatar sshahrokhi avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.