Code Monkey home page Code Monkey logo

controllora's Introduction

ControlLoRA: A Light Neural Network To Control Stable Diffusion Spatial Information

EN | 中文

By combining the ideas of lllyasviel/ControlNet and cloneofsimo/lora, we can easily fine-tune stable diffusion to achieve the purpose of controlling its spatial information, with ControlLoRA, a simple and small (~7M parameters, ~25M storage space) network.

ControlNet is large and it's not easy to send to your friends. With the idea of LoRA, we don't even need to transfer the entire stable diffusion model. Use the 25M ControlLoRA to save your time.

You could use gradio apps in the apps directory to try the pretrained models. More dataset types of models and their supporting gradio apps wanted. The annotator directory is borrowed from ControlNet.

You could download some pretrained models from huggingface. Note that I only used 100 MPII pictures for the training of the openpose, so the model effect is not good. So I suggest you train your own ControlLoRA.

How To Train

Refer to the script in the tasks directory. I highly refer to the training code from diffusers.

You could add or modify config file in the configs directory to custom the ControlLoRA model architecture. To enhance the effect of the model, you could change some blocks to other residual block types of diffusers and you could increase the number of layers of blocks by modify the config files.

Work In Progress

  • More type tasks mentioned in ControlNet.

  • Experiment of mixing LoRA and ControlLoRA.

    We could inject pretrained LoRA models before the ControlLoRA. See mix_lora_and_control_lora.py for more details.

    p portrait of male HighCWu

ControlLoRA with Canny Edge

sd-diffusiondb-canny-model-control-lora, on 100 openpose pictures, 30k training steps

Stable Diffusion 1.5 + ControlLoRA (using simple Canny edge detection)

python apps/gradio_canny2image.py

Highly refered to the ControlNet codes.

The Gradio app also allows you to change the Canny edge thresholds. Just try it for more details.

Prompt: "bird" p

Prompt: "cute dog" p

ControlLoRA with Human Pose

sd-mpii-pose-model-control-lora, on 100 openpose pictures, 30k training steps

Stable Diffusion 1.5 + ControlLoRA (using human pose)

python apps/gradio_pose2image.py

Highly refered to the ControlNet codes.

Apparently, this model deserves a better UI to directly manipulate pose skeleton. However, again, Gradio is somewhat difficult to customize. Right now you need to input an image and then the Openpose will detect the pose for you.

Note that I only used 100 MPII pictures for the training of the openpose, so the model effect is not good. So I suggest you train your own ControlLoRA.

Prompt: "Chief in the kitchen" p

Prompt: "An astronaut on the moon" p

PS: I don't know why my gallery didn't show the full images and I should click an output to show the full result of one of the outputs, like this: p

Discuss together

QQ Group: 艾梦的小群

QQ Channel: 艾梦的AI造梦堂

Discord: AI Players - AI Dream Bakery

Citation

@software{wu2023controllora,
    author = {Wu Hecong},
    month = {2},
    title = {{ControlLoRA: A Light Neural Network To Control Stable Diffusion Spatial Information}},
    url = {https://github.com/HighCWu/ControlLoRA},
    version = {1.0.0},
    year = {2023}
}

controllora's People

Contributors

highcwu avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.