Code Monkey home page Code Monkey logo

attention-interpolation-diffusion's Introduction

PAID: (Prompt-guided) Attention Interpolation of Text-to-Image Diffusion


He Qiyuan1Wang Jinghao2Liu Ziwei2Angela Yao1,✉;
Computer Vision & Machine Learning Group, National University of Singapore 1
S-Lab, Nanyang Technological University 2
Corresponding Author

📌 Release

[03/2024] Code and paper are publicly available.

📑 Abstract

TL;DR: AID (Attention Interpolation via Diffusion) is a training-free method that enables the text-to-image diffusion model to generate interpolation between different conditions with high consistency, smoothness and fidelity. Its variant, PAID, provides further control of the interpolation via prompt guidance.

▶️ PAID Results

🏍️ Google Colab

Directly try PAID with Stable Diffusion 2.1 or SDXL using Google's Free GPU!

🚗 Local Setup using Jupyter Notebook

  1. Clone the repository and install the requirements:
git clone https://github.com/QY-H00/attention-interpolation-diffusion.git
cd attention-interpolation-diffusion
pip install requirements.txt
  1. Go to play.ipynb or play_sdxl.ipynb for fun!

🛳️ Local Setup using Gradio

  1. install Gradio
pip install gradio
  1. Launch the Gradio interface
gradio gradio_src/app.py

🎲 Customized Interpolation

Our method offers users customized and diverse configurations to experiment with, allowing them to freely adjust settings and achieve a wide range of interesting interpolation results. Here are some examples:

Prompt guidance

1. "A dog driving car"

2. "A car with dog furry texture"

3. "A toy named dog-car"

4. "A painting of car and dog drawn by Vincent van Gogh"

$\alpha$ and $\beta$ of the Beta prior

1. $\alpha=1, \beta=1$

2. $\alpha=1, \beta=8$

3. $\alpha=8, \beta=1$

📝 Supporting Models

Model Name Link
Stable Diffusion 1.4-512 CompVis/stable-diffusion-v1-4
Stable Diffusion 1.5-512 runwayml/stable-diffusion-v1-5
Stable Diffusion 2.1-768 stabilityai/stable-diffusion-2-1
Stable Diffusion XL-1024 stabilityai/stable-diffusion-xl-base-1.0
Animagine XL 3.1 cagliostrolab/animagine-xl-3.1

✒️Citation

If you found this repository/our paper useful, please consider citing:

@misc{he2024aid,
      title={AID: Attention Interpolation of Text-to-Image Diffusion}, 
      author={Qiyuan He and Jinghao Wang and Ziwei Liu and Angela Yao},
      year={2024},
      eprint={2403.17924},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

❤️ Acknowledgement

We thank the following repositories for their great work: diffusers, transformers.

➕️ More Results with SD1.5

Realist Style

Pikachu -> Gundam

Computer -> Phone

Anime Style

Ninja -> Cat

Ninja -> Dog

Oil-Painting Style

Starry night -> Mona Lisas

SkyCraper -> Town

attention-interpolation-diffusion's People

Contributors

qy-h00 avatar king159 avatar

Stargazers

Haitao Xiao avatar  avatar Yantao Xie avatar  avatar  avatar  avatar Zuhao Yang avatar  avatar Manish Kumar avatar Paolo Faccini avatar mskani avatar  avatar  avatar  avatar Paragoner avatar  avatar  avatar thekarmakazi avatar Logan avatar syddharth avatar Kellyxiaowei avatar  avatar Timur Aroslanov avatar  avatar  avatar Xiefan Guo avatar ml-donghyeop-shin avatar Koolen Dasheppi avatar Rodrigo Baron avatar SUN, Pengzhan avatar hyunsoo avatar Zeqiang Lai avatar A student's GitHub Account avatar gradetwo avatar Tsu-Jui Fu avatar  avatar Chino avatar Dibyadip Chatterjee avatar Zhu Jiayin 朱珈印 avatar  avatar  avatar  avatar mulancer avatar  avatar pe653 avatar kelsy gagnebin avatar  avatar Monteiro Steed avatar  avatar  avatar  avatar Lasse Peters avatar Shreyas Jaiswal avatar Jonathan Fischoff avatar Mitchell Mosure avatar Rui Chen avatar Vishaal Udandarao avatar Oli_Zhan avatar IronMan avatar Chen Juan avatar  avatar Zhuoran Zhao avatar Linh Nguyen avatar  avatar Kai Xu avatar kiui avatar lirc572 avatar Lingdong Kong avatar  avatar Ziwei Liu avatar  avatar Guo Xun avatar Zijian Zhou avatar Said avatar

Watchers

Kostas Georgiou avatar IronMan avatar  avatar

attention-interpolation-diffusion's Issues

interpolation with real images

Hi,
I really enjoyed reading the PAID paper you recently posted in the archive
Thank you for your great work and code.

After looking at the paper and the code, I had some questions, so I wrote to you.
Is it possible to do interpolation with real images?
I ask because I tried to test it using DDIM inversion, but it didn't seem to work well.

Thank you.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.