Code Monkey home page Code Monkey logo

ip-adapter's Introduction

IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models


Introduction

we present IP-Adapter, an effective and lightweight adapter to achieve image prompt capability for the pretrained text-to-image diffusion models. An IP-Adapter with only 22M parameters can achieve comparable or even better performance to a fine-tuned image prompt model. IP-Adapter can be generalized not only to other custom models fine-tuned from the same base model, but also to controllable generation using existing controllable tools. Moreover, the image prompt can also work well with the text prompt to accomplish multimodal image generation.

arch

Release

  • [2023/8/18] ๐Ÿ”ฅ Add code and models for SDXL 1.0. Demo is here.
  • [2023/8/16] ๐Ÿ”ฅ We release the code and models.

Dependencies

  • diffusers >= 0.19.3

Download Models

you can download models from here. To run the demo, you should also download the following models:

How to Use

  • ip_adapter_demo: image variations, image-to-image, and inpainting with image prompt.

image variations

image-to-image

inpainting

structural_cond

multi_prompts

Best Practice

  • If you only use the image prompt, you can set the scale=1.0 and text_prompt=""(or some generic text prompts, e.g. "best quality", you can also use any negative text prompt). If you lower the scale, more diverse images can be generated, but they may not be as consistent with the image prompt.
  • For multimodal prompts, you can adjust the scale to get best results. In most cases, setting scale=0.5 can get good results. For the version of SD 1.5, we recommend using community models to generate good images.

Citation

If you find IP-Adapter useful for your your research and applications, please cite using this BibTeX:

@article{ye2023ip-adapter,
  title={IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models},
  author={Ye, Hu and Zhang, Jun and Liu, Sibo and Han, Xiao and Yang, Wei},
  booktitle={arXiv preprint arxiv:2308.06721},
  year={2023}
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.