TIGeR

Unified Text-to-Image Generation and Retrieval

Leigang Qu, Haochuan Li, Tan Wang*, Wenjie Wang*, Yongqi Li, Liqiang Nie, and Tat-Seng Chua

(* Corresponding Authors)

National University of Singapore, Nanyang Technological University, Hong Kong Polytechnic University, Harbin Institute of Technology (Shenzhen)

Introduction

This repository contains code and links to the Unified Text-to-Image Generation and Retrieval (TIGeR) work. We show the potential of intrinsic discriminative abilities of current Multimodal Large Language Models (MLLMs) and propose a training-free method to unify text-to-image generation and retrieval. Besides, we build a benchmark called TIGeR-Bench to comprehensively evaluate the unified performance across recent MLLMs.

Framework

Overview of the framework to unify text-to-image generation and retrieval. Images from the database are first tokenized into discrete codes and a lookup table is maintained for the correspondence between discrete codes and images. The given prompt X is first fed into a MLLM and Forward Beam Search is performed to retrieve and generate images in parallel. The prompt and obtained images are then fed into the same MLLM for Reverse Re-Ranking and Decision-making.

Release

Release the evaluation code for the unified task and the retrieval task.
Release the inference code for unified text-to-image generation and retrieval.
2024-7-6 Release TIGeR-Bench on Huggingface.
2024-6-9 Release the paper of TIGeR on arXiv.

Acknowledgement

We thank the authors of SEED-LLaMA and LaVIT for making their code available.

If you find our work useful in your research, please consider citing TIGeR:

@article{qu2024unified,
  title={Unified Text-to-Image Generation and Retrieval},
  author={Qu, Leigang and Li, Haochuan and Wang, Tan and Wang, Wenjie and Li, Yongqi and Nie, Liqiang and Chua, Tat-Seng},
  journal={arXiv preprint arXiv:2406.05814},
  year={2024}
}

lgqu / tiger Goto Github PK

tiger's Introduction

TIGeR

Unified Text-to-Image Generation and Retrieval

Introduction

Framework

Release

Acknowledgement

tiger's People

Contributors

Stargazers

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent