Code Monkey home page Code Monkey logo

word-as-image's Introduction

Word-As-Image for Semantic Typography

webpage Huggingface space Youtube




A few examples of our Word-As-Image illustrations in various fonts and for different textual concept. The semantically adjusted letters are created completely automatically using our method, and can then be used for further creative design as we illustrate here.

A word-as-image is a semantic typography technique where a word illustration presents a visualization of the meaning of the word, while also preserving its readability. We present a method to create word-as-image illustrations automatically. This task is highly challenging as it requires semantic understanding of the word and a creative idea of where and how to depict these semantics in a visually pleasing and legible manner. We rely on the remarkable ability of recent large pretrained language-vision models to distill textual concepts visually. We target simple, concise, black-and-white designs that convey the semantics clearly.We deliberately do not change the color or texture of the letters and do not use embellishments. Our method optimizes the outline of each letter to convey the desired concept, guided by a pretrained Stable Diffusion model. We incorporate additional loss terms to ensure the legibility of the text and the preservation of the style of the font. We show high quality and engaging results on numerous examples and compare to alternative techniques.

Description

Official implementation of Word-As-Image for Semantic Typography paper.

Setup

  1. Clone the repo:
git clone https://github.com/WordAsImage/Word-As-Image.git
cd Word-As-Image
  1. Create a new conda environment and install the libraries:
conda create --name word python=3.8.15
conda activate word
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 --extra-index-url https://download.pytorch.org/whl/cu113
conda install -y numpy scikit-image
conda install -y -c anaconda cmake
conda install -y -c conda-forge ffmpeg
pip install svgwrite svgpathtools cssutils numba torch-tools scikit-fmm easydict visdom freetype-py shapely
pip install opencv-python==4.5.4.60  
pip install kornia==0.6.8
pip install wandb
pip install shapely
  1. Install diffusers:
pip install diffusers==0.8
pip install transformers scipy ftfy accelerate
  1. Install diffvg:
git clone https://github.com/BachiLi/diffvg.git
cd diffvg
git submodule update --init --recursive
python setup.py install
  1. Paste your HuggingFace access token for StableDiffusion in the TOKEN file.

Run Experiments

conda activate word
cd Word-As-Image

# Please modify the parameters accordingly in the file and run:
bash run_word_as_image.sh

# Or run :
python code/main.py --experiment <experiment> --semantic_concept <concept> --optimized_letter <letter> --seed <seed> --font <font_name> --use_wandb <0/1> --wandb_user <user name> 
  • --semantic_concept : the semantic concept to insert
  • --optimized_letter : one letter in the word to optimize
  • --font : font name, the .ttf file should be located in code/data/fonts/

Optional arguments:

  • --word : The text to work on, default: the semantic concept
  • --config : Path to config file, default: code/config/base.yaml
  • --experiment : You can specify any experiment in the config file, default: conformal_0.5_dist_pixel_100_kernel201
  • --log_dir : Default: output folder
  • --prompt_suffix : Default: "minimal flat 2d vector. lineal color. trending on artstation"

Examples

python code/main.py  --semantic_concept "BUNNY" --optimized_letter "Y" --font "KaushanScript-Regular" --seed 0

python code/main.py  --semantic_concept "LEAVES" --word "NATURE" --optimized_letter "T" --font "HobeauxRococeaux-Sherman" --seed 0

  • Pay attention, as the arguments are case-sensitive, but it can handle both upper and lowercase letters depending on the input letters.

Tips

If the outcome does not meet your quality expectations, you could try the following options:

  1. Adjusting the weight ๐›ผ of the L๐‘Ž๐‘๐‘Ž๐‘ loss, which preserves the letter's structure after deformation.
  2. Modifying the ๐œŽ parameter of the low-pass filter used in the L๐‘ก๐‘œ๐‘›๐‘’ loss, which limits the degree of deviation from the original letter.
  3. Changing the number of control points, as this can influence the outputs.
  4. Experimenting with different seeds, as each may produce slightly different results.
  5. Changing the font type, as this can also result in various outputs.

Acknowledgement

Our implementation is based ob Stable Diffusion text-to-image model from Hugging Face's Diffusers library, combined with Diffvg. The framework is built on Live.

Licence

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

word-as-image's People

Contributors

wordasimage avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.