Code Monkey home page Code Monkey logo

fakeglyph's Introduction

fakeglyph

samples.gif

This repository lets you train generative models that try to produce letters.

Setup

  • Create a Python environment where python>=3.10.
  • Clone the repository by git clone https://github.com/kumatheworld/fakeglyph.git.
  • Install the required packages by pip install -r requirements.txt.
  • Set the device. Change the device field of configs/train.yaml to cuda, cpu or whatever device you want to use.
  • Define the letter set and font file to create the dataset. Edit configs/data/cjk.yaml and fakeglyph/data/charset.py accordingly. The default dataset is made of ~20K Chinese characters, but you can define your own letter set with any font.

Train models

The first time you run the script, you'll need some time to generate the dataset, which will be cached under datasets/. You can easily change the model architecture by editing config files under configs/model/.

Train (β-)VAE

Run python main.py train model=vae. Run python main.py train model=vae model.beta=1 model.reduction=batchmean instead to train the vanilla VAE.

Train GAN

Run python main.py train model=gan.

fakeglyph's People

Contributors

kumatheworld avatar

Watchers

 avatar

fakeglyph's Issues

Stabilize GAN training

The current GAN performance looks catastrophic. I tried some bigger models but that didn't work.

samples

Perhaps you want to have a regularization method like the R1 regularization, which I find easy and worth trying.

Unify View and Interpolate in fakeglyph/model/units.py

You can define a module that looks something like the following.

class FunctionalModule(nn.Module):
    def __init__(self, func: Callable) -> None:
        super().__init__()
        self.forward = func

Then View and Interpolate can be made class methods rather than separate classes. It will be nice to support type hints and override __repr__() to describe the function clearly.

Try diffusion models

Since the current VAEs and GANs don't look satisfactory, you might want to try some diffusion models. Hugging Face's diffusers will be a good starting point.

Validate datasets

The current dataset creation pipeline doesn't validate the images and might silently generate trivial images. Below is what they look like.

samples

I got this by using fakeglyph.data.charset.cjk_extension, which isn't supported by the font /System/Library/Fonts/PingFang.ttc. PIL.ImageFont.FreeTypeFont.getbbox() doesn't throw an error, but maybe you can instead look at the value returned by it, which ends up being something like (left, top, right, bottom) = (0, 68, 64, 68) when size=64.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.