Code Monkey home page Code Monkey logo

learnable-typewriter's Introduction

teaser.png

The Learnable Typewriter
A Generative Approach to Text Analysis

Official PyTorch implementation of The Learnable Typewriter: A Generative Approach to Text Ξ‘nalysis.
Authors: Yannis Siglidis, Nicolas Gonthier, Julien Gaubil, Tom Monnier, Mathieu Aubry.
Research Institute: Imagine, LIGM, Ecole des Ponts, Univ Gustave Eiffel, CNRS, Marne-la-VallΓ©e, France

Install 🌱

conda create --name ltw pytorch==1.9.1 torchvision==0.10.1 cudatoolkit=11.3 -c pytorch -c conda-forge
conda activate ltw
python -m pip install -r requirements.txt

Datasets β˜€οΈ Models πŸ”¨

Download & extract datasets.zip and runs.zip in the parent folder.

Inference πŸ‘

For minimal inference and plotting we provide a standalone notebook. Open in Colab

To reproduce the figures of the paper run the scripts/figures.ipynb notebook.

Helper scripts are also provided to perform evaluation on the corresponding datasets:

python scripts/eval.py -i <MODEL-PATH> {--eval, --eval_best}

and produce figures and sprites for certain samples:

python scripts/eval.py -i <MODEL-PATH> -s {train, val, test} -id 0 0 0 -is 1 2 3 --plot_sprites

Training 🌼

Training and model configure is performed though hydra. We supply the corresponding config files for all our baseline experiments.

Google πŸ“°

python scripts/train.py supervised-google.yaml
python scripts/train.py unsupervised-google.yaml

Copiale πŸ“œ

python scripts/train.py supervised-copiale.yaml
python scripts/train.py unsupervised-copiale.yaml

Fontenay β›ͺ

python scripts/train.py supervised-fontenay.yaml

and finetune with:

python scripts/fontenay.py -i fontenay/fontenay/<MODEL_NAME> -o fontenay/fontenay-ft/ --max_epochs 150 -k "training.optimizer.lr=0.001"

To all of the above experiment config files, additional command line overrides could be applied to further modify them using the hydra syntax.

Custom Dataset πŸ’Ύ

Trying the LT on a new dataset is dead easy.

First create a config file:

configs/<DATASET_ID>.yaml

...

DATASET-TAG:
  path: <DATASET-NAME>/
  sep: ''                    # How the character separator is denoted in the annotation. 
  space: ' '                 # How the space is denoted in the annotation.

Then create the dataset folder:

datasets/<DATASET-NAME>
β”œβ”€β”€ annotation.json
└── images
  β”œβ”€β”€ <image_id>.jpg
  └── ...

The annotation.json file should be a dictionary with entries of the form:

    "<image_id>": {
        "split": "train",                            # {"train", "val", "test"} - "val" is ignored in the unsupervised case.
        "label": "A beautiful calico cat."           # The text that corresponds to this line.
    },

You can completely ignore the annotation.json file in the case of unsupervised training without evaluation.

Logging πŸ“‰

Logging is done through tensorboard. To visualize results run:

tensorboard --logdir ./<run_dir>/

If you want to dive in deeper, check out our experimental features.

Citing πŸ’«

@misc{the-learnable-typewriter,
	title = {The Learnable Typewriter: A Generative Approach to Text Line Analysis},
	author = {Siglidis, Ioannis and Gonthier, Nicolas and Gaubil, Julien and Monnier, Tom and Aubry, Mathieu},
	publisher = {arXiv},
	year = {2023},
	url = {https://arxiv.org/abs/2302.01660},
	keywords = {Computer Vision and Pattern Recognition (cs.CV), FOS: Computer and information sciences, FOS: Computer and information sciences},
	doi = {10.48550/ARXIV.2302.01660},
	copyright = {Creative Commons Attribution 4.0 International}
}

Also check out 🌈

If you like this project, have also a look to related work produced by our team:

Acknowledgements ✨

We would like to thank Malamatenia Vlachou and Dominique Stutzmann for sharing ideas, insights and data for applying our method in paleography; Vickie Ye and Dmitriy Smirnov for useful insights and discussions; Romain Loiseau, Mathis Petrovich, Elliot Vincent, Sonat BaltacΔ± for manuscript feedback and constructive insights. This work was partly supported by the European Research Council (ERC project DISCOVER, number 101076028), ANR project EnHerit ANR-17-CE23-0008, ANR project VHS ANR-21-CE38-0008 and HPC resources from GENCI-IDRIS (2022-AD011012780R1, AD011012905).

learnable-typewriter's People

Contributors

ysig avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.