Code Monkey home page Code Monkey logo

comic-text-detector's Introduction

This repository contains training scripts to train a text detector based on manga-image-translator which can extract bounding-boxes, text lines and segmentation of text from manga or comics to help further comics translation procedures such as text-removal, recognition, lettering, etc.

There are some awesome projects such as manga-image-translator, manga_ocr, SickZil-Machine offer DL models to automize the remaining work, we are working on a computer-aided comic/manga translation software which would (hopefully) put them together.

Download the text detection model from https://github.com/zyddnys/manga-image-translator/releases/tag/beta-0.2.1 or Google Drive.

Examples

AisazuNihaIrarenai-003 (source: manga109, © Yoshi Masako)

AisazuNihaIrarenai-003-mask

AisazuNihaIrarenai-003-bboxes

Training Details

Our current model can be summarized as below.

All models were trained on around 13 thousand anime & comic style images, 1/3 from Manga109-s, 1/3 from DCM, and 1/3 are synthetic data in a weak supervision manner due to the lack of available high-quality annotations.

We used text detection model of manga-image-translator to generate text lines annotations for manga, and Manga-Text-Segmentation with some post-processing to generate masks for both manga and comics. Synthetic data were generated using around 4k text-free anime-girls pictures from https://t.me/SugarPic, text-rendering, Unet and DBNet training scripts can be found in this repo. Text block detector was trained using yolov5 official repository

We would not (don't have the right) share training sets or fonts publically, 2/3 of the training set is not so clean anyway, so the training is reproducible only if you have enough images and fonts, you can use the models this repo provided to generate labels for comics/manga, and the comic style text rendering script to generate synthetic data, please refer to examples.ipynb for more details.

Acknowledgements

comic-text-detector's People

Contributors

dmmaze avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.