Code Monkey home page Code Monkey logo

vedastr's Introduction

Introduction

Vedastr is an open source scene text recognition toolbox based on PyTorch. It is designed to be flexible in order to support rapid implementation and evaluation for scene text recognition task.

Features

  • Modular design
    We decompose the scene text recognition framework into different components and one can easily construct a customized scene text recognition framework by combining different modules.

  • Flexibility
    Vedastr is flexible enough to be able to easily change the components within a module.

  • Module expansibility
    It is easy to integrate a new module into the vedastr project.

  • Support of multiple frameworks
    The toolbox supports several popular scene text recognition framework, e.g., CRNN, TPS-ResNet-BiLSTM-Attention, etc.

  • Good performance
    We re-implement the best model in deep-text-recognition-benchmark and get better average accuracy. What's more, we implement a simple baseline(ResNet-FC) and the performance is acceptable.

License

This project is released under Apache 2.0 license.

Benchmark and model zoo

Note:

MODEL CASE SENSITIVE IIIT5k_3000 SVT IC03_867 IC13_1015 IC15_2077 SVTP CUTE80 AVERAGE
TPS-ResNet-BiLSTM-Attention False 87.33 87.79 95.04 92.61 74.45 81.09 74.91 84.95
ResNet-FC False 85.03 86.4 94 91.03 70.29 77.67 71.43 82.38

AVERAGE : Average accuracy over all test datasets
TPS : Spatial transformer network
CASE SENSITIVE : If true, the output is case sensitive and contain common characters. If false, the output is not case sentive and contains only numbers and letters.

Installation

Requirements

  • Linux
  • Python 3.6+
  • PyTorch 1.1.0 or higher
  • CUDA 9.0 or higher

We have tested the following versions of OS and softwares:

  • OS: Ubuntu 16.04.6 LTS
  • CUDA: 9.0
  • Python 3.6.9

Install vedastr

a. Create a conda virtual environment and activate it.

conda create -n vedastr python=3.6 -y
conda activate vedastr

b. Install PyTorch and torchvision following the official instructions, e.g.,

conda install pytorch torchvision -c pytorch

c. Clone the vedastr repository.

git clone https://github.com/Media-Smart/vedastr.git
cd vedastr
vedastr_root=${PWD}

d. Install dependencies.

pip install -r requirements.txt

Prepare data

a. Download Lmdb data from deep-text-recognition-benchmark, which contains training data, validation data and evaluation data.

b. Make directory data as follows:

cd ${vedastr_root}
mkdir ${vedastr_root}/data

c. Put the download Lmdb data into this data directory, the structure of data directory will look like as follows:

data
└── data_lmdb_release
    ├── evaluation
    ├── training
    │   ├── MJ
    │   │   ├── MJ_test
    │   │   ├── MJ_train
    │   │   └── MJ_valid
    │   └── ST
    └── validation

Train

a. Config

Modify some configuration accordingly in the config file like configs/clova.py

b. Run

python tools/trainval.py configs/clova.py

Snapshots and logs will be generated at vedastr/workdir.

Test

a. Config

Modify some configuration accordingly in the config file like configs/clova.py

b. Run

python tools/test.py configs/clova.py path_to_clova_weights

Contact

This repository is currently maintained by Jun Sun(@ChaseMonsterAway), Hongxiang Cai (@hxcai), Yichao Xiong (@mileistone).

Credits

We got a lot of code from mmcv , mmdetection, deep-text-recognition-benchmark and vedaseg thanks to open-mmlab, clovaai, Media-Smart.

vedastr's People

Contributors

chasemonsteraway avatar hxcai avatar media-smart avatar mileistone avatar

Stargazers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.