Code Monkey home page Code Monkey logo

kanji-classifier's Introduction

kanji-classifier

OCR application that classifies almost 3000 Japanese kanji. Full list of characters can be accessed by running print_all_characters.py.
Deployed on kanji.al3xbro.me

Dependencies:

  • Tensorflow, Keras: 2.12
  • NumPy, OpenCV

Note: Follow the guide at https://www.tensorflow.org/guide/gpu to use your GPU for training

Performance:

  • 92% accuracy for both validation and training sets.
  • 0.18 training loss and 0.21 validation loss.

To Train:

  1. Download an image dataset of your choice.
  2. Modify the config.py file to contain the correct paths.
  3. Run the image_preprocessing.py script to process images.
  4. Run the delete_hiragana.py script to remove hiragana from the dataset.
  5. Run the model_training.py script to train your model. Uses data augmentation to help the model generalize.

Testing:

  • Run the predicting.py script to test your model.
  • Try writing kanji in your own handwriting and testing your model on that. Have fun!

Resources:

  • Datasets ETL8G and ETL9G from etlcdb were used for training and validation.
  • Used etlcdb-image-extractor to extract images from these datasets. Thank you!

kanji-classifier's People

Contributors

al3xbro avatar chloe199719 avatar

Stargazers

Nicholas Le avatar MiguelX413 avatar  avatar Tevin Wang avatar Leon Yee avatar

Watchers

Kostas Georgiou avatar  avatar

Forkers

chloe199719

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.