Code Monkey home page Code Monkey logo

inboxpraveen / asr-accuracy-tool Goto Github PK

View Code? Open in Web Editor NEW
4.0 2.0 4.0 9.27 MB

๐ŸŽ™๏ธ๐Ÿ“ A powerful Flask-based web application that leverages the latest Hugging Face ASR models to provide real-time speech-to-text (STT) transcripts with an intuitive user interface for easy correction. Perfect for enhancing the quality of training datasets for ASR models. ๐Ÿš€

License: MIT License

Python 68.81% CSS 0.15% JavaScript 1.95% HTML 29.10%
accuracy asr automatic-speech-recognition dataset-generation huggingface huggingface-transformers speech-recognition speech-to-text transformers

asr-accuracy-tool's Introduction

ASR-Accuracy-Tool ๐Ÿ”ˆ

๐ŸŽ™๏ธ A powerful Flask-based web application that leverages the latest Hugging Face ASR models to provide real-time speech-to-text (STT) transcripts with an intuitive user interface for easy correction. Perfect for enhancing the quality of training datasets for ASR models, building awesome NLP Application driving by Accurate text data, and much more.

Screenshots ๐ŸŽฅ of Application

  1. Home Page - It shows an simple form where you get to choose directory which contains your audio files. This could also be directory which contains even more directories. It allows both relative as well as absolute path.

Main Screen

  1. Processing Page - This is a dynamic and real-time page based on celery background task that gets updated every 10 seconds with new transcriptions (if they are available). It shows you overall progress based on number of segments total possible. Additionally, it contains an editable column which can be used for corrections. It also allows user to listen to complete audio as they continue to generate.

Transcribe Screen 1

Transcribe Screen 2

๐ŸŽฌ Video Demo Coming Soon...

Features:

Real-time audio-to-text conversion using state-of-the-art ASR models from Hugging Face. User-friendly interface for reviewing and correcting transcripts. Seamless integration with Hugging Face's model hub for easy model selection and updates. Export corrected transcripts in common formats for training and analysis. Built with scalability in mind for handling large datasets.

Why Use It:

Enhance the accuracy of your ASR models by easily creating high-quality training datasets. Correct and fine-tune ASR transcripts with ease, all powered by cutting-edge Hugging Face models.

Stay Updated: โญ

๐Ÿ” Stay tuned for regular updates as we incorporate the latest advancements in ASR technology!

To-Do Improvements ๐Ÿšง

This project is open for community. You are welcome to join me. I am primarily focusing on the following improvements.

  1. Add custom models for Speech Recognition
  2. Add support to Mac & Windows Platforms
  3. Memory Optimization of shared resources instead of single model instance per concurrent instance inside celery
  4. Add support for more audio extensions
  5. Auto Setup and configuration scripts which allows more robustness to changes
  6. Improvement to this documentation

Other contributions are also welcome. It will be slightly less in priority but thanks a lot for your inputs.

Contributions Welcome:

๐Ÿ‘ฉโ€๐Ÿ’ป Welcome contributions from the community to make this tool even more powerful and accessible to everyone. Join me in creating a better ASR use-cases world!

asr-accuracy-tool's People

Contributors

inboxpraveen avatar

Stargazers

 avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.