Code Monkey home page Code Monkey logo

robinasr's Introduction

RobinASR

This repository contains Robin's Automatic Speech Recognition (RobinASR) for the Romanian language based on the DeepSpeech2 architecture, together with a KenLM language model to imporve the transcriptions.

The pretrained text-to-speech model can be downloaded from here and the pretrained KenLM can be downloaded from here.

Also, make sure to visit:

Installation

Docker

We offer two docker containers that are available on dockerhub and that provide the RobinASR out of the box:

  • for running on GPU:
docker pull racai/robinasr:gpu
docker run --gpus all -p 8888:8888 --net=host --ipc=host racai/robinasr:gpu
  • for running on CPU:
docker pull racai/robinasr:cpu
docker run -p 8888:8888 --net=host --ipc=host racai/robinasr:cpu

You can also create your own docker image by following these steps:

  1. Download the pretrained text-to-speech model and the pretrained KenLM at the above links, and copy them in a models directory inside this repository.

  2. Build the docker image using the Dockerfile. Make sure that deepspeech_pytorch/configs/inference_config.py has the desired configuration.

docker build --tag RobinASR .
  1. Run the docker image.
docker run --gpus all -p 8888:8888 --net=host --ipc=host RobinASR

From Source

  1. You must have Python 3.6+ and PyTorch 1.5.1+ installed in your system. Also. Cuda 10.1+ is required if you want to use the (recommended) GPU version.

  2. Clone the repository and install its dependencies:

git clone https://github.com/racai-ai/RobinASR.git
cd RobinASR
pip3 install -r requirements.txt
pip3 install -e .
  1. Install Nvidia Apex:
git clone --recursive https://github.com/NVIDIA/apex.git
cd apex && pip install .
  1. If you want to use Beam Search and the KenLM language model, you must install CTCDecode:
git clone --recursive https://github.com/parlance/ctcdecode.git
cd ctcdecode && pip install .

Inference Server

Firstly, take a look at the configuration file in deepspeech_pytorch/configs/inference_config.py and make sure that the configuration meets your requirements. Then, run the following command:

python3 server.py

Train a New Model

You must create 3 csv manifest files (train, valid and test) that contain on each line the the path to a wav file and the path to its corresponding transcription, separated by commas:

path_to_wav1,path_to_txt1
path_to_wav2,path_to_txt2
path_to_wav3,path_to_txt3
...

Then you must modify correspondingly with your configuration the file located at deepspeech_pytorch/configs/train_config.py and start training with:

python train.py

Acknowledgments

We would like to thank Sean Narnen for making his DeepSpeech2 implementation publicly-available. We used a lot of his code in our implementation.

Cite

If you are using this repository, please cite the following paper as a thank you to the authors:

Avram, A.M., Păiș, V. and Tufis, D., 2020, October. Towards a Romanian end-to-end automatic speech recognition based on Deepspeech2. In Proc. Rom. Acad. Ser. A (Vol. 21, pp. 395-402).

or in BibTeX format:

@inproceedings{avram2020towards,
  title={Towards a Romanian end-to-end automatic speech recognition based on Deepspeech2},
  author={Avram, Andrei-Marius and Păiș, Vasile and Tufiș, Dan},
  booktitle={Proceedings of the Romanian Academy, Series A},
  pages={395--402},
  year={2020}
}

robinasr's People

Contributors

avramandrei avatar pvf2005 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

robinasr's Issues

Error 404

Hello Andrei and thanks for sharing the source code and setup instructions for RobinASR. We have successfully setup the environment with the DockerHub and have it up and running. However, we had hard time on accessing the service and using it as we are constantly getting a 404 error. We would appreciate any help regarding this matter.

As a side note, having checked out the demo page - the RobinASR works really good and is an excellent offline solution for Romanian SR. Is there a way we could commercially setup this service on our premises?

Many thanks,
P

Training phase question ?

Will the train phase, use the "deepspeech_final.pth" model or can it be made to enhance this model ?
I am asking because I just want to improve the model not build one from scratch.

Thanks

Error while installing from git

When I'm trying to install from git with the following command:
!pip3 install -r requirements.txt

I get the following message with the error " × python setup.py egg_info did not run successfully."
What could be the problem?

Thank you very much

Requirement already satisfied: flask in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 1)) (2.2.5)
Collecting hydra-core (from -r requirements.txt (line 2))
  Downloading hydra_core-1.3.2-py3-none-any.whl (154 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 154.5/154.5 kB 1.3 MB/s eta 0:00:00
Collecting jupyter (from -r requirements.txt (line 3))
  Downloading jupyter-1.0.0-py2.py3-none-any.whl (2.7 kB)
Requirement already satisfied: librosa in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 4)) (0.10.1)
Requirement already satisfied: matplotlib in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 5)) (3.7.1)
Requirement already satisfied: numpy in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 6)) (1.23.5)
Collecting optuna (from -r requirements.txt (line 7))
  Downloading optuna-3.4.0-py3-none-any.whl (409 kB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 409.6/409.6 kB 7.4 MB/s eta 0:00:00
Requirement already satisfied: pytest in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 8)) (7.4.3)
Collecting python-levenshtein (from -r requirements.txt (line 9))
  Downloading python_Levenshtein-0.23.0-py3-none-any.whl (9.4 kB)
Requirement already satisfied: scipy in /usr/local/lib/python3.10/dist-packages (from -r requirements.txt (line 10)) (1.11.4)
Collecting sklearn (from -r requirements.txt (line 11))
  Downloading sklearn-0.0.post12.tar.gz (2.6 kB)
  error: subprocess-exited-with-error
  
  × python setup.py egg_info did not run successfully.
  │ exit code: 1
  ╰─> See above for output.
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  Preparing metadata (setup.py) ... error
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

model tflite

Buna ziua,

Ar fi fain si un model pretrained tflite.

Multumesc!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.