Code Monkey home page Code Monkey logo

deepspeech2's Introduction

DeepSpeech2

Python3 installation and complete setup only for prediction of Baidu's deep speech 2 model.

Implementation of DeepSpeech2 architecture for ASR. It is an open-source implementation of end-to-end Automatic Speech Recognition (ASR) engine, based on Baidu's Deep Speech 2 paper, with PaddlePaddle platform. Biadu’s pre-trained model for English is used for inference.

System Requirements:

OS: Ubuntu 16.04.5 LTS Language: Python3 and Bash Database: None

Tools:

PaddlePaaddle SoundFile Efficient Signal Resampling Python Speech Feature extraction

Packages:

List of Python packages has been attached in requirements.txt

Environment setup

Folders setup

sh folder_setup.sh

OR

git clone https://github.com/PaddlePaddle/DeepSpeech/ cd DeepSpeech/ mkdir checkpoints cd checkpoints/ mkdir baidu cd baidu/ mkdir step_final/ cd step_final/ cd .. cd .. cd .. cd models/ cd baidu_en8k/ sh download_model.sh cd .. cd .. cp -p models/baidu_en8k/params.pdparams checkpoints/baidu/step_final/params.pdparams cp -p ../data_utils/audio.py ../DeepSpeech/data_utils/audio.py cp -p ../data_utils/data.py ../DeepSpeech/data_utils/data.py cp -p ../data_utils/utility.py ../DeepSpeech/data_utils/utility.py cp -p ../model_utils/model.py ../DeepSpeech/model_utils/model.py cp -p ../utils/utility.py ../DeepSpeech/utils/utility.py cp -p ../utils/error_rate.py ../DeepSpeech/utils/error_rate.py cp -p ../decoders/swig_wrapper.py ../DeepSpeech/decoders/swig_wrapper.py cp -p ../swig_decoders.py ../DeepSpeech/swig_decoders.py cp -p ../_swig_decoders.py ../DeepSpeech/_swig_decoders.py cp -p ../_swig_decoders.cpython-35m-x86_64-linux-gnu.so ../DeepSpeech/_swig_decoders.cpython-35m-x86_64-linux-gnu.so cp -p ../create_manifest.py ../DeepSpeech/create_manifest.py cp -p ../infer_deepspeech2_baidu.py ../DeepSpeech/infer_deepspeech2_baidu.py cp -p ../models/lm/download_lm_en.sh ../DeepSpeech/models/lm/download_lm_en.sh cp -p ../deploy/demo_server.py ../DeepSpeech/deploy/demo_server.py cp -p ../deploy/demo_client.py ../DeepSpeech/deploy/demo_client.py mkdir test_dataset cp -r ../vaibhav_data_test_wav16k/ ../DeepSpeech/test_dataset/vaibhav_data_test_wav16k/ cd data mkdir baidu_en8k/ cd .. cd models/lm sh download_lm_en.sh cd ../..

Utilities setup

Either run required_packages.sh

OR

sudo apt-get install python3-pip sudo python3 -m pip install paddlepaddle

Libraries setup

sudo python3 -m pip install -r requirements.txt

OR

sudo python3 -m pip install scipy==1.2.1 sudo python3 -m pip install SoundFile==0.9.0.post1 sudo python3 -m pip install resampy==0.1.5 sudo python3 -m pip install python_speech_features

Other steps

Replace </data_utils/utility.py> from repo

deepspeech2's People

Contributors

vaibhavabhimanyoohiwase avatar vaibhavhiwasekonvergeai avatar

Stargazers

 avatar

Watchers

 avatar

deepspeech2's Issues

while running the demo_server file, I am getting the error

Traceback (most recent call last):
File "deploy/demo_server.py", line 601, in
main()
File "deploy/demo_server.py", line 597, in main
start_server(kwargs)
File "deploy/demo_server.py", line 504, in start_server
vocab_list)
File "/home/arti/nlp_engine/GIT/deepspeech2_python3/DeepSpeech2/DeepSpeech/deploy/../model_utils/model.py", line 478, in init_ext_scorer
language_model_path, vocab_list)
File "/home/arti/nlp_engine/GIT/deepspeech2_python3/DeepSpeech2/DeepSpeech/deploy/../decoders/swig_wrapper.py", line 23, in init
swig_decoders.Scorer.init(self, alpha, beta, model_path, vocabulary)
File "/home/arti/nlp_engine/GIT/deepspeech2_python3/DeepSpeech2/DeepSpeech/deploy/../swig_decoders.py", line 1140, in init
_swig_decoders.Scorer_swiginit(self, _swig_decoders.new_Scorer(alpha, beta, lm_path, vocabulary))
TypeError: in method 'new_Scorer', argument 4 of type 'std::vector< std::string,std::allocator< std::string > > const &'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.