Code Monkey home page Code Monkey logo

speeq's Introduction

Documentation Status CI Code style: black License: MIT

SpeeQ

"SpeeQ", pronounced as "speekiu", is a Python-based speech recognition framework that allows developers and researchers to experiment and train various speech recognition models. It offers pre-implemented model architectures that can be trained with just a few lines of code, making it a suitable option for quick prototyping and testing of speech recognition models.

To get started, refer to the documentation. If you need assistance or want to stay connected, please join our Discord Server.

Installation

To install this package, you can follow the steps below:

  1. Create and activate a Python environment using the following commands:
python3 -m venv env
source env/bin/activate
  1. Install the packge from source

    git clone https://github.com/msalhab96/SpeeQ.git
    cd SpeeQ
    pip install -r requirements.txt
    pip install -e .

Implemented Models/Papers

Model name Paper Type
Deep Speech 1 Deep Speech: Scaling up end-to-end speech recognition CTC
Deep Speech 2 Deep Speech 2: End-to-End Speech Recognition in English and Mandarin CTC
Conformer Conformer: Convolution-augmented Transformer for Speech Recognition CTC
Jasper Jasper: An End-to-End Convolutional Neural Acoustic Model CTC
Wav2Letter Wav2Letter: an End-to-End ConvNet-based Speech Recognition System CTC
QuartzNet QuartzNet: Deep Automatic Speech Recognition with 1D Time-Channel Separable Convolutions CTC
Squeezeformer Squeezeformer: An Efficient Transformer for Automatic Speech Recognition CTC
RNNTransducer Sequence Transduction with Recurrent Neural Networks Transducer
ConformerTransducer Conformer: Convolution-augmented Transformer for Speech Recognition Transducer
ContextNet ContextNet: Improving Convolutional Neural Networks for Automatic Speech Recognition with Global Context Transducer
VGGTransformer-Transducer Transformer-Transducer: End-to-End Speech Recognition with Self-Attention Transducer
Transformer-Transducer Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss Transducer
BasicAttSeq2SeqRNN N/A Seq2Seq (encoder/decoder)
LAS Listen, Attend and Spell Seq2Seq (encoder/decoder)
RNNWithLocationAwareAtt Attention-Based Models for Speech Recognition Seq2Seq (encoder/decoder)
SpeechTransformer Speech-Transformer: A No-Recurrence Sequence-to-Sequence Model for Speech Recognition Seq2Seq (encoder/decoder)

Contributiuon

Your contributions are highly valued and appreciated! Our aim is to create an open and transparent environment that facilitates easy and straightforward contributions to this project. This can include reporting any issues or bugs you encounter, engaging in discussions regarding the current codebase, submitting fixes, proposing new features, or even becoming a maintainer yourself. We believe that your input is crucial to the continued growth and success of this framework. To start contributing to the framework, please consult the guidelines for contributions.

License & Citation

The framework is licensed under MIT. Therefore, if you use the framework, please consider citing it using the following bitex.

@software{Salhab_SpeeQ_A_framework_2023,
author = {Salhab, Mahmoud},
doi = {10.5281/zenodo.7708780},
license = {MIT},
month = {3},
title = {{SpeeQ: A framework for automatic speech recognition}},
url = {https://github.com/msalhab96/SpeeQ},
version = {0.0.1},
year = {2023}
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.