Topic: asr Goto Github

Some thing interesting about asr

👇 Here are 1007 public repositories matching this topic...

abhirooptalasila / autosub

asr,A CLI script to generate subtitle files (SRT/VTT/TXT) for any video using either DeepSpeech or Coqui

User: abhirooptalasila

speech-to-text ffmpeg sox deepspeech python asr mozilla-deepspeech autosub subtitle srt

ahmetoner / whisper-asr-webservice

asr,OpenAI Whisper ASR Webservice API

User: ahmetoner

Home Page: https://ahmetoner.github.io/whisper-asr-webservice

automatic-speech-recognition speech-recognition speech-to-text openai-whisper docker asr speech

ailln / cn2an

asr,📦 快速转化「中文数字」和「阿拉伯数字」～ (最新特性：分数，日期、温度等转化）

User: ailln

Home Page: https://www.dovolopor.com/cn2an

arabic-numerals chinese-numerals python arabic-numbers nlp-tool nlp-library speech-recognition asr pypi cn2an

alphacep / vosk-android-demo

asr,Offline speech recognition for Android with Vosk library.

Organization: alphacep

android kaldi offline speech-recognition asr vosk

alphacep / vosk-api

asr,Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

Organization: alphacep

speech-recognition asr voice-recognition speech-to-text android ios raspberry-pi deep-learning deep-neural-networks speech-to-text-android

alphacep / vosk-server

asr,WebSocket, gRPC and WebRTC speech recognition server based on Vosk and Kaldi libraries

Organization: alphacep

websocket speech-recognition kaldi python asr grpc saas webrtc vosk

athena-team / athena

asr,an open-source implementation of sequence-to-sequence based speech processing engine

Organization: athena-team

Home Page: https://athena-team.readthedocs.io

speech-recognition asr transformer tensorflow ctc unsupervised-learning sequence-to-sequence deployment wfst speaker-recognition tts speech-synthesis

cheshirecc / faster-whisper-gui

asr,faster_whisper GUI with PySide6

User: cheshirecc

faster-whisper openai transcribe vad voice-transcription whisper whisperx asr

coqui-ai / stt

asr,🐸STT - The deep learning toolkit for Speech-to-Text. Training and deploying STT models has never been so easy.

Organization: coqui-ai

Home Page: https://coqui.ai

stt speech-to-text tensorflow deep-learning automatic-speech-recognition asr voice-recognition speech-recognition speech-recognizer speech-recognition-api

delta-ml / delta

asr,DELTA is a deep learning based natural language and speech processing platform.

Organization: delta-ml

Home Page: https://delta-didi.readthedocs.io/

nlp deep-learning tensorflow speech sequence-to-sequence seq2seq speech-recognition text-classification speaker-verification nlu

dmitryryumin / interspeech-2023-papers

asr,INTERSPEECH 2023 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023 conference. Explore the latest advances in speech and language processing. Code included. Star the repository to support the advancement of speech technology!

User: dmitryryumin

interspeech speech-technology machine-translation speech-synthesis asr prosody self-supervised-learning speech-production speech-coding transmission

freewym / espresso

asr,Espresso: A Fast End-to-End Neural Speech Recognition Toolkit

User: freewym

asr end-to-end fairseq kaldi python pytorch speech-recognition

funaudiollm / sensevoice

asr,Multilingual Voice Understanding Model

Organization: funaudiollm

Home Page: https://funaudiollm.github.io/

ai asr gpt-4o speech-recognition speech-to-text aigc audio-event-classification cross-lingual llm python

hirofumi0810 / neural_sp

asr,End-to-end ASR/LM implementation with PyTorch

User: hirofumi0810

pytorch speech-recognition automatic-speech-recognition asr ctc attention-mechanism attention seq2seq sequence-to-sequence speech language-model transformer language-modeling rnn-transducer transformer-xl streaming

iceychris / libreasr

asr,:speech_balloon: An On-Premises, Streaming Speech Recognition System

User: iceychris

Home Page: https://news.ycombinator.com/item?id=25099847

asr speech-recognition pytorch fastai rnn-transducer deep-learning esp32-lyrat python

ictnlp / streamspeech

asr,StreamSpeech is an “All in One” seamless model for offline and simultaneous speech recognition, speech translation and speech synthesis.

Organization: ictnlp

Home Page: https://ictnlp.github.io/StreamSpeech-site/

seamless simultaneous-translation speech speech-recognition speech-synthesis speech-to-text speech-translation translation all-in-one machine-translation streaming-audio text-to-speech asr tts voice text-to-audio non-autoregressive speech-enhancement audio-processing speech-processing

innovatorved / whisper.api

asr,This project provides an API with user level access support to transcribe speech to text using a finetuned and processed Whisper ASR model.

User: innovatorved

Home Page: https://innovatorved-whisper-api.hf.space/

asr innovatorved transcribe whisper hacktoberfest

jdepoix / youtube-transcript-api

asr,This is a python API which allows you to get the transcript/subtitles for a given YouTube video. It also works for automatically generated subtitles and it does not require an API key nor a headless browser, like other selenium based solutions do!

User: jdepoix

youtube-api subtitles youtube transcripts youtube-subtitles youtube-transcripts python transcript subtitle cli

k2-fsa / sherpa

asr,Speech-to-text server framework with next-gen Kaldi

Organization: k2-fsa

Home Page: https://k2-fsa.github.io/sherpa

asr cpp python ctc pytorch transducer websocket end-to-end-asr speech-recognition

k2-fsa / sherpa-onnx

asr,Speech-to-text, text-to-speech, and speaker recognition using next-gen Kaldi with onnxruntime without Internet connection. Support embedded systems, Android, iOS, Raspberry Pi, RISC-V, x86_64 servers, websocket server/client, C/C++, Python, Kotlin, C#, Go, NodeJS, Java, Swift, Dart, JavaScript, Flutter

Organization: k2-fsa

Home Page: https://k2-fsa.github.io/sherpa/onnx/index.html

asr onnx windows linux macos cpp android ios raspberry-pi aarch64

kaituoxu / speech-transformer

asr,A PyTorch implementation of Speech Transformer, an End-to-End ASR with Transformer network on Mandarin Chinese.

User: kaituoxu

asr attention attention-is-all-you-need end-to-end pytorch self-attention transformer

linto-ai / whisper-timestamped

asr,Multilingual Automatic Speech Recognition with word-level timestamps and confidence

Organization: linto-ai

deep-learning speech speech-recognition speech-to-text asr machine-learning python python3 pytorch attention-is-all-you-need

m-bain / whisperx

asr,WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

User: m-bain

asr speech speech-recognition speech-to-text whisper

mahmoudashraf97 / whisper-diarization

asr,Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper

User: mahmoudashraf97

asr speaker-diarization speech speech-recognition speech-to-text whisper

mravanelli / pytorch-kaldi

asr,pytorch-kaldi is a project for developing state-of-the-art DNN/RNN hybrid speech recognition systems. The DNN part is managed by pytorch, while feature extraction, label computation, and decoding are performed with the kaldi toolkit.

User: mravanelli

speech-recognition gru dnn kaldi rnn-model pytorch timit deep-learning deep-neural-networks recurrent-neural-networks

mravanelli / sincnet

asr,SincNet is a neural architecture for efficiently processing raw audio samples.

User: mravanelli

deep-learning audio waveform filtering cnn convolutional-neural-networks speaker-recognition speaker-verification speaker-identification speech-recognition

nvidia / nemo

asr,A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)

Organization: nvidia

Home Page: https://docs.nvidia.com/nemo-framework/user-guide/latest/overview.html

machine-translation speaker-recognition asr tts generative-ai multimodal deeplearning neural-networks speaker-diariazation speech-translation

openspeech-team / openspeech

asr,Open-Source Toolkit for End-to-End Speech Recognition leveraging PyTorch-Lightning and Hydra.

Organization: openspeech-team

Home Page: https://openspeech-team.github.io/openspeech/

asr speech recognition speech-recognition open end-to-end e2e

paddlepaddle / paddlespeech

asr,Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

Organization: paddlepaddle

Home Page: https://paddlespeech.readthedocs.io

transformer conformer speech-translation streaming-asr speech-alignment punctuation-restoration streaming-tts speech-synthesis tts asr

peterh0323 / streamer-sales

asr,Streamer-Sales 销冠 —— 卖货主播 LLM 大模型🛒🎁，一个能够根据给定的商品特点从激发用户购买意愿角度出发进行商品解说的卖货主播大模型。🚀⭐内含详细的数据生成流程❗ 📦另外还集成了 LMDeploy 加速推理🚀、RAG检索增强生成 📚、TTS文字转语音🔊、数字人生成 🦸、 Agent 使用网络查询实时信息🌐、ASR 语音转文字🎙️

User: peterh0323

Home Page: https://openxlab.org.cn/apps/detail/HinGwenWong/Streamer-Sales

chat-application internlm-chat-7b internlm2 llm chatbot text-generation chat chatgpt gpt rag

picovoice / cheetah

asr,On-device streaming speech-to-text engine powered by deep learning

Organization: picovoice

Home Page: https://picovoice.ai/

speech-to-text asr automatic-speech-recognition online-speech-recognition speech-recognition stt transcription voice-recognition streaming-speech-to-text

purfview / whisper-standalone-win

asr,Whisper & Faster-Whisper standalone executables for those who don't want to bother with Python.

User: purfview

openai speech-to-text transcriber whisper asr speech-recognition subtitles ctranslate2 faster-whisper whisper-faster

pykaldi / pykaldi

asr,A Python wrapper for Kaldi

Organization: pykaldi

Home Page: https://pykaldi.github.io

python wrapper kaldi openfst asr speech-recognition speech language-model feature-extraction clif numpy

rapidai / rapidasr

asr,商用级开源语音自动识别程序库，开箱即用，全平台支持，中英文混合识别。A Cross-platform implementation of ASR inference. It's based on ONNXRuntime and FunASR. We provide a set of easier APIs to call ASR models.

Organization: rapidai

asr paraformer paddlespeech wenet

snakers4 / open_stt

asr,Open STT

User: snakers4

speech-to-text russian dataset stt asr automatic-speech-recognition

snakers4 / silero-models

asr,Silero Models: pre-trained speech-to-text, text-to-speech and text-enhancement models made embarrassingly simple

User: snakers4

speech-recognition speech-to-text stt asr pretrained-models english german spanish stt-benchmark pytorch

sooftware / conformer

asr,[Unofficial] PyTorch implementation of "Conformer: Convolution-augmented Transformer for Speech Recognition" (INTERSPEECH 2020)

User: sooftware

conformer transformer cnn transformer-xl asr speech-recognition pytorch conv convolution augmented

sooftware / kospeech

asr,Open-Source Toolkit for End-to-End Korean Automatic Speech Recognition leveraging PyTorch and Hydra.

User: sooftware

Home Page: https://sooftware.github.io/kospeech/

speech-recognition asr korean-speech end-to-end las-models ksponspeech pytorch seq2seq e2e-asr las

speechbrain / speechbrain

asr,A PyTorch-based Speech Toolkit

Organization: speechbrain

Home Page: http://speechbrain.github.io

speech-recognition speech-toolkit speaker-recognition speech-to-text speech-enhancement speech-separation audio audio-processing speech-processing speechrecognition

speechio / chinese_text_normalization

asr,Chinese text normalization for speech processing

Organization: speechio

text-normalization chinese asr kaldi-asr thrax-gramma sparrowhawk speech-recognition

srvk / eesen

asr,The official repository of the Eesen project

Organization: srvk

Home Page: http://arxiv.org/abs/1507.08240

tensorflow ctc-loss asr ctc kaldi speech-recognition speech-to-text

tensorflow / lingvo

asr,Lingvo

Organization: tensorflow

speech-recognition translation speech-to-text machine-translation mnist seq2seq language-model tts asr lm nlp tensorflow speech research distributed gpu-computing speech-synthesis

wenet-e2e / wenet

asr,Production First and Production Ready End-to-End Speech Recognition Toolkit

Organization: wenet-e2e

Home Page: https://wenet-e2e.github.io/wenet/

asr automatic-speech-recognition conformer e2e-models production-ready pytorch speech-recognition transformer whisper

wzpan / wukong-robot

asr,🤖 wukong-robot 是一个简单、灵活、优雅的中文语音对话机器人/智能音箱项目，支持ChatGPT多轮对话能力，还可能是首个支持脑机交互的开源智能音箱项目。

User: wzpan

Home Page: https://wukong.hahack.com/

ai speaker asr tts unit homeassistant raspeberry-pi amazon-echo alexa snowboy

xiangyuecn / recorder

asr,html5 js 录音 mp3 wav ogg webm amr g711a g711u 格式，支持pc和Android、iOS部分浏览器、Hybrid App（提供Android iOS App源码）、微信，提供ASR语音识别转文字 H5版语音通话聊天示例 DTMF编码解码

User: xiangyuecn

Home Page: https://xiangyuecn.github.io/Recorder/

recorder record javascript html5 h5 luyin mp3 wav amr ogg

yeyupiaoling / masr

asr,Pytorch实现的流式与非流式的自动语音识别框架，同时兼容在线和离线识别，目前支持Conformer、Squeezeformer、DeepSpeech2模型，支持多种数据增强方法。

User: yeyupiaoling

deepspeech pytorch asr deep-learning speech-recognition speech-to-text speech conformer squeezeformer

yeyupiaoling / paddlepaddle-deepspeech

asr,基于PaddlePaddle实现的语音识别，中文语音识别。项目完善，识别效果好。支持Windows，Linux下训练和预测，支持Nvidia Jetson开发板预测。

User: yeyupiaoling

Home Page: https://yeyupiaoling.blog.csdn.net/article/details/102904306

paddlepaddle deepspeech chinese asr deepspeech2 docker nvidia-docker speech-recognition speech-to-text deep-learning

yeyupiaoling / ppasr

asr,基于PaddlePaddle实现端到端中文语音识别，从入门到实战，超简单的入门案例，超实用的企业项目。支持当前最流行的DeepSpeech2、Conformer、Squeezeformer模型

User: yeyupiaoling

asr paddlepaddle deep-learning chinese speech-to-text speech speech-recognition streaming-asr conformer squeezeformer

yeyupiaoling / whisper-finetune

asr,Fine-tune the Whisper speech recognition model to support training without timestamp data, training with timestamp data, and training without speech data. Accelerate inference and support Web deployment, Windows desktop deployment, and Android deployment

User: yeyupiaoling

asr ctranslate2 huggingface whisper lora speech-recognition transformers chinese pytorch android

zw76859420 / asr_theory

asr,语音识别理论、论文和PPT

User: zw76859420

Home Page: https://www.meta-speech.com

asr kaldi tensorflow keras papers ppt deeplearning k2 kaldi2

Topic: asr Goto Github

👇 Here are 1007 public repositories matching this topic...

abhirooptalasila / autosub

ahmetoner / whisper-asr-webservice

ailln / cn2an

alphacep / vosk-android-demo

alphacep / vosk-api

alphacep / vosk-server

athena-team / athena

cheshirecc / faster-whisper-gui

coqui-ai / stt

delta-ml / delta

dmitryryumin / interspeech-2023-papers

freewym / espresso

funaudiollm / sensevoice

hirofumi0810 / neural_sp

iceychris / libreasr

ictnlp / streamspeech

innovatorved / whisper.api

jdepoix / youtube-transcript-api

k2-fsa / sherpa

k2-fsa / sherpa-onnx

kaituoxu / speech-transformer

linto-ai / whisper-timestamped

m-bain / whisperx

mahmoudashraf97 / whisper-diarization

mravanelli / pytorch-kaldi

mravanelli / sincnet

nvidia / nemo

openspeech-team / openspeech

paddlepaddle / paddlespeech

peterh0323 / streamer-sales

picovoice / cheetah

purfview / whisper-standalone-win

pykaldi / pykaldi

rapidai / rapidasr

snakers4 / open_stt

snakers4 / silero-models

sooftware / conformer

sooftware / kospeech

speechbrain / speechbrain

speechio / chinese_text_normalization

srvk / eesen

tensorflow / lingvo

wenet-e2e / wenet

wzpan / wukong-robot

xiangyuecn / recorder

yeyupiaoling / masr

yeyupiaoling / paddlepaddle-deepspeech

yeyupiaoling / ppasr

yeyupiaoling / whisper-finetune

zw76859420 / asr_theory

Recommend Projects

Recommend Topics

Recommend Org