thanhkm Goto Github PK

followers: 1.0 following: 0.0 repos: 65.0 gists: 0.0

Name: Thanh KM

Type: User

Bio: Research interests: NLP, Generative AI, Text to Speech, Speech recognition

Thanh KM's Projects

amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audio, music, and speech generation research and development.

bigvgan

Official PyTorch implementation of BigVGAN (ICLR 2023)

clap

Contrastive Language-Audio Pretraining

consistencyvc-voive-conversion

Using joint training speaker encoder with consistency loss to achieve cross-lingual voice conversion and expressive voice conversion

contextnet

Tensorflow2 based implementation of ContextNet, an improved convolutional rnn-transducer-based architecture for end-to-end speech recognition using global context

control-vc

This is the implementation for "ControlVC: Zero-Shot Voice Conversion with Time-Varying Controls on Pitch and Rhythm"

coursera-fullstack-course4

coursera-fullstack-course5

ctcdecode

PyTorch CTC Decoder bindings

ctci-6th-edition

Cracking the Coding Interview 6th Ed. Solutions

dddm-vc

Official Pytorch Implementation for "DDDM-VC: Decoupled Denoising Diffusion Models with Disentangled Representation and Prior Mixup for Verified Robust Voice Conversion" (AAAI 2024)

deepspeech

A PaddlePaddle implementation of DeepSpeech2 architecture for ASR.

deepspeech.pytorch

Speech Recognition using DeepSpeech2.

descript-audio-codec

State-of-the-art audio codec with 90x compression factor. Supports 44.1kHz, 24kHz, and 16kHz mono/stereo audio.

diff-hiervc

Official Pytorch Implementation of "Diff-HierVC: Diffusion-based Hierarchical Voice Conversion with Robust Pitch Generation and Masked Prior for Zero-shot Speaker Adaptation"

dissc

Official repository for "Speaking Style Conversion With Discrete Self-Supervised Units". https://arxiv.org/abs/2212.09730

end-to-end-asr-pytorch

This is an open source project (formerly named Listen, Attend and Spell - PyTorch Implementation) for end-to-end ASR implemented with Pytorch, the well known deep learning toolkit.

epijudge

EPI Judge - Preview Release

espnet

End-to-End Speech Processing Toolkit

faceswap

Non official project based on original /r/Deepfakes thread. Many thanks to him!

freevc

FreeVC: Towards High-Quality Text-Free One-Shot Voice Conversion

hierspeech

hierspeechpp

The official implementation of HierSpeech++

hubert

HuBERT content encoders for: A Comparison of Discrete and Soft Speech Units for Improved Voice Conversion

kaldi

This is the official location of the Kaldi project.

libri-light

dataset for lightly supervised training using the librivox audio book recordings. https://librivox.org/.

listen-attend-and-spell

Tensorflow implementation of "Listen, Attend and Spell" authored by William Chan. This project utilizes input pipeline and estimator API of Tensorflow, which makes the training and evaluation truly end-to-end.

thanhkm Goto Github PK

Thanh KM's Projects

Recommend Projects

Recommend Topics

Recommend Org