auzxb Goto Github PK

followers: 26.0 following: 59.0 repos: 69.0 gists: 0.0

Type: User

Bio: Interested in Machine Learning and Deep Learning. Focus on Speech Synthesis and NLP

Location: Shenzhen

About Me

Github Page

👋 A senior algorithm engineer for multi-modal audio at Tencent Music Entertainment Group.
🔭 Extensive experience in speech synthesis, singing voice synthesis, signal processing, machine learning and multi-modal algorithms
📫 Research topics are displayed on my Google Scholar
⚡ Contact me via the my Email [email protected]

My skills

My github

auzxb's Projects

actionformer_release

Code release for ActionFormer (ECCV 2022)

alta

A complete training recipe for kaldi-based Automatic Lyrics Transcription.

audiosignalprocessingforml

Code and slides of my YouTube series called "Audio Signal Proessing for Machine Learning"

auraloss

Collection of audio-focused loss functions in PyTorch

awesome-diffusion-models

A collection of resources and papers on Diffusion Models

bigvgan

Unofficial pytorch implementation of BigVGAN: A Universal Neural Vocoder with Large-Scale Training

bigvgan-1

Official PyTorch implementation of BigVGAN (ICLR 2023)

chinese-poetry

最全中华古诗词数据库, 唐宋两朝近一万四千古诗人, 接近5.5万首唐诗加26万宋诗. 两宋时期1564位词人，21050首词。

crepe

CREPE: A Convolutional REpresentation for Pitch Estimation -- pre-trained model (ICASSP 2018)

deep-q-learning

PyTorch Implementation of Deep Q-Learning with Experience Replay in Atari Game Environments, as made public by Google DeepMind

deepchorus

An end-to-end chorus detection model DeepChorus.

deeplearningforaudiowithpython

Code and slides for the "Deep Learning (For Audio) With Python" course on TheSoundOfAI Youtube channel.

Real Time Speech Enhancement in the Waveform Domain (Interspeech 2020)We provide a PyTorch implementation of the paper Real Time Speech Enhancement in the Waveform Domain. In which, we present a causal speech enhancement model working on the raw waveform that runs in real-time on a laptop CPU. The proposed model is based on an encoder-decoder architecture with skip-connections. It is optimized on both time and frequency domains, using multiple loss functions. Empirical evidence shows that it is capable of removing various kinds of background noise including stationary and non-stationary noises, as well as room reverb. Additionally, we suggest a set of data augmentation techniques applied directly on the raw waveform which further improve model performance and its generalization abilities.

auzxb Goto Github PK

About Me

My skills

My github

auzxb's Projects

Recommend Projects

Recommend Topics

Recommend Org