wendongj,wendong,github

micronet

micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、Low-Bit(≤2b)/Ternary and Binary(TWN/BNN/XNOR-Net); post-training-quantization(PTQ), 8-bit(tensorrt); 2、 pruning: normal、regular and group convolutional channel pruning; 3、 group convolution structure; 4、batch-normalization fuse for quantization. deploy: tensorrt, fp32/fp16/int8(ptq-calibration)、op-adapt(upsample)、dynamic_shape

mst-plus-plus

"MST++: Multi-stage Spectral-wise Transformer for Efficient Spectral Reconstruction" (CVPRW 2022) & (Winner of NTIRE 2022 Challenge on Spectral Reconstruction from RGB)

mtfaa-net

Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Speech Enhancement

nafnet

The state-of-the-art image restoration model without nonlinear activation functions.

neural_sp

End-to-end ASR/LM implementation with PyTorch

nlp_chinese_corpus

大规模中文自然语言处理语料 Large Scale Chinese Corpus for NLP

paddlespeech

Easy-to-use Speech Toolkit including SOTA/Streaming ASR with punctuation, influential TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

paper-reading

深度学习经典、新论文逐段精读

parallelwavegan

Unofficial Parallel WaveGAN (+ MelGAN & Multi-band MelGAN & HiFi-GAN & StyleMelGAN) with Pytorch

percepnet

(Under construct) Unofficial implementation of PercepNet: A Perceptually-Motivated Approach for Low-Complexity, Real-Time Enhancement of Fullband Speech

prml_learning

learning fomula

python-speech-enhancement

a python library for speech enhancement

pytorch-cifar100

Practice on cifar100(ResNet, DenseNet, VGG, GoogleNet, InceptionV3, InceptionV4, Inception-ResNetv2, Xception, Resnet In Resnet, ResNext,ShuffleNet, ShuffleNetv2, MobileNet, MobileNetv2, SqueezeNet, NasNet, Residual Attention Network, SENet, WideResNet)

pywsj0-mix

wsj0-{2, 3, 4, 5} mix generation scripts, in Python.

research-and-analysis-of-speech-enhancement-or-dereverberation

This repository contains some material of speech enhancement and dereverberation. On the one hand, I summarize this work for my further understanding. On the other hand, I hope that all beginners or masters interested in speech enhancement can ask me questions and make progress together. A lot of my summary is not very good, I hope you put forward corrections!

room-simulation

Supporting code for the paper "A study on more realistic room simulation for far-field keyword spotting".

sinet

Unofficial Tensorflow 2 implementation of SINet: Extreme Lightweight Portrait Segmentation Networks with Spatial Squeeze Modules and Information Blocking Decoder

skip-dpcrn-samples

skiprnn-2017-telecombcn

Skip RNN: Learning to Skip State Updates in Recurrent Neural Networks (ICLR 2018)

speaker-verification-pytorch

speaker_extraction

target speaker extraction and verification for multi-talker speech

specmix

This is a project of Interspeech2021 paper "SpecMix : A Mixed Sample Data Augmentation method for Training with Time-Frequency Domain Features"

speech-resources

语音方向实验室/公司/资源/实习等，欢迎推荐或自荐（排名不分先后）

speech-separation-paper-tutorial

A must-read paper for speech separation based on neural networks

speech_dataset

The dataset of Speech Recognition

squeezeformer

Squeezeformer: An Efficient Transformer for Automatic Speech Recognition

statistical-learning-method_code

手写实现李航《统计学习方法》书中全部算法

summaryofloansuspension

全国各省市停贷通知汇总

svoice

We provide a PyTorch implementation of the paper Voice Separation with an Unknown Number of Multiple Speakers In which, we present a new method for separating a mixed audio sequence, in which multiple voices speak simultaneously. The new method employs gated neural networks that are trained to separate the voices at multiple processing steps, while maintaining the speaker in each output channel fixed. A different model is trained for every number of possible speakers, and the model with the largest number of speakers is employed to select the actual number of speakers in a given sample. Our method greatly outperforms the current state of the art, which, as we show, is not competitive for more than two speakers.

tensorflowasr

集成了Tensorflow 2版本的端到端语音识别模型，并且RTF(实时率)在0.1左右/Mandarin State-of-the-art Automatic Speech Recognition in Tensorflow 2

wendongj Goto Github PK

wendong's Projects

Recommend Projects

Recommend Topics

Recommend Org