mortyzhou-shef-bit Goto Github PK
Name: yhzhouowo
Type: User
Bio: Living with attention is all we need.
Location: UoS -> NUS & BIT
Name: yhzhouowo
Type: User
Bio: Living with attention is all we need.
Location: UoS -> NUS & BIT
My Academic Homepage
ASR for dysarthric speakers with Kaldi
AudioLDM: Generate speech, sound effects, music and beyond, with text.
Autoregressive Predictive Coding: An unsupervised autoregressive model for speech representation learning
A curated list of research in System for Edge Intelligence and Computing(Edge MLSys), including Frameworks, Tools, Repository, etc. Paper notes are also provided.
Reading list for research topics in embodied vision
A curated list of Multimodal Related Research.
An ultimately comprehensive paper list of Vision Transformer/Attention, including papers, codes, and related websites
CMU MultimodalSDK is a machine learning platform for development of advanced multimodal models as well as easily accessing and processing multimodal datasets.
A curated list of research papers and resources on code-switching
A toolkit for non-parallel voice conversion based on vector-quantized variational autoencoder
Notes by myself or link collection from other guys
Dialog Evaluation Paper List: include multiple different dialog tasks
DiffWave is a fast, high-quality neural vocoder and waveform synthesizer.
source code for "DYGAN-VC: IMPROVING SPEECH CONTENT PRESERVATION FOR GAN VOICE CONVERSION USING DYNAMIC CONVOLUTION"
End-to-End Speech Processing Toolkit
ESPnet Model Zoo
Facebook AI Research Sequence-to-Sequence Toolkit written in Python.
Include Basis-MelGAN, MelGAN, HifiGAN and Multiband-HifiGAN, maybe NHV in the future.
Download a large file from Google Drive (curl/wget fails because of the security notice).
Google Drive direct download of big files
Official pytorch implementation of paper "Image-to-image Translation via Hierarchical Style Disentanglement" (CVPR 2021 Oral).
INTERSPEECH 2019 Tutorial Materials
A pure python module for reading and writing kaldi ark files
The website for the CMU Language Technologies Institute low resource NLP bootcamp 2020
Calculation of MCD (dB) between two speech waveforms
Implementation of meta-transfer-learning for ASR and LM (ACL 2020)
Implementation of "MOSNet: Deep Learning based Objective Assessment for Voice Conversion"
Non-Autoregressive Predictive Coding
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.