whitefu Goto Github PK

followers: 37.0 following: 430.0 repos: 486.0 gists: 0.0

Type: User

Bio: speech synthesis & voice conversion & speech enhancement

whitefu's Projects

versatile_audio_super_resolution

Versatile audio super resolution (any -> 48kHz) with AudioSR.

vi-svs

Use VITS and Opencpop to develop singing voice synthesis; Different from VISinger.

video-retalking

[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild

viscpm

Chinese and English Multimodal Large Model Series (Chat and Paint) | 基于CPM基础模型的中英双语多模态大模型系列

visdom

A flexible tool for creating, organizing, and sharing visualizations of live, rich data. Supports Torch and Numpy.

vits

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

vits2_pytorch

unofficial vits2-TTS implementation in pytorch

vits_chinese

vits chinese, tts chinese, tts mandarin 史上训练最简单，音质最好的语音合成系统

vocos

Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis

voice_datasets

🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).

voicecraft

Zero-Shot Speech Editing and Text-to-Speech in the Wild

voicesmith

VoiceSmith makes training text to speech models easy.

vqmivc

Official implementation of VQMIVC: One-shot Voice Conversion @ Interspeech 2021

wav2lip

This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020.

wavegrad2

Unofficial Pytorch Implementation of WaveGrad2

wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

wetextprocessing

Text Normalization & Inverse Text Normalization

whisper_streaming

Whisper realtime streaming for long speech-to-text transcription and translation

whisperx

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

wizardlm

Family of instruction-following LLMs powered by Evol-Instruct: WizardLM, WizardCoder

xphonebert

XPhoneBERT: A Pre-trained Multilingual Model for Phoneme Representations for Text-to-Speech (INTERSPEECH 2023)

yayi2

YAYI 2 是中科闻歌研发的新一代开源大语言模型，采用了超过 2 万亿 Tokens 的高质量、多语言语料进行预训练。(Repo for YaYi 2 Chinese LLMs)

This is a python API which allows you to get the transcript/subtitles for a given YouTube video. It also works for automatically generated subtitles and it does not require an API key nor a headless browser, like other selenium based solutions do!

whitefu Goto Github PK

whitefu's Projects

Recommend Projects

Recommend Topics

Recommend Org