Code Monkey home page Code Monkey logo

whitefu's Projects

vc-lm icon vc-lm

将任意人的音色转换为成千上万种不同音色

vi-svs icon vi-svs

Use VITS and Opencpop to develop singing voice synthesis; Different from VISinger.

video-retalking icon video-retalking

[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild

viscpm icon viscpm

Chinese and English Multimodal Large Model Series (Chat and Paint) | 基于CPM基础模型的中英双语多模态大模型系列

visdom icon visdom

A flexible tool for creating, organizing, and sharing visualizations of live, rich data. Supports Torch and Numpy.

vits icon vits

VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

vits_chinese icon vits_chinese

vits chinese, tts chinese, tts mandarin 史上训练最简单,音质最好的语音合成系统

vocos icon vocos

Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis

voice_datasets icon voice_datasets

🔊 A comprehensive list of open-source datasets for voice and sound computing (95+ datasets).

voicecraft icon voicecraft

Zero-Shot Speech Editing and Text-to-Speech in the Wild

voicesmith icon voicesmith

VoiceSmith makes training text to speech models easy.

vqmivc icon vqmivc

Official implementation of VQMIVC: One-shot Voice Conversion @ Interspeech 2021

wav2lip icon wav2lip

This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020.

wavegrad2 icon wavegrad2

Unofficial Pytorch Implementation of WaveGrad2

wenet icon wenet

Production First and Production Ready End-to-End Speech Recognition Toolkit

whisper_streaming icon whisper_streaming

Whisper realtime streaming for long speech-to-text transcription and translation

whisperx icon whisperx

WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)

wizardlm icon wizardlm

Family of instruction-following LLMs powered by Evol-Instruct: WizardLM, WizardCoder

xphonebert icon xphonebert

XPhoneBERT: A Pre-trained Multilingual Model for Phoneme Representations for Text-to-Speech (INTERSPEECH 2023)

yayi2 icon yayi2

YAYI 2 是中科闻歌研发的新一代开源大语言模型,采用了超过 2 万亿 Tokens 的高质量、多语言语料进行预训练。(Repo for YaYi 2 Chinese LLMs)

youtube-transcript-api icon youtube-transcript-api

This is a python API which allows you to get the transcript/subtitles for a given YouTube video. It also works for automatically generated subtitles and it does not require an API key nor a headless browser, like other selenium based solutions do!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.