alexandajerry Goto Github PK

followers: 39.0 following: 112.0 repos: 74.0 gists: 2.0

Name: Alexanda

Type: User

Bio: Postgraduate student of Phonetics

Hi there 👋

I'm AlexandaJerry, a postgraduate student studying phonetics.

🔭 I’m currently working on speech-related research.
🌱 I’m currently learning python programming and deep learning.
👯 I’m interested in Praat scripting and R visualization.

Alexanda's Projects

-speech-signal-processing-experiment-tutorial-_python

《语音信号处理试验教程》（梁瑞宇等）的代码主要是Matlab实现的，现在Python比较热门，所以把这个项目大部分内容写成了Python实现

alexandajerry

annotated_deep_learning_paper_implementations

🧑‍🏫 50! Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, ...), gans(cyclegan, stylegan2, ...), 🎮 reinforcement learning (ppo, dqn), capsnet, distillation, ... 🧠

audio-preprocessing-scripts

数据集自动化制作脚本

auto_labeling_for_bert_vits2

这个项目是数据预处理。第一步是对获取到的音频做处理，结合Funasr的时间戳去掉空背景音。也包含了喂给BERT前的label

automatic_speech_annotator

Automatic speech annotator processing speech with voice activaty detection, overlapping speech detection, speaker diarization and automatic speech recognition

awesome-data-annotation

A list of tools for annotating data, managing annotations, etc.

awesome-python-scientific-audio

Curated list of python software and packages related to scientific research in audio

book-text-to-speech

A book about Text-to-Speech (TTS) in Chinese.

bulk_transcribe_youtube_videos_from_playlist

Easily take an entire YouTube playlist and turn it into high quality transcripts using Whisper.

capswriter-offline

CapsWriter 的离线版，一个好用的 PC 端的语音输入工具

charsiu

Charsiu: A neural phonetic aligner.

chatpaper

Use ChatGPT to summarize the arXiv papers. 全流程加速科研，利用chatgpt进行论文全文总结+专业翻译+润色+审稿+审稿回复

chenyme-aavt-

这是一个全自动（音频）视频翻译项目。利用Whisper识别声音，AI大模型翻译字幕，最后合并字幕视频，生成翻译后的视频。

coursera-ml-andrewng-notes

吴恩达老师的机器学习课程个人笔记

cross-modal-bert

CM-BERT: Cross-Modal BERT for Text-Audio Sentiment Analysis（MM2020）

ctc-forced-aligner

Text to speech alignment using CTC forced alignment

dataset_generator_for_vits

基于达摩院视频切割技术的视频转换为短音频的vits数据集生成工具 A VITS Dataset Generation Tool for Converting Video to Short Audio Based on Damo Academy Video Cutting Technology

deeplearning_ai_books

deeplearning.ai（吴恩达老师的深度学习课程笔记及资源）

dlhlp2020-spring

Deep Learning on Human Language Processing (2020, Spring) NTU-EECS

ears_dataset

Expressive Anechoic Recordings of Speech (EARS)

emotion2vec

Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

emotivoice

EmotiVoice 😊: a Multi-Voice and Prompt-Controlled TTS Engine

faster-whisper-gui

faster_whisper GUI with PySide6

fricative_analysis

ft-w2v2-ser

Official implementation for the paper Exploring Wav2vec 2.0 fine-tuning for improved speech emotion recognition

galgame-engine-collect

关于视觉小说的一切，争取打造全网最全的资料库

gmkvextractguiflatpak

📦 Flatpak Package of gMKVExtractGUI, a small GUI utility to extract tracks, chapters and CUE sheets from mkv files

gpt-sovits

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

label-studio

Label Studio is a multi-type data labeling and annotation tool with standardized output format

alexandajerry Goto Github PK

Hi there 👋

Alexanda's Projects

Recommend Projects

Recommend Topics

Recommend Org