endimionzf Goto Github PK
Type: User
Type: User
π Text-Prompted Generative Audio Model
The most powerful and modular stable diffusion GUI, api and backend with a graph/nodes interface.
Real-time face swap for PC streaming or video calls
Analyze videos, perform detections, index frames & detected objects, search by examples
Code for the paper Hybrid Spectrogram and Waveform Source Separation
ffmpeg shell scripts
Repository of useful FFmpeg commands for archivists!
Flowframes Windows GUI for video interpolation using DAIN (NCNN) or RIFE (CUDA/NCNN)
Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".
This repository does not contain code, its purpose it for issue tracking and wiki
Instant voice cloning by MyShell.
A modern, web-based photo management server. Run it on your home server and it will let you find the right photo from your collection on any device. Smart filtering is made possible by object recognition, face recognition, location awareness, color analysis and other ML algorithms.
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Advanced data structures for handling temporal segments with attached labels.
:movie_camera: Python and OpenCV-based scene cut/transition detection program & library.
Voice data <= 10 mins can also be used to train a good VC model!
The best looking and most functional webui for RVC related tasks. See website for UI demo:
Extract scenecuts from video files using ffmpeg
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Stash App's own OpenSource video indexing and Perceptual Hashing MetaData API
An editing tool that uses AI to transcribe, understand content and search for anything in your footage, integrated with ChatGPT and other AI models
A Gradio web UI for Large Language Models. Supports transformers, GPTQ, AWQ, EXL2, llama.cpp (GGUF), Llama models.
Minimal extension of OpenAI's Whisper adding speaker diarization with special tokens
πΈπ¬ - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Whisper as a Service (GUI and API with queuing for OpenAI Whisper)
Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event Taggers"
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
Port of OpenAI's Whisper model in C/C++
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Open source, local, and self-hosted highly optimized language inference server supporting ASR/STT, TTS, and LLM across WebRTC, REST, and WS
A declarative, efficient, and flexible JavaScript library for building user interfaces.
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. πππ
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google β€οΈ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.