Code Monkey home page Code Monkey logo

rtvc's Introduction

rtvc

Realtime voice cloning using ElevenLabs's API.

https://rtvc.hunterparcells.com/

Disclaimer

This project was largely thrown together in a single afternoon, the UI at the moment is very crude and many things may be still unoptimized.

There are probably many other better solutions that exist than this project, such as dedicated voice changer applications or ones for instant voice cloning, but I thought this would be interesting to make.

Requirements

  • A browser that supports the Web Speech API. Chrome Desktop is recommended
  • A subscription to ElevenLabs. Traditionally this is $5/month but they currently have an offer for $1 for your first month.
  • If you plan to play the audio through a microphone, I use VoiceMeeter to route my desktop audio to a virtual mic output.

Running Locally

  1. Clone or download this repository.
  2. Install needed dependencies with npm i with Node.js.
  3. Build the app using npm run build.
  4. Run using npm start.

For development, skip steps 3-4 and instead run npm run dev.

rtvc's People

Contributors

hparcells avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

rtvc's Issues

Add Customization Options for Stability and Similarity Boost

The configuration for generating speech using stability and similarity_boost seem to be very useful after doing a little bit of experimenting.

  • These options should be added to the UI and passed into the API call.
  • It would be nice if these options could use a React hook.

Audio Playback Stops Randomly/Sturrers Until Manually Pressing the Red "Stop" Button

Describe the bug
The audio playback of the synthesized/cloned voice stops randomly and stutters.

To Reproduce
Steps to reproduce the behavior:

  1. Start RTVC
  2. Speak and wait for the program to transcribe your message
  3. Wait for the program to send the transcribed audio to ElevenLabs API
  4. The audio playback will start then immediately stop (usually after the first word) and will not continue (or will stutter) until you press the red "Stop" button.

This happens regardless of whether or not you continue to speak while the program is listening for speech.

Expected behavior
No stuttering or halting of audio playback.

Smartphone (please complete the following information):

  • Device: Samsung S21 FE
  • OS: Android 13
  • Browser: Chrome (latest)

Additional context

Display Character Quota and Limit

The user's current remaining and allocated characters from their plan should be displayed somewhere. This information can be found at the /v1/user/subscription endpoint. API documentation can be found at https://api.elevenlabs.io/docs.

During this, the transcript to be sent through the API should be limited to 5000 characters. ElevenLab's website says "The maximum number of characters you can generate in a single request on the Platform is 2,500." but it seems to be 5000 as that's what the text box's limit is. Perhaps this is outdated information?

ElevenLabs API Key Details

For data validation and condition checks, knowing the exact length and details of API keys would be useful. From my own API key, it seems to be a 32 character alphanumeric (lowercase only) string. If anyone else has any other details or can confirm this information, please let me know.

For now, the useEffect(() => {}, [apiKey]) function will only fetch voices when the API key entered in the input field is exactly 32 characters in length.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.