Code Monkey home page Code Monkey logo

overdub-api's Introduction

Overdub API

Introduction

Welcome to Overdub API. This API serves as an automated tool to create narrated, subtitled videos from raw content with minimal guidance needed. It's a streamlined process to convert any raw video into accessible, and social-media-ready content. Getting Started

Prerequisites

To use this software, you will need:

Cloning the Repository

git clone https://github.com/your-username/Overdub-API.git
cd Overdub-API

Setting Up the Virtual Environment

It's best practice to create a virtual environment to manage dependencies for the project:

python -m venv venv
source venv/bin/activate On Windows use venv\Scripts\activate

Install Requirements

Install the required packages for the project using the provided requirements.txt file:

pip install -r requirements.txt

Setup Environment Variables

Rename the .env.example file to .env and add your ElevenLabs API key to it.

ELEVENLABS_API_KEY='your-api-key'

Google Cloud Configuration

Ensure you've installed gcloud and then set up gcloud for your local environment. Please refer to the Google Cloud documentation provided above to authenticate and set up your local environment. Configure your Google Cloud project to match the settings in the settings.py file.

Configure settings.py

Adjust the settings.py file to match your requirements. The file includes parameters for video processing, API keys, and other configurations.

How it Works

Content Processing

The software processes each video file listed in the input CSV. By default, this file should be located in the __temp__/ directory. The CSV should contain video URLs or file paths, with an optional content direction field to describe what is happening in the video, which can improve the results.

Metadata Generation

The API uses Google's Vertex AI to generate metadata. It classifies content through image data extracted from the video, complemented with the optional content direction field if provided.

Overdubbing and Subtitles

Once metadata is obtained, ElevenLabs' capabilities are used to overdub the video with natural-sounding speech. Subsequent steps involve generating dub timestamps for the overdub and creating dynamic subtitles that fit the requirements for various social media platforms, all of which are configurable within the settings.py file.

Output

Processed videos are outputted to the __temp__/02_processed directory. Any generated metadata will be stored in __temp__/01_metadata.

Usage

To start processing your content, place your CSV file in the __temp__/ directory or specifiy a location in settings.py and run the main script:

python main.py Make sure your environment variables and Google Cloud settings are correctly configured before running the script.

Contributing

Currently this project is self-mainted. If you are interested in contributing, please open a pull request. Thank you.

License

This project is released under the MIT License.

Disclaimer

By using this API, you agree to the terms and conditions of both ElevenLabs and Google Cloud services. We strive to make automated video processing as seamless as possible. With this tool, we simplify the path from raw content to engaging digital media. Should you encounter any issues or have suggestions, please open up an issue in the repository. Happy processing!

overdub-api's People

Contributors

oscaem avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.