Code Monkey home page Code Monkey logo

autoreel's Introduction

Unified-Multimodal Transformer Pipeline for Political Content Creation: TikTok Reel Generator (Under-development)

Welcome to the Unified-Multimodal Transformer for Political Content Generation repository! This powerful tool harnesses the latest advancements in artificial intelligence and machine learning ๐Ÿง  to create engaging and high-impact TikTok reels ๐Ÿ“น from political videos. By leveraging the power of transformers and multimodal learning, our tool is able to identify the most interesting and click-worthy segments of a video, automatically edit them, and generate a TikTok reel that is both captivating and informative ๐Ÿคฏ.

output_video.mp4

๐Ÿ“‹ To Do:

  • (Development under progress) Re-train the UMT model https://github.com/TencentARC/UMT Unified-Multi Modal Transformer Model on political content to extract high click potential content โœ….
  • Implement auto-posting on social channels.
  • Add the feature to constantly monitor channels for interview/podcast length videos ๐Ÿ“บ.
  • Perform video cropping, automated to 9:16 aspect ๐ŸŽž๏ธ.
  • Integrate Speech to Text engine Whisper (English, Urdu, Hindi, etc.) + Integrate ASR model for clear speech cutting ๐Ÿ—ฃ๏ธ.
  • Generate --.srt files ๐Ÿ“„.
  • Chop SRT to < 8 words, 2 lines per frame ๐Ÿ“.
  • Animate srt on top of the video in Python ๐Ÿ.
  • Add Instagram style filters. A range of them ๐ŸŒˆ.
  • An interface to download videos to upload on accounts โฌ‡๏ธ.
  • Wrap up the app into Gradio ๐ŸŽ.

๐Ÿ’ช The Power of Unified-Multimodal Transformers

This was a project that I did to understand the construction of transformers and a key application of it in automation of political campaigning ๐Ÿ“ฃ. This tool utilizes state-of-the-art Unified-Multimodal Transformers, which are capable of processing and understanding multiple forms of data simultaneously ๐ŸŒ. This allows the model to analyze and comprehend the intricacies of political content, such as speech, text, and visual cues, resulting in a deeper understanding of the video's context and potential impact on viewers ๐ŸŽฏ.

๐Ÿค– Human-like Editing Capabilities

One of the key features of our tool is its ability to edit the extracted video segments just like a person would ๐Ÿง‘โ€๐Ÿ’ป. This means that the final TikTok reels will have a professional and polished look, making them more appealing to viewers and increasing the chances of your content going viral ๐Ÿš€.

๐Ÿ’ก Conclusion

This tool coupled with other technologies available such as LLMs can potentially turn the game for political campaigning by automating spread of content. The training costs for the transformer to identify the segments is too high, so this project has been paused and made public for now, and I plan to build this further a bit later.

๐Ÿ› ๏ธ Installation:

  1. Setup an Azure VM ๐Ÿ–ฅ๏ธ
  2. Install and setup Docker ๐Ÿณ
  3. Run the following commands

pip3 install -r requirements.txt

All mediapipe missing files can be found here.

Compiled AutoFlip Docker Image can be found at this link

autoreel's People

Contributors

harisrab avatar

Stargazers

Eitan Miller avatar Carlos Luiz avatar Monteiro Steed avatar Sothy Chanty avatar HeisenBerg? avatar null data avatar  avatar Julian Hilg avatar Minhal Abdul Sami avatar  avatar A Kumar avatar UnMars avatar

Watchers

Kostas Georgiou avatar  avatar furqan ali avatar Minhal Abdul Sami avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.