Code Monkey home page Code Monkey logo

automash's Introduction

AutoMash

Automatically create YouTube mashups. For a given list of videos and a text, AutoMash will cut the videos together so the speakers in the video says the given text. This is best understood by considering the following two examples.

First example. This was created automatically from this video and the following text.

In today's video I'm gonna tell you why you will waste four years of your life when you study computer science. With a computer science degree you can easily get outsourced within your first few years. The reason you clicked on this video is because you wanted to know how to get a job. However a computer science degree will not do that for you, study math or physics instead.

Second example. This was created automatically from this video and the following text.

In the mainstream it's always talked about Napoleon's invasion of America. However, the majority of people who have research this know that he actually managed to destroy the american colonies. There's a lot of potential here, and I've never really seen anyone doubt that.

How it works

AutoMash will download the given YouTube videos and then use a speech-to-text tool to get a transcript of the video. Currently, AutoMash is compatible with three speech-to-text tools: Vosk, DeepSpeech and IBM Watson. After transcribing the videos, AutoMash uses a greedy algorithm to find the longest sequence of words in the transcript that fits the words in the given text. Finally, AutoMash extracts the video sequences that corresponds to these sequences of words and cuts them together into the final video.

DeepSpeech or IBM Watson

TL;DR: Use Vosk. If this yields bad results try IBM Watson. I do not recommend DeepSpeech.

You can either use Vosk, DeepSpeech or IBM Watson to get video transcripts. Vosk and DeepSpech are both easier to configure than IBM Watson and free to use without limitations. However for the models I tried (see below), DeepSpeech produces both worse results and is slower than Vosk and IBM Watson. Transcribing one minute of video with DeepSpeech takes about a minute of real time on my Ryzen 5 1600. Hence, it is generally recommended to use Vosk over DeepSpech. IBM Watson seems to yield slightly better results than Vosk. The downsides to IBM Watson are that configuring it is a bit more work and that it can only be used to transcribe 500 minutes of audio per month.

How to install

Install the repository

Get the repository:

  • Clone the repository git clone https://github.com/ocatias/AutoMash

  • Go to directory cd AutoMash

  • Create a folder for the virtual environment mkdir virtual_env

  • Create virtual environment python3 -m venv virtual_env

  • Activate virtual environment:

    • For Windows: .\virtual_env\Scripts\activate.bat
    • For Linux: source virtual_env/bin/activate
  • If you want to have text subtitles in your videos then set add_subtitles in config.yaml to 1. Then install ImageMagick before installing the other dependencies. If you do not want text subtitles then set add_subtitles to 0.

  • Install dependencies pip install -r requirements.txt

  • Next you need to configure one of Vosk, DeepSpeech or IBM Watson.

Configure Vosk

Download the language model into the AutoMash folder curl -LO http://alphacephei.com/vosk/models/vosk-model-en-us-0.22.zip and unzip it. Then open the config.yaml file and set transcription_tool to vosk and model_path to the path of the unzipped folder, for example to vosk-model-en-us-0.22.

Configure DeepSpeech

Download the language model into the AutoMash folder curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.9.3/deepspeech-0.9.3-models.pbmm. Then open the config.yaml file and set transcription_tool to deepspeech and model_path to the path of the model, for example to deepspeech-0.9.3-models.pbmm.

Configure IBM Watson

You will need a free account which will allow you to transform 500 minutes of audio into text per month for free.

  1. Register here. Sometimes account creation fails, this seems to be a faulty IBM anti fraud measure it helps to try different email addresses, private browsing / incognito mode mode or different browsers. For me it worked under Firefox with private browsing and a gmail address.
  2. Go to the tutorial here and follow the points under IBM Cloud® only to get the API key and URL.
  3. Create a file called watson.key that has the shape
Url
API Key
  1. In config.yaml set transcription_tool to watson.

How to use AutoMash

Before you can create the mashup you will need to decide on a list of YouTube videos that you want to use for this. Next we will get a transcript for these videos by querying IBM Watson, then we will write the text for the final video and create a video plan. Afterwards we can create the video and if necessary finetune the cuts.

Transcribe the video

Use python src\create_lexicon.py PROJECT_NAME YT_URL_1 YT_URL2 ... to send the audios to your selected transcription tool and retrieve the transcripts. Here PROJECT_NAME is the name of your project which will be used to name newly created files, YT_URL_1 YT_URL2 ... is a list of URLs of YouTube videos separated by spaces.

Create a video plan

The above step will have created a file called PROJECT_NAME.txt in the AutoMash\tmp directory. This text file is called the lexicon and contains all the video transcripts and is there to help you create the text for the mashup video. When writing your text ensure that you only use words that appear in the lexicon. The final video will sound better if you use longer sequences from the lexicon. You can also use punctuation (,, . and ;) which signifies where pauses should be inserted, , gives a short pause, . a long pause and ; a medium length pause (the length of pauses can be configured in config.yaml). When you have decided on a text you can use python src\plan_video.py PROJECT_NAME "TEXT" to create the video plan. Here TEXT is the text you just came up with, note that it needs to be wrapped in ". For example if your text is Hello, this is a text then you can create the video plan with python src\plan_video.py PROJECT_NAME "Hello, this is a text".

Create the video

You can now create the final video with python src\create_video.py PROJECT_NAME. This will create the video named PROJECT_NAME.mp4 directly in the AutoMash directory.

If you are unhappy with how the video turns out you can either change the text by doing the steps under Create a video plan again, or you can finetune the cuts and length of video sequences as explained in the section below.

(Optional) Manually edit the video plan

Maybe some of the cuts in the video bother you, for example some video sequence ends to quickly or starts too late. Then you can manually edit the video plan to fix this. The video plan can be found under tmp\PROJECT_NAME_video_plan.txt and has the shape

some words	0	0	0
some more words even longer sentences	0	0	0

Here each line corresponds to a video sequence that will be directly cut out from a youtube video. The three numbers influence the length of this sequence. The first number controls the beginning of the sequence, setting it to 0.5 will mean the sequence starts 0.5 seconds earlier and setting it to -0.5 will mean it starts 0.5 seconds later. The second number controls the end of the sequence, here setting it to 0.5 will let the sequence end 0.5 seconds later and setting it to -0.5 will mean the sequence ends 0.5 seconds earlier. The last number controls the pause between the speakers words, setting it to 0.5 will add a 0.5 second pause after this phrase (during the pause the video will continue but no audio will be played).

Afterwards, you can just create the video like mentioned above.

automash's People

Contributors

ocatias avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Forkers

joimjnbg

automash's Issues

I having problem on the Create the video step

When I run:

python src\create_video.py PROJECT_NAME

I get this error:

(AUTOMASHin) C:\Users\camer\Desktop\Video projects__SCRIPTS\AutoMash(USED FOR VIDEO CUTTING)>python src\create_video.py PROJECT_NAME
Moviepy - Building video tmp\the funny stuff is_0.02_0.02_0.1_0.05_0.05.mp4.
MoviePy - Writing audio in the funny stuff is_0.02_0.02_0.1_0.05_0.05TEMP_MPY_wvf_snd.mp4
chunk: 0%| | 0/23 [00:00<?, ?it/s, now=None]Traceback (most recent call last):
File "src\create_video.py", line 45, in
snippets_path = [get_snippet_path(fade_in_time, fade_out_time, data_path, lexicon, words, time_before + 0.02 , time_after + 0.02, pause_after + pause_between_phrases) for (words, time_before, time_after, pause_after) in data]
File "src\create_video.py", line 45, in
snippets_path = [get_snippet_path(fade_in_time, fade_out_time, data_path, lexicon, words, time_before + 0.02 , time_after + 0.02, pause_after + pause_between_phrases) for (words, time_before, time_after, pause_after) in data]
File "C:\Users\camer\Desktop\Video projects__SCRIPTS\AutoMash(USED FOR VIDEO CUTTING)\src\helpers.py", line 71, in get_snippet_path
video.write_videofile(output_path, audio_codec='aac')
File "", line 2, in write_videofile
File "D:_PROGRAMS\anaconda3NEW\envs\AUTOMASHin\lib\site-packages\moviepy\decorators.py", line 54, in requires_duration
return f(clip, *a, **k)
File "", line 2, in write_videofile
File "D:_PROGRAMS\anaconda3NEW\envs\AUTOMASHin\lib\site-packages\moviepy\decorators.py", line 137, in use_clip_fps_by_default
return f(clip, *new_a, **new_kw)
File "", line 2, in write_videofile
File "D:_PROGRAMS\anaconda3NEW\envs\AUTOMASHin\lib\site-packages\moviepy\decorators.py", line 22, in convert_masks_to_RGB
return f(clip, *a, **k)
File "D:_PROGRAMS\anaconda3NEW\envs\AUTOMASHin\lib\site-packages\moviepy\video\VideoClip.py", line 317, in write_videofile
logger=logger)
File "", line 2, in write_audiofile
File "D:_PROGRAMS\anaconda3NEW\envs\AUTOMASHin\lib\site-packages\moviepy\decorators.py", line 54, in requires_duration
return f(clip, *a, **k)
File "D:_PROGRAMS\anaconda3NEW\envs\AUTOMASHin\lib\site-packages\moviepy\audio\AudioClip.py", line 209, in write_audiofile
logger=logger)
File "", line 2, in ffmpeg_audiowrite
File "D:_PROGRAMS\anaconda3NEW\envs\AUTOMASHin\lib\site-packages\moviepy\decorators.py", line 54, in requires_duration
return f(clip, *a, **k)
File "D:_PROGRAMS\anaconda3NEW\envs\AUTOMASHin\lib\site-packages\moviepy\audio\io\ffmpeg_audiowriter.py", line 169, in ffmpeg_audiowrite
logger=logger):
File "D:_PROGRAMS\anaconda3NEW\envs\AUTOMASHin\lib\site-packages\moviepy\audio\AudioClip.py", line 85, in iter_chunks
fps=fps, buffersize=chunksize)
File "", line 2, in to_soundarray
File "D:_PROGRAMS\anaconda3NEW\envs\AUTOMASHin\lib\site-packages\moviepy\decorators.py", line 54, in requires_duration
return f(clip, *a, **k)
File "D:_PROGRAMS\anaconda3NEW\envs\AUTOMASHin\lib\site-packages\moviepy\audio\AudioClip.py", line 126, in to_soundarray
snd_array = self.get_frame(tt)
File "", line 2, in get_frame
File "D:_PROGRAMS\anaconda3NEW\envs\AUTOMASHin\lib\site-packages\moviepy\decorators.py", line 89, in wrapper
return f(*new_a, **new_kw)
File "D:_PROGRAMS\anaconda3NEW\envs\AUTOMASHin\lib\site-packages\moviepy\Clip.py", line 95, in get_frame
return self.make_frame(t)
File "D:_PROGRAMS\anaconda3NEW\envs\AUTOMASHin\lib\site-packages\moviepy\audio\AudioClip.py", line 296, in make_frame
for c, part in zip(self.clips, played_parts)
File "D:_PROGRAMS\anaconda3NEW\envs\AUTOMASHin\lib\site-packages\moviepy\audio\AudioClip.py", line 297, in
if (part is not False)]
File "", line 2, in get_frame
File "D:_PROGRAMS\anaconda3NEW\envs\AUTOMASHin\lib\site-packages\moviepy\decorators.py", line 89, in wrapper
return f(*new_a, **new_kw)
File "D:_PROGRAMS\anaconda3NEW\envs\AUTOMASHin\lib\site-packages\moviepy\Clip.py", line 95, in get_frame
return self.make_frame(t)
File "D:_PROGRAMS\anaconda3NEW\envs\AUTOMASHin\lib\site-packages\moviepy\Clip.py", line 138, in
newclip = self.set_make_frame(lambda t: fun(self.get_frame, t))
File "D:_PROGRAMS\anaconda3NEW\envs\AUTOMASHin\lib\site-packages\moviepy\audio\fx\audio_fadeout.py", line 11, in fading
gft = gf(t)
File "", line 2, in get_frame
File "D:_PROGRAMS\anaconda3NEW\envs\AUTOMASHin\lib\site-packages\moviepy\decorators.py", line 89, in wrapper
return f(*new_a, **new_kw)
File "D:_PROGRAMS\anaconda3NEW\envs\AUTOMASHin\lib\site-packages\moviepy\Clip.py", line 95, in get_frame
return self.make_frame(t)
File "D:_PROGRAMS\anaconda3NEW\envs\AUTOMASHin\lib\site-packages\moviepy\Clip.py", line 138, in
newclip = self.set_make_frame(lambda t: fun(self.get_frame, t))
File "D:_PROGRAMS\anaconda3NEW\envs\AUTOMASHin\lib\site-packages\moviepy\audio\fx\audio_fadein.py", line 10, in fading
gft = gf(t)
File "", line 2, in get_frame
File "D:_PROGRAMS\anaconda3NEW\envs\AUTOMASHin\lib\site-packages\moviepy\decorators.py", line 89, in wrapper
return f(new_a, **new_kw)
File "D:_PROGRAMS\anaconda3NEW\envs\AUTOMASHin\lib\site-packages\moviepy\Clip.py", line 95, in get_frame
return self.make_frame(t)
File "D:_PROGRAMS\anaconda3NEW\envs\AUTOMASHin\lib\site-packages\moviepy\Clip.py", line 138, in
newclip = self.set_make_frame(lambda t: fun(self.get_frame, t))
File "D:_PROGRAMS\anaconda3NEW\envs\AUTOMASHin\lib\site-packages\moviepy\Clip.py", line 190, in
return self.fl(lambda gf, t: gf(t_func(t)), apply_to,
File "", line 2, in get_frame
File "D:_PROGRAMS\anaconda3NEW\envs\AUTOMASHin\lib\site-packages\moviepy\decorators.py", line 89, in wrapper
return f(new_a, **new_kw)
File "D:_PROGRAMS\anaconda3NEW\envs\AUTOMASHin\lib\site-packages\moviepy\Clip.py", line 95, in get_frame
return self.make_frame(t)
File "D:_PROGRAMS\anaconda3NEW\envs\AUTOMASHin\lib\site-packages\moviepy\audio\io\AudioFileClip.py", line 78, in
self.make_frame = lambda t: self.reader.get_frame(t)
File "D:_PROGRAMS\anaconda3NEW\envs\AUTOMASHin\lib\site-packages\moviepy\audio\io\readers.py", line 180, in get_frame
self.buffer_around(fr_min)
File "D:_PROGRAMS\anaconda3NEW\envs\AUTOMASHin\lib\site-packages\moviepy\audio\io\readers.py", line 241, in buffer_around
self.seek(new_bufferstart)
File "D:_PROGRAMS\anaconda3NEW\envs\AUTOMASHin\lib\site-packages\moviepy\audio\io\readers.py", line 140, in seek
self.skip_chunk(pos-self.pos)
File "D:_PROGRAMS\anaconda3NEW\envs\AUTOMASHin\lib\site-packages\moviepy\audio\io\readers.py", line 102, in skip_chunk
s = self.proc.stdout.read(self.nchannels
chunksize
self.nbytes)
AttributeError: 'NoneType' object has no attribute 'stdout'

I tried to solve it by downgrading movie.py to 1.0.0, but then i got this error:

(AUTOMASHin) C:\Users\camer\Desktop\Video projects__SCRIPTS\AutoMash(USED FOR VIDEO CUTTING)>python src\create_video.py PROJECT_NAME
Moviepy - Building video tmp\the funny stuff is_0.02_0.02_0.1_0.05_0.05.mp4.
MoviePy - Writing audio in %s
MoviePy - Done.
Moviepy - Writing video tmp\the funny stuff is_0.02_0.02_0.1_0.05_0.05.mp4

Moviepy - Done !
Moviepy - video ready tmp\the funny stuff is_0.02_0.02_0.1_0.05_0.05.mp4
Moviepy - Building video tmp\what you_0.02_0.02_0.1_0.05_0.05.mp4.
MoviePy - Writing audio in %s
MoviePy - Done.
Moviepy - Writing video tmp\what you_0.02_0.02_0.1_0.05_0.05.mp4

Moviepy - Done !
Moviepy - video ready tmp\what you_0.02_0.02_0.1_0.05_0.05.mp4
Moviepy - Building video tmp\will_0.02_0.02_0.1_0.05_0.05.mp4.
MoviePy - Writing audio in %s
MoviePy - Done.
Moviepy - Writing video tmp\will_0.02_0.02_0.1_0.05_0.05.mp4

Moviepy - Done !
Moviepy - video ready tmp\will_0.02_0.02_0.1_0.05_0.05.mp4
Moviepy - Building video tmp\find_0.02_0.02_0.1_0.05_0.05.mp4.
MoviePy - Writing audio in %s
MoviePy - Done.
Moviepy - Writing video tmp\find_0.02_0.02_0.1_0.05_0.05.mp4

Moviepy - Done !
Moviepy - video ready tmp\find_0.02_0.02_0.1_0.05_0.05.mp4
Moviepy - Building video PROJECT_NAME.mp4.
Moviepy - Writing video PROJECT_NAME.mp4

Moviepy - Done !
Moviepy - video ready PROJECT_NAME.mp4
Exception ignored in: <function FFMPEG_AudioReader.del at 0x000001D9A02D1F78>
Traceback (most recent call last):
File "D:_PROGRAMS\anaconda3NEW\envs\AUTOMASHin\lib\site-packages\moviepy\audio\io\readers.py", line 254, in del
self.close_proc()
File "D:_PROGRAMS\anaconda3NEW\envs\AUTOMASHin\lib\site-packages\moviepy\audio\io\readers.py", line 150, in close_proc
self.proc.terminate()
File "D:_PROGRAMS\anaconda3NEW\envs\AUTOMASHin\lib\subprocess.py", line 1343, in terminate
_winapi.TerminateProcess(self._handle, 1)
OSError: [WinError 6] The handle is invalid

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.