Light

abmami / multilingual-video-transcription-using-whisper Goto Github PK

View Code? Open in Web Editor NEW

2.0 1.0 0.0 6 KB

A Python tool for transcribing videos using Whisper

License: MIT License

Python 100.00%

openai-whisper python pytube pytube-projects transcription video-to-text youtube

multilingual-video-transcription-using-whisper's Introduction

👋 Hello, I'm Abdessalem Mami

😊 About Me

I'm a Software Engineering Student with a passion for Artificial Intelligence.

🎯 Interests

Probability and Statistics • Software Engineering • Machine Learning • Computer Vision • Natural Language Processing • Visualization • Clean Code • Open Source Software • Knowledge Sharing

🌐 Contacts

Website: abmami.github.io

E-Mail: [email protected]

Social Media:

💻 Tech Stack

📁 Repositories

multilingual-video-transcription-using-whisper's People

Contributors

Stargazers

Watchers

multilingual-video-transcription-using-whisper's Issues

Error transcription local video

I have been using the script well for transcribing local videos, but when I tried a new video, I got an error.

Here's my process so far:
source venv/bin/activate

then
python3 transcribe.py --local

and its output is:

Option: from local files
Transcribing /home/user/Documents/Multilingual-Video-Transcription-using-Whisper/data/videos/video.mp4
Transcribing /home/user/Documents/Multilingual-Video-Transcription-using-Whisper/data/videos/.gitkeep
Traceback (most recent call last):
File "/home/user/Documents/Multilingual-Video-Transcription-using-Whisper/venv/lib/python3.10/site-packages/whisper/audio.py", line 46, in load_audio
ffmpeg.input(file, threads=0)
File "/home/user/Documents/Multilingual-Video-Transcription-using-Whisper/venv/lib/python3.10/site-packages/ffmpeg/_run.py", line 325, in run
raise Error('ffmpeg', out, err)
ffmpeg._run.Error: ffmpeg error (see stderr output for detail)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/home/user/Documents/Multilingual-Video-Transcription-using-Whisper/transcribe.py", line 93, in
transcript = transcribe(model, video_path, args.save)
File "/home/user/Documents/Multilingual-Video-Transcription-using-Whisper/transcribe.py", line 40, in transcribe
result = model.transcribe(video_path)
File "/home/user/Documents/Multilingual-Video-Transcription-using-Whisper/venv/lib/python3.10/site-packages/whisper/transcribe.py", line 121, in transcribe
mel = log_mel_spectrogram(audio, padding=N_SAMPLES)
File "/home/user/Documents/Multilingual-Video-Transcription-using-Whisper/venv/lib/python3.10/site-packages/whisper/audio.py", line 130, in log_mel_spectrogram
audio = load_audio(audio)
File "/home/user/Documents/Multilingual-Video-Transcription-using-Whisper/venv/lib/python3.10/site-packages/whisper/audio.py", line 51, in load_audio
raise RuntimeError(f"Failed to load audio: {e.stderr.decode()}") from e
RuntimeError: Failed to load audio: ffmpeg version 4.4.2-0ubuntu0.22.04.1 Copyright (c) 2000-2021 the FFmpeg developers
built with gcc 11 (Ubuntu 11.2.0-19ubuntu1)
configuration: --prefix=/usr --extra-version=0ubuntu0.22.04.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-pocketsphinx --enable-librsvg --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared
WARNING: library configuration mismatch
avcodec configuration: --prefix=/usr --extra-version=0ubuntu0.22.04.1 --toolchain=hardened --libdir=/usr/lib/x86_64-linux-gnu --incdir=/usr/include/x86_64-linux-gnu --arch=amd64 --enable-gpl --disable-stripping --enable-gnutls --enable-ladspa --enable-libaom --enable-libass --enable-libbluray --enable-libbs2b --enable-libcaca --enable-libcdio --enable-libcodec2 --enable-libdav1d --enable-libflite --enable-libfontconfig --enable-libfreetype --enable-libfribidi --enable-libgme --enable-libgsm --enable-libjack --enable-libmp3lame --enable-libmysofa --enable-libopenjpeg --enable-libopenmpt --enable-libopus --enable-libpulse --enable-librabbitmq --enable-librubberband --enable-libshine --enable-libsnappy --enable-libsoxr --enable-libspeex --enable-libsrt --enable-libssh --enable-libtheora --enable-libtwolame --enable-libvidstab --enable-libvorbis --enable-libvpx --enable-libwebp --enable-libx265 --enable-libxml2 --enable-libxvid --enable-libzimg --enable-libzmq --enable-libzvbi --enable-lv2 --enable-omx --enable-openal --enable-opencl --enable-opengl --enable-sdl2 --enable-pocketsphinx --enable-librsvg --enable-libmfx --enable-libdc1394 --enable-libdrm --enable-libiec61883 --enable-chromaprint --enable-frei0r --enable-libx264 --enable-shared --enable-version3 --disable-doc --disable-programs --enable-libaribb24 --enable-libopencore_amrnb --enable-libopencore_amrwb --enable-libtesseract --enable-libvo_amrwbenc --enable-libsmbclient
libavutil 56. 70.100 / 56. 70.100
libavcodec 58.134.100 / 58.134.100
libavformat 58. 76.100 / 58. 76.100
libavdevice 58. 13.100 / 58. 13.100
libavfilter 7.110.100 / 7.110.100
libswscale 5. 9.100 / 5. 9.100
libswresample 3. 9.100 / 3. 9.100
libpostproc 55. 9.100 / 55. 9.100
/home/user/Documents/Multilingual-Video-Transcription-using-Whisper/data/videos/.gitkeep: Invalid data found when processing input

I've tried multiple ways: ffmpeg to convert it to h264 encoding, and still as .mp4 file, but got the same error like above.

If you have any idea how to sort this, it'd be awesome!

PS: tested on Linux Mint 21.1 Cinnamon, Linux Kernel v6.1.0-1025-oem, Graphics Card Nvidia TU106 GeForce RTX 2060, CPU 12th Gen Intel© Core™ i7-12700F × 12

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.