abdeladim-s / subsai Goto Github PK

View Code? Open in Web Editor NEW

1.1K 12.0 90.0 13.27 MB

🎞️ Subtitles generation tool (Web-UI + CLI + Python package) powered by OpenAI's Whisper and its variants 🎞️

Home Page: https://abdeladim-s.github.io/subsai/

License: GNU General Public License v3.0

Python 99.61% Dockerfile 0.39%

cli webui whisper whisper-ai subtitles subtitles-generator

subsai's Introduction

Hi there 👋 I am Abdeladim

• •

subsai's People

Contributors

Stargazers

Watchers

Forkers

jlinkels jooni22 etnperlong jonashaag forkxx nivranaitsirhc jmaigc sky8neil alotofblahaj jasonchang0905 stefa-n nitishymtpl redcathode ddkwing shaheedghazi kaminn doutianbao santiarias stefanobang keyman9848 ghalymt ngeniedeveloper qjao res0nat0r thiefjim nasterfy sanhyewng wlmsoft gjmulder advaypal nigat12 neozbr lcsouzamenezes emsi yaoxiao1 chroming davidatb weinix r2d209git adeelahmad yechoi42 hanimedialab qqq-tech positioner lucferreira-27 luluuyu enter-sandman2 mafergon sachineleven techthiyanes orestes74 dbarkar todociber sanskar-mk2 rkp64 chinshou alex-meizizi ortega16 ivandovgalyuk prafiles rasyiidwho abdelmageed95 didala083 hunterjim rennerdo30 pedrolabonia mongodbexpert bchkv suisei-pettan wydengyre fatemach krackxor fernandocamporredondo jordybeer lvu21512256 maclleida lesun90 figo2002 saeedeeas123 arna13 leilapapa ggggmohamed624 phongvngg1 lotka db-tech morgan7street sean-lx lovecodewq ghluckm4n

subsai's Issues

Subsai Google Colab

It showed this error:
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
ipython 7.9.0 requires jedi>=0.10, which is not installed.

then later it showed:

ModuleNotFoundError: No module named 'dl_translate

so i installed dl translate and pywhispercpp but kept having numpy version incompatible?
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
numba 0.56.4 requires numpy<1.24,>=1.18, but you have numpy 1.24.2 which is incompatible.

what can i do?

Also, can i run Subsai WebUI in google colab? It showed this error both on Windows and gg colab:

Traceback (most recent call last):
File "/usr/local/bin/subsai-webui", line 5, in
from subsai.webui import run
File "/usr/local/lib/python3.9/dist-packages/subsai/init.py", line 1, in
from subsai.main import SubsAI, Tools
File "/usr/local/lib/python3.9/dist-packages/subsai/main.py", line 27, in
from subsai.configs import AVAILABLE_MODELS
File "/usr/local/lib/python3.9/dist-packages/subsai/configs.py", line 11, in
from subsai.models.whisper_model import WhisperModel
File "/usr/local/lib/python3.9/dist-packages/subsai/models/whisper_model.py", line 13, in
import whisper
File "/usr/local/lib/python3.9/dist-packages/whisper/init.py", line 13, in
from .model import ModelDimensions, Whisper
File "/usr/local/lib/python3.9/dist-packages/whisper/model.py", line 13, in
from .transcribe import transcribe as transcribe_function
File "/usr/local/lib/python3.9/dist-packages/whisper/transcribe.py", line 20, in
from .timing import add_word_timestamps
File "/usr/local/lib/python3.9/dist-packages/whisper/timing.py", line 7, in
import numba
File "/usr/local/lib/python3.9/dist-packages/numba/init.py", line 42, in
from numba.np.ufunc import (vectorize, guvectorize, threading_layer,
File "/usr/local/lib/python3.9/dist-packages/numba/np/ufunc/init.py", line 3, in
from numba.np.ufunc.decorators import Vectorize, GUVectorize, vectorize, guvectorize
File "/usr/local/lib/python3.9/dist-packages/numba/np/ufunc/decorators.py", line 3, in
from numba.np.ufunc import _internal
SystemError: initialization of _internal failed without raising an exception

Thank you

Won't start... The torchaudio backend is switched to 'soundfile'. Note that 'sox_io' is not supported on Windows.

Ok. I have Windows 10 and PyCharm with python 3.9.13
I used that code from main page for python and of course install subsai package via git

And I got this...
[19:46:18] WARNING The torchaudio backend is torch_audio_backend.py:19
switched to 'soundfile'. Note
that 'sox_io' is not supported on
Windows.
WARNING torchvision is not available - cannot train_logger.py:227
save figures
WARNING The torchaudio backend is torch_audio_backend.py:19
switched to 'soundfile'. Note
that 'sox_io' is not supported on
Windows.

Process finished with exit code -1073741795 (0xC000001D)

German alphabet letters aren't supported in text file

e.g. letter ü in Kostümtag.mp4

[+] Processing file: C:\Users\a\Desktop\KostÃ¼mtag.mp4
[*] Error: C:\Users\a\Desktop\KostÃ¼mtag.mp4 does not exist -> continue
DONE!

[bug] error: Microsoft Visual C++ 14.0 or greater is required

Windows 11
intel 12700

I tried running as admin /non admin on cmd and also powershell but neither works.

I installed the "microsoft build tools" as requested in the error, and restarted, but the error still continues.

Please help.

Building wheel for webrtcvad (pyproject.toml) ... error
error: subprocess-exited-with-error

× Building wheel for webrtcvad (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> [9 lines of output]
running bdist_wheel
running build
running build_py
creating build
creating build\lib.win32-cpython-311
copying webrtcvad.py -> build\lib.win32-cpython-311
running build_ext
building '_webrtcvad' extension
error: Microsoft Visual C++ 14.0 or greater is required. Get it with "Microsoft C++ Build Tools": https://visualstudio.microsoft.com/visual-cpp-build-tools/
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for webrtcvad
Successfully built subsai whisper-timestamped
Failed to build webrtcvad
ERROR: Could not build wheels for webrtcvad, which is required to install pyproject.toml-based projects

Installing on Windows => ModuleNotFoundError: jax requires jaxlib to be installed.

I'm trying to install it on Windows.
I've created a new conda environment

conda create -n subsai
conda activate subsai

Then:
pip install git+https://github.com/abdeladim-s/subsai

At the end i try to run it:
subsai-webui

I get error:

ModuleNotFoundError: jax requires jaxlib to be installed. See https://github.com/google/jax#installation for installation instructions.

I've checked the installation instructions.
It looks like for windows only CPU is available.
For GPU i read

Note: wheels only available on linux.

Do I miss something?

wrong,I have this file ,but it can not find

Help with CUDA

2023-06-28 16:27:34.564 Processing audio with duration 11:05.966
Could not load library cudnn_cnn_infer64_8.dll. Error code 126
Please make sure cudnn_cnn_infer64_8.dll is in your library path!

But i already have CUDA installed

requirements.txt whisper-timestamped version wrong

requirements.txt points at wrong whisper-timestamped version.

After manually removing the @-part after git+https://github.com/linto-ai/whisper-timestamped in line 7

and re-running docker compose build --no-cache . it works for me

error when installing subsai

got this error:
ERROR: Cannot install subsai and subsai==1.2.0 because these package versions have conflicting dependencies.

The conflict is caused by:
subsai 1.2.0 depends on faster-whisper==0.5.1
whisperx 3.1.1 depends on faster-whisper>=0.8

To fix this you could try to:

loosen the range of package versions you've specified
remove package versions to allow pip attempt to solve the dependency conflict

ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts

I am no developer so I don't know how to fix this. any tips?

Help to understand parameters

First, thank you for your work.
I have a movie in Russian and I would like English or Spanish subtitles.
How should I call subsai?
Is there a model which works best for languages other than English?

Helpful to add ports to Docker command

Running on Windows with Docker, you can add -p 8501:8501 to open the ports for localhost

Library cublas64_11.dll is not found or cannot be loaded

cuda install?

OSError: libcublasLt.so.11

Hi @abdeladim-s , once again thanks for the great work. These are the early days so probably there are quite a few teething pains. I think there is a great potential with subsai framework, so hang in there :)

I am stuck during the import step of the notebook. I get the following error:

OSError: /opt/conda/envs/subsai/lib/python3.9/site-packages/nvidia/cublas/lib/libcublas.so.11: symbol cublasLtGetStatusString version libcublasLt.so.11 not defined in file libcublasLt.so.11 with link time reference

I have tried many different ways to go around it by I've not been successful. It seems that the error is caused because of a version (environment conflict) between torch1.13 and nvidia toolkit. I have tried uninstall, reinstall and new env settings but the error doesn't go away. The error goes away if I downgrad torch to 1.12 but that breaks some code dependecies in subsai.

Here is the error stack:

---------------------------------------------------------------------------
OSError                                   Traceback (most recent call last)
File /opt/conda/envs/subsai/lib/python3.9/site-packages/torch/__init__.py:172, in _load_global_deps()
    171 try:
--> 172     ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)
    173 except OSError as err:
    174     # Can only happen of wheel with cublas as PYPI deps
    175     # As PyTorch is not purelib, but nvidia-cublas-cu11 is

File /opt/conda/envs/subsai/lib/python3.9/ctypes/__init__.py:374, in CDLL.__init__(self, name, mode, handle, use_errno, use_last_error, winmode)
    373 if handle is None:
--> 374     self._handle = _dlopen(self._name, mode)
    375 else:

OSError: /opt/conda/envs/subsai/lib/python3.9/site-packages/torch/lib/../../nvidia/cublas/lib/libcublas.so.11: symbol cublasLtGetStatusString version libcublasLt.so.11 not defined in file libcublasLt.so.11 with link time reference

During handling of the above exception, another exception occurred:

OSError                                   Traceback (most recent call last)
Cell In[2], line 2
      1 import os
----> 2 import torch
      3 from pathlib import Path
      4 import pysubs2

File /opt/conda/envs/subsai/lib/python3.9/site-packages/torch/__init__.py:217
    206 else:
    207     # Easy way.  You want this most of the time, because it will prevent
    208     # C++ symbols from libtorch clobbering C++ symbols from other
   (...)
    214     #
    215     # See Note [Global dependencies]
    216     if USE_GLOBAL_DEPS:
--> 217         _load_global_deps()
    218     from torch._C import *  # noqa: F403
    220 # Appease the type checker; ordinarily this binding is inserted by the
    221 # torch._C module initialization code in C

File /opt/conda/envs/subsai/lib/python3.9/site-packages/torch/__init__.py:178, in _load_global_deps()
    176 if 'libcublas.so.11' not in err.args[0]:
    177     raise err
--> 178 _preload_cuda_deps()
    179 ctypes.CDLL(lib_path, mode=ctypes.RTLD_GLOBAL)

File /opt/conda/envs/subsai/lib/python3.9/site-packages/torch/__init__.py:158, in _preload_cuda_deps()
    155         continue
    156     break
--> 158 ctypes.CDLL(cublas_path)
    159 ctypes.CDLL(cudnn_path)

File /opt/conda/envs/subsai/lib/python3.9/ctypes/__init__.py:374, in CDLL.__init__(self, name, mode, handle, use_errno, use_last_error, winmode)
    371 self._FuncPtr = _FuncPtr
    373 if handle is None:
--> 374     self._handle = _dlopen(self._name, mode)
    375 else:
    376     self._handle = handle

Thanks.

whisper-timestamped crashes on creating plots -- but not when used standalone

Hi,

I am trying to use the linto-ai/whisper-timestamped model from within subsai. I don't use any model configuration:
subsai -m linto-ai/whisper-timestamped -f srt test0.webm
whisper-timestamped insists on loading the matplotlib library. And crashes.

whisper_timestamped is not supposed to do that. When the plot option is False, whisper_timestamped should not import this library. In subsai this option is plot_word_alignment.
In subsai the option plot_word_alignment is False by default.
The strange thing is, when I call whisper_timestamped directly from the command line:
whisper_timestamped --output_format srt --output_dir /tmp/ --vad True --task transcribe assets/audio/test0.mp3
the plot option is false and the command is executed successfully.

From a previous installation I know that when I do install matplotlib I can call the whisper-timestamped model from within subsai, but then matplotlib launches a graph for every sentence. Which is not really useful.

It seems that the whisper_timestamped CLI passes the model configuration in a different way to whisper_timestamp.transcribe() than when called directly from subsai.

Is this a known issue?

If not, I want to start working on this during the weekend and see why whisper_timestamped thinks it should plot.

Only last file from text file is processed

on Windows

Error in CLI documentation

In the documentation on the github page under Usage->CLI->"Example of simple usage" you have:
subsai ./assets/test1.mp4 --model openai/whisper --model-configs '{"model_type": "small"}' --format srt
When it should be:
subsai ./assets/test1.mp4 --model openai/whisper --model-configs "{\"model_type\": \"small\"}" --format srt

hi im using whisperx model. but result not showing speaker diarization.

Error when trying to transcribe using whisper-timestamped

When I set the model to whisper-timestamped and pressed transcribe, this error showed up.

This ORT build has ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider'] enabled. Since ORT 1.9, you are required to explicitly set the providers parameter when instantiating InferenceSession. For example, onnxruntime.InferenceSession(..., providers=['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'CPUExecutionProvider'], ...)

max_length deprecated

Hi I have the warning, when run subsai by the command:
subsai ./Videos/one.mp4 --model openai/whisper --model-configs '{"model_type": "small"}' --format srt --translation-model facebook/mbart-large-50-many-to-many-mmt --translation-source-lang en_XX --translation-target-lang pl_PL

.local/lib/python3.10/site-packages/transformers/generation/utils.py:1288: UserWarning: Using `max_length`'s default (200) to control the generation length. This behaviour is deprecated and will be removed from the config in v5 of Transformers -- we recommend using `max_new_tokens` to control the maximum length of the generation.
  warnings.warn(

It's only warning. Will you be replacing the max_length parameter with max_new_tokens in the future?

whisperX results in "unexpected keyword argument error for 3 fields

Command I input to windows powershell:

subsai .\file --model m-bain/whisperX --model-configs '{\"model_type\": \"large-v2\", \"device\": \"cuda\"}' --format srt

And result is:

TypeError: <lambda>() got an unexpected keyword argument 'repetition_penalty'
Reading the source code for whisperX there is a repetition_penalty field that isn't present in your code:

https://github.com/m-bain/whisperX/blob/b1a98b78c9152ace9f9801593b5fa0c7d5d96b0f/whisperx/asr.py#L66

default_asr_options = {
"beam_size": 5,
"best_of": 5,
"patience": 1,
"length_penalty": 1,
"repetition_penalty": 1,
"no_repeat_ngram_size": 0,
"temperatures": [0.0, 0.2, 0.4, 0.6, 0.8, 1.0],
"compression_ratio_threshold": 2.4,
"log_prob_threshold": -1.0,
"no_speech_threshold": 0.6,
"condition_on_previous_text": False,
"prompt_reset_on_temperature": 0.5,
"initial_prompt": None,
"prefix": None,
"suppress_blank": True,
"suppress_tokens": [-1],
"without_timestamps": True,
"max_initial_timestamp": 0.0,
"word_timestamps": False,
"prepend_punctuations": ""'“¿([{-",
"append_punctuations": ""'.。,，!！?？:：”)]}、",
"suppress_numerals": False,
}

But you don't have all of these as is, so may not be required

Streamlit memory error

After inputting the media path, hitting Enter on the web ui, the memory usage is through the roof. And it crashed. Here is the full error showing in console.

(subsai) D:\Workspace>subsai-webui
[17:46:06] WARNING  The torchaudio backend is switched to 'soundfile'. Note that 'sox_io' is   torch_audio_backend.py:19
                    not supported on Windows.
[17:46:46] WARNING  The torchaudio backend is switched to 'soundfile'. Note that 'sox_io' is   torch_audio_backend.py:19
                    not supported on Windows.
2023-07-11 17:46:47.647 WARNING streamlit.runtime.caching.cache_data_api: No runtime found, using MemoryCacheStorageManager

  You can now view your Streamlit app in your browser.

  Local URL: http://localhost:8501
  Network URL: http://192.168.31.61:8501

Warning: temperature does not have a supported UI
Warning: temperature does not have a supported UI
2023-07-11 17:48:36.401 Uncaught app exception
Traceback (most recent call last):
  File "C:\Users\minh\.conda\envs\subsai\Lib\site-packages\streamlit\components\v1\components.py", line 151, in create_instance
    serialized_json_args = json.dumps(json_args)
                           ^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\minh\.conda\envs\subsai\Lib\json\__init__.py", line 231, in dumps
    return _default_encoder.encode(obj)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\minh\.conda\envs\subsai\Lib\json\encoder.py", line 200, in encode
    chunks = self.iterencode(o, _one_shot=True)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\minh\.conda\envs\subsai\Lib\json\encoder.py", line 258, in iterencode
    return _iterencode(o, 0)
           ^^^^^^^^^^^^^^^^^
MemoryError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\Users\minh\.conda\envs\subsai\Lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 565, in _run_script
    exec(code, module.__dict__)
  File "C:\Users\minh\.conda\envs\subsai\Lib\site-packages\subsai\webui.py", line 501, in <module>
    run()
  File "C:\Users\minh\.conda\envs\subsai\Lib\site-packages\subsai\webui.py", line 494, in run
    webui()
  File "C:\Users\minh\.conda\envs\subsai\Lib\site-packages\subsai\webui.py", line 470, in webui
    event = st_player(_media_file_base64(file_path), **options, height=500, key="player")
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\minh\.conda\envs\subsai\Lib\site-packages\streamlit_player\__init__.py", line 84, in st_player
    event = _component_func(
            ^^^^^^^^^^^^^^^^
  File "C:\Users\minh\.conda\envs\subsai\Lib\site-packages\streamlit\components\v1\components.py", line 79, in __call__
    return self.create_instance(*args, default=default, key=key, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\minh\.conda\envs\subsai\Lib\site-packages\streamlit\runtime\metrics_util.py", line 311, in wrapped_func
    result = non_optional_func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\Users\minh\.conda\envs\subsai\Lib\site-packages\streamlit\components\v1\components.py", line 153, in create_instance
    raise MarshallComponentException(
streamlit.components.v1.components.MarshallComponentException: ('Could not convert component args to JSON', MemoryError())

200MB Limit

"File must be 200.0MB or smaller." when inputting a file over 200mb, I was hopping it can be transcribed/merged without showing the video if >200mb as I would assume thats the main issue, that the library being used can only shows videos <200mb, if this is the case can we still have it so we can download the translated/transcribed video at the end without the preview?

ROCm isnt supported

Is it possible to add ROCm support for amd gpus?

[Request] Please add support for V3-Large model

KeyError?

I managed to get everything installed but now I'm getting this error when I press export:

KeyError: 'st.session_state has no key "transcribed_subs". Did you forget to initialize it? More info: https://docs.streamlit.io/library/advanced-features/session-state#initialization'

Any idea on how to proceed? Thanks! @abdeladim-s

Symlink Support

I was using faster-whisper and the downloaded model utilized relative symlink (to avoid duplications I suppose) but the webui (or the ctranslate2) doesn't like it:

Traceback (most recent call last):
  File "C:\AI\LLM\Subs-AI\venv\Lib\site-packages\streamlit\runtime\scriptrunner\script_runner.py", line 565, in _run_script
    exec(code, module.__dict__)
  File "C:\AI\LLM\Subs-AI\venv\Lib\site-packages\subsai\webui.py", line 545, in <module>
    run()
  File "C:\AI\LLM\Subs-AI\venv\Lib\site-packages\subsai\webui.py", line 538, in run
    webui()
  File "C:\AI\LLM\Subs-AI\venv\Lib\site-packages\subsai\webui.py", line 318, in webui
    subs = _transcribe(file_path, stt_model_name, model_config)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\LLM\Subs-AI\venv\Lib\site-packages\streamlit\runtime\caching\cache_utils.py", line 194, in wrapper
    return cached_func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\LLM\Subs-AI\venv\Lib\site-packages\streamlit\runtime\caching\cache_utils.py", line 223, in __call__
    return self._get_or_create_cached_value(args, kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\LLM\Subs-AI\venv\Lib\site-packages\streamlit\runtime\caching\cache_utils.py", line 248, in _get_or_create_cached_value
    return self._handle_cache_miss(cache, value_key, func_args, func_kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\LLM\Subs-AI\venv\Lib\site-packages\streamlit\runtime\caching\cache_utils.py", line 302, in _handle_cache_miss
    computed_value = self._info.func(*func_args, **func_kwargs)
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\LLM\Subs-AI\venv\Lib\site-packages\subsai\webui.py", line 189, in _transcribe
    model = subs_ai.create_model(model_name, model_config=model_config)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\LLM\Subs-AI\venv\Lib\site-packages\subsai\main.py", line 96, in create_model
    return AVAILABLE_MODELS[model_name]['class'](model_config)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\LLM\Subs-AI\venv\Lib\site-packages\subsai\models\faster_whisper_model.py", line 240, in __init__
    self.model = WhisperModel(model_size_or_path=self._model_size_or_path,
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "C:\AI\LLM\Subs-AI\venv\Lib\site-packages\faster_whisper\transcribe.py", line 120, in __init__
    self.model = ctranslate2.models.Whisper(
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Unable to open file 'model.bin' in model 'C:\Users\Victor\.cache\huggingface\hub\models--guillaumekln--faster-whisper-large-v2\snapshots\f541c54c566e32dc1fbce16f98df699208837e8b'

models--guillaumekln--faster-whisper-large-v2\snapshots\f541c54c566e32dc1fbce16f98df699208837e8b is a folder that contains model files, those files are symlinks to the actual files at models--guillaumekln--faster-whisper-large-v2\blobs folder.

If I copy those files over and rename them, this error goes away. And the rest works flawlessly.

[install issue] ld: unsupported tapi file type '!tapi-tbd' in YAML file

I have encounted a issue when I run:
pip install git+https://github.com/abdeladim-s/subsai

---details----
ld: unsupported tapi file type '!tapi-tbd' in YAML file '/Library/Developer/CommandLineTools/SDKs/MacOSX13.sdk/usr/lib/libSystem.tbd' for architecture arm64
clang-16: error: linker command failed with exit code 1 (use -v to see invocation)
error: command '/opt/homebrew/opt/llvm/bin/clang' failed with exit code 1
error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
full command: /Users/m/anaconda3/bin/python -u -c '
exec(compile('"'"''"'"''"'"'

This is -- a caller that pip uses to run setup.py

- It imports setuptools before invoking setup.py, to enable projects that directly

import from `distutils.core` to work with newer packaging standards.

- It provides a clear error message when setuptools is not installed.

- It sets `sys.argv[0]` to the underlying `setup.py`, when invoking `setup.py` so

setuptools doesn'"'"'t think the script is `-c`. This avoids the following warning:

manifest_maker: standard file '"'"'-c'"'"' not found".

- It generates a shim setup.py, for handling setup.cfg-only projects.

import os, sys, tokenize

try:
import setuptools
except ImportError as error:
print(
"ERROR: Can not execute setup.py since setuptools is not available in "
"the build environment.",
file=sys.stderr,
)
sys.exit(1)

file = %r
sys.argv[0] = file

if os.path.exists(file):
filename = file
with tokenize.open(file) as f:
setup_py_code = f.read()
else:
filename = ""
setup_py_code = "from setuptools import setup; setup()"

exec(compile(setup_py_code, filename, "exec"))
'"'"''"'"''"'"' % ('"'"'/private/var/folders/bv/x8bqw0hd5hz7fdsvcdl9f43r0000gn/T/pip-install-fzmlj4fc/webrtcvad_3f3618882ee14b95b613dfb742749d32/setup.py'"'"',), "", "exec"))' bdist_wheel -d /private/var/folders/bv/x8bqw0hd5hz7fdsvcdl9f43r0000gn/T/pip-wheel-t5tnh_6c
cwd: /private/var/folders/bv/x8bqw0hd5hz7fdsvcdl9f43r0000gn/T/pip-install-fzmlj4fc/webrtcvad_3f3618882ee14b95b613dfb742749d32/
Building wheel for webrtcvad (setup.py) ... error
ERROR: Failed building wheel for webrtcvad

ModuleNotFoundError: No module named 'importlib.util'

This project looks great.

However, when installing using "pip install git+https://github.com/abdeladim-s/subsai" the installation fails with:

ERROR: Could not find a version that satisfies the requirement importlib<1.1.0,>=1.0.4 (from subsai)
ERROR: No matching distribution found for importlib<1.1.0,>=1.0.4

I am running Python 3.9.2 on Debian 11 Bullseye.
As I understand, importlib is a component of Python 3 and should not have to be installed.
When importing importlib.util in Python this is successful.

>>> import importlib.util
>>>

So it seems that the installation fails on the requirement importlib<1.1.0,>=1.0.4. How can I work around this?

Thanks
jlinkels

WhisperX doesn't support my gpu

I have the rtx 3060, but i always get :ValueError: unsupported device cuda:0, when trying to use Whisper X

Support for OpenAI API Key

Add support for openai api key as an option.

Auto-translating?

First, thanks for this great tool! I'm getting it to work via the command line, but for some reason some videos are autotranslating from french (original language) to english. This is what I'm typing into the command line:

"C:\Users>subsai ./assets/video.mp4 --model openai/whisper --format srt

Any idea why some output srt files are autotranslated and some are in original language? I'm trying to get the original language files which are in french in this instance.

Thanks so much!!!

[bug] whisperX word segmentation fails: KeyError: 'start'

Running the latest version from the docker image.

Version: 1.1.1
===================================

[-] Model name: m-bain/whisperX
[-] Model configs: {'model_type': 'large-v2', 'segment_type': 'word', 'language': 'en', 'device': 'cpu'}
---
[+] Initializing the model
[2023-07-19 14:10:15.381] [ctranslate2] [thread 7] [warning] The compute type inferred from the saved model is float16, but the target device or backend do not support efficient float16 computation. The model weights have been automatically converted to use the float32 compute type instead.
[14:10:15] INFO     Lightning automatically upgraded your loaded checkpoint from v1.5.4 to v2.0.5. To apply the upgrade to your files      utils.py:128
                    permanently, run `python -m pytorch_lightning.utilities.upgrade_checkpoint --file
                    ../root/.cache/torch/whisperx-vad-segmentation.bin`
Model was trained with pyannote.audio 0.0.1, yours is 2.1.1. Bad things might happen unless you revert pyannote.audio to 0.x.
Model was trained with torch 1.10.0+cu102, yours is 2.0.1. Bad things might happen unless you revert torch to 1.x.
[+] Processing file: /media_files/Marble Hornets Season 1.mp4
Traceback (most recent call last):
  File "/opt/conda/bin/subsai", line 8, in <module>
    sys.exit(main())
  File "/opt/conda/lib/python3.10/site-packages/subsai/cli.py", line 143, in main
    run(media_file_arg=args.media_file,
  File "/opt/conda/lib/python3.10/site-packages/subsai/cli.py", line 87, in run
    subs = subs_ai.transcribe(file, model)
  File "/opt/conda/lib/python3.10/site-packages/subsai/main.py", line 114, in transcribe
    return stt_model.transcribe(media_file)
  File "/opt/conda/lib/python3.10/site-packages/subsai/models/whisperX_model.py", line 146, in transcribe
    event = SSAEvent(start=pysubs2.make_time(s=word["start"]), end=pysubs2.make_time(s=word["end"]))
KeyError: 'start'

It ran for the expected length of time, so it probably finished the encoding and died at the end.

Disonnecting during transcription (whisper.cpp)

So I was trying to do this and this problem kept on coming and I don't know how to fix it or if there is a workaround to this.

Screen.Recording.2023-07-14.at.5.01.56.PM.mp4

AttributeError: module 'numpy' has no attribute 'complex'

After installing (no errors) and running the command subsai-webui I get:

C:\Users\<user>\miniconda3\lib\site-packages\h5py\__init__.py:36: UserWarning: h5py is running against HDF5 1.12.2 when it was built against 1.12.1, this may cause problems
  _warn(("h5py is running against HDF5 {0} when it was built against {1}, "
C:\Users\<user>\miniconda3\lib\site-packages\paramiko\transport.py:219: CryptographyDeprecationWarning: Blowfish has been deprecated
  "class": algorithms.Blowfish,
Traceback (most recent call last):
  File "C:\Users\<user>\miniconda3\lib\runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "C:\Users\<user>\miniconda3\lib\runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "C:\Users\<user>\miniconda3\Scripts\subsai-webui.exe\__main__.py", line 4, in <module>
  File "C:\Users\<user>\miniconda3\lib\site-packages\subsai\__init__.py", line 1, in <module>
    from subsai.main import SubsAI, Tools
  File "C:\Users\<user>\miniconda3\lib\site-packages\subsai\main.py", line 27, in <module>
    from subsai.configs import AVAILABLE_MODELS
  File "C:\Users\<user>\miniconda3\lib\site-packages\subsai\configs.py", line 12, in <module>
    from subsai.models.whisperX_model import WhisperXModel
  File "C:\Users\<user>\miniconda3\lib\site-packages\subsai\models\whisperX_model.py", line 16, in <module>
    import whisperx
  File "C:\Users\<user>\miniconda3\lib\site-packages\whisperx\__init__.py", line 1, in <module>
    from .transcribe import load_model
  File "C:\Users\<user>\miniconda3\lib\site-packages\whisperx\transcribe.py", line 10, in <module>
    from .asr import load_model
  File "C:\Users\<user>\miniconda3\lib\site-packages\whisperx\asr.py", line 13, in <module>
    from .vad import load_vad_model, merge_chunks
  File "C:\Users\<user>\miniconda3\lib\site-packages\whisperx\vad.py", line 9, in <module>
    from pyannote.audio import Model
  File "C:\Users\<user>\miniconda3\lib\site-packages\pyannote\audio\__init__.py", line 29, in <module>
    from .core.inference import Inference
  File "C:\Users\<user>\miniconda3\lib\site-packages\pyannote\audio\core\inference.py", line 35, in <module>
    from pyannote.audio.core.model import Model
  File "C:\Users\<user>\miniconda3\lib\site-packages\pyannote\audio\core\model.py", line 45, in <module>
    from pyannote.audio.core.task import Problem, Resolution, Specifications, Task
  File "C:\Users\<user>\miniconda3\lib\site-packages\pyannote\audio\core\task.py", line 40, in <module>
    from torch_audiomentations import Identity
  File "C:\Users\<user>\miniconda3\lib\site-packages\torch_audiomentations\__init__.py", line 1, in <module>
    from .augmentations.background_noise import AddBackgroundNoise
  File "C:\Users\<user>\miniconda3\lib\site-packages\torch_audiomentations\augmentations\background_noise.py", line 11, in <module>
    from ..utils.io import Audio
  File "C:\Users\<user>\miniconda3\lib\site-packages\torch_audiomentations\utils\io.py", line 5, in <module>
    import librosa
  File "C:\Users\<user>\miniconda3\lib\site-packages\librosa\__init__.py", line 211, in <module>
    from . import core
  File "C:\Users\<user>\miniconda3\lib\site-packages\librosa\core\__init__.py", line 9, in <module>
    from .constantq import *  # pylint: disable=wildcard-import
  File "C:\Users\<user>\miniconda3\lib\site-packages\librosa\core\constantq.py", line 1059, in <module>
    dtype=np.complex,
  File "C:\Users\<user>\miniconda3\lib\site-packages\numpy\__init__.py", line 305, in __getattr__
    raise AttributeError(__former_attrs__[attr])
AttributeError: module 'numpy' has no attribute 'complex'.
`np.complex` was a deprecated alias for the builtin `complex`. To avoid this error in existing code, use `complex` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.complex128` here.
The aliases was originally deprecated in NumPy 1.20; for more details and guidance see the original release note at:
    https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations. Did you mean: 'complex_'?

I don't know Python/Numpy to know if this is just a case of using a deprecated type or a fault on my end, since there's no other issue about it. Is it a bug?

New variant: Whisper Jax

A new variant called Whisper Jax claims a huge boost again. Maybe you would like to have a look and see if it's interesting.

Providing lyrics / Dictionary / Hints

Hello, this project is awesome! I'm trying to make a tool that generates karaoke timings automatically... I was wondering if there's a method of providing the lyrics ahead of time to the model so it could just determine the timing? It's already 70-90% of the way there, but matching it based on syllables/lines/string similarity can be problematic because when it's wrong, it's wildly wrong.

Help with the interview to the creator of OpenAI

I have been using your project for several days, and it has been working more than perfect, but today I got the idea to get the .sdr file from Lex Fridman's interview to OpenAI's creator.

https://www.youtube.com/watch?v=L_Guz73e6fw

as you can see the video lasts more than 2 hours, and when I try to process it with your tool I get this error:

Is there any way to get the .srt file of such a long video using your tool?

thank you for this beautiful project, I hope you can help me.

Edit: Now I am using this script:

It seems to be working:

I'll leave it running and go to sleep.

Preparing metadata (pyproject.toml) ... error

 Preparing metadata (pyproject.toml) ... error
  error: subprocess-exited-with-error

  × Preparing metadata (pyproject.toml) did not run successfully.
  │ exit code: 1
  ╰─> [68 lines of output]
      <string>:5: DeprecationWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html
      C:\Users\marti\AppData\Local\Temp\pip-build-env-a6jqn33o\overlay\Lib\site-packages\setuptools\dist.py:946: SetuptoolsDeprecationWarning: The namespace_packages parameter is deprecated.
      !!

              ********************************************************************************
              Please replace its usage with implicit namespaces (PEP 420).

              See https://setuptools.pypa.io/en/latest/references/keywords.html#keyword-namespace-packages for details.
              ********************************************************************************

      !!
        ep.load()(self, ep.name, value)
      C:\Users\marti\AppData\Local\Temp\pip-build-env-a6jqn33o\overlay\Lib\site-packages\setuptools\dist.py:744: SetuptoolsDeprecationWarning: Invalid dash-separated options
      !!

              ********************************************************************************
              Usage of dash-separated 'author-email' will not be supported in future
              versions. Please use the underscore name 'author_email' instead.

              By 2023-Sep-26, you need to update your project and remove deprecated calls
              or your builds will no longer be supported.

              See https://setuptools.pypa.io/en/latest/userguide/declarative_config.html for details.
              ********************************************************************************

      !!
        opt = self.warn_dash_deprecation(opt, section)
      C:\Users\marti\AppData\Local\Temp\pip-build-env-a6jqn33o\overlay\Lib\site-packages\setuptools\dist.py:744: SetuptoolsDeprecationWarning: Invalid dash-separated options
      !!

              ********************************************************************************
              Usage of dash-separated 'long-description' will not be supported in future
              versions. Please use the underscore name 'long_description' instead.

              By 2023-Sep-26, you need to update your project and remove deprecated calls
              or your builds will no longer be supported.

              See https://setuptools.pypa.io/en/latest/userguide/declarative_config.html for details.
              ********************************************************************************

      !!
        opt = self.warn_dash_deprecation(opt, section)
      C:\Users\marti\AppData\Local\Temp\pip-build-env-a6jqn33o\overlay\Lib\site-packages\setuptools\dist.py:744: SetuptoolsDeprecationWarning: Invalid dash-separated options
      !!

              ********************************************************************************
              Usage of dash-separated 'long-description-content-type' will not be supported in future
              versions. Please use the underscore name 'long_description_content_type' instead.

              By 2023-Sep-26, you need to update your project and remove deprecated calls
              or your builds will no longer be supported.

              See https://setuptools.pypa.io/en/latest/userguide/declarative_config.html for details.
              ********************************************************************************

      !!
        opt = self.warn_dash_deprecation(opt, section)
      -- Building version 2.1.1
      running dist_info
      creating C:\Users\marti\AppData\Local\Temp\pip-modern-metadata-4bqzl5q2\pyannote.audio.egg-info
      writing C:\Users\marti\AppData\Local\Temp\pip-modern-metadata-4bqzl5q2\pyannote.audio.egg-info\PKG-INFO
      writing dependency_links to C:\Users\marti\AppData\Local\Temp\pip-modern-metadata-4bqzl5q2\pyannote.audio.egg-info\dependency_links.txt
      writing entry points to C:\Users\marti\AppData\Local\Temp\pip-modern-metadata-4bqzl5q2\pyannote.audio.egg-info\entry_points.txt
      writing namespace_packages to C:\Users\marti\AppData\Local\Temp\pip-modern-metadata-4bqzl5q2\pyannote.audio.egg-info\namespace_packages.txt
      writing requirements to C:\Users\marti\AppData\Local\Temp\pip-modern-metadata-4bqzl5q2\pyannote.audio.egg-info\requires.txt
      writing top-level names to C:\Users\marti\AppData\Local\Temp\pip-modern-metadata-4bqzl5q2\pyannote.audio.egg-info\top_level.txt
      writing manifest file 'C:\Users\marti\AppData\Local\Temp\pip-modern-metadata-4bqzl5q2\pyannote.audio.egg-info\SOURCES.txt'
      error: [WinError 267] Nom de répertoire non valide
      [end of output]

  note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed

× Encountered error while generating package metadata.
╰─> See above for output.

note: This is an issue with the package mentioned above, not pip.
hint: See above for details.

Cant create .exe with PyInstaller when Subsai is imported

I am trying to create a small program that works with Subsai. I then build the Python script to the .exe file using PyInstaller (auto-py-to-exe). When I run the .exe file, I get the following error:

File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\auto_py_to_exe\packaging.py", line 131, in package run_pyinstaller() File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\__main__.py", line 180, in run run_build(pyi_config, spec_file, **vars(args)) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\__main__.py", line 61, in run_build PyInstaller.building.build_main.main(pyi_config, spec_file, **kwargs) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\building\build_main.py", line 1035, in main build(specfile, distpath, workpath, clean_build) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\building\build_main.py", line 957, in build exec(code, spec_namespace) File "C:\Users\FARZAD~1\AppData\Local\Temp\tmpbkihmg4b\main.spec", line 21, in <module> a = Analysis( File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\building\build_main.py", line 442, in __init__ self.__postinit__() File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\building\datastruct.py", line 184, in __postinit__ self.assemble() File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\building\build_main.py", line 603, in assemble priority_scripts.append(self.graph.add_script(script)) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\depend\analysis.py", line 268, in add_script self._top_script_node = super().add_script(pathname) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\lib\modulegraph\modulegraph.py", line 1359, in add_script self._process_imports(n) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\lib\modulegraph\modulegraph.py", line 2811, in _process_imports target_modules = self._safe_import_hook(*import_info, **kwargs) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\depend\analysis.py", line 432, in _safe_import_hook return super()._safe_import_hook(target_module_partname, source_module, target_attr_names, level, edge_attr) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\lib\modulegraph\modulegraph.py", line 2262, in _safe_import_hook target_modules = self.import_hook( File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\lib\modulegraph\modulegraph.py", line 1431, in import_hook target_package, target_module_partname = self._find_head_package( File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\lib\modulegraph\modulegraph.py", line 1610, in _find_head_package target_package = self._safe_import_module( File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\depend\analysis.py", line 479, in _safe_import_module return super()._safe_import_module(module_basename, module_name, parent_package) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\lib\modulegraph\modulegraph.py", line 1992, in _safe_import_module self._process_imports(n) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\lib\modulegraph\modulegraph.py", line 2811, in _process_imports target_modules = self._safe_import_hook(*import_info, **kwargs) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\depend\analysis.py", line 432, in _safe_import_hook return super()._safe_import_hook(target_module_partname, source_module, target_attr_names, level, edge_attr) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\lib\modulegraph\modulegraph.py", line 2262, in _safe_import_hook target_modules = self.import_hook( File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\lib\modulegraph\modulegraph.py", line 1444, in import_hook submodule = self._safe_import_module(head, mname, submodule) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\depend\analysis.py", line 479, in _safe_import_module return super()._safe_import_module(module_basename, module_name, parent_package) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\lib\modulegraph\modulegraph.py", line 1992, in _safe_import_module self._process_imports(n) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\lib\modulegraph\modulegraph.py", line 2811, in _process_imports target_modules = self._safe_import_hook(*import_info, **kwargs) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\depend\analysis.py", line 432, in _safe_import_hook return super()._safe_import_hook(target_module_partname, source_module, target_attr_names, level, edge_attr) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\lib\modulegraph\modulegraph.py", line 2262, in _safe_import_hook target_modules = self.import_hook( File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\lib\modulegraph\modulegraph.py", line 1444, in import_hook submodule = self._safe_import_module(head, mname, submodule) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\depend\analysis.py", line 479, in _safe_import_module return super()._safe_import_module(module_basename, module_name, parent_package) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\lib\modulegraph\modulegraph.py", line 1992, in _safe_import_module self._process_imports(n) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\lib\modulegraph\modulegraph.py", line 2811, in _process_imports target_modules = self._safe_import_hook(*import_info, **kwargs) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\depend\analysis.py", line 432, in _safe_import_hook return super()._safe_import_hook(target_module_partname, source_module, target_attr_names, level, edge_attr) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\lib\modulegraph\modulegraph.py", line 2262, in _safe_import_hook target_modules = self.import_hook( File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\lib\modulegraph\modulegraph.py", line 1431, in import_hook target_package, target_module_partname = self._find_head_package( File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\lib\modulegraph\modulegraph.py", line 1610, in _find_head_package target_package = self._safe_import_module( File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\depend\analysis.py", line 479, in _safe_import_module return super()._safe_import_module(module_basename, module_name, parent_package) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\lib\modulegraph\modulegraph.py", line 1992, in _safe_import_module self._process_imports(n) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\lib\modulegraph\modulegraph.py", line 2811, in _process_imports target_modules = self._safe_import_hook(*import_info, **kwargs) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\depend\analysis.py", line 432, in _safe_import_hook return super()._safe_import_hook(target_module_partname, source_module, target_attr_names, level, edge_attr) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\lib\modulegraph\modulegraph.py", line 2262, in _safe_import_hook target_modules = self.import_hook( File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\lib\modulegraph\modulegraph.py", line 1431, in import_hook target_package, target_module_partname = self._find_head_package( File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\lib\modulegraph\modulegraph.py", line 1610, in _find_head_package target_package = self._safe_import_module( File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\depend\analysis.py", line 479, in _safe_import_module return super()._safe_import_module(module_basename, module_name, parent_package) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\lib\modulegraph\modulegraph.py", line 1992, in _safe_import_module self._process_imports(n) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\lib\modulegraph\modulegraph.py", line 2811, in _process_imports target_modules = self._safe_import_hook(*import_info, **kwargs) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\depend\analysis.py", line 432, in _safe_import_hook return super()._safe_import_hook(target_module_partname, source_module, target_attr_names, level, edge_attr) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\lib\modulegraph\modulegraph.py", line 2262, in _safe_import_hook target_modules = self.import_hook( File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\lib\modulegraph\modulegraph.py", line 1444, in import_hook submodule = self._safe_import_module(head, mname, submodule) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\depend\analysis.py", line 479, in _safe_import_module return super()._safe_import_module(module_basename, module_name, parent_package) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\lib\modulegraph\modulegraph.py", line 1992, in _safe_import_module self._process_imports(n) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\lib\modulegraph\modulegraph.py", line 2811, in _process_imports target_modules = self._safe_import_hook(*import_info, **kwargs) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\depend\analysis.py", line 432, in _safe_import_hook return super()._safe_import_hook(target_module_partname, source_module, target_attr_names, level, edge_attr) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\lib\modulegraph\modulegraph.py", line 2262, in _safe_import_hook target_modules = self.import_hook( File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\lib\modulegraph\modulegraph.py", line 1431, in import_hook target_package, target_module_partname = self._find_head_package( File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\lib\modulegraph\modulegraph.py", line 1610, in _find_head_package target_package = self._safe_import_module( File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\depend\analysis.py", line 479, in _safe_import_module return super()._safe_import_module(module_basename, module_name, parent_package) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\lib\modulegraph\modulegraph.py", line 1992, in _safe_import_module self._process_imports(n) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\lib\modulegraph\modulegraph.py", line 2811, in _process_imports target_modules = self._safe_import_hook(*import_info, **kwargs) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\depend\analysis.py", line 368, in _safe_import_hook excluded_imports = self._find_all_excluded_imports(source_module.identifier) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\depend\analysis.py", line 356, in _find_all_excluded_imports excluded_imports.update(module_hook.excludedimports) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\depend\imphook.py", line 316, in __getattr__ self._load_hook_module() File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\depend\imphook.py", line 383, in _load_hook_module self._hook_module = importlib_load_source(self.hook_module_name, self.hook_filename) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\compat.py", line 612, in importlib_load_source return mod_loader.load_module() File "<frozen importlib._bootstrap_external>", line 529, in _check_name_wrapper File "<frozen importlib._bootstrap_external>", line 1029, in load_module File "<frozen importlib._bootstrap_external>", line 854, in load_module File "<frozen importlib._bootstrap>", line 274, in _load_module_shim File "<frozen importlib._bootstrap>", line 711, in _load File "<frozen importlib._bootstrap>", line 680, in _load_unlocked File "<frozen importlib._bootstrap_external>", line 850, in exec_module File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\_pyinstaller_hooks_contrib\hooks\stdhooks\hook-webrtcvad.py", line 15, in <module> datas = copy_metadata('webrtcvad') File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\PyInstaller\utils\hooks\__init__.py", line 942, in copy_metadata dist = pkg_resources.get_distribution(package_name) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\pkg_resources\__init__.py", line 477, in get_distribution dist = get_provider(dist) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\pkg_resources\__init__.py", line 353, in get_provider return working_set.find(moduleOrReq) or require(str(moduleOrReq))[0] File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\pkg_resources\__init__.py", line 897, in require needed = self.resolve(parse_requirements(requirements)) File "C:\Users\Farzad Nikzad\AppData\Roaming\Python\Python39\site-packages\pkg_resources\__init__.py", line 783, in resolve raise DistributionNotFound(req, requirers) pkg_resources.DistributionNotFound: The 'webrtcvad' distribution was not found and is required by the application

And this is the part of my code in which subsai are used,
and when I delete this part, the exe file is created without any problem.

` def create_subtitles(self):

    if not self.input_dir or not self.output_dir or not self.audio_dir:
        messagebox.showerror("Error", "Please select input, output, and audio directories.")
        return
    
    
    subs_ai = SubsAI()
    model_type = self.model_type_var.get()
    model = WhisperTimeStamped({'model_type': model_type})
    model = WhisperTimeStamped({"segment_type": "word"})
    
    for filename in os.listdir(self.input_dir):
        if filename.endswith('.mp4') or filename.endswith('.avi'):
           
            video_path = os.path.join(self.input_dir, filename)
            subtitle_path = os.path.join(self.output_dir, filename[:-4] + '.ass')
            audio_path = os.path.join(self.audio_dir, filename[:-4] + '.mp3')
            
            
            subprocess.run(['ffmpeg', '-i', video_path, '-vn', '-acodec', 'libmp3lame', '-q:a', '2', audio_path])
            
          
            subs = subs_ai.transcribe(audio_path, model)
            subs.save(subtitle_path)
    `

Anyway to increase file limit from 200mb?

Hello, I love the program it works amazingly well! I just want to find out, if is there an easy way to increase the file limit from 200 MB to say 1 GB or higher.

edit: I found out you can add the config.toml file to bypass the limit, but in doing so it pretty much kills my machine.

Enhancement:provide user-friendly packages

For ordinary users, pip is still not friendly enough for installation. Maybe it's better to pack with pyinstaller for direct distribution, or add a little more effort to turn this into an Electron or Tauri App, so the project can approach a larger user group. https://github.com/chidiwilliams/buzz has done it, but it's more oriented to transcription while less to caption, so such a job may be still worthwhile here.

Feature Request: edit video with generated subtitle

https://github.com/zcf0508/autocut-client/ (package for https://github.com/mli/autocut) does a job to extract and assemble video clips based on subtitle generated by whisper, which is very useful. However, unfortunately that project looks lacking sustainable development and emerging modules like fast-whisper are not in their vision. So I wonder if such a feature can be implemented here to help more people with this use case.

Add proxy for dlt part

People in chinese can not use dlt connection.

Upload and download via webui?

Hello,

Thank you for your great work which saves me a lot of time and effort. Please forgive me if I didn't get it correctly, is there a way to upload video/audio files and download sub files via the webui? I think it will make this project easier to use if we can upload and download files directly on webui.

I found the following links to streamlit docs might be helpful:
How do you retrieve the filename of a file uploaded with st.file_uploader?
How to download a file in Streamlit?

Thanks again,
Best,

Circular Import

Debug

davidliao@Davids-MacBook-Air project %  /usr/bin/env /usr/local/bin/python3 /Users/davidliao/.vscode/extensions/ms-python.pyth
on-2023.6.1/pythonFiles/lib/python/debugpy/adapter/../../debugpy/launcher 57695 -- /Users/davidliao/Desktop/project/srt.py 
Traceback (most recent call last):
  File "/Users/davidliao/Desktop/project/srt.py", line 2, in <module>
    from subsai import SubsAI
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/subsai/__init__.py", line 1, in <module>
    from subsai.main import SubsAI, Tools
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/subsai/main.py", line 27, in <module>
    from subsai.configs import AVAILABLE_MODELS
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/subsai/configs.py", line 8, in <module>
    from ffsubsync.constants import DEFAULT_MAX_SUBTITLE_SECONDS, DEFAULT_START_SECONDS, DEFAULT_MAX_OFFSET_SECONDS, \
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/ffsubsync/__init__.py", line 21, in <module>
    from .ffsubsync import main  # noqa
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/ffsubsync/ffsubsync.py", line 34, in <module>
    from ffsubsync.speech_transformers import (
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/ffsubsync/speech_transformers.py", line 23, in <module>
    from ffsubsync.generic_subtitles import GenericSubtitle
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/ffsubsync/generic_subtitles.py", line 9, in <module>
    import srt
  File "/Users/davidliao/Desktop/project/srt.py", line 2, in <module>
    from subsai import SubsAI
ImportError: cannot import name 'SubsAI' from partially initialized module 'subsai' (most likely due to a circular import) (/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/subsai/__init__.py)

Code

from subsai import SubsAI

def createsrt(path,name):
    file = f'./{path}/{name}.mp4'
    subs_ai = SubsAI()
    model = subs_ai.create_model('openai/whisper', {'model_type': 'base'})
    subs = subs_ai.transcribe(file, model)
    subs.save('test1.srt')

createsrt("mp4","input")

I am on MacOS Monterey, M2 Silicon using VSC, on latest release, have ffmpeg and setuptools-rust.
Using Python 3.10.4, tried as well with 3.9.6

I'm unsure what to do here as I've never ran into a circular import error, but I just copied the code and replaced the path variable with my own. I've attempted to use it as a script/outside of the function but I am getting the same error.

Also, is this usable with mp3 files as well? Assuming that would be faster.

WebUI process dies immediately when finishing transcription

Running on Windows 11,
CUDA (NVidia RTX 3090)
python 3.10.6

Server is run with subsai-webui --server.maxUploadSize 50000 in case that's relevant.

using faster-whisper and large-v2

Trying to process a 1hr 17min wav file, just under 900mb. It gets to the final segment it just shuts down with no error message, as if the process exited successfuly. The UI does not get updated, and the transcriptions are inaccessible. Tried both headless and no. No difference. Can't actually export srts from the UI because of this. I was hoping to use the webui because i have other AI generative stuff on this server, but work on different computers at home.

I do get an occasional message on the ui that says "Connection failed with status 0"

Not sure what i'm doing wrong here. LMK what other details you need.

Pip installs UNKNOWN

pip install git+https://github.com/abdeladim-s/subsai
Defaulting to user installation because normal site-packages is not writeable
Collecting git+https://github.com/abdeladim-s/subsai
  Cloning https://github.com/abdeladim-s/subsai to /tmp/pip-req-build-chkbjtlp
  Running command git clone --filter=blob:none --quiet https://github.com/abdeladim-s/subsai /tmp/pip-req-build-chkbjtlp
  Resolved https://github.com/abdeladim-s/subsai to commit 4cfbda07aad112057304e145d1f11023f3a892c2
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Installing backend dependencies ... done
  Preparing metadata (pyproject.toml) ... done
Building wheels for collected packages: UNKNOWN
  Building wheel for UNKNOWN (pyproject.toml) ... done
  Created wheel for UNKNOWN: filename=UNKNOWN-0.0.0-py3-none-any.whl size=13277 sha256=2836aa2f641949f305ce021aa33f6b45abb86ff4f26cf71cbadd2abdde9dd43f
  Stored in directory: /tmp/pip-ephem-wheel-cache-qv6qy6s3/wheels/09/1b/a3/6f2b2255cd57a3c69767cb2d735e1f329967c8fe3b750da6ae
Successfully built UNKNOWN
Installing collected packages: UNKNOWN
Successfully installed UNKNOWN-0.0.0

Running Linux Mint 21.1

Passing a filepath for models

I'm trying to use the facebook m2m 1.2B model, but python can't download it in time before it times out and crashes the download. Is it possible to pass a filepath to a model I've already downloaded? It looks like it's possible in dl-translate, but I'm not sure how I'd access that model_or_path feature via the subsai options.

how to use translate by cli command

I use this command
subsai -f srt --model ggerganov/whisper.cpp --model-configs '{"model_type": "base.en","n_threads":10}' --translation-configs '{"type": "facebook/m2m100_418M"}' -tsl English -ttl Chinese 111.avi
It run , only generate srt , not translate.

Incorrect param on WhisperTimeStamped

Hi!
First of all, thanks for this awesome lib!! I love it!

I was testing the WhisperTimeStamped model and reading the documentation on the source code I found the option to set the 'segment_type' with these possible values
'options': ['Sentence', 'Word'],

but the code expects lowercase value 'word':

So the Options should indicate the value properly or change the code to accept 'Words'

Took me a while to realize why it was not working.