aarnphm / whispercpp Goto Github PK

View Code? Open in Web Editor NEW

302.0 302.0 52.0 2.33 MB

Pybind11 bindings for Whisper.cpp

License: Apache License 2.0

Starlark 10.99% Nix 1.27% Shell 2.52% Python 23.40% C++ 61.83%

audio-transcription bazel bentoml mlops-workflow nix pybind11 python3 whisper whisper-cpp

whispercpp's Introduction

I just wondered how things were put together. - Claude Shannon

whispercpp's People

Contributors

Stargazers

Watchers

Forkers

bugbakery im-mp melihogutcen hay mmyjona dgtlntv n0740 bryceamacker fidoriel ashabana hokhyk luvwinnie jeffzwang iantanwx aiwintermuteai erickong1985 bjmeo8 eamonndunne kimwoonggon asxzy 5l1v3r1 palontologist bloomv andermatt64 k2m5t2 remkade filopedraz neuralaim katyamuland inosphe idkwhatimd0ing johnrbent gn64 kornpow vavaschenko matt90luo yacaikk githubnemo andrijdavid eviltwin quantfive chrisspen codethazine maddinek aigaosheng mo-rijndael

whispercpp's Issues

bug: RuntimeError: src/whispercpp/context.cc#L69: c.wctx is not initialized

Describe the bug

Hello! First of all thank you for your work!

I wanted to report this problem that I'm having when trying to load quantized models. Could someone give me a hand to see if I'm doing something wrong?

To reproduce

Given a quantized Whisper model in ggml format (e.g. ggml-model-whisper-small-q5_1.bin)
When I try to load it with Whisper.from_pretrained("ggml-model-whisper-small-q5_1.bin")
I get the error RuntimeError: src/whispercpp/context.cc#L69: c.wctx is not initialized
I also get the following log:

whisper_model_load: unknown tensor 'X�����ب��G]����u��H.��*sKGi���������ڔ�������<�! L���Ϊ�����' in model file

Expected behavior

The model should be loaded and it shouldn't crash

Environment

python: 3.10.12
OS: Ubuntu 22.04

Docs don't explain if additional whisper.cpp build/install step is required?

Describe the bug

Carried over from: ggerganov/whisper.cpp#9 (comment)

How can I make this work? I've cloned this whisper.cpp repo and run make main and make stream. I've made a virtualenv and installed whispercpp. When I try to run the stream.py example, I get:

Traceback (most recent call last):
  File "stream.py", line 44, in <module>
    default=w.api.SAMPLE_RATE,
  File "/Users/tailee/Projects/whisper.cpp/venv/lib/python3.8/site-packages/whispercpp/utils.py", line 144, in __getattr__
    self._module = self._load()
  File "/Users/tailee/Projects/whisper.cpp/venv/lib/python3.8/site-packages/whispercpp/utils.py", line 122, in _load
    module = importlib.import_module(self.__name__)
  File "/Users/tailee/.pyenv/versions/3.8.16/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 657, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 556, in module_from_spec
  File "<frozen importlib._bootstrap_external>", line 1166, in create_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
ImportError: dlopen(/Users/tailee/Projects/whisper.cpp/venv/lib/python3.8/site-packages/whispercpp/api_cpp2py_export.so, 0x0002): symbol not found in flat namespace '_PyCMethod_New'

Do I need to make and install some shared libraries somewhere? If so, I could not find any instructions for this in whisper.cpp or whispercpp docs and existing GitHub issues.

To reproduce

Steps to reproduce (from memory):

git clone [email protected]:ggerganov/whisper.cpp.git
cd whisper.cpp
bash ./models/download-ggml-model.sh large
make main
make stream
python -m virtualenv venv
source venv/bin/activate
pip install whispercpp
curl https://raw.githubusercontent.com/aarnphm/whispercpp/main/examples/stream/stream.py > stream.py
python stream.py

This raises an exception:

Traceback (most recent call last):
  File "stream.py", line 44, in <module>
    default=w.api.SAMPLE_RATE,
  File "/Users/tailee/Projects/whisper.cpp/venv/lib/python3.8/site-packages/whispercpp/utils.py", line 144, in __getattr__
    self._module = self._load()
  File "/Users/tailee/Projects/whisper.cpp/venv/lib/python3.8/site-packages/whispercpp/utils.py", line 122, in _load
    module = importlib.import_module(self.__name__)
  File "/Users/tailee/.pyenv/versions/3.8.16/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 657, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 556, in module_from_spec
  File "<frozen importlib._bootstrap_external>", line 1166, in create_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
ImportError: dlopen(/Users/tailee/Projects/whisper.cpp/venv/lib/python3.8/site-packages/whispercpp/api_cpp2py_export.so, 0x0002): symbol not found in flat namespace '_PyCMethod_New'

Expected behavior

No exception.

Environment

Python: 3.8.16 via PyEnv
macOS: Ventura 13.1

feat: Cuda support?

Feature request

Is there cuda support?

Motivation

The transcription would be much faster if cuda was used

Other

No response

bug: Cannot change transcription language

Describe the bug

Even if the original audio is not an English speech, the transcription is always translated in English.
I've tried to change the language property of params using api.Params.language = 'it' but it didn't work.

To reproduce

import ffmpeg
import numpy as np
from whispercpp import Whisper
from whispercpp import api

try:
    y, _ = (
        ffmpeg.input("/Users/michelangelopucci/Downloads/untitled folder 2/output.wav", threads=0)
        .output("-", format="s16le", acodec="pcm_s16le", ac=1)
        .run(
            cmd=["ffmpeg", "-nostdin"], capture_stdout=True, capture_stderr=True
        )
    )
except ffmpeg.Error as e:
    raise RuntimeError(f"Failed to load audio: {e.stderr.decode()}") from e

arr = np.frombuffer(y, np.int16).flatten().astype(np.float32) / 32768.0

api.Params.language = 'it'
w = Whisper.from_pretrained("large")
a = w.transcribe(arr)
print(a)

Expected behavior

No response

Environment

Python 3.9.6

bug: from_pretrained() gives HTTP Error 401: Unauthorized

Describe the bug

Calling Whisper.from_pretrained("base") gives an error:

>>> whisper = Whisper.from_pretrained("base")
Downloading model base. It may take a while...Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/opt/miniconda3/envs/misc/lib/python3.10/site-packages/whispercpp/__init__.py", line 106, in from_pretrained
    utils.download_model(model_name, basedir=basedir), no_state=no_state
  File "/opt/miniconda3/envs/misc/lib/python3.10/site-packages/whispercpp/utils.py", line 71, in download_model
    request.urlretrieve(MODELS_URL[model_name], model_path)
  File "/opt/miniconda3/envs/misc/lib/python3.10/urllib/request.py", line 241, in urlretrieve
    with contextlib.closing(urlopen(url, data)) as fp:
  File "/opt/miniconda3/envs/misc/lib/python3.10/urllib/request.py", line 216, in urlopen
    return opener.open(url, data, timeout)
  File "/opt/miniconda3/envs/misc/lib/python3.10/urllib/request.py", line 525, in open
    response = meth(req, response)
  File "/opt/miniconda3/envs/misc/lib/python3.10/urllib/request.py", line 634, in http_response
    response = self.parent.error(
  File "/opt/miniconda3/envs/misc/lib/python3.10/urllib/request.py", line 563, in error
    return self._call_chain(*args)
  File "/opt/miniconda3/envs/misc/lib/python3.10/urllib/request.py", line 496, in _call_chain
    result = func(*args)
  File "/opt/miniconda3/envs/misc/lib/python3.10/urllib/request.py", line 643, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 401: Unauthorized

When I try to manually open the url (https://huggingface.co/datasets/ggerganov/whisper.cpp/resolve/main/ggml-base.bin) in the browser, hugging face is popping up a login dialog.

To reproduce

Create conda environment with python 3.10
pip install whispercpp
Open python shell
from whispercpp import Whisper
whisper = Whisper.from_pretrained("base")

Expected behavior

No errors

Environment

python: 3.10
platform: OSX 13 (apple silicon)

bug: Runs Exclusively on CPU

Describe the bug

This binding is about 10 times slower than native Whisper CPP because it is running exclusively on CPU on my M2 Device.
Whisper CPP runs fine on its own on the GPU, so there is no reason why this should not be possible for Python bindings.

To reproduce

I ran this code:

from whispercpp import Whisper

w = Whisper.from_pretrained("large")
transcript = w.transcribe_from_file("output.wav")

I compared with whisper cpp command:
./main -f output.wav -m models/ggml-large.bin -otxt

Expected behavior

Run on GPU and 10x faster

Environment

python 3.11
MacOS Sonoma
M2

bug(ci): wheels on 3.11

Describe the bug

Something breaks with upstream 3.11 from build-standalone (which rules_python depends on).

To reproduce

No response

Expected behavior

No response

Environment

bazel: 6
on main

bug: ARM wheels

Describe the bug

I think QEMU currently doesn't work for building arm wheels on CI. need fixing.

To reproduce

No response

Expected behavior

No response

Environment

main

bug: Wheel doesn't build

Describe the bug

[ 34%] Building C object CMakeFiles/SDL2main.dir/src/main/dummy/SDL_dummy_main.c.o
ccache: error: Failed to create directory /home/chris/.ccache/tmp: Read-only file system
make[2]: *** [CMakeFiles/SDL2main.dir/build.make:76: CMakeFiles/SDL2main.dir/src/main/dummy/SDL_dummy_main.c.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:113: CMakeFiles/SDL2main.dir/all] Error 2
make: *** [Makefile:136: all] Error 2
_____ END BUILD LOGS _____
rules_foreign_cc: Build wrapper script location: bazel-out/k8-fastbuild/bin/external/com_github_libsdl_sdl2/SDL_foreign_cc/wrapper_build_script.sh
rules_foreign_cc: Build script location: bazel-out/k8-fastbuild/bin/external/com_github_libsdl_sdl2/SDL_foreign_cc/build_script.sh
rules_foreign_cc: Build log location: bazel-out/k8-fastbuild/bin/external/com_github_libsdl_sdl2/SDL_foreign_cc/CMake.log

Target //:whispercpp_wheel failed to build
Use --verbose_failures to see the command lines of failed build steps.
INFO: Elapsed time: 294.182s, Critical Path: 133.17s
INFO: 125 processes: 108 internal, 16 linux-sandbox, 1 local.
FAILED: Build did NOT complete successfully

To reproduce

./tools/bazel build //:whispercpp_wheel

Expected behavior

No response

Environment

Python 3.8.13
Ubuntu 20.04

Sample code doesn't work

Describe the bug

Attempting to run the sample code results in the error type object 'whispercpp.Whisper' has no attribute 'from_pretrained'

$ python
Python 3.11.7 (main, Dec  8 2023, 18:56:57) [GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from whispercpp import Whisper
Saving models to: ~/.ggml-models
>>> w = Whisper.from_pretrained("medium.en")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: type object 'whispercpp.Whisper' has no attribute 'from_pretrained'

To reproduce

No response

Expected behavior

No response

Environment

Ubuntu 22.04, Python 3.11.7

Bug: ERROR: Failed to initialized SDL: dsp: No such audio device

Describe the bug

Streaming issue. Can't find/list audio devices

To reproduce

Standard installation instructions

"""Some streaming examples."""

import os
import sys
import typing as t

import whispercpp_py as w


def main(**kwargs: t.Any):
    kwargs.pop("list_audio_devices")
    mname = kwargs.pop("model_name", os.getenv("GGML_MODEL", "tiny.en"))
    iterator: t.Iterator[str] | None = None
    try:
        iterator = w.Whisper.from_pretrained(mname).stream_transcribe(**kwargs)
    finally:
        assert iterator is not None, "Something went wrong!"
        sys.stderr.writelines(
            ["\nTranscription (line by line):\n"] + [f"{it}\n" for it in iterator]
        )
        sys.stderr.flush()


if __name__ == "__main__":
    import argparse

    parser = argparse.ArgumentParser()
    parser.add_argument("--model_name", required=False)
    parser.add_argument(
        "--device_id", type=int, help="Choose the audio device", default=0
    )
    parser.add_argument(
        "--length_ms",
        type=int,
        help="Length of the audio buffer in milliseconds",
        default=5000,
    )
    parser.add_argument(
        "--sample_rate",
        type=int,
        help="Sample rate of the audio device",
        default=w.api.SAMPLE_RATE,
    )
    parser.add_argument(
        "--n_threads",
        type=int,
        help="Number of threads to use for decoding",
        default=8,
    )
    parser.add_argument(
        "--step_ms",
        type=int,
        help="Step size of the audio buffer in milliseconds",
        default=2000,
    )
    parser.add_argument(
        "--keep_ms",
        type=int,
        help="Length of the audio buffer to keep in milliseconds",
        default=200,
    )
    parser.add_argument(
        "--max_tokens",
        type=int,
        help="Maximum number of tokens to decode",
        default=32,
    )
    parser.add_argument("--audio_ctx", type=int, help="Audio context", default=0)
    parser.add_argument(
        "--list_audio_devices",
        action="store_true",
        default=False,
        help="Show available audio devices",
    )

    args = parser.parse_args()

    if args.list_audio_devices:
        w.utils.available_audio_devices()
        sys.exit(0)

    main(**vars(args))

$ python3 stream.py --list_audio_devices
ERROR: Failed to initialized SDL: dsp: No such audio device

 $ python3 stream.py --model_name ggml-base.en.bin
whisper_init_from_file_no_state: loading model from 'ggml-base.en.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51864
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 512
whisper_model_load: n_audio_head  = 8
whisper_model_load: n_audio_layer = 6
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 512
whisper_model_load: n_text_head   = 8
whisper_model_load: n_text_layer  = 6
whisper_model_load: n_mels        = 80
whisper_model_load: f16           = 1
whisper_model_load: type          = 2
whisper_model_load: mem required  =  218.00 MB (+    6.00 MB per decoder)
whisper_model_load: adding 1607 extra tokens
whisper_model_load: model ctx     =  140.60 MB
whisper_model_load: model size    =  140.54 MB
whisper_init_state: kv self size  =    5.25 MB
whisper_init_state: kv cross size =   17.58 MB
ERROR: Failed to initialized SDL: dsp: No such audio device
Traceback (most recent call last):
  File "/home/acheong/.models/whisper_ggml/stream.py", line 15, in main
    iterator = w.Whisper.from_pretrained(mname).stream_transcribe(**kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/acheong/venv/lib/python3.11/site-packages/whispercpp_py/__init__.py", line 256, in stream_transcribe
    raise RuntimeError("Failed to initialize audio capture device.")
RuntimeError: Failed to initialize audio capture device.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/acheong/.models/whisper_ggml/stream.py", line 82, in <module>
    main(**vars(args))
  File "/home/acheong/.models/whisper_ggml/stream.py", line 17, in main
    assert iterator is not None, "Something went wrong!"
           ^^^^^^^^^^^^^^^^^^^^
AssertionError: Something went wrong!

Expected behavior

(venv) [ 12:14AM ]  [ acheong@InsignificantV3:~/.models/whisper_ggml/whisper.cpp(master✔) ]
 $ ./stream -m ~/.models/whisper_ggml/ggml-base.en.bin -t 8 --step 500 --length 5000
init: found 1 capture devices:
init:    - Capture device #0: 'Built-in Audio Analog Stereo'
init: attempt to open default capture device ...
init: obtained spec for input device (SDL Id = 2):
init:     - sample rate:       16000
init:     - format:            33056 (required: 33056)
init:     - channels:          1 (required: 1)
init:     - samples per frame: 1024
whisper_init_from_file_no_state: loading model from '/home/acheong/.models/whisper_ggml/ggml-base.en.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51864
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 512
whisper_model_load: n_audio_head  = 8
whisper_model_load: n_audio_layer = 6
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 512
whisper_model_load: n_text_head   = 8
whisper_model_load: n_text_layer  = 6
whisper_model_load: n_mels        = 80
whisper_model_load: ftype         = 1
whisper_model_load: qntvr         = 0
whisper_model_load: type          = 2
whisper_model_load: mem required  =  310.00 MB (+    6.00 MB per decoder)
whisper_model_load: adding 1607 extra tokens
whisper_model_load: model ctx     =  140.66 MB
whisper_model_load: model size    =  140.54 MB
whisper_init_state: kv self size  =    5.25 MB
whisper_init_state: kv cross size =   17.58 MB

main: processing 8000 samples (step = 0.5 sec / len = 5.0 sec / keep = 0.2 sec), 8 threads, lang = en, task = transcribe, timestamps = 0 ...
main: n_new_line = 9, no_context = 1

 This is your... this more you.
 (drum roll)
whisper_print_timings:     load time =    85.85 ms
whisper_print_timings:     fallbacks =   1 p /   0 h
whisper_print_timings:      mel time =  1614.78 ms
whisper_print_timings:   sample time =   293.89 ms /   431 runs (    0.68 ms per run)
whisper_print_timings:   encode time = 10957.55 ms /     8 runs ( 1369.69 ms per run)
whisper_print_timings:   decode time =  1747.24 ms /   420 runs (    4.16 ms per run)
whisper_print_timings:    total time = 16279.01 ms

Environment

$ python -V
Python 3.11.2

acheong@InsignificantV3 
----------------------- 
OS: Ubuntu 23.04 x86_64 
Host: Laptop AB 
Kernel: 6.2.8-060208-generic 
Uptime: 9 hours, 3 mins 
Packages: 4237 (dpkg), 47 (nix-default), 14 (flatpak), 27 (snap) 
Shell: zsh 5.9 
Resolution: 2256x1504 
DE: GNOME 44.0 
WM: Mutter 
WM Theme: WhiteSur-Dark 
Theme: WhiteSur-Dark [GTK2/3] 
Icons: WhiteSur-dark [GTK2/3] 
Terminal: gnome-terminal 
CPU: 11th Gen Intel i7-1165G7 (8) @ 4.700GHz 
GPU: Intel TigerLake-LP GT2 [Iris Xe Graphics] 
Memory: 7047MiB / 15769MiB

bug: pip install fail

Describe the bug

Got this error when running pip install whispercpp

Collecting whispercpp
  Using cached whispercpp-0.0.12.tar.gz (684 kB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
    Preparing wheel metadata ... done
Building wheels for collected packages: whispercpp
  Building wheel for whispercpp (PEP 517) ... error
  ERROR: Command errored out with exit status 1:
   command: /Users/sudochia/anaconda3/envs/py39/bin/python /Users/sudochia/anaconda3/envs/py39/lib/python3.9/site-packages/pip/_vendor/pep517/in_process/_in_process.py build_wheel /var/folders/sq/stfprb0j06d3gjdr3ghknsyc0000gn/T/tmpoqby07_u
       cwd: /private/var/folders/sq/stfprb0j06d3gjdr3ghknsyc0000gn/T/pip-install-46wgz2zd/whispercpp_d0cf4eb3d1ae4e1694bd86e4bcf9c89f
  Complete output (104 lines):
  /private/var/folders/sq/stfprb0j06d3gjdr3ghknsyc0000gn/T/pip-build-env-udds7pkd/overlay/lib/python3.9/site-packages/setuptools/config/pyprojecttoml.py:108: _BetaConfiguration: Support for `[tool.setuptools]` in `pyproject.toml` is still *beta*.
    warnings.warn(msg, _BetaConfiguration)
  running bdist_wheel
  INFO: Running bazel wrapper (see //tools/bazel for details), bazel version 6.0.0 will be used instead of system-wide bazel installation.
  Downloading bazel, will try URLs: https://storage.googleapis.com/grpc-bazel-mirror/github.com/bazelbuild/bazel/releases/download/6.0.0/bazel-6.0.0-darwin-x86_64 https://github.com/bazelbuild/bazel/releases/download/6.0.0/bazel-6.0.0-darwin-x86_64
    % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                   Dload  Upload   Total   Spent    Left  Speed

    0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
    0   257    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  curl: (22) The requested URL returned error: 404
    % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                   Dload  Upload   Total   Spent    Left  Speed

    0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
    0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0

    1 48.1M    1  767k    0     0   813k      0  0:01:00 --:--:--  0:01:00  813k
   29 48.1M   29 14.2M    0     0  7329k      0  0:00:06  0:00:01  0:00:05 12.8M
   61 48.1M   61 29.7M    0     0  9430k      0  0:00:05  0:00:03  0:00:02 12.6M
   74 48.1M   74 35.9M    0     0  9587k      0  0:00:05  0:00:03  0:00:02 12.1M
   96 48.1M   96 46.2M    0     0  9788k      0  0:00:05  0:00:04  0:00:01 11.6M
  100 48.1M  100 48.1M    0     0   9.7M      0  0:00:04  0:00:04 --:--:-- 11.9M
  Starting local Bazel server and connecting to it...
  Loading:
  Loading: 0 packages loaded
  Analyzing: target //:extensions (1 packages loaded, 0 targets configured)
  Analyzing: target //:extensions (41 packages loaded, 19 targets configured)
  Analyzing: target //:extensions (49 packages loaded, 465 targets configured)
  Analyzing: target //:extensions (49 packages loaded, 465 targets configured)
  Analyzing: target //:extensions (49 packages loaded, 465 targets configured)
  Analyzing: target //:extensions (50 packages loaded, 495 targets configured)
  Analyzing: target //:extensions (50 packages loaded, 495 targets configured)
  Analyzing: target //:extensions (51 packages loaded, 573 targets configured)
  Analyzing: target //:extensions (51 packages loaded, 573 targets configured)
  Analyzing: target //:extensions (54 packages loaded, 2035 targets configured)
  Analyzing: target //:extensions (54 packages loaded, 2035 targets configured)
  Analyzing: target //:extensions (54 packages loaded, 2035 targets configured)
  Analyzing: target //:extensions (56 packages loaded, 2089 targets configured)
  Analyzing: target //:extensions (58 packages loaded, 2185 targets configured)
  Analyzing: target //:extensions (58 packages loaded, 2185 targets configured)
  INFO: Analyzed target //:extensions (60 packages loaded, 11089 targets configured).
  INFO: Found 1 target...
  [0 / 8] [Prepa] Writing file extensions.sh
  [5 / 8] checking cached actions
  [23 / 121] Compiling ggml.c; 0s darwin-sandbox ... (4 actions running)
  [23 / 121] Compiling ggml.c; 1s darwin-sandbox ... (5 actions running)
  [23 / 121] Compiling ggml.c; 3s darwin-sandbox ... (6 actions, 5 running)
  [25 / 121] Compiling ggml.c; 4s darwin-sandbox ... (5 actions running)
  [25 / 121] Compiling ggml.c; 7s darwin-sandbox ... (6 actions, 5 running)
  INFO: From Compiling ggml.c:
  external/com_github_ggerganov_whisper/ggml.c:1465:19: warning: unused function 'ggml_up64' [-Wunused-function]
  static inline int ggml_up64(int n) {
                    ^
  1 warning generated.
  [27 / 121] Compiling whisper.cpp; 9s darwin-sandbox ... (5 actions running)
  [111 / 123] Compiling whisper.cpp; 10s darwin-sandbox ... (5 actions, 4 running)
  [115 / 127] Compiling whisper.cpp; 11s darwin-sandbox ... (4 actions running)
  [116 / 127] Compiling whisper.cpp; 14s darwin-sandbox ... (4 actions, 3 running)
  [118 / 127] Compiling src/whispercpp/api_cpp2py_export.cc; 15s darwin-sandbox ... (3 actions, 2 running)
  ERROR: /private/var/folders/sq/stfprb0j06d3gjdr3ghknsyc0000gn/T/pip-install-46wgz2zd/whispercpp_d0cf4eb3d1ae4e1694bd86e4bcf9c89f/BUILD.bazel:127:17: Linking api_cpp2py_export.so failed: (Exit 1): cc_wrapper.sh failed: error executing command (from target //:api_cpp2py_export.so) external/llvm_toolchain/bin/cc_wrapper.sh @bazel-out/darwin-fastbuild/bin/api_cpp2py_export.so-2.params

  Use --sandbox_debug to see verbose messages from the sandbox and retain the sandbox build root for debugging
  clang: error: invalid linker name in argument '-fuse-ld=lld'
  Target //:extensions failed to build
  Use --verbose_failures to see the command lines of failed build steps.
  INFO: Elapsed time: 132.521s, Critical Path: 17.18s
  INFO: 122 processes: 112 internal, 9 darwin-sandbox, 1 local.
  FAILED: Build did NOT complete successfully
  ERROR: Build failed. Not running target
  Building pybind11 extension...
  Traceback (most recent call last):
    File "/Users/sudochia/anaconda3/envs/py39/lib/python3.9/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 349, in <module>
      main()
    File "/Users/sudochia/anaconda3/envs/py39/lib/python3.9/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 331, in main
      json_out['return_val'] = hook(**hook_input['kwargs'])
    File "/Users/sudochia/anaconda3/envs/py39/lib/python3.9/site-packages/pip/_vendor/pep517/in_process/_in_process.py", line 248, in build_wheel
      return _build_backend().build_wheel(wheel_directory, config_settings,
    File "/private/var/folders/sq/stfprb0j06d3gjdr3ghknsyc0000gn/T/pip-build-env-udds7pkd/overlay/lib/python3.9/site-packages/setuptools/build_meta.py", line 413, in build_wheel
      return self._build_with_temp_dir(['bdist_wheel'], '.whl',
    File "/private/var/folders/sq/stfprb0j06d3gjdr3ghknsyc0000gn/T/pip-build-env-udds7pkd/overlay/lib/python3.9/site-packages/setuptools/build_meta.py", line 398, in _build_with_temp_dir
      self.run_setup()
    File "/private/var/folders/sq/stfprb0j06d3gjdr3ghknsyc0000gn/T/pip-build-env-udds7pkd/overlay/lib/python3.9/site-packages/setuptools/build_meta.py", line 335, in run_setup
      exec(code, locals())
    File "<string>", line 39, in <module>
    File "/private/var/folders/sq/stfprb0j06d3gjdr3ghknsyc0000gn/T/pip-build-env-udds7pkd/overlay/lib/python3.9/site-packages/setuptools/__init__.py", line 108, in setup
      return distutils.core.setup(**attrs)
    File "/private/var/folders/sq/stfprb0j06d3gjdr3ghknsyc0000gn/T/pip-build-env-udds7pkd/overlay/lib/python3.9/site-packages/setuptools/_distutils/core.py", line 185, in setup
      return run_commands(dist)
    File "/private/var/folders/sq/stfprb0j06d3gjdr3ghknsyc0000gn/T/pip-build-env-udds7pkd/overlay/lib/python3.9/site-packages/setuptools/_distutils/core.py", line 201, in run_commands
      dist.run_commands()
    File "/private/var/folders/sq/stfprb0j06d3gjdr3ghknsyc0000gn/T/pip-build-env-udds7pkd/overlay/lib/python3.9/site-packages/setuptools/_distutils/dist.py", line 969, in run_commands
      self.run_command(cmd)
    File "/private/var/folders/sq/stfprb0j06d3gjdr3ghknsyc0000gn/T/pip-build-env-udds7pkd/overlay/lib/python3.9/site-packages/setuptools/dist.py", line 1221, in run_command
      super().run_command(command)
    File "/private/var/folders/sq/stfprb0j06d3gjdr3ghknsyc0000gn/T/pip-build-env-udds7pkd/overlay/lib/python3.9/site-packages/setuptools/_distutils/dist.py", line 988, in run_command
      cmd_obj.run()
    File "<string>", line 27, in run
    File "<string>", line 22, in compile_ext
    File "/Users/sudochia/anaconda3/envs/py39/lib/python3.9/subprocess.py", line 424, in check_output
      return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
    File "/Users/sudochia/anaconda3/envs/py39/lib/python3.9/subprocess.py", line 528, in run
      raise CalledProcessError(retcode, process.args,
  subprocess.CalledProcessError: Command '['/private/var/folders/sq/stfprb0j06d3gjdr3ghknsyc0000gn/T/pip-install-46wgz2zd/whispercpp_d0cf4eb3d1ae4e1694bd86e4bcf9c89f/tools/bazel', 'run', '//:extensions']' returned non-zero exit status 1.
  ----------------------------------------
  ERROR: Failed building wheel for whispercpp
Failed to build whispercpp
ERROR: Could not build wheels for whispercpp which use PEP 517 and cannot be installed directly

To reproduce

Just run pip install whispercpp on a Macbook Pro with Intel chip

Expected behavior

No response

Environment

Python 3.9.12
MacBook Pro 2.9 GHz Quad-Core Intel Core i7

bug: pi_cpp2py_export.so: Exec format error

Describe the bug

So I'm trying to get this running in a docker container and ran into the following error:

ctx = api.Context.from_file("/app/models/ggml-base.en.bin")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.11/site-packages/whispercpp/utils.py", line 144, in __getattr__
    self._module = self._load()
                   ^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/whispercpp/utils.py", line 122, in _load
    module = importlib.import_module(self.__name__)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen importlib._bootstrap>", line 1206, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1178, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1149, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 676, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 573, in module_from_spec
  File "<frozen importlib._bootstrap_external>", line 1233, in create_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
ImportError: Error loading shared library /usr/local/lib/python3.11/site-packages/whispercpp/api_cpp2py_export.so: Exec format error

I get the same error using >>> w = Whisper.from_pretrained("tiny.en")

To reproduce

So far my setup is relatively simple:

# Backend build stage
FROM python:3.11-alpine

WORKDIR /app

COPY requirements.txt /app/requirements.txt

RUN pip install --trusted-host pypi.python.org -r requirements.txt

# Copy the backend source files
COPY . /app

EXPOSE 8000

ENV NAME fastapi-app

CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "8000"]

Expected behavior

No response

Environment

I install whispercpp==0.0.17 using pip (requirments.txt)

bug: significantly lower performance compared to original whisper.cpp

Describe the bug

I installed the latest version via:

pip install git+https://github.com/aarnphm/whispercpp.git -vv

Then I have this 1 minute long wav file.

Here's the output of the original whisper.cpp command:

./main -m models/ggml-small.bin -f out.wav --language auto --max-len 1
whisper_init_from_file_no_state: loading model from 'models/ggml-small.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51865
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 768
whisper_model_load: n_audio_head  = 12
whisper_model_load: n_audio_layer = 12
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 768
whisper_model_load: n_text_head   = 12
whisper_model_load: n_text_layer  = 12
whisper_model_load: n_mels        = 80
whisper_model_load: ftype         = 1
whisper_model_load: qntvr         = 0
whisper_model_load: type          = 3
whisper_model_load: mem required  =  743.00 MB (+   16.00 MB per decoder)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: model ctx     =  464.68 MB
whisper_model_load: model size    =  464.44 MB
whisper_init_state: kv self size  =   15.75 MB
whisper_init_state: kv cross size =   52.73 MB

system_info: n_threads = 4 / 12 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | VSX = 0 | COREML = 0 | 

main: processing 'out.wav' (958952 samples, 59.9 sec), 4 threads, 1 processors, lang = auto, task = transcribe, timestamps = 1 ...

whisper_full_with_state: auto-detected language: ru (p = 0.993210)

# recognition results go here ...

whisper_print_timings:     load time =   571.00 ms
whisper_print_timings:     fallbacks =   0 p /   0 h
whisper_print_timings:      mel time =   310.98 ms
whisper_print_timings:   sample time =   378.25 ms /   426 runs (    0.89 ms per run)
whisper_print_timings:   encode time = 11926.32 ms /     4 runs ( 2981.58 ms per run)
whisper_print_timings:   decode time =  9821.29 ms /   425 runs (   23.11 ms per run)
whisper_print_timings:    total time = 23272.83 ms

Total execution time is 23 seconds.

Here's my python code which uses this library:

import time
from whispercpp import Whisper

start = time.time()

w = Whisper.from_pretrained(model_name="/whispercpp/models/ggml-small.bin")

w.params.with_language("auto")

print(w.transcribe_from_file("out.wav"))

end = time.time()
print(end - start)

And here's the output on the same file:

whisper_init_from_file_no_state: loading model from '/whispercpp/models/ggml-small.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab       = 51865
whisper_model_load: n_audio_ctx   = 1500
whisper_model_load: n_audio_state = 768
whisper_model_load: n_audio_head  = 12
whisper_model_load: n_audio_layer = 12
whisper_model_load: n_text_ctx    = 448
whisper_model_load: n_text_state  = 768
whisper_model_load: n_text_head   = 12
whisper_model_load: n_text_layer  = 12
whisper_model_load: n_mels        = 80
whisper_model_load: f16           = 1
whisper_model_load: type          = 3
whisper_model_load: mem required  =  608.00 MB (+   16.00 MB per decoder)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: model ctx     =  464.56 MB
whisper_model_load: model size    =  464.44 MB
whisper_init_state: kv self size  =   15.75 MB
whisper_init_state: kv cross size =   52.73 MB
whisper_full_with_state: auto-detected language: ru (p = 0.993206)

# recognition results go here...

183.6768798828125

Total execution time is 183s.

Difference is almost 9x times.

To reproduce

No response

Expected behavior

I'd expect the performance to be the same as can be seen in original whisper.cpp

Environment

Macbook pro 16, 2,6 GHz 6-Core Intel Core i7; 32 GB RAM
Python 3.10
Latest versions of whisper.cpp and this library as of 5th June 2023.

perf: improvements

Describe the bug

Currently, it doesn't do very well with big files. This has to do with some memcpy wrt how we handle new segment.
- Right now #11 address some of this issue, but we are still doing some copying. Ideally this should be zero-copy
Avoid using stl conversion
- https://pybind11.readthedocs.io/en/stable/advanced/cast/stl.html: Automatic conversion does the copying

bug: static link library based on platform

Describe the bug

included shared library should adhere to platform specific. Right now .so is used for every platform.

MacOS -> .dylib
Linux -> .so
Windows -> .dll

To reproduce

No response

Expected behavior

No response

Environment

bazel: 6
macos

bug: Quantized Models not Loading

Describe the bug

When I pass the path of a quantized model into Whisper, it attempts to load the model however fails with the message whisper_model_load: ERROR not all tensors loaded from model file - expected 245, got 3. I don't believe this is an issue with the bindings or whisper.cpp, but rather an issue with the version of whisper.cpp built by Bazel.

A few lines before the error is printed, whisper_model_load: f16 = 8 is printed, however f16 should be ftype according to this commit on whisper.cpp. I confirmed this by loading the quantized version into whisper.cpp, also confirming the quantized model is valid. I've built and installed the whisper.cpp bindings directly from source, so I am unsure why Bazel isn't building the right commit.

To reproduce

Install and build whisper.cpp bindings according to the readme
Quantize a model according to the instructions on whisper.cpp's readme
Attempt to load the quantized model by passing the full path of the quantized model into from_pretrained

Expected behavior

The model loads and runs without error.

Environment

Python: 3.10.9
OS: WSL Ubuntu 22.04
Bazel: 6.0.0 (installed from the Bazel script in ./tools/)
Bindings commit: ca5863c04c9e0eb3359ef0330dde06747cef406c
whisper.cpp submodule commit: 14bee39b290f81381a259c53a169ffac8fcce987

chore: decoupling Callback and Context implementation

feat: wheels for arm devices

Feature request

Currently, on main we provide a hermetic llvm toolchain. It will take a while to install for the first time.

Probably nicer if we can build a arm wheels for linux and mac

Motivation

No response

Other

No response

ci: wheels testing

add tests for the built wheels and sdist.

feat: cuBLAS Support

Feature request

It would be nice to be able to compile with cuBLAS support while installing/building locally. I haven't found a way to do so, but I am also unfamiliar with Bazel so apologies if this is already possible, and if so, how would I go about doing this?

Motivation

This would offload lots of processing from the CPU and onto the GPU speeding up transcribing time considerably for those with a powerful GPU.

Other

No response

feat: async streaming APIs

Feature request

Currently, stream_transcribe is blocking, which should be async instead.

Motivation

No response

Other

No response

feat(extension): HuggingFace support

Feature request

This is rather a huge extension, but it can optionally switch to HuggingFace whisper as backend to run inference on GPU for now. 😄

Motivation

No response

Other

No response

from_pretrained load local model

Describe the bug

code is
w = Whisper.from_pretrained('/Users/haowmazs/testdata/whisper.cpp-master/models/ggml-medium.bin')

throw exception
RuntimeError: '/Users/haowmazs/testdata/whisper.cpp-master/models/ggml-medium.bin' is not a valid preconverted model. Choose one of ['tiny.en', 'tiny', 'base.en', 'base', 'small.en', 'small', 'medium.en', 'medium', 'large-v1', 'large']

To reproduce

No response

Expected behavior

No response

Environment

Python 3.9.7
whispercpp 0.017

bug: Doesn't work on installing on Windows

Describe the bug

Trying to use this port on windows.

I tried building it on windows as well in many ways but all of them leading to errors with dead ends so problems for another time.

I used it on WSL2 ubuntu environment in the same machine following the same steps and it works. So the problem is most certainly the OS.

To reproduce

Installed on windows with pip install whispercpp, it says that the installation is successful
when running a code that imports the library, the following error happens:

Error: None (reason: No module named 'whispercpp.api_cpp2py_export')

when running this line of code

Whisper.from_pretrained(params.model, './models')

Expected behavior

For the package to be usable on windows

Environment

python 3.9.3
Windows 11

feat: CoreML support

Feature request

I would like if there was simple a flag to pass in order to use CoreML.

Motivation

Speeds up inference many times.

Other

No response

feat: Command mode

Feature request

Would it be possible to include the command mode from the main branch which accepts voice commands from the microphone ?

Motivation

N/A

Other

N/A

`bazel-toolchain` does not support `NixOS`

bazel-toolchain, which was added in bdd5302 does not support NixOS (see bazel-contrib/toolchains_llvm#58), making (dev) installation on NixOS impossible currently.

feat: wheels

Feature request

manywheellinux and darwin

Motivation

No response

Other

No response

feat: do you provide a language=*** in transcribe_from_file() ?

Feature request

with short audio language detection is poor

Motivation

No response

Other

No response

bug: Pip install fails

Describe the bug

Running pip install git+https://github.com/aarnphm/whispercpp.git results in the error:

Collecting git+https://github.com/aarnphm/whispercpp.git
  Cloning https://github.com/aarnphm/whispercpp.git to /tmp/pip-req-build-vzxr7w4b
  Running command git clone --filter=blob:none --quiet https://github.com/aarnphm/whispercpp.git /tmp/pip-req-build-vzxr7w4b
  Resolved https://github.com/aarnphm/whispercpp.git to commit 00f331b54be5dddaa4eb1611b5f533f8d29f18c1
  Running command git submodule update --init --recursive -q
  fatal: remote error: upload-pack: not our ref c9d5095f0c64455b201f1cd0b547efcf093ee7c3
  fatal: the remote end hung up unexpectedly
  Fetched in submodule path 'extern/whispercpp/bindings/ios', but it did not contain c9d5095f0c64455b201f1cd0b547efcf093ee7c3. Direct fetching of that commit failed.
  Failed to recurse into submodule path 'extern/whispercpp'
  error: subprocess-exited-with-error
  
  × git submodule update --init --recursive -q did not run successfully.
  │ exit code: 1
  ╰─> See above for output.
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× git submodule update --init --recursive -q did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.

To reproduce

No response

Expected behavior

No response

Environment

Ubuntu

bug: streaming

Describe the bug

Probably has to do with whisper_state

To reproduce

No response

Expected behavior

No response

Environment

main

bug: submodule cron update doesn't update bazel commit pinning

Describe the bug

There are a lot of commits bumping the whisper.cpp submodule to the newest version but the commit hash in rules/deps.bzl is staying the same for whisper.cpp.

Wouldn't it make more sense to update that hash in that case as well?

To reproduce

No response

Expected behavior

No response

Environment

github

bug: params.with_max_segment_length(1) doesn't produce word-level segments

Describe the bug

Given this code:

model = w.Whisper.from_pretrained("tiny")
params = model.params.with_max_segment_length(1).build()

samples = [] # np.array of samples from pydub
model.context.full(params, samples)
for s in range(model.context.full_n_segments()):
    print(model.context.full_get_segment_text(s))

I would expect the output to be individual segments, but instead I just get normal sentences.
It doesn't seem like with_max_segment_length makes any difference.

I may be misunderstanding some things - my main goal is to just get word-level timestamps by outputting individual words.

To reproduce

No response

Expected behavior

No response

Environment

python: 3.11
platform: MacOS 13.3

bug: Readme issues

Describe the bug

I detected multiple README flaws.

It should be w.transcribe(np.ones(16000)) with just one dimension, at least that's what worked for me (and the example afterwards uses flatten(), so it is done here.
Might be mistaken on this one, but

Running transcription on a given Numpy array. This calls full from whisper.cpp. If num_proc is greater than 1, it will use full_parallel instead.

Is simply not the case. In the code in __init__.py, transcribe always calls full_parallel

To reproduce

Copy paste example README.md code. It fails.

Expected behavior

It doesn't fail.

Environment

python: 3.9
plattform: Ubuntu 20.04

bug: pip install fails in macOS

Describe the bug

When I do

pip3 install whispercpp

in my MacOs.

I get the following error :

   cmd_obj.run()
        File "<string>", line 28, in run
        File "<string>", line 23, in compile_ext
        File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/subprocess.py", line 424, in check_output
          return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
        File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/subprocess.py", line 528, in run
          raise CalledProcessError(retcode, process.args,
      subprocess.CalledProcessError: Command '['/private/var/folders/lp/xpmn7gz528l07lt294j7bqz00000gp/T/pip-install-5rafli_7/whispercpp_ab016005d8bc4a0b9a1a98eb17eca408/tools/bazel', 'run', '//:extensions']' returned non-zero exit status 1.
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
  ERROR: Failed building wheel for whispercpp
Failed to build whispercpp
ERROR: Could not build wheels for whispercpp, which is required to install pyproject.toml-based projects

To reproduce

Run the pip3 install command.

Expected behavior

Does not fail.

Environment

MacOS M2.
Python 3.9

bug: set_tokens is tokens: list[str] instead of tokens: list[int] in stub file

Describe the bug

In the stub file the set_tokens function is documented as accepting tokens: list[str], when it should be tokens: list[int].

To reproduce

No response

Expected behavior

No response

Environment

python 3.10.9

Erron when running on Debian ARM64

Hi,

I'm trying to run the code from a Debian ARM64 on AWS.

After installing using pip install whispercpp, I executed the following code:

from whispercpp import Whisper

print(f'STT Starting')
w = Whisper.from_pretrained("tiny")
w.transcribe_from_file("/etc/stt/audio.wav")

And it fails with this trace:

STT Starting
Traceback (most recent call last):
  File "/etc/stt/stt.py", line 4, in <module>
    w = Whisper.from_pretrained("tiny")
  File "/usr/local/lib/python3.9/dist-packages/whispercpp/__init__.py", line 105, in from_pretrained
    context = api.Context.from_file(
  File "/usr/local/lib/python3.9/dist-packages/whispercpp/utils.py", line 144, in __getattr__
    self._module = self._load()
  File "/usr/local/lib/python3.9/dist-packages/whispercpp/utils.py", line 122, in _load
    module = importlib.import_module(self.__name__)
  File "/usr/lib/python3.9/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1030, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
  File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 666, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 565, in module_from_spec
  File "<frozen importlib._bootstrap_external>", line 1108, in create_module
  File "<frozen importlib._bootstrap>", line 228, in _call_with_frames_removed
ImportError: /usr/local/lib/python3.9/dist-packages/whispercpp/api_cpp2py_export.so: invalid ELF header

Am I missing something?

bug: Doesn't install on wsl2

Describe the bug

pip install git+https://github.com/aarnphm/whispercpp.git leads to errors and doesn't install

`
ERROR: An error occurred during the fetch of repository 'local_config_python':
Traceback (most recent call last):
File "/home/tt/.cache/bazel/_bazel_tt/eef295a781ab947c7038b674b37d49c9/external/pybind11_bazel/python_configure.bzl", line 409, column 36, in _python_autoconf_impl
_create_local_python_repository(repository_ctx)
File "/home/tt/.cache/bazel/_bazel_tt/eef295a781ab947c7038b674b37d49c9/external/pybind11_bazel/python_configure.bzl", line 355, column 33, in _create_local_python_repository
python_bin = _get_python_bin(repository_ctx)
File "/home/tt/.cache/bazel/_bazel_tt/eef295a781ab947c7038b674b37d49c9/external/pybind11_bazel/python_configure.bzl", line 173, column 10, in _get_python_bin
_fail("Cannot find python in PATH, please make sure " +
File "/home/tt/.cache/bazel/_bazel_tt/eef295a781ab947c7038b674b37d49c9/external/pybind11_bazel/python_configure.bzl", line 27, column 9, in _fail
fail("%sPython Configuration Error:%s %s\n" % (red, no_color, msg))
Error in fail: Python Configuration Error: Cannot find python in PATH, please make sure python is installed and add its directory in PATH, or --repo_env PYTHON_BIN_PATH='/something/else'.
PATH=/tmp/pip-build-env-5d2xcjbc/overlay/local/bin:/tmp/pip-build-env-5d2xcjbc/normal/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/lib/wsl/lib:/mnt/c/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.2/bin:/mnt/c/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.2/libnvvp:/mnt/c/Windows/system32:/mnt/c/Windows:/mnt/c/Windows/System32/Wbem:/mnt/c/Windows/System32/WindowsPowerShell/v1.0/:/mnt/c/Windows/System32/OpenSSH/:/mnt/c/Program Files/Intel/WiFi/bin/:/mnt/c/Program Files/Common Files/Intel/WirelessCommon/:/mnt/c/Program Files/Git/cmd:/mnt/c/msys64/mingw64/bin:/mnt/c/msys64/usr/bin:/mnt/c/Program Files/ffmpeg:/mnt/c/Program Files/NVIDIA Corporation/Nsight Compute 2023.2.0/:/mnt/c/Program Files (x86)/NVIDIA Corporation/PhysX/Common:/mnt/c/Users/yinch/AppData/Local/Programs/Python/Python311/Scripts/:/mnt/c/Users/yinch/AppData/Local/Programs/Python/Python311/:/mnt/c/Users/yinch/AppData/Local/Microsoft/WindowsApps:/mnt/c/Users/yinch/AppData/Local/Programs/Microsoft VS Code/bin:/snap/bin
ERROR: /tmp/pip-req-build-nk8q05cy/WORKSPACE:89:17: fetching python_configure rule //external:local_config_python: Traceback (most recent call last):
File "/home/tt/.cache/bazel/_bazel_tt/eef295a781ab947c7038b674b37d49c9/external/pybind11_bazel/python_configure.bzl", line 409, column 36, in _python_autoconf_impl
_create_local_python_repository(repository_ctx)
File "/home/tt/.cache/bazel/_bazel_tt/eef295a781ab947c7038b674b37d49c9/external/pybind11_bazel/python_configure.bzl", line 355, column 33, in _create_local_python_repository
python_bin = _get_python_bin(repository_ctx)
File "/home/tt/.cache/bazel/_bazel_tt/eef295a781ab947c7038b674b37d49c9/external/pybind11_bazel/python_configure.bzl", line 173, column 10, in _get_python_bin
_fail("Cannot find python in PATH, please make sure " +
File "/home/tt/.cache/bazel/_bazel_tt/eef295a781ab947c7038b674b37d49c9/external/pybind11_bazel/python_configure.bzl", line 27, column 9, in _fail
fail("%sPython Configuration Error:%s %s\n" % (red, no_color, msg))
Error in fail: Python Configuration Error: Cannot find python in PATH, please make sure python is installed and add its directory in PATH, or --repo_env PYTHON_BIN_PATH='/something/else'.
PATH=/tmp/pip-build-env-5d2xcjbc/overlay/local/bin:/tmp/pip-build-env-5d2xcjbc/normal/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/lib/wsl/lib:/mnt/c/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.2/bin:/mnt/c/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.2/libnvvp:/mnt/c/Windows/system32:/mnt/c/Windows:/mnt/c/Windows/System32/Wbem:/mnt/c/Windows/System32/WindowsPowerShell/v1.0/:/mnt/c/Windows/System32/OpenSSH/:/mnt/c/Program Files/Intel/WiFi/bin/:/mnt/c/Program Files/Common Files/Intel/WirelessCommon/:/mnt/c/Program Files/Git/cmd:/mnt/c/msys64/mingw64/bin:/mnt/c/msys64/usr/bin:/mnt/c/Program Files/ffmpeg:/mnt/c/Program Files/NVIDIA Corporation/Nsight Compute 2023.2.0/:/mnt/c/Program Files (x86)/NVIDIA Corporation/PhysX/Common:/mnt/c/Users/yinch/AppData/Local/Programs/Python/Python311/Scripts/:/mnt/c/Users/yinch/AppData/Local/Programs/Python/Python311/:/mnt/c/Users/yinch/AppData/Local/Microsoft/WindowsApps:/mnt/c/Users/yinch/AppData/Local/Programs/Microsoft VS Code/bin:/snap/bin
ERROR: /tmp/pip-req-build-nk8q05cy/BUILD.bazel:116:17: //:audio_cpp2py_export.so depends on @local_config_python//:python_headers in repository @local_config_python which failed to fetch. no such package '@local_config_python//': Python Configuration Error: Cannot find python in PATH, please make sure python is installed and add its directory in PATH, or --repo_env PYTHON_BIN_PATH='/something/else'.
PATH=/tmp/pip-build-env-5d2xcjbc/overlay/local/bin:/tmp/pip-build-env-5d2xcjbc/normal/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/usr/lib/wsl/lib:/mnt/c/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.2/bin:/mnt/c/Program Files/NVIDIA GPU Computing Toolkit/CUDA/v12.2/libnvvp:/mnt/c/Windows/system32:/mnt/c/Windows:/mnt/c/Windows/System32/Wbem:/mnt/c/Windows/System32/WindowsPowerShell/v1.0/:/mnt/c/Windows/System32/OpenSSH/:/mnt/c/Program Files/Intel/WiFi/bin/:/mnt/c/Program Files/Common Files/Intel/WirelessCommon/:/mnt/c/Program Files/Git/cmd:/mnt/c/msys64/mingw64/bin:/mnt/c/msys64/usr/bin:/mnt/c/Program Files/ffmpeg:/mnt/c/Program Files/NVIDIA Corporation/Nsight Compute 2023.2.0/:/mnt/c/Program Files (x86)/NVIDIA Corporation/PhysX/Common:/mnt/c/Users/yinch/AppData/Local/Programs/Python/Python311/Scripts/:/mnt/c/Users/yinch/AppData/Local/Programs/Python/Python311/:/mnt/c/Users/yinch/AppData/Local/Microsoft/WindowsApps:/mnt/c/Users/yinch/AppData/Local/Programs/Microsoft VS Code/bin:/snap/bin
Analyzing: target //:extensions (53 packages loaded, 523 targets configured)
ERROR: Analysis of target '//:extensions' failed; build aborted:
INFO: Elapsed time: 29.841s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (53 packages loaded, 523 targets configured)
ERROR: Build failed. Not running target
Building pybind11 extension...
Traceback (most recent call last):
File "/home/tt/.local/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in
main()
File "/home/tt/.local/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
json_out['return_val'] = hook(**hook_input['kwargs'])
File "/home/tt/.local/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 251, in build_wheel
return _build_backend().build_wheel(wheel_directory, config_settings,
File "/tmp/pip-build-env-5d2xcjbc/overlay/local/lib/python3.10/dist-packages/setuptools/build_meta.py", line 434, in build_wheel
return self._build_with_temp_dir(
File "/tmp/pip-build-env-5d2xcjbc/overlay/local/lib/python3.10/dist-packages/setuptools/build_meta.py", line 419, in _build_with_temp_dir
self.run_setup()
File "/tmp/pip-build-env-5d2xcjbc/overlay/local/lib/python3.10/dist-packages/setuptools/build_meta.py", line 341, in run_setup
exec(code, locals())
File "", line 45, in
File "/tmp/pip-build-env-5d2xcjbc/overlay/local/lib/python3.10/dist-packages/setuptools/init.py", line 107, in setup
return distutils.core.setup(**attrs)
File "/tmp/pip-build-env-5d2xcjbc/overlay/local/lib/python3.10/dist-packages/setuptools/_distutils/core.py", line 185, in setup
return run_commands(dist)
File "/tmp/pip-build-env-5d2xcjbc/overlay/local/lib/python3.10/dist-packages/setuptools/_distutils/core.py", line 201, in run_commands
dist.run_commands()
File "/tmp/pip-build-env-5d2xcjbc/overlay/local/lib/python3.10/dist-packages/setuptools/_distutils/dist.py", line 969, in run_commands
self.run_command(cmd)
File "/tmp/pip-build-env-5d2xcjbc/overlay/local/lib/python3.10/dist-packages/setuptools/dist.py", line 1233, in run_command
super().run_command(command)
File "/tmp/pip-build-env-5d2xcjbc/overlay/local/lib/python3.10/dist-packages/setuptools/_distutils/dist.py", line 988, in run_command
cmd_obj.run()
File "", line 28, in run
File "", line 23, in compile_ext
File "/usr/lib/python3.10/subprocess.py", line 421, in check_output
return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
File "/usr/lib/python3.10/subprocess.py", line 526, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['/tmp/pip-req-build-nk8q05cy/tools/bazel', 'run', '//:extensions']' returned non-zero exit status 1.
error: subprocess-exited-with-error

× Building wheel for whispercpp (pyproject.toml) did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
full command: /usr/bin/python3 /home/tt/.local/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py build_wheel /tmp/tmp192izf51
cwd: /tmp/pip-req-build-nk8q05cy
Building wheel for whispercpp (pyproject.toml) ... error
ERROR: Failed building wheel for whispercpp
Failed to build whispercpp
ERROR: Could not build wheels for whispercpp, which is required to install pyproject.toml-based projects
Exception information:
Traceback (most recent call last):
File "/home/tt/.local/lib/python3.10/site-packages/pip/_internal/cli/base_command.py", line 180, in exc_logging_wrapper
status = run_func(*args)
File "/home/tt/.local/lib/python3.10/site-packages/pip/_internal/cli/req_command.py", line 248, in wrapper
return func(self, options, args)
File "/home/tt/.local/lib/python3.10/site-packages/pip/_internal/commands/install.py", line 429, in run
raise InstallationError(
pip._internal.exceptions.InstallationError: Could not build wheels for whispercpp, which is required to install pyproject.toml-based projects
Remote version of pip: 23.2.1
Local version of pip: 23.2.1
Was pip installed by pip? True
Removed build tracker: '/tmp/pip-build-tracker-he5jdcca'
`

To reproduce

install on wsl2 (ubuntu)
pip install git+https://github.com/aarnphm/whispercpp.git

Expected behavior

For the package to install

Environment

python 3.10.12
windows 11 wsl2

bug: Custom model not loading.

Describe the bug

Code:
!ls /content
Out: ggml-model.bin sample_data whisper whisper.CPP

Code:

from whispercpp import Whisper
w = Whisper.from_pretrained("/content/ggml-model.bin")

Out:

[<ipython-input-16-daf3870535af>](https://localhost:8080/#) in <cell line: 4>()
from whispercpp import Whisper
w = Whisper.from_pretrained("/content/ggml-model.bin")

/usr/local/lib/python3.10/dist-packages/whispercpp/__init__.py in from_pretrained(model_name, basedir, no_state)

RuntimeError: '/content/ggml-model.bin' is not a valid preconverted model. Choose one of ['tiny.en', 'tiny', 'base.en', 'base', 'small.en', 'small', 'medium.en', 'medium', 'large-v1', 'large']

To reproduce

I downloaded the model and converted it using a script:

!python whisper.cpp/models/convert-h5-to-ggml.py /content/whisper-small whisper ./

Expected behavior

No response

Environment

python 3.10.6
Google Colab

How do I print out the timestamps along with the transcribed text?

Feature request

Hi,

Thanks for writing this out. I am using it in my project.
And I am a beginner in Python, and I am currently using it.

I tried the sample code with a file, and it works. But I do not get the timestamps in the output. But only the transcribed file.
I tried to do this :

    w = Whisper.from_pretrained(MODEL_NAME)
    w.params.token_timestamps = True
    try:
        y, _ = (
            ffmpeg.input(input_file, threads=0)
            .output("-", format=SAMPLE_FORMAT, acodec=SAMPLE_CODEC, ac=1)
            .run(
                cmd=["ffmpeg", "-nostdin"], capture_stdout=True, capture_stderr=True
            )
        )
    except ffmpeg.Error as e:
        raise RuntimeError(f"Failed to load audio: {e.stderr.decode()}") from e

    arr = np.frombuffer(y, np.int16).flatten().astype(np.float32) / 32768.0
    return w.transcribe(arr, num_proc=3)

And still it does not work.
I would like to have the timestamps in the output like the original whisper:

[00:00:00.000 --> 00:00:07.000]   If you want to know how to say "hi" in Spanish the next time you greet someone, just say "ola".
[00:00:07.000 --> 00:00:18.000]   For a more casual way to say "hi", try "kipasa", which means "what's happening", or "kital", which means "w

How can I achieve this? Please let me know. Sorry if this is a noob question.

Motivation

No response

Other

No response

bug: core dump where i try in ubuntu

Describe the bug

Python 3.10.6 (main, May 29 2023, 11:10:38) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

from whispercpp import Whisper
w = Whisper.from_pretrained("tiny.en")
Illegal instruction (core dumped)

To reproduce

No response

Expected behavior

No response

Environment

Python 3.10.6 (main, May 29 2023, 11:10:38) [GCC 11.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.

Distributor ID: Ubuntu
Description: Ubuntu 22.04.1 LTS
Release: 22.04
Codename: jammy

bug: linux wheel Import Error

Describe the bug

The package installed from pypi causes the following erros:

Python 3.11.2 (main, Mar 23 2023, 17:12:29) [GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from whispercpp import Whisper
>>>
>>> w = Whisper.from_pretrained("tiny.en")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.11/site-packages/whispercpp/__init__.py", line 105, in from_pretrained
    context = api.Context.from_file(
              ^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/whispercpp/utils.py", line 144, in __getattr__
    self._module = self._load()
                   ^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/whispercpp/utils.py", line 122, in _load
    module = importlib.import_module(self.__name__)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "<frozen importlib._bootstrap>", line 1206, in _gcd_import
  File "<frozen importlib._bootstrap>", line 1178, in _find_and_load
  File "<frozen importlib._bootstrap>", line 1149, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 676, in _load_unlocked
  File "<frozen importlib._bootstrap>", line 573, in module_from_spec
  File "<frozen importlib._bootstrap_external>", line 1233, in create_module
  File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
ImportError: /lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.32' not found (required by /usr/local/lib/python3.11/site-packages/whispercpp/api_cpp2py_export.so)

Then I rebuild the package using the latest source, the program creashed.

Python 3.11.2 (main, Mar 23 2023, 17:12:29) [GCC 10.2.1 20210110] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from whispercpp import Whisper
>>>
>>> w = Whisper.from_pretrained("tiny.en")
Illegal instruction (core dumped)

To reproduce

docker run -i -t python:3 bash
pip install whispercpp

root@495bbb253133:/# pip install whispercpp
Collecting whispercpp
  Downloading whispercpp-0.0.17-cp311-cp311-manylinux2014_x86_64.whl (1.5 MB)
     ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.5/1.5 MB 2.5 MB/s eta 0:00:00
Installing collected packages: whispercpp
Successfully installed whispercpp-0.0.17

Run codes in readme.
Rebuild and reinstall package using python3 -m build -w
Run codes in readme.

Expected behavior

No response

Environment

python: 3.11
platform: x86-64 Debian GNU/Linux 11 (bullseye)

ci: yet another CI bug report

Currently, SDL2 is failed to compile on Linux. Working on a PR rn

perf: drastically slower when running inference

Describe the bug

Throws this error when I try and install from pypi.

WARNING: Generating metadata for package whispercpp produced metadata for project name unknown. Fix your #egg=whispercpp fragments.
Discarding https://files.pythonhosted.org/packages/1a/19/21264d0e12a94896a223a14fe4796bacfb988d7b65b4921ec3f90fe16f96/whispercpp-0.0.2.tar.gz#sha256=ebdfbd44c928d8a5ab88396d9ff7d63753256fbf4c4c4ca794d0899f88487f0e (from https://pypi.org/simple/whispercpp/) (requires-python:>=3.8): Requested unknown from https://files.pythonhosted.org/packages/1a/19/21264d0e12a94896a223a14fe4796bacfb988d7b65b4921ec3f90fe16f96/whispercpp-0.0.2.tar.gz#sha256=ebdfbd44c928d8a5ab88396d9ff7d63753256fbf4c4c4ca794d0899f88487f0e has inconsistent name: filename has 'whispercpp', but metadata has 'unknown'
ERROR: Could not find a version that satisfies the requirement whispercpp (from versions: 0.0.2)
ERROR: No matching distribution found for whispercpp

Install from Git seems to work but the message looks pretty odd.

Building wheels for collected packages: UNKNOWN
  Building wheel for UNKNOWN (pyproject.toml) ... done
  Created wheel for UNKNOWN: filename=UNKNOWN-0.0.0-py3-none-any.whl size=5112 sha256=9be610284ad3eaa982b8c41abcaba1d2c4044b2285da1144ffc2cc22ed94c43a
  Stored in directory: /tmp/pip-ephem-wheel-cache-mfo9ypjd/wheels/e9/b5/80/8b16e725a70d7c0404dc90e91c04cb329bd39b2bb8f5fdff2c
Successfully built UNKNOWN
Installing collected packages: UNKNOWN
Successfully installed UNKNOWN-0.0.0

When I try to import from whispercpp, it gives the error
ModuleNotFoundError: No module named 'whispercpp'

To reproduce

No response

Expected behavior

Should install correctly and allow import from whispercpp

Environment

Google Colab
Python 3.8.10 (default, Nov 14 2022, 12:59:47)

feat: support prompt param

Feature request

Didn't find a place to set the prompt parameter, am I missing something?

Motivation

No response

Other

No response

bug: doesnt build

Describe the bug

git submodule update --init --recursive
Submodule 'extern/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'extern/pybind11'
Submodule 'extern/whispercpp' (https://github.com/ggerganov/whisper.cpp.git) registered for path 'extern/whispercpp'
Cloning into '/home/ivan/whispercpp/extern/pybind11'...
Cloning into '/home/ivan/whispercpp/extern/whispercpp'...
Submodule path 'extern/pybind11': checked out 'dc9b39596d986aeb061bd3debe52d30e2467dc48'
Submodule path 'extern/whispercpp': checked out 'bebf0da983632f47c8dff5cae5a578992c31ff26'
Submodule 'bindings/ios' (https://github.com/ggerganov/whisper.spm) registered for path 'extern/whispercpp/bindings/ios'
Cloning into '/home/ivan/whispercpp/extern/whispercpp/bindings/ios'...
fatal: remote error: upload-pack: not our ref c9d5095f0c64455b201f1cd0b547efcf093ee7c3
fatal: Fetched in submodule path 'extern/whispercpp/bindings/ios', but it did not contain c9d5095f0c64455b201f1cd0b547efcf093ee7c3. Direct fetching of that commit failed.
fatal:
fatal: Failed to recurse into submodule path 'extern/whispercpp'

To reproduce

git clone

git submodule update --init --recursive

Submodule 'extern/pybind11' (https://github.com/pybind/pybind11.git) registered for path 'extern/pybind11'
Submodule 'extern/whispercpp' (https://github.com/ggerganov/whisper.cpp.git) registered for path 'extern/whispercpp'
Cloning into '/home/ivan/whispercpp/extern/pybind11'...
Cloning into '/home/ivan/whispercpp/extern/whispercpp'...
Submodule path 'extern/pybind11': checked out 'dc9b39596d986aeb061bd3debe52d30e2467dc48'
Submodule path 'extern/whispercpp': checked out 'bebf0da983632f47c8dff5cae5a578992c31ff26'
Submodule 'bindings/ios' (https://github.com/ggerganov/whisper.spm) registered for path 'extern/whispercpp/bindings/ios'
Cloning into '/home/ivan/whispercpp/extern/whispercpp/bindings/ios'...
fatal: remote error: upload-pack: not our ref c9d5095f0c64455b201f1cd0b547efcf093ee7c3
fatal: Fetched in submodule path 'extern/whispercpp/bindings/ios', but it did not contain c9d5095f0c64455b201f1cd0b547efcf093ee7c3. Direct fetching of that commit failed.
fatal:
fatal: Failed to recurse into submodule path 'extern/whispercpp'

Expected behavior

build

Environment

Python 3.10.12
Ubuntu 22.04.3 LTS