Code Monkey home page Code Monkey logo

whisper-cpp-python's Introduction

whisper-cpp-python

GitHub Workflow Status (with branch) GitHub PyPI

whisper-cpp-python is a Python module inspired by llama-cpp-python that provides a Python interface to the whisper.cpp model. This module automatically parses the C++ header file of the project during building time, generating the corresponding Python bindings.

Installation

To install the module, you can use pip:

pip install whisper-cpp-python

Usage

To use the module, you need to create an instance of the Whisper class, passing the path to the model file as a parameter. Then, you can call the transcribe or translate method to transcribe or translate a given audio file.

High-level API

The high-level API provides a simple managed interface through the Wisper class.

Below is a short example demonstrating how to use the high-level API to transcribe an mp3:

>>> from whisper_cpp_python import Whisper
>>> whisper = Whisper(model_path="./models/ggml-tiny.bin")
>>> output = whisper.transcribe(open('samples/jfk.mp3'))
>>> print(output)
{'text': 'And so my fellow Americans ask not what your country can do for you, ask what you can do for your country.'}
>>> output = whisper.transcribe(open('samples/jfk.mp3'), response_format='verbose_json')
>>> print(output)
{
    'task': 'transcribe',
    'language': 'en',
    'duration': 11.0,
    'text': 'And so, my fellow Americans ask not what your country can do for you, ask what you can do for your country.',
    'segments': [{
        'id': 0,
        'seek': 0.0,
        'start': 0.0,
        'end': 10.98,
        'text': ' And so, my fellow Americans ask not what your country can do for you, ask what you can do for your country.',
        'tokens': [50364,
            400,
            370,
            11,
            452,
            7177,
            6280,
            1029,
            406,
            437,
            428,
            1941,
            393,
            360,
            337,
            291,
            11,
            1029,
            437,
            291,
            393,
            360,
            337,
            428,
            1941,
            13,
            50913],
       'temperature': 0.800000011920929,
       'avg_logprob': -0.3063158459133572,
       'compression_ratio': 2.4000000953674316,
       'no_speech_prob': 0.0,
       'transient': False
   }]
}

Low-level API

All interfaces provided by whisper.h are available in python. The following example show how to pass a custom progress_callback function to the model.

from whisper_cpp_python import Whisper
from whisper_cpp_python.whisper_cpp import whisper_progress_callback

def callback(ctx, state, i, p):
    print(i)

model = Whisper('../quantized_models/whisper/models/ggml-tiny.bin')
model.params.progress_callback = whisper_progress_callback(callback)

print(model.transcribe('vendor/whisper.cpp/samples/jfk.wav'))

Web Server

whisper-cpp-python offers a web server which aims to act as a drop-in replacement for the OpenAI API. This allows you to use whisper.cpp compatible models with any OpenAI compatible client (language libraries, services, etc).

To install the server package and get started:

pip install whisper-cpp-python[server]
python3 -m whisper_cpp_python.server --model models/ggml-tiny.bin

Navigate to http://localhost:8001/docs to see the OpenAPI documentation.

License

whisper-cpp-python is released under the MIT License. See LICENSE for details.

whisper-cpp-python's People

Contributors

carloscdias avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.