Code Monkey home page Code Monkey logo

whisper-node's Introduction

whisper-node

npm downloads npm downloads

Node.js bindings for OpenAI's Whisper. Transcription done local.

Features

  • Output transcripts to JSON (also .txt .srt .vtt)
  • Optimized for CPU (Including Apple Silicon ARM)
  • Timestamp precision to single word

Installation

  1. Add dependency to project
npm install whisper-node
  1. Download whisper model of choice [OPTIONAL]
npx whisper-node download

Requirement for Windows: Install the make command from here.

Usage

import whisper from 'whisper-node';

const transcript = await whisper("example/sample.wav");

console.log(transcript); // output: [ {start,end,speech} ]

Output (JSON)

[
  {
    "start":  "00:00:14.310", // time stamp begin
    "end":    "00:00:16.480", // time stamp end
    "speech": "howdy"         // transcription
  }
]

Full Options List

import whisper from 'whisper-node';

const filePath = "example/sample.wav"; // required

const options = {
  modelName: "base.en",       // default
  // modelPath: "/custom/path/to/model.bin", // use model in a custom directory (cannot use along with 'modelName')
  whisperOptions: {
    language: 'auto'          // default (use 'auto' for auto detect)
    gen_file_txt: false,      // outputs .txt file
    gen_file_subtitle: false, // outputs .srt file
    gen_file_vtt: false,      // outputs .vtt file
    word_timestamps: true     // timestamp for every word
    // timestamp_size: 0      // cannot use along with word_timestamps:true
  }
}

const transcript = await whisper(filePath, options);

Input File Format

Files must be .wav and 16Hz

Example .mp3 file converted with an FFmpeg command: ffmpeg -i input.mp3 -ar 16000 output.wav

Made with

Roadmap

  • Support projects not using Typescript
  • Allow custom directory for storing models
  • Config files as alternative to model download cli
  • Remove path, shelljs and prompt-sync package for browser, react-native expo, and webassembly compatibility
  • fluent-ffmpeg to automatically convert to 16Hz .wav files as well as support separating audio from video
  • Pyanote diarization for speaker names
  • Implement WhisperX as optional alternative model for diarization and higher precision timestamps (as alternative to C++ version)
  • Add option for viewing detected langauge as described in Issue 16
  • Include typescript typescript types in d.ts file
  • Add support for language option
  • Add support for transcribing audio streams as already implemented in whisper.cpp

Modifying whisper-node

npm run dev - runs nodemon and tsc on '/src/test.ts'

npm run build - runs tsc, outputs to '/dist' and gives sh permission to 'dist/download.js'

Acknowledgements

whisper-node's People

Contributors

ariym avatar casperwarnich avatar kaivinc avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

whisper-node's Issues

I am getting this error when trying to download

Hi Team

I am getting following error
vij@DESKTOP-8J00BG1 MINGW64 /f/test/Extra
$ npx whisper-node download

Model Disk RAM
tiny 75 MB ~390 MB
tiny.en 75 MB ~390 MB
base 142 MB ~500 MB
base.en 142 MB ~500 MB
small 466 MB ~1.0 GB
small.en 466 MB ~1.0 GB
medium 1.5 GB ~2.6 GB
medium.en 1.5 GB ~2.6 GB
large-v1 2.9 GB ~4.7 GB
large 2.9 GB ~4.7 GB

[whisper-node] Enter model name (e.g. 'base.en') or 'cancel' to exit
(ENTER for base.en):
[whisper-node] Going with base.en
Downloading ggml model base.en...
Model base.en already exists. Skipping download.
[whisper-node] Attempting to compile model...
'make' is not recognized as an internal or external command,
operable program or batch file.

Let me know how to fix this

Thanks

Typescript Errors

When building my application with tsc, I get the following errors from this package:


9 export default async function whisper(filePath: string, whisperOptions?: object): Promise<TranscriptType[]> {
                                                                                    ~~~~~~~~~~~~~~~~~~~~~~~~~

node_modules/whisper-node/whisper.ts:4:46 - error TS7053: Element implicitly has an 'any' type because expression of type 'string' can't be used to index type '{ en_base: string; en_medium: string; large: string; }'.
  No index signature with a parameter of type 'string' was found on type '{ en_base: string; en_medium: string; large: string; }'.

4   `./main ${getFlags(options)} -m ./models/${models[model]} -f ${filePath} `;
                                               ~~~~~~~~~~~~~


Found 2 errors in 2 files.

Errors  Files
     1  node_modules/whisper-node/index.ts:9
     1  node_modules/whisper-node/whisper.ts:4```
     
Additionally, the usage example in the README is inaccurate as the whisper function does not take in an object and instead takes in a string, meaning it doesn't work as intended.

ERROR INSTALL (npx whisper-node download)

[whisper-node] Enter model name (e.g. 'base.en') or 'cancel' to exit
(ENTER for base.en):
[whisper-node] Going with base.en
Downloading ggml model base.en...
Model base.en already exists. Skipping download.
[whisper-node] Attempting to compile model...
"cc" no se reconoce como un comando interno o externo,
programa o archivo por lotes ejecutable.
"head" no se reconoce como un comando interno o externo,
programa o archivo por lotes ejecutable.
I whisper.cpp build info:
I UNAME_S:
I UNAME_P:
I UNAME_M:
I CFLAGS: -I. -O3 -DNDEBUG -std=c11 -fPIC -D_XOPEN_SOURCE=600 -pthread
I CXXFLAGS: -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -D_XOPEN_SOURCE=600 -pthread
I LDFLAGS:
I CC:
I CXX:

cc -I. -O3 -DNDEBUG -std=c11 -fPIC -D_XOPEN_SOURCE=600 -pthread -c ggml.c -o ggml.o
process_begin: CreateProcess(NULL, uname -s, ...) failed.
process_begin: CreateProcess(NULL, uname -p, ...) failed.
process_begin: CreateProcess(NULL, uname -m, ...) failed.
process_begin: CreateProcess(NULL, which nvcc, ...) failed.
Makefile:299: recipe for target 'ggml.o' failed
process_begin: CreateProcess(NULL, cc -I. -O3 -DNDEBUG -std=c11 -fPIC -D_XOPEN_SOURCE=600 -pthread -c ggml.c -o ggml.o, ...) failed.
make (e=2): El sistema no puede encontrar el archivo especificado.
make: *** [ggml.o] Error 2

error: failed to open 'test.wav' as WAV file

I got other bugs also in which it was failing to download models , do i did those steps manually , and placed it in node modules , but now i am facing this error

[whisper-node] Problem: whisper_init_from_file: loading model from './models/ggml-tiny.en.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab = 51864
whisper_model_load: n_audio_ctx = 1500
whisper_model_load: n_audio_state = 384
whisper_model_load: n_audio_head = 6
whisper_model_load: n_audio_layer = 4
whisper_model_load: n_text_ctx = 448
whisper_model_load: n_text_state = 384
whisper_model_load: n_text_head = 6
whisper_model_load: n_text_layer = 4
whisper_model_load: n_mels = 80
whisper_model_load: f16 = 1
whisper_model_load: type = 1
whisper_model_load: adding 1607 extra tokens
whisper_model_load: mem_required = 390.00 MB
whisper_model_load: ggml ctx size = 73.58 MB
whisper_model_load: memory size = 11.41 MB
whisper_model_load: model size = 73.54 MB
error: failed to open 'test.wav' as WAV file

Problem: whisper.cpp not initialized in nextjs project

Problem

When using this library in a Nextjs project v14, and executing it in a server component I'm receiving this error
I already follow the README.md and executed the command: bunx whisper-node download

Error

cd: no such file or directory: project/.next/server/lib/whisper.cpp
[whisper-node] Problem. whisper.cpp not initialized. Current shelljs directory: project/.next/server/vendor-chunks
[whisper-node] Attempting to run 'make' command in /whisper directory...
[whisper-node] Problem. 'make' command failed. Please run 'make' command in /whisper directory. Current shelljs directory:  project/.next/server/vendor-chunks

Any thoughts about this? Does this library support running with nextjs?

Thanks!

Versions

  • Pop!_OS 22.04 LTS
  • Bun version 1.0.29
  • Next.js version 14.1.0

Is this support zh-cn ?out put is []

mycode:

import {whisper} from 'whisper-node';

console.log('whisper',whisper)
const transcript = await whisper("./mycloning.wav");

console.log(transcript); 

why the output is [],not support chinese or need option?

image

TypeError when using whisper-node library for audio transcription in NestJS project

I'm encountering a TypeError while attempting to transcribe audio from a WAV file using the whisper-node library in my NestJS project. Here's my code snippet:

import whisper from 'whisper-node';
import path from 'path';

const filePath = path.join(__dirname, '..', 'uploads', 'output.wav');
const options = {
modelName: 'base.en',
whisperOptions: {
language: 'auto',
gen_file_txt: false,
gen_file_subtitle: false,
gen_file_vtt: false,
word_timestamps: true,
},
};

const transcript = await whisper(filePath, options);

I've verified that the file path is correct, and it points to a valid WAV file. However, when I pass this file path to the whisper function, I encounter the following error:

TypeError: Cannot read properties of null (reading 'shift') at parseTranscript
I'm not sure what's causing this error or how to resolve it. Any insights or suggestions would be greatly appreciated. Thank you!

Importing whisper-node package changes process.cwd()

Importing whisper-node package changes process.cwd().
Verison of whisper used to repro (includes a recent patch): "@distube/ytdl-core": "github:soya-daizu/ytdl-core#sig-patch"

const whisper = require('whisper-node');
audio_file = "example.wav"
audio_path = path.join(process.cwd(), audio_file);

leads to: [whisper-node] Transcribing: <project_dir>/node_modules/whisper-node/lib/whisper.cpp/<audio_file>

Whereas the desired behavior is for process.cwd() to be set to the <project_dir>. The following import ordering is a workaround:

audio_file = "example.wav"
audio_path = path.join(process.cwd(), audio_file );
const whisper = require('whisper-node');

which leads to: [whisper-node] Transcribing: <project_dir>/<audio_file>

Suport for more options

Hi, could you add more options to the options param? Like language, for instance. When I try to transcript a audio on another language this outputs in English. But i'd like to keep it in the same language haha

Thanks

Following tutorial try 2

I'm missing a thing?

node main.js

[whisper-node] Transcribing: audios/geladeira.wav

[whisper-node] Problem: whisper_init_from_file_with_params_no_state: loading model from './models/ggml-base.bin'
whisper_model_load: loading model
whisper_model_load: n_vocab = 51865
whisper_model_load: n_audio_ctx = 1500
whisper_model_load: n_audio_state = 512
whisper_model_load: n_audio_head = 8
whisper_model_load: n_audio_layer = 6
whisper_model_load: n_text_ctx = 448
whisper_model_load: n_text_state = 512
whisper_model_load: n_text_head = 8
whisper_model_load: n_text_layer = 6
whisper_model_load: n_mels = 80
whisper_model_load: ftype = 1
whisper_model_load: qntvr = 0
whisper_model_load: type = 2 (base)
whisper_model_load: adding 1608 extra tokens
whisper_model_load: n_langs = 99
whisper_backend_init: using Metal backend
ggml_metal_init: allocating
ggml_metal_init: found device: Intel Iris Pro Graphics
ggml_metal_init: found device: NVIDIA GeForce GT 750M
ggml_metal_init: picking default device: NVIDIA GeForce GT 750M
ggml_metal_init: default.metallib not found, loading from source
ggml_metal_init: loading '/Users/jefaokpta/Downloads/whisper-test/node_modules/whisper-node/lib/whisper.cpp/ggml-metal.metal'

TypeError: whisper is not a function

I run my code, but it has a error
this is error
`
const transcript = await whisper(filePath, options);
^

TypeError: whisper is not a function
at file:///Users/xx/test/test1.js:17:26
at ModuleJob.run (node:internal/modules/esm/module_job:192:25)
`

Initialization Error and ShellJS File Writing Issue

On Debian GNU/Linux 10 (buster), I encountered a ploblem while deploying a NodeJS project on Render. It appears that the module is not initializing correctly on it.

[whisper-node] Problem. whisper.cpp not initialized. Current shelljs directory:  /opt/render/project/src/node_modules/whisper-node/dist
[whisper-node] Attempting to run 'make' command in /whisper directory...
error caught in try catch block
node:fs:2352
    return binding.writeFileUtf8(
                   ^

Error [ShellJSInternalError]: ENOENT: no such file or directory, open '/tmp/shelljs_ec87588fed5a22cbb957'
    at Object.writeFileSync (node:fs:2352:20)
    at writeFileLockedDown (/opt/render/project/src/node_modules/shelljs/src/exec.js:61:8)
    at execSync (/opt/render/project/src/node_modules/shelljs/src/exec.js:66:3)
    at Object._exec (/opt/render/project/src/node_modules/shelljs/src/exec.js:223:12)
    at Object.exec (/opt/render/project/src/node_modules/shelljs/src/common.js:335:23)
    at Object.<anonymous> (/opt/render/project/src/node_modules/whisper-node/dist/shell.js:53:27)
    at Module._compile (node:internal/modules/cjs/loader:1376:14)
    at Module._extensions..js (node:internal/modules/cjs/loader:1435:10)
    at Module.load (node:internal/modules/cjs/loader:1207:32)
    at Module._load (node:internal/modules/cjs/loader:1023:12) {
  errno: -2,
  code: 'ENOENT',
  syscall: 'open',
  path: '/tmp/shelljs_ec87588fed5a22cbb957'
}

Running whisper-node on EC2 Instance

Hi there,

Apologies if this is not the right place to ask this question!

I've been playing around with this package and it's working great. I'd like to deploy my app to an AWS EC2 instance but I'm unsure about what sort of specs/configuration I'll need. I'm looking at deploying to a m5.xlarge EC2 instance with a deeplearning AMI. Does this seem suitable? And are there is there any other configuration I should look into to get the best performance from this package?

Thanks in advance!

Rob

Problem. 'make' command failed.

Hello!
this is actually my first time using this so I am not sure if I am doing this right but I followed the steps and installed the package and then installed the large.bin model.
when Model was downloaded and it was compiling, it threw error:

Downloading ggml model large...
Done! Model large saved in G:\DiscordPaidBots\DiscordPaidBots\excalibura\node_modules\whisper-node\lib\whisper.cpp\models\ggml-large.bin
You can now use it like this:
main.exe -m G:\DiscordPaidBots\DiscordPaidBots\excalibura\node_modules\whisper-node\lib\whisper.cpp\models\ggml-large.bin -f G:\DiscordPaidBots\DiscordPaidBots\excalibura\node_modules\whisper-node\lib\whisper.cpp\samples\jfk.wav
[whisper-node] Attempting to compile model...
process_begin: CreateProcess(NULL, uname -s, ...) failed.
Makefile:2: pipe: No error
process_begin: CreateProcess(NULL, uname -p, ...) failed.
Makefile:6: pipe: No error
process_begin: CreateProcess(NULL, uname -m, ...) failed.
Makefile:10: pipe: No error
/usr/bin/bash: cc: command not found
I whisper.cpp build info: 
I UNAME_S:
I UNAME_P:
I UNAME_M:
I CFLAGS:   -I.              -O3 -std=c11   -fPIC -mfma -mf16c -mavx -mavx2
I CXXFLAGS: -I. -I./examples -O3 -std=c++11 -fPIC
I LDFLAGS:
I CC:
I CXX:      g++.exe (GCC) 11.2.0

cc  -I.              -O3 -std=c11   -fPIC -mfma -mf16c -mavx -mavx2   -c ggml.c -o ggml.o
process_begin: CreateProcess(NULL, cc -I. -O3 -std=c11 -fPIC -mfma -mf16c -mavx -mavx2 -c ggml.c -o ggml.o, ...) failed.
make (e=2): The system cannot find the file specified.
make: *** [Makefile:176: ggml.o] Error 2

Then I tried to launch my nodejs app and it said:

[whisper-node] Problem. whisper.cpp not initialized. Current shelljs directory:  G:\DiscordPaidBots\DiscordPaidBots\excalibura\node_modules\whisper-node\dist
[whisper-node] Attempting to run 'make' command in /whisper directory...
[whisper-node] Problem. 'make' command failed. Please run 'make' command in /whisper directory. Current shelljs directory:  G:\DiscordPaidBots\DiscordPaidBots\excalibura\node_modules\whisper-node\dist

And when I tried to run make command in node_modules/whisper-node/lib/whisper.cc as this was the only pace where Makefile was, it gave me this error:

process_begin: CreateProcess(NULL, uname -s, ...) failed.
Makefile:2: pipe: No error
process_begin: CreateProcess(NULL, uname -p, ...) failed.
Makefile:6: pipe: No error
process_begin: CreateProcess(NULL, uname -m, ...) failed.
Makefile:10: pipe: No error
I whisper.cpp build info:
I UNAME_S:
I UNAME_P:
I UNAME_M:
I CFLAGS:   -I.              -O3 -std=c11   -fPIC -mfma -mf16c -mavx -mavx2
I CXXFLAGS: -I. -I./examples -O3 -std=c++11 -fPIC
I LDFLAGS:
I CC:       gcc.exe (GCC) 11.2.0
I CXX:      g++.exe (GCC) 11.2.0

g++ -I. -I./examples -O3 -std=c++11 -fPIC examples/main/main.cpp ggml.o whisper.o -o main
./main -h

usage: ./main [options] file0.wav file1.wav ...
options:
 -h,       --help           [default] show this help message and exit
  -t N,     --threads N      [4      ] number of threads to use during computation
  -p N,     --processors N   [1      ] number of processors to use during computation
  -ot N,    --offset-t N     [0      ] time offset in milliseconds
.
.
.
.
-f FNAME, --file FNAME     [       ] input WAV file path

Is there any fix for this? Or is there something I am doing incorrect?
I am using Windows btw.
NodeJS version 18.16.1
GNU make v4.4.1
gcc v11.2.0

I appreciate any kind of support.

Could not find a declaration file for module 'whisper-node'.

Following the instructions in the Readme, I get:

╰─(base) ⠠⠵ ts-node src/process_scenes.ts --node                                                                                                                                                                                                                                                                                                                                               on master↑1|✚16…17
Error preparing scenes: src/lib/Bark.ts:5:32 - error TS7016: Could not find a declaration file for module 'whisper-node'. '/home/arthur/dev/ai/manga/node_modules/whisper-node/dist/index.js' implicitly has an 'any' type.
  Try `npm i --save-dev @types/whisper-node` if it exists or add a new declaration (.d.ts) file containing `declare module 'whisper-node';`

5 import whisper            from 'whisper-node';
                                 ~~~~~~~~~~~~~~

I tried npm i --save-dev @types/whisper-node and got nothing more.

Any ideas?

Thanks.

PS: My tsconfig in case it's relevant:

{
  "compilerOptions": {

    "strict": true,
    "noImplicitAny": true,

    "target": "es6",
    "module": "commonjs",
    "moduleResolution": "node",
    "esModuleInterop": true,
    "baseUrl": ".",
    "paths": {
      "@/*": ["src/*"]
    },
  }
}

PS: doing:

// @ts-ignore
import whisper            from 'whisper-node';

solves the problem temporarily, but I'd really like a "clean" way out.

then when I am able to run it I have another problem:

/ram/runner/bark-text-reader-2ksgcnjtjt6//example.wav
[whisper-node] Transcribing: /ram/runner/bark-text-reader-2ksgcnjtjt6//example.wav 

[whisper-node] No 'modelName' or 'modelPath' provided. Trying default model: base.en 

[whisper-node] Problem: TypeError: Cannot read properties of null (reading 'shift')
    at parseTranscript (/home/arthur/dev/ai/manga/node_modules/whisper-node/src/tsToArray.ts:12:9)
    at /home/arthur/dev/ai/manga/node_modules/whisper-node/src/index.ts:35:46
    at Generator.next (<anonymous>)
    at fulfilled (/home/arthur/dev/ai/manga/node_modules/whisper-node/dist/index.js:5:58)
    at processTicksAndRejections (node:internal/process/task_queues:95:5)
undefined

Can't download models with v1.0.0

After upgrading to v1.0.0, downloading models fails with the following error. However, v0.3.1 works fine

(ENTER for base.en): medium.en
Downloading ggml model medium.en from 'https://huggingface.co/datasets/ggerganov/whisper.cpp' ...
Failed to download ggml model medium.en

Removing First Line, But It's Not Empty

I've been seeing empty transcription results, and did some digging to find that you're trying to remove the first line assuming that it is empty. In my testing, it's never empty. Perhaps you could create a more elegant solution that looks for empty array elements after it's parsed and then remove those?

// 2. remove the first line, which is empty

ERROR WHEN INSTALLING WITH WINDOWS

I installed all the necessary dependencies to use with Windows and the same error continues to appear. I saw that several people reported the same error. Has anyone managed to solve it?

PS: I ALREADY INSTALLED THIS AND THERE WAS AN ERROR https://gnuwin32.sourceforge.net/packages/make.htm

image
You can now use it like this:
main.exe -m C:\Developer\Pruebas\whipser-test\node_modules\whisper-node\lib\whisper.cpp\models\ggml-base.en.bin -f C:\Developer\Pruebas\whipser-test\node_modules\whisper-node\lib\whisper.cpp\samples\jfk.wav
[whisper-node] Attempting to compile model...
"cc" no se reconoce como un comando interno o externo,
programa o archivo por lotes ejecutable.
"head" no se reconoce como un comando interno o externo,
programa o archivo por lotes ejecutable.
I whisper.cpp build info:
I UNAME_S:
I UNAME_P:
I UNAME_M:
I CFLAGS: -I. -O3 -DNDEBUG -std=c11 -fPIC -D_XOPEN_SOURCE=600 -pthread
I CXXFLAGS: -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -D_XOPEN_SOURCE=600 -pthread
I LDFLAGS:
I CC:
I CXX:

cc -I. -O3 -DNDEBUG -std=c11 -fPIC -D_XOPEN_SOURCE=600 -pthread -c ggml.c -o ggml.o
process_begin: CreateProcess(NULL, uname -s, ...) failed.
process_begin: CreateProcess(NULL, uname -p, ...) failed.
process_begin: CreateProcess(NULL, uname -m, ...) failed.
process_begin: CreateProcess(NULL, which nvcc, ...) failed.
process_begin: CreateProcess(NULL, cc -I. -O3 -DNDEBUG -std=c11 -fPIC -D_XOPEN_SOURCE=600 -pthread -c ggml.c -o ggml.o, ...) failed.
make (e=2): El sistema no puede encontrar el archivo especificado.
make: *** [ggml.o] Error 2

TypeError when using whisper-node library for audio transcription in NestJS project

I'm encountering a TypeError while attempting to transcribe audio from a WAV file using the whisper-node library in my NestJS project. Here's my code snippet:

import whisper from 'whisper-node';
import path from 'path';

const filePath = path.join(__dirname, '..', 'uploads', 'output.wav');
const options = {
modelName: 'base.en',
whisperOptions: {
language: 'auto',
gen_file_txt: false,
gen_file_subtitle: false,
gen_file_vtt: false,
word_timestamps: true,
},
};

const transcript = await whisper(filePath, options);

Cannot read properties of null (reading 'shift')

[whisper-node] Problem: TypeError: Cannot read properties of null (reading 'shift') at parseTranscript (/app/node_modules/whisper-node/dist/tsToArray.js:7:11) at /app/node_modules/whisper-node/dist/index.js:36:57 at Generator.next (<anonymous>) at fulfilled (/app/node_modules/whisper-node/dist/index.js:5:58) at process.processTicksAndRejections (node:internal/process/task_queues:95:5)

How fix this pls?

Failing: Model compiling using npx whisper-node download

I'm getting this error in the compiling part of the npx whisper-node download command:

(ENTER for base.en): base
Downloading ggml model base...
Done! Model base saved in C:\Users\node_modules\whisper-node\lib\whisper.cpp\models\ggml-base.bin
You can now use it like this:
main.exe -m C:\Users\node_modules\whisper-node\lib\whisper.cpp\models\ggml-base.bin -f C:\Users\node_modules\whisper-node\lib\whisper.cpp\samples\jfk.wav
[whisper-node] Attempting to compile model...
process_begin: CreateProcess(NULL, uname -s, ...) failed.
Makefile:2: pipe: No error
process_begin: CreateProcess(NULL, uname -p, ...) failed.
Makefile:6: pipe: No error
process_begin: CreateProcess(NULL, uname -m, ...) failed.
Makefile:10: pipe: No error
'cc' is not recognized as an internal or external command,
operable program or batch file.
'head' is not recognized as an internal or external command,
operable program or batch file.
I whisper.cpp build info: 
I UNAME_S:
I UNAME_P:
I UNAME_M:
I CFLAGS:   -I.              -O3 -std=c11   -fPIC -mfma -mf16c -mavx -mavx2
I CXXFLAGS: -I. -I./examples -O3 -std=c++11 -fPIC
I LDFLAGS:
I CC:
I CXX:

cc  -I.              -O3 -std=c11   -fPIC -mfma -mf16c -mavx -mavx2   -c ggml.c -o ggml.o
process_begin: CreateProcess(NULL, cc -I. -O3 -std=c11 -fPIC -mfma -mf16c -mavx -mavx2 -c ggml.c -o ggml.o, ...) failed.
make (e=2): The system cannot find the file specified.
make: *** [Makefile:176: ggml.o] Error 2

"npx whisper-node download" downloading a broken file

Running "npx whisper-node download" is downloading a broken ggml-base.en.bin file with

Invalid username or password.

This is the logs when I ran npx whisper-node download:

> npx whisper-node download

| Model     | Disk   | RAM     |
|-----------|--------|---------|
| tiny      |  75 MB | ~390 MB |
| tiny.en   |  75 MB | ~390 MB |
| base      | 142 MB | ~500 MB |
| base.en   | 142 MB | ~500 MB |
| small     | 466 MB | ~1.0 GB |
| small.en  | 466 MB | ~1.0 GB |
| medium    | 1.5 GB | ~2.6 GB |
| medium.en | 1.5 GB | ~2.6 GB |
| large-v1  | 2.9 GB | ~4.7 GB |
| large     | 2.9 GB | ~4.7 GB |


[whisper-node] Enter model name (e.g. 'base.en') or 'cancel' to exit
(ENTER for base.en): 
[whisper-node] Going with base.en
Downloading ggml model base.en from 'https://huggingface.co/datasets/ggerganov/whisper.cpp' ...
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100    29  100    29    0     0    207      0 --:--:-- --:--:-- --:--:--   216
Done! Model 'base.en' saved in 'models/ggml-base.en.bin'
You can now use it like this:

  $ ./main -m models/ggml-base.en.bin -f samples/jfk.wav

[whisper-node] Attempting to compile model...
sysctl: unknown oid 'hw.optional.arm64'
I whisper.cpp build info: 
I UNAME_S:  Darwin
I UNAME_P:  i386
I UNAME_M:  x86_64
I CFLAGS:   -I.              -O3 -std=c11   -fPIC -pthread -mf16c -mfma -mavx -mavx2 -DGGML_USE_ACCELERATE
I CXXFLAGS: -I. -I./examples -O3 -std=c++11 -fPIC -pthread
I LDFLAGS:   -framework Accelerate
I CC:       Apple clang version 14.0.0 (clang-1400.0.29.202)
I CXX:      Apple clang version 14.0.0 (clang-1400.0.29.202)

make: Nothing to be done for `default'.

And this is the error logs for when I try to call whisper(...) with the broken model:

$ tsc && node ./build/test.js
[whisper-node] Transcribing: /Users/ruben/Desktop/audio.mp3 

[whisper-node] No 'modelName' or 'modelPath' provided. Trying default model: base.en 

[whisper-node] Problem: whisper_init_from_file: loading model from './models/ggml-base.en.bin'
whisper_model_load: loading model
whisper_model_load: invalid model data (bad magic)
whisper_init: failed to load model
error: failed to initialize whisper context

undefined
✨  Done in 3.66s.

After copy-pasting the model directly from whisper.cpp repo in /node_modules/whisper-node/lib/whisper.cpp/models/, it worked.

Failed to download model

Hi there, thanks for making these node bindings!

Problem:

I was trying to add your project to mine and ran into an error after attempting to download the whisper model.

Using the command you provided, I got the following error:

npx whisper-node download-model base.en
npm ERR! could not determine executable to run

Steps used to recreate the error:

I followed the project's README.md

Install the npm package:

npm i whisper-node

up to date, audited 230 packages in 440ms

44 packages are looking for funding
  run `npm fund` for details

Download the model:

npx whisper-node download-model base.en
npm ERR! could not determine executable to run

npm ERR! A complete log of this run can be found in:
npm ERR!     /.npm/_logs/2023-03-12T17_22_13_443Z-debug-0.log

Not downloading the model

[whisper-node] Enter model name (e.g. 'base.en') or 'cancel' to exit
(ENTER for base.en):
[whisper-node] Going with base.en
Downloading ggml model base.en...
Invoke-WebRequest : Invalid username or password.
At line:1 char:1
+ Invoke-WebRequest -Uri https://huggingface.co/datasets/ggerganov/whis ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : InvalidOperation: (System.Net.HttpWebRequest:HttpWebRequest) [Invoke-WebRequest], WebException
    + FullyQualifiedErrorId : WebCmdletWebResponseException,Microsoft.PowerShell.Commands.InvokeWebRequestCommand
Failed to download ggml model base.en
Please try again later or download the original Whisper model files and convert them yourself.
[whisper-node] Attempting to compile model...
'make' is not recognized as an internal or external command,
operable program or batch file.

I followed the read me but has this error

node main.js

[whisper-node] Transcribing: audios/elogio.wav

[whisper-node] Problem: TypeError: Cannot read properties of null (reading 'shift')
at parseTranscript (/home/jefaokpta/Downloads/whisper/node_modules/whisper-node/dist/tsToArray.js:7:11)
at /home/jefaokpta/Downloads/whisper/node_modules/whisper-node/dist/index.js:36:57
at Generator.next ()
at fulfilled (/home/jefaokpta/Downloads/whisper/node_modules/whisper-node/dist/index.js:5:58)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
undefined

Is there a way to use this with a regular .js file rather than this 'import' typescript style

I'm on nodejs 20. I don't want to use typescript, but I do want to use whisper-node in javascript

the import statement of import whisper from 'whisper-node'; returns Cannot use import statement outside of a module when I add it to my .js file.

is there a way to just simply declare this dependency at the top with a simple require statement like I do for all my other modules, that way I'm not declaring every function.

What do you suggest, I'd love to start using this

I'd love to be able to just simple add the dependency statement at the top and do my thing, create functions and call the functions of the dependency. Is that possible?

I'm looking to do something like this

const whisper = require('whisper-node');


(async () => {
    try {
        const transcription = await whisper.whisper("./audio/output.wav");

        console.log(transcription);
        process.exit(0);
    } catch (err) {
        console.log("ERROR", err);
        process.exit(1);
    }

})();

Whisper-Node Make Fails at Runtime

I get the following error at start:

[whisper-node] Problem. whisper.pp not initialized. Current shelljs directory:  C:\Project\node_modules\whisper-node\dist
[whisper-node] Attempting to run 'make' command in /whisper directory...
[whisper-node] Problem. 'make' command failed. Please run 'make' command in /whisper directory. Current shelljs directory:  C:\Project\node_modules\whisper-node\dist

When using whisper-node, it blocks the rest of the requests in nest.js.

async getTranscript(filePath: string): Promise {
const options = {
modelName: 'base.en',
whisperOptions: {
language: 'auto',
gen_file_txt: false,
gen_file_subtitle: false,
gen_file_vtt: false,
word_timestamps: true,
},
};
if (filePath) {
return await whisper(filePath, options);
}
return null;
}

if I remove this line "return await whisper(filePath, options); " then everything works fine, after putting it no request works in that project․

What is the problem? How to fix it?

whisper-node sets rootDir to node_modules package when importing it

I have made the same issue in the ts version one . But thought I might as well leave it here for future reference.

Basically same issue as listed in the issue linked above.

Without loading the file with import whisper from 'whisper-node':

[INFO] Root directory: /home/wolf/develop/nodejs/okuuai/node_modules/whisper-node/lib/whisper.cpp // using process.cwd()
[INFO] Root directory path: /home/wolf/develop/nodejs/okuuai/src // using path

Without loading it:

[INFO] Root directory: /home/wolf/develop/nodejs/okuuai
[INFO] Root directory path: /home/wolf/develop/nodejs/okuuai/src

It would be great to have some help on this, please.

[Mac - M ship] TypeError: Cannot read properties of null (reading 'shift') - tsToArray.js:7:11

Hello, I try to use this library on a nodejs project.
I follow steps to install this lib, download large-v1 model.
This is my code:
Capture d’écran 2024-02-09 à 12 27 35

And when I start my server to run this code.
I have this error in the console and I don't understand why the lines var is undefined:

 [Function: whisper]
[whisper-node] Transcribing: /Users/*/example/Je_sais_plus.wav 

Promise { <pending> }
[whisper-node] Problem: TypeError: Cannot read properties of null (reading 'shift')
    at parseTranscript (/Users/*/myproject/node_modules/whisper-node/dist/tsToArray.js:7:11)

Do you know how to fix this error ?
It's a problem with the whisper.cpp installation ?

Failing: npx whisper-node download Makefile Errors

When I run npx whisper-node download I get the following error:

[whisper-node] Going with base.en
'.' is not recognized as an internal or external command,
operable program or batch file.
[whisper-node] Attempting to compile model...
process_begin: CreateProcess(NULL, uname -s, ...) failed.
Makefile:2: pipe: No error
process_begin: CreateProcess(NULL, uname -p, ...) failed.
Makefile:6: pipe: No error
process_begin: CreateProcess(NULL, uname -m, ...) failed.
Makefile:10: pipe: No error
'cc' is not recognized as an internal or external command,
operable program or batch file.
'head' is not recognized as an internal or external command,
operable program or batch file.
I whisper.cpp build info: 
I UNAME_S:
I UNAME_P:
I UNAME_M:
I CFLAGS:   -I.              -O3 -std=c11   -fPIC -mfma -mf16c -mavx -mavx2
I CXXFLAGS: -I. -I./examples -O3 -std=c++11 -fPIC
I LDFLAGS:
I CC:
I CXX:

cc  -I.              -O3 -std=c11   -fPIC -mfma -mf16c -mavx -mavx2   -c ggml.c -o ggml.o
process_begin: CreateProcess(NULL, cc -I. -O3 -std=c11 -fPIC -mfma -mf16c -mavx -mavx2 -c ggml.c -o ggml.o, ...) failed.
make (e=2): The system cannot find the file specified.
make: *** [Makefile:176: ggml.o] Error 2

error: failed to open 'test.wav' as WAV file

running whisper-node always give this error

sample code:

import { whisper } from 'whisper-node';

(async function () {
    try {
        const transcript = await whisper(`test.wav`);
        console.log('transcript res: ', transcript);
    } catch (e) {
        console.error('transcript error: ', e);
    }
})()

Stream and live recognition

Hello. First of all thanks for this awesome project, it's what I was looking for.

I'm going to ask two questions.

  • Is it possible to pass an audio stream instead of a file name?
  • Is it possible to do live stream recognition similar to what they do in the original cpp version?

Of course the first question is linked to the second, as you can undersand

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.