Code Monkey home page Code Monkey logo

kaldi-native-fbank's Introduction

Introduction

Kaldi-compatible online fbank feature extractor without external dependencies.

Tested on the following architectures and operating systems:

  • Linux
  • macOS
  • Windows
  • Android
  • x86
  • arm
  • aarch64

Usage

See the following CMake-based speech recognition (i.e., text-to-speech) projects for its usage:

They use kaldi-native-fbank to compute fbank features for real-time speech recognition.

Python APIs

First, please install kaldi-native-fbank by

git clone https://github.com/csukuangfj/kaldi-native-fbank
cd kaldi-native-fbank
python3 setup.py install

or use

pip install kaldi-native-fbank

To check that you have installed kaldi-native-fbank successfully, please use

python3 -c "import kaldi_native_fbank; print(kaldi_native_fbank.__version__)"

which should print the version you have installed.

Please refer to https://github.com/csukuangfj/kaldi-native-fbank/blob/master/kaldi-native-fbank/python/tests/test_online_fbank.py for usages.

For easier reference, we post the above file below:

#!/usr/bin/env python3

import sys

try:
    import kaldifeat
except:
    print("Please install kaldifeat first")
    sys.exit(0)

import kaldi_native_fbank as knf
import torch


def main():
    sampling_rate = 16000
    samples = torch.randn(16000 * 10)

    opts = kaldifeat.FbankOptions()
    opts.frame_opts.dither = 0
    opts.mel_opts.num_bins = 80
    opts.frame_opts.snip_edges = False
    opts.mel_opts.debug_mel = False

    online_fbank = kaldifeat.OnlineFbank(opts)

    online_fbank.accept_waveform(sampling_rate, samples)

    opts = knf.FbankOptions()
    opts.frame_opts.dither = 0
    opts.mel_opts.num_bins = 80
    opts.frame_opts.snip_edges = False
    opts.mel_opts.debug_mel = False

    fbank = knf.OnlineFbank(opts)
    fbank.accept_waveform(sampling_rate, samples.tolist())

    assert online_fbank.num_frames_ready == fbank.num_frames_ready
    for i in range(fbank.num_frames_ready):
        f1 = online_fbank.get_frame(i)
        f2 = torch.from_numpy(fbank.get_frame(i))
        assert torch.allclose(f1, f2, atol=1e-3), (i, (f1 - f2).abs().max())


if __name__ == "__main__":
    torch.manual_seed(20220825)
    main()
    print("success")

kaldi-native-fbank's People

Contributors

csukuangfj avatar haujetzhao avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.