Code Monkey home page Code Monkey logo

kws's Introduction

Introduction

Keyword Spotting (KWS) refers to the task of detecting a pre-defind keyword/phrase in an audio file or a stream of audio. The implemented algorithm uses a sliding Dynamic Time Warping (DTW) approach. Refer to this paper for a detailed explanation. You can also view my presentation here.

Datasets

  1. TIMIT is used for training a Neural Network which acts as a feature extractor.
  2. The Google Speech Commands dataset is used for testing the performance of the algorithm.

Instructions

  1. The following python packages are required: numpy, matplotlib, pickle, torch, json, scipy, python_speech features, yaml
  2. For relative paths to work smoothly, please adhere to the following directory structure:
KWS (parent directory)
├── speech (Google Speech Commands)
│	├── bed (example class)
│	├── ...
├── nn
│	├── TIMIT
│	├── TEST
│	├── TRAIN
│	├── models (where trained models are stored)
│	│	├── best.pth (a shallow pre-trained model with ±4 context is included)
│	│	├── (other models)
│	├── (python scripts and config file)
  1. 'dl_model.py' is responsible for training the Neural Network feature extractor while 'sliding_kws.py' runs the actual experiments and dumps a json file containing the results.

kws's People

Contributors

methi1999 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.