Code Monkey home page Code Monkey logo

protk's Introduction

ProTK: A Prosody Toolkit

This is ProTK, a prosody toolkit developed to help create machine learning models for detection/classification of filled pauses in recorded speech. It is currently developed at the University of Minnesota-Twin Cities College of Pharmacy.

Authors

Current:

  • Jacob Okamoto (UMN Computer Science)
  • Serguei Pakhomov (UMN Pharmacy)

Advising:

  • Elizabeth Shriberg (Microsoft)
  • Andreas Stolcke (Microsoft)

Past:

  • Thomas Christie (UMN Cognitive Science)

Overview

ProTK is a toolkit developed to help create machine learning models of recorded speech. It has three primary components: a data ingest module, a feature extraction module, and an ARFF generation module. These three modules use an SQLite database to store and retrieve information in a structured intermediate format.

The workflow for ProTK's core functionality is simple: ingest analysis units from HTK recs or Praat TextGrids, extract features for each unit ingested, and output an ARFF file of the features extracted.

New Features

The primary new features of the rewritten ProTK are:

  • Arbitrary Units of Analysis: ProTK supports the generation of arbitrary units of analysis, specifically frames of specified length (frame size) and overlap (window size).
  • Multi-Tier Targeting: classification values (i.e., YES/NO truth values) can be generated by ProTK by checking whether a unit of analysis in the output tier occurs within a specific kind of unit of analysis in another tier. For example, this can check if a vowel occurs inside of a filled pause.
  • Passthrough Features: additional metadata from TextGrid files can be passed through from ProTK’s ingest engine to the ARFF output as additional ARFF attributes.
  • Contextual Information: ProTK can output arbitrary-width context for each unit of analysis during ARFF generation. This places information about n preceding and following units with the current unit in the ARFF output.
  • Multiprocessing: ProTK supports multi-core processors when running Praat analysis. It will run as many Praat processes in parallel as there are reported processing cores by the system.
  • High-Performance C Operations: the ProTK distribution includes a high-performance C ARFF generator for fast analysis of large datasets using a very small subset of features specific to filled-pause detection. This ARFF generator allowed us to process a large (200+ files) dataset in one hour instead of 20 or more.

Interspeech 2012

This software was presented at Interspeech 2012 in Portland, Oregon. Demonstration code is available at <https://github.com/oko/protk-demo>. Note that the demo code does not include the audio for testing (tested against RIT/UPenn's TRAINS corpus).

protk's People

Contributors

oko avatar

Watchers

mega avatar James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.