Code Monkey home page Code Monkey logo

vocal-vad's Introduction

Vocal-VAD in Music Scene

The best is yet to come.

A project implementing vocal VAD (Voice Activity Detection) in music scene.

Task

  • Input: a complete music in wav format
  • Output: vocal probability in per 10 ms

Dataset

MUSDB18: contains 150 music tracks (mixture) along with their isolated drums, bass, vocals and others stems.

Methods

Feature Engineering

  • STE: Short Time Energy, represents the energy of a frame of speech signal.
  • ZCC: Zero Crossing Counter, represents the number of times the time domain signal of a frame passes through zero.

In general, vocal fragments have high STE and low ZCC, while non-vocal fragments have low STE and high ZCC.

The calculation methods of STE and ZCC is optimized in implementation.

Vocal Extraction

Spleeter is a U-Net based model to extract the vocal track from an audio, implemented in tensorflow. It provides pre-trained model and can be used straight from command line.

Experiment

Reached an AUC of 0.88, a relatively high performance.

Usage

  • Install dependencies:

    > pip install -r requirements.txt
    
  • audio_segments.py: to segment audio files in 10 ms

  • spleeter_process.py: to automatically run spleeter to extract vocal tracks from original audios

  • data_process.py: to process the extracted audio and output the VAD result

  • AUC.py: to compare with ground truth to get ROC curve and AUC value

vocal-vad's People

Contributors

tianhewu avatar vaceee avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Forkers

brahimmade

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.