Code Monkey home page Code Monkey logo

Comments (6)

jtkim-kaist avatar jtkim-kaist commented on May 24, 2024

Sorry for absent of detail description of specification of training dataset,

But, it is very simple. You can find the dataformat by investigating the data in /data/raw/train or /data/raw/valid

The speech data should be .wav file whose sampling rate at 16khz and the label must be .mat file whose have 1 dimension and the values are just 1 (if speech) or 0 (if non-speech). For the direct understanding, plz open the sample training data in /data/raw/train

Thx!

from vad.

pankaj2701 avatar pankaj2701 commented on May 24, 2024

Thanks for the quick reply. I still have one doubt. While marking the labels do we have to count the overlapping frames or non overlapping

from vad.

jtkim-kaist avatar jtkim-kaist commented on May 24, 2024

You don't have to conduct framing on the label. The needed label is just sample based label.

For example if speech signal has 10,000 samples. The label also should have 10,000 samples.

Please download our sample wav & label and verify these.

Thx!

from vad.

pankaj2701 avatar pankaj2701 commented on May 24, 2024

The reason I am asking the question is because I want to train it on my data. So I need to know how to prepare the training data.

I saw the sample files given but it is not very clear how the samples have been labeled. Some samples are are having a value of zero and some are having a value of 1. I guess the value of 1 means that corresponding sample is a speech sample. But I have not been able to visually correlate the sample numbers with the waveforms.

from vad.

jtkim-kaist avatar jtkim-kaist commented on May 24, 2024

Your guess is correct, the 1 corresponds to speech and 0 corresponds to the non-speech the plot is like as below:

untitled

Note that if the speech data has noise, it is hard to discriminate speech or non-speech visually in 1d signal domain.

from vad.

Chenny0808 avatar Chenny0808 commented on May 24, 2024

hello,i have the same problem with you. now
(1)do you konw the method of formating the train data?
(2)i dont konw that how do the one label of a mat file correspond with the wav file?

from vad.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.