Code Monkey home page Code Monkey logo

voice-based-gender-recognition's Introduction

Build Status

Voice-based-gender-recognition

Voice based gender recognition using:

  • The Free ST American English Corpus dataset (SLR45)
  • Mel-frequency cepstrum coefficients (MFCC)
  • Gaussian mixture models (GMM)

Dataset

The The Free ST American English Corpus dataset (SLR45) can be found on SLR45. It is a free American English corpus by Surfingtech, containing utterances from 10 speakers (5 females and 5 males). Each speaker has about 350 utterances.

Theory

Voice features extraction

The Mel-Frequency Cepstrum Coefficients (MFCC) are used here, since they deliver the best results in speaker verification. MFCCs are commonly derived as follows:

  1. Take the Fourier transform of (a windowed excerpt of) a signal.
  2. Map the powers of the spectrum obtained above onto the mel scale, using triangular overlapping windows.
  3. Take the logs of the powers at each of the mel frequencies.
  4. Take the discrete cosine transform of the list of mel log powers, as if it were a signal.
  5. The MFCCs are the amplitudes of the resulting spectrum.

Gaussian Mixture Model

According to D. Reynolds in Gaussian_Mixture_Models: A Gaussian Mixture Model (GMM) is a parametric probability density function represented as a weighted sum of Gaussian component densities. GMMs are commonly used as a parametric model of the probability distribution of continuous measurements or features in a biometric system, such as vocal-tract related spectral features in a speaker recognition system. GMM parameters are estimated from training data using the iterative Expectation-Maximization (EM) algorithm or Maximum A Posteriori(MAP) estimation from a well-trained prior model.

Workflow graph


  • For a more detailed explanation, please refer to this blog that I have written.


Dependencies

This script require the follwing modules/libraries:

Libs can be installed as follows:

pip install -r requirements.txt

Code & scripts

  • Run.py : This is the main script and it will run the whole cycle (Data management > Models training > Genders identification)
  • DataManager.py: This script is responsible for the extracting and strcturing the data.
  • ModelsTrainer.py:This script is responsible for training the Gaussian Mixture Models (GMM).
  • GenderIdentifier.py:This script is responsible for Testing the system by identifying the genders of the testing set.
  • FeaturesExtractor.py:This script is responsible for extracting the MFCC features from the .wav files.

Results and disscussion

  • The system results in a 95% accuracy of gender detection.
  • The code can be further optimized using multi-threading, acceleration libs and multi-processing.
  • The accuracy can be further improved using GMM normalization aka a UBM-GMM system.

voice-based-gender-recognition's People

Contributors

dependabot[bot] avatar superkogito avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

voice-based-gender-recognition's Issues

SVM-GMM

Hello Mr.Ayoub,
Thank you for your great work.

I got some issues while running the SVM-code could you help me please.

image
image

I have one question regarding the SVM-code, why we are training the data on .hmm.

And why we you used this?
image

Thank you in advance.

Unable to run Run.py

I downloaded the Zip file of the depository and launched
pip on the repository.txt
There seems to be an incompatibility between the version of scikit-learn and scipy. So finally I get the message

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for scipy
Running setup.py clean for scipy
error: subprocess-exited-with-error

× python setup.py clean did not run successfully.
│ exit code: 1
╰─> [11 lines of output]
/private/var/folders/1n/fvs3dd2d5gqghwcw8khb41_h0000gn/T/pip-install-wrqba4ts/scipy_d52b27f83b4d4bcbab2d2a119d86fdd9/setup.py:114: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
import imp

  `setup.py clean` is not supported, use one of the following instead:
  
    - `git clean -xdf` (cleans all files)
    - `git clean -Xdf` (cleans all versioned files, doesn't touch
                        files that aren't checked into the git repo)
  
  Add `--force` to your command to use it anyway if you must (unsupported).
  
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed cleaning build dir for scipy
Failed to build scikit-learn scipy
ERROR: Could not build wheels for scikit-learn, scipy, which is required to install pyproject.toml-based projects

Here is the output I get when I then run Run.py:

python Run.py

Download dataset zip file

Mange and organize files

Files extraction was successfull ...
TrainingData was created ...
TestingData was created ...
TrainingData/females was created ...
TrainingData/males was created ...
TestingData/females was created ...
TestingData/males was created ...

Train gender models

Traceback (most recent call last):
File "Code/ModelsTrainer.py", line 5, in
from sklearn.mixture import GMM
ImportError: cannot import name 'GMM' from 'sklearn.mixture' (/Users/yannis/miniconda3/lib/python3.8/site-packages/sklearn/mixture/init.py)

Identify genders

Traceback (most recent call last):
File "Code/GenderIdentifier.py", line 5, in
from FeaturesExtractor import FeaturesExtractor
File "/Users/yannis/Downloads/LIVRE/PYTHON/EXOS/01-4/Voice-based-gender-recognition-master/Code/FeaturesExtractor.py", line 4, in
from python_speech_features import mfcc
ModuleNotFoundError: No module named 'python_speech_features'

What am I doing wrong? How can I solve this problem?

Error when using high rates

error
hi, i am currently working on my second phase of experiment using your source code. thank you :) i just have one doubt: the audio and rate in features vector, im having troubles in sampling_rate when i tried to substitute it with 44100. can you please tell me where im going wrong?

Originally posted by @thxrgxxs in #4 (comment)

Thank you

Thanx for the great work.
The problem has been solved.

Issue With Score Computation

Hello,

In testing this implementation on some real-life recordings I took, I happened to get very negative scores for the file named "female2.wav". I'm wondering how this happened, and more specifically how the scoring algorithm works (the documentation for the score function appears to say that it is a log probability, but somehow we have positive values?). Any indication as to how this one .wav file could have generated negative scores while other similar ones generated positive ones would be greatly appreciated.

Screen Shot 2019-06-11 at 12 19 38 PM

The question of model

Hi, thank you for sharing your code.I am using your code for my further research recently. And now I have a question in the process of practice. I am appreciation it if get your answer.
In the model training of the Code file, you used the Gaussian mixture model. But in the model training of the nnCode, svmCode, and hmmCode files, you used the Gaussian HMM. Why should it be set like this?

Real-time gender classification?

I see the python files rely on files for input.

Can I pipe a .wav file into a python script?

I'm exploring possibilities for classifying a voice in real time.

I got the program working fine. Very nice code. Thank you.

There was problem for running the code

I am trying to run this code there was problem in feturesextractor
'from FeaturesExtractor import FeaturesExtractor'
How can solve this problem in python.
Which library is used to remove this issues.
error is : unresolved reference FeaturesExtractor
FeatureExtractorError

Input of NN

Sorry for the inconvenience, I’m stuck a little bit with NN code. Is the input for the NN is the super vector?

[SOLVED] update to run

thanks for model. i did not install it from requirements so stuck with errors.
and fixed em.
so to start it working with updated sklearn:
in Voice-based-gender-recognition\Code\ModelsTrainer.py change:

#from sklearn.mixture import GMM
from sklearn import mixture

#females_gmm = GMM(n_components = 16, n_iter = 200, covariance_type='diag', n_init = 3)
#males_gmm   = GMM(n_components = 16, n_iter = 200, covariance_type='diag', n_init = 3)
females_gmm = mixture.GaussianMixture(n_components = 16, max_iter = 200, covariance_type='diag', n_init = 3)
males_gmm   = mixture.GaussianMixture(n_components = 16, max_iter = 200, covariance_type='diag', n_init = 3)

also in FeaturesExtractor.py to avoid warnings:

#nfft         = 512,
nfft         = 2400,

also i have to change a bit GenderIdentifier.py (i use Windows) when i use it alone:

expected_gender = file.split("\\")[1][:-1]
#expected_gender = file.split("/")[1][:-1]
#gender_identifier = GenderIdentifier("TestingData/females", "TestingData/males", "females.gmm", "males.gmm")
gender_identifier = GenderIdentifier("females", "males", "females.gmm", "males.gmm")

and finally i saw strange thing - the model identifies upside-down:

--> TESTING        : 1073565748.wem.wav
+ FEMALE SCORE     : 0.487
+ MALE SCORE       : 0.8
+ EXPECTATION      : 1073565748.wem.wa
+ IDENTIFICATION   : male
----------------------------------------------------
--> TESTING        : 1073638869.wem.wav
+ FEMALE SCORE     : -0.127
+ MALE SCORE       : 0.432
+ EXPECTATION      : 1073638869.wem.wa
+ IDENTIFICATION   : male
----------------------------------------------------
--> TESTING        : 1073566498.wem.wav
+ FEMALE SCORE     : 0.661
+ MALE SCORE       : 0.229
+ EXPECTATION      : 1073566498.wem.wa
+ IDENTIFICATION   : female
----------------------------------------------------
*** Accuracy = 0.0% ***

True results: male -> female, female -> male.
May be somewhere is wrong path?

ok.fixed the last just changed *args in GenderIdentifier.py like follows:
gender_identifier = GenderIdentifier("males", "females", "males.gmm","females.gmm")

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.