superkogito / voice-based-gender-recognition Goto Github PK

:sound: :boy: :girl:Voice based gender recognition using Mel-frequency cepstrum coefficients (MFCC) and Gaussian mixture models (GMM)

License: MIT License

Python 100.00%

gender-recognition gender-detection gender-classification gmm mfcc gender-recognition-by-voice voice mel-frequencies gaussian-mixture-models signal

voice-based-gender-recognition's Introduction

Voice-based-gender-recognition

Voice based gender recognition using:

The Free ST American English Corpus dataset (SLR45)
Mel-frequency cepstrum coefficients (MFCC)
Gaussian mixture models (GMM)

Dataset

The The Free ST American English Corpus dataset (SLR45) can be found on SLR45. It is a free American English corpus by Surfingtech, containing utterances from 10 speakers (5 females and 5 males). Each speaker has about 350 utterances.

Theory

Voice features extraction

The Mel-Frequency Cepstrum Coefficients (MFCC) are used here, since they deliver the best results in speaker verification. MFCCs are commonly derived as follows:

Take the Fourier transform of (a windowed excerpt of) a signal.
Map the powers of the spectrum obtained above onto the mel scale, using triangular overlapping windows.
Take the logs of the powers at each of the mel frequencies.
Take the discrete cosine transform of the list of mel log powers, as if it were a signal.
The MFCCs are the amplitudes of the resulting spectrum.

Gaussian Mixture Model

According to D. Reynolds in Gaussian_Mixture_Models: A Gaussian Mixture Model (GMM) is a parametric probability density function represented as a weighted sum of Gaussian component densities. GMMs are commonly used as a parametric model of the probability distribution of continuous measurements or features in a biometric system, such as vocal-tract related spectral features in a speaker recognition system. GMM parameters are estimated from training data using the iterative Expectation-Maximization (EM) algorithm or Maximum A Posteriori(MAP) estimation from a well-trained prior model.

Workflow graph

For a more detailed explanation, please refer to this blog that I have written.

Dependencies

This script require the follwing modules/libraries:

numpy & scipy & scikit-learn & python_speech_features

Libs can be installed as follows:

pip install -r requirements.txt

Code & scripts

Run.py : This is the main script and it will run the whole cycle (Data management > Models training > Genders identification)
DataManager.py: This script is responsible for the extracting and strcturing the data.
ModelsTrainer.py:This script is responsible for training the Gaussian Mixture Models (GMM).
GenderIdentifier.py:This script is responsible for Testing the system by identifying the genders of the testing set.
FeaturesExtractor.py:This script is responsible for extracting the MFCC features from the .wav files.

Results and disscussion

The system results in a 95% accuracy of gender detection.
The code can be further optimized using multi-threading, acceleration libs and multi-processing.
The accuracy can be further improved using GMM normalization aka a UBM-GMM system.

voice-based-gender-recognition's People

Contributors

Stargazers

Watchers

Forkers

phildani7 habibsellami wotulong tomasramilison keehyuk alongur xjia520 owen864720655 melissaboiko yangzhixue1 chaitanya-jadhav srekar chamecall baidu88vip hars-singh pubg1 huzhoudaxia gaoyanzeng doanthuvan hrishabh95 yuevan666 manoj904 viveksj iitians team-agility meghavarshini bahablgn t108318141 maple5525 bellyfat xzw168 padmajachow alireza0fa nt16145 zhaozju mishav78 elroy-he magicalvoice tessmil vasco-electronics danishcyber-star hamidsepehry baojiazhong 3erkajlo dvirbiton1 abhijeetrana1 anuragc10 ahmad-abdellatif joezhou1025 zhaohan33 charleschen0611 radhika2704 cyjack dyagnosys om-dhawan shravya0503 abinash1011 cheolhee ddj1201 nionsisre vipintomar0 juferoga swannedlakee jpambattu intouchables120

voice-based-gender-recognition's Issues

SVM-GMM

Hello Mr.Ayoub,
Thank you for your great work.

I got some issues while running the SVM-code could you help me please.

I have one question regarding the SVM-code, why we are training the data on .hmm.

And why we you used this?

Thank you in advance.

Unable to run Run.py

I downloaded the Zip file of the depository and launched
pip on the repository.txt
There seems to be an incompatibility between the version of scikit-learn and scipy. So finally I get the message

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for scipy
Running setup.py clean for scipy
error: subprocess-exited-with-error

× python setup.py clean did not run successfully.
│ exit code: 1
╰─> [11 lines of output]
/private/var/folders/1n/fvs3dd2d5gqghwcw8khb41_h0000gn/T/pip-install-wrqba4ts/scipy_d52b27f83b4d4bcbab2d2a119d86fdd9/setup.py:114: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
import imp

  `setup.py clean` is not supported, use one of the following instead:
  
    - `git clean -xdf` (cleans all files)
    - `git clean -Xdf` (cleans all versioned files, doesn't touch
                        files that aren't checked into the git repo)
  
  Add `--force` to your command to use it anyway if you must (unsupported).
  
  [end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed cleaning build dir for scipy
Failed to build scikit-learn scipy
ERROR: Could not build wheels for scikit-learn, scipy, which is required to install pyproject.toml-based projects

Here is the output I get when I then run Run.py:

python Run.py

Download dataset zip file

Mange and organize files

Files extraction was successfull ...
TrainingData was created ...
TestingData was created ...
TrainingData/females was created ...
TrainingData/males was created ...
TestingData/females was created ...
TestingData/males was created ...

Train gender models

Traceback (most recent call last):
File "Code/ModelsTrainer.py", line 5, in
from sklearn.mixture import GMM
ImportError: cannot import name 'GMM' from 'sklearn.mixture' (/Users/yannis/miniconda3/lib/python3.8/site-packages/sklearn/mixture/init.py)

Identify genders

Traceback (most recent call last):
File "Code/GenderIdentifier.py", line 5, in
from FeaturesExtractor import FeaturesExtractor
File "/Users/yannis/Downloads/LIVRE/PYTHON/EXOS/01-4/Voice-based-gender-recognition-master/Code/FeaturesExtractor.py", line 4, in
from python_speech_features import mfcc
ModuleNotFoundError: No module named 'python_speech_features'

What am I doing wrong? How can I solve this problem?

Error when using high rates

hi, i am currently working on my second phase of experiment using your source code. thank you :) i just have one doubt: the audio and rate in features vector, im having troubles in sampling_rate when i tried to substitute it with 44100. can you please tell me where im going wrong?

Originally posted by @thxrgxxs in #4 (comment)

Error when running ModelsTrainer

i keep getting this error when i run modelstrainer.py. can you please assist me?

Thank you

Thanx for the great work.
The problem has been solved.

Issue With Score Computation

Hello,

In testing this implementation on some real-life recordings I took, I happened to get very negative scores for the file named "female2.wav". I'm wondering how this happened, and more specifically how the scoring algorithm works (the documentation for the score function appears to say that it is a log probability, but somehow we have positive values?). Any indication as to how this one .wav file could have generated negative scores while other similar ones generated positive ones would be greatly appreciated.

The question of model

Hi, thank you for sharing your code.I am using your code for my further research recently. And now I have a question in the process of practice. I am appreciation it if get your answer.
In the model training of the Code file, you used the Gaussian mixture model. But in the model training of the nnCode, svmCode, and hmmCode files, you used the Gaussian HMM. Why should it be set like this？

Real-time gender classification?

I see the python files rely on files for input.

Can I pipe a .wav file into a python script?

I'm exploring possibilities for classifying a voice in real time.

I got the program working fine. Very nice code. Thank you.

how to test the network on new voices?

Hi
thanks for your good work .
will you please explain how to test the network on new data ?
how can we test it on our own voices?

thanks

The identification result is not accurate

I used more than 3,000 male voices and more than 3,000 female voices, and most of the results of the training identified male voices as female voices

There was problem for running the code

I am trying to run this code there was problem in feturesextractor
'from FeaturesExtractor import FeaturesExtractor'
How can solve this problem in python.
Which library is used to remove this issues.
error is : unresolved reference FeaturesExtractor

#from sklearn.mixture import GMM
from sklearn import mixture

#females_gmm = GMM(n_components = 16, n_iter = 200, covariance_type='diag', n_init = 3)
#males_gmm   = GMM(n_components = 16, n_iter = 200, covariance_type='diag', n_init = 3)
females_gmm = mixture.GaussianMixture(n_components = 16, max_iter = 200, covariance_type='diag', n_init = 3)
males_gmm   = mixture.GaussianMixture(n_components = 16, max_iter = 200, covariance_type='diag', n_init = 3)

also in FeaturesExtractor.py to avoid warnings:

#nfft         = 512,
nfft         = 2400,

also i have to change a bit GenderIdentifier.py (i use Windows) when i use it alone:

expected_gender = file.split("\\")[1][:-1]
#expected_gender = file.split("/")[1][:-1]
#gender_identifier = GenderIdentifier("TestingData/females", "TestingData/males", "females.gmm", "males.gmm")
gender_identifier = GenderIdentifier("females", "males", "females.gmm", "males.gmm")

and finally i saw strange thing - the model identifies upside-down:

--> TESTING        : 1073565748.wem.wav
+ FEMALE SCORE     : 0.487
+ MALE SCORE       : 0.8
+ EXPECTATION      : 1073565748.wem.wa
+ IDENTIFICATION   : male
----------------------------------------------------
--> TESTING        : 1073638869.wem.wav
+ FEMALE SCORE     : -0.127
+ MALE SCORE       : 0.432
+ EXPECTATION      : 1073638869.wem.wa
+ IDENTIFICATION   : male
----------------------------------------------------
--> TESTING        : 1073566498.wem.wav
+ FEMALE SCORE     : 0.661
+ MALE SCORE       : 0.229
+ EXPECTATION      : 1073566498.wem.wa
+ IDENTIFICATION   : female
----------------------------------------------------
*** Accuracy = 0.0% ***

True results: male -> female, female -> male.
May be somewhere is wrong path?

ok.fixed the last just changed *args in GenderIdentifier.py like follows:
gender_identifier = GenderIdentifier("males", "females", "males.gmm","females.gmm")