MusicRJ

A Machine Learning-Audio Signal Processing Project (Ongoing)

To see demo video, click here

Project Details

This is a Machine Learning-Audio Signal Processing Project where a real-time audio signal is classified into speech or music using Deep Neural Network and Convolutional Network. The long term goal is to create an AI personal assistant which listens to audio streams and summarize its content to the end user.

Dataset

The project use the dataset DataGTZAN music/speech collection.

All the wav audio files should be extracted to the Data/Files folder.

Python Version

Python 3.9.12

Setting up virtual environment

Installing Virtual Environment

python -m pip install --user virtualenv

Creating New Virtual Environment

python -m venv envname

Activating Virtual Environment

source envname/bin/activate

Upgrade PIP

python -m pip install --upgrade pip

Installing Packages

python -m pip install -r requirements.txt
pip install PyAudio

How to run

#Data preprocessing
python main.py -s p

#Model Training
python main.py -s t

#Real-time Demonstration
python main.py -s r

Model 1 (Simple DNN) Architecture

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 dense (Dense)               (None, 32)                8224                                                              
 dense_1 (Dense)             (None, 64)                2112                                                             
 dense_2 (Dense)             (None, 128)               8320                                                                  
 dense_3 (Dense)             (None, 256)               33024                                                                 
 dense_4 (Dense)             (None, 512)               131584                                                                 
 dense_5 (Dense)             (None, 256)               131328                                                                
 dense_6 (Dense)             (None, 128)               32896                                                                 
 dropout (Dropout)           (None, 128)               0                                                                    
 dense_7 (Dense)             (None, 64)                8256                                                             
 dense_8 (Dense)             (None, 2)                 130                                                                    
=================================================================
Total params: 355,874
Trainable params: 355,874
Non-trainable params: 0
_________________________________________________________________

Model 1 Train and validation loss graph

Model 2 (CNN) Architecture

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d (Conv2D)                (None, 101, 1290, 32)    320                                                                    
 max_pooling2d (MaxPooling2D)   (None, 50, 645, 32)      0                                             
 conv2d_1 (Conv2D)              (None, 48, 643, 64)      18496                                                              
 max_pooling2d_1 (MaxPooling2D) (None, 24, 321, 64)      0                                                                                       
 conv2d_2 (Conv2D)              (None, 22, 319, 64)      36928                                                                
 flatten (Flatten)              (None, 449152)           0                                                                   
 dense (Dense)                  (None, 64)               28745792                                                        
 dense_1 (Dense)                (None, 2)                130                                                            
=================================================================
Total params: 28,801,666
Trainable params: 28,801,666
Non-trainable params: 0
_________________________________________________________________

Testing

python -m pytest --verbose

Results

Model	Accuracy	Precision	Recall	F1-score
DNN Model	0.9812	0.9980	0.9647	0.9810

cksajil / musicrj Goto Github PK

musicrj's Introduction

MusicRJ

A Machine Learning-Audio Signal Processing Project (Ongoing)

To see demo video, click here

Project Details

Dataset

Setting up virtual environment

How to run

Model 1 (Simple DNN) Architecture

Model 1 Train and validation loss graph

Model 2 (CNN) Architecture

Testing

Results

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent