Code Monkey home page Code Monkey logo

musicrj's Introduction

MusicRJ

Basic Build

A Machine Learning-Audio Signal Processing Project (Ongoing)

Project Details

This is a Machine Learning-Audio Signal Processing Project where a real-time audio signal is classified into speech or music using Deep Neural Network and Convolutional Network. The long term goal is to create an AI personal assistant which listens to audio streams and summarize its content to the end user.

Block diagram

Dataset

The project use the dataset DataGTZAN music/speech collection.

All the wav audio files should be extracted to the Data/Files folder.

Python Version

Python 3.9.12

Setting up virtual environment

Installing Virtual Environment

python -m pip install --user virtualenv

Creating New Virtual Environment

python -m venv envname

Activating Virtual Environment

source envname/bin/activate

Upgrade PIP

python -m pip install --upgrade pip

Installing Packages

python -m pip install -r requirements.txt
pip install PyAudio

How to run

#Data preprocessing
python main.py -s p

#Model Training
python main.py -s t

#Real-time Demonstration
python main.py -s r

Model 1 (Simple DNN) Architecture

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 dense (Dense)               (None, 32)                8224                                                              
 dense_1 (Dense)             (None, 64)                2112                                                             
 dense_2 (Dense)             (None, 128)               8320                                                                  
 dense_3 (Dense)             (None, 256)               33024                                                                 
 dense_4 (Dense)             (None, 512)               131584                                                                 
 dense_5 (Dense)             (None, 256)               131328                                                                
 dense_6 (Dense)             (None, 128)               32896                                                                 
 dropout (Dropout)           (None, 128)               0                                                                    
 dense_7 (Dense)             (None, 64)                8256                                                             
 dense_8 (Dense)             (None, 2)                 130                                                                    
=================================================================
Total params: 355,874
Trainable params: 355,874
Non-trainable params: 0
_________________________________________________________________

dnn architecture

Model 1 Train and validation loss graph

Loss graph

Model 2 (CNN) Architecture

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 conv2d (Conv2D)                (None, 101, 1290, 32)    320                                                                    
 max_pooling2d (MaxPooling2D)   (None, 50, 645, 32)      0                                             
 conv2d_1 (Conv2D)              (None, 48, 643, 64)      18496                                                              
 max_pooling2d_1 (MaxPooling2D) (None, 24, 321, 64)      0                                                                                       
 conv2d_2 (Conv2D)              (None, 22, 319, 64)      36928                                                                
 flatten (Flatten)              (None, 449152)           0                                                                   
 dense (Dense)                  (None, 64)               28745792                                                        
 dense_1 (Dense)                (None, 2)                130                                                            
=================================================================
Total params: 28,801,666
Trainable params: 28,801,666
Non-trainable params: 0
_________________________________________________________________

Testing

python -m pytest --verbose

Results

Model Accuracy Precision Recall F1-score
DNN Model 0.9812 0.9980 0.9647 0.9810

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.