Code Monkey home page Code Monkey logo

deep-learning-based-pitch-detection's Introduction

Deep learning based pitch detection

Abstract

The usage of deep learning models in audio signal domain tasks has substantially increased over the time. The success of these models, however, depends largely on the labeled dataset and their availability. Using digital data to render sufficient datasets for training have gained the focus as it reduces dependencies on the real world data. However, the realism of such digital data with corresponding real world data needs great human effort.

In this theoretical explanation, digital data based alternative prototype is proposed for pitch detection. As it is established, the relevant information of the signal is important for pitch detection, a synthesized audio signal is generated using MIDI data. Instead of processing the audio signal using conventional methods for pitch values, pitch tracker does the needful even in presence of noise using deep neural network. The proposed work has efficient approach to achieve the requirement. The evaluation on real world data shows the promising results. Finally, analysis on real world data shows the criteria that dataset needs to have for successful detection.

The appropriateness of the approach is demonstrated by training the state-of-the-art neural network architectures for pitch detection. Monophonic recordings under LMD (Lakh Midi Datasets) are considered. This scenario is evaluated on the Bach datasets to check the performance of model on real world data. The experiments conducted shows that the synthetic data helps the model in training and detecting the pitches when the real world data is passed. Instead of creating the real world datasets for statistical method based pitch detection, which can be complex, synthetic data can be used along with consideration of sound properties.

Blog about this project can be found here

The folder consists of 5 directories

dataset_creation:

	Contains script used to:
	Convert MIDI data to time domain signals 
	LSTM Pickle file creation
	LSTM Non-Overlapping time steps
	LSTM Overlapping time steps

evaluation_matlabFiles:

	Contains script used to:
	Reconstruct the audio files
	Visualize the files created for real audio
	Visualize the files created for synthetic audio

test_files:

	2 directories,
	cnn, contains
		feed spectrogram as .png file
		feed spectrogram as .mat file 
	lstm, contains
		file to create Pickle file
		file to create Timesteps 
		test using traied LSTM network

trained_nets:

	overlapping (LSTM)
	non-overlapping(LSTM)
	cnn network(CNN)

training_files:

	CNN architecture
	LSTM architecture

Procedure:

CNN
Create datasets Inputs and Labels as .png file
Provide the path to CNN architecture file
Train the network
Test the network by passing the image in Matlab(applying CQT) and pass the saved image through CNN network.
LSTM
Create datasets Inputs and Labels using Matlab. (Matlab)
Inputs are the spectrogram as .png file and labels are mat files.
Create a pickle of 96xN by concatenating all spectrograms and mat files.(python)
Now create the timesteps 96x216 using python.
Provide the path of timesteps inputs and labels to LSTM architecture
Train the network
Test the network by passing spectrograms
Create 96xN of spectrograms using pickle
Convert this to 96x216 timesteps
Pass the file to test LSTM script
save the network outputs
Visulize the network outputs using Matlab.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.