Code Monkey home page Code Monkey logo

deeplearningsourceseparation's Introduction

Deep Learning For Monaural Source Separation

Demo

Webpage: https://sites.google.com/site/deeplearningsourceseparation/

Experiments

MIR-1K experiment (singing voice separation)

  1. Training code: codes/mir1k/train_mir1k_demo.m

  2. Demo

  • Download a trained model http://www.ifp.illinois.edu/~huang146/DNN_separation/model_400.mat
  • Put the model at codes/mir1k/demo and go to the folder
  • Run: codes/mir1k/demo/run_test_single_model.m

TIMIT experiment (speech separation)

  1. Training code: codes/timit/train_timit_demo.m and codes/timit/train_timit_demo_mini_clip.m

  2. Demo

  • Download a trained model http://www.ifp.illinois.edu/~huang146/DNN_separation/timit_model_70.mat
  • Put the model at codes/timit/demo and go to the folder
  • Run: codes/timit/demo/run_test_single_model.m

TSP experiment (speech separation)

  1. Training code: codes/TSP/train_TSP_demo_mini_clip.m

  2. Demo

  • Download a trained model http://www.ifp.illinois.edu/~huang146/DNN_separation/TSP_model_RNN1_win1_h300_l2_r0_64ms_1000000_softabs_linearout_RELU_logmel_trn0_c1e-10_c0.001_bsz100000_miter10_bf50_c0_d0_7650.mat
  • Put the model at codes/TSP/demo and go to the folder
  • Run the demo code at codes/TSP/demo/run_test_single_model.m

Denosing experiment

  1. Put original FCJF0, FDAW0', FDML0, FECD0, 'FETB0', 'FJSP0', 'FKFB0', 'FMEM0', 'FSAH0', 'FSJK1', 'FSMA0', 'FTBR0', 'FVFB0' 'FVMH0 of the original TIMIT data under codes/denoising/Data/timit/

  2. Training code: codes/denoising/train_denoising_demo.m

  3. Demo

  • Download a trained model http://www.ifp.illinois.edu/~huang146/DNN_separation/denoising_model_870.mat
  • Put the model at codes/denoising/demo and go to the folder
  • Run the demo code at codes/denoising/demo/run_test_single_model.m

Dependencies

  1. The package is modified based on rnn-speech-denoising

  2. The software depends on Mark Schmidt's minFunc package for convex optimization.

  3. Additionally, we have included Mark Hasegawa-Johnson's HTK write and read functions that are used to handle the MFCC files.

  4. We use HTK for computing features (MFCC, logmel) (HCopy).

  5. We use signal processing functions from labrosa.

  6. We use BSS Eval toolbox Version 2.0, 3.0 for evaluation.

  7. We use MIR-1K for singing voice separation task.

  8. We use TSP for speech separation task.

Work on your data:

  1. To try the codes on your data, see mir1k, TSP settings - put your data into codes/mir1k/Wavfile or codes/TSP/Data/ accordingly.

  2. Look at the unit test parameters below codes/mir1k/train_mir1k_demo.m, codes/TSP/train_TSP_demo_mini_clip.m (with minibatch lbfgs, gradient clipping)

  3. Tune the parameters on the dev set and check the results.

Reference

  1. P.-S. Huang, M. Kim, M. Hasegawa-Johnson, P. Smaragdis, "Joint Optimization of Masks and Deep Recurrent Neural Networks for Monaural Source Separation", IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 23, no. 12, pp. 2136–2147, Dec. 2015

  2. P.-S. Huang, M. Kim, M. Hasegawa-Johnson, P. Smaragdis, "Singing-Voice Separation From Monaural Recordings Using Deep Recurrent Neural Networks," in International Society for Music Information Retrieval Conference (ISMIR) 2014.

  3. P.-S. Huang, M. Kim, M. Hasegawa-Johnson, P. Smaragdis, "Deep Learning for Monaural Speech Separation," in IEEE International Conference on Acoustic, Speech and Signal Processing 2014.

Notes

The codes are tested using MATLAB R2015a

Related Implementations

source_separaton_ml_jeju

deeplearningsourceseparation's People

Contributors

posenhuang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

deeplearningsourceseparation's Issues

Timit and TSP separation code - not working

Hi, the Timit speech separation model seems to run as you can see a bunch of output on the command window - but does NOT save the split files it appears

The TSP model - just does NOT run and errors out :(.

Would be very helpful if you could address these.

Thx!

could it be used in real time ?

Dear,
I want to use the method to solve the howling in RTC-Real Time Communication,
so could I separate the noise and voice frame by frame ?

any reference will be appreciated.

thx

drnn folder missing

Warning: Name is nonexistent or not a directory: ..\..\..\codes\denoising\drnn 
> In path (line 109)
  In addpath (line 88)
  In run_test_single_model (line 8) 

some errors

Excuse me.When I run the demos there are some errors.It reports that there is no stft3 function,and wrongly use horzcat function.The projects is a little big.It will be a little challenging to find the bugs,so I hope that you can give me some advise.Thank you very much.

how to control the gain

I notice that the gain is 1 by default, I guess this means the SNR is 0 db. If I want mix the waves at 5db,how to set the parameter 'gain'

Timit

Hi, I am a beginner in matlab.
I'm interesting in singing voice separation project,
so I cloned your project and tried to execute "train_timit_demo" in matlab.
However, I got:
Error in train_timit_demo (line 58)
eI.MFCCorlogMelorSpectrum=MFCCorlogMelorSpectrum;

I'm wondering how to set the MFCCorlogMelorSpectrum value when executing this code in terminal.
Thank you so much!

I found your answer about the same question for "train_mir1k_demo". Can I use the same values?

Denoising Demo not working

Hi, it seems like the code in the denoising folder is not up to date. First I received an error that the model file was not found so I renamed that file. Then I received an error that the formulate_data_test function is missing. If I add it (from the timit or TSP folder) I get yet another error:

Error using -
Matrix dimensions must agree.

Error in test_denoising_general_kl_bss3 (line 56)
output.source_noise= spectrum.mix-output.source_signal;

The previous error looked like this:

run_test_single_model
Warning: Name is nonexistent or not a directory: ......\codes\denoising\Data
In path (line 109)
In addpath (line 88)
In run_test_single_model (line 7)
Warning: Name is nonexistent or not a directory: ......\codes\denoising\drnn
In path (line 109)
In addpath (line 88)
In run_test_single_model (line 8)
Undefined function or variable 'formulate_data_test'.

Error in test_denoising_general_kl_bss3 (line 14)
formulate_data_test(mixture, eI, testmode);

Error in run_test_single_model (line 38)
output = test_denoising_general_kl_bss3(x', theta, eI, 'testall', 0);

I will try to fix it myself but if you have time I would appreciate your help. Thank you.

Script for Inferencing

Hi @posenhuang ,
Thanks for creating this really helpful repository. Can you please share a inference script so that I can test the result of your trained models on custom dataset.

Thanks.

minFunc returns 'Step direction is illegal!'

Hi,posenhuang
I use 3 hours' mandarin speech to train mir1k. The SNR is [-5,0,5,10,15,20]dB and clean data is SNR infinite condition. The left channel of clean data is filled with all zeros.
In minFunc.m, line 963-967, a legal number check is implemented and I got the error 'Step direction is illegal!'. I think there must be some data illegal, but I don't know where the problem is. Would you please
help me to solve this problem?

How to train the model?

Dear,
How to train the model for separating singing and voice?
Any advice or suggestion will be good.
Thx

TSP

To run this m file it is necessary some parameters, could you give the ones to run the example
TSP_model_RNN1_win1_h300_l2_r0_64ms_1000000_softabs_linearout_RELU_logmel_trn0_c1e-10_c0.001_bsz100000_miter10_bf50_c0_d0_7650.mat

train_TSP_demo_mini_clip(context_win, hidden_units, num_layers, isdropout, isRNN, iscleanonly,...
circular_step , isinputL1, MFCCorlogMelorSpectrum, framerate, pos_neg_r, outputnonlinear, opt, act, ...
train_mode, const, const2, isGPU, batchsize, MaxIter, bfgs_iter, clip, lambda,...
data_mode)

Faced some errors after running run_test_single_model.m

Hi, I recently started using Matlab so I apologise in advance if this is a stupid question.

I am trying to work on TIMIT dataset. While running the file :run_test_single_model.m, after downloading the pertained model, I am facing the following errors,

Error using horzcat

Dimensions of arrays being concatenated are not consistent.

Error in stft2 (line 82)
x = [zeros( 1, sz+pd-hp, cl) s zeros( 1, sz+pd,
cl)]';

Error in compute_features_stft2 (line 15)
spectrum_mix = scf * stft2( dmix, nFFT, hop, 0, wn);

Error in formulate_data_test (line 54)
[DATA, mixture_spectrum, eI]=compute_features_stft2(dmix, eI);

Error in test_timit_general_kl_recurrent (line 38)
[test_data_cell, target_ag,
mixture_spectrum]=formulate_data_test(mixture, eI,
testmode);

Error in run_test_single_model (line 41)
test_timit_general_kl_recurrent(eI.modelname, theta, eI,
'done', j);

Would be great if you could help me out, thank you!

denoising training?

Why the denoising experiment has no training code available? I am very interested in this, but would be awesome if you can provide additional code for training the denoising model. Thanks!

is there other Language?

Dear,
Is there any other Language?
Such as python or C/C++
Any advice or suggestion will be good!
Thx

The codes can't run

Your codes have some bugs, the minFunc_2012 tools can’t support your codes, and the TSP and denoising can't run.

Training ERROR with "Not enough input arguments."

Hi, I am a beginner in matlab.
I'm interesting in singing voice separation project,
so I cloned your project and tried to execute "train_mir1k_demo" in matlab.
However, I got:
Error in train_mir1k_demo (line 58)
eI.MFCCorlogMelorSpectrum=MFCCorlogMelorSpectrum;

I'm wondering how to set the MFCCorlogMelorSpectrum value when executing this code in terminal.
ThankU!!

After training, where is the model saved?

Hi, posenhuang. Thank you for your wonderful work. I have tried to train the denoising demo with timit corpus. It seems like that the training process is successful, which cost me 5h:43m:44.3s. The problem is I can't find the trained model. Could you please tell me where can I find it?

License is missing

Hi, we want reference your implementation in a paper and need to know the software license of your work.

what is the license of this software? Maybe you can add a LICENSE.md file in your root folder.

Training (ERROR)

HI Excuse me, I am a beginner in programming. I am working on a project of audio source separation , really interested in this code, if you can helpe me ... i have two problems :
1_ Error en Training code: codes/mir1k/train_mir1k_demo.m
Undefined function or variable "data_ag".

Error in formulate_data (line 223)
theoutputs = {data_ag, target_ag, mixture_ag};

Error in train_mir1k_demo (line 220)
[data_cell, targets_cell, mixture_spectrum]=formulate_data(train_files, eI, eI.train_mode); %0 -- chunk, 2--no chunk, 3- icassp

Error in tr (line 62)
train_mir1k_demo(context_win, hidden_units, num_layers, isdropout, ...
2_ I have a problem in the installation of htk, i can't install it, does the problem come back from there ? if so, how can i install it provided i downloaded the file ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.