neural_srp's Introduction

Neural-SRP

This repository contains the code for the Neural-SRP paper, published in the Open Journal of Signal Processing (OJSP). Neural-SRP is a neural network-based multi-source tracking algorithm which combines architectural features from SRP-PHAT, an established model-based algorithm for sound source localization.

This code contains the code used for training the Neural-SRP method under different scenarios, and also the code used for evaluating the performance of the trained models on the LOCATA and TAU-NIGENS datasets.

Configuration

Parameters are controled in the params.json file. Note that some parameters should be changed depending on the script being run, as detailed below. The most important parameters are:

model: the model to be trained. It can be neural_srp, doanet, cross3d or srp. doanet only works for the TAU-NIGENS (train_multisource.py) dataset.
win_size: the window size in samples. Set to 640 when using the TAU-NIGENS dataset, and 4096 when using the LOCATA dataset.
hop_size: the hop size in percentage. Set to 0.5 when using the TAU-NIGENS dataset, and 0.75 when using the LOCATA dataset.
model_checkpoint_path: the path to the pretrained model. Must be compatible with the model parameter.

Main scripts

After setting the correct parameters in params.json, you can run the following scripts using python script_name.py. They are:

visualize_locata.py Visualize single-source tracking trajectories on the LOCATA dataset. Generates results of Tables 3 and 4 in the paper.
visualize_tau.py Visualize multi-source tracking trajectories on the TAU-NIGENS dataset. Generates results of Table 5 in the paper.
train_singlesource.py trains the single-source models using simulated data.
train_multisource.py trains the multi-source models using the TAU-NIGENS dataset.
analyze_complexity.py analyzes the complexity of the different models. Generates results of Table 6 in the paper.

Pretrained models

You can find the pretrained models in the checkpoints folder.

Datasets

LOCATA LOCATA challenge.
TAU-NIGENS TAU-NIGENS dataset. After downloading, run the preprocessing script python -m datasets.preprocess_tau_nigens_dataset to generate the dataset in the correct format. Set the variable path_tau_nigens_preprocessed to the path of the output preprocessed dataset in params.json.

neural_srp's People

Contributors

Stargazers

Watchers

neural_srp's Issues

Problems encountered in reproducing the model

Hi, I've tried to reproduce neural-srp and encountered some problems:

when I proprocess the tau-nigens dataset, the following error occurs, and when I change the value of self._nb_unique_classes from 2 to
3 it have been solved, Does it mean that the samples in the dataset do not have more than 2 sound source activities at most?

tnb_classes[frame_ind, active_event] = 1
IndexError: index 2 is out of bounds for axis 1 with size 2

But When I tried to visualize the tau using the neural-srp-multi.bin, it reported that:

target_doas = target_doas.view( target_doas.shape[0], target_doas.shape[1], 3, max_nb_doas ).transpose(-1, -2)
RuntimeError: shape '[1, 50, 3, 2]' is invalid for input of size 500

Is it because I change the value of self._nb_unique_classes?

When I tried to loading the doanet.bin to visualize the tau, it occured:

how can I solve it?

Thank you in advance, looking forward to your reply!

Questions bout the feature shape transform during normalization

Hello, after reading your code, i have some questions:

In GccExtractor, the feat matric shape is

gcc_feat = np.zeros((nb_frames, self._nb_bins, n_output_channels))
gcc_feat = gcc_feat.transpose((0, 2, 1))

and, in Preprocessor::extract_all_feature, the feat is trans to

        feat = self._gcc_extractor(audio_in)
        nb_frames = feat.shape[1]
        feat = feat.transpose((0, 2, 1)).reshape((nb_frames, -1))

which not make sense. ( i mean, the nb_frames refers to different meanings

I can't get it, could u help me about it ？ If there is something wrong in this version of code? Thanks!!!

Recommend Projects

egrinstein / neural_srp Goto Github PK

neural_srp's Introduction

Neural-SRP

Configuration

Main scripts

Pretrained models

Datasets

neural_srp's People

Contributors

Stargazers

Watchers

Forkers

neural_srp's Issues

Problems encountered in reproducing the model

Questions bout the feature shape transform during normalization

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent