The bsseval from sigsep

Do not exclude segments where one/several estimates are all-zero

Currently, segments where one/several estimates are all-zero are not considered for the BSSEval computation

https://github.com/sigsep/sigsep-mus-eval/blob/05d52e4962660417801b78aa82ac598dd8c7b25a/museval/metrics.py#L300

This leads to the effect that the SDR value, which is defined for the jth instrument as

SDR_j = 20\log_10 ( \sum_i,n s_{ij}(n)^2 ) / ( \sum_i,n (s_{ij}(n) - \hat s_{ij}(n))^2 )

depends on the other estimates \hat s_{ik}(n) for k \ne j. Here is a quick example that shows the effect:

import musdb
import museval

import numpy as np


def estimate_and_evaluate1(track):
    """ Simple baseline system using mixture as estimate """
    estimates = {}
    estimates['vocals'] = 0.25 * track.audio
    estimates['accompaniment'] = 0.75 * track.audio

    scores = museval.eval_mus_track(track, estimates, output_dir='.')
    print('Score for `estimate_and_evaluate1`:')
    print(scores)

    return estimates


def estimate_and_evaluate2(track):#
    """ Modified baseline system, which sets the second half of `vocals` to zero """
    estimates = {}
    estimates['vocals'] = 0.25 * track.audio
    estimates['accompaniment'] = 0.75 * track.audio

    estimates['vocals'] *= np.vstack((np.ones((track.audio.shape[0] // 2, 2)),
                                      np.zeros((track.audio.shape[0] - track.audio.shape[0] // 2, 2))))

    scores = museval.eval_mus_track(track, estimates, output_dir='.')
    print('Score for `estimate_and_evaluate2`:')
    print(scores)

    return estimates


def estimate_and_evaluate3(track):#
    """ Modified baseline system, which sets the first half of `vocals` to zero """
    estimates = {}
    estimates['vocals'] = 0.25 * track.audio
    estimates['accompaniment'] = 0.75 * track.audio

    estimates['vocals'] *= np.vstack((np.zeros((track.audio.shape[0] // 2, 2)),
                                      np.ones((track.audio.shape[0] - track.audio.shape[0] // 2, 2))))

    scores = museval.eval_mus_track(track, estimates, output_dir='.')
    print('Score for `estimate_and_evaluate3`:')
    print(scores)

    return estimates


mus = musdb.DB(root_dir='/speech/db/mul/separ4/sisec/data2018/', is_wav=True)
mus.run(estimate_and_evaluate1, estimates_dir=".", tracks=[mus.load_mus_tracks(subsets='test')[0]])
mus.run(estimate_and_evaluate2, estimates_dir=".", tracks=[mus.load_mus_tracks(subsets='test')[0]])
mus.run(estimate_and_evaluate3, estimates_dir=".", tracks=[mus.load_mus_tracks(subsets='test')[0]])

estimate_and_evaluate* are three simple systems that uses the mixture as estimate. Only vocals is modified for the different versions but also the BSSEval values for accompaniment are changed:

$ python separ_and_evaluate.py 
  0%|                                                                                                                                                                                       | 0/1 [00:00<?, ?it/s]
Score for `estimate_and_evaluate1`:
vocals              => SDR:-10.161dB, SIR:-16.848dB, ISR:2.421dB, SAR:28.828dB, 
accompaniment       => SDR:6.991dB, SIR:12.551dB, ISR:11.751dB, SAR:28.828dB, 

100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [01:25<00:00, 85.16s/it]
  0%|                                                                                                                                                                                       | 0/1 [00:00<?, ?it/s]
Score for `estimate_and_evaluate2`:
vocals              => SDR:-12.816dB, SIR:-15.727dB, ISR:0.177dB, SAR:-1.699dB, 
accompaniment       => SDR:7.181dB, SIR:14.078dB, ISR:11.783dB, SAR:27.795dB, 

100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [01:11<00:00, 71.51s/it]
  0%|                                                                                                                                                                                       | 0/1 [00:00<?, ?it/s]
Score for `estimate_and_evaluate3`:
vocals              => SDR:-7.410dB, SIR:-11.257dB, ISR:0.695dB, SAR:2.519dB, 
accompaniment       => SDR:6.783dB, SIR:10.938dB, ISR:11.722dB, SAR:29.830dB, 

100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [01:15<00:00, 75.29s/it]

@faroit @aliutkus What do you think? Should this be changed for a future version of BSSEval?

Use an efficient algorithm to find the best permutation

The current implementation of bss_eval, in mir_eval and here, exhaustively tests all permutations of sources which has factorial cost. It is possible to find the permutation in quadratic time using a minimum weight matching algorithm for bipartite graphs. This algorithm is implemented in scipy by the function linear_sum_assignment.

I have made a PR to mir_eval with the required changes. In mir_eval it seems to start to make a difference in runtime only for 9 or more sources, but the difference is very significant then. However, since you are also implementing computationally lighter versions of bss_eval/si_sdr here, there is potentially room for improvement even with fewer sources.

I'd be happy to pitch in some code if required.

Add SI-SDR

Add scale invariant SDR metric. See https://arxiv.org/abs/1811.02508

Add regression tests

use museval as a test dependency to load the museval json files
later extend museval so that the json files include the used metrics (e.g. v4, or SI-SDR)
process selected musdb 7s tracks and add the evaluation files here

Refactoring

reduce the number of functions
add modules for different bsseval versions and other metrics such as SNR, and SI-SDR
possible look for ways to reduce memory footprint
look for ways to speed up computation

sigsep / bsseval Goto Github PK

bsseval's People

Contributors

Stargazers

Watchers

Forkers

bsseval's Issues

Do not exclude segments where one/several estimates are all-zero

Use an efficient algorithm to find the best permutation

Add SI-SDR

Add regression tests

Refactoring

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent