Hello! Awesome repository, thanks for making it available! I have tw

Tunable Parameters about speaker-diarization HOT 4 CLOSED

aalto-speech commented on August 11, 2024

Tunable Parameters

from speaker-diarization.

Comments (4)

antoniomo commented on August 11, 2024 1

But of course feel free to open another issue if you can work around that and still want to give this a shot :)

from speaker-diarization.

antoniomo commented on August 11, 2024

Hi!

Telephone conversations are usually of quite poor quality, however, you can try some things:

Change the sample rate to 16khz with sox to match the models. This doesn't enhance the quality but at least makes it work.
Also with sox, try to reduce the noise:
If the recording conditions are too dominant, probably everything will get clustered into the speaker_1. Try making the speaker change detection more sensible, by trying for example a lambda of 0.75:

call(['./spk-change-detection.py', vad_recipe, args.feapath,
        '-o', spkchange_recipe, '-m', 'gw', '-d', 'BIC', '-w', '1.0',
        '-st', '3.0', '-dws', '0.1', '-l', '0.75'])  # Lambda to 0.75

Try using GLR with a fine-tuned distance threshold for the spk-clustering2.py:

call(['./spk-clustering.py', spkchange_recipe, args.feapath,
          '-o', outfile, '-m', 'hi', '-d', 'GLR', '-t', '3000'])  # Distance: GLR, Threshold: 3000 (fine tune for your data)

You'll find that code in the wrapper script spk-diarization2.py, make the changes there :)

I hope this helps, let me know to close the issue if possible :)

from speaker-diarization.

amenegola commented on August 11, 2024

Hi! Awesome answer, thanks!

I tried to implement the suggestions, they didn't work very well. Sometimes, it really looks like it is working, but it is not most of the time. I understood how changing the lambda and GLR Threshold impacts on segmentation and clustering, but it seems that the problem lies elsewhere. Also, tuning the parameters until I get good results in one audio may not generalize to the rest.

You can close the issue. Thank you very much for the help!

from speaker-diarization.

antoniomo commented on August 11, 2024

Sorry I couldn't be of more help! You are totally right, those parameters are data-dependent, so if the recording conditions vary a lot on your dataset, this won't work too well, and you'll need something more advanced than what this package can provide :(

from speaker-diarization.

Tunable Parameters about speaker-diarization HOT 4 CLOSED

Comments (4)

Related Issues (15)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent