This repo is only dedicated to the post-processing PCS.
Introduction
PCS-tools
SpeechMetrics-tools
Citation
References
"PCS is derived based on the critical band importance function and applied to modify the targets of the SE model."
"It can also be used as a post-processing (PP) method to further sharpen the structure of enhanced speech and suppress residual noise."
More details can be found in here: http://arxiv.org/abs/2203.17152 (Preprint arXiv; Accepted by INTERSPEECH 2022)
This repo is only dedicated to the post-processing PCS.
Enhanced audios are generated by different baseline models to which post-processing PCS is then applied.
The experimental results are as follows:
Some examples are shown below:
Post-processing PCS tools can be found at /PCS
folder.
So you can simply post-process the audio with PCS.
Speech metric scores were computed with /speech_metrics
.
https://lojoffy-pcs-online-demo-main-luu0rc.streamlitapp.com/
If you find the code useful in your research, please cite:
@article{chao2022perceptual,
title={Perceptual Contrast Stretching on Target Feature for Speech Enhancement},
author={Chao, Rong and Yu, Cheng and Fu, Szu-Wei and Lu, Xugang and Tsao, Yu},
journal={Proc. of INTERSPEECH},
year={2022}
}
arXiv: https://arxiv.org/pdf/1703.09452.pdf
wikipedia: https://en.wikipedia.org/wiki/Wiener_filter
arXiv: https://arxiv.org/pdf/2006.10296.pdf
arXiv: https://arxiv.org/pdf/1805.00579.pdf
arXiv: https://arxiv.org/pdf/2104.03538.pdf
From SpeechBrain: https://huggingface.co/speechbrain/metricgan-plus-voicebank
arXiv: https://arxiv.org/pdf/2104.13002.pdf
Reproduced and denoted as DPT*