Code Monkey home page Code Monkey logo

voicefixer_main's Introduction

arXiv Open In Colab PyPI version githubio

2021-11-06: I have just updated the code structure to make it easier to understand. It may have potential bug now. I will do some test training later.

2021-11-01: I will update the code and make it easier to use later.

VoiceFixer

VoiceFixer is a framework for general speech restoration. We aim at the restoration of severely degraded speech and historical speech.

Materials

Usage

Environment (Do this at first)

# Download dataset and prepare running environment
git clone https://github.com/haoheliu/voicefixer_main.git
cd voicefixer_main
source init.sh 

VoiceFixer for general speech restoration

Here we take VF_UNet(voicefixer with unet as analysis module) as an example.

  • Training
# pass in a configuration file to the training script
python3 train_gsr_voicefixer.py -c config/vctk_base_voicefixer_unet.json # you can modify the configuration file to personalize your training

You can checkout the logs directory for checkpoints, logging and validation results.

  • Evaluation

Automatic evaluation and generating .csv file on all testsets.

For example, if you like to evaluate on all testset (default).

python3 eval_gsr_voicefixer.py  \
                    --config  <path-to-the-config-file> \
                    --ckpt  <path-to-the-checkpoint> 

For example, if you just wanna evaluate on GSR testset.

python3 eval_gsr_voicefixer.py  
                    --config  <path-to-the-config-file> \
                    --ckpt  <path-to-the-checkpoint> \
                    --testset  general_speech_restoration \ 
                    --description  general_speech_restoration_eval 

There are generally seven testsets you can pass to --testset:

  • base: all testset
  • clip: testset with speech that have clipping threshold of 0.1, 0.25, and 0.5
  • reverb: testset with reverberate speech
  • general_speech_restoration: testset with speech that contain all kinds of random distortions
  • enhancement: testset with noisy speech
  • speech_super_resolution: testset with low resolution speech that have sampling rate of 2kHz, 4kHz, 8kHz, 16kHz, and 24kHz.

And if you would like to evaluate on a small portion of data, e.g. 10 utterance. You can pass the number to --limit_numbers argument.

python3 eval_gsr_voicefixer.py  \
                    --config  <path-to-the-config-file> \
                    --ckpt  <path-to-the-checkpoint> \
                    --limit_numbers 10 

Evaluation results will be presented in the exp_results folder.

ResUNet for general speech restoration

  • Training
# pass in a configuration file to the training script
python3 train_gsr_voicefixer.py -c config/vctk_base_voicefixer_unet.json

You can checkout the logs directory for checkpoints, logging and validation results.

  • Evaluation (similar to voicefixer evaluation)
    python3 eval_ssr_unet.py  
                        --config  <path-to-the-config-file> \
                        --ckpt  <path-to-the-checkpoint> \
                        --limit_numbers <int-test-only-on-a-few-utterance> \
                        --testset  <the-testset-you-want-to-use> \ 
                        --description  <describe-this-test>

ResUNet for single task speech restoration

  • Training

    • Denoising
    # pass in a configuration file to the training script
    python3 train_ssr_unet.py -c config/vctk_base_ssr_unet_denoising.json
    • Dereverberation
    # pass in a configuration file to the training script
    python3 train_ssr_unet.py -c config/vctk_base_ssr_unet_dereverberation.json
    • Super Resolution
    # pass in a configuration file to the training script
    python3 train_ssr_unet.py -c config/vctk_base_ssr_unet_super_resolution.json
    • Declipping
    # pass in a configuration file to the training script
    python3 train_ssr_unet.py -c config/vctk_base_ssr_unet_declipping.json

You can checkout the logs directory for checkpoints, logging and validation results.

  • Evaluation (similar to voicefixer evaluation)
    python3 eval_ssr_unet.py  
                        --config  <path-to-the-config-file> \
                        --ckpt  <path-to-the-checkpoint> \
                        --limit_numbers <int-test-only-on-a-few-utterance> \
                        --testset  <the-testset-you-want-to-use> \ 
                        --description  <describe-this-test>

Citation

 @misc{liu2021voicefixer,   
     title={VoiceFixer: Toward General Speech Restoration With Neural Vocoder},   
     author={Haohe Liu and Qiuqiang Kong and Qiao Tian and Yan Zhao and DeLiang Wang and Chuanzeng Huang and Yuxuan Wang},  
     year={2021},  
     eprint={2109.13731},  
     archivePrefix={arXiv},  
     primaryClass={cs.SD}  
 }

real-life-example real-life-example real-life-example

voicefixer_main's People

Contributors

haoheliu avatar ak391 avatar satvik-venkatesh avatar ws-choi avatar anonymous20211004 avatar msinanyildirim avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.