Code Monkey home page Code Monkey logo

iter_vocoder's Introduction

Itererative Vocoders

This repository implements the iterative vocoders as described in the paper, "Beyond Griffin-Lim: Improved Iterative Phase Retrieval for Speech by Tal Peer et al"

Paper link on arxiv

Note that the vocoder implementations are for academic purposes and they are not optimized in terms of speed.

The iterative algorithms implemented

  • Griffin-Lim Algorithm (GLA)
  • Fast Griffin-Lim (FGLA)
  • Relaxed Averaged Alternating Reflections (RAAR)
  • Difference Map (DiffMap)
  • Alternating Direction Method of Multipliers (ADMM)
  • Hybrid algorithms (where any of the above algorithms can be combined)

How to install

Clone the repository

git clone https://github.com/ogunlao/iter_vocoder.git

Then, you can install the package

cd /iter_vocoder
pip install -e .

Add the path to your PYTHONPATH

import sys
sys.path.append([FULL_PATH_OR_DIR]/iter_vocoder/src/vocoder)

Examples

cd /[FULL_PATH_OR_DIR]/iter_vocoder/src
import librosa
from vocoder import (GriffinLim, FastGriffinLim, RAAR, 
                          DiffMap, ADMM, HybridVocoder)

# stft parameters
hop_length=256
win_length=1024
n_fft=1024
window="hann"
center=True

sampling_rate=16000

# load the spectrogram e.g extract spectrogram from an audio
audio, sr = librosa.load("[FULL_PATH_OR_DIR]/iter_vocoder/sample_audios/sample_audio.wav", 
                        sr=sampling_rate)
complex_spec = librosa.stft(y=audio, 
                            hop_length=hop_length, 
                            win_length=win_length,
                            n_fft=n_fft,
                            window=window,
                            center=center,)

magspec, phase = librosa.magphase(complex_spec)

# To use griffin-lim

# a. Initialize griffin-lim
gl_vocoder = GriffinLim(n_iter=20,
                     hop_length=hop_length, 
                     win_length=win_length,
                     n_fft=n_fft,
                     window=window,
                     center=center,)
# b. use the vocode method 
gen_audio = gl_vocoder.vocode(magspec)

# c. To give an initial phase to vocoder
gen_audio = gl_vocoder.vocode(magspec, init_phase=phase)

# To use one or more iterative vocoders together aka hybrid vocoders

# parameters pertaining to each vocoder
# i.e first apply fast griffin-lim for 60 iterations, then raar for the last 40 iterations, for a total of 100 iterations
param_dict = {"fgla": {
                "n_iter": 60,
              },
              "raar": {
                "n_iter": 40,
              }
              }
# * You can choose among "gla", "fgla", "admm", "diffmap" and "raar"

# parameters to be applied to all vocoders e.g stft parameters
stft_args = dict(
    hop_length=hop_length,
    win_length=win_length,
    window=window,
    center=center,
    n_fft=n_fft,
)

hybrid_voc = HybridVocoder(param_dict, stft_args)
gen_audio = hybrid_voc.vocode(magspec)

# You can also give an initial phase
gen_audio = hybrid_voc.vocode(magspec, init_phase=phase)

Todo

  • Add tests
  • Compare implementation with results in paper

Contributors

  • Sewade Ogun

iter_vocoder's People

Contributors

ogunlao avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.