Code Monkey home page Code Monkey logo

real-time-hpss's Introduction

⚠️ WARNING 2021-03-01 ⚠️

The Python version of the code is buggy - the overlap-add reconstruction is incorrect. I believe it only works by luck.

The MATLAB version works fine. Also, I recommend you look at Zen, another one of my projects written in C++ and CUDA which is a very fast (160us!) implementation of the realtime HPSS idea, with correct overlap-add.

Real-Time-HPSS

This repository contains a real-time implementation of the median-filtering HPSS algorithm [1], [2].

The original implementation uses the STFT/spectrogram of the audio signal to create harmonic and percussive masks, which are then applied to the STFT. The ISTFT is taken to create the separated harmonic and percussive audio signals. By combining the STFT and ISTFT loops and creating a sliding STFT, the separation can be done in real-time:

MATLAB implementation

There's a demo script which performs live, real-time HPSS on a microphone input stream (watch out for feedback - you should probably have your output and input devices in different rooms). PDFs of the report and presentation (built from the latex sources in this repo) are distributed as well on the releases page. The spectrograms below were created with HPSSRtWav.m, a chunked processing of wav files to demonstrate the validity of real-time HPSS without the complications of microphone recordings.

Mixed spectrogram:

Real-time harmonic separation:

Real-time percussive separation:

Python demo

The file chunked_wav_example.py uses the hpss_rt package in the python subdirectory:

fs, x = scipy.io.wavfile.read("mixed.wav")
hpss = HPSSRT(fs)

h = numpy.ndarray(shape=x.shape)
p = numpy.ndarray(shape=x.shape)

x_ptr = 0
while x_ptr < len(x):
    if len(x[x_ptr : x_ptr + hpss.hop]) != hpss.hop:
        # skip uneven/non-hop-sized last chunk
        break
    h_, p_ = hpss.process_next_hop(x[x_ptr : x_ptr + hpss.hop])
    h[x_ptr : x_ptr + hpss.hop] = h_
    p[x_ptr : x_ptr + hpss.hop] = p_
    x_ptr += hpss.hop

scipy.io.wavfile.write("h_rt_sep_python.wav", fs, h)
scipy.io.wavfile.write("p_rt_sep_python.wav", fs, p)

fs, xm = scipy.io.wavfile.read("mixed.wav")
fs, xh = scipy.io.wavfile.read("h_rt_sep_python.wav")
fs, xp = scipy.io.wavfile.read("p_rt_sep_python.wav")
_, _, _, im = plt.specgram(xm, Fs=fs, NFFT=1024, noverlap=256)
plt.show()
_, _, _, im = plt.specgram(xh, Fs=fs, NFFT=1024, noverlap=256)
plt.show()
_, _, _, im = plt.specgram(xp, Fs=fs, NFFT=1024, noverlap=256)
plt.show()

Mixed spectrogram:

Real-time harmonic separation:

Real-time percussive separation:

Project files

  • audio - audio clips used throughout the report and presentation to generate results, plots
  • images - plots, etc. for the report and presentation
  • latex - latex files for the report and presentation PDFs
  • matlab - matlab scripts
    • HPSS.m - median-filtering HPSS (with both the 2010 and 2014 techniques)
    • HPSSWav.m - a file that loads a wav file and applies HPSS.m
    • HPSSMicrophone.m - a real-time implementation that separates and outputs percussive or harmonic separations of the microphone input in real-time
    • HPSSRtWav.m - a modification of HPSSMicrophone.m to test the real-time implementation with wav files
  • python (nb! buggy/suspicious STFT reconstruction) - python library + class, from hpss_rt import HPSSRT

About this project

Real-Time-HPSS is presented as my final project for MUMT 501, Winter 2020.

real-time-hpss's People

Contributors

sevagh avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.