lucianodato / libspecbleach Goto Github PK

C library for audio noise reduction and other spectral effects

License: GNU Lesser General Public License v2.1

C 98.26% Meson 1.74%

c fft library noise-reduction noise-removal spectral stft spectral-processing broadband-noise-reduction non-stationary-noise-reduction

libspecbleach's People

Contributors

Stargazers

Watchers

Forkers

luzpaz jiuliguan isgursoy road2018 ennuicastr vu3rdd zzc2 ivandmitry7 efenstor khangkt1310 rhizomatica

libspecbleach's Issues

Noise modeling options

Give the user the possibility of using the rolling mean or max functions for noise spectrum modeling

No unit test at the time. Definitely to be a serious project test should be added. Not sure which test framework should be used or is commonly used across open source C projects. Maybe doing a custom client with no framework is better in this case.

Memory issue, Windows

I have compiled this in a 64-Bit Windows MFC DLL, works superbly but I believe the heap is being corrupted. I am so impressed with libspecbleach that I'm spending considerable time trying to try this down. Available via e-mail as [email protected] .

Rework stft code

Needed for #11. Using a proper circular/ring buffer

Project doesn't install anything

The 'install' target is typically expected to install the files.

Whitening factor in adaptive denoising?

Apologies if this is just a case of me not understanding the core technology, but why isn't whitening factor exported as part of the adaptive denoising API, when it is part of the learning denoising API? Coming from noise-repellent, and unless I'm a loony, whitening definitely worked just as well in adaptive mode there as in learning mode.

configure looks for the sndfile library and then the project doesn't use it

Library sndfile found: YES

The built binary isn't linked with it.

Smoothing

Would it be possible to explain the smoothing factor? Changing this in adaptive mode appears to make no difference. Many thanks, Simon.

How to use the example programs?

Hello!

I want to play with the library and found the example programs and compiled them. I am using the latest HEAD version of the library. When I pass a captured wav file to the denoiser_demo program, the output wav file sounds very mechanical and machine-like (like the sound of robots in old movies). I tried playing with the SpectralBleachParameters before it is passed to the load function. They do not seem to have any effect. Clearly, I am using it wrong.

Also towards the end of the denoiser_demo run, I get this error:

double free or corruption (!prev)
Aborted

Quality/Latency tradeoff control? AKA Variable frame size

AFAIK noise-repellent has a fixed latency.

I wonder if there could be a way to control the amount of delay somehow.

In some cases one could sacrifice some quality to get faster operation for semi-live usage like video live streaming or VoIP.

I have no idea if this is possible, so let me know.

Spectral Band Replication

To recover high frequencies after the reduction

Knee for spectral gates

This might be pretty useful for HF and transient conservation

Explain and implement different modes better

(Sorry for the somewhat strange-looking title.)

I just learned by accident that I can choose different values for "Learn noise profile". Neither the README.md nor the Wiki explain them, nor do they acknowledge their existence. How do they work? Some of them appear to be labeled in Spanish. For English, I would prefer simpler labels like

0: "Disabled"
1: "Average"
2: "Median"
3: "Maximum"

But I do not know whether that would be expressive enough since I do not understand the semantics. Please explain the meaning of the different values (0 to 3) in the Wiki.

While we are at it: From what I guess the different values are supposed to mean, does that mean that the way noise reduction works depends on the mode chosen during learning? If so, why not just learn for all of the three different modes simultaneously, and then let the user choose the mode to be used afterwards? That way, the user can try and compare different modes without having to try and make Noise Repellent relearn from the exact same noise profile repeatedly.

In the end, I think of something like this:

"Learn noise profile" should have only two values, "0" meaning "Disabled" and "1" meaning "Enabled".
There should be a new control "Mode" with the same four different values and labels as enumerated in the list above.
Since "Mode" can then be set to "0" meaning "Disabled", there is no need for a separate control "Enable" anymore. Thus, the currently present control "Enable" can be removed.

The main advantages I see are the following:

It is easier for the user to compare different modes by trail and error.
Every effect host I have ever seen already features a separate (binary or continuous) dry/wet control for each effect. Thus, removing the current "Enable" control simplifies the interface since it is probably redundant anyway.
Given the current implementation, what happens if the user switches "Learn noise profile" from a non-zero value to another non-zero value, e.g. from "2" to "1"? This kind of question (and confusion!) will not arise with the changes proposed here.

Of course, there are probably reasons as to why the controls are the way they currently are. Maybe what I imagine is too complex to implement or too difficult to use. What do you think?

P. S.: The control "Reduction strenght" (sic!) should probably read "Reduction strength".

Use meson subdir build scripts

To make it easier to add new files when needed.

Implement FFT partitioning to reduce latency

Hi,

I see the frame size is 20ms. If latency and CPU load are not a problem, is there an advantage in using a much bigger frame size, for example 160ms?

Use of inline

Hi,

I'm using this library in a 64-bit Windows DLL with Visual Studio 2019, the DLL is a mix of C++ (my code) and C (LibSpecBleach).

The use of inline in general_utils.c and spectral_utils.c results in the functions not being available - the compiler removes them. A good C compiler will inline all suitable functions anyway.

Would it be possible to remove the use of inline? Every time I use a new download from here I have to edit these two files.

Better noise estimation for adaptive mode

Better noise estimation algorithms

Enhance transient detection using a combination of detection criterias

Spectral flux is not enough for low energy transients. Multiband detection using psycho-acoustic bands like bark or mel should work better. This is key in order to use multiresolution processing.

Detect tonal (hum) content in noise print capture and enable reduction on the tonal content separately of the noisy content (hiss)

Use some form of peak detection in the spectrum to find bins that are like hum and enable the user to reduce them separately (ala iZotope)

Better smoothing

Maybe something like proposed in SPECTRAL SUBTRACTION WITH ADAPTIVE AVERAGING OF THE GAIN FUNCTION.
Time smoothing between current and past fft_p2 (similar effect to ephraim and malah). Is done by applying a release envelope to signal power spectrum. The best option would be to adaptively smooth 2D spectral components so it will require a bigger buffer as suggested by Lukin in Suppression of Musical Noise Artifacts in Audio Noise Reduction by Adaptive 2D Filtering. Bilateral filter or non local means + DFT (This is the best improvement but it will need a big rehauling of the stft).

Multi-resolution processing

As codecs do, selecting the right size of each frame would lead to better transients treatment. A new multiband onset detection would be needed and processing would be much more but could be optional.

Better psycho-acoustic model or rule to apply it

Right now Virag's method is used without the spectral flooring but with the adapted over-subtraction factors. Some suggestions are proposed in "A single channel speech enhancement technique exploiting human auditory masking properties" that can preserve low level details better.

Expose more key functionality through the API

Hi,

https://www.vocal.com/noise-reduction/musical-noise/ is an interesting read, it'll take me some days to fully understand it. Would be good to get the musicality removed, I think you effectively have this in libspecbleach, maybe some parameters should be exposed.

Look at https://www.youtube.com/watch?v=eRIqabVpltM to see libspecbleach in action, sometimes the musicality is pronounced.

BTW - this is not a complaint!

time for an ABI and SOVERSION?

Hi Luciano,

I was comparing Debian (and derivatives) to macOS the other day, and I was shocked to find that Debian doesn't seem to have any denoising plugins. I quickly found noise-repellant, which led me to its prerequisite libspecbleach, and I did preliminary work packaging it for Debian. Then I ran into a surprising issue that makes me ask:

Is libspecbleach ready to be packaged for a mainstream distribution? If so, is there any reason why it doesn't have an ABI and SOVERSION yet?

Do you think noise-repellant is ready for producing use in vocals and voiceover, or would https://github.com/werman/noise-suppression-for-voice better for now? My concern with noise-suppression-for-voice is that it only supports 48kHz, plus possible DFSG issues with https://github.com/xiph/rnnoise neural net and/or generating the trained neural net, since I'm not an audio scientist with expensive gear (nor someone with the background to set up the data gathering in a methodologically sound way).

What do you recommend? 😄
Nicholas (sten in the Debian project)