Code Monkey home page Code Monkey logo

athena-signal's Introduction

Athena-signal processing open source library

What is Athena-signal?

Athena-signal is an open-source implementation of speech signal processing algorithms. It aims to help reserchers and engineers who want to use speech signal processing algorithms in their own projects. Athena-signal is mainly implemented using C, and called by python.

Introduction of athena-signal modules

Currently athena-signal is composed of such modules as Acoustic Echo Cancellation(AEC), High Pass Filter(HPF), Direction Of Arrival(DOA), Minimum Variance Distortionless Response(MVDR) beamformer, Generalized Sidelobe Canceller(GSC), Voice Activity Detection(VAD), Noise Supression(NS), and Automatic Gain Control(AGC).

Detailed description of each module

  • AEC: The core of the AEC algorithm includes time delay estimation, linear echo cancellation, double-talk detection, echo return loss estimation and residual echo suppression.
  • HPF: High-pass filtering is implemented using cascaded-iir-filter. The cut-off frequency is 200Hz in this program. You can rewrite the iir filter coefficients and gains, with the help of filter design toolbox in MATLAB, to generate high-pass filter with cut-off frequency you set.
  • DOA: Capon algorithm is used to get the direction of the sound source. The main function of the Capon algorithm is the Capon beamformer, also called MVDR. The Capon spectrum is estimated by using Rxx matrix and steering vector in frequency domain.
  • MVDR: This is a Minimum Variance Distortionless Response beamformer. You can set the steering vector(loc_phi) with the help of DOA estimation, which indicates the distortionless response direction. Rnn matrix is estimated using MCRA noise estimation method. Microphone array could be any shape as long as you set the coordinates of each microphones(mic_coord) beforehand. In the future edition, the steering vector will be estimated by DOA estimation. You can set the steering vector by your own DOA estimation method of course.
  • GSC: This is a Generalized Sidelobe Canceller beamformer, It is composed of Fixed Beamformer(FBF), Adaptive Blocking Matrix(ABM) and the Adaptive Interference Canceller(AIC) modules.
  • VAD: Voice Activity Detection(VAD) function outputs the current frame speech state based on the result of the double-talk detection.
  • NS: Noise reduction algorithm is based on MCRA noise estimation method. Details can be found in "Noise Estimation by Minima Controlled Recursive Averaging for Robust Speech Enhancement" and "Noise Spectrum Estimation in Adverse Environments: Improved Minima Controlled Recursive Averaging"
  • AGC: Automatic Gain Control(AGC) determines the gain factor based on the current frame signal level and the target level so that the gain of the signal is kept within a reasonable range.

Each modules has a separate switch that controls the operating status of the module. In the code, modules are turned off by default except AEC and NS.

Athena-signal operating instructions

  1. You need to set the switches of each module to determine whether it is enabled or not. Enter the number of mic and reference, and the coordinates of each microphone. When the switch is set to 1, it is in the running state, and when it is 0, the module will be skipped. Generally you have to manually set the number of microphones and reference channels, and the coordinates for each microphone. The coordinates MUST be set when MVDR, GSC or DOA module is enabled.
  2. You can set the length of the read and write data array_frm_len, and it is 128 by default, ptr_input_data and ptr_ref_data store the microphone signal data and reference signal data, respectively. The multi-channel microphone signals are stored in the form of parallel input, that is, the data of each channel is sequentially stored in ptr_input_data.
  3. Since MVDR requires the angle of incidence of the sound source, we set it to 90 by default. When the DOA module is enabled, the steering vector will be estimated by DOA estimation. MVDR supports ANY array setups, including circular array and linear array, as long as you set the coordinates of microphones mic_coord beforehand.

Requirements

  • Python3.x
  • swig
  • numpy
  • setuptools

Installation

Supported environments

  • Linux
  • MacOS
  • Windows

Install from source code

swig -python athena_signal/dios_signal.i
python setup.py bdist_wheel sdist

#For Linux or MacOS
pip install --ignore-installed  dist/athena_signal-0.1.0-*.whl

#For Windows
for /r dist %%i in (athena_signal-*.whl) do pip install --ignore-installed %%i

Execution of example

python examples/athena_signal_test.py

Configures Setting[Options]

config(dictionary):
    --add_AEC         : If 1, do AEC on  signal.
                        (int, default = 1)
    --add_NS          : If 1, do NS on signal.
                        (int, default = 1)
    --add_AGC         : If 1, do AGC on signal.
                        (int, default = 0)
    --add_HPF         : If 1, do HPF on signal.
                        (int, default = 0)
    --add_BF          : If 1, do MVDR on signal.
                        (int, default = 0)
                      : If 2, do GSC on signal.
                        (int, default = 0)
    --add_DOA         : If 1, do DOA on signal.
                        (int, default = 0)
    --mic_num         : Number of microphones.
                        (int, default = 1)
    --ref_num         : Number of reference channel.
                        (int, default = 1)
mic_coord(array/list): 
    The coordinates of each microphone of the microphone array
    using in MVDR. (A float array/list of size [mic_num, 3] containing
    three-dimensional coordinates of every microphone.)

Usage

from athena_signal.dios_ssp_api import athena_signal_process

#Test AEC
input_file = ["examples/0841-0875_env7_sit1_male_in.pcm"]
ref_file = ["examples/0841-0875_env7_sit1_male_ref.pcm"]
out_file = ["examples/0841-0875_env7_sit1_male_out.pcm"]
config = {'add_AEC': 1, 'add_BF': 0}
athena_signal_process(input_file, out_file, ref_file, config)

# Test BF
input_file = ["examples/m0f60_5cm_1_mix.pcm",
              "examples/m0f60_5cm_2_mix.pcm",
              "examples/m0f60_5cm_3_mix.pcm",
              "examples/m0f60_5cm_4_mix.pcm",
              "examples/m0f60_5cm_5_mix.pcm",
              "examples/m0f60_5cm_6_mix.pcm"]
out_file = ["examples/m0f60_5cm_mvdr_out.pcm"]
config = {'add_AEC': 0, 'add_BF': 1, 'add_DOA': 1, 'mic_num': 6}
mic_coord = [[0.05, 0.0, 0.0],
             [0.025, 0.0433, 0.0],
             [-0.025, 0.0433, 0.0],
             [-0.05, 0.0, 0.0],
             [-0.025, -0.0433, 0.0],
             [0.025, -0.0433, 0.0]]
athena_signal_process(input_file, out_file, config=config, mic_coord=mic_coord)

Contributing

Any contribution is welcome. All issues and pull requests are highly appreciated. If you have any questions when using athena-signal, or any ideas to make it better, please feel free to contact us.

Acknowledgement

Athena-signal is built with the help of some open-source repos such as WebRTC, speex, etc.

athena-signal's People

Contributors

songhui5561 avatar tjadamlee avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

athena-signal's Issues

感觉做residual echo suppression 时 ,遍历ref_num 是没有必要的?

在 dios_ssp_aec_process_api 中的 做 2步 res 的时候(line 506-517 和 line 534-543) 调用dios_ssp_aec_res_process函数时 用ref_num 做了遍历,感觉没必要,因为输入都是一样的。est_echo 已经把所有的ref 都相加了。 在你们设计的res 中,应该和ref 的数目没关系的。

关于代码中子带的问题请教2?

请问代码中的subband实现是参考下面文档中的 Version 4 吗 , D=128 M=256 L=768 吗? 这个是 filter bank 的 NPR 吗?
http://www.ws.binghamton.edu/fowler/fowler%20personal%20page/EE521_files/IV-08%20Uniform%20DFT%20Filter%20Bank_2007.pdf

文档中version 3 ,Analysis 中使用的窗函数系数是w[i],那么 Synthesis 中就是 1/w[i] ,为什么 version 4 中不是这么做了?使用的是恢复系数?
subband_filter_coef 是怎么产生的,比如(Hanning , Hamming和Kaiser )这些常见的窗函数都可以吗? 还是需要设计成满足NPR一些非线性公式? 有参考文档吗?看了很多文档都没看到老师这么做的,,看到的文档都是一堆公式求解非线性问题。

仔细看了视频的filter bank部分没讲太深,自己看了很多文档也没理解透彻这部分。

谢谢

gsc存在高频缺失的情况

首先感谢贵团队的工作,经过测试,发现gsc存在高频缺失的现象,不知道有没有改进的方法

I want to change frame size to 10ms buffer, what should I do please?

Dear friends:
1, I notice currently you only support 8ms buffer to process, now I want to change to 10ms buffer, I tried to change some parameters, but I find the programme runs crashed, and the output is all zero, so I want to know what should I change for 10ms buffer?
2, I notice you define subband_filter_coef[] in dios_ssp_share_subband.c file, but I do not know how you get this filter, could you give your algorithm? I see it is not just hanning or hamming windows, I think if I want to change to 10ms buffer, I have to change it, correctly?

Thanks

'config = { "add_AGC": 1}' error

#Test AGC input_file = ["agc_testaudio3.wav"] out_file = ["agc_testaudio3_athenaout.wav"] # config = { "add_AGC": 1, "add_NS":0, 'add_AEC': 0} config = { "add_AGC": 1} athena_signal_process(input_file, out_file, config)
I want to test agc algoirthm in 'athena_signal_test.py' which is shown above.
However, error occurred:

image

Any advice to solve this problem?
Looking forward to your reply.

关于代码中子带的问题请教?

  1. dios_ssp_share_subband.h 中 struct objSubBand scale 变量是什么作用?
    是窗函数的恢复系数吗? 在 dios_ssp_share_subband.c 代码中使用窗函数使用的是平顶窗,我查资料平顶窗的幅值恢复系数是1.110,感觉也不太像,这个和使用的窗函数有关吗?

donot understand some code about mvdr?

i found c code is not same with formula. the function dios_ssp_mvdr_cal_weights_adpmvdr computer w,but i only see R-1,i dont see vk and vkh etc. is this use other formula?

DOA's angle is always 0 degree

Dear friends:
I tried your DOA feature, all your DOA's reports are 0 degree, I also tried mine, it is the same, I think there may be some problems, could you check your code please?
Thanks

Segmentation fault (core dumped)

When I try to run the script in example/, it says segmentation fault. Can anyone help me out? Thanks a lot!
The entire output is like below:

ubuntu@ip-172-31-21-152:~/Judy/athena-signal/examples$ python3 athena_signal_test.py
#################################################
The configurations are: add_AEC: 1, add_NS: 1, add_AGC: 0, add_HPF: 0, add_BF: 0, add_DOA: 0
The number of microphones is: 1
The number of reference channels is: 1
#################################################
Segmentation fault (core dumped)

athena signal AEC

Hi All
我发现athena signal AEC和webrtc的AEC相似,请问先他们有什么不同,那些地方有改进,谢谢

DOA指向容易跳跃

使用3角麦的数据,DOA容易在正确方向和对角线方向间跳跃,比如正确方向是60度,声源位置不变的情况下会时不时指向240度;也就是容易在角度a和(a+180)间切换,请问是什么原因?致谢!

some question about mmse gain

Thanks for sharing the noise reduction code. There exists some questions that make me confusing in the noise reduction module(in the file dios_ssp_ns_api.c).

In the following code from the dios_ssp_ns_api.c (computing mmse gain), can you tell me the meaning of the variable "pSAP" or where can I find it in any research paper?

Thanks very much!

for (i = 0; i < srv->m_sp_size; ++i )
{
vk = srv->m_sp_snr[i]srv->m_gammak[i]/(1+srv->m_sp_snr[i]);
j00 = first_modified_Bessel( 0, vk/2 );
j11 = first_modified_Bessel( 1, vk/2 );
tmpC = (float)exp( -0.5
vk );
if (srv->m_gammak[i] < 1.0e-3)
{
tmpA = 0; // Limitation
}
else
{
tmpA = (float)sqrt(PI) / 2 * (float)pow( vk, 0.5 ) * tmpC / srv->m_gammak[i] ;
}
tmpB = (1+vk)j00+vkj11;
evk = (float)exp( vk );
Lambda = (1-0.3f) / 0.3f * evk / ( 1+srv->m_sp_snr[i] );
pSAP = Lambda/(Lambda+1 );

	tmp = tmpA*tmpB*pSAP;
	srv->m_gain[i] = tmp;
}

Streaming interface

Is it possible to provide an interface to Alsa or Pulseaudio that will feed a multichannel pcm into pcm streams?

So you can stream direct from microphone and output for ref channel?

ImportError: cannot import name '_dios_signal' from 'athena_signal'

python3 examples/athena_signal_test.py succeeded
But in the interactive environment of Python3, the following error occurs

from athena_signal.dios_ssp_api import athena_signal_process
Traceback (most recent call last):
File "", line 1, in
File "/Users/xxxx/athena-signal/athena_signal/init.py", line 18, in
from athena_signal import dios_ssp_api
File "/Users/xxxx/athena-signal/athena_signal/dios_ssp_api.py", line 19, in
from athena_signal.dios_signal import dios_ssp_v1
File "/Users/xxxx/athena-signal/athena_signal/diossignal.py", line 13, in
from . import _dios_signal
ImportError: cannot import name '_dios_signal' from 'athena_signal' (/Users/xxxx/athena-signal/athena_signal/init.py)

sh pack.sh debian 10 Arm64

Installing collected packages: athena-signal
Successfully installed athena-signal-0.1.0
Traceback (most recent call last):
  File "/home/pi/athena-signal/athena_signal/dios_signal.py", line 14, in swig_import_helper
    return importlib.import_module(mname)
  File "/usr/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
  File "<frozen importlib._bootstrap>", line 983, in _find_and_load
  File "<frozen importlib._bootstrap>", line 965, in _find_and_load_unlocked
ModuleNotFoundError: No module named 'athena_signal._dios_signal'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "examples/athena_signal_test.py", line 19, in <module>
    from athena_signal.dios_ssp_api import athena_signal_process
  File "/home/pi/athena-signal/athena_signal/__init__.py", line 18, in <module>
    from athena_signal import dios_ssp_api
  File "/home/pi/athena-signal/athena_signal/dios_ssp_api.py", line 19, in <module>
    from athena_signal.dios_signal import dios_ssp_v1
  File "/home/pi/athena-signal/athena_signal/dios_signal.py", line 17, in <module>
    _dios_signal = swig_import_helper()
  File "/home/pi/athena-signal/athena_signal/dios_signal.py", line 16, in swig_import_helper
    return importlib.import_module('_dios_signal')
  File "/usr/lib/python3.7/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
ModuleNotFoundError: No module named '_dios_signal'

Is the "second noise estimator" procedure necessary in noiselevel_process? It in dios_ssp_share_noiselevel.c

image

In the line 119 of dios_ssp_share_noiselevel.c, only if in_energy is low enough, we'll start to update srv->noise_level_first , so the srv->noise_level_first is suppose to lower than srv->noise_level_second which is update in all condition.

But in the line 151, the in_energy is compared to 20.0f * srv->noise_level_second and 20.0f * srv->noise_level_first seperately
image

if srv->noise_level_first < srv->noise_level_second is always true, is this can be just writen as
if ((in_energy > 20.0f * srv->noise_level_second) )

So what's the real intention to update srv->noise_level_first while we have the srv->noise_level_second? Maybe there is some good trick I haven't realized yet. Thanks

Stereo AEC

Dear Athena:

  1. what is the purpose of your multiple ref? in what situation will it be used?
  2. As your code in the dios_ssp_aec_api:
    for (i_mic = 0; i_mic < srv->mic_num; i_mic++){
    for (i_ref = 0; i_ref < srv->ref_num; i_ref++){
    }
    }
    If I set ref_num=2, can I use it as stereo AEC? Is it reasonable?

Some questions about ABM processing in the GSC module

Thanks for sharing the GSC processing code! There are existing some code in the "dios_ssp_gsc_abm.c" that make me confusing.

line 338 in the function "dios_ssp_gsc_gscabm_processonedatablock"
/* compute error signal in time-domain with circular convolution constraint e = [0 | new] */
for (i = 0; i < gscabm->fftsize / 2; i++)
{
gscabm->e[gscabm->fftsize / 2 + i] = gscabm->xrefdline[i] - gscabm->ytmp[gscabm->fftsize / 2 + i];
}
The code is checked carefully and it is found that
"gscabm->e" is the output for the ABM module. "gscabm->xrefdline" is from the output of fixed beamforming output, "gscabm->ytmp" is from the ABM filter output with the input from the steering output.

However , I think that "gscabm->xrefdline" should be from the steering output, and "gscabm->ytmp" should be from the ABM filter output with the input from the fixed beamforming output. Maybe I got it wrong, and please help me to understand it. The reference paper is "A robust adaptive beamformer for microphone arrays with a blocking matrix using constrained adaptive filters"

Thanks very much!

some question about "dios_ssp_matrix_inv_process"

Hello, the function of "dios_ssp_matrix_inv_process" is tested using the following code:

#include "dios_ssp_share_cinv.h"
int main(int argc, char *argv)
{
float R[18] = {
258919244, 0.000000000000000, -683065112, 0.000000000000000, 123292480, 0.000000000000000,
-683065112, 0.000000000000000, 1802021921,0.000000000000000, -325262900, 0.000000000000000,
123292480, 0.000000000000000, -325262900, 0.000000000000000, 58709685, 0.000000000000000,
};
float Rinv[18] = { 0.0 };
void
handle = dios_ssp_matrix_inv_init(3);
dios_ssp_matrix_inv_process(handle, R, Rinv);
return 0;
}

However, the inverse matrix result is different from through using "inv" function in matlab. Is there anything that we should pay special attantion to? Because I found that if I using the mvdr matlab code(all the matlab code is same with the c code except the "inv" function ) , the final noise reduction performance is degrade.

Look forward to your reply, Thanks!

請問 AEC 算法本身的延遲時間多少呢?

Hi
想請問一下, AEC 算法本身的延遲時間多少呢? 我估計了範例聲音檔AEC處理後大概過了 175ms 後近端語音才出現,這個 175 ms 可以透過什麼方法縮短呢?

Thanks

怎么评估算法的质量?

如果从语音质量评测方向看,客观的评估方法有两类:1 有参考质量评估, 2 无参考质量评估.
有参考的方法有PESQ, 这个方法应该只能在测试仿真阶段使用。
无参考的方法有P.563, 这个应该能在实际场景中使用,只是不知道适不适合评估这类语音增强的算法?还是有其他的方法能在实际场景中评估前端信号处理算法?
@songhui5561
谢谢!

Want to know the algorithm of AEC and DTD

Hi dear athena developer,
I'm a junior leaner of audio-processing, and I want to learn AEC related knowledges, could you tell me the algorithm of AEC and DTD that athena-signal used? What's papers that I can refer to?

there is memory leak in ptr_doa->m_doa_fid

in file
/kernels/dios_ssp_doa/dios_ssp_doa_api.c
line 125, ptr_doa->m_doa_fid = (int*)calloc(ptr_doa->m_angle_num, sizeof(int));

you only alloc ptr_doa->m_angle_num ,but when you use it

line 169- > Line 172
for(i = 0; i < ptr_doa->m_frq_bin_num; ++i)
{
ptr_doa->m_doa_fid[i] = ptr_doa->m_low_fid + (iptr_doa->m_frq_spptr_doa->m_fft_size)/ptr_doa->m_fs;
}

when tr_doa->m_frq_bin_num larger than ptr_doa->m_angle_num
there could be a memory leak risk occurs

there are bugs when add_BF==2

athena_signal/dios_signal.h
line 24
int dios_v1(int argc, char **argv, int *fe_switch, size_t m, float *mic_coord, size_t n, int mic_num, int ref_num, float loc_phi);

shoube be
int dios_ssp_v1(int argc, char **argv, int *fe_switch, size_t m, float *mic_coord, size_t n, int mic_num, int ref_num, float loc_phi);

athena_signal/dios_ssp_api.py
line 88
if self.feature_switch[4] == 1 and mic_coord is not None:

should be
if self.feature_switch[4] >= 1 and mic_coord is not None:

AEC 的参考文献

hello, 能否告知一下当前 repo 的 AEC 实现是具体参考哪篇或者几篇文献吗?谢谢

关于MVDR做beamforming中一个细节的不理解

你好,dios_ssp_mvdr_header.c中频域结果做ifft的时候,有如下代码
ptr_mvdr->fft_in[0] = ptr_mvdr->m_mvdr_out_re[0];
ptr_mvdr->fft_in[ptr_mvdr->m_fft_size / 2] = ptr_mvdr->m_mvdr_out_re[ptr_mvdr->m_fft_size / 2];
for (i = 1; i < ptr_mvdr->m_fft_size / 2; i++)
{
ptr_mvdr->fft_in[i] = ptr_mvdr->m_mvdr_out_re[i];
ptr_mvdr->fft_in[ptr_mvdr->m_fft_size - i] = -ptr_mvdr->m_mvdr_out_im[i];
}
dios_ssp_share_irfft_process(ptr_mvdr->mvdr_fft, ptr_mvdr->fft_in, ptr_mvdr->m_win_data);
从代码看,ptr_mvdr->fft_in前半部分只有实部m_mvdr_out_re,后半部分只有虚部-ptr_mvdr->m_mvdr_out_im,为什么是这么放呢?我理解一般都是后半部分放前半部分的共轭呀?望高手解答!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.