Code Monkey home page Code Monkey logo

mtfaa-net's Introduction

Last update time: 07-20-2022

Hi, I'm Shimin Zhang (张是民)

  • 📕 Research interests: Speech Enhacement (including acoustic echo cancellation, noise suppression, target speaker extration)
  • 📫 How to reach me: [email protected]

Visitor count from 07-20-2022 to the present: Visitor Count

mtfaa-net's People

Contributors

echocatzh avatar jzi040941 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

mtfaa-net's Issues

question about the network

thanks for your code, there is a problem still confuse me, the input of the u-net structure is the magnitude after the phase encoder, but the output of the u-net have two-stage mask, one is magnitude mask, the other is phase mask and magnitude mask, I am confusing that there is no phase information input to the u-net structure, how can it get the correct phase mask? or after phase encoder, although the output is magnitude, but it includes phase information?

有关代码的一些请教

     您好!首先非常感谢您贡献的项目代码,这为我学习AEC提供了很大的帮助!由于我刚接触这个领域,所以在阅读您的代码时有些地方不太明白想向您请教一下。如果您在百忙之中能抽出时间回复,我将感激不尽!
第一个问题是,您的代码中的sig有一句注释:“sigs: list [B N] of len(sigs)”这里的B和N分别是指什么呢?是指0-8khz,8-16khz,16-48khz 3条带宽吗?N或许是音频的长度?之前的issue中有位大佬提到3个通道,指的是声道吗?
    第二个问题是,模型代码中有一句注释        # D / E ? 是指判断是Deep noisy supression 或 Echo cancellation吗?
期待得到您的热心回答!

关于lookahead的问题

论文中MTFAA-Net-Streaming的lookahead有40ms,我没有在代码中找到具体的体现,论文的帧移是8ms,应该是用了5帧的未来帧,在哪里体现呢?

Training with custom data

Hi,

How can I train a model with my dataset. Where can I find sample usage for training the model?

Thank you

Did you normalize signals when you calculate loss?

Hi, thanks for you great work.
I find that the loss decrease hardly when I train your MTFAA, I dont normalize signals when I calculate loss.
Maybe I should normalize signals like 《Data augmentation and loss normalization for deep noise suppression》,I want to know your way to calculate loss.

real performance

Did you test the model's real performance on real AEC data? and what's the FLOPS and parameters?

CUDA out of memory when using the network to train

Hello,

First of all, thank you for proving the implementation. It was very helpful to understand the paper.

I had one question though. When I was trying to train the network using 30-second 48kHz audio, I always run into CUDA out of memory error, even if the batch size is set to 1. Have you seen that in your experiments or do you have any advice maybe?

Anything will be greatly appreciated!

erb.py reported a error :expected np.ndarray (got tuple)

The report is as follows:

  File "E:/code_paper/MTFAA-Net-main/erb.py", line 24, in __init__
    filter = th.from_numpy(filter).float()
TypeError: expected np.ndarray (got tuple)

The error occurred on line 24 of erb.py.
filter = th.from_numpy(filter).float()
"filter" is a tuple has two members.

My Python version is 3.9.12.
My spafe version is 0.2.0.
My torch version is 1.12.1.

Hope you can teach me.

有关LAEC和模型的连接

你好,大佬,感谢开源这么优秀的项目;看到论文中说“LAEC中引入附加的条件信息,可以进一步提高模型对回声任务的性能。但是,如果将LAEC与模型简单地连接在一起,由于LAEC引入的失真会降低系统的性能”,有几个问题想请教下呢:
1、这个模型输入是三个吧,混合音频、LACE数据、远端数据
2、这个LAEC的输出要经过的附加条件是指什么呢,不太理解这个,LAEC与模型直接连接,是不是就是将LAEC的输出直接和混合音频、远端数据一起送到模型呢。求大佬指教

Lincense

Hi @echocatzh

I think there are many people supposed to use your awesome work for both commercial and non-commercial purposes
it would be great for people who use this and of course for you as well if you could add an explicit License
would you be able to add the license file?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.