View Code? Open in Web Editor
NEW
Official source code of the INTERSPEECH 2023 paper: "Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Model" (AVLIT)
Home Page: https://arxiv.org/abs/2306.00160
License: MIT License
avlit's People
Contributors
Watchers
avlit's Issues
No 'tests/test_avlit.py' code file mentioned in readme.
Thank you for the awesome work.
Is it possible to separate the input audio mixture only without video frames? (maybe by setting the video path to null).
Thank you
Can you please share the script to generate NTCD-TIMIT and LRS3+WHAM! datasets?
Thanks
How did you do the loss function ?
In model configuration i am see:
video_encoder_checkpoint = "path/to/ae.ckpt",
What is it? Where i am can get this ae.ckpt file?
Can you provide you pretrained model please?