will-rice / denoisers Goto Github PK
View Code? Open in Web Editor NEWSimple PyTorch Denoisers for Waveform Audio
License: Apache License 2.0
Simple PyTorch Denoisers for Waveform Audio
License: Apache License 2.0
I'm on macOS 13. Python 3.11, ffmpeg 6.1.1, torch and torchaudio 2.2.0. The audio file is almost 21 minutes long and is extracted from this video: https://www.youtube.com/watch?v=8Wdz1Tj5084. I tried both mp3 and wav versions. Your gradio demo errors out on the file too.
This is the full error.
Traceback (most recent call last):
File "/Users/james/denoisers/test.py", line 20, in <module>
clean_chunk = model(audio_chunk[None]).audio
^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/james/.local/share/virtualenvs/denoisers-4WN3pNTX/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/james/.local/share/virtualenvs/denoisers-4WN3pNTX/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/james/.local/share/virtualenvs/denoisers-4WN3pNTX/lib/python3.11/site-packages/denoisers/modeling/waveunet/model.py", line 156, in forward
noise = self.model(inputs)
^^^^^^^^^^^^^^^^^^
File "/Users/james/.local/share/virtualenvs/denoisers-4WN3pNTX/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/james/.local/share/virtualenvs/denoisers-4WN3pNTX/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/james/.local/share/virtualenvs/denoisers-4WN3pNTX/lib/python3.11/site-packages/denoisers/modeling/waveunet/model.py", line 234, in forward
out = self.in_conv(inputs)
^^^^^^^^^^^^^^^^^^^^
File "/Users/james/.local/share/virtualenvs/denoisers-4WN3pNTX/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/james/.local/share/virtualenvs/denoisers-4WN3pNTX/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/james/.local/share/virtualenvs/denoisers-4WN3pNTX/lib/python3.11/site-packages/torch/nn/modules/conv.py", line 310, in forward
return self._conv_forward(input, self.weight, self.bias)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/james/.local/share/virtualenvs/denoisers-4WN3pNTX/lib/python3.11/site-packages/torch/nn/modules/conv.py", line 306, in _conv_forward
return F.conv1d(input, weight, bias, self.stride,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: Given groups=1, weight of size [24, 1, 15], expected input[1, 2, 163840] to have 1 channels, but got 2 channels instead
The code is taken from the project README.md:
import torch
import torchaudio
from denoisers import WaveUNetModel
from tqdm import tqdm
model = WaveUNetModel.from_pretrained("wrice/waveunet-vctk-24khz")
audio, sr = torchaudio.load("noisy_audio.wav")
if sr != model.config.sample_rate:
audio = torchaudio.functional.resample(audio, sr, model.config.sample_rate)
chunk_size = model.config.max_length
print(model.config)
padding = abs(audio.size(-1) % chunk_size - chunk_size)
padded = torch.nn.functional.pad(audio, (0, padding))
clean = []
for i in tqdm(range(0, padded.shape[-1], chunk_size)):
audio_chunk = padded[:, i:i + chunk_size]
with torch.no_grad():
clean_chunk = model(audio_chunk[None]).audio
clean.append(clean_chunk.squeeze(0))
denoised = torch.concat(clean, 1)[:, :audio.shape[-1]]
@will-rice , I just got this error,I attached my audio file:
p232.wav.zip
code:
import torch
import torchaudio
from denoisers import WaveUNetModel
from tqdm import tqdm
model = WaveUNetModel.from_pretrained("wrice/waveunet-vctk-24khz")
audio, sr = torchaudio.load("/content/p2325.wav")
if sr != model.config.sample_rate:
audio = torchaudio.functional.resample(audio, sr, model.config.sample_rate)
chunk_size = model.config.max_length
padding = abs(audio.size(-1) % chunk_size - chunk_size)
padded = torch.nn.functional.pad(audio, (0, padding))
clean = []
for i in tqdm(range(0, padded.shape[-1], chunk_size)):
audio_chunk = padded[:, i:i + chunk_size]
with torch.no_grad():
clean_chunk = model(audio_chunk[None]).audio
clean.append(clean_chunk.squeeze(0))
denoised = torch.concat(clean, 1)[:, :audio.shape[-1]]
error:
0%| | 0/1 [00:01<?, ?it/s]
AttributeError
[p232.wav.zip](https://github.com/will-rice/denoisers/files/14550543/p232.wav.zip)
Traceback (most recent call last)
22 # a=model(audio_chunk[None])
23 # print(a)
---> 24 clean_chunk = model(audio_chunk[None]).output.audio
25 # clean_chunk = model(audio_chunk[None]).audio
26 clean.append(clean_chunk.squeeze(0))
AttributeError: 'WaveUNetModelOutputs' object has no attribute 'output'
!pip install fastapi
!pip install kaleido
!pip install python-multipart
!pip install cohere
!pip install openai
!pip install tiktoken
ImportError Traceback (most recent call last)
in <cell line: 3>()
1 import torch
2 import torchaudio
----> 3 from denoisers import WaveUNetModel
4 from tqdm import tqdm
11 frames
/usr/local/lib/python3.10/dist-packages/pytorch_lightning/core/saving.py in
44 from omegaconf import OmegaConf
45 from omegaconf.dictconfig import DictConfig
---> 46 from omegaconf.errors import UnsupportedValueType, ValidationError
47
48 # the older shall be on the top
ImportError: cannot import name 'UnsupportedValueType' from 'omegaconf.errors' (/usr/local/lib/python3.10/dist-packages/omegaconf/errors.py)
Hope you can provide a fix as I needed immediate help in denoising my audio clips for a ASR project.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.