playvoice / grad-svc Goto Github PK

View Code? Open in Web Editor NEW

114.0 10.0 15.0 2.31 MB

Diffusion Singing Voice Conversion based on Grad-TTS from HuaWei

Home Page: https://huggingface.co/spaces/maxmax20160403/grad-svc

License: MIT License

Python 100.00%

diff-svc diffusion svc vits-svc voice-change grad-tts vits vits2 flow-matching

grad-svc's People

Contributors

Stargazers

Watchers

Forkers

maxmax2016 splinter21 muruganr96 wendongj diiogofernands ishine youngjundev2 lokshaw-chau nactemha guangkechen awas666 techthiyanes seanbackstrom zhaopufeng kdrkdrkdr

grad-svc's Issues

training error

python gvc_trainer.py

Numbers of GPU : True
Initializing logger...
Initializing data loaders...
----------131----------
----------10----------
Initializing model...
/Users/workstation/Music/Grad-SVC/Grad-SVC/lib/python3.8/site-packages/torch/nn/utils/weight_norm.py:30: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")
Number of encoder parameters = 16.99m
Number of decoder parameters = 16.87m
Start from Grad_SVC pretrain model: grad_pretrain/gvc.pretrain.pth
Initializing optimizer...
Logging test batch...
Traceback (most recent call last):
File "gvc_trainer.py", line 30, in
train(hps, args.checkpoint_path)
File "/Users/workstation/Music/Grad-SVC/grad_extend/train.py", line 72, in train
logger.add_image(f'image_{i}/ground_truth', plot_tensor(mel.squeeze()),
File "/Users/workstation/Music/Grad-SVC/grad_extend/utils.py", line 59, in plot_tensor
data = save_figure_to_numpy(fig)
File "/Users/workstation/Music/Grad-SVC/grad_extend/utils.py", line 48, in save_figure_to_numpy
data = data.reshape(fig.canvas.get_width_height()[::-1] + (3,))
ValueError: cannot reshape array of size 4320000 into shape (300,1200,3)

Does SVS work in english lyrics?

Something wrong with the decoder

I adjusted Hubert Korean to Grad-SVC and trained the model. But the exported audio file sounds weird and the generated decoder image looks weird.

What is the best version?

There are variety of version in Grad SVC (V3 CFM, V3 CFM RoPE, V2 96, etc..), but what is the best version?

Runtime error

This is the error that I get when trying to do the "Training file debugging" step.

`Traceback (most recent call last):
File "", line 1, in
File "C:\Users\phill\AppData\Local\Programs\Python\Python311\Lib\multiprocessing\spawn.py", line 122, in spawn_main
exitcode = _main(fd, parent_sentinel)
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\phill\AppData\Local\Programs\Python\Python311\Lib\multiprocessing\spawn.py", line 131, in _main
prepare(preparation_data)
File "C:\Users\phill\AppData\Local\Programs\Python\Python311\Lib\multiprocessing\spawn.py", line 246, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "C:\Users\phill\AppData\Local\Programs\Python\Python311\Lib\multiprocessing\spawn.py", line 297, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
^^^^^^^^^^^^^^^^^^^^^^^^^
File "", line 291, in run_path
File "", line 98, in _run_module_code
File "", line 88, in _run_code
File "A:\GradSVC\Grad-SVC-20230925-V3-CFM\prepare\preprocess_zzz.py", line 19, in
for batch in tqdm(loader):
File "C:\Users\phill\AppData\Roaming\Python\Python311\site-packages\tqdm\std.py", line 1178, in iter
for obj in iterable:
File "C:\Users\phill\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\utils\data\dataloader.py", line 442, in iter
return self._get_iterator()
^^^^^^^^^^^^^^^^^^^^
File "C:\Users\phill\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\utils\data\dataloader.py", line 388, in _get_iterator
return _MultiProcessingDataLoaderIter(self)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\phill\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\utils\data\dataloader.py", line 1043, in init
w.start()
File "C:\Users\phill\AppData\Local\Programs\Python\Python311\Lib\multiprocessing\process.py", line 121, in start
self._popen = self._Popen(self)
^^^^^^^^^^^^^^^^^
File "C:\Users\phill\AppData\Local\Programs\Python\Python311\Lib\multiprocessing\context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\phill\AppData\Local\Programs\Python\Python311\Lib\multiprocessing\context.py", line 336, in _Popen
return Popen(process_obj)
^^^^^^^^^^^^^^^^^^
File "C:\Users\phill\AppData\Local\Programs\Python\Python311\Lib\multiprocessing\popen_spawn_win32.py", line 45, in init
prep_data = spawn.get_preparation_data(process_obj._name)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\phill\AppData\Local\Programs\Python\Python311\Lib\multiprocessing\spawn.py", line 164, in get_preparation_data
_check_not_importing_main()
File "C:\Users\phill\AppData\Local\Programs\Python\Python311\Lib\multiprocessing\spawn.py", line 140, in _check_not_importing_main
raise RuntimeError('''
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.

    This probably means that you are not using fork to start your
    child processes and you have forgotten to use the proper idiom
    in the main module:

        if __name__ == '__main__':
            freeze_support()
            ...

    The "freeze_support()" line can be omitted if the program
    is not going to be frozen to produce an executable.

    To fix this issue, refer to the "Safe importing of main module"
    section in https://docs.python.org/3/library/multiprocessing.html

0%| | 0/5 [00:05<?, ?it/s]
Traceback (most recent call last):
File "C:\Users\phill\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\utils\data\dataloader.py", line 1133, in _try_get_data
data = self._data_queue.get(timeout=timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "C:\Users\phill\AppData\Local\Programs\Python\Python311\Lib\multiprocessing\queues.py", line 114, in get
raise Empty
_queue.Empty

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "A:\GradSVC\Grad-SVC-20230925-V3-CFM\prepare\preprocess_zzz.py", line 19, in
for batch in tqdm(loader):
File "C:\Users\phill\AppData\Roaming\Python\Python311\site-packages\tqdm\std.py", line 1178, in iter
for obj in iterable:
File "C:\Users\phill\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\utils\data\dataloader.py", line 634, in next
data = self._next_data()
^^^^^^^^^^^^^^^^^
File "C:\Users\phill\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\utils\data\dataloader.py", line 1329, in _next_data
idx, data = self._get_data()
^^^^^^^^^^^^^^^^
File "C:\Users\phill\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\utils\data\dataloader.py", line 1295, in _get_data
success, data = self._try_get_data()
^^^^^^^^^^^^^^^^^^^^
File "C:\Users\phill\AppData\Local\Programs\Python\Python311\Lib\site-packages\torch\utils\data\dataloader.py", line 1146, in _try_get_data
raise RuntimeError('DataLoader worker (pid(s) {}) exited unexpectedly'.format(pids_str)) from e
RuntimeError: DataLoader worker (pid(s) 14904) exited unexpectedly`

Please help!

Regarding the error "Fail to allocate bitmap" during the training process

I made modifications to the parameters in base.yaml:

full_epoch: Changed from 150 to 15000
batch_size: Changed from 8 to 24(18)
save_step: Changed from 10 to 100

My environment is:

Windows 10 22H2
CPU: 10850k
Memory: 64GiB
GPU: RTX 4090 64GiB
Pytorch 2.0.1+cu117
Python 3.8

During the training process, I encountered the following issues:

After running approximately +1030 Epochs, the following error message is frequently encountered:

Synthesis...
Fail to allocate bitmap

After running approximately +90 Epochs, the following error message can occasionally appear:

xception ignored in: <function Image.__del__ at 0x00000132B1459AF0>
Traceback (most recent call last):
  File "H:\Python389\lib\tkinter\__init__.py", line 4017, in __del__
    self.tk.call('image', 'delete', self.name)
RuntimeError: main thread is not in main loop
Exception ignored in: <function Variable.__del__ at 0x00000132EA2C18B0>
Traceback (most recent call last):
  File "H:\Python389\lib\tkinter\__init__.py", line 363, in __del__
    if self._tk.getboolean(self._tk.call("info", "exists", self._name)):
RuntimeError: main thread is not in main loop
Exception ignored in: <function Variable.__del__ at 0x00000132EA2C18B0>
Traceback (most recent call last):
  File "H:\Python389\lib\tkinter\__init__.py", line 363, in __del__
    if self._tk.getboolean(self._tk.call("info", "exists", self._name)):

I'm not sure what caused these issues. Do you have any suggestions on how to pinpoint them? Thanks.

Speaker Encoder是如何训练出来的？

如题，想请教一下Speaker Encoder是怎么训练出来的，有参考的代码吗

A better alternative to Grad-TTS

Thanks for noticing Better Diffusion Modeling Technology. Recently, Xue et al. proposed that Multi-GradSpeech using Consistent Diffusion Model as the generative network outperforms Grad-TTS in both single- and multi-speaker scenarios, and I believe that this advantage can be carried over to the SVC task, and I'd be happy to share the code if you'd like to try to replace Grad-TTS with Multi-GradSpeech.

Setting base.yaml

What is the difference between full and fast epochs? And what are test size, test step, and save step?

num_worker Issue

Traceback (most recent call last): File "/Users/workstation/Music/Grad-SVC V2 96/gvc_trainer.py", line 30, in <module> train(hps, args.checkpoint_path) File "/Users/workstation/Music/Grad-SVC V2 96/grad_extend/train.py", line 120, in train for batch in progress_bar: File "/Users/workstation/Music/Grad-SVC V2 96/Grad-SVC V2 96/lib/python3.11/site-packages/tqdm/std.py", line 1182, in __iter__ for obj in iterable: File "/Users/workstation/Music/Grad-SVC V2 96/Grad-SVC V2 96/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 631, in __next__ data = self._next_data() ^^^^^^^^^^^^^^^^^ File "/Users/workstation/Music/Grad-SVC V2 96/Grad-SVC V2 96/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 1318, in _next_data self._shutdown_workers() File "/Users/workstation/Music/Grad-SVC V2 96/Grad-SVC V2 96/lib/python3.11/site-packages/torch/utils/data/dataloader.py", line 1443, in _shutdown_workers w.join(timeout=_utils.MP_STATUS_CHECK_INTERVAL) File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/process.py", line 149, in join res = self._popen.wait(timeout) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/popen_fork.py", line 40, in wait if not wait([self.sentinel], timeout): ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/multiprocessing/connection.py", line 947, in wait ready = selector.select(timeout) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/Library/Frameworks/Python.framework/Versions/3.11/lib/python3.11/selectors.py", line 415, in select fd_event_list = self._selector.poll(timeout) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/workstation/Music/Grad-SVC V2 96/Grad-SVC V2 96/lib/python3.11/site-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler _error_if_any_worker_fails() RuntimeError: DataLoader worker (pid 53481) is killed by signal: Segmentation fault: 11.

Git this error while training. Would this error be resolved if I change num_workers=8 to num_workers=0 from ./grad_extend/train.py?

why skip_diff_train before fast_epochs

Hi,

Can you explain why skip diffusion train before the configured fast_epochs?
And how many epochs does diffusion train need?

Thanks!

What is the advantage for Grad-SVC, compare to So-VITS-SVC?

Hi @MaxMax2016 Thank you for this wonderful project

What is the advantage for Grad-SVC, compare to So-VITS-SVC?

训练数据量？

想问一下，预训练模型用了多少数据量训出来的。

电音现象问题请教

想请教一下，在经过扩散模型之前的声学模型，也就是从hubert 到 mel的这个阶段，这个出来的mel直接送到声码器，为啥会有电音现象呀，按理来说，hubert已经包含足够多的信息了，为什么生成的mel谱还有那么多平行的共振峰呢？楼主有没有试过用wavLM替代hubert呀？

在推理阶段遇到了路径报错

!python gvc_inference.py --model gvc.pth --spk ./data_gvc/singer/Sakura.spk.npy --wave test.wav --vec test.vec.npy --pit test.csv --shift 0
报错：
Traceback (most recent call last):
File "E:\Grad-SVC-20230930-V3-CFM\gvc_inference.py", line 215, in
assert os.path.isfile(args.model_bigv)
AssertionError

Error during training

For testing, I set full_epochs to 15, fast_epochs to 10, and save_step to 5.
At the end of the 11th epoch, the following error message appears and the training is terminated.

Traceback (most recent call last):
File "S:\VoiceChanger\Grad-SVC\gvc_trainer.py", line 30, in
train(hps, args.checkpoint_path)
File "S:\VoiceChanger\Grad-SVC\grad_extend\train.py", line 127, in train
prior_loss, diff_loss, mel_loss, spk_loss = model.compute_loss(
File "S:\VoiceChanger\Grad-SVC\grad\model.py", line 132, in compute_loss
mel = slice_segments(mel, ids, out_size)
File "S:\VoiceChanger\Grad-SVC\grad\utils.py", line 82, in slice_segments
ret[i] = x[i, :, idx_str:idx_end]
RuntimeError: The expanded size of the tensor (200) must match the existing size (0) at non-singleton dimension 1. Target sizes: [100, 200]. Tensor sizes: [100, 0]

Adjusting Hubert model

How can I use this Hubert Model on Grad SVC?
https://huggingface.co/team-lucid/hubert-base-korean