Comments (3)
Yes, you can leave multispeaker
setting to true
. I used the same inference code as in the Colab notebook: https://colab.research.google.com/github/yl4579/StyleTTS2/blob/main/Colab/StyleTTS2_Finetune_Demo.ipynb
I haven't really tested with different max_len
, but try to increase it as much as you can while keeping the batch size at least 2, and also do the SLM adversarial training run if you could (this is very RAM consuming though). I know right now the code is not very friendly to low RAM GPUs because of DP implementation. You can wait for fixed DDP implantations for mixed precision training.
from styletts2.
For 4, did you change multispeaker
to true or false? The default is true, and the default settings do produce better results than you have. The only difference I can see is batch_size
(from 16 to 4), but it shouldn't produce this big difference. max_len
from 400 to 100 is probably the cause. This is what I got by finetuning with one hour of data: https://voca.ro/1aC4vr4jErDL using the default setting.
from styletts2.
For 4, did you change multispeaker to true or false?
I fine-tuned the model with multispeaker:true
and then tried inference with both true and false. It definitely works better with true
, the example I attached is also generated with multispeaker:true
. I didn't try to fine-tune it with false
, but I guess a model fine-tuned with true
in the config should produce better results anyway, is that correct?
max_len from 400 to 100 is probably the cause
Do you know what is the minimal value for decent results? Unfortunately, I cannot use 400, but maybe I could set it a bit higher than 100 if I reduce batch_size even more. Training speed is not a concern for me.
This is what I got by finetuning with one hour of data: https://voca.ro/1aC4vr4jErDL using the default setting.
Yes, that sounds much better. Could you please share inference parameters? Would be awesome if you still have alpha/beta values and the name of the reference clip, so I can compare my results using the same values.
Thanks!
from styletts2.
Related Issues (20)
- styletts2 inference pip package HOT 1
- Current code doesn't work with hifigan HOT 4
- Testing foundation layer needed!
- Noise on long sentences HOT 1
- Some of FineTuning has this error HOT 5
- Using a smaller Hifigan HOT 1
- An Error From LJspeech Dataset HOT 2
- Stage 2 training bug (after joint training) HOT 7
- Speech-to-speech possible? HOT 4
- stage1 training issue HOT 6
- 你好,请问模型支持流式tts吗? HOT 1
- Fine-tuning worsens the quality of speech-synthesis. HOT 6
- When start firtst_train give errors. I have 96 Gb Ram and 3 P40/24GB/ 1 T4 /16GB/ ?? HOT 10
- Train a zero-shot voice adaptation model for a different accent/language HOT 1
- Finetuning kernel size issue HOT 1
- Preparing text and data HOT 2
- training single speaker different accent/language results HOT 3
- Training a Japanese model, pitch accent and IPA HOT 1
- OOD data for LibriTTS-460 training? HOT 1
- Chinese data HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from styletts2.