Hi, The document said if we are not English, we should prepare the PLBERT for our

Thanks, Is there any PL what is your language? </bl

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

How much data should we prepare to train our PLBERT if we are not English? about styletts2 HOT 7 CLOSED

yl4579 commented on September 5, 2024

How much data should we prepare to train our PLBERT if we are not English?

from styletts2.

Comments (7)

commented on September 5, 2024

what is your language?

from styletts2.

yl4579 commented on September 5, 2024

It depends on how much speech data you have. If you only have 10-20 hours of data like LJSpeech, then you need millions of sentences in PL-BERT to have a good performance. Otherwise, if you have like thousands of hours speech data, probably you don’t even need a pretrained PL-BERT to begin with.

from styletts2.

JohnHerry commented on September 5, 2024

Thanks, Is there any PL

what is your language?

For Mandarin and English mixed TTS

from styletts2.

yl4579 commented on September 5, 2024

@JohnHerry What do you mean by mixed? Like a sentence contains both English and Mandarin, or just something trained on the mixture of English and Mandarin datasets?

from styletts2.

JohnHerry commented on September 5, 2024

@JohnHerry What do you mean by mixed? Like a sentence contains both English and Mandarin, or just something trained on the mixture of English and Mandarin datasets?

@yl4579 I am sorry for my ill expression. I mean one sentence contains both English and Mandarin. Eg. "我们来讨论一下这个 paper idea"。 The PLM should be based on Mandarin phonemes but also support English phonemes. I made an issure in the phonemizer project and got that its "language_switch" function is limited and the output may contains some faults when the input text is bilingual.

And what I concern about also， is the alphabeta of bilingual phonemes. As what you said in the StyleTTS-issure10, You had tried to map Mandarin-PinYin into IPA characters but not work. I guess that may be raised in that the Alphabeta in your PinYin-IPA mapping, is not the same with the Alphabeta used in the https://github.com/bootphon/phonemizer . So if the output of phonemizer is what the PLM takes, which means they are sharing the same phoneme Alphabeta, then your PinYin-IPA mapping will not work. I did not find the way to get its IPA alphabeta that support cross-lingual phoneme transcription from the phonemizer project. Do you have any ideas how to get such an Alphabeta?

from styletts2.

yl4579 commented on September 5, 2024

@JohnHerry Actually I have trained a multilingual model in English, Mandarin and Japanese and it works quite well with all the settings I provided in yl4579/StyleTTS#10. The PL-BERT was also pre-trained on the Wikipedia corpora of these 3 languages, where I used word-level tokenizers for all languages and merged shared graphemes for Japanese and Chinese (for example, both /ɽˈʲokˈai/ and /lˈjaʊtɕˈjɛ/ correspond to the grapheme 了解). It is well compatible with phonemizer as the English IPA was generated by it.

As for your question regarding mixed language, I'm not sure how to get training data for this. The PL-BERT I pre-trained was trained separately on mixed datasets but not mixed languages in a single sentence. However, if your speech data contains a lot of samples of this type it probably would work with this pre-trained PL-BERT as well.

from styletts2.

JohnHerry commented on September 5, 2024

Thanks for the information, it is helpfull.

from styletts2.

How much data should we prepare to train our PLBERT if we are not English? about styletts2 HOT 7 CLOSED

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent