Comments (25)
模型是其他大佬做的,你可以在这里找到我所使用的模型
https://github.com/CjangCjengh/TTSModels
https://huggingface.co/spaces/zomehwh/vits-uma-genshin-honkai/tree/main/model
from vits-simple-api.
from vits-simple-api.
感谢,已经用起来了,现在的问题是在哪里开启gpu加速啊?有配置项吗?
from vits-simple-api.
感谢,已经用起来了,现在的问题是在哪里开启gpu加速啊?有配置项吗?
需要安装cuda和gpu版pytorch,安装好后会自动调用gpu。
from vits-simple-api.
安装好了,电脑也重启了,重新跑的时候还是没显示启用gpu加速。后面吧集成显卡禁用了,也没用,是我的显卡是mx250 是需要设置哪里吗?
from vits-simple-api.
安装好了,电脑也重启了,重新跑的时候还是没显示启用gpu加速。后面吧集成显卡禁用了,也没用,是我的显卡是mx250 是需要设置哪里吗?
你验证下cuda是否安装成功,mx250应该要找对应的版本安装。然后是pytorch,在vits启动时会打印pytorch版本信息,版本是x.x.x+cu1xx(x是数字)的才是可以使用cuda的。
from vits-simple-api.
模型是其他大佬做的,你可以在这里找到我所使用的模型 https://github.com/CjangCjengh/TTSModels https://huggingface.co/spaces/zomehwh/vits-uma-genshin-honkai/tree/main/model
大佬,目前vits有英语模型么?今天在网上搜了半天,也没找到英语Model。
from vits-simple-api.
https://github.com/jaywalnut310/vits
原仓库有英语模型,不过需要稍微改下代码并另外安装espeak才能使用
from vits-simple-api.
305cae8
在对应的json文件中添加以下两行才能使用
"speakers": ["vctk"],
"symbols": ["_", ";", ":", ",", ".", "!", "?", "¡", "¿", "—", "…", "\"", "«", "»", "“", "”", " ", "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "ɑ", "ɐ", "ɒ", "æ", "ɓ", "ʙ", "β", "ɔ", "ɕ", "ç", "ɗ", "ɖ", "ð", "ʤ", "ə", "ɘ", "ɚ", "ɛ", "ɜ", "ɝ", "ɞ", "ɟ", "ʄ", "ɡ", "ɠ", "ɢ", "ʛ", "ɦ", "ɧ", "ħ", "ɥ", "ʜ", "ɨ", "ɪ", "ʝ", "ɭ", "ɬ", "ɫ", "ɮ", "ʟ", "ɱ", "ɯ", "ɰ", "ŋ", "ɳ", "ɲ", "ɴ", "ø", "ɵ", "ɸ", "θ", "œ", "ɶ", "ʘ", "ɹ", "ɺ", "ɾ", "ɻ", "ʀ", "ʁ", "ɽ", "ʂ", "ʃ", "ʈ", "ʧ", "ʉ", "ʊ", "ʋ", "ⱱ", "ʌ", "ɣ", "ɤ", "ʍ", "χ", "ʎ", "ʏ", "ʑ", "ʐ", "ʒ", "ʔ", "ʡ", "ʕ", "ʢ", "ǀ", "ǁ", "ǂ", "ǃ", "ˈ", "ˌ", "ː", "ˑ", "ʼ", "ʴ", "ʰ", "ʱ", "ʲ", "ʷ", "ˠ", "ˤ", "˞", "↓", "↑", "→", "↗", "↘", "'", "̩", "'", "ᵻ"]
from vits-simple-api.
305cae8 在对应的json文件中添加以下两行才能使用
"speakers": ["vctk"], "symbols": ["_", ";", ":", ",", ".", "!", "?", "¡", "¿", "—", "…", "\"", "«", "»", "“", "”", " ", "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "ɑ", "ɐ", "ɒ", "æ", "ɓ", "ʙ", "β", "ɔ", "ɕ", "ç", "ɗ", "ɖ", "ð", "ʤ", "ə", "ɘ", "ɚ", "ɛ", "ɜ", "ɝ", "ɞ", "ɟ", "ʄ", "ɡ", "ɠ", "ɢ", "ʛ", "ɦ", "ɧ", "ħ", "ɥ", "ʜ", "ɨ", "ɪ", "ʝ", "ɭ", "ɬ", "ɫ", "ɮ", "ʟ", "ɱ", "ɯ", "ɰ", "ŋ", "ɳ", "ɲ", "ɴ", "ø", "ɵ", "ɸ", "θ", "œ", "ɶ", "ʘ", "ɹ", "ɺ", "ɾ", "ɻ", "ʀ", "ʁ", "ɽ", "ʂ", "ʃ", "ʈ", "ʧ", "ʉ", "ʊ", "ʋ", "ⱱ", "ʌ", "ɣ", "ɤ", "ʍ", "χ", "ʎ", "ʏ", "ʑ", "ʐ", "ʒ", "ʔ", "ʡ", "ʕ", "ʢ", "ǀ", "ǁ", "ǂ", "ǃ", "ˈ", "ˌ", "ː", "ˑ", "ʼ", "ʴ", "ʰ", "ʱ", "ʲ", "ʷ", "ˠ", "ˤ", "˞", "↓", "↑", "→", "↗", "↘", "'", "̩", "'", "ᵻ"]
谢谢大佬,我试试去
from vits-simple-api.
https://github.com/jaywalnut310/vits 原仓库有英语模型,不过需要稍微改下代码并另外安装espeak才能使用
大佬,看了原仓库的README,是否要拿数据集训练,才能得到英语语音模型?在仓库文件夹里,没找到Model文件
"3.Download datasets
Download and extract the LJ Speech dataset, then rename or create a link to the dataset folder: ln -s /path/to/LJSpeech-1.1/wavs DUMMY1
For mult-speaker setting, download and extract the VCTK dataset, and downsample wav files to 22050 Hz. Then rename or create a link to the dataset folder: ln -s /path/to/VCTK-Corpus/downsampled_wavs DUMMY2
4.Build Monotonic Alignment Search and run preprocessing if you use your own datasets."
from vits-simple-api.
不用,他在README里提供了预训练模型的下载链接,你可以直接使用预训练模型,或者在这个模型上继续训练。而json文件可以在仓库的configs里找到
from vits-simple-api.
305cae8 在对应的json文件中添加以下两行才能使用
"speakers": ["vctk"], "symbols": ["_", ";", ":", ",", ".", "!", "?", "¡", "¿", "—", "…", "\"", "«", "»", "“", "”", " ", "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "ɑ", "ɐ", "ɒ", "æ", "ɓ", "ʙ", "β", "ɔ", "ɕ", "ç", "ɗ", "ɖ", "ð", "ʤ", "ə", "ɘ", "ɚ", "ɛ", "ɜ", "ɝ", "ɞ", "ɟ", "ʄ", "ɡ", "ɠ", "ɢ", "ʛ", "ɦ", "ɧ", "ħ", "ɥ", "ʜ", "ɨ", "ɪ", "ʝ", "ɭ", "ɬ", "ɫ", "ɮ", "ʟ", "ɱ", "ɯ", "ɰ", "ŋ", "ɳ", "ɲ", "ɴ", "ø", "ɵ", "ɸ", "θ", "œ", "ɶ", "ʘ", "ɹ", "ɺ", "ɾ", "ɻ", "ʀ", "ʁ", "ɽ", "ʂ", "ʃ", "ʈ", "ʧ", "ʉ", "ʊ", "ʋ", "ⱱ", "ʌ", "ɣ", "ɤ", "ʍ", "χ", "ʎ", "ʏ", "ʑ", "ʐ", "ʒ", "ʔ", "ʡ", "ʕ", "ʢ", "ǀ", "ǁ", "ǂ", "ǃ", "ˈ", "ˌ", "ː", "ˑ", "ʼ", "ʴ", "ʰ", "ʱ", "ʲ", "ʷ", "ˠ", "ˤ", "˞", "↓", "↑", "→", "↗", "↘", "'", "̩", "'", "ᵻ"]
谢谢大佬,我试试去
大佬,我先git pull了您最新的代码。然后加载了预训练模型,也将您这两行代码更新到模型对应的json文件,python.app后报出以下错误
(fort) E:\Fort\WechatBot\vits-simple-api>python app.py
INFO:root:Loaded checkpoint 'E:\Fort\WechatBot\vits-simple-api/Model/Nene_Nanami_Rong_Tang/1374_epochs.pth' (iteration None)
INFO:root:Loaded checkpoint 'E:\Fort\WechatBot\vits-simple-api/Model/Zero_no_tsukaima/1158_epochs.pth' (iteration None)
INFO:root:Loaded checkpoint 'E:\Fort\WechatBot\vits-simple-api/Model/g/G_953000.pth' (iteration 630)
INFO:root:Loaded checkpoint 'E:\Fort\WechatBot\vits-simple-api/Model/Voistock/547_epochs.pth' (iteration None)
INFO:root:Loaded checkpoint 'E:\Fort\WechatBot\vits-simple-api/Model/ljs/ljs.pth' (iteration 0)
Traceback (most recent call last):
File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\torch\serialization.py", line 354, in _check_seekable
f.seek(f.tell())
AttributeError: 'NoneType' object has no attribute 'seek'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "E:\Fort\WechatBot\vits-simple-api\app.py", line 28, in
tts = merge_model(app.config["MODEL_LIST"])
File "E:\Fort\WechatBot\vits-simple-api\utils\merge.py", line 55, in merge_model
obj = vits(model=i[0], config=i[1])
File "E:\Fort\WechatBot\vits-simple-api\voice.py", line 54, in init
self.load_model(model, model_)
File "E:\Fort\WechatBot\vits-simple-api\voice.py", line 61, in load_model
self.hubert = hubert_soft(model_)
File "E:\Fort\WechatBot\vits-simple-api\hubert_model.py", line 217, in hubert_soft
checkpoint = torch.load(path)
File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\torch\serialization.py", line 791, in load
with _open_file_like(f, 'rb') as opened_file:
File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\torch\serialization.py", line 276, in _open_file_like
return _open_buffer_reader(name_or_buffer)
File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\torch\serialization.py", line 261, in init
_check_seekable(buffer)
File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\torch\serialization.py", line 357, in _check_seekable
raise_err_msg(["seek", "tell"], e)
File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\torch\serialization.py", line 350, in raise_err_msg
raise type(e)(msg)
AttributeError: 'NoneType' object has no attribute 'seek'. You can only torch.load from a file that is seekable. Please pre-load the data into a buffer like io.BytesIO and try to load from it instead.
另外,他的configs中,ljs模型有两个json文件,我用的ljs_base.json
from vits-simple-api.
与train
data
model
并列,可以参考其他的config.json,我还是贴一份在这里吧,改的是vctk_base.json
{
"train": {
"log_interval": 200,
"eval_interval": 1000,
"seed": 1234,
"epochs": 10000,
"learning_rate": 2e-4,
"betas": [0.8, 0.99],
"eps": 1e-9,
"batch_size": 64,
"fp16_run": true,
"lr_decay": 0.999875,
"segment_size": 8192,
"init_lr_ratio": 1,
"warmup_epochs": 0,
"c_mel": 45,
"c_kl": 1.0
},
"data": {
"training_files":"filelists/vctk_audio_sid_text_train_filelist.txt.cleaned",
"validation_files":"filelists/vctk_audio_sid_text_val_filelist.txt.cleaned",
"text_cleaners":["english_cleaners2"],
"max_wav_value": 32768.0,
"sampling_rate": 22050,
"filter_length": 1024,
"hop_length": 256,
"win_length": 1024,
"n_mel_channels": 80,
"mel_fmin": 0.0,
"mel_fmax": null,
"add_blank": true,
"n_speakers": 109,
"cleaned_text": true
},
"model": {
"inter_channels": 192,
"hidden_channels": 192,
"filter_channels": 768,
"n_heads": 2,
"n_layers": 6,
"kernel_size": 3,
"p_dropout": 0.1,
"resblock": "1",
"resblock_kernel_sizes": [3,7,11],
"resblock_dilation_sizes": [[1,3,5], [1,3,5], [1,3,5]],
"upsample_rates": [8,8,2,2],
"upsample_initial_channel": 512,
"upsample_kernel_sizes": [16,16,4,4],
"n_layers_q": 3,
"use_spectral_norm": false,
"gin_channels": 256
},
"speakers": ["vctk"],
"symbols": ["_", ";", ":", ",", ".", "!", "?", "¡", "¿", "—", "…", "\"", "«", "»", "“", "”", " ", "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "ɑ", "ɐ", "ɒ", "æ", "ɓ", "ʙ", "β", "ɔ", "ɕ", "ç", "ɗ", "ɖ", "ð", "ʤ", "ə", "ɘ", "ɚ", "ɛ", "ɜ", "ɝ", "ɞ", "ɟ", "ʄ", "ɡ", "ɠ", "ɢ", "ʛ", "ɦ", "ɧ", "ħ", "ɥ", "ʜ", "ɨ", "ɪ", "ʝ", "ɭ", "ɬ", "ɫ", "ɮ", "ʟ", "ɱ", "ɯ", "ɰ", "ŋ", "ɳ", "ɲ", "ɴ", "ø", "ɵ", "ɸ", "θ", "œ", "ɶ", "ʘ", "ɹ", "ɺ", "ɾ", "ɻ", "ʀ", "ʁ", "ɽ", "ʂ", "ʃ", "ʈ", "ʧ", "ʉ", "ʊ", "ʋ", "ⱱ", "ʌ", "ɣ", "ɤ", "ʍ", "χ", "ʎ", "ʏ", "ʑ", "ʐ", "ʒ", "ʔ", "ʡ", "ʕ", "ʢ", "ǀ", "ǁ", "ǂ", "ǃ", "ˈ", "ˌ", "ː", "ˑ", "ʼ", "ʴ", "ʰ", "ʱ", "ʲ", "ʷ", "ˠ", "ˤ", "˞", "↓", "↑", "→", "↗", "↘", "'", "̩", "'", "ᵻ"]
}
from vits-simple-api.
与
train
data
model
并列,可以参考其他的config.json,我还是贴一份在这里吧,改的是vctk_base.json
{ "train": { "log_interval": 200, "eval_interval": 1000, "seed": 1234, "epochs": 10000, "learning_rate": 2e-4, "betas": [0.8, 0.99], "eps": 1e-9, "batch_size": 64, "fp16_run": true, "lr_decay": 0.999875, "segment_size": 8192, "init_lr_ratio": 1, "warmup_epochs": 0, "c_mel": 45, "c_kl": 1.0 }, "data": { "training_files":"filelists/vctk_audio_sid_text_train_filelist.txt.cleaned", "validation_files":"filelists/vctk_audio_sid_text_val_filelist.txt.cleaned", "text_cleaners":["english_cleaners2"], "max_wav_value": 32768.0, "sampling_rate": 22050, "filter_length": 1024, "hop_length": 256, "win_length": 1024, "n_mel_channels": 80, "mel_fmin": 0.0, "mel_fmax": null, "add_blank": true, "n_speakers": 109, "cleaned_text": true }, "model": { "inter_channels": 192, "hidden_channels": 192, "filter_channels": 768, "n_heads": 2, "n_layers": 6, "kernel_size": 3, "p_dropout": 0.1, "resblock": "1", "resblock_kernel_sizes": [3,7,11], "resblock_dilation_sizes": [[1,3,5], [1,3,5], [1,3,5]], "upsample_rates": [8,8,2,2], "upsample_initial_channel": 512, "upsample_kernel_sizes": [16,16,4,4], "n_layers_q": 3, "use_spectral_norm": false, "gin_channels": 256 }, "speakers": ["vctk"], "symbols": ["_", ";", ":", ",", ".", "!", "?", "¡", "¿", "—", "…", "\"", "«", "»", "“", "”", " ", "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "ɑ", "ɐ", "ɒ", "æ", "ɓ", "ʙ", "β", "ɔ", "ɕ", "ç", "ɗ", "ɖ", "ð", "ʤ", "ə", "ɘ", "ɚ", "ɛ", "ɜ", "ɝ", "ɞ", "ɟ", "ʄ", "ɡ", "ɠ", "ɢ", "ʛ", "ɦ", "ɧ", "ħ", "ɥ", "ʜ", "ɨ", "ɪ", "ʝ", "ɭ", "ɬ", "ɫ", "ɮ", "ʟ", "ɱ", "ɯ", "ɰ", "ŋ", "ɳ", "ɲ", "ɴ", "ø", "ɵ", "ɸ", "θ", "œ", "ɶ", "ʘ", "ɹ", "ɺ", "ɾ", "ɻ", "ʀ", "ʁ", "ɽ", "ʂ", "ʃ", "ʈ", "ʧ", "ʉ", "ʊ", "ʋ", "ⱱ", "ʌ", "ɣ", "ɤ", "ʍ", "χ", "ʎ", "ʏ", "ʑ", "ʐ", "ʒ", "ʔ", "ʡ", "ʕ", "ʢ", "ǀ", "ǁ", "ǂ", "ǃ", "ˈ", "ˌ", "ː", "ˑ", "ʼ", "ʴ", "ʰ", "ʱ", "ʲ", "ʷ", "ˠ", "ˤ", "˞", "↓", "↑", "→", "↗", "↘", "'", "̩", "'", "ᵻ"] }
大佬,服务成功打开了,我发了一条语音请求,报错提示espeak没安装
(fort) E:\Fort\WechatBot\vits-simple-api>python app.py
INFO:root:Loaded checkpoint 'E:\Fort\WechatBot\vits-simple-api/Model/ljs/ljs.pth' (iteration 0)
INFO:root:Loaded checkpoint 'E:\Fort\WechatBot\vits-simple-api/Model/vctk/vctk.pth' (iteration 0)
INFO:vits-simple-api:torch:2.0.1+cpu cuda_available:False
INFO:vits-simple-api:device:cpu device.type:cpu
INFO:vits-simple-api:Loaded 2 speakers
INFO:apscheduler.scheduler:Added job "clean_task" to job store "default"
DEBUG:apscheduler.scheduler:Looking for jobs to run
DEBUG:apscheduler.scheduler:Next wakeup is due at 2023-05-16 01:16:36.449880+08:00 (in 3599.999002 seconds)
- Serving Flask app 'app'
- Debug mode: off
INFO:werkzeug:WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead. - Running on all addresses (0.0.0.0)
- Running on http://127.0.0.1:23456
- Running on http://192.168.1.52:23456
INFO:werkzeug:Press CTRL+C to quit
INFO:werkzeug:127.0.0.1 - - [16/May/2023 00:16:47] "GET /voice/speakers HTTP/1.1" 200 -
INFO:vits-simple-api:[VITS] id:0 format:wav lang:auto length:1.0 noise:0.667 noisew:0.8
INFO:vits-simple-api:[VITS] len:41 text:Good evening! How can I assist you today?
DEBUG:vits-simple-api:[EN]Good evening! How can I assist you today?[EN]
ERROR:app:Exception on /voice [POST]
Traceback (most recent call last):
File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\flask\app.py", line 2528, in wsgi_app
response = self.full_dispatch_request()
File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\flask\app.py", line 1825, in full_dispatch_request
rv = self.handle_user_exception(e)
File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\flask\app.py", line 1823, in full_dispatch_request
rv = self.dispatch_request()
File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\flask\app.py", line 1799, in dispatch_request
return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
File "E:\Fort\WechatBot\vits-simple-api\app.py", line 38, in check_api_key
return func(*args, **kwargs)
File "E:\Fort\WechatBot\vits-simple-api\app.py", line 113, in voice_vits_api
output = tts.vits_infer({"text": text,
File "E:\Fort\WechatBot\vits-simple-api\voice.py", line 435, in vits_infer
audio = voice_obj.get_audio(voice, auto_break=True)
File "E:\Fort\WechatBot\vits-simple-api\voice.py", line 205, in get_audio
self.get_infer_param(text=sentence, speaker_id=speaker_id, length=length, noise=noise,
File "E:\Fort\WechatBot\vits-simple-api\voice.py", line 131, in get_infer_param
stn_tst = self.get_cleaned_text(text, self.hps_ms, cleaned=cleaned)
File "E:\Fort\WechatBot\vits-simple-api\voice.py", line 71, in get_cleaned_text
text_norm = text_to_sequence(text, hps.symbols, hps.data.text_cleaners)
File "E:\Fort\WechatBot\vits-simple-api\text_init_.py", line 17, in text_to_sequence
clean_text = clean_text(text, cleaner_names)
File "E:\Fort\WechatBot\vits-simple-api\text_init.py", line 31, in _clean_text
text = cleaner(text)
File "E:\Fort\WechatBot\vits-simple-api\text\cleaners.py", line 62, in english_cleaners2
phonemes = phonemize(text, language='en-us', backend='espeak', strip=True, preserve_punctuation=True,
File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\phonemizer\phonemize.py", line 206, in phonemize
phonemizer = BACKENDS[backend](
File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\phonemizer\backend\espeak\espeak.py", line 45, in init
super().init(
File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\phonemizer\backend\espeak\base.py", line 39, in init
super().init(
File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\phonemizer\backend\base.py", line 77, in init
raise RuntimeError( # pragma: nocover
RuntimeError: espeak not installed on your system
from vits-simple-api.
在config.py中填写espeak的dll路径即可解决
例如Windows下路径为C:\Program Files\eSpeak NG\libespeak-ng.dll
from vits-simple-api.
espeak
大佬,好像我下载的espeak版本不对,在对应目录下没有找到libespeak-ng.dll文件
在网上搜了一下,没找到Win10系统的eSpeak NG的exe安装文件,
from vits-simple-api.
这个是win10可用的安装文件
https://github.com/espeak-ng/espeak-ng/releases/download/1.51/espeak-ng-X64.msi
from vits-simple-api.
这个是win10可用的安装文件 https://github.com/espeak-ng/espeak-ng/releases/download/1.51/espeak-ng-X64.msi
大佬,成功啦,可以接到请求并返回语音啦。试了id0和id1都是女声,是不是ljs和vctk这两个模型都是女声?想要英文男声,应该去哪里找啊?
from vits-simple-api.
看了下vctk的json中有109个speaker,你可以将json里的speaker名称任意补全,例如"speakers": ["vctk1","vctk2","vctk3"],
然后再进行挑选。我试了下这个模型中id1就是男声。
from vits-simple-api.
看了下vctk的json中有109个speaker,你可以将json里的speaker名称任意补全,例如
"speakers": ["vctk1","vctk2","vctk3"],
然后再进行挑选。我试了下这个模型中id1就是男声。
大佬,我按照您的方案,已经将vctk1,vctk2,vctk3加载到config.json里了,运行服务后,在页面端(http://127.0.0.1:23456/voice/speakers)可以看到不同id号了
是否意味着已经成功加载vctk的3个音库了,我只要在咱们仓库的config.py里改id选出来男声就可以了。
from vits-simple-api.
是的,其实你可以通过在请求的时候指定id
,就不用通过更改config.py
来切换speaker
from vits-simple-api.
是的,其实你可以通过在请求的时候指定
id
,就不用通过更改config.py
来切换speaker
懂了,谢谢大佬。
from vits-simple-api.
在config.py中填写espeak的dll路径即可解决 例如Windows下路径为
C:\Program Files\eSpeak NG\libespeak-ng.dll
大佬,在Linux环的Docker部署如何配置和安装espeak
from vits-simple-api.
docker里我应该写了安装espeak-ng命令,你可以在docker容器终端里输入espeak-ng --version
确认是否安装。linux环境下安装espeak会自动配置环境变量,所以不需要手动配置dll路径,直接使用就可以了。
from vits-simple-api.
Related Issues (20)
- 特定词组单独推理无法返回音频 HOT 1
- 本地环境 pip install 安装时间太久, 似乎陷入死循环 HOT 3
- 生成的yaml文件与实际的配置路径不符,以及实例文件问题 HOT 1
- 管理员后台已加载模型但是网页前端不显示 HOT 4
- 希望大佬支持日语特化版 HOT 4
- 部署包无法启动 HOT 4
- GPT-SoVITS模型部署,已经识别出模型,但是推理的时候引用完音频后生成卡住了 HOT 1
- 目前gpt sovits的推理速度,有更新使用TorchScript推理速度优化吗? HOT 9
- 请问怎样可以与SillyTavern连接? HOT 2
- gpt sovits,你好,官方推理上是有按模式切分的参数的 HOT 7
- gpt sovits,你好,生成语音重复 HOT 6
- 支持并发请求吗? HOT 1
- SSML 合成语音时id问题 HOT 7
- 建议增加OpenAI API的兼容性接口
- 软件启动时间太长了,平均需要25秒,能否缩短启动时间呢? HOT 4
- GPT的GPT_SoVITS/text/engdict-hot.rep 配置文件怎么配置? HOT 1
- 中文特化版模型,流式推理报错 HOT 3
- 关于动态加载 gpt_sovits 的指定模型,我克隆生成的模型能否有api可以直接设置,而不需要在配置文件中配置 然后重启服务器才能用 HOT 2
- GPT-SoVITS里面增加搜索预设音频的功能 HOT 1
- id 0 does not exist
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from vits-simple-api.