大佬，可以吧你的模型共享出来吗？,about artrajz/vits-simple-api

Comments (25)

Artrajz commented on July 21, 2024

模型是其他大佬做的，你可以在这里找到我所使用的模型
https://github.com/CjangCjengh/TTSModels
https://huggingface.co/spaces/zomehwh/vits-uma-genshin-honkai/tree/main/model

from vits-simple-api.

jwister commented on July 21, 2024

您的邮件已收到，谢谢。

from vits-simple-api.

jwister commented on July 21, 2024

感谢，已经用起来了，现在的问题是在哪里开启gpu加速啊？有配置项吗？

from vits-simple-api.

Artrajz commented on July 21, 2024

感谢，已经用起来了，现在的问题是在哪里开启gpu加速啊？有配置项吗？

需要安装cuda和gpu版pytorch，安装好后会自动调用gpu。

from vits-simple-api.

jwister commented on July 21, 2024

安装好了，电脑也重启了，重新跑的时候还是没显示启用gpu加速。后面吧集成显卡禁用了，也没用，是我的显卡是mx250 是需要设置哪里吗？

from vits-simple-api.

Artrajz commented on July 21, 2024

安装好了，电脑也重启了，重新跑的时候还是没显示启用gpu加速。后面吧集成显卡禁用了，也没用，是我的显卡是mx250 是需要设置哪里吗？

你验证下cuda是否安装成功，mx250应该要找对应的版本安装。然后是pytorch，在vits启动时会打印pytorch版本信息，版本是x.x.x+cu1xx（x是数字）的才是可以使用cuda的。

from vits-simple-api.

cgnannan commented on July 21, 2024

模型是其他大佬做的，你可以在这里找到我所使用的模型 https://github.com/CjangCjengh/TTSModels https://huggingface.co/spaces/zomehwh/vits-uma-genshin-honkai/tree/main/model

大佬，目前vits有英语模型么？今天在网上搜了半天，也没找到英语Model。

from vits-simple-api.

Artrajz commented on July 21, 2024

https://github.com/jaywalnut310/vits
原仓库有英语模型，不过需要稍微改下代码并另外安装espeak才能使用

from vits-simple-api.

Artrajz commented on July 21, 2024

305cae8
在对应的json文件中添加以下两行才能使用

"speakers": ["vctk"],
"symbols":  ["_", ";", ":", ",", ".", "!", "?", "¡", "¿", "—", "…", "\"", "«", "»", "“", "”", " ", "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "ɑ", "ɐ", "ɒ", "æ", "ɓ", "ʙ", "β", "ɔ", "ɕ", "ç", "ɗ", "ɖ", "ð", "ʤ", "ə", "ɘ", "ɚ", "ɛ", "ɜ", "ɝ", "ɞ", "ɟ", "ʄ", "ɡ", "ɠ", "ɢ", "ʛ", "ɦ", "ɧ", "ħ", "ɥ", "ʜ", "ɨ", "ɪ", "ʝ", "ɭ", "ɬ", "ɫ", "ɮ", "ʟ", "ɱ", "ɯ", "ɰ", "ŋ", "ɳ", "ɲ", "ɴ", "ø", "ɵ", "ɸ", "θ", "œ", "ɶ", "ʘ", "ɹ", "ɺ", "ɾ", "ɻ", "ʀ", "ʁ", "ɽ", "ʂ", "ʃ", "ʈ", "ʧ", "ʉ", "ʊ", "ʋ", "ⱱ", "ʌ", "ɣ", "ɤ", "ʍ", "χ", "ʎ", "ʏ", "ʑ", "ʐ", "ʒ", "ʔ", "ʡ", "ʕ", "ʢ", "ǀ", "ǁ", "ǂ", "ǃ", "ˈ", "ˌ", "ː", "ˑ", "ʼ", "ʴ", "ʰ", "ʱ", "ʲ", "ʷ", "ˠ", "ˤ", "˞", "↓", "↑", "→", "↗", "↘", "'", "̩", "'", "ᵻ"]

from vits-simple-api.

cgnannan commented on July 21, 2024

305cae8 在对应的json文件中添加以下两行才能使用

"speakers": ["vctk"],
"symbols":  ["_", ";", ":", ",", ".", "!", "?", "¡", "¿", "—", "…", "\"", "«", "»", "“", "”", " ", "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "ɑ", "ɐ", "ɒ", "æ", "ɓ", "ʙ", "β", "ɔ", "ɕ", "ç", "ɗ", "ɖ", "ð", "ʤ", "ə", "ɘ", "ɚ", "ɛ", "ɜ", "ɝ", "ɞ", "ɟ", "ʄ", "ɡ", "ɠ", "ɢ", "ʛ", "ɦ", "ɧ", "ħ", "ɥ", "ʜ", "ɨ", "ɪ", "ʝ", "ɭ", "ɬ", "ɫ", "ɮ", "ʟ", "ɱ", "ɯ", "ɰ", "ŋ", "ɳ", "ɲ", "ɴ", "ø", "ɵ", "ɸ", "θ", "œ", "ɶ", "ʘ", "ɹ", "ɺ", "ɾ", "ɻ", "ʀ", "ʁ", "ɽ", "ʂ", "ʃ", "ʈ", "ʧ", "ʉ", "ʊ", "ʋ", "ⱱ", "ʌ", "ɣ", "ɤ", "ʍ", "χ", "ʎ", "ʏ", "ʑ", "ʐ", "ʒ", "ʔ", "ʡ", "ʕ", "ʢ", "ǀ", "ǁ", "ǂ", "ǃ", "ˈ", "ˌ", "ː", "ˑ", "ʼ", "ʴ", "ʰ", "ʱ", "ʲ", "ʷ", "ˠ", "ˤ", "˞", "↓", "↑", "→", "↗", "↘", "'", "̩", "'", "ᵻ"]

谢谢大佬，我试试去

from vits-simple-api.

cgnannan commented on July 21, 2024

https://github.com/jaywalnut310/vits 原仓库有英语模型，不过需要稍微改下代码并另外安装espeak才能使用

大佬，看了原仓库的README,是否要拿数据集训练，才能得到英语语音模型？在仓库文件夹里，没找到Model文件

"3.Download datasets
Download and extract the LJ Speech dataset, then rename or create a link to the dataset folder: ln -s /path/to/LJSpeech-1.1/wavs DUMMY1
For mult-speaker setting, download and extract the VCTK dataset, and downsample wav files to 22050 Hz. Then rename or create a link to the dataset folder: ln -s /path/to/VCTK-Corpus/downsampled_wavs DUMMY2

4.Build Monotonic Alignment Search and run preprocessing if you use your own datasets."

from vits-simple-api.

Artrajz commented on July 21, 2024

不用，他在README里提供了预训练模型的下载链接，你可以直接使用预训练模型，或者在这个模型上继续训练。而json文件可以在仓库的configs里找到

from vits-simple-api.

cgnannan commented on July 21, 2024

305cae8 在对应的json文件中添加以下两行才能使用

"speakers": ["vctk"],
"symbols":  ["_", ";", ":", ",", ".", "!", "?", "¡", "¿", "—", "…", "\"", "«", "»", "“", "”", " ", "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "ɑ", "ɐ", "ɒ", "æ", "ɓ", "ʙ", "β", "ɔ", "ɕ", "ç", "ɗ", "ɖ", "ð", "ʤ", "ə", "ɘ", "ɚ", "ɛ", "ɜ", "ɝ", "ɞ", "ɟ", "ʄ", "ɡ", "ɠ", "ɢ", "ʛ", "ɦ", "ɧ", "ħ", "ɥ", "ʜ", "ɨ", "ɪ", "ʝ", "ɭ", "ɬ", "ɫ", "ɮ", "ʟ", "ɱ", "ɯ", "ɰ", "ŋ", "ɳ", "ɲ", "ɴ", "ø", "ɵ", "ɸ", "θ", "œ", "ɶ", "ʘ", "ɹ", "ɺ", "ɾ", "ɻ", "ʀ", "ʁ", "ɽ", "ʂ", "ʃ", "ʈ", "ʧ", "ʉ", "ʊ", "ʋ", "ⱱ", "ʌ", "ɣ", "ɤ", "ʍ", "χ", "ʎ", "ʏ", "ʑ", "ʐ", "ʒ", "ʔ", "ʡ", "ʕ", "ʢ", "ǀ", "ǁ", "ǂ", "ǃ", "ˈ", "ˌ", "ː", "ˑ", "ʼ", "ʴ", "ʰ", "ʱ", "ʲ", "ʷ", "ˠ", "ˤ", "˞", "↓", "↑", "→", "↗", "↘", "'", "̩", "'", "ᵻ"]

谢谢大佬，我试试去

大佬，我先git pull了您最新的代码。然后加载了预训练模型，也将您这两行代码更新到模型对应的json文件，python.app后报出以下错误

(fort) E:\Fort\WechatBot\vits-simple-api>python app.py
INFO:root:Loaded checkpoint 'E:\Fort\WechatBot\vits-simple-api/Model/Nene_Nanami_Rong_Tang/1374_epochs.pth' (iteration None)
INFO:root:Loaded checkpoint 'E:\Fort\WechatBot\vits-simple-api/Model/Zero_no_tsukaima/1158_epochs.pth' (iteration None)
INFO:root:Loaded checkpoint 'E:\Fort\WechatBot\vits-simple-api/Model/g/G_953000.pth' (iteration 630)
INFO:root:Loaded checkpoint 'E:\Fort\WechatBot\vits-simple-api/Model/Voistock/547_epochs.pth' (iteration None)
INFO:root:Loaded checkpoint 'E:\Fort\WechatBot\vits-simple-api/Model/ljs/ljs.pth' (iteration 0)
Traceback (most recent call last):
File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\torch\serialization.py", line 354, in _check_seekable
f.seek(f.tell())
AttributeError: 'NoneType' object has no attribute 'seek'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "E:\Fort\WechatBot\vits-simple-api\app.py", line 28, in
tts = merge_model(app.config["MODEL_LIST"])
File "E:\Fort\WechatBot\vits-simple-api\utils\merge.py", line 55, in merge_model
obj = vits(model=i[0], config=i[1])
File "E:\Fort\WechatBot\vits-simple-api\voice.py", line 54, in init
self.load_model(model, model_)
File "E:\Fort\WechatBot\vits-simple-api\voice.py", line 61, in load_model
self.hubert = hubert_soft(model_)
File "E:\Fort\WechatBot\vits-simple-api\hubert_model.py", line 217, in hubert_soft
checkpoint = torch.load(path)
File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\torch\serialization.py", line 791, in load
with _open_file_like(f, 'rb') as opened_file:
File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\torch\serialization.py", line 276, in _open_file_like
return _open_buffer_reader(name_or_buffer)
File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\torch\serialization.py", line 261, in init
_check_seekable(buffer)
File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\torch\serialization.py", line 357, in _check_seekable
raise_err_msg(["seek", "tell"], e)
File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\torch\serialization.py", line 350, in raise_err_msg
raise type(e)(msg)
AttributeError: 'NoneType' object has no attribute 'seek'. You can only torch.load from a file that is seekable. Please pre-load the data into a buffer like io.BytesIO and try to load from it instead.

我将这两行代码放到了"train"下，是不是我放的位置不对

另外，他的configs中，ljs模型有两个json文件，我用的ljs_base.json

MODEL_LIST也同步更新了

from vits-simple-api.

Artrajz commented on July 21, 2024

与train data model并列，可以参考其他的config.json，我还是贴一份在这里吧，改的是vctk_base.json

{
  "train": {
    "log_interval": 200,
    "eval_interval": 1000,
    "seed": 1234,
    "epochs": 10000,
    "learning_rate": 2e-4,
    "betas": [0.8, 0.99],
    "eps": 1e-9,
    "batch_size": 64,
    "fp16_run": true,
    "lr_decay": 0.999875,
    "segment_size": 8192,
    "init_lr_ratio": 1,
    "warmup_epochs": 0,
    "c_mel": 45,
    "c_kl": 1.0
  },
  "data": {
    "training_files":"filelists/vctk_audio_sid_text_train_filelist.txt.cleaned",
    "validation_files":"filelists/vctk_audio_sid_text_val_filelist.txt.cleaned",
    "text_cleaners":["english_cleaners2"],
    "max_wav_value": 32768.0,
    "sampling_rate": 22050,
    "filter_length": 1024,
    "hop_length": 256,
    "win_length": 1024,
    "n_mel_channels": 80,
    "mel_fmin": 0.0,
    "mel_fmax": null,
    "add_blank": true,
    "n_speakers": 109,
    "cleaned_text": true
  },
  "model": {
    "inter_channels": 192,
    "hidden_channels": 192,
    "filter_channels": 768,
    "n_heads": 2,
    "n_layers": 6,
    "kernel_size": 3,
    "p_dropout": 0.1,
    "resblock": "1",
    "resblock_kernel_sizes": [3,7,11],
    "resblock_dilation_sizes": [[1,3,5], [1,3,5], [1,3,5]],
    "upsample_rates": [8,8,2,2],
    "upsample_initial_channel": 512,
    "upsample_kernel_sizes": [16,16,4,4],
    "n_layers_q": 3,
    "use_spectral_norm": false,
    "gin_channels": 256
  },
  "speakers": ["vctk"],
  "symbols":  ["_", ";", ":", ",", ".", "!", "?", "¡", "¿", "—", "…", "\"", "«", "»", "“", "”", " ", "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "ɑ", "ɐ", "ɒ", "æ", "ɓ", "ʙ", "β", "ɔ", "ɕ", "ç", "ɗ", "ɖ", "ð", "ʤ", "ə", "ɘ", "ɚ", "ɛ", "ɜ", "ɝ", "ɞ", "ɟ", "ʄ", "ɡ", "ɠ", "ɢ", "ʛ", "ɦ", "ɧ", "ħ", "ɥ", "ʜ", "ɨ", "ɪ", "ʝ", "ɭ", "ɬ", "ɫ", "ɮ", "ʟ", "ɱ", "ɯ", "ɰ", "ŋ", "ɳ", "ɲ", "ɴ", "ø", "ɵ", "ɸ", "θ", "œ", "ɶ", "ʘ", "ɹ", "ɺ", "ɾ", "ɻ", "ʀ", "ʁ", "ɽ", "ʂ", "ʃ", "ʈ", "ʧ", "ʉ", "ʊ", "ʋ", "ⱱ", "ʌ", "ɣ", "ɤ", "ʍ", "χ", "ʎ", "ʏ", "ʑ", "ʐ", "ʒ", "ʔ", "ʡ", "ʕ", "ʢ", "ǀ", "ǁ", "ǂ", "ǃ", "ˈ", "ˌ", "ː", "ˑ", "ʼ", "ʴ", "ʰ", "ʱ", "ʲ", "ʷ", "ˠ", "ˤ", "˞", "↓", "↑", "→", "↗", "↘", "'", "̩", "'", "ᵻ"]
}

from vits-simple-api.

cgnannan commented on July 21, 2024

与train data model并列，可以参考其他的config.json，我还是贴一份在这里吧，改的是vctk_base.json

{
  "train": {
    "log_interval": 200,
    "eval_interval": 1000,
    "seed": 1234,
    "epochs": 10000,
    "learning_rate": 2e-4,
    "betas": [0.8, 0.99],
    "eps": 1e-9,
    "batch_size": 64,
    "fp16_run": true,
    "lr_decay": 0.999875,
    "segment_size": 8192,
    "init_lr_ratio": 1,
    "warmup_epochs": 0,
    "c_mel": 45,
    "c_kl": 1.0
  },
  "data": {
    "training_files":"filelists/vctk_audio_sid_text_train_filelist.txt.cleaned",
    "validation_files":"filelists/vctk_audio_sid_text_val_filelist.txt.cleaned",
    "text_cleaners":["english_cleaners2"],
    "max_wav_value": 32768.0,
    "sampling_rate": 22050,
    "filter_length": 1024,
    "hop_length": 256,
    "win_length": 1024,
    "n_mel_channels": 80,
    "mel_fmin": 0.0,
    "mel_fmax": null,
    "add_blank": true,
    "n_speakers": 109,
    "cleaned_text": true
  },
  "model": {
    "inter_channels": 192,
    "hidden_channels": 192,
    "filter_channels": 768,
    "n_heads": 2,
    "n_layers": 6,
    "kernel_size": 3,
    "p_dropout": 0.1,
    "resblock": "1",
    "resblock_kernel_sizes": [3,7,11],
    "resblock_dilation_sizes": [[1,3,5], [1,3,5], [1,3,5]],
    "upsample_rates": [8,8,2,2],
    "upsample_initial_channel": 512,
    "upsample_kernel_sizes": [16,16,4,4],
    "n_layers_q": 3,
    "use_spectral_norm": false,
    "gin_channels": 256
  },
  "speakers": ["vctk"],
  "symbols":  ["_", ";", ":", ",", ".", "!", "?", "¡", "¿", "—", "…", "\"", "«", "»", "“", "”", " ", "A", "B", "C", "D", "E", "F", "G", "H", "I", "J", "K", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W", "X", "Y", "Z", "a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y", "z", "ɑ", "ɐ", "ɒ", "æ", "ɓ", "ʙ", "β", "ɔ", "ɕ", "ç", "ɗ", "ɖ", "ð", "ʤ", "ə", "ɘ", "ɚ", "ɛ", "ɜ", "ɝ", "ɞ", "ɟ", "ʄ", "ɡ", "ɠ", "ɢ", "ʛ", "ɦ", "ɧ", "ħ", "ɥ", "ʜ", "ɨ", "ɪ", "ʝ", "ɭ", "ɬ", "ɫ", "ɮ", "ʟ", "ɱ", "ɯ", "ɰ", "ŋ", "ɳ", "ɲ", "ɴ", "ø", "ɵ", "ɸ", "θ", "œ", "ɶ", "ʘ", "ɹ", "ɺ", "ɾ", "ɻ", "ʀ", "ʁ", "ɽ", "ʂ", "ʃ", "ʈ", "ʧ", "ʉ", "ʊ", "ʋ", "ⱱ", "ʌ", "ɣ", "ɤ", "ʍ", "χ", "ʎ", "ʏ", "ʑ", "ʐ", "ʒ", "ʔ", "ʡ", "ʕ", "ʢ", "ǀ", "ǁ", "ǂ", "ǃ", "ˈ", "ˌ", "ː", "ˑ", "ʼ", "ʴ", "ʰ", "ʱ", "ʲ", "ʷ", "ˠ", "ˤ", "˞", "↓", "↑", "→", "↗", "↘", "'", "̩", "'", "ᵻ"]
}

大佬，服务成功打开了，我发了一条语音请求，报错提示espeak没安装

(fort) E:\Fort\WechatBot\vits-simple-api>python app.py
INFO:root:Loaded checkpoint 'E:\Fort\WechatBot\vits-simple-api/Model/ljs/ljs.pth' (iteration 0)
INFO:root:Loaded checkpoint 'E:\Fort\WechatBot\vits-simple-api/Model/vctk/vctk.pth' (iteration 0)
INFO:vits-simple-api:torch:2.0.1+cpu cuda_available:False
INFO:vits-simple-api:device:cpu device.type:cpu
INFO:vits-simple-api:Loaded 2 speakers
INFO:apscheduler.scheduler:Added job "clean_task" to job store "default"
DEBUG:apscheduler.scheduler:Looking for jobs to run
DEBUG:apscheduler.scheduler:Next wakeup is due at 2023-05-16 01:16:36.449880+08:00 (in 3599.999002 seconds)

Serving Flask app 'app'
Debug mode: off
INFO:werkzeug:WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
Running on all addresses (0.0.0.0)
Running on http://127.0.0.1:23456
Running on http://192.168.1.52:23456
INFO:werkzeug:Press CTRL+C to quit
INFO:werkzeug:127.0.0.1 - - [16/May/2023 00:16:47] "GET /voice/speakers HTTP/1.1" 200 -
INFO:vits-simple-api:[VITS] id:0 format:wav lang:auto length:1.0 noise:0.667 noisew:0.8
INFO:vits-simple-api:[VITS] len:41 text：Good evening! How can I assist you today?
DEBUG:vits-simple-api:[EN]Good evening! How can I assist you today?[EN]
ERROR:app:Exception on /voice [POST]
Traceback (most recent call last):
File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\flask\app.py", line 2528, in wsgi_app
response = self.full_dispatch_request()
File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\flask\app.py", line 1825, in full_dispatch_request
rv = self.handle_user_exception(e)
File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\flask\app.py", line 1823, in full_dispatch_request
rv = self.dispatch_request()
File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\flask\app.py", line 1799, in dispatch_request
return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
File "E:\Fort\WechatBot\vits-simple-api\app.py", line 38, in check_api_key
return func(*args, **kwargs)
File "E:\Fort\WechatBot\vits-simple-api\app.py", line 113, in voice_vits_api
output = tts.vits_infer({"text": text,
File "E:\Fort\WechatBot\vits-simple-api\voice.py", line 435, in vits_infer
audio = voice_obj.get_audio(voice, auto_break=True)
File "E:\Fort\WechatBot\vits-simple-api\voice.py", line 205, in get_audio
self.get_infer_param(text=sentence, speaker_id=speaker_id, length=length, noise=noise,
File "E:\Fort\WechatBot\vits-simple-api\voice.py", line 131, in get_infer_param
stn_tst = self.get_cleaned_text(text, self.hps_ms, cleaned=cleaned)
File "E:\Fort\WechatBot\vits-simple-api\voice.py", line 71, in get_cleaned_text
text_norm = text_to_sequence(text, hps.symbols, hps.data.text_cleaners)
File "E:\Fort\WechatBot\vits-simple-api\text_init_.py", line 17, in text_to_sequence
clean_text = clean_text(text, cleaner_names)
File "E:\Fort\WechatBot\vits-simple-api\text_init.py", line 31, in _clean_text
text = cleaner(text)
File "E:\Fort\WechatBot\vits-simple-api\text\cleaners.py", line 62, in english_cleaners2
phonemes = phonemize(text, language='en-us', backend='espeak', strip=True, preserve_punctuation=True,
File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\phonemizer\phonemize.py", line 206, in phonemize
phonemizer = BACKENDS[backend](
File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\phonemizer\backend\espeak\espeak.py", line 45, in init
super().init(
File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\phonemizer\backend\espeak\base.py", line 39, in init
super().init(
File "E:\Fort\WechatBot\vits-simple-api\fort\lib\site-packages\phonemizer\backend\base.py", line 77, in init
raise RuntimeError( # pragma: nocover
RuntimeError: espeak not installed on your system

我已经装了espeak，也打开了。

from vits-simple-api.

Artrajz commented on July 21, 2024

在config.py中填写espeak的dll路径即可解决
例如Windows下路径为C:\Program Files\eSpeak NG\libespeak-ng.dll

from vits-simple-api.

cgnannan commented on July 21, 2024

espeak

大佬，好像我下载的espeak版本不对，在对应目录下没有找到libespeak-ng.dll文件
在网上搜了一下，没找到Win10系统的eSpeak NG的exe安装文件，

找到了python依赖库和github仓库

from vits-simple-api.

Artrajz commented on July 21, 2024

这个是win10可用的安装文件
https://github.com/espeak-ng/espeak-ng/releases/download/1.51/espeak-ng-X64.msi

from vits-simple-api.

cgnannan commented on July 21, 2024

这个是win10可用的安装文件 https://github.com/espeak-ng/espeak-ng/releases/download/1.51/espeak-ng-X64.msi

大佬，成功啦，可以接到请求并返回语音啦。试了id0和id1都是女声，是不是ljs和vctk这两个模型都是女声？想要英文男声，应该去哪里找啊？

from vits-simple-api.

Artrajz commented on July 21, 2024

看了下vctk的json中有109个speaker，你可以将json里的speaker名称任意补全，例如"speakers": ["vctk1","vctk2","vctk3"],然后再进行挑选。我试了下这个模型中id1就是男声。

from vits-simple-api.

cgnannan commented on July 21, 2024

看了下vctk的json中有109个speaker，你可以将json里的speaker名称任意补全，例如"speakers": ["vctk1","vctk2","vctk3"],然后再进行挑选。我试了下这个模型中id1就是男声。

大佬，我按照您的方案，已经将vctk1,vctk2，vctk3加载到config.json里了，运行服务后，在页面端(http://127.0.0.1:23456/voice/speakers)可以看到不同id号了

是否意味着已经成功加载vctk的3个音库了，我只要在咱们仓库的config.py里改id选出来男声就可以了。

from vits-simple-api.

Artrajz commented on July 21, 2024

是的，其实你可以通过在请求的时候指定id，就不用通过更改config.py来切换speaker

from vits-simple-api.

cgnannan commented on July 21, 2024

是的，其实你可以通过在请求的时候指定id，就不用通过更改config.py来切换speaker

懂了，谢谢大佬。

from vits-simple-api.

gzmasterpulse commented on July 21, 2024

在config.py中填写espeak的dll路径即可解决例如Windows下路径为C:\Program Files\eSpeak NG\libespeak-ng.dll

大佬，在Linux环的Docker部署如何配置和安装espeak

from vits-simple-api.

Artrajz commented on July 21, 2024

docker里我应该写了安装espeak-ng命令，你可以在docker容器终端里输入espeak-ng --version确认是否安装。linux环境下安装espeak会自动配置环境变量，所以不需要手动配置dll路径，直接使用就可以了。

from vits-simple-api.

大佬，可以吧你的模型共享出来吗？ about vits-simple-api HOT 25 CLOSED

Comments (25)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent