artrajz / vits-simple-api Goto Github PK
View Code? Open in Web Editor NEWA simple VITS HTTP API, developed by extending Moegoe with additional features.
License: GNU Affero General Public License v3.0
A simple VITS HTTP API, developed by extending Moegoe with additional features.
License: GNU Affero General Public License v3.0
如题
这是日志
moegoe_1 | INFO:werkzeug:101.33.231.208 - - [07/Apr/2023 17:20:24] "GET /voice?text=%5BLENGTH=1.4%5D你好!有什么我可以为您做的吗?请注意,我只能通过文本输入与您交流,无法识别语音指令。&lang=zh&id=1&format=silk HTTP/1.1" 500 -
moegoe_1 | INFO:werkzeug:101.33.231.208 - - [07/Apr/2023 17:20:30] "POST /voice/speakers HTTP/1.1" 200 -
moegoe_1 | ERROR:app:Exception on /voice [GET]
moegoe_1 | Traceback (most recent call last):
moegoe_1 | File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 2528, in wsgi_app
moegoe_1 | response = self.full_dispatch_request()
moegoe_1 | File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1825, in full_dispatch_request
moegoe_1 | rv = self.handle_user_exception(e)
moegoe_1 | File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1823, in full_dispatch_request
moegoe_1 | rv = self.dispatch_request()
moegoe_1 | File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1799, in dispatch_request
moegoe_1 | return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
moegoe_1 | File "/app/app.py", line 51, in voice_api
moegoe_1 | speaker_id = int(request.args.get("id", app.config["ID"]))
moegoe_1 | KeyError: 'ID'
moegoe_1 | INFO:werkzeug:101.33.231.208 - - [07/Apr/2023 17:20:30] "GET /voice?text=%5BLENGTH=1.4%5D&lang=zh&id=1&format=silk HTTP/1.1" 500 -
moegoe_1 | ERROR:app:Exception on /voice [GET]
moegoe_1 | Traceback (most recent call last):
moegoe_1 | File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 2528, in wsgi_app
moegoe_1 | response = self.full_dispatch_request()
moegoe_1 | File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1825, in full_dispatch_request
moegoe_1 | rv = self.handle_user_exception(e)
moegoe_1 | File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1823, in full_dispatch_request
moegoe_1 | rv = self.dispatch_request()
moegoe_1 | File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1799, in dispatch_request
moegoe_1 | return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
moegoe_1 | File "/app/app.py", line 51, in voice_api
moegoe_1 | speaker_id = int(request.args.get("id", app.config["ID"]))
moegoe_1 | KeyError: 'ID'
moegoe_1 | INFO:werkzeug:101.33.231.208 - - [07/Apr/2023 17:20:35] "GET /voice?text=%5BLENGTH=1.4%5D你好!有什么我可以帮助您的吗?&lang=zh&id=1&format=silk HTTP/1.1" 500 -
moegoe_1 | * Serving Flask app 'app'
moegoe_1 | * Debug mode: off
moegoe_1 | INFO:werkzeug:WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
moegoe_1 | * Running on all addresses (0.0.0.0)
moegoe_1 | * Running on http://127.0.0.1:23457
moegoe_1 | * Running on http://172.21.0.2:23457
moegoe_1 | INFO:werkzeug:Press CTRL+C to quit
moegoe_1 | INFO:werkzeug:101.33.231.208 - - [07/Apr/2023 17:23:06] "POST /voice//speakers HTTP/1.1" 308 -
moegoe_1 | INFO:werkzeug:101.33.231.208 - - [07/Apr/2023 17:23:06] "POST /voice/speakers HTTP/1.1" 200 -
moegoe_1 | INFO:werkzeug:101.33.231.208 - - [07/Apr/2023 17:23:07] "POST /voice//speakers HTTP/1.1" 308 -
moegoe_1 | INFO:werkzeug:101.33.231.208 - - [07/Apr/2023 17:23:07] "POST /voice/speakers HTTP/1.1" 200 -
moegoe_1 | INFO:werkzeug:101.33.231.208 - - [07/Apr/2023 17:23:07] "GET /voice/?text=%5BLENGTH=1.4%5D&lang=zh&id=1&format=silk HTTP/1.1" 200 -
moegoe_1 | INFO:werkzeug:101.33.231.208 - - [07/Apr/2023 17:23:12] "GET /voice/?text=%5BLENGTH=1.4%5D这句话太长了,抱歉&lang=zh&id=1&format=silk HTTP/1.1" 200 -
moegoe_1 | INFO:werkzeug:101.33.231.208 - - [07/Apr/2023 17:26:46] "GET /voice/?text=%5BLENGTH=1.4%5D消息已收到!当前我还有条消息要回复,请您稍等。&lang=zh&id=1&format=silk HTTP/1.1" 200 -
moegoe_1 | INFO:werkzeug:101.33.231.208 - - [07/Apr/2023 17:27:12] "GET /voice/?text=%5BLENGTH=1.4%5D消息已收到!当前我还有条消息要回复,请您稍等。&lang=zh&id=1&format=silk HTTP/1.1" 200 -
moegoe_1 | INFO:werkzeug:101.33.231.208 - - [07/Apr/2023 17:27:27] "GET /voice/?text=%5BLENGTH=1.4%5D消息已收到!当前我还有条消息要回复,请您稍等。&lang=zh&id=1&format=silk HTTP/1.1" 200 -
moegoe_1 | * Serving Flask app 'app'
moegoe_1 | * Debug mode: off
moegoe_1 | INFO:werkzeug:WARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.
moegoe_1 | * Running on all addresses (0.0.0.0)
moegoe_1 | * Running on http://127.0.0.1:23457
moegoe_1 | * Running on http://172.21.0.2:23457
moegoe_1 | INFO:werkzeug:Press CTRL+C to quit
moegoe_1 | INFO:werkzeug:101.33.231.208 - - [07/Apr/2023 17:34:12] "POST /voice//speakers HTTP/1.1" 308 -
moegoe_1 | INFO:werkzeug:101.33.231.208 - - [07/Apr/2023 17:34:12] "POST /voice/speakers HTTP/1.1" 200 -
moegoe_1 | INFO:werkzeug:101.33.231.208 - - [07/Apr/2023 17:34:12] "GET /voice/?text=%5BLENGTH=1.4%5D你好!有什么我可以帮助你的吗?&lang=zh&id=1&format=silk HTTP/1.1" 200 -
wget http://127.0.0.1:23456/voice/speakers
只会得到id和name两个属性,而且没有属性名,更像map?
{"HuBert-VITS":[],"VITS":[{"0":"爱丽丝"},{"1":"日奈"},{"2":"星野"},{"3":"优香"}],"W2V2-VITS":[]}
请求 /voice?lang=mix
时 提供的 text 可以自动标记语言。
例如: /voice?lang=mix&text=你好用日语说是こんにちは
服务端可以自动标记成 [ZH]你好用日语说是[ZH][JA]こんにちは[JA]
如果这个功能可以在服务端完成,客户端可以少写很多代码,这个项目就可以作为一个即插即用的 API 直接使用了。
在docker上部署时出现以下错误:
INFO:moegoe-simple-api:角色id:0
INFO:moegoe-simple-api:合成文本:[ZH]您好!有什么我可以帮助您的吗?[ZH]
ERROR:app:Exception on /voice [GET]
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 2528, in wsgi_app
response = self.full_dispatch_request()
File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1825, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1823, in full_dispatch_request
rv = self.dispatch_request()
File "/usr/local/lib/python3.9/site-packages/flask/app.py", line 1799, in dispatch_request
return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
File "/app/app.py", line 80, in voice_api
output, file_type, fname = real_obj.generate(text=text,
File "/app/voice.py", line 100, in generate
stn_tst = self.get_text(text, self.hps_ms, cleaned=cleaned)
File "/app/voice.py", line 56, in get_text
text_norm = text_to_sequence(text, hps.symbols, hps.data.text_cleaners)
File "/app/text/init.py", line 17, in text_to_sequence
clean_text = _clean_text(text, cleaner_names)
File "/app/text/init.py", line 31, in _clean_text
text = cleaner(text)
File "/app/text/cleaners.py", line 118, in shanghainese_cleaners
from text.shanghainese import shanghainese_to_ipa
File "/app/text/shanghainese.py", line 6, in
converter = opencc.OpenCC('zaonhe')
File "/usr/local/lib/python3.9/site-packages/opencc/init.py", line 43, in init
super(OpenCC, self).init(config)
RuntimeError: /usr/local/lib/python3.9/site-packages/opencc/clib/share/opencc/zaonhe.json not found or not accessible.
INFO:werkzeug:172.30.0.1 - - [10/Apr/2023 13:28:30] "GET /voice?text=您好!有什么我可以帮助您的吗?&lang=zh&id=0&format=silk&length=1.4 HTTP/1.1" 500 -
请问是什么原因
未加载任何模型时logging不工作#36 (comment)
v1模型可以使用吗
http://127.0.0.1:23456/voice/speakers 返回 {"HUBERT-VITS":[],"VITS":[],"W2V2-VITS":[]} 是不是表示失败?
http://127.0.0.1:23456/voice?text=我喜欢看**记录片
{"message":"id 0 does not exist","status":"error"}
DEBUG:vits-simple-api:[GD]君不见,黄河之水天上来,奔流到海不复回。君不见,高堂明镜悲白发,朝如青丝暮成雪[GD]
ERROR:app:Exception on /voice/w2v2-vits [POST]
Traceback (most recent call last):
File "C:\Users\lin85\AppData\Local\Programs\Python\Python310\lib\site-packages\flask\app.py", line 2528, in wsgi_app
response = self.full_dispatch_request()
File "C:\Users\lin85\AppData\Local\Programs\Python\Python310\lib\site-packages\flask\app.py", line 1825, in full_dispatch_request
rv = self.handle_user_exception(e)
File "C:\Users\lin85\AppData\Local\Programs\Python\Python310\lib\site-packages\flask\app.py", line 1823, in full_dispatch_request
rv = self.dispatch_request()
File "C:\Users\lin85\AppData\Local\Programs\Python\Python310\lib\site-packages\flask\app.py", line 1799, in dispatch_request
return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
File "G:\AI\vits接口\VITS\app.py", line 44, in check_api_key
return func(args, **kwargs)
File "G:\AI\vits接口\VITS\app.py", line 239, in voice_w2v2_api
output = tts.w2v2_vits_infer({"text": text,
File "G:\AI\vits接口\VITS\voice.py", line 454, in w2v2_vits_infer
audio = voice_obj.get_audio(voice, auto_break=True)
File "G:\AI\vits接口\VITS\voice.py", line 216, in get_audio
self.get_infer_param(text=sentence, speaker_id=speaker_id, length=length, noise=noise,
File "G:\AI\vits接口\VITS\voice.py", line 119, in get_infer_param
stn_tst = self.get_cleaned_text(text, self.hps_ms, cleaned=cleaned)
File "G:\AI\vits接口\VITS\voice.py", line 59, in get_cleaned_text
text_norm = text_to_sequence(text, hps.symbols, hps.data.text_cleaners)
File "G:\AI\vits接口\VITS\text_init_.py", line 17, in text_to_sequence
clean_text = clean_text(text, cleaner_names)
File "G:\AI\vits接口\VITS\text_init.py", line 31, in _clean_text
text = cleaner(text)
File "G:\AI\vits接口\VITS\text\cleaners.py", line 241, in chinese_dialect_cleaners
text = re.sub(r'[GD](.?)[GD]',
File "C:\Users\lin85\AppData\Local\Programs\Python\Python310\lib\re.py", line 209, in sub
return compile(pattern, flags).sub(repl, string, count)
File "G:\AI\vits接口\VITS\text\cleaners.py", line 242, in
lambda x: cantonese_to_ipa(x.group(1)) + ' ', text)
File "G:\AI\vits接口\VITS\text\cantonese.py", line 62, in cantonese_to_ipa
text = converter.convert(text).replace('-', '').replace('$', ' ')
File "C:\Users\lin85\AppData\Local\Programs\Python\Python310\lib\site-packages\opencc_init.py", line 87, in convert
retv_i = libopencc.opencc_convert_utf8(self._od, text, len(text))
OSError: exception: access violation reading 0xFFFFFFFFFFFFFFFF
INFO:werkzeug:127.0.0.1 - - [31/May/2023 10:51:14] "POST /voice/w2v2-vits HTTP/1.1" 500 -
DEBUG:urllib3.connectionpool:http://127.0.0.1:23456 "POST /voice/w2v2-vits HTTP/1.1" 500 265
mp3作为一种常用格式能被许多应用内部打开,而且是压缩过的。用ogg一些应用内部打不开,还要转到一些外部播放器里打开有点麻烦,谢谢!
在服务器上运行VITS模型需要满足较高的运行环境要求,而本地电脑不可能一直运行不关机。相比之下,Hugging Face提供的免费配置能够轻松地运行一个VITS项目。
部署方式:docker-compose
下面是我的config.py
文件内容
import os
import sys
JSON_AS_ASCII = False
MAX_CONTENT_LENGTH = 5242880
# port
PORT = 23456
# absolute path
ABS_PATH = os.path.join(os.path.dirname(os.path.realpath(sys.argv[0])))
# upload path
UPLOAD_FOLDER = ABS_PATH + "/upload"
# cahce path
CACHE_PATH = ABS_PATH + "/cache"
# zh ja ko en ...
LANGUAGE_AUTOMATIC_DETECT = ["zh","ja"]
#set to True to enable API Key authentication
API_KEY_ENABLED = False
# API_KEY is required for authentication
API_KEY = "api-key"
'''
For each model, the filling method is as follows 模型列表中每个模型的填写方法如下
example 示例:
MODEL_LIST = [
#VITS
[ABS_PATH+"/Model/Nene_Nanami_Rong_Tang/1374_epochs.pth", ABS_PATH+"/Model/Nene_Nanami_Rong_Tang/config.json"],
[ABS_PATH+"/Model/Zero_no_tsukaima/1158_epochs.pth", ABS_PATH+"/Model/Zero_no_tsukaima/config.json"],
[ABS_PATH+"/Model/g/G_953000.pth", ABS_PATH+"/Model/g/config.json"],
#HuBert-VITS
[ABS_PATH+"/Model/louise/360_epochs.pth", ABS_PATH+"/Model/louise/config.json", ABS_PATH+"/Model/louise/hubert-soft-0d54a1f4.pt"],
]
'''
# load mutiple models
MODEL_LIST = [
[ABS_PATH+"/Model/Nene_Meguru_Yoshino_Mako_Myrasame_hoharu_Nanami/365_epochs.pth", ABS_PATH+"/Model/Nene_Meguru_Yoshino_Mako_Myrasame_hoharu_Nanami/config.json"],
[ABS_PATH+"/Model/to_love/1113_epochs.pth", ABS_PATH+"/Model/to_love/config.json"],
]
"""
default params
以下选项是修改VITS GET方法 [不指定参数]时的默认值
"""
# GET 默认音色id
ID = 0
# GET 默认音频格式 可选wav,ogg,silk
FORMAT = "wav"
# GET 默认语言
LANG = "AUTO"
# GET 默认语音长度,相当于调节语速,该数值越大语速越慢
LENGTH = 1
# GET 默认噪声
NOISE = 0.667
# GET 默认噪声偏差
NOISEW = 0.8
#长文本分段阈值,max<=0表示不分段,text will not be divided if max<=0
MAX = 50
请求其他接口,比如http://127.0.0.1:23456/voice/speakers
,都是正常返回的,但是请求 http://127.0.0.1:23456/voice?text=你好呀
生成的音频却只有喘息声
你这个项目非常赞,我想请教一下,这个SSML语言相关的停顿在项目中哪里有处理?我没找到。我要是想在源vits项目自定义停顿应该如何做?麻烦你说一下思路,谢谢了。
cmd里会显示 "GET /favicon.ico HTTP/1.1" 404 -
clone项目到服务器上后,下一步,模型库以及config.py
都要放到服务器根路径的/path/to/....
? 但是在app.py
里读取config.py
的时候却没有指定ABS_PATH
,应该是直接在同级目录下加载,这样能找到吗?感觉有一点反直觉,想确认一下
老大加一个流式处理?是否能实现流式响应?
我测试一下了,很棒!感谢大佬!
不过好像有三个bug(或许是我的方式不对?)
1.方言无法失败会自动转为普通话,比如我用粤语;[GD]XXXXXXXX[GD] , 无效,会读成普通话;
2.用老大的post,发现ssml 会出现这样的bug:
Traceback (most recent call last):
File "1.py", line 5, in <module>
voice_ssml(smm);
File "D:\ai\vits\bmss_fy\vits-simple-api-windows\post.py", line 151, in voice_ssml
fname = re.findall("filename=(.+)", res.headers["Content-Disposition"])[0]
File "C:\Users\lin85\AppData\Local\Programs\Python\Python38\lib\site-packages\requests\structures.py", line 52, in __getitem__
return self._store[key.lower()][1]
KeyError: 'content-disposition'
3.(不是bug哈哈哈)原emotion可以使用单个npy文件使用情绪,是否后续追加,也支持单个npy?
最后感谢付出。
在使用python app.py时出现了报错,日志如下:
root@ecsekei:~/vits# python3 app.py
torch:2.0.0+cu117 GPU_available:False
device:cpu device.type:cpu
Traceback (most recent call last):
File "/root/vits/app.py", line 25, in
voice_obj, voice_speakers = merge_model(app.config["MODEL_LIST"])
File "/root/vits/utils/merge.py", line 53, in merge_model
obj = vits(model=i[0], config=i[1])
File "/root/vits/voice.py", line 54, in init
self.load_model(model, model_)
File "/root/vits/voice.py", line 57, in load_model
utils.load_checkpoint(model, self.net_g_ms)
File "/root/vits/utils/utils.py", line 43, in load_checkpoint
checkpoint_dict = load(checkpoint_path, map_location='cpu')
File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 815, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "/usr/local/lib/python3.10/dist-packages/torch/serialization.py", line 1033, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
EOFError: Ran out of input
it seems something go wrong when installing packages:
(py30) zhonghaoli@MacBook-Pro-14-inch-2021 vits-simple-api % pip install -r requirements.txt
Defaulting to user installation because normal site-packages is not writeable
Collecting numba (from -r requirements.txt (line 1))
Using cached numba-0.57.0-cp310-cp310-macosx_11_0_arm64.whl (2.5 MB)
Collecting librosa (from -r requirements.txt (line 2))
Using cached librosa-0.10.0.post2-py3-none-any.whl (253 kB)
Collecting numpy==1.23.3 (from -r requirements.txt (line 3))
Using cached numpy-1.23.3-cp310-cp310-macosx_11_0_arm64.whl (13.3 MB)
Collecting scipy (from -r requirements.txt (line 4))
Using cached scipy-1.10.1-cp310-cp310-macosx_12_0_arm64.whl (28.8 MB)
Collecting torch (from -r requirements.txt (line 5))
Using cached torch-2.0.1-cp310-none-macosx_11_0_arm64.whl (55.8 MB)
Collecting unidecode (from -r requirements.txt (line 6))
Using cached Unidecode-1.3.6-py3-none-any.whl (235 kB)
Collecting openjtalk==0.3.0.dev2 (from -r requirements.txt (line 7))
Using cached openjtalk-0.3.0.dev2.tar.gz (24.9 MB)
Installing build dependencies ... done
Getting requirements to build wheel ... error
error: subprocess-exited-with-error
× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> [18 lines of output]
Traceback (most recent call last):
File "/Users/zhonghaoli/miniforge3/envs/py30/lib/python3.10/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in
大佬您好,最近在测试微信机器人的语音合成方案,在调用咱们家的vist-simple-api时遇到了点问题,想请教您。
前置确认
1.我确认我运行的是最新版本的代码,并且安装了所需的依赖
搜索issues中是否已存在类似问题
2.我已经搜索过issues,没有跟我遇到的问题相关的issue
操作系统类型?
Windows 10 专业版(21H2)
运行的python版本是?
python 3.10.8
复现步骤 🕹
启动终端——>激活环境变量(fort)——>python app.py——>给微信机器人发语音——>请求失败
问题描述 😯
在微信机器人项目(https://github.com/zhayujie/chatgpt-on-wechat) 中调用仓库的语音合成api,现实语音合成失败。
终端日志 📒
127.0.0.1 - - [14/May/2023 13:12:40] "POST /voice HTTP/1.1" 400 -
作者您好:
目前vits-simple-api没有鉴权功能,这意味着部署在公网的vits-simple-api会响应所有请求,一旦服务器URL暴露,可能有被滥用的风险,建议考虑加入类似于api key的鉴权功能,谢谢!
我服务器是centOS 8,尝试启动项目:
python3 app.py
torch:2.0.1+cu117 GPU_available:False
device:cpu device.type:cpu
到这里CPU占用和硬盘IO飙升,很快导致io_util
达到峰值,服务器无法响应了。关掉终端重新登录也无法响应,只能重启服务器。死机的时候看了一眼硬盘读取是285508kb/s,这是读取模型吗?总共我就导入了三个pth模型。
有什么好的办法吗?(已经多次尝试,必死机)
config.py
import os
import sys
JSON_AS_ASCII = False
MAX_CONTENT_LENGTH = 5242880
PORT = 23457
ABS_PATH = os.path.join(os.path.dirname(os.path.realpath(sys.argv[0])))
UPLOAD_FOLDER = ABS_PATH + "/upload"
CACHE_PATH = ABS_PATH + "/cache"
'''
vits模型路径填写方法,MODEL_LIST中的每一行是
[ABS_PATH+"/Model/{模型文件夹}/{.pth模型}", ABS_PATH+"/Model/{模型文件夹}/config.json"],
也可以写相对路径或绝对路径,由于windows和linux路径写法不同,用上面的写法或绝对路径最稳妥
示例:
MODEL_LIST = [
#VITS
[ABS_PATH+"/Model/Nene_Nanami_Rong_Tang/1374_epochs.pth", ABS_PATH+"/Model/Nene_Nanami_Rong_Tang/config.json"],
[ABS_PATH+"/Model/Zero_no_tsukaima/1158_epochs.pth", ABS_PATH+"/Model/Zero_no_tsukaima/config.json"],
[ABS_PATH+"/Model/g/G_953000.pth", ABS_PATH+"/Model/g/config.json"],
#HuBert-VITS
[ABS_PATH+"/Model/louise/360_epochs.pth", ABS_PATH+"/Model/louise/config.json", ABS_PATH+"/Model/louise/hubert-soft-0d54a1f4.pt"],
]
'''
MODEL_LIST = [
[ABS_PATH+"/Model/g/1374_epochs.pth", ABS_PATH+"/Model/g/config.json"],
]
docker-compose.yaml
version: '3.4'
services:
moegoe:
image: artrajz/moegoe-simple-api:latest
restart: always
ports:
- 23457:23457
environment:
LANG: 'C.UTF-8'
volumes:
- ./Model:/app/Model # 挂载模型文件夹
- ./config.py:/app/config.py # 挂载配置文件
按步骤部署最后python app.py时报错
Traceback (most recent call last):
File "app.py", line 10, in
from utils import clean_folder, merge_model
File "C:\Users\Administrator\Desktop\MoeGoe-Simple-API\utils.py", line 120, in
def to_pcm(in_path: str) -> tuple[str, int]:
TypeError: 'type' object is not subscriptable
你这个项目非常赞,我想请教一下,这个SSML语言相关的停顿在项目中哪里有处理?我没找到。我要是想在源vits项目自定义停顿应该如何做?麻烦你说一下思路,谢谢了。
有时候需要新增一个情绪的npy ,需要先到MoeGoe去生成npy,还是觉得建议加上 音频来调节情感吧,来回跳转很麻烦哈哈哈,而且
也还是要依靠MoeGoe,如vits-simple-api有生成npy的功能,那就完全独立了,就可以删掉MoeGoe了哈哈哈。
好像是配置文件不兼容导致的,请问有啥解决办法么?
我将配置文件里的"n_speakers"挪到了data里可以正常运行,但是生成wav过程中报错了 ,生成的都是1kb无法播放的文件。
就是两个服务器,因为我部署chat-bot的服务器配置太低了,而且硬盘容量也不够,这个项目如果部署在了另一个服务器上改怎么去访问呢,有没有什么具体的思路呀,或者我该百度些什么内容呢,问题有点低级,麻烦你了
我发现这个默认只能cpu推理,即便我有48核但是还是慢而且吃不满,但是我这还有3090,我想用gpu可以大幅度加速推理
报错代码如下
[E 230613 19:09:03 fastlid:170] fastlid.set_languages is not a list
[JA]こんにちは[JA]
ERROR:app:Exception on /voice [POST]
Traceback (most recent call last):
File "E:\aila\API\vits-simple-api-windows_3\vits-simple-api\py310\lib\site-packages\flask\app.py", line 2528, in wsgi_app
response = self.full_dispatch_request()
File "E:\aila\API\vits-simple-api-windows_3\vits-simple-api\py310\lib\site-packages\flask\app.py", line 1825, in full_dispatch_request
rv = self.handle_user_exception(e)
File "E:\aila\API\vits-simple-api-windows_3\vits-simple-api\py310\lib\site-packages\flask\app.py", line 1823, in full_dispatch_request
rv = self.dispatch_request()
File "E:\aila\API\vits-simple-api-windows_3\vits-simple-api\py310\lib\site-packages\flask\app.py", line 1799, in dispatch_request
return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
File "E:\aila\API\vits-simple-api-windows_3\vits-simple-api\app.py", line 37, in check_api_key
return func(*args, **kwargs)
File "E:\aila\API\vits-simple-api-windows_3\vits-simple-api\app.py", line 136, in voice_api
output = real_obj.create_infer_task(text=text,
File "E:\aila\API\vits-simple-api-windows_3\vits-simple-api\voice.py", line 211, in create_infer_task
self.get_infer_param(text=sentence, speaker_id=speaker_id, length=length, noise=noise,
File "E:\aila\API\vits-simple-api-windows_3\vits-simple-api\voice.py", line 147, in get_infer_param
stn_tst = self.get_cleaned_text(text, self.hps_ms, cleaned=cleaned)
File "E:\aila\API\vits-simple-api-windows_3\vits-simple-api\voice.py", line 71, in get_cleaned_text
text_norm = text_to_sequence(text, hps.symbols, hps.data.text_cleaners)
File "E:\aila\API\vits-simple-api-windows_3\vits-simple-api\text_init_.py", line 17, in text_to_sequence
clean_text = clean_text(text, cleaner_names)
File "E:\aila\API\vits-simple-api-windows_3\vits-simple-api\text_init.py", line 28, in _clean_text
cleaner = getattr(cleaners, name)
AttributeError: module 'text.cleaners' has no attribute 'custom_cleaners'
INFO:werkzeug:127.0.0.1 - - [13/Jun/2023 19:09:03] "POST /voice HTTP/1.1" 500 -
部分平台(如qq)需要使用amr格式才能发送音频,使用其他格式需要用ffmpeg转码成amr,不如在服务端自动转码,省掉前端流程
我无法通过那个脚本部署docker镜像,我看了那个脚本里的内容 docker compose pull 后面什么都没有,是还没开发吗
我尝试了如下的配置
version: '3.4'
services:
vits:
image: vits-simple-api:latest
restart: always
ports:
- 23456:23456
environment:
LANG: 'C.UTF-8'
TZ: Asia/Shanghai #timezone
volumes:
- ./Model:/app/Model # 挂载模型文件夹
- ./config.py:/app/config.py # 挂载配置文件
- /opt/cuda:/opt/cuda
- ./cuda_test.py:/app/cuda_test.py
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [ gpu,utility ]
但是torch.cuda.is_available()是False。
nvidia-smi是可用的。
我在Dockerfile中删去了RUN pip install torch --index-url https://download.pytorch.org/whl/cpu
改为了RUN pip install torch
.
之后我在容器中将cuda添加到PATH中,使得nvcc -v也可用,但是还是没有解决问题。
运行app.py后,getattr返回一个报错.提示 attribute name must be string.
配置文件路径设置如下:
MODEL_LIST = [
# VITS
[ABS_PATH + "/Model/Xin/G_32000.pth", ABS_PATH + "/Model/Xin/config.json"],
#[ABS_PATH + "/Model/Zero_no_tsukaima/1158_epochs.pth", ABS_PATH + "/Model/Zero_no_tsukaima/config.json"],
#[ABS_PATH + "/Model/g/G_953000.pth", ABS_PATH + "/Model/g/config.json"],
# HuBert-VITS (Need to configure HUBERT_SOFT_MODEL)
#[ABS_PATH + "/Model/louise/360_epochs.pth", ABS_PATH + "/Model/louise/config.json"],
# W2V2-VITS (Need to configure DIMENSIONAL_EMOTION_NPY)
#[ABS_PATH + "/Model/w2v2-vits/1026_epochs.pth", ABS_PATH + "/Model/w2v2-vits/config.json"],
]
当我测试这个的时候,它自动把一些词识别为日语。
并且就算我在加上[ZH]包裹也还是复现。
log如下:
DEBUG:vits-simple-api:[[EN]ZH][EN][ZH] 君不见,[ZH][JA]黄河之水天上来,[JA][ZH]奔流到海不复回。君不见,高堂明镜悲白发, 朝如青丝暮成雪[[ZH][EN]ZH][EN]
Hi. Thanks for your contribution to the project. I had an issue using your project with a trained model from vits-finetuning. It speaks that [JA] in the generated audio file. Is there a way to prevent that? Thanks.
这个使用gpu加速的话一般短句的十几个字的这种。一句话多久合成好呢?
对机器人发送
say 你好
后台报错
INFO:vits-simple-api:[VITS] len:2 text:你好
[E 230609 13:09:49 fastlid:48] IncompleteRead(451317 bytes read, 486696 more expected)
请问是什么问题
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.