Code Monkey home page Code Monkey logo

visv2tts's Introduction

Vietnamese Voice Clone

Data Preparation

If you use custom data

  • Config your custom data follow this format:

    • Create folder: DATA

    • Subfolder: DATA/wavs -> which contain <audio_id>.wav files inside

    • DATA/train.txt and DATA/val.txt: with format each line follow format: <audio_id>transcript

  • If you dont have transcript, please check wav2vec inference script

If you try with VIVOS

wget http://ailab.hcmus.edu.vn/assets/vivos.tar.gz
tar xzf vivos.tar.gz
mkdir -p DATA/wavs
scp -v vivos/*/waves/*/*.wav DATA/wavs
cat vivos/test/prompts.txt > DATA/val.txt
cat vivos/test/prompts.txt > DATA/train.txt
cat vivos/train/prompts.txt >> DATA/train.txt

Install environment

conda create -y -n viclone python=3.8
conda activate viclone
conda install cudatoolkit=11.3.1 cudnn=8.2.1
python -m pip install torch==1.12.0+cu116 torchvision==0.13.0+cu116 torchaudio==0.12.0 --extra-index-url https://download.pytorch.org/whl/cu116
python -m pip install -r requirements.txt
cd vits/monotonic_align
mkdir monotonic_align
python setup.py build_ext --inplace

Process data

python Step1_data_processing.py

Extract feature

python Step2_extract_feature.py

Train model

python train_ms.py -c configs/vivos.json -m vivos 

Demo

python app.py

Then check port: http://127.0.0.1:7860/

visv2tts's People

Contributors

ak9250 avatar alexpeattie avatar cclauss avatar cforcomputer avatar corentinj avatar lidalei avatar matheusfillipe avatar mathigatti avatar niwala avatar ramalamadingdong avatar rancoud avatar v-nhandt21 avatar vanpelt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

visv2tts's Issues

Hỏi về pretrained model?

Xin chào anh, hiện em đang triển khai ứng dụng phát thanh nội bộ chuyển văn bản thành giọng nói tại đơn vị, không biết em có thể xin pretrained model của mô hình clone voice này không, vì hiện tại em đang tìm kiếm giọng nói phù hợp cho phát thanh nhưng các tập dữ liệu mở trên mạng lại không có, mà vấn đề trained tốn rất nhiều thời gian cho hạn của ứng dụng này, mục đích xây dựng ứng dụng này không mang tính thương mại, không biết ý kiến anh ntn?

Lỗi ở bước Process data

Xin chào bạn, mình cài đặt trên Win11, đến bước chạy file "Step1_data_processing.py" thì gặp lỗi này:

(viclone) PS E:\ViSV2TTS-master> python Step1_data_processing.py
0%| | 0/12420 [00:00<?, ?it/s]
Traceback (most recent call last):
File "Step1_data_processing.py", line 57, in
process_text()
File "Step1_data_processing.py", line 15, in process_text
phoneme = vi2IPA_split(script.lower(), "/")
File "E:\ViSV2TTS-master\viphoneme_init_.py", line 510, in vi2IPA_split
TN= TTSnorm(text)
File "C:\Users\sakur\AppData\Roaming\Python\Python38\site-packages\vinorm_init_.py", line 10, in TTSnorm
fw.write(text)
File "C:\ProgramData\miniconda3\envs\viclone\lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode character '\u1edf' in position 2: character maps to

Bạn check giúp mình nhé
Thanks

Error on runing app.py

Chào bạn,
Rất cảm ơn vì chia sẻ project tuyệt vời này

T đã run app.py nhưng xảy ra lỗi. T thử sửa subprocess.check_call(Command, env=myenv, cwd=A) nhưng thất bại.
T chạy trên win10 x64
Rất mong c gợi ý cách fix lỗi này.
Chờ hồi âm từ c.

Traceback (most recent call last):
File "C:\ProgramData\miniconda3\envs\Voice\lib\site-packages\gradio\queueing.py", line 456, in call_prediction
output = await route_utils.call_process_api(
File "C:\ProgramData\miniconda3\envs\Voice\lib\site-packages\gradio\route_utils.py", line 232, in call_process_api
output = await app.get_blocks().process_api(
File "C:\ProgramData\miniconda3\envs\Voice\lib\site-packages\gradio\blocks.py", line 1522, in process_api
result = await self.call_function(
File "C:\ProgramData\miniconda3\envs\Voice\lib\site-packages\gradio\blocks.py", line 1144, in call_function
prediction = await anyio.to_thread.run_sync(
File "C:\ProgramData\miniconda3\envs\Voice\lib\site-packages\anyio\to_thread.py", line 33, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "C:\ProgramData\miniconda3\envs\Voice\lib\site-packages\anyio_backends_asyncio.py", line 877, in run_sync_in_worker_thread
return await future
File "C:\ProgramData\miniconda3\envs\Voice\lib\site-packages\anyio_backends_asyncio.py", line 807, in run
result = context.run(func, *args)
File "C:\ProgramData\miniconda3\envs\Voice\lib\site-packages\gradio\utils.py", line 674, in wrapper
response = f(*args, **kwargs)
File "C:/Colorizations/Voice/ViSV2TTS/app.py", line 98, in clonevoice
outfile, text_norm = object.infer(text, speaker_source)
File "C:/Colorizations/Voice/ViSV2TTS/app.py", line 63, in infer
text_norm =TTSnorm(text)
File "C:\ProgramData\miniconda3\envs\Voice\lib\site-packages\vinorm_init_.py", line 28, in TTSnorm
subprocess.check_call(Command, env=myenv, cwd=A)
File "C:\ProgramData\miniconda3\envs\Voice\lib\subprocess.py", line 341, in call
with Popen(*popenargs, **kwargs) as p:
File "C:\ProgramData\miniconda3\envs\Voice\lib\subprocess.py", line 859, in init
self._execute_child(args, executable, preexec_fn, close_fds,
File "C:\ProgramData\miniconda3\envs\Voice\lib\subprocess.py", line 1328, in _execute_child
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
OSError: [WinError 193] %1 is not a valid Win32 application

Làm sao để train vậy bạn?

Bạn có thể hướng đẫn mình cách train chi tiết được không ạ?
Nếu được thì bạn cho mình xin thử pretrained model với!

Problem occured when I run app.py

Hi dev,
I tried to run your code on Linux with python 3.8 installed. And I followed your installation guidance.
However, I met some problem when I try to run app.py.

ModuleNotFoundError:
No module named 'commons'
File "/root/ViSV2TTS/app.py", line 6, in
import commons
ModuleNotFoundError: No module named 'commons'

Please help!

Hỏi về demo clone giọng bằng tiếng Việt

Chào a em đang là sv CNTT của ĐHQG HN, e đang có hứng thú với đề tài này, bởi mục đích là cloning giọng cho các bệnh nhân bị mất khả năng nói chuyện. A có thể chia sẻ cho em demo của repo không ạ e cám ơn.

DLL load failed

Dear anh,
Em chạy code thì bị lỗi này ạ
File ".\app.py", line 10, in
from models import SynthesizerTrn
File "vits\models.py", line 10, in
import monotonic_align
File "vits\monotonic_align_init_.py", line 3, in
from .monotonic_align.core import maximum_path_c
ImportError: DLL load failed while importing core: The parameter is incorrect.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.