Code Monkey home page Code Monkey logo

rvc-tts-pipeline's People

Contributors

jarodmica avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

rvc-tts-pipeline's Issues

ModuleNotFoundError: No module named 'vc_infer_pipeline'

when i try to implement code i am having this error below :
from vc_infer_pipeline import VC

output :
11 import logging
13 from multiprocessing import cpu_count
---> 14 from vc_infer_pipeline import VC
15 from lib.infer_pack.models import SynthesizerTrnMs256NSFsid, SynthesizerTrnMs256NSFsid_nono, SynthesizerTrnMs768NSFsid, SynthesizerTrnMs768NSFsid_nono
16 from lib.audio import load_audio

ModuleNotFoundError: No module named 'vc_infer_pipeline'

and also in code VC never been used I don't understand

Fails when calling API

Error: "Calling API with sentence: Hello there, my name is marine and I am excited to meet you!
tortoise failed, trying again"

Similar forks

Hi, Jarod!
Glad to see that you made it into a repo!

I can see that you also forked RVC (https://github.com/JarodMica/rvc), here's my fork that also makes it installable:
https://github.com/rsxdalv/Retrieval-based-Voice-Conversion-WebUI/tree/package

Additionally, I modified the file for the inference directly on the fork so that in my downstream repo I only need:

from rvc_beta import infer_batch_rvc
from rvc_beta.infer_batch_rvc import vc_single, get_vc

def inject_hubert(hubert_model: torch.nn.Module):
    infer_batch_rvc.hubert_model = hubert_model


def infer_rvc(
    f0up_key,
    input_path,
    index_path,
    f0method,
    opt_path,
    model_path,
    index_rate,
):
    get_vc(model_path)
    wav_opt = vc_single(0, input_path, f0up_key, None, f0method, index_path, index_rate)
    out_path = os.path.join(opt_path, f"{os.path.basename(input_path)}.wav")
    wavfile.write(out_path, infer_batch_rvc.tgt_sr, wav_opt)
    return out_path


def run_rvc(
    f0up_key: str,
    original_audio_path: str,
    index_path: tempfile._TemporaryFileWrapper,
    f0method: str,
    model_path: tempfile._TemporaryFileWrapper,
    index_rate: float,
    device: str,
    is_half: bool,
    filter_radius: int,
    resample_sr: int,
    rms_mix_rate: float,
    protect: float,
):
    infer_batch_rvc.set_params_temp(
        _device=device,
        _is_half=is_half,
        _filter_radius=filter_radius,
        _resample_sr=resample_sr,
        _rms_mix_rate=rms_mix_rate,
        _protect=protect,
    )

    from rvc_beta.infer_batch_rvc import config

    if device == "cpu":  # Workaround for "slow_conv2d_cpu" not implemented for 'Half'
        config.is_half = is_half

    if infer_batch_rvc.hubert_model is None:
        get_and_load_hubert()

    opt_path = "./outputs-rvc/"
    return infer_rvc(
        f0method=f0method,
        f0up_key=f0up_key,
        input_path=original_audio_path,
        index_path=index_path.name,
        index_rate=index_rate,
        model_path=model_path.name,
        opt_path=opt_path,
    )

and here's the hubert loading:

from huggingface_hub import hf_hub_download
from rvc_beta import infer_batch_rvc


def get_and_load_hubert():
    model = hf_hub_download(
        repo_id="lj1995/VoiceConversionWebUI", filename="hubert_base.pt"
    )
    infer_batch_rvc.load_hubert(
        hubert_model_path=model,
    )

FileNotFoundError: [Errno 2] No such file or directory: 'trainset_preprocess_pipeline_print.py'

Dear @JarodMica , firstly thanks for uploading videos about AI on youtube and packaging RVC code into just 1 function. But I am facing error in running this. I just needs to run rvc_convert function, I provided .wav audio file generated from another TTS service.

Complete Error:
audio file path: /media/ravneet/Work SSD/RVC_TTS_Pipeline/ravneet_test_voice.wav
Traceback (most recent call last):
File "/media/ravneet/Work SSD/RVC_TTS_Pipeline/test.py", line 6, in
rvc_convert(model_path="guanguanV1.pth",
File "/media/ravneet/Work SSD/RVC_TTS_Pipeline/venv_pipeline/src/rvc-tts-pipe/rvc_infer.py", line 316, in rvc_convert
config=Config(device,is_half)
File "/media/ravneet/Work SSD/RVC_TTS_Pipeline/venv_pipeline/src/rvc-tts-pipe/rvc_infer.py", line 29, in init
self.x_pad, self.x_query, self.x_center, self.x_max = self.device_config()
File "/media/ravneet/Work SSD/RVC_TTS_Pipeline/venv_pipeline/src/rvc-tts-pipe/rvc_infer.py", line 63, in device_config
with open("trainset_preprocess_pipeline_print.py", "r") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'trainset_preprocess_pipeline_print.py'

Here is my code:
from rvc_infer import rvc_convert
if name == "main":
audio_file_name = '/media/ravneet/Work SSD/RVC_TTS_Pipeline/ravneet_test_voice.wav'
print(f"audio file path: {audio_file_name}")
# play_audio(audio_file_name)
rvc_convert(model_path="guanguanV1.pth",
input_path=audio_file_name)

Please help me to resolve the error. Many thanks in advance.

Failed to load audio: [WinError 2] The specified file can not be found

Trying to convert with rvc_convert, but it fails and got me this Exception:

`Traceback (most recent call last):
  File "f:\python\py\tts\venv\src\rvc\lib\audio.py", line 14, in load_audio
    ffmpeg.input(file, threads=0)
  File "F:\Python\py\TTS\venv\lib\site-packages\ffmpeg\_run.py", line 313, in run
    process = run_async(
  File "F:\Python\py\TTS\venv\lib\site-packages\ffmpeg\_run.py", line 284, in run_async
    return subprocess.Popen(
  File "C:\Users\Quick\AppData\Local\Programs\Python\Python310\lib\subprocess.py", line 966, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "C:\Users\Quick\AppData\Local\Programs\Python\Python310\lib\subprocess.py", line 1435, in _execute_child
    hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] Не удается найти указанный файл

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "F:\Python\py\TTS\main.py", line 47, in <module>
    rvc_convert(model_path='eduard.pth',
  File "f:\python\py\tts\venv\src\rvc-tts-pipe\rvc_infer.py", line 323, in rvc_convert
    wav_opt=vc_single(0,input_path,f0_up_key,None,f0method,file_index,file_index2,index_rate,filter_radius,resample_sr,rms_mix_rate,protect)
  File "f:\python\py\tts\venv\src\rvc-tts-pipe\rvc_infer.py", line 160, in vc_single
    audio = load_audio(input_audio_path, 16000)
  File "f:\python\py\tts\venv\src\rvc\lib\audio.py", line 19, in load_audio
    raise RuntimeError(f"Failed to load audio: {e}")
RuntimeError: Failed to load audio: [WinError 2] Не удается найти указанный файл`

I read that other people had the same error. I need to do something with ffmpeg, but nothing works for me. Python 3.10

Tried to reinstall ffmpeg. Not working. Trying to use other Python versions. Not working. Trying to extract ffmpeg.exe from this archive https://github.com/BtbN/FFmpeg-Builds/releases into ffmpge library folder. Not working

Reloading the models/inference speed

Am I doing something wrong, or is the pipeline reloading all the models every time an inference is made? This makes inference slow. If there is a way to keep them loaded in memory, I would be grateful to know it.

image

Testing Issues, Any Assistance Would Be Appreciated

I have not tried Tortoise with the same RVC's that I have trouble with so I don't know if the issue is universal but using XTTS with certain models outputs a deep voice result, also pitch with XTTS on this pipeline (male->female and vice versa ) is wonky, most times does not work as expected. Tried matching sample rates by converting the 24khz XTTS audio to 16khz (same result with librosa) and that doesn't fix it so I don't think it's an sr issue. I was talking with some people in the XTTS Discord and they were saying it could be the way rvc_convert synthesizes audio because XTTS gives the expected result just fine, its when you shove that result into the pipeline is when the bug happens.

EDIT: For reference the RVC models on AI Hub, the Donald Trump 600/300 epochs works fine but the Joe Rogan 300 epochs does not. Just one example but there also others.

Device not found

i'm trying to get this working but when it goes to process using rcv i get this error: Cuda or MPS not detected
Exception in thread Thread-4:
Traceback (most recent call last):
File "C:\Users\M\AppData\Local\Programs\Python\Python39\lib\threading.py", line 980, in _bootstrap_inner
self.run()
File "C:\Users\M\AppData\Local\Programs\Python\Python39\lib\threading.py", line 917, in run
self._target(*self._args, **self._kwargs)
File "C:\Users\M\PycharmProjects\pythonProject2\tortrvc.py", line 549, in handle_message
processed_audio_path = rvc_convert(model_path=model_path, input_path=original_audio_path)
File "c:\users\m\pycharmprojects\pythonproject2\venv\src\rvc-tts-pipe\rvc_infer.py", line 330, in rvc_convert
config=Config(device,is_half)
NameError: name 'device' is not defined

i'm new to coding so sorry if the issue is obvious

Error occurs when specifying custom output directory: PermissionError: [Errno 13] Permission denied: 'outputs\\rvc_out'

Hi! I am facing this error when trying to input a custom output_path_dir:

Traceback (most recent call last): File "C:\Users\Mech8\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "C:\Users\Mech8\AppData\Local\Programs\Python\Python310\lib\runpy.py", line 86, in _run_code exec(code, run_globals) File "c:\Users\Mech8\.vscode\extensions\ms-python.python-2023.20.0\pythonFiles\lib\python\debugpy\__main__.py", line 39, in <module> cli.main() File "c:\Users\Mech8\.vscode\extensions\ms-python.python-2023.20.0\pythonFiles\lib\python\debugpy/..\debugpy\server\cli.py", line 430, in main run() File "c:\Users\Mech8\.vscode\extensions\ms-python.python-2023.20.0\pythonFiles\lib\python\debugpy/..\debugpy\server\cli.py", line 284, in run_file runpy.run_path(target, run_name="__main__") File "c:\Users\Mech8\.vscode\extensions\ms-python.python-2023.20.0\pythonFiles\lib\python\debugpy\_vendored\pydevd\_pydevd_bundle\pydevd_runpy.py", line 321, in run_path return _run_module_code(code, init_globals, run_name, File "c:\Users\Mech8\.vscode\extensions\ms-python.python-2023.20.0\pythonFiles\lib\python\debugpy\_vendored\pydevd\_pydevd_bundle\pydevd_runpy.py", line 135, in _run_module_code _run_code(code, mod_globals, init_globals, File "c:\Users\Mech8\.vscode\extensions\ms-python.python-2023.20.0\pythonFiles\lib\python\debugpy\_vendored\pydevd\_pydevd_bundle\pydevd_runpy.py", line 124, in _run_code exec(code, run_globals) File "F:\xtts-webui\rvc_client.py", line 4, in <module> rvc_convert( File "F:\xtts-webui\.venv\src\rvc-tts-pipe\rvc_infer.py", line 324, in rvc_convert wavfile.write(output_file_path, tgt_sr, wav_opt) File "f:\xtts-webui\.venv\lib\site-packages\scipy\io\wavfile.py", line 767, in write fid = open(filename, 'wb') PermissionError: [Errno 13] Permission denied: 'outputs\\rvc_out'

This happens with windows-like and unix-like paths (like 'outputs\rvc_out' for example) and adding the file name to path doesn't work as well. Just in case I applied full permissions to the output directories, but this doesn't help. Could you please take a look?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.