Code Monkey home page Code Monkey logo

bark's Introduction

๐Ÿš€ BARK INFINITY: A Voice Is A Sound and a Sound Is a Voice ๐ŸŽถ ๐ŸŒˆโœจ๐Ÿš€

screenshot

Colab Notebooks: These need updating, but they should work soon if they have a problem at the moment.

Open In Colab Open In Colab

Bark on an AMD GPU?

Scroll down to Bark AMD (DirectML) MANUAL Install

๐ŸŽ‰ Bark INFINITY NVIDIA Automatic Windows Installer ๐ŸŽ‰

โš ๏ธ Note: make sure you fully extract the .zip file before running the .bat files.

Bark Install Prerequisites:

  1. Just the regular Windows NVIDIA drivers.
    1. You don't need anything else installed.
    2. You don't need Python.
    3. You don't need Pytorch.
    4. You don't need anything with CUDA in the name.
    5. In fact, other Python/CUDA things on your system could cause problems, if those were not installed in isolated environment like a conda or venv.
    6. To check, you can try going to a command line and typing python --version or pip list and seeing what shows up. Ideally you want those commands to do nothing because that means nothing is installed on your base windows system itself.
  2. (Optional But Recommended) The Windows Terminal https://apps.microsoft.com/store/detail/windows-terminal/9N0DX20HK701 -- Bark still has a lot of text output and it's looks nicer and is easier to read in the Windows Terminal. But you can also use the regular Windows Command Prompt. (Color text is coming back so you will want this later.)

Bark Install Steps

  1. Download the latest zip file from the releases page: https://github.com/JonathanFly/bark-installer/releases
  2. Extract the zip file into a directory with no spaces in the filename. This is the folder where Bark will be installed.
  3. Click on INSTALL_BARK_INFINITY.bat (you do not need to be administrator)
  4. If the install finished with no errors, close that terminal window. Close any other open command line windows as well.
  5. Click START_BARK_INFINITY.bat for the GUI version, or COMMAND_LINE_BARK_INFINITY.bat for the command line version.

Bark Install Problems

  1. Windows permissions error: Check for antivirus or security settings that may be blocking the installer. Can also be random and retrying the installer again simply fixes it.
  2. CondaSSLError, SSL Certificate Error: This is an odd one. If your country or government has an online ID card, you may have to uninstall the government SSL certificates.

Install Trouble Shooting:

  1. Try TROUBLESHOOT_BARK_INFINITY.bat for some options.
  2. Still Not Working? Feel free to @ me on Bark Official Discord, username "Jonathan Fly" jonathanfly.

Bark Uninstall

  1. Delete the entire directory where you installed Bark.
    1. โš ๏ธ Any Bark samples you made will be in that bark folder, so be sure to save anything you want before deleting the bark folder.
  2. Delete the Bark models. The Bark models will be in your Huggingface Cache directory. The location should be printed when you run Bark in the console as Bark Model Location: or HF_HOME (for some older installs). The default location is C:\Users\YourWindowsUserName\.cache\ in Windows. (You can generally delete anything in there and it will redownload if needed.)

Where Can I Get Bark Voices?

  1. Search the Bark Discord for voices. The Bark Discord is abuzz with Suno's new music 'Chirp' model, but you can find many great voices posted in chat in the months since Bark was released.
  2. Bark Speaker Directory from rsxdalv: https://github.com/rsxdalv/bark-speaker-directory
  3. You can nag me in Discord and if I'm online I might have some handy.
  4. Wait: I keep meaning to organize a bunch of good voices and update this repo.

How to use .srt subtitles as bark input?

I added an extremely basic way to do this over here but might still be useful. All this does it basically remove non-speech text from the .srt file so you can use it as a regular prompt. It doesn't add the new audio into a new video file or generate new subtitles, etc. Just inputs.

image

  1. Copy .SRT text file content and paste it into text prompt field.
  2. Run SRT text transformations in order. 1,2,3 etc. You can decide whether or not to leave the [action] text in, sometimes things like [grunts] work in Bark, depending on the voice.
  3. Decide how you want to split the text:

Render an audio segment for each original subtitle: After step two, your text input field should have the dialog in each subtitle on its own line in the text prompt. To render each line as segment, choose "advanced text splitting" and "Option 1" in the image. Process chunks of: "line", group by "line", start new clip when: "1". Because some subtitles have no dialog the number of segments may not line up be the same as the original subtitle file. If you want it to match you may want to skip step 5.

Render audio segments for the content, generally: Split the text normally how you would in Bark. This just ignores lines and counts by sentences, words, etc. For example "Option 2" is a common split: process chunks of: "sentence", group by: "word", start new clip when: "20". That will split by sentences and add new sentences until total words is over 20, then start a new segment.

How To Make Bark Faster?

  1. Tradeoff A Little Quality: Use just the small coarse Bark model, leave other two models (text and fine) regular sized. You can do this in the Gradio UI or the command line with an option.
  2. Try the new Huggingface Bark Implementation and see https://huggingface.co/blog/optimizing-bark with optimization suggestions.
  3. Try Pytorch nightly, if you know your way around the installation.
  4. Try optimizing numpy and other related libraries CPU libraries, because bark may be bottlenecked somewhat on CPU. For example Intel MKL may be worth a shot so you could try something like this:
  5. Linux (or WSL) seems generally a little faster than Windows, but I'm not sure if this is always true.

Intel MKL Install: Maybe Faster?

This is not part of the installer because it can break things, here for people who comfortable tweaking and rebuilding their Python conda environments.

  1. Type COMMAND_LINE_BARK_INFINITY.bat
  2. Type python bark_perform.py --run_numpy_benchmark True
  3. Note the benchmark times.
  4. Install MKL exe in Windows itself. https://www.intel.com/content/www/us/en/developer/tools/oneapi/onemkl-download.html
  5. Install MKL in Python. Theres's a few different ways to install, one way you can try:
    1. Run COMMAND_LINE_BARK_INFINITY.bat
    2. Type:
      conda install -y mkl mkl-service mkl_fft libcblas liblapacke liblapack blas-devel mkl-include mkl_random mkl-devel mkl-include libblas=*=*mkl mkl-static intel-openmp blas=*=*mkl -c intel -c conda-forge --solver=libmamba
      
  6. Rerun steps 1 and 2. Is it faster? If so, some chance Bark may be too.

How To Make Bark Faster on CPU or Apple hardware?

  1. On apple, try setting SUNO_ENABLE_MPS True
  2. Watch this repo: https://github.com/PABannier/bark.cpp
  3. New: Intel BigDL just added Bark: intel-analytics/ipex-llm#9016

Who Am I?

Sometimes I post at twitter.com/jonathanfly

๐ŸŽ‰Bark AMD (DirectML) MANUAL Install ๐ŸŽ‰

An expensive AMD card will be a lot faster than using CPU, but a lot slower than a much less-expensive NVIDIA GPU like a 3060. If you really know your way around linux some people have reportedly compiled custom ROCm pytorch and say they got Bark working on AMD pretty decently - but there's no guide that I know of. This AMD install uses DirectML instead of of ROCm, easier but slower.

Bark AMD DirectML Instructions.

What is DirectML? https://learn.microsoft.com/en-us/windows/ai/directml/gpu-pytorch-windows

Install Miniconda. https://repo.anaconda.com/miniconda/Miniconda3-py310_23.3.1-0-Windows-x86_64.exe

Then go to start menu and start a new "Ananconda Prompt" not regular windows command line

conda update -y conda
conda update -y -n base conda
conda install -y -n base conda-libmamba-solver
conda create --name pydml_torch2 -y python=3.10.6
conda activate pydml_torch2

make sure you see (pydml_torch2) in the corner of of your prompt. (pydml_torch2) C:\Users\YourName

conda install -y pip git --solver=libmamba
conda update -y --all --solver=libmamba

pip install ffmpeg_downloader
ffdl install -U --add-path

Now quit out of the terminal and restart. We need ffmpeg in the path, which means you need to be able to type ffmpeg -version and have it work. If you close and restart, you should be able to do that.

So close the terminal, close all window command lines or terminals to be sure. Then go back start menu and start a new "Ananaconda Prompt". This should be same you started the install.

conda activate pydml_torch2

make sure you see (pydml_torch2) in the corner again. (pydml_torch2) C:\Users\YourName etc. You always want to see (pydyml_torch2) in the corner when installing and using Bark. If you don't see it from this point, you are in the wrong conda environment and need to type conda activate pydml_torch2 again.

Now try typing

ffmpeg -version

Do you see ffmpeg 6.0? If it doesn't work you can keep going and you can use .wav file outputs, and fix it later.

Now the big conda install command. This could take 5 to 15 minutes, and if you have a slow internet it could even take hours, because it downloads multiple gigabytes. So if looks like it's frozen, let it go. Check your task manager and see if it's downloading.

For testing torch 2.0, just some giant pip installs:

pip install torch==2.0.0 torchvision==0.15.1 torch-directml==0.2.0.dev230426 opencv-python torchvision==0.15.1 wget pygments numpy pandas tensorboard matplotlib tqdm pyyaml boto3 funcy torchaudio transformers pydub pathvalidate rich nltk chardet av hydra-core>=1.1 einops scipy num2words pywin32 ffmpeg ffmpeg-python sentencepiece spacy==3.5.2 librosa jsonschema pytorch_lightning==1.9.4

pip install encodec flashy>=0.0.1 audiolm_pytorch==1.1.4 demucs 

pip install universal-startfile hydra_colorlog julius soundfile==0.12.1 gradio>=3.35.2 rich_argparse flashy>=0.0.1 ffmpeg_downloader rich_argparse devtools vector_quantize_pytorch

pip install https://github.com/Sharrnah/fairseq/releases/download/v0.12.4/fairseq-0.12.4-cp310-cp310-win_amd64.whl 

First set a SUNO_USE_DIRECTML variable. This tells Bark to use DirectML. If this doesn't work you can edit /bark_infinity/config.py`` and set SUNO_USE_DIRECTML to `True in the `DEFAULTS`` section.

set SUNO_USE_DIRECTML=1

Download Bark:

git clone https://github.com/JonathanFly/bark.git
cd bark

Change to the AMD Test Version

Note: (you may be able to skip git checkout bark_amd_directml_test and use main branch, if not, you will be able to soon)

git checkout bark_amd_directml_test

Now try running it. Bark has to download all the models the first time it runs, so it might look frozen for awhile. It's another 10 gigs of files.

python bark_perform.py

When I tested this install, bark_perform.py seemed to freeze at downloading models without making progress. I don't know if was a fluke, but I ran python bark_webui.py and it downloaded them fine.

Start the Bark UI

python bark_webui.py

Things that don't work:

  1. Voice Cloning (might work?)
  2. Top_k and top_p
  3. Probably more things I haven't tested.

Start Back UI Later

  1. Click Anaconda Prompt in start menu
  2. conda activate pydml_torch2
  3. cd bark
  4. python bark_webui.py

โš ๏ธโฌ‡๏ธโฌ‡๏ธ Everything below this point is out of date - may have some useful info. โฌ‡๏ธโฌ‡๏ธ โš ๏ธ

Manual Windows Mamba Install

(Mamba is a fast version of conda. They should work the same if you install either one, just change mamba to conda or vice-versa.)

Pip and conda/mamba are two different ways of installing Bark Infinity. If you use Mamba do not install anything. Don't install pytorch, do not install anything with 'CUDA' in the same. You don't need to lookup a YouTube tutorial. Just type the commands. The only thing you need installed is the NVIDIA drivers.

Take note of which lines are for NVIDIA or CPU, or Linux or Windows.

There is one exception, on Windows if you don't have the better Windows Terminal installed, that is a nice to have feature https://apps.microsoft.com/store/detail/windows-terminal/9N0DX20HK701

You don't have to but it may display the output from the bark commands better. When you start Anaconda Prompt (miniconda3) you can do it from the new Windows Terminal app, clicking on the down arrow next to the plus, should let you pick Anaconda Prompt (miniconda3)

  1. Go here: https://github.com/conda-forge/miniforge#mambaforge

  2. Download a Python 3.10 Miniconda3 installer for your OS. Windows 64-bit, macOS, and Linux probably don't need a guide. a. Install the Mambaforge for your OS, not specifically Windows. OSX for OSX etc. b. Don't install Mambaforge-pypy3. (It probably works fine, it is just not what I tested.) Install the one above that, just plain Mambaforge. Or you can use Conda, Mamba should faster but sometimes Conda may be more compatible.

  3. Install the Python 3.10 Miniconda3 exe. Then start the miniforge 'Miniforge Prompt Terminal which is a new program it installed. You will always use this program for Bark.

  4. Start 'Miniforge Prompt Be careful not to start the regular windows command line. (Unless you installed the new Terminal and know how to switch.) It should say "Anaconda Prompt (miniconda3)"

You should see also terminal that says "(base)".

Do not move forward until you see (base).

  1. Choose the place to install Bark Infinity directory. You can also just leave it at default. If you make a LOT of audio you think about a place with a lot of space.

When you start "Anaconda Prompt (miniconda3)" you will be in a directory, in Windows, probably something like** "C:\Users\YourName"**. Okay to install there. Just remember where you put it. It will be in /bark. (If you already had bark-infinity installed and want to update instead of reinstalling, skip to the end.)

  1. Type the next commands exactly. Hit "Y" for yes where you need to:
mamba update -y mamba
mamba create --name bark-infinity python=3.10
mamba activate bark-infinity

## NVIDIA GPU ONLY
mamba install -y -k cuda ninja git pip -c nvidia/label/cuda-11.7.0 -c nvidia 
pip install torch==2.0.1+cu117 torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117
## END NVIDIA GPU ONLY

##### CPU ONLY LINES START HERE (Also MacOS)
mamba install -y -k ninja git
pip install torch torchvision torchaudio
##### CPU ONLY LINES END HERE (Also MacOS)


## WINDOWS ONLY fairseq
pip install fairseq@https://github.com/Sharrnah/fairseq/releases/download/v0.12.4/fairseq-0.12.4-cp310-cp310-win_amd64.whl

## NON-WINDOWS fairseq
mamba install fairseq

pip install audiolm_pytorch==1.1.4 --no-deps 

git clone https://github.com/JonathanFly/bark.git
cd bark

pip install -r barki-allpip.txt --upgrade
ffdl install -U --add-path

Run Bark Infinity

Run command line version

python bark_perform.py

Run web ui version

python bark_webui.py

(If you see a warning that "No GPU being used. Careful, inference might be very slow!" after python bark_perform.py then something may be wrong, if you have GPU. If you don't see that then the GPU is working.)

Start Bark Infinity At A Later Time

To restart later, start Miniforge Prompt. Not Regular Prompt. Make sure you see (base) You will type a command to activate bark-infinity and of base, like this:

mamba activate bark-infinity
cd bark
python bark_webui.py

Update Bark Infinity

git pull
pip install -r barki-allpip.txt --upgrade

๐ŸŒŸ Original Bark Infinity Launch README (Preserved Mostly For Amusement) ๐ŸŒŸ

๐ŸŒ  The Past: ๐ŸŒ 

Bark Infinity started as a humble ๐Ÿ’ป command line wrapper, a CLI ๐Ÿ’ฌ. Built from simple keyword commands, it was a proof of concept ๐Ÿงช, a glimmer of potential ๐Ÿ’ก.

๐ŸŒŸ The Present: ๐ŸŒŸ

Bark Infinity evolved ๐Ÿงฌ, expanding across dimensions ๐ŸŒ. Infinite Length ๐ŸŽต๐Ÿ”„, Infinite Voices ๐Ÿ”Š๐ŸŒˆ, and a true high point in human history: ๐ŸŒ Infinite Awkwardness ๐Ÿ•บ. But for some people, the time-tested command line interface was not a good fit. Many couldn't even try Bark ๐Ÿ˜ž, struggling with CUDA gods ๐ŸŒฉ and being left with cryptic error messages ๐Ÿง and a chaotic computer ๐Ÿ’พ. Many people felt veryโ€ฆ UN INFINITE.

๐Ÿ”œ๐Ÿš€ The Future: ๐Ÿš€

๐Ÿš€ (Non emoji real answer: A node based UI like ComfyUI, if Gradio 4.0 makes Audio a lot better.)

1. INFINITY VOICES ๐Ÿ”Š๐ŸŒˆ

Discover cool new voices and reuse them. Performers, musicians, sound effects, two party dialog scenes. Save and share them. Every audio clip saves a speaker.npz file with the voice. To reuse a voice, move the generated speaker.npz file (named the same as the .wav file) to the "prompts" directory inside "bark" where all the other .npz files are.

๐Ÿ”Š With random celebrity appearances!

(I accidentally left a bunch of voices in the repo, some of them are pretty good. Use --history_prompt 'en_fiery' for the same voice as the audio sample right after this sentence.)

whoami.mp4

2. INFINITY LENGTH ๐ŸŽต๐Ÿ”„

Any length prompt and audio clips. Sometimes the final result is seamless, sometimes it's stable (but usually not both!).

๐ŸŽต Now with Slowly Morphing Rick Rolls! Can you even spot the seams in the most earnest Rick Rolls you've ever heard in your life?

but_are_we_strangers_to_love_really.mp4

๐Ÿ•บ Confused Travolta Mode ๐Ÿ•บ

Confused Travolta GIF confused_travolta

Can your text-to-speech model stammer and stall like a student answering a question about a book they didn't read? Bark can. That's the human touch. The semantic touch. You can almost feel the awkward silence through the screen.

๐Ÿ’ก But Wait, There's More: Travolta Mode Isn't Just A Joke ๐Ÿ’ก

Are you tired of telling your TTS model what to say? Why not take a break and let your TTS model do the work for you. With enough patience and Confused Travolta Mode, Bark can finish your jokes for you.

almost_a_real_joke.mp4

Truly we live in the future. It might take 50 tries to get a joke and it's probably an accident, but all 49 failures are also very amusing so it's a win/win. (That's right, I set a single function flag to False in a Bark and raved about the amazing new feature. Everything here is small potatoes really.)

reaching_for_the_words.mp4

Be sure to check out the official Suno repo README for updates as well

https://github.com/suno-ai/bark

bark's People

Contributors

afrogthatexists avatar alyxdow avatar gkucsko avatar jn-jairo avatar jonathanfly avatar kmfreyberg avatar marjan2k avatar mcamac avatar melmass avatar mikeyshulman avatar pansapiens avatar pleonard212 avatar steinhaug avatar uetuluk avatar vaibhavs10 avatar zygi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

bark's Issues

Cloning own voice

Could someone describe to me how to clone my own voice?
I see .wav and .npz files here but Is there any way to add my own voice?

UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 189: character maps to <undefined>

 File "C:\Users\darkl\bark\bark_webui.py", line 172, in generate_audio_long_gradio
    trim_logs()
  File "C:\Users\darkl\bark\bark_webui.py", line 686, in trim_logs
    lines = f.readlines()
  File "C:\Users\darkl\mambaforge\envs\bark-infinity-oneclick\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8f in position 189: character maps to <undefined>

Installation stuck

Hello!
Thank you for making this application.

I find it very interesting but I have stuck with inconvenient way to install it.
It would be great to me an who doesn't familiar with python to make a full installer or provide a step by step instruction of installation.
Cause its frustration to find out how to install python, pip and use them.
Thank you for your work.

is_bf16_supported

Traceback (most recent call last):
File "D:\5118\movielearning\testbark\test.py", line 1, in
from bark import SAMPLE_RATE, generate_audio
File "C:\ProgramData\Anaconda3\envs\movielearning\lib\site-packages\bark_init_.py", line 1, in
from .api import generate_audio, text_to_semantic, semantic_to_waveform
File "C:\ProgramData\Anaconda3\envs\movielearning\lib\site-packages\bark\api.py", line 5, in
from .generation import codec_decode, generate_coarse, generate_fine, generate_text_semantic
File "C:\ProgramData\Anaconda3\envs\movielearning\lib\site-packages\bark\generation.py", line 24, in
torch.cuda.is_bf16_supported()
AttributeError: module 'torch.cuda' has no attribute 'is_bf16_supported'

a bit confusing instruction

so need to install both mambaforge and miniforge for Windows? Because when installing the mambaforge I see nothing about miniforge

Is there a way to do line breaks like with the GUI?

It seems that line breaks might help with song/rap structure -or it may have no effect, is very hard to tell!

Anyway, is there a way to do that with command line prompt?

(This is a great project, thank you so much!)

miniforge annoyances in regular windows version

I mostly test in WSL2 and just noticed miniforge in base windows has some annoying bits. The console has errors, even though everything is still working. And you can't control-c it, it just sits there.

UnicodeEncodeError: 'charmap' codec can't encode characters in position 122-241:

UnicodeEncodeError: 'charmap' codec can't encode characters in position 122-241: character maps to
*** You may need to add PYTHONIOENCODING=utf-8 to your environment ***

This error seems to randomly appear and does not go away. No idea what causes it. Sometimes the prompts work, and other times the same prompt triggers that error, and just continues to get worse after it starts.

WebUI won't open.

I pulled to the newest version of bark, installed the requirements using
pip install -r requirements-pip.txt
and tried running the webui. However, it errors out here:

Traceback (most recent call last):
  File "/home/rlt/bark/bark_webui.py", line 5, in <module>
    import gradio as gr
  File "/home/rlt/.local/lib/python3.10/site-packages/gradio/__init__.py", line 3, in <module>
    import gradio.components as components
  File "/home/rlt/.local/lib/python3.10/site-packages/gradio/components.py", line 55, in <module>
    from gradio import processing_utils, utils
  File "/home/rlt/.local/lib/python3.10/site-packages/gradio/utils.py", line 38, in <module>
    import matplotlib
  File "/home/rlt/.local/lib/python3.10/site-packages/matplotlib/__init__.py", line 107, in <module>
    from collections import MutableMapping
ImportError: cannot import name 'MutableMapping' from 'collections' (/usr/lib/python3.10/collections/__init__.py)

I am using Python 3.10.6. Any advice? Same result in miniconda.

Thanks in advance.

Feature: Troubleshooting Mambaforge Installation in One-Click PowerShell Script

Hi everyone,

I have been working on creating a one-click PowerShell script to install and set up Bark from this repo. The script is designed to download and install Mambaforge, Git, and other required packages, then clone the Bark repository, create a virtual environment, and activate it.

However, I'm experiencing issues with the Mambaforge installation step. Although the script downloads the Mambaforge installer and seemingly starts the installation, it appears that nothing actually gets installed. The script then continues to the next steps, which eventually fail due to the missing Mambaforge installation.

I am looking for assistance in identifying and resolving the issue with the Mambaforge installation. If you have experience with PowerShell scripting, any suggestions or insights would be greatly appreciated. Your input will help improve the script and make the Bark setup process smoother and more efficient for users.

Thank you for your help!

if (-not ([Security.Principal.WindowsPrincipal] [Security.Principal.WindowsIdentity]::GetCurrent()).IsInRole([Security.Principal.WindowsBuiltInRole] "Administrator")) {
    Write-Host "This script requires Administrator privileges. Please run the script as Administrator."
    return
}

function Check-Error {
    param (
        [string]$ErrorMessage
    )

    if ($LASTEXITCODE -ne 0) {
        throw $ErrorMessage
    }
}

function Test-CommandExists {
    param (
        [string]$Command
    )

    try {
        Get-Command $Command -ErrorAction Stop | Out-Null
        return $true
    } catch {
        return $false
    }
}

if (-not (Test-CommandExists "mamba")) {
    # Download Mambaforge
    $mambaForgeExe = "Mambaforge-Windows-x86_64.exe"
    if (-not (Test-Path $mambaForgeExe)) {
        Write-Host "Downloading Mambaforge..."
        try {
            Invoke-WebRequest -Uri "https://github.com/conda-forge/miniforge/releases/latest/download/$mambaForgeExe" -OutFile $mambaForgeExe
        } catch {
            throw "Failed to download Mambaforge: $_"
        }
    } else {
        Write-Host "Mambaforge installer already downloaded, skipping download."
    }

    # Install Mambaforge
    Write-Host "Installing Mambaforge..."
    try {
        Start-Process -FilePath ".\$mambaForgeExe" -ArgumentList "/S /D=%UserProfile%\miniforge" -Wait
    } catch {
        throw "Failed to install Mambaforge: $_"
    }

    # Add Mambaforge to Path
    $Env:Path = "$Env:UserProfile\miniforge\Scripts;$Env:Path"
} else {
    Write-Host "Mambaforge is already installed, skipping download and installation."
}

# Check if git is installed
if (-not (Test-CommandExists "git")) {
    $gitExe = "Git-2.40.1-64-bit.exe"
    if (-not (Test-Path $gitExe)) {
        Write-Host "Downloading Git..."
        Invoke-WebRequest -Uri "https://github.com/git-for-windows/git/releases/download/v2.40.1.windows.1/$gitExe" -OutFile $gitExe
    } else {
        Write-Host "Git installer already downloaded, skipping download."
    }

    Write-Host "Installing Git..."
    Start-Process -FilePath ".\$gitExe" -ArgumentList "/VERYSILENT" -Wait
} else {
    Write-Host "Git is already installed, skipping installation."
}

# Clone Bark repository
Write-Host "Cloning Bark repository..."
if (-not (Test-Path ".\bark")) {
    git clone https://github.com/JonathanFly/bark.git
    Check-Error "Failed to clone Bark repository"
} else {
    Write-Host "Bark repository already exists, skipping cloning."
}

# Change to Bark directory
Set-Location ".\bark"

# Create and activate environment
Write-Host "Creating and activating environment..."
& $Env:UserProfile\miniforge\Scripts\mamba.exe env create -f environment-cuda.yml
Check-Error "Failed to create environment"

Write-Host "Activating environment..."
& $Env:UserProfile\miniforge\Scripts\mamba.exe activate bark-infinity-oneclick
Check-Error "Failed to activate environment"

# Install additional packages
Write-Host "Installing additional packages..."
pip install encodec
pip install rich-argparse

# Uninstall and reinstall soundfile
Write-Host "Fixing soundfile installation..."
& $Env:UserProfile\miniforge\Scripts\mamba.exe uninstall pysoundfile
pip install soundfile

Write-Host "Installation complete. Please use the Miniforge Prompt to start Bark."

Here is the current PowerShell script: one_click_install.zip

C-drive installation? How to launch from non-C?

The BAT appears to have the expectation it will install on C: under username
@echo off
call %USERPROFILE%\mambaforge\Scripts\activate.bat bark-infinity-oneclick
python %USERPROFILE%\bark\bark_webui.py
pause

But I installed on E:\Bark
How do I modify this to properly run from E:?

Long text file change voices in the middle

I try the new version. thare is no option of --stable-voice in the commands...
Even though I selected en_speaker_1 the voices changed in the middle of the process. Can this be fixed? Sending the file.

of_finding_him_so_de-SPK-en_speaker_1.mp4

where is Ramshackle Gradio App and dev branch?

I want to generate long consistent and natural audio for my youtube videos to make a voice over : https://www.youtube.com/SECourses

I tried tortoise TTS even cloned a voice but natural and high quality enough

I need consistency and high quality

Reading the repo it says Ramshackle Gradio App

But can't find this dev branch or anything

I can be volunteer to test and make a tutorial video

my discord : MonsterMMORPG#2198

requirements.txt

Thanks for making this wrapper!
Is it possible to include a requirements.txt file for this project?

OSError: [WinError 126] The specified module could not be found. Error loading "C:\Users\my user\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\lib\cudnn_cnn_infer64_8.dll" or one of its dependencies.

D:\AI\bark>python bark_perform.py --text_prompt "It is a mistake to think you can solve any major problems just with potatoes... or can you? (and the next page, and the next page...)" --split_by_words 35
Traceback (most recent call last):
  File "D:\AI\bark\bark_perform.py", line 3, in <module>
    from bark import SAMPLE_RATE, generate_audio, preload_models
  File "D:\AI\bark\bark\__init__.py", line 1, in <module>
    from .api import generate_audio, text_to_semantic, semantic_to_waveform
  File "D:\AI\bark\bark\api.py", line 3, in <module>
    from .generation import codec_decode, generate_coarse, generate_fine, generate_text_semantic
  File "D:\AI\bark\bark\generation.py", line 7, in <module>
    from encodec import EncodecModel
  File "C:\Users\my user\AppData\Local\Programs\Python\Python310\lib\site-packages\encodec\__init__.py", line 12, in <module>
    from .model import EncodecModel
  File "C:\Users\my user\AppData\Local\Programs\Python\Python310\lib\site-packages\encodec\model.py", line 14, in <module>
    import torch
  File "C:\Users\my user\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\__init__.py", line 133, in <module>
    raise err
OSError: [WinError 126] The specified module could not be found. Error loading "C:\Users\my user\AppData\Local\Programs\Python\Python310\lib\site-packages\torch\lib\cudnn_cnn_infer64_8.dll" or one of its dependencies.

I tried to reinstall torch but it still didnt work

Installing on MacOS -- No CUDA available?

I'm trying to install on MacOS 13.2.1. When I get to mamba env create -f environment-cuda.yml it gives me the error

Could not solve for environment specs
The following packages are incompatible
โ”œโ”€ cudatoolkit 11.8.0**  does not exist (perhaps a typo or a missing channel);
โ””โ”€ pytorch-cuda 11.8**  is uninstallable because it requires
   โ””โ”€ cuda 11.8.* , which does not exist (perhaps a missing channel).

On NVIDA's website it says "NVIDIAยฎ CUDA Toolkit 11.8 no longer supports development or running applications on macOS."

So am I out of luck? Does anyone know a way to get this running on MacOS?

Thanks!

problem with bark models during test run

Hi, when I try to run test voice Bark starts to download few GB of models

python3 bark_perform.py --text_prompt "It is a mistake to think you can solve any major problems just with potatoes... or can you? (and the next page, and the next page...)" --split_by_words 35
Loading Bark models...

finally it gives me

No GPU being used. Careful, Inference might be extremely slow!
No GPU being used. Careful, Inference might be extremely slow!
No GPU being used. Careful, Inference might be extremely slow!
Downloading: "https://dl.fbaipublicfiles.com/encodec/v0/encodec_24khz-d7cc33bc.th" to /Users/paulinajaskulska/.cache/torch/hub/checkpoints/encodec_24khz-d7cc33bc.th
Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 1348, in do_open
    h.request(req.get_method(), req.selector, req.data, headers,
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1282, in request
    self._send_request(method, url, body, headers, encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1328, in _send_request
    self.endheaders(body, encode_chunked=encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1277, in endheaders
    self._send_output(message_body, encode_chunked=encode_chunked)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1037, in _send_output
    self.send(msg)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 975, in send
    self.connect()
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/http/client.py", line 1454, in connect
    self.sock = self._context.wrap_socket(self.sock,
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/ssl.py", line 513, in wrap_socket
    return self.sslsocket_class._create(
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/ssl.py", line 1071, in _create
    self.do_handshake()
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/ssl.py", line 1342, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:997)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/paulinajaskulska/Desktop/tworczosc/bark/bark_perform.py", line 312, in <module>
    main(args)
  File "/Users/paulinajaskulska/Desktop/tworczosc/bark/bark_perform.py", line 245, in main
    preload_models()
  File "/Users/paulinajaskulska/Desktop/tworczosc/bark/bark/generation.py", line 312, in preload_models
    _ = load_codec_model(use_gpu=use_gpu, force_reload=True)
  File "/Users/paulinajaskulska/Desktop/tworczosc/bark/bark/generation.py", line 290, in load_codec_model
    model = _load_codec_model(device)
  File "/Users/paulinajaskulska/Desktop/tworczosc/bark/bark/generation.py", line 254, in _load_codec_model
    model = EncodecModel.encodec_model_24khz()
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/encodec/model.py", line 279, in encodec_model_24khz
    state_dict = EncodecModel._get_pretrained(checkpoint_name, repository)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/encodec/model.py", line 262, in _get_pretrained
    return torch.hub.load_state_dict_from_url(url, map_location='cpu', check_hash=True)  # type:ignore
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/hub.py", line 746, in load_state_dict_from_url
    download_url_to_file(url, cached_file, hash_prefix, progress=progress)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/site-packages/torch/hub.py", line 611, in download_url_to_file
    u = urlopen(req)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 216, in urlopen
    return opener.open(url, data, timeout)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 519, in open
    response = self._open(req, data)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 536, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 496, in _call_chain
    result = func(*args)
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 1391, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
  File "/Library/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 1351, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: unable to get local issuer certificate (_ssl.c:997)>

what can I do about it?

Low VRAM or CUDA out of memory for BARK INFINITY Solved

python bark_perform.py --use_smaller_models --text_prompt "Hello, world testing this for Bark Infinity since it's giving CUDA out of memory error so to overcome this error just simply use --use_smaller_models as args" --split_by_words 35

in web-ui version of bark and in suno-ai/bark we can do ["SUNO_USE_SMALL_MODELS"] = "True" but can't in this Infinity so to overcome this CUDA out of memory error simply use --use_smaller_models as args in cmd

RuntimeError: Unrecognized CachingAllocator option: garbage_collection_threshold

While attempting to run the script as written in readme i get

Traceback (most recent call last):
  File "D:\tmp\bark\bark-inf\bark_perform.py", line 3, in <module>
    from bark import SAMPLE_RATE, generate_audio, preload_models
  File "D:\tmp\bark\bark-inf\bark\__init__.py", line 1, in <module>
    from .api import generate_audio, text_to_semantic, semantic_to_waveform
  File "D:\tmp\bark\bark-inf\bark\api.py", line 3, in <module>
    from .generation import codec_decode, generate_coarse, generate_fine, generate_text_semantic
  File "D:\tmp\bark\bark-inf\bark\generation.py", line 24, in <module>
    torch.cuda.is_bf16_supported()
  File "D:\Python310\lib\site-packages\torch\cuda\__init__.py", line 92, in is_bf16_supported
    return torch.cuda.get_device_properties(torch.cuda.current_device()).major >= 8 and cuda_maj_decide
  File "D:\Python310\lib\site-packages\torch\cuda\__init__.py", line 481, in current_device
    _lazy_init()
  File "D:\Python310\lib\site-packages\torch\cuda\__init__.py", line 216, in _lazy_init
    torch._C._cuda_init()
RuntimeError: Unrecognized CachingAllocator option: garbage_collection_threshold

I have gtx 1060 with 6gb vram, windows 10 and python 3.10.6
oh and PYTORCH_CUDA_ALLOC_CONF is garbage_collection_threshold:0.6,max_split_size_mb:128

torch.cuda.OutOfMemoryError: CUDA out of memory

I am getting the out of memory issue on my 8GB card i noticed you had posted a comment about using --use_smaller_models. I installed the gui version from the instructions and then updated the files with the one in your repo. I am unsure where to put the --use_smaller_models would appreciate some help. Thank you.

new API interfaces for the new methods

As an integrator, I would like to have access to the same logic being used by the bark_perform.py script via bark.api.

Please relocate these methods to the api and instead, import and use them through the perform script.

Additionally, the methods you have updated in the api have not had their doc strings updated correctly, so it confusingly returns a tuple now instead of the audio array.

I can't get it to run on the GPU

This is what I get.

Loading Bark models... No GPU being used. Careful, Inference might be extremely slow!
I have a 2080 TI and it works with with SD and GPTs.

Any ideas?

GPU problems with CUDA detection, No module named 'encoded' and "No GPU being used. Careful" fixed

1080ti not detected, while working with stable-diffusion and GPTchat

First encountered No module named 'encoded'
Fixed with running: python -m pip install . (yes, include the .)

Second "No GPU being used. Careful, Inference might be extremely slow!" message

type python, hit enter
then type import torch and hit enter
now type torch.cuda.is_available() and see if it says true or false

if false go to pytorch and follow the steps for manual reinstall

if still not working while rest is working (like in my case)
download anaconda or miniconda, create a clean environment and start over
this did the trick finally for me, while I still had to take again the steps above.

Bug in Gradio UI: Audio preview not updating with numbered file names

I have encountered a bug in the Gradio UI where the audio preview does not update if the file name has a number appended to it. For example, if the original file name is "audio.wav" and a new file is generated with the name "audio_1.wav", the UI only loads the original audio preview and does not update to the newly generated file.

To reproduce this issue, please follow these steps:

  1. Generate a prompt
  2. Generate another prompt
  3. Observe that the UI only displays the original audio preview and does not update to the newly generated file.

I have checked the console output and it is clear that the script is not correctly processing numbered files.

ModuleNotFoundError: No module named 'encodec'

What do I have to do to make JonathanFly/bark work? I already have barkwebui inst. (oneclickinst.) and then copied the files from your repo to the bark folder and "pip install soundfile" via cmd inst.
I get the following error when i run this python bark_speak.py --text_prompt "It is a mistake to think you can solve any major problems just with potatoes." --history_prompt en_speaker_3 command in cmd.

Microsoft Windows [Version 10.0.22621.1555]
(c) Microsoft Corporation. Alle Rechte vorbehalten.

D:\AI\Bark_WebUI\bark>python bark_speak.py --text_prompt "It is a mistake to think you can solve any major problems just with potatoes." --history_prompt en_speaker_3
Traceback (most recent call last):
File "D:\AI\Bark_WebUI\bark\bark_speak.py", line 3, in
from bark import SAMPLE_RATE, generate_audio, preload_models
File "D:\AI\Bark_WebUI\bark\bark_init_.py", line 1, in
from .api import generate_audio, text_to_semantic, semantic_to_waveform
File "D:\AI\Bark_WebUI\bark\bark\api.py", line 5, in
from .generation import codec_decode, generate_coarse, generate_fine, generate_text_semantic
File "D:\AI\Bark_WebUI\bark\bark\generation.py", line 7, in
from encodec import EncodecModel
ModuleNotFoundError: No module named 'encodec'

D:\AI\Bark_WebUI\bark>

summarizing the contributions

  1. voices are randomly summoned without a history prompt. bark infinity lets you persist these voices for future use.
  2. chunk arbitrarily long texts into smaller pieces, then generate each separately.
  3. travolta mode: ignore the EOS token and keep on generating audio.

super cool stuff!

"The requested array has an inhomogeneous shape after 1 dimensions"

Windows 10, RTX 4090

  1. Git cloned repo
  2. missing encodec - instal it via "pip install -U encodec"
  3. missing funcy - instal it via "pip install -U funcy"
  4. missing scipy - instal it via "pip install -U scipy"
  5. as following

E:\Magazyn\Grafika\AI\Text2Voice\bark>python bark_speak.py --text_prompt "It is a mistake to think you can solve any major problems just with potatoes." --history_prompt en_speaker_3
Loading Bark models...
Models loaded.
Estimated time: 6.00 seconds.
Generating: It is a mistake to think you can solve any major problems just with potatoes.
Using speaker: en_speaker_3
history_prompt in gen: en_speaker_3
en_speaker_3
aa
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 100/100 [00:12<00:00, 8.20it/s]
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 31/31 [00:35<00:00, 1.15s/it]
Traceback (most recent call last):
File "E:\Magazyn\Grafika\AI\Text2Voice\bark\bark_speak.py", line 175, in
main(args)
File "E:\Magazyn\Grafika\AI\Text2Voice\bark\bark_speak.py", line 157, in main
gen_and_save_audio(prompt, history_prompt, text_temp, waveform_temp, filename, output_dir)
File "E:\Magazyn\Grafika\AI\Text2Voice\bark\bark_speak.py", line 97, in gen_and_save_audio
save_audio_to_file(filename, audio_array, SAMPLE_RATE, output_dir=output_dir)
File "E:\Magazyn\Grafika\AI\Text2Voice\bark\bark_speak.py", line 63, in save_audio_to_file
sf.write(filepath, audio_array, sample_rate, format=format, subtype=subtype)
File "C:\Users\jurandfantom\miniconda3\lib\site-packages\soundfile.py", line 338, in write
data = np.asarray(data)
ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (2,) + inhomogeneous part.

E:\Magazyn\Grafika\AI\Text2Voice\bark>

Minor bugs

A few minor bugs, nothing game-breaking!

  • When there is a : in the text prompt, such as "TV AD: Blah blah text here", then the script attempts to save the files with an unsanitized filename. On Windows, : is an illegal character in the filename, which causes no files to be saved out, only a broken 0kb file with the name up to the :. Might be best to strip out everything that isn't regular text.

  • When executing the same prompt twice, then any already existing speaker file will be overwritten. Only the wave file gets a _1 appended to the filename. Would be nice if this also applied to the speaker files.

made batch file for autorun and probably an issue with --history_prompt

I made a quick batch file to run the script with args. On weekend i will make GUI with tkinter so that anyone can copy paste their prompt and run program with their selected speaker and selected settings. If anyone wants to take over i will provide the files i have made.

And many thanks to OP for making such awesome AI work with larger text.

BTW: i noticed one thing, when it splits text the first one gives --history_prompt in gen: chosen speaker but on next generations it gives --history_prompt in gen: none. could that be fixed?

cuDNN Version Incompatibility

Hi there, getting an error just before my audio file is generated:

RuntimeError: cuDNN version incompatibility: PyTorch was compiled against (8, 7, 0) but found runtime version (8, 6, 0). PyTorch already comes bundled with cuDNN. One option to resolving this error is to ensure PyTorch can find the bundled cuDNN.

Any help on this is possible? Thank you!

Full log before it breaks -

--Segment 1/1: est. 0.40s
test
Loading text model from C:\Users\[HIDDEN]\.cache\suno\bark_v0\text_2.pt to cpu
_load_model model loaded: 312.3M params, 1.269 loss                      generation.py:840Loading coarse model from C:\Users\[HIDDEN]\.cache\suno\bark_v0\coarse_2.pt to cpu
_load_model model loaded: 314.4M params, 2.901 loss                      generation.py:840Loading fine model from C:\Users\[HIDDEN]\.cache\suno\bark_v0\fine_2.pt to cpu
_load_model model loaded: 302.1M params, 2.079 loss                      generation.py:840Traceback (most recent call last):
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\gradio\routes.py", line 394, in run_predict
    output = await app.get_blocks().process_api(
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\gradio\blocks.py", line 1075, in process_api
    result = await self.call_function(
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\gradio\blocks.py", line 884, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\anyio\to_thread.py", line 28, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(func, *args, cancellable=cancellable,
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\anyio\_backends\_asyncio.py", line 818, in run_sync_in_worker_thread
    return await future
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\anyio\_backends\_asyncio.py", line 754, in run
    result = context.run(func, *args)
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\gradio\helpers.py", line 587, in tracked_fn
    response = fn(*args)
  File "C:\Users\[HIDDEN]\Documents\Projects\bark\bark_webui.py", line 205, in generate_audio_long_gradio
    full_generation_segments, audio_arr_segments, final_filename_will_be = api.generate_audio_long_from_gradio(**kwargs)
  File "C:\Users\[HIDDEN]\Documents\Projects\bark\bark_infinity\api.py", line 467, in generate_audio_long_from_gradio
    full_generation_segments, audio_arr_segments, final_filename_will_be = generate_audio_long(**kwargs)
  File "C:\Users\[HIDDEN]\Documents\Projects\bark\bark_infinity\api.py", line 577, in generate_audio_long
    full_generation, audio_arr = generate_audio_barki(text=segment_text, **kwargs)
  File "C:\Users\[HIDDEN]\Documents\Projects\bark\bark_infinity\api.py", line 434, in generate_audio_barki
    audio_arr = codec_decode(fine_tokens)
  File "C:\Users\[HIDDEN]\Documents\Projects\bark\bark_infinity\generation.py", line 747, in codec_decode
    model.to(models_devices["codec"])
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\nn\modules\module.py", line 1145, in to
    return self._apply(convert)
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\nn\modules\module.py", line 797, in _apply
    module._apply(fn)
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\nn\modules\module.py", line 797, in _apply
    module._apply(fn)
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\nn\modules\module.py", line 797, in _apply
    module._apply(fn)
  [Previous line repeated 1 more time]
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\nn\modules\rnn.py", line 202, in _apply
    self._init_flat_weights()
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\nn\modules\rnn.py", line 139, in _init_flat_weights
    self.flatten_parameters()
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\nn\modules\rnn.py", line 169, in flatten_parameters
    not torch.backends.cudnn.is_acceptable(fw.data)):
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\backends\cudnn\__init__.py", line 97, in is_acceptable
    if not _init():
  File "C:\Users\[HIDDEN]\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\backends\cudnn\__init__.py", line 60, in _init
    raise RuntimeError(base_error_msg)
RuntimeError: cuDNN version incompatibility: PyTorch was compiled  against (8, 7, 0) but found runtime version (8, 6, 0). PyTorch already comes bundled with cuDNN. One option to resolving this error is to ensure PyTorch can find the bundled cuDNN.```

Request for Multiple Pass Quality Assurance for Voice Generation

I would like to request the implementation of a multiple pass quality assurance process for the voice generation program. The aim of this process is to regenerate any output audio segments that do not meet a specified quality standard.

According to ChatGPT, a quality assessment function can be created to evaluate the output audio segment. This function can use a combination of signal processing and machine learning techniques to analyze the audio and determine if it is distorted or of poor quality. Some useful metrics for this assessment include signal-to-noise ratio (SNR), total harmonic distortion (THD), and perceptual evaluation of speech quality (PESQ).

I believe that implementing this multiple pass quality assurance process will significantly improve the overall quality of the generated voice output

Error when run "python bark_webui.py"

Traceback (most recent call last):
File "/home/xxx/sound/bark/bark_webui.py", line 350, in
with gr.Blocks(theme=default_theme,css=bark_console_style) as demo:
File "/home/xxx/.local/lib/python3.11/site-packages/gradio/blocks.py", line 1285, in exit
self.config = self.get_config_file()
^^^^^^^^^^^^^^^^^^^^^^
File "/home/xxx/.local/lib/python3.11/site-packages/gradio/blocks.py", line 1261, in get_config_file
"input": list(block.input_api_info()), # type: ignore
^^^^^^^^^^^^^^^^^^^^^^
File "/home/xxx/anaconda3/envs/bark/lib/python3.11/site-packages/gradio_client/serializing.py", line 40, in input_api_info
return (api_info["serialized_input"][0], api_info["serialized_input"][1])
~~~~~~~~^^^^^^^^^^^^^^^^^^^^
KeyError: 'serialized_input'

PyTorch Stream Reader failed reading zip archive: failed finding central directory

multiple attempts to generate audio resulted in this runtime error.
relevant img attached.
followed the mamba install, using windows 10. running on rtx 3090. only change was the directory of the bark clone from c-drive to g-drive (place for my ai-related stuff).
also occurs when i try to pre-load models. double checked, the models are located at their supposedly correct locations:
C:\Users\Admin\.cache\suno\bark_v0
mamba pytorchstream

is this mamba thing really required ?

Actually I just created a virtual environment and made a git + install pip requirements.
Works fine so far, also webui starts, however Voice generation only works on CPU it says. So i guess that mamba thing is for cuda.
Automatic1111 seems to work without mamba?

RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

Loading Bark models...
Loading text model from C:\Users\Ariana.cache\suno\bark_v0\text_2.pt to cpu
Loading coarse model from C:\Users\Ariana.cache\suno\bark_v0\coarse_2.pt to cpu
Traceback (most recent call last):
File "C:\Users\Ariana\Desktop\Bark_App\New_Bark\bark\bark\bark_perform.py", line 128, in
main(namespace_args)
File "C:\Users\Ariana\Desktop\Bark_App\New_Bark\bark\bark\bark_perform.py", line 95, in main
generation.preload_models(args.text_use_gpu, args.text_use_small, args.coarse_use_gpu, args.coarse_use_small, args.fine_use_gpu, args.fine_use_small, args.codec_use_gpu, args.force_reload)
File "C:\Users\Ariana\Desktop\Bark_App\New_Bark\bark\bark\bark_infinity\generation.py", line 888, in preload_models
_ = load_model(
File "C:\Users\Ariana\Desktop\Bark_App\New_Bark\bark\bark\bark_infinity\generation.py", line 781, in load_model
model = _load_model_f(ckpt_path, device)
File "C:\Users\Ariana\Desktop\Bark_App\New_Bark\bark\bark\bark_infinity\generation.py", line 813, in _load_model
checkpoint = torch.load(ckpt_path, map_location=device)
File "C:\Users\Ariana\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\serialization.py", line 797, in load
with _open_zipfile_reader(opened_file) as opened_zipfile:
File "C:\Users\Ariana\mambaforge\envs\bark-infinity-oneclick\lib\site-packages\torch\serialization.py", line 283, in init
super().init(torch._C.PyTorchFileReader(name_or_buffer))
RuntimeError: PytorchStreamReader failed reading zip archive: failed finding central directory

I can't use certain formatting in prompts!

When I attempt to use the symbol "โ™ช" in the prompt to indicate a singing voice it just gives me an error?, do I need to format it in a specific way (besides this -> โ™ช I'm the king of the jungle โ™ช ) or is there a setting that I need to check first (typing [singing] seems to sometimes work making the voice sing what comes after it, is that intentional or what?)

Also the using the speaker like in (Man: Hi ... Woman: Hello) isn't consistent at all, is there a setting I need to adjust for it to work?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.