Code Monkey home page Code Monkey logo

languageleapai's People

Contributors

0xiaohei0 avatar edqe14 avatar hpacleb avatar sociallyineptweeb avatar stancho44 avatar totoluto avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

languageleapai's Issues

Request Timeout

i see too much Request Timeout how i fix this and translate too wrong

Type Error

Umm, I already installed the requirements.txt and also fill the .env file already done in google colab but I'm still getting an error
image-1

japanese audio to english subs on live, can it be done?

lets say if i were to play a japanese video on vlc or web browser and i need a way to send the audio to whisper so it can translate in in live not like and not like one mp3 auido after another but like in live i need to see english subs for an japanese audio instead of japanese i need english subs in my terminal or like you have discribed in your video and sorry for my bad english and i think it can be done but i dont know.

DeepL

i just want use subtitler do I still need to use DeepL?
because DeepL doesnt support my country cart

Application limitations

Reguarding this:

Application limitations

Some applications like Valorant for some reason does not allow open mic for team voice chat, so LanguageLeapAI will not work for in these cases, unless you hold down the push to talk button whenever you want your teammates to hear the Text-to-Speech. However, Valorant does have open mic for party voice-chat, so there should be no issue if it's used towards your party members.

We could hold down the push to talk button through code while the audio is playing. The keyboard module already imported should be able to do this (https://github.com/boppreh/keyboard#keyboardpresshotkey)

About the Google Colab

even this is free, it also has a capacity in which it reaches the limit you no longer use the GPU, and you'll just have to use another account :D

have a problem with getenv()

problem: line 9, in BASE_URL = getenv('WHISPER_BASE_URL')

TypeError: 'str' object is not collable

Code:

1)from os import getenv
2)from pathlib import Path
3)
4)import requests
5)from dotenv import load_dotenv
6)
7)load_dotenv()
8)
9)BASE_URL = getenv('WHISPER_BASE_URL')

ModuleNotFoundError

➜ ~ python3.8 subtitltler.py
Traceback (most recent call last):
File "subtitltler.py", line 11, in
from modules.audio_translate import translate_audio
ModuleNotFoundError: No module named 'modules.audio_translate'; 'modules' is not a package

A question about the get_audio_device_ids.py file

C:\Users\Terence\Downloads\LanguageLeapAI-main\LanguageLeapAI-main\src\modules>python get_audio_device_ids.py
Traceback (most recent call last):
File "C:\Users\Terence\Downloads\LanguageLeapAI-main\LanguageLeapAI-main\src\modules\get_audio_device_ids.py", line 1, in
import speech_recognition as sr
ModuleNotFoundError: No module named 'speech_recognition'

^
||
Hello everyone, am I supposed to run the file with command prompt?
I am new to programming so Im confused with stuffs
Any replies would be appreciated O7

What does this mean?

image
I already asked ChatGPT, and I'm stressed up of searching the source code to get that py file and even pip install doesn't work

amd gpu support?

I got the following error while trying to startup the docker container:

Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy' nvidia-container-cli: initialization error: WSL environment detected but no adapters were found: unknown

On another PC, this issue was solved by installing the Nvidia GPU driver, but I have an amd GPU on my main machineπŸ’€. Is there any way to change the requirements of the docker to get AMD GPUs to work?

Request timeout for subtitler.py

image

I tested on short and long voices where the latter just returns to request timeout

Edit: Actually it's not about the duration, it now occurs often somehow both short and long.

voice_translator.py stopping without error after attempting to send API calls to Whisper/DeepL/VoiceVox

Issue

I have been experimenting for hours without luck. Whenever I run voice_translator.py, it seems to stop the script after attempting to request data. It doesn't output anything as well. Very randomly, it decides to work, but 99% of the time I am unable to get it to work.

I ran Whisper and VoiceVox through Docker before, but thinking that somehow that was an issue despite having the resources, I'm hosting them currently on Colab. I've attempted using different Python versions from 3.8 to 3.11.

I've documented all my testing below. I hope this helps pinpoint something. Maybe I'm overlooking something very apparent or maybe it's a specific issue on my environment, though I'm just running on Windows 10.

Testings with voice_translator.py

I've had the best results via Docker. Where it's able to at least transcribe. But even that is unfortunately inconsistent. I've logged the contents of the variables in voice_translator.py where it shows it doesn't pass after DeepL.
image

I even went as far as omitting certain steps:

Skipping transcription
No result from DeepL, script stops without throwing error.
image

Skipping translation
Here it seems to get responses from both audio_query and synthesis calls, but stops before playing the audio. Also no error thrown.
image

Testings with module scripts

However, it succeeds the calls running the scripts manually. Same result with either Docker or Colab.
voicevox.py (I even tried translating here)
image
asr.py
image

Extra info

For some reason, I'm not able to see the logs in Docker. But when I switched to Colab, it's logging bad requests whenever the transcribe fails when running voice_translator.py. The 200 responses were from the times I manually ran asr.py.
image

Unable to reach whisper

I'm getting a problem while running the voice_translator it says
image
even though the variables are correct ig.
image
image
image

How to set this up?

Can someone please do a step-by-step guide? I just wanna use the voice talk option, I can't seem to figure out what to do with pip and cp commands, also what do I do with docker, it's so confusing.

Couldn't find `CABLE Output (VB-Audio Virtual Cable)` in application settings.

According to Audio Routing setup guide, after setting up VoiceMeeter Banana, I have to set input device to CABLE Output (VB-Audio Virtual Cable) in the application settings. The problem is, I couldn't find it anywhere. All I found were VoiceMeeter Output (VB-Audio VoiceMeeter VAIO) and VoiceMeeter AUX Output (VB-Audio VoiceMeeter AUX VAIO)

image

Here is my VoiceMeeter Banana setup.

image

Thank you.

can't to get a response from voicevox in colab

I managed to get the project to work. However I can't seem to get a response from voicevox in colab
image
the additional print statements are added like this:
image

I tired adding the port number to the URL and got a connection refused error
image

I ended up hosting voicevox on me local machine and it works.
image
Is this due to the Antivirus Web Protection you mentioned?
If so how do I should I configure my firewall? I tried to allow inbound connections like this but the error stays the same:
image

argument out of range voice_translator.py / request timeout subtitler.py

Ive followed the instructions in the audio and env files, also installed everything said in the readme file but when i try to launch the codes i get argument out of range for voice_translator.py and the request timeout for subtitler.py. i tried playing with the values in the env file but without any change. any help?
87687c363832fff1081e9018ea715a81

Google colab

Okay now is working the "server" created at google colab, but is returning me a 500 request error:

imagen

Do you think there is an error when im not using the colab GPU?

discord not detecting audio

I have all the program well configured but discord and obs do not detect the sound through the cable output

image

image

This openai WHISPER model has some serious problem

image
(ldo) G:\PRINT G0\LanguageLeapAI-main\LanguageLeapAI-main\src>python subtitler.py
Request timeout
Request timeout
Request timeout
Request timeout

not working at all

(ldo) G:\PRINT G0\LanguageLeapAI-main\LanguageLeapAI-main\src>python voice_translator.py
English: copy.
Japanese: をコピーします。
when i just said something complete not copy.

(ldo) G:\PRINT G0\LanguageLeapAI-main\LanguageLeapAI-main\src>python voice_translator.py
English: Thank you.
Japanese: γ‚γ‚ŠγŒγ¨γ†γ”γ–γ„γΎγ™γ€‚
when i just say hey and it return thank you

(ldo) G:\PRINT G0\LanguageLeapAI-main\LanguageLeapAI-main\src>python voice_translator.py
English: Are you?
Japanese: あγͺγŸγ―γγ†γ§γ™γ‹οΌŸ
how are you return are you

and voice vox part just dont work at all ,colab tells me if i dont use voicevox clooud gpu just shut it down

DeepL alternatives?

I can't sign up for a free account due to being in an unsupported country so I was wondering if there are other alternatives?

Very strange bug

Finally, the subtitler works, but can't hear anything from my PC.

But when I use the voice translator, it worked 1 time, it translated my voice but after,
Any running application start to crash, until windows crashes itself and becomes blank.
And when windows don't crash, I can't stop the docker, and docker desktop won't stop running.

I called a friend for help and he couldn't help me, It's the weirdest thing that happend to me, and I have no idea how to fix that.
It happen everytime I launch the voice translator.

struct.error: argument out of range for voice_translator.py

I can hold down the key just fine, but when I let go I get this massive error. I am running the program using docker, and I have already double checked that the local service URLS are correct.

Exception in thread Thread-2 (process):
Traceback (most recent call last):
  File "C:\Users\kylez\AppData\Local\Programs\Python\Python310\lib\threading.py", line 1016, in _bootstrap_inner
    self.run()
  File "C:\Users\kylez\AppData\Local\Programs\Python\Python310\lib\threading.py", line 953, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\kylez\AppData\Local\Programs\Python\Python310\lib\site-packages\keyboard\_generic.py", line 58, in process
    if self.pre_process_event(event):
  File "C:\Users\kylez\AppData\Local\Programs\Python\Python310\lib\site-packages\keyboard\__init__.py", line 213, in pre_process_event
    key_hook(event)
  File "C:\Users\kylez\AppData\Local\Programs\Python\Python310\lib\site-packages\keyboard\__init__.py", line 516, in <lambda>
    return hook_key(key, lambda e: e.event_type == KEY_DOWN or callback(e), suppress=suppress)
  File "G:\LanguageLeapAI\src\voice_translator.py", line 58, in on_release_key
    wf.writeframes(b''.join(frames))
  File "C:\Users\kylez\AppData\Local\Programs\Python\Python310\lib\wave.py", line 437, in writeframes
    self.writeframesraw(data)
  File "C:\Users\kylez\AppData\Local\Programs\Python\Python310\lib\wave.py", line 426, in writeframesraw
    self._ensure_header_written(len(data))
  File "C:\Users\kylez\AppData\Local\Programs\Python\Python310\lib\wave.py", line 467, in _ensure_header_written
    self._write_header(datasize)
  File "C:\Users\kylez\AppData\Local\Programs\Python\Python310\lib\wave.py", line 479, in _write_header
    self._file.write(struct.pack('<L4s4sLHHLLHH4s',
struct.error: argument out of range
Exception ignored in: <function Wave_write.__del__ at 0x0000017E045BF7F0>
Traceback (most recent call last):
  File "C:\Users\kylez\AppData\Local\Programs\Python\Python310\lib\wave.py", line 326, in __del__
    self.close()
  File "C:\Users\kylez\AppData\Local\Programs\Python\Python310\lib\wave.py", line 444, in close
    self._ensure_header_written(0)
  File "C:\Users\kylez\AppData\Local\Programs\Python\Python310\lib\wave.py", line 467, in _ensure_header_written
    self._write_header(datasize)
  File "C:\Users\kylez\AppData\Local\Programs\Python\Python310\lib\wave.py", line 479, in _write_header
    self._file.write(struct.pack('<L4s4sLHHLLHH4s',
struct.error: argument out of range

How do I get voice IDs?

I have a question, if I want to change the voice to a male voice, how do I get the voice IDs to obtain the voice I want for the value VOICE_ID in the .env?

Problems with Voicevox

Soooooooo here i am opening an Issue because i've spent about two hours trying to fix it by myself and i'm tired of it now. Hopefully somebody can help.

I initially ran everything in google Collab and was constantly getting time-outs from my voicevox collab instance. I tested the translation in the asr.py file ans DeepL and whisper a playing niceley and that works just fine.
I get a translation of what i said printed in the terminal, and then the the problems start.
My Voicevox doesn't print anything to the collab console and won't react in the test in voicevox.py nor when i try to actually use it in the voice_translator.py.

The next thing i did is try to run just Voicevox it locally because maybe collab was playing up. I quickly noticed that i doesn't work with amd gpus so i pulled the cpu image from docker hub and ran it manually

  1. docker pull voicevox/voicevox_engine:cpu-ubuntu20.04-latest
  2. docker run --rm -it -p '127.0.0.1:50021:50021' voicevox/voicevox_engine:cpu-ubuntu20.04-latest

And now some things were happening (i was getting outputs from voicevox, yay!) but other still weren't.
Now everytime i say something i get following erros from the Console in which Voicevox is running:

INFO:     Started server process [1]
INFO:     Waiting for application startup.
reading /tmp/tmp5wmph5yu ... 57
emitting double-array: 100% |###########################################|

done!
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:50021 (Press CTRL+C to quit)
INFO:     172.17.0.1:48314 - "POST //audio_query?text=%E3%82%80%E3%81%8B%E3%81%97%E3%81%82%E3%82%8B%E3%81%A8%E3%81%93%E3%82%8D%E3%81%AB%E3%80%81%E3%82%B8%E3%83%A3%E3%83%83%E3%82%AF%E3%81%A8%E3%81%84%E3%81%86%E7%94%B7%E3%81%AE%E5%AD%
90%E3%81%8C%E3%81%84%E3%81%BE%E3%81%97%E3%81%9F%E3%80%82%E3%82%B8%E3%83%A3%E3%83%83%E3%82%AF%E3%81%AF%E3%81%8A%E6%AF%8D%E3%81%95%E3%82%93%E3%81%A8%E4%B8%80%E7%B7%92%E3%81%AB%E4%BD%8F%E3%82%93%E3%81%A7%E3%81%84%E3%81%BE%E3%81%97%E3%8
1%9F%E3%80%82&speaker=15 HTTP/1.1" 404 Not Found
INFO:     172.17.0.1:48316 - "POST //synthesis?speaker=15 HTTP/1.1" 404 Not Found
^CINFO:     Shutting down
INFO:     Waiting for application shutdown.
INFO:     Application shutdown complete.
INFO:     Finished server process [1]

and these form the voicevox.py file:

Voicevox attempting to speak now...
Exception in thread Thread-1 (play_voice):
Exception in thread Traceback (most recent call last):
  File "C:\Python310\lib\threading.py", line 1009, in _bootstrap_inner
Thread-2 (play_voice):
Traceback (most recent call last):
  File "C:\Python310\lib\threading.py", line 1009, in _bootstrap_inner
    self.run()
  File "C:\Python310\lib\threading.py", line 946, in run
    self.run()
  File "C:\Python310\lib\threading.py", line 946, in run
    self._target(*self._args, **self._kwargs)
  File "C:\*\LanguageLeapAI-main\src\modules\voicevox.py", line 29, in play_voice
    self._target(*self._args, **self._kwargs)
  File "C:\*\LanguageLeapAI-main\src\modules\voicevox.py", line 29, in play_voice
    data, fs = sf.read(VOICEVOX_WAV_PATH, dtype='float32')
  File "C:\*\LanguageLeapAI-main\lib\site-packages\soundfile.py", line 285, in read
    data, fs = sf.read(VOICEVOX_WAV_PATH, dtype='float32')
  File "C:\*\LanguageLeapAI-main\lib\site-packages\soundfile.py", line 285, in read
    with SoundFile(file, 'r', samplerate, channels,
  File "C:\*\LanguageLeapAI-main\lib\site-packages\soundfile.py", line 658, in __init__
    with SoundFile(file, 'r', samplerate, channels,
  File "C:\*\LanguageLeapAI-main\lib\site-packages\soundfile.py", line 658, in __init__
    self._file = self._open(file, mode_int, closefd)
  File "C:\*\LanguageLeapAI-main\lib\site-packages\soundfile.py", line 1216, in _open
    self._file = self._open(file, mode_int, closefd)
  File "C:\*\LanguageLeapAI-main\lib\site-packages\soundfile.py", line 1216, in _open
    raise LibsndfileError(err, prefix="Error opening {0!r}: ".format(self.name))
soundfile.LibsndfileError: Error opening 'C:\\*\\LanguageLeapAI-main\\src\\audio\\voicevox.wav': Format not recognised.
    raise LibsndfileError(err, prefix="Error opening {0!r}: ".format(self.name))
soundfile.LibsndfileError: Error opening 'C:\\*\\LanguageLeapAI-main\\src\\audio\\voicevox.wav': Format not recognised.

Process finished with exit code 0

as well as these from the voice_translator.py: (as you can see, the translation part works)

English: This is a test sentence.
Japanese: γ“γ‚Œγ―γƒ†γ‚Ήγƒˆζ–‡γ§γ™γ€‚
Exception in thread Exception in thread Thread-4 (play_voice):
Traceback (most recent call last):
  File "C:\Python310\lib\threading.py", line 1009, in _bootstrap_inner
Thread-3 (play_voice):
Traceback (most recent call last):
  File "C:\Python310\lib\threading.py", line 1009, in _bootstrap_inner
    self.run()    self.run()
  File "C:\Python310\lib\threading.py", line 946, in run

  File "C:\Python310\lib\threading.py", line 946, in run
    self._target(*self._args, **self._kwargs)
  File "C:\*\LanguageLeapAI-main\src\modules\voicevox.py", line 29, in play_voice
    self._target(*self._args, **self._kwargs)
  File "C:\*\LanguageLeapAI-main\src\modules\voicevox.py", line 29, in play_voice
    data, fs = sf.read(VOICEVOX_WAV_PATH, dtype='float32')
  File "C:\*\LanguageLeapAI-main\lib\site-packages\soundfile.py", line 285, in read
    data, fs = sf.read(VOICEVOX_WAV_PATH, dtype='float32')
  File "C:\*\LanguageLeapAI-main\lib\site-packages\soundfile.py", line 285, in read
    with SoundFile(file, 'r', samplerate, channels,
  File "C:\*\LanguageLeapAI-main\lib\site-packages\soundfile.py", line 658, in __init__
    with SoundFile(file, 'r', samplerate, channels,
  File "C:\*\LanguageLeapAI-main\lib\site-packages\soundfile.py", line 658, in __init__
    self._file = self._open(file, mode_int, closefd)
  File "C:\*\LanguageLeapAI-main\lib\site-packages\soundfile.py", line 1216, in _open
    self._file = self._open(file, mode_int, closefd)
  File "C:\*\LanguageLeapAI-main\lib\site-packages\soundfile.py", line 1216, in _open
    raise LibsndfileError(err, prefix="Error opening {0!r}: ".format(self.name))
soundfile.LibsndfileError: Error opening 'C:\\*\\LanguageLeapAI-main\\src\\audio\\voicevox.wav': Format not recognised.
    raise LibsndfileError(err, prefix="Error opening {0!r}: ".format(self.name))
soundfile.LibsndfileError: Error opening 'C:\\*\\LanguageLeapAI-main\\src\\audio\\voicevox.wav': Format not recognised.
Closing voice translator.

Process finished with exit code 0

So yeah, that's about what i've done so far.

Help would be most appreciated!

[Feature] Showing transcribed and translated text to the screen

Why?

Games like Valorant only allow push to talk, that mean I have to set 1 keybind (v) for Speech To Text, and 1 keybind (n) to actually talk in the game.

Everytime I actually say something, I want to know if the transcribed and translated text are correct before I actually send the voice to other players. So if showing them to the screen like how Subtitler doing. If would be easier to use.

I'm not familiar with Python so I couldn't get it to work.

I'm looking forward to your ideas and desicions.

Thanks

Voicemeter Setup

Before I try using this, I want to make sure that my voice meter is setup correctly before it potentially destroys my audio and bluescreens my pc
image

Voice sounds incredibly robotic

Hello,

I was just curious on if you had a potential fix for preventing this from sounding incredibly robotic. I had resorted to tweaking the settings below but for whatever reason it clips at points (despite having a good mic although I don't think this is the reason) and just sounds off compared to your initial video.

VOICE_ID=20

# Type: float
SPEED_SCALE=1.0
VOLUME_SCALE=3.0
INTONATION_SCALE=1.5
PRE_PHONEME_LENGTH=1.0
POST_PHONEME_LENGTH=1.0

whisper return {"detail":"Not Found"}

As written in the title, the response of the request to '{BASE_URL}/asr?task=transcribe&language=en&output=json' in transcribe() is {"detail":"Not Found"}, is there a solution?

Additionally, when the transcribe() function ends, the program terminates. Is it possible to prevent it?

I don't even know why this is

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.