akras14 / speech-to-text Goto Github PK
View Code? Open in Web Editor NEWExample transcribing audio file (speech) to text with Google Cloud Speech API and Python
Example transcribing audio file (speech) to text with Google Cloud Speech API and Python
Hey akras,
enable_automatic_punctuation=True ?
Your codes i try configs and many things like enable_automatic_punctuation not working can u answer it thanks :)
I have imported this project but no able to run in Android ?
How can I do that ?
I am a beginner and could not find anything on google. WAV docs are in Turkish. I don't know if it is related. Might be
Thank you for your time.
Ali
`Traceback (most recent call last):
File "fast.py", line 28, in
all_text = pool.map(transcribe, enumerate(files))
File "C:\Users\ASUS-25\AppData\Local\Programs\Python\Python38-32\lib\multiprocessing\pool.py", line 364, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "C:\Users\ASUS-25\AppData\Local\Programs\Python\Python38-32\lib\multiprocessing\pool.py", line 768, in get
raise self.value
File "C:\Users\ASUS-25\AppData\Local\Programs\Python\Python38-32\lib\multiprocessing\pool.py", line 125, in worker
result = (True, func(*args, **kwds))
File "C:\Users\ASUS-25\AppData\Local\Programs\Python\Python38-32\lib\multiprocessing\pool.py", line 48, in mapstar
return list(map(*args))
File "fast.py", line 21, in transcribe
text = r.recognize_google_cloud(audio, credentials_json=GOOGLE_CLOUD_SPEECH_CREDENTIALS)
File "C:\Users\ASUS-25\AppData\Local\Programs\Python\Python38-32\lib\site-packages\speech_recognition_init.py", line 937, in recognize_google_cloud
if "results" not in response or len(response["results"]) == 0: raise UnknownValueError()
speech_recognition.UnknownValueError
C:\Users\ASUS-25\Google Drive\Work\speech-to-text-master>`
I have used google cloud speech to text API which is working well but I need to show speakers just above the line. Suppose I have an audio in which 4 persons involved Now I want to get the persons just before start his / her text. Like
Person1:
Here is the text of person1.
Person2:
Here is the text of person2.
Person1:
Here is another line of text from person1.
Person3:
Here is the text of person3.
Can anyone let me know how I can get the speaker also with the text by using google API?
Has anyone encountered a value error even though the audio file is a PCM wav? Any idea to solve it?
ValueError: Audio file could not be read as PCM WAV, AIFF/AIFF-C, or Native FLAC.
I ran the fast.py
with some sample wav files and it worked perfectly! But when I tested it with audio files I collected from website, I got a value error even though the info from soxi command says otherwise.
I then re-ran the sample wav files that were previously worked, but received the same error messages.
Audio files I collected from website
I downloaded Amazon's audio (https://www.youtube.com/watch?v=CxK1VhtJlNQ), converted it to wav file at 16K sample rate and 1 channel. Split it into small pieces with py-webrtcvad.
soxi chunk-02.wav
Input File : 'chunk-02.wav'
Channels : 1
Sample Rate : 16000
Precision : 16-bit
Duration : 00:00:03.03 = 48480 samples ~ 227.25 CDDA sectors
File Size : 97.0k
Bit Rate : 256k
Sample Encoding: 16-bit Signed Integer PCM
I have setup your library locally and it works like charm thank you for good work ! I m trying to integrate this library with php but couldn't get it produce results in that case. This script is saved in a folder named speech_to_text and I m trying to execute it using php's shell command the code I run is $output = shell_exec("/usr/bin/python3 /var/www/html/speech_to_text/slow.py $directory"); and I have modified the slow.py file in following way
https://gist.github.com/khanof89/1c97f178dace3712991d114f95a3da2c
the following is the output I get:
foldername /var/www/html/podcasts-manage/storage/episode-2-of-the-awesome-mypodcast-a5dc/
['genevieve1.wav']
for f in tqdm /var/www/html/podcasts-manage/storage/episode-2-of-the-awesome-mypodcast-a5dc/genevieve1.wav
name /var/www/html/podcasts-manage/storage/episode-2-of-the-awesome-mypodcast-a5dc/genevieve1.wav
inside source
done source
credentials {
"type": "service_account",
"project_id": "api-project-11111111111",
"private_key_id": "private_key_id_goes_here",
"private_key": "-----BEGIN PRIVATE KEY-----\nMY_PRIVATE_KEY_GOES_HERE\n-----END PRIVATE KEY-----\n",
"client_email": "[email protected]",
"client_id": "1111111111111",
"auth_uri": "https://accounts.google.com/o/oauth2/auth",
"token_uri": "https://accounts.google.com/o/oauth2/token",
"auth_provider_x509_cert_url": "https://www.googleapis.com/oauth2/v1/certs",
"client_x509_cert_url": "https://www.googleapis.com/robot/v1/metadata/x509/analytics%40api-project-792103813257.iam.gserviceaccount.com"
}
exception in text=r.recognizerl": "https://www.googleapis.com/robot/v1/metadata/x509/analytics%40api-project-792103813257.iam.gserviceaccount.com"
}
exception in text=r.recognize
because I am creating this as a issue I have replaced many thing from my google credentials file but actually they are intact. Please help
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.