deepgram / deepgram-python-sdk Goto Github PK

View Code? Open in Web Editor NEW

181.0 181.0 49.0 7.93 MB

Official Python SDK for Deepgram's automated speech recognition APIs.

Home Page: https://developers.deepgram.com

License: MIT License

Python 97.14% Dockerfile 0.28% Makefile 0.54% Shell 2.04%

asr automated-speech-recognition deepgram hacktoberfest python speech-recognition

deepgram-python-sdk's People

Stargazers

Watchers

Forkers

samgdf ambergeldar natalierutgers moumeneb1 riconghost krazykaigh yottalogical chryslana dmater01 arry043 pogzyb sudiptog81 dotaadarsh shirgoldbird rangemercautomotors nicolinebutler brianhillis-dg paparaha4 oxxio jjmaldonis jkroll-deepgram saarthshah bangado holyr00d b2k-tortechnik-de dvonthenen olushile blakehawkins nikhilesh-b artchess universaldevgenius amartyajha techtide mirceapasoi vineetverma100 tomprimozic lyonsno roerohan lyg5597122 young17 huaqi-fr odoochain fburnelli avibathula brandonwheat joe-nano vemulapalli01 run2ai-m

deepgram-python-sdk's Issues

False language detection

Detected language is false

It works fine for language detection in audio file
but sometimes it detects false language of the audio file..
I've multiple english audio file but for some of them it detects language as hindi and russian for the audio files..

sync_prerecorded doesn't return...

What is the current behavior?

Calling sync_prerecorded sometimes doesn't return, even if the request is processed on dashboard

DEEPGRAM_TRANSCRIPT_API_OPTIONS = {
    "model": "general",
    "tier": "enhanced",
    "utterances": True,
    "punctuate": True,
    "smart_format": True,
    "paragraphs": True,
    "diarize": True,
    "language": "en",
}
deepgram = Deepgram(API_KEY)
with open("test.mp3", "rb") as fp:
    source = {"buffer": fp, "mimetype": "audio/wav"}
    print("printed")
    res = deepgram.transcription.sync_prerecorded(
        source, DEEPGRAM_TRANSCRIPT_API_OPTIONS
    )
    print("not printed")

Expected behavior

sync_prerecorded should return or throw exception...

Please tell us about your environment

Operating System/Version: Windows 10 Pro
Language: Python 3.10.11

Other information

I'm sending .mp3 files up to 500 MB

code is error when i am giving the path of audio file in real time transcription code

code is error when i am giving the path of audio file in real time transcription code.

Idling transcriber

What is the current behavior?

I try to transcribe some audio with the help of LiveTranscription and sometimes, the program idles.

⚠️ I am not really transcribing audio live and I think the issue is related to that. I want to test the performance of DG's streaming API on recorded data. What I have been doing so far is sending audio chunks without trying to emulate live timing (I just send them one after the other, i.e. faster than real time).

From what I have seen, the issue might be related to the LiveTranscription object not receiving the closing response from the server after sending the finish message. Because I send the data faster than real time, their is quite some gap between the moment when I call the finish() method and when the server actually processes it. And this seems to lead to the websocket being closed (I can see this in the debug log) before the client can get/process the closing signal from the server. I am not familiar with websockets so not completely sure here.

Steps to reproduce

Try to send recorded audio data without waiting between two chunks to simulate real time conditions.

Expected behavior

Transcription should proceed normally.

Please tell us about your environment

We want to make sure the problem isn't specific to your operating system or programming language.

Operating System/Version: Ubuntu 22.04
Language: Python

Exception: DG: server rejected WebSocket connection: HTTP 401 (OS:MAC)

import pyaudio
from deepgram import Deepgram
import deepgram

client = Deepgram("api key")

Set the sample rate and chunk size for the audio stream

SAMPLE_RATE = 16000
CHUNK_SIZE = 1024

Create a new PyAudio object

audio = pyaudio.PyAudio()
audio.get_device_info_by_index(0)

Open the audio stream using PyAudio

stream = audio.open(
format=pyaudio.paInt16,
channels=1,
rate=SAMPLE_RATE,
input=True,
frames_per_buffer=CHUNK_SIZE
)

async def getaudiotranscript(data):
deepgramLive = await client.transcription.live({'punctuate': True, 'interim_results': False})
deepgramLive.registerHandler(deepgramLive.event.CLOSE, lambda c: print(f'Connection closed with code {c}.'))
deepgramLive.send(data)
result = client.transcription(data)
print(result.text)

Loop until the user stops the program

while True:
# Read a chunk of audio from the stream
data = stream.read(CHUNK_SIZE)

# Send the audio chunk to Deepgram for recognition
reposnse = await getaudiotranscript(data)



# Print the recognized text to the console

Close the audio stream

stream.close()

Version bump automation

Proposed changes

We need the version to update automatically with the CI/CD pipeline.

Context

We want to keep the version in sync without having to manually type in the new version every time there is an update.

Possible Implementation

Possibly a github action

Other information

This may be updated in the setup.py file, or it may be updated in the entry point file (init.py)

Manage endpoints updated to support Tokens

Manage endpoints that will likely change:

request - https://developers.deepgram.com/reference/get-request
requests - https://developers.deepgram.com/reference/get-all-requests
usage - https://developers.deepgram.com/reference/summarize-usage
fields - https://developers.deepgram.com/reference/get-fields"

calling sync_prerecorded fails with ssl.SSLEOFError: EOF occurred in violation of protocol (_ssl.c:2396)

What is the current behavior?

    dg_client = Deepgram(DEEPGRAM_API_KEY)

    with open(PATH_TO_FILE, 'rb') as audio:
        source = {'buffer': audio, 'mimetype': MIMETYPE}
        options = { "punctuate": True, "model": "base"}
        response = dg_client.transcription.sync_prerecorded(source, options)

What's happening that seems wrong?

fails with

Traceback (most recent call last):
  File "/opt/homebrew/lib/python3.10/site-packages/deepgram/_utils.py", line 139, in attempt
    with urllib.request.urlopen(req) as resp:
  File "/opt/homebrew/Cellar/[email protected]/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 216, in urlopen
    return opener.open(url, data, timeout)
  File "/opt/homebrew/Cellar/[email protected]/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 519, in open
    response = self._open(req, data)
  File "/opt/homebrew/Cellar/[email protected]/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 536, in _open
    result = self._call_chain(self.handle_open, protocol, protocol +
  File "/opt/homebrew/Cellar/[email protected]/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 496, in _call_chain
    result = func(*args)
  File "/opt/homebrew/Cellar/[email protected]/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 1391, in https_open
    return self.do_open(http.client.HTTPSConnection, req,
  File "/opt/homebrew/Cellar/[email protected]/3.10.9/Frameworks/Python.framework/Versions/3.10/lib/python3.10/urllib/request.py", line 1351, in do_open
    raise URLError(err)
urllib.error.URLError: <urlopen error EOF occurred in violation of protocol (_ssl.c:2396)>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Users/mikhailkuznetcov/Developer/speech-rec-python/deepgram_version.py", line 36, in <module>
    main()
  File "/Users/mikhailkuznetcov/Developer/speech-rec-python/deepgram_version.py", line 28, in main
    response = dg_client.transcription.sync_prerecorded(source, options)
  File "/opt/homebrew/lib/python3.10/site-packages/deepgram/transcription.py", line 355, in sync_prerecorded
    return SyncPrerecordedTranscription(
  File "/opt/homebrew/lib/python3.10/site-packages/deepgram/transcription.py", line 111, in __call__
    return _sync_request(
  File "/opt/homebrew/lib/python3.10/site-packages/deepgram/_utils.py", line 153, in _sync_request
    return attempt()
  File "/opt/homebrew/lib/python3.10/site-packages/deepgram/_utils.py", line 148, in attempt
    raise (Exception(f'DG: {exc}') if exc.status < 500 else exc)
AttributeError: 'URLError' object has no attribute 'status'

Steps to reproduce

To make it faster to diagnose the root problem. Tell us how can we reproduce the bug.

Expected behavior

transcribe the file

What would you expect to happen when following the steps above?

Please tell us about your environment

Python:

We want to make sure the problem isn't specific to your operating system or programming language.

Operating System/Version: Mac M1
Language: Python

Other information

Anything else we should know? (e.g. detailed explanation, stack-traces, related issues, suggestions how to fix, links for us to have context, eg. stack overflow, codepen, etc)

Balance deduction in summary report

Proposed changes

Summarize Usage api returns following details
end,start,requests,hours

but we dont understand the daily how much we spend ?
so please add how much money we get charged for that day for total requests or total minutes

Access in Cloud Server

Whether Deepgram with its api will be supported in the cloud server

Include webvtt and srt formatting

Proposed changes

Provide methods to transform a Deepgram prerecorded transcription response into WebVTT or SRT captions
We've decided to do this as a separate package

Context

This feature was available in V2 as to_WebVTT() or .to_SRT(). We need to continue to provide this feature in V3

Possible Implementation

See V2 implementation: https://github.com/deepgram/deepgram-python-sdk/blob/main/deepgram/extra.py

Take this V2 implementation and create a standalone python package to create captions. The standalone package can be a dependency in this SDK so users can use it from the SDK, or they can use it independently.

Other information

Update SDK for pre-Python v3 or notate it is for v3+

Received some feedback on the SDK. If not overly complicated, we should think about making it accessible to more versions of Python. If we are using something that requires 3+, we should denote that in the README.md

Hey for the Python SDK
We should specify it works for Python 3+
Had a prospect just run into an issue running it on 2.9
So I recommend trying on 3.6+, and they said that worked
I think it’s due to an from __future __ import annotations

Support Persian(Farsi) language

Proposed changes

Provide a detailed description of the change or addition you are proposing

Adding Persian(Farsi) recognition model.

Context

Why is this change important to you? How would you use it? How can it benefit other users?

Possible Implementation

Not obligatory, but suggest an idea for implementing addition or change

Other information

Anything else we should know? (e.g. detailed explanation, related issues, links for us to have context, eg. stack overflow, codepen, etc)

┆Issue is synchronized with this Asana task by Unito

Intent (Listen) supported

new param intents is supported for intent recognition for /listen requests.

LiveTranscription

What is the current behavior?

I keep seeing the following error emitted by my server

Task exception was never retrieved
future: <Task finished name='Task-192191' coro=<LiveTranscription._start() done, defined at /Users/sishaar/.pyenv/versions/3.11.4/lib/python3.11/site-packages/deepgram/transcription.py:178> exception=ConnectionClosedOK(Close(code=1000, reason=''), Close(code=1000, reason=''), True)>
Traceback (most recent call last):
  File "/Users/sishaar/.pyenv/versions/3.11.4/lib/python3.11/site-packages/deepgram/transcription.py", line 222, in _start
    await self._socket.send(body)
  File "/Users/sishaar/.pyenv/versions/3.11.4/lib/python3.11/site-packages/websockets/legacy/protocol.py", line 635, in send
    await self.ensure_open()
  File "/Users/sishaar/.pyenv/versions/3.11.4/lib/python3.11/site-packages/websockets/legacy/protocol.py", line 935, in ensure_open
    raise self.connection_closed_exc()
websockets.exceptions.ConnectionClosedOK: received 1000 (OK); then sent 1000 (OK)

Here is the file I use to manage my Deepgram connections - am I not registering/handling errors correctly?

class _DeepgramConnectionPool:
  def __init__(self):
    self.connections: LiveTranscription = []
    self.deepgram_hosted = Deepgram('<elided>')
    self.deepgram_onprem = Deepgram({'api_key': '', 'api_url': '<elided>'})

  async def get_connection(self, config: dict = {}):
    # default to use hosted deepgram
    self.deepgram = self.deepgram_hosted
    if config.get("enable_to_use_deepgram_on_premise") == "true":
      self.deepgram = self.deepgram_onprem
    try:
      deepgram_config = config.get('country_code_configs', {})
      model = deepgram_config.get("model", '<elided>')
      tier = 'enhanced' if model == 'nooks' else deepgram_config.get("tier", 'base')
      version = 'v1' if model == 'nooks' else deepgram_config.get('version', 'latest')

      async def _get_connection():
        return await self.deepgram.transcription.live(
          encoding='mulaw',
          model=model,
          tier=tier,
          version=version,
          sample_rate=8000,
          punctuate=True,
          interim_results=True,
          language='en-US',
          times=False,
        )

      connection: LiveTranscription = await Retry(_get_connection, delay_s=0.1, factor=1.2)
      connection.register_handler(LiveTranscriptionEvent.ERROR, lambda error: logger.error('Error during connection', exception=error))
      return connection
    except ConnectionClosedOK:
      logger.log('Deepgram websocket connection closed')
    except Exception as e:
      logger.error('Error creating connection: ', exception=e)
      raise e

DeepgramConnectionPool = _DeepgramConnectionPool()

Later on, I close the connection with the following piece of code

    async def close_deepgram(self):
        self.logger.debug("Closing Deepgram Connection")
        if self.deepgram:
            await self.deepgram.finish()
            self.logger.log('Closed Deepgram Connection')

Please tell us about your environment

We want to make sure the problem isn't specific to your operating system or programming language.

**Operating System/Version: MacOS
**Language: Python

Python API | Option to add username, secret and key

What is the current behavior?

We consume DG in our production. I have been provided with username, key and secret for testing some features for prod build.

What's happening that seems wrong?

I read the docs, but I don't see examples where I can given key, secret and username. The instantiation example contains api_key and api_url in the params dictionary. By just giving key and url, it throws unautorized exception.

Add the PyTest Testing Framework to the Python SDK

Proposed changes

Add the PyTest testing tool to the Python SDK. Once it's added write a test for the the PrerecordedTranscription class and the async def prerecorded function inside the Transcription class, in the transcription.py file. Make sure your tests run and pass locally before committing them. When all your tests pass, please do a pull request and assign @geekchick as a reviewer.

Context

Currently, the Python SDK doesn't have any tests, and these are important. Adding PyTest would be an appreciated contribution so we can have test coverage. This would allow developers using the Python SDK to have a better experience as well as to help our Deepgram developers catch defects before releasing to production.

Why is this change important to you? How would you use it? How can it benefit other users?

This change is significant because we can finally have test coverage.

Topics (Read) supported

new param topics added for topic detection for /read requests.

Deepgram timeouts

Exception: DG: 408, message='Request Timeout', url=URL('https://api.deepgram.com/v1/listen?smart_format=true&utterances=true&summarize=v2&model=video&diarize=true')

Worked fine till a few hours ago.

AttributeError: ‘URLError’ object has no attribute ‘status’

What is the current behavior?

Using the following code, we get an error. This is because Deepgram Nova doesn't support Spanish. However, the error relates to parsing the response incorrectly. Which appears to be a bug.

(not the actual code from the ticket)

from deepgram import Deepgram
import json
import os

DEEPGRAM_API_KEY = os.getenv('DEEPGRAM_API_KEY') 

PATH_TO_FILE = 'test.mp3'
MIMETYPE = 'audio/mp3'

def main():
    dg_client = Deepgram(DEEPGRAM_API_KEY)
    
    with open(PATH_TO_FILE, 'rb') as audio:
        source = {'buffer': audio, 'mimetype': MIMETYPE}
        options = {"utterances":True,"diarize":True,"utterances":True, "model": "nova", "language": "en-US","paragraphs":True,"smart_format":True }

// ...

        response = dg_client.transcription.sync_prerecorded(source, options)
        print(response['results'])

main()

The error:

Traceback (most recent call last):

// ...

AttributeError: ‘URLError’ object has no attribute ‘status’

Remove MIME type for Batch

Proposed changes

Our ASR API never requires a MIME type unless you are making a fetch request with a JSON body, in which case it must be application/json.

Context

Possible Implementation

Other information

Second initialization of LiveTranscription doesn't send data to deepgram

What is the current behavior?

I have an interaction with a stream of audio that should be transcripted. It's not like a podcast where it is a single large stream, rather it's more conversational where there could be a sentence and a pause of 30 seconds or more and then another sentence. I figure the best thing to do is use a unique instance for each interaction.

The problem is that subsequent instances of the LiveTranscription object don't seem to send data. The first time through the code it works fine but when deepgramLive is initialized a second time, the TRANSCRIPT_RECEIVED handler is never fired until eventually it closes, that is, the CLOSE handler is called.

One thing to note is if I don't reinitialize deepgramLive, the stream can pick up subsequent phrases.

Steps to reproduce

@app.websocket("/ws/test")
async def test(websocket: WebSocket):
    await websocket.accept()
    async def receive_deepgram_transcript(msg):
        nonlocal transcript
        if msg.get("is_final"):
            transcript = (
                msg.get("channel", {})
                .get("alternatives", [{}])[0]
                .get("transcript", "")
            )
            if transcript != "":
                print(transcript)

    deepgramLive = None
    transcript = None
    while True:
        while not transcript:
            if not deepgramLive:
                deepgramLive = await dg_client.transcription.live({ 'punctuate': True, 'interim_results': False, 'language': 'en-US' })
                deepgramLive.registerHandler(deepgramLive.event.CLOSE, lambda c: print(f'Connection closed with code {c}.'))
                deepgramLive.registerHandler(deepgramLive.event.TRANSCRIPT_RECEIVED, receive_deepgram_transcript)
            data = await websocket.receive()
            deepgramLive.send(data['bytes'])
        else:
            print("closing")
            await deepgramLive.finish()
            deepgramLive = None

Not able to install deepgram-sdk on Mac

Unable to install deepgram-sdk and import it

I've tried running the following commands
pip install deepgram-sdk
as well as
pip3 install deepgram-sdk
to install it using Python3

I even checked the libraries installed by typing:
pip list
and
pip3 list

And I was able to see that deepgram-sdk 0.3.0 was installed on both.
But I am still not able to import it.

Please help

ssl.SSLCertVerificationError for batch and streaming starter code of deepgram

What is the current behavior?

What's happening that seems wrong?

Steps to reproduce

To make it faster to diagnose the root problem. Tell us how can we reproduce the bug.

Expected behavior

What would you expect to happen when following the steps above?

Please tell us about your environment

We want to make sure the problem isn't specific to your operating system or programming language.

Operating System/Version: Windows 10
Language: [all | TypeScript | Python | PHP | etc]
Browser: Chrome

Other information

Anything else we should know? (e.g. detailed explanation, stack-traces, related issues, suggestions how to fix, links for us to have context, eg. stack overflow, codepen, etc)

Could not open socket

What is the current behavior?

When running we get the error "Could not open socket"

Steps to reproduce

 Just run the line:
 deepgramLive = await deepgram.transcription.live(PARAMS)

To make it faster to diagnose the root problem. Tell us how can we reproduce the bug.

Expected behavior

  We should get more details on why the socket could not be opened. 
  Side Note: I suspect this might be due to HTTPs issues, is there a way to ignore SSL errors?

What would you expect to happen when following the steps above?
Error details to help debug the core issue

Please tell us about your environment

 Python 3.10 on MacOs

We want to make sure the problem isn't specific to your operating system or programming language.

Operating System/Version: Windows 10
Language: [all | TypeScript | Python | PHP | etc]
Browser: Chrome

Other information

Anything else we should know? (e.g. detailed explanation, stack-traces, related issues, suggestions how to fix, links for us to have context, eg. stack overflow, codepen, etc)

Import Error

What is the current behavior?

from deepgram import Deepgram gives below error:
SyntaxError: future feature annotations is not defined

Steps to reproduce

Python version 3.6.8
pip install deepgram-sdk
from deepgram import Deepgram

Expected behavior

Successful import of Deepgram class

Please tell us about your environment

We want to make sure the problem isn't specific to your operating system or programming language.

Operating System/Version: Ubuntu 18.04
Language: python 3.6.8

Topics (Listen) supported

old param detect_topics should be kept for backwards compatibility cases.
new param topics added for topic detection for /listen requests.

For captions, use punctuated_word if it exists in the response

Proposed changes

Currently, the to_caption function uses "word" instead of "punctuated_word" to create the captions even when "punctuated_word" exist in the response. I would like that to be changed so "punctuated_word" is used if it exists.

Context

My proposed change is actually the existing behavior of the toSRT function in deepgram-node-sdk.

Typing in _type.py is out of date with what is actually returned from API

What is the current behavior?

the types in _types.py don't actually represent what is returned from the API. Specifically I am using the Paragraphs feature, and the API returns paragraphs.paragraphs.speaker which I would like to access, but the typing does not list the speaker as a field (which is optionally there if you specify diarization).

Steps to reproduce

Use the Python SDK
Specify Smart formatting and Diarization
Transcribe something
Use typing
Try access the the speaker field from within the paragraphs and your typing will complain because the SDK doesn't actually contain the possible fields returned from the API.

Expected behavior

I would expect to see speaker: Optional[int] here

Please tell us about your environment

We want to make sure the problem isn't specific to your operating system or programming language.

Operating System/Version: macOS 12.5.1
Language: Python
Browser: Chrome

Other information

deepgram-sdk==2.11.0

N best fails for batch transcription

What is the current behavior?

I need to get N best as many as possible, time stamps and scores but when I ask for more then 2 N best it fails it is intermittent even with the same audio file and a request for the same number of n best.

I am using Windows 10 64 bit 21H2 19044.2965 Windows Feature Experience Pack 1000.19041.1000.0

this is a little python file and I am using a wave file I recorded that lasts just a few seconds

What's happening that seems wrong?

I get a bunch of error messages and it crashes

Traceback (most recent call last):
File "C:\Users\Andre\source\repos\Quintessence\Python\Deepgram_Batch\Lib\Main.py", line 47, in
main()
File "C:\Users\Andre\source\repos\Quintessence\Python\Deepgram_Batch\Lib\Main.py", line 42, in main
response = deepgram.transcription.sync_prerecorded(source, {'punctuate': True, 'alternatives': 5})
File "C:\Users\Andre\source\repos\Quintessence\Python\Deepgram_Batch\lib\site-packages\deepgram\transcription.py", line 355, in sync_prerecorded
return SyncPrerecordedTranscription(
File "C:\Users\Andre\source\repos\Quintessence\Python\Deepgram_Batch\lib\site-packages\deepgram\transcription.py", line 111, in call
return _sync_request(
File "C:\Users\Andre\source\repos\Quintessence\Python\Deepgram_Batch\lib\site-packages\deepgram_utils.py", line 156, in _sync_request
raise exc # stream is now invalid as payload
File "C:\Users\Andre\source\repos\Quintessence\Python\Deepgram_Batch\lib\site-packages\deepgram_utils.py", line 153, in _sync_request
return attempt()
File "C:\Users\Andre\source\repos\Quintessence\Python\Deepgram_Batch\lib\site-packages\deepgram_utils.py", line 148, in attempt
raise (Exception(f'DG: {exc}') if exc.status < 500 else exc)
File "C:\Users\Andre\source\repos\Quintessence\Python\Deepgram_Batch\lib\site-packages\deepgram_utils.py", line 139, in attempt
with urllib.request.urlopen(req) as resp:
File "C:\Users\Andre\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 214, in urlopen
return opener.open(url, data, timeout)
File "C:\Users\Andre\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 523, in open
response = meth(req, response)
File "C:\Users\Andre\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 632, in http_response
response = self.parent.error(
File "C:\Users\Andre\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 561, in error
return self._call_chain(*args)
File "C:\Users\Andre\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 494, in _call_chain
result = func(*args)
File "C:\Users\Andre\AppData\Local\Programs\Python\Python39\lib\urllib\request.py", line 641, in http_error_default
raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 503: Service Unavailable

Process finished with exit code 1

Steps to reproduce

run this program after you add code to set:

DEEPGRAM_API_KEY =

PATH_TO_FILE =

from deepgram import Deepgram
import json
import os

def main():
# Initializes the Deepgram SDK
deepgram = Deepgram(DEEPGRAM_API_KEY)
# Open the audio file
with open(PATH_TO_FILE, 'rb') as audio:
# ...or replace mimetype as appropriate
source = {'buffer': audio, 'mimetype': 'audio/wav'}
response = deepgram.transcription.sync_prerecorded(source, {'punctuate': True, 'alternatives': 5})
print(json.dumps(response, indent=4))

main()

if you reduce 'alternatives': 5 to 2 it will work
if you reduce it to 3 or 4 it may run some times but not other times

Deepgram Python speech Live. call does not start coroutine for sending

What is the current behavior?

deepgram.transcription.live()

Code in some cases ignoring send() call.
So recognition does not start up to finish() call or resulting timeout.

Steps to reproduce

If the main data sending loop is in the coroutine too.

Expected behavior

Expected to be working like in the example

Deepgram Python speech Live. call does not start coroutine for sending

Please tell us about your environment

We want to make sure the problem isn't specific to your operating system or programming language.

Operating System/Version: Linux, any
Language: Python
Browser: n/a

Other information

After extensive debugging was found that at def __call__ there is line

asyncio.create_task(self._start())

Which creates a coroutine, but it never actually start if the main loop is also coroutine and no asyncio.sleep or similar was called. If the coroutine is not suspended or waiting, a new task is not starting. create_task() is not thread.start !

Suggested change: after that line add
await asyncio.sleep(0)

Or change the code sample for streaming to do deepgramLive.send() via await

Error when importing deepgram (unhashable type: 'list')

Current behavior

I'm trying to test the deepgram-sdk. However, I keep on encountering this issue:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\mycomputer\.virtualenvs\Python_Stuff-EaH8Zb-h\lib\site-packages\deepgram\__init__.py", line 2, in <module>
    from ._types import Options
  File "C:\Users\mycomputer\.virtualenvs\Python_Stuff-EaH8Zb-h\lib\site-packages\deepgram\_types.py", line 172, in <module>
    EventHandler = Union[Callable[[Any], None], Callable[[Any], Awaitable[None]]]
  File "c:\users\mycomputer\appdata\local\programs\python\python39\lib\typing.py", line 243, in inner
    return func(*args, **kwds)
  File "c:\users\mycomputer\appdata\local\programs\python\python39\lib\typing.py", line 316, in __getitem__
    return self._getitem(self, parameters)
  File "c:\users\mycomputer\appdata\local\programs\python\python39\lib\typing.py", line 421, in Union
    parameters = _remove_dups_flatten(parameters)
  File "c:\users\mycomputer\appdata\local\programs\python\python39\lib\typing.py", line 215, in _remove_dups_flatten
    all_params = set(params)
TypeError: unhashable type: 'list'

Steps to reproduce

Open terminal and run pipenv shell in an empty directory (or any directory of your choice)
Run pipenv install deepgram-sdk
Run python then type from deepgram import Deepgram or import deepgram. Press Enter afterwards.
The error should show by now

My environment

Operating System/Version: Windows 10
Language: Python 3.9.0
Virtual Environment used: Pipenv

Intent (Read) supported

new param intents is supported for intent recognition for /read requests.

Passing keywords parameter fails to url encode

What's happening that seems wrong?

Pass keywords parameter as a list of strings.

        response = await dg_client.transcription.prerecorded(
            source,
            {
                "punctuate": True,
                "diarize": True,
                "numerals": True,
                "utterances": True,
                "keywords": [
                    "10-Q",
                    "non-GAAP",
                    "GAAP",
                    "CRM",

ValueError: not enough values to unpack (expected 2, got 1)

The data looks like

[('keywords', '10-Q')]
[('keywords', 'non-GAAP')]

and it should look like

('keywords', '10-Q')
('keywords', 'non-GAAP')

So an extra array is ending up around each item.

Cannot import Deepgram

Hi, I've installed deepgram via "pip install deepgram-sdk"

The installation is successful, and the deepgram-sdk appears in my Python Lib/site-packages/ folder with the init.py file inside.

After the successful install, I attempt to import to my script with "from deepgram import Deepgram"

But Eclipse keeps showing "Unresolved import: Deepgram"

Not sure exactly why this is happening, as this is the first time something like this has happened to me. Any help would be greeatly appreciated.

Update the setup.py file

Proposed changes

The setup.py file is out of date. It needs to be updated for this new version release.

Context

We have changed some of the dependencies, such as using httpx instead of aiohttp to make http requests. The author is also out of date. Basically, this needs a possible rewrite due to the significant number of changes we are making for v3.

Possible Implementation

Research what is needed in a setup.py file and make appropriate changes.

Other information

Need to consider the version - this needs to stay in sync as the SDK is updated.

Wrong speaker detection/ Wrong labeling of speaker

Wrong speaker detection/ Wrong labeling of speaker when I am trying to transcribe the mp4 video file. I am using python sdk. I am using below settings:

'tier':'enhanced',
'punctuate': True,'
diarize':True,'
utterances':True,'
utt_split':0.3
I am attaching the expected output file, actual output file
current_whatsapp.docx
expected_whatsapp.docx

Can someone solve my problem?

Below is the json response of transcription:

{'metadata': {'transaction_key': 'deprecated', 'request_id': '2a124439-6333-4b0f-9ac7-0063b303e6ba', 'sha256': '2af7b928fe91cfca4b51126b54d19c69af3b5c39db65da7d9d87d01e74faf7ca', 'created': '2022-11-10T05:27:39.785Z', 'duration': 59.94669, 'channels': 1, 'models': ['125125fb-e391-458e-a227-a60d6426f5d6'], 'model_info': {'125125fb-e391-458e-a227-a60d6426f5d6': {'name': 'general-enhanced', 'version': '2022-05-18.0', 'tier': 'enhanced'}}}, 'results': {'channels': [{'alternatives': [{'transcript': "Hello, Kamiji. How are you? I'm fine. And you? Fine. So tell me about yourself. myself. I'm a software engineer. I'm working in a global IT app as project manager. Okay. And I have more than ten year experience in PHP. During my experience, I worked on PHP. So which projects are you working on? Right now I'm working on? Right now I'm working on single project. That is it. that is ASR means speech recognition on that. Which challenges are you facing? Right now, I'm facing some challenges related to the speaker changes. Like, when I transcribe the video into text format. Sometimes the speak speaker labeling are wrong. Okay. I'm disconnecting the", 'confidence': 0.94970703, 'words': [{'word': 'hello', 'start': 1.1992188, 'end': 1.5195312, 'confidence': 0.75268555, 'speaker': 0, 'speaker_confidence': 0.06817669, 'punctuated_word': 'Hello,'}, {'word': 'kamiji', 'start': 1.5195312, 'end': 1.9990234, 'confidence': 0.6906738, 'speaker': 0, 'speaker_confidence': 0.06817669, 'punctuated_word': 'Kamiji.'}, {'word': 'how', 'start': 1.9990234, 'end': 2.1582031, 'confidence': 0.9946289, 'speaker': 0, 'speaker_confidence': 0.06817669, 'punctuated_word': 'How'}, {'word': 'are', 'start': 2.1582031, 'end': 2.3183594, 'confidence': 0.98095703, 'speaker': 0, 'speaker_confidence': 0.06817669, 'punctuated_word': 'are'}, {'word': 'you', 'start': 2.3183594, 'end': 2.8183594, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.06817669, 'punctuated_word': 'you?'}, {'word': "i'm", 'start': 3.0390625, 'end': 3.2773438, 'confidence': 0.79833984, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': "I'm"}, {'word': 'fine', 'start': 3.2773438, 'end': 3.5976562, 'confidence': 0.78344727, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'fine.'}, {'word': 'and', 'start': 3.5976562, 'end': 3.9179688, 'confidence': 0.5410156, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'And'}, {'word': 'you', 'start': 3.9179688, 'end': 4.4179688, 'confidence': 0.8601074, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'you?'}, {'word': 'fine', 'start': 4.5585938, 'end': 5.0585938, 'confidence': 0.9968262, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'Fine.'}, {'word': 'so', 'start': 5.3554688, 'end': 5.5976562, 'confidence': 0.9975586, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'So'}, {'word': 'tell', 'start': 5.5976562, 'end': 5.8359375, 'confidence': 0.93359375, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'tell'}, {'word': 'me', 'start': 5.8359375, 'end': 5.9960938, 'confidence': 0.99902344, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'me'}, {'word': 'about', 'start': 5.9960938, 'end': 6.2382812, 'confidence': 0.9980469, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'about'}, {'word': 'yourself', 'start': 6.2382812, 'end': 6.7382812, 'confidence': 0.7758789, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'yourself.'}, {'word': 'myself', 'start': 7.6835938, 'end': 8.183594, 'confidence': 0.61417645, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'myself.'}, {'word': "i'm", 'start': 8.8828125, 'end': 9.0859375, 'confidence': 0.85961914, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': "I'm"}, {'word': 'a', 'start': 9.0859375, 'end': 9.203125, 'confidence': 0.99609375, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'a'}, {'word': 'software', 'start': 9.203125, 'end': 9.640625, 'confidence': 0.99853516, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'software'}, {'word': 'engineer', 'start': 9.640625, 'end': 10.140625, 'confidence': 0.9729004, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'engineer.'}, {'word': "i'm", 'start': 10.2421875, 'end': 10.484375, 'confidence': 0.8195801, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': "I'm"}, {'word': 'working', 'start': 10.484375, 'end': 10.84375, 'confidence': 0.94970703, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'working'}, {'word': 'in', 'start': 10.84375, 'end': 11.125, 'confidence': 0.99609375, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'in'}, {'word': 'a', 'start': 11.125, 'end': 11.203125, 'confidence': 0.6328125, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'a'}, {'word': 'global', 'start': 11.203125, 'end': 11.640625, 'confidence': 0.98828125, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'global'}, {'word': 'it', 'start': 11.640625, 'end': 12.0, 'confidence': 0.99853516, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'IT'}, {'word': 'app', 'start': 12.0, 'end': 12.203125, 'confidence': 0.5722656, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'app'}, {'word': 'as', 'start': 12.203125, 'end': 12.703125, 'confidence': 0.9086914, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'as'}, {'word': 'project', 'start': 13.640625, 'end': 14.0, 'confidence': 0.6201172, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'project'}, {'word': 'manager', 'start': 14.0, 'end': 14.5, 'confidence': 0.9519043, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'manager.'}, {'word': 'okay', 'start': 15.2265625, 'end': 15.546875, 'confidence': 0.7788086, 'speaker': 0, 'speaker_confidence': 0.041602314, 'punctuated_word': 'Okay.'}, {'word': 'and', 'start': 15.546875, 'end': 16.046875, 'confidence': 0.79345703, 'speaker': 0, 'speaker_confidence': 0.041602314, 'punctuated_word': 'And'}, {'word': 'i', 'start': 17.09375, 'end': 17.21875, 'confidence': 0.95654297, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'I'}, {'word': 'have', 'start': 17.21875, 'end': 17.46875, 'confidence': 0.99560547, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'have'}, {'word': 'more', 'start': 17.46875, 'end': 17.65625, 'confidence': 0.99853516, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'more'}, {'word': 'than', 'start': 17.65625, 'end': 17.90625, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'than'}, {'word': 'ten', 'start': 17.90625, 'end': 18.09375, 'confidence': 0.9941406, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'ten'}, {'word': 'year', 'start': 18.09375, 'end': 18.34375, 'confidence': 0.9902344, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'year'}, {'word': 'experience', 'start': 18.34375, 'end': 18.84375, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'experience'}, {'word': 'in', 'start': 18.90625, 'end': 19.140625, 'confidence': 0.99853516, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'in'}, {'word': 'php', 'start': 19.140625, 'end': 19.640625, 'confidence': 0.90063477, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'PHP.'}, {'word': 'during', 'start': 20.421875, 'end': 20.703125, 'confidence': 0.9951172, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'During'}, {'word': 'my', 'start': 20.703125, 'end': 20.90625, 'confidence': 0.9506836, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'my'}, {'word': 'experience', 'start': 20.90625, 'end': 21.40625, 'confidence': 0.96850586, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'experience,'}, {'word': 'i', 'start': 21.46875, 'end': 21.625, 'confidence': 0.9975586, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'I'}, {'word': 'worked', 'start': 21.625, 'end': 21.90625, 'confidence': 0.4501953, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'worked'}, {'word': 'on', 'start': 21.90625, 'end': 22.0625, 'confidence': 0.9453125, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'on'}, {'word': 'php', 'start': 22.0625, 'end': 22.5625, 'confidence': 0.28125, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'PHP.'}, {'word': 'so', 'start': 25.96875, 'end': 26.140625, 'confidence': 0.13635254, 'speaker': 0, 'speaker_confidence': 0.019748092, 'punctuated_word': 'So'}, {'word': 'which', 'start': 26.140625, 'end': 26.328125, 'confidence': 0.9736328, 'speaker': 0, 'speaker_confidence': 0.019748092, 'punctuated_word': 'which'}, {'word': 'projects', 'start': 26.328125, 'end': 26.78125, 'confidence': 0.9975586, 'speaker': 0, 'speaker_confidence': 0.0, 'punctuated_word': 'projects'}, {'word': 'are', 'start': 26.78125, 'end': 26.890625, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.0, 'punctuated_word': 'are'}, {'word': 'you', 'start': 26.890625, 'end': 27.015625, 'confidence': 0.9824219, 'speaker': 0, 'speaker_confidence': 0.0, 'punctuated_word': 'you'}, {'word': 'working', 'start': 27.015625, 'end': 27.296875, 'confidence': 0.9975586, 'speaker': 0, 'speaker_confidence': 0.0, 'punctuated_word': 'working'}, {'word': 'on', 'start': 27.296875, 'end': 27.796875, 'confidence': 0.8391113, 'speaker': 0, 'speaker_confidence': 0.0, 'punctuated_word': 'on?'}, {'word': 'right', 'start': 28.375, 'end': 28.65625, 'confidence': 0.86621094, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'Right'}, {'word': 'now', 'start': 28.65625, 'end': 28.703125, 'confidence': 0.99316406, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'now'}, {'word': "i'm", 'start': 28.703125, 'end': 28.75, 'confidence': 0.77368164, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': "I'm"}, {'word': 'working', 'start': 28.75, 'end': 28.796875, 'confidence': 0.99853516, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'working'}, {'word': 'on', 'start': 28.796875, 'end': 28.84375, 'confidence': 0.75878906, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'on?'}, {'word': 'right', 'start': 28.84375, 'end': 28.890625, 'confidence': 0.62402344, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'Right'}, {'word': 'now', 'start': 28.890625, 'end': 28.9375, 'confidence': 0.99121094, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'now'}, {'word': "i'm", 'start': 28.9375, 'end': 29.171875, 'confidence': 0.76293945, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': "I'm"}, {'word': 'working', 'start': 29.171875, 'end': 29.53125, 'confidence': 0.99902344, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'working'}, {'word': 'on', 'start': 29.53125, 'end': 30.03125, 'confidence': 0.99902344, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'on'}, {'word': 'single', 'start': 30.65625, 'end': 31.015625, 'confidence': 0.74316406, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'single'}, {'word': 'project', 'start': 31.015625, 'end': 31.5, 'confidence': 0.7504883, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'project.'}, {'word': 'that', 'start': 31.5, 'end': 31.65625, 'confidence': 0.8725586, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'That'}, {'word': 'is', 'start': 31.65625, 'end': 31.84375, 'confidence': 0.95654297, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'is'}, {'word': 'it', 'start': 31.84375, 'end': 32.34375, 'confidence': 0.8676758, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'it.'}, {'word': 'that', 'start': 32.96875, 'end': 33.21875, 'confidence': 0.38134766, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'that'}, {'word': 'is', 'start': 33.21875, 'end': 33.71875, 'confidence': 0.89990234, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'is'}, {'word': 'asr', 'start': 34.0625, 'end': 34.5625, 'confidence': 0.87939453, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'ASR'}, {'word': 'means', 'start': 35.4375, 'end': 35.9375, 'confidence': 0.6923828, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'means'}, {'word': 'speech', 'start': 35.9375, 'end': 36.21875, 'confidence': 0.52246094, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'speech'}, {'word': 'recognition', 'start': 36.21875, 'end': 36.71875, 'confidence': 0.92529297, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'recognition'}, {'word': 'on', 'start': 37.71875, 'end': 37.84375, 'confidence': 0.5708008, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'on'}, {'word': 'that', 'start': 37.84375, 'end': 38.34375, 'confidence': 0.9433594, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'that.'}, {'word': 'which', 'start': 38.625, 'end': 38.875, 'confidence': 0.9770508, 'speaker': 0, 'speaker_confidence': 0.0, 'punctuated_word': 'Which'}, {'word': 'challenges', 'start': 38.875, 'end': 39.34375, 'confidence': 0.8046875, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'challenges'}, {'word': 'are', 'start': 39.34375, 'end': 39.5, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'are'}, {'word': 'you', 'start': 39.5, 'end': 39.625, 'confidence': 0.99902344, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'you'}, {'word': 'facing', 'start': 39.625, 'end': 40.125, 'confidence': 0.99658203, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'facing?'}, {'word': 'right', 'start': 41.71875, 'end': 41.96875, 'confidence': 0.63623047, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'Right'}, {'word': 'now', 'start': 41.96875, 'end': 42.15625, 'confidence': 0.81274414, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'now,'}, {'word': "i'm", 'start': 42.15625, 'end': 42.5625, 'confidence': 0.98950195, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': "I'm"}, {'word': 'facing', 'start': 42.5625, 'end': 43.0, 'confidence': 0.9868164, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'facing'}, {'word': 'some', 'start': 43.0, 'end': 43.25, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'some'}, {'word': 'challenges', 'start': 43.25, 'end': 43.75, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'challenges'}, {'word': 'related', 'start': 44.03125, 'end': 44.53125, 'confidence': 0.98583984, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'related'}, {'word': 'to', 'start': 44.53125, 'end': 44.625, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'to'}, {'word': 'the', 'start': 44.625, 'end': 44.84375, 'confidence': 0.9716797, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'the'}, {'word': 'speaker', 'start': 44.84375, 'end': 45.1875, 'confidence': 0.9902344, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'speaker'}, {'word': 'changes', 'start': 45.1875, 'end': 45.6875, 'confidence': 0.9050293, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'changes.'}, {'word': 'like', 'start': 46.34375, 'end': 46.84375, 'confidence': 0.9001465, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'Like,'}, {'word': 'when', 'start': 47.0625, 'end': 47.3125, 'confidence': 0.99902344, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'when'}, {'word': 'i', 'start': 47.3125, 'end': 47.8125, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'I'}, {'word': 'transcribe', 'start': 48.5625, 'end': 49.0625, 'confidence': 0.9313965, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'transcribe'}, {'word': 'the', 'start': 49.3125, 'end': 49.56, 'confidence': 0.9580078, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'the'}, {'word': 'video', 'start': 49.8125, 'end': 50.3125, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'video'}, {'word': 'into', 'start': 50.59375, 'end': 51.0, 'confidence': 0.91308594, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'into'}, {'word': 'text', 'start': 51.0, 'end': 51.3125, 'confidence': 0.7885742, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'text'}, {'word': 'format', 'start': 51.3125, 'end': 51.8125, 'confidence': 0.78125, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'format.'}, {'word': 'sometimes', 'start': 52.6875, 'end': 53.1875, 'confidence': 0.9589844, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'Sometimes'}, {'word': 'the', 'start': 53.1875, 'end': 53.375, 'confidence': 0.49975586, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'the'}, {'word': 'speak', 'start': 53.375, 'end': 53.625, 'confidence': 0.84521484, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'speak'}, {'word': 'speaker', 'start': 53.84375, 'end': 54.25, 'confidence': 0.71191406, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'speaker'}, {'word': 'labeling', 'start': 54.25, 'end': 54.71875, 'confidence': 0.9682617, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'labeling'}, {'word': 'are', 'start': 54.71875, 'end': 54.96875, 'confidence': 0.48291016, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'are'}, {'word': 'wrong', 'start': 54.96875, 'end': 55.46875, 'confidence': 0.9030762, 'speaker': 0, 'speaker_confidence': 0.098423064, 'punctuated_word': 'wrong.'}, {'word': 'okay', 'start': 56.40625, 'end': 56.90625, 'confidence': 0.77368164, 'speaker': 0, 'speaker_confidence': 0.098423064, 'punctuated_word': 'Okay.'}, {'word': "i'm", 'start': 58.0625, 'end': 58.40625, 'confidence': 0.95996094, 'speaker': 0, 'speaker_confidence': 0.098423064, 'punctuated_word': "I'm"}, {'word': 'disconnecting', 'start': 58.40625, 'end': 58.90625, 'confidence': 0.99902344, 'speaker': 0, 'speaker_confidence': 0.098423064, 'punctuated_word': 'disconnecting'}, {'word': 'the', 'start': 59.1875, 'end': 59.6875, 'confidence': 0.88964844, 'speaker': 0, 'speaker_confidence': 0.098423064, 'punctuated_word': 'the'}]}]}], 'utterances': [{'start': 1.1992188, 'end': 6.7382812, 'confidence': 0.87348634, 'channel': 0, 'transcript': "Hello, Kamiji. How are you? I'm fine. And you? Fine. So tell me about yourself.", 'words': [{'word': 'hello', 'start': 1.1992188, 'end': 1.5195312, 'confidence': 0.75268555, 'speaker': 0, 'speaker_confidence': 0.06817669, 'punctuated_word': 'Hello,'}, {'word': 'kamiji', 'start': 1.5195312, 'end': 1.9990234, 'confidence': 0.6906738, 'speaker': 0, 'speaker_confidence': 0.06817669, 'punctuated_word': 'Kamiji.'}, {'word': 'how', 'start': 1.9990234, 'end': 2.1582031, 'confidence': 0.9946289, 'speaker': 0, 'speaker_confidence': 0.06817669, 'punctuated_word': 'How'}, {'word': 'are', 'start': 2.1582031, 'end': 2.3183594, 'confidence': 0.98095703, 'speaker': 0, 'speaker_confidence': 0.06817669, 'punctuated_word': 'are'}, {'word': 'you', 'start': 2.3183594, 'end': 2.8183594, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.06817669, 'punctuated_word': 'you?'}, {'word': "i'm", 'start': 3.0390625, 'end': 3.2773438, 'confidence': 0.79833984, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': "I'm"}, {'word': 'fine', 'start': 3.2773438, 'end': 3.5976562, 'confidence': 0.78344727, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'fine.'}, {'word': 'and', 'start': 3.5976562, 'end': 3.9179688, 'confidence': 0.5410156, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'And'}, {'word': 'you', 'start': 3.9179688, 'end': 4.4179688, 'confidence': 0.8601074, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'you?'}, {'word': 'fine', 'start': 4.5585938, 'end': 5.0585938, 'confidence': 0.9968262, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'Fine.'}, {'word': 'so', 'start': 5.3554688, 'end': 5.5976562, 'confidence': 0.9975586, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'So'}, {'word': 'tell', 'start': 5.5976562, 'end': 5.8359375, 'confidence': 0.93359375, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'tell'}, {'word': 'me', 'start': 5.8359375, 'end': 5.9960938, 'confidence': 0.99902344, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'me'}, {'word': 'about', 'start': 5.9960938, 'end': 6.2382812, 'confidence': 0.9980469, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'about'}, {'word': 'yourself', 'start': 6.2382812, 'end': 6.7382812, 'confidence': 0.7758789, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'yourself.'}], 'speaker': 0, 'id': 'efde132f-d302-4900-b368-901c67ad5c72'}, {'start': 7.6835938, 'end': 8.183594, 'confidence': 0.61417645, 'channel': 0, 'transcript': 'myself.', 'words': [{'word': 'myself', 'start': 7.6835938, 'end': 8.183594, 'confidence': 0.61417645, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'myself.'}], 'speaker': 0, 'id': '61f3c5d5-e229-4520-9e0f-f4abf924173c'}, {'start': 8.8828125, 'end': 12.703125, 'confidence': 0.89109296, 'channel': 0, 'transcript': "I'm a software engineer. I'm working in a global IT app as", 'words': [{'word': "i'm", 'start': 8.8828125, 'end': 9.0859375, 'confidence': 0.85961914, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': "I'm"}, {'word': 'a', 'start': 9.0859375, 'end': 9.203125, 'confidence': 0.99609375, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'a'}, {'word': 'software', 'start': 9.203125, 'end': 9.640625, 'confidence': 0.99853516, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'software'}, {'word': 'engineer', 'start': 9.640625, 'end': 10.140625, 'confidence': 0.9729004, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'engineer.'}, {'word': "i'm", 'start': 10.2421875, 'end': 10.484375, 'confidence': 0.8195801, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': "I'm"}, {'word': 'working', 'start': 10.484375, 'end': 10.84375, 'confidence': 0.94970703, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'working'}, {'word': 'in', 'start': 10.84375, 'end': 11.125, 'confidence': 0.99609375, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'in'}, {'word': 'a', 'start': 11.125, 'end': 11.203125, 'confidence': 0.6328125, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'a'}, {'word': 'global', 'start': 11.203125, 'end': 11.640625, 'confidence': 0.98828125, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'global'}, {'word': 'it', 'start': 11.640625, 'end': 12.0, 'confidence': 0.99853516, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'IT'}, {'word': 'app', 'start': 12.0, 'end': 12.203125, 'confidence': 0.5722656, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'app'}, {'word': 'as', 'start': 12.203125, 'end': 12.703125, 'confidence': 0.9086914, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'as'}], 'speaker': 0, 'id': '7030b852-6e2b-4271-be1f-c000668762b0'}, {'start': 13.640625, 'end': 14.5, 'confidence': 0.78601074, 'channel': 0, 'transcript': 'project manager.', 'words': [{'word': 'project', 'start': 13.640625, 'end': 14.0, 'confidence': 0.6201172, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'project'}, {'word': 'manager', 'start': 14.0, 'end': 14.5, 'confidence': 0.9519043, 'speaker': 0, 'speaker_confidence': 0.64976156, 'punctuated_word': 'manager.'}], 'speaker': 0, 'id': '0433d626-1ab4-4e51-a7a0-e346801b2720'}, {'start': 15.2265625, 'end': 16.046875, 'confidence': 0.7861328, 'channel': 0, 'transcript': 'Okay. And', 'words': [{'word': 'okay', 'start': 15.2265625, 'end': 15.546875, 'confidence': 0.7788086, 'speaker': 0, 'speaker_confidence': 0.041602314, 'punctuated_word': 'Okay.'}, {'word': 'and', 'start': 15.546875, 'end': 16.046875, 'confidence': 0.79345703, 'speaker': 0, 'speaker_confidence': 0.041602314, 'punctuated_word': 'And'}], 'speaker': 0, 'id': 'f9754b97-95c0-4cfe-813b-4ed088bf4df3'}, {'start': 17.09375, 'end': 19.640625, 'confidence': 0.98147243, 'channel': 0, 'transcript': 'I have more than ten year experience in PHP.', 'words': [{'word': 'i', 'start': 17.09375, 'end': 17.21875, 'confidence': 0.95654297, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'I'}, {'word': 'have', 'start': 17.21875, 'end': 17.46875, 'confidence': 0.99560547, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'have'}, {'word': 'more', 'start': 17.46875, 'end': 17.65625, 'confidence': 0.99853516, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'more'}, {'word': 'than', 'start': 17.65625, 'end': 17.90625, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'than'}, {'word': 'ten', 'start': 17.90625, 'end': 18.09375, 'confidence': 0.9941406, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'ten'}, {'word': 'year', 'start': 18.09375, 'end': 18.34375, 'confidence': 0.9902344, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'year'}, {'word': 'experience', 'start': 18.34375, 'end': 18.84375, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'experience'}, {'word': 'in', 'start': 18.90625, 'end': 19.140625, 'confidence': 0.99853516, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'in'}, {'word': 'php', 'start': 19.140625, 'end': 19.640625, 'confidence': 0.90063477, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'PHP.'}], 'speaker': 0, 'id': '2555d72a-3eb6-4b03-9417-424c452d7780'}, {'start': 20.421875, 'end': 22.5625, 'confidence': 0.7983747, 'channel': 0, 'transcript': 'During my experience, I worked on PHP.', 'words': [{'word': 'during', 'start': 20.421875, 'end': 20.703125, 'confidence': 0.9951172, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'During'}, {'word': 'my', 'start': 20.703125, 'end': 20.90625, 'confidence': 0.9506836, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'my'}, {'word': 'experience', 'start': 20.90625, 'end': 21.40625, 'confidence': 0.96850586, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'experience,'}, {'word': 'i', 'start': 21.46875, 'end': 21.625, 'confidence': 0.9975586, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'I'}, {'word': 'worked', 'start': 21.625, 'end': 21.90625, 'confidence': 0.4501953, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'worked'}, {'word': 'on', 'start': 21.90625, 'end': 22.0625, 'confidence': 0.9453125, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'on'}, {'word': 'php', 'start': 22.0625, 'end': 22.5625, 'confidence': 0.28125, 'speaker': 0, 'speaker_confidence': 0.56206894, 'punctuated_word': 'PHP.'}], 'speaker': 0, 'id': '61a1bffa-9c00-41e5-a0cb-d060f3e2d4ca'}, {'start': 25.96875, 'end': 27.796875, 'confidence': 0.8465925, 'channel': 0, 'transcript': 'So which projects are you working on?', 'words': [{'word': 'so', 'start': 25.96875, 'end': 26.140625, 'confidence': 0.13635254, 'speaker': 0, 'speaker_confidence': 0.019748092, 'punctuated_word': 'So'}, {'word': 'which', 'start': 26.140625, 'end': 26.328125, 'confidence': 0.9736328, 'speaker': 0, 'speaker_confidence': 0.019748092, 'punctuated_word': 'which'}, {'word': 'projects', 'start': 26.328125, 'end': 26.78125, 'confidence': 0.9975586, 'speaker': 0, 'speaker_confidence': 0.0, 'punctuated_word': 'projects'}, {'word': 'are', 'start': 26.78125, 'end': 26.890625, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.0, 'punctuated_word': 'are'}, {'word': 'you', 'start': 26.890625, 'end': 27.015625, 'confidence': 0.9824219, 'speaker': 0, 'speaker_confidence': 0.0, 'punctuated_word': 'you'}, {'word': 'working', 'start': 27.015625, 'end': 27.296875, 'confidence': 0.9975586, 'speaker': 0, 'speaker_confidence': 0.0, 'punctuated_word': 'working'}, {'word': 'on', 'start': 27.296875, 'end': 27.796875, 'confidence': 0.8391113, 'speaker': 0, 'speaker_confidence': 0.0, 'punctuated_word': 'on?'}], 'speaker': 0, 'id': 'f9004e9d-9cde-4a64-b032-4070a409bfb6'}, {'start': 28.375, 'end': 30.03125, 'confidence': 0.87666017, 'channel': 0, 'transcript': "Right now I'm working on? Right now I'm working on", 'words': [{'word': 'right', 'start': 28.375, 'end': 28.65625, 'confidence': 0.86621094, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'Right'}, {'word': 'now', 'start': 28.65625, 'end': 28.703125, 'confidence': 0.99316406, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'now'}, {'word': "i'm", 'start': 28.703125, 'end': 28.75, 'confidence': 0.77368164, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': "I'm"}, {'word': 'working', 'start': 28.75, 'end': 28.796875, 'confidence': 0.99853516, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'working'}, {'word': 'on', 'start': 28.796875, 'end': 28.84375, 'confidence': 0.75878906, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'on?'}, {'word': 'right', 'start': 28.84375, 'end': 28.890625, 'confidence': 0.62402344, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'Right'}, {'word': 'now', 'start': 28.890625, 'end': 28.9375, 'confidence': 0.99121094, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'now'}, {'word': "i'm", 'start': 28.9375, 'end': 29.171875, 'confidence': 0.76293945, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': "I'm"}, {'word': 'working', 'start': 29.171875, 'end': 29.53125, 'confidence': 0.99902344, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'working'}, {'word': 'on', 'start': 29.53125, 'end': 30.03125, 'confidence': 0.99902344, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'on'}], 'speaker': 0, 'id': 'f6d42af6-2873-4bac-8083-4179fcb9d676'}, {'start': 30.65625, 'end': 32.34375, 'confidence': 0.83808595, 'channel': 0, 'transcript': 'single project. That is it.', 'words': [{'word': 'single', 'start': 30.65625, 'end': 31.015625, 'confidence': 0.74316406, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'single'}, {'word': 'project', 'start': 31.015625, 'end': 31.5, 'confidence': 0.7504883, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'project.'}, {'word': 'that', 'start': 31.5, 'end': 31.65625, 'confidence': 0.8725586, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'That'}, {'word': 'is', 'start': 31.65625, 'end': 31.84375, 'confidence': 0.95654297, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'is'}, {'word': 'it', 'start': 31.84375, 'end': 32.34375, 'confidence': 0.8676758, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'it.'}], 'speaker': 0, 'id': 'dec794ef-efb7-47ac-84f7-0a69a179acf9'}, {'start': 32.96875, 'end': 33.71875, 'confidence': 0.640625, 'channel': 0, 'transcript': 'that is', 'words': [{'word': 'that', 'start': 32.96875, 'end': 33.21875, 'confidence': 0.38134766, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'that'}, {'word': 'is', 'start': 33.21875, 'end': 33.71875, 'confidence': 0.89990234, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'is'}], 'speaker': 0, 'id': '1fe6444e-bdde-4f77-9b86-39bf61f0f322'}, {'start': 34.0625, 'end': 34.5625, 'confidence': 0.87939453, 'channel': 0, 'transcript': 'ASR', 'words': [{'word': 'asr', 'start': 34.0625, 'end': 34.5625, 'confidence': 0.87939453, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'ASR'}], 'speaker': 0, 'id': '6cf98d9a-55f4-4683-9b0e-30edc887eb7a'}, {'start': 35.4375, 'end': 36.71875, 'confidence': 0.7133789, 'channel': 0, 'transcript': 'means speech recognition', 'words': [{'word': 'means', 'start': 35.4375, 'end': 35.9375, 'confidence': 0.6923828, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'means'}, {'word': 'speech', 'start': 35.9375, 'end': 36.21875, 'confidence': 0.52246094, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'speech'}, {'word': 'recognition', 'start': 36.21875, 'end': 36.71875, 'confidence': 0.92529297, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'recognition'}], 'speaker': 0, 'id': '17cb8e5d-03e2-481d-b19c-4d145e37502d'}, {'start': 37.71875, 'end': 40.125, 'confidence': 0.8987165, 'channel': 0, 'transcript': 'on that. Which challenges are you facing?', 'words': [{'word': 'on', 'start': 37.71875, 'end': 37.84375, 'confidence': 0.5708008, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'on'}, {'word': 'that', 'start': 37.84375, 'end': 38.34375, 'confidence': 0.9433594, 'speaker': 0, 'speaker_confidence': 0.61906564, 'punctuated_word': 'that.'}, {'word': 'which', 'start': 38.625, 'end': 38.875, 'confidence': 0.9770508, 'speaker': 0, 'speaker_confidence': 0.0, 'punctuated_word': 'Which'}, {'word': 'challenges', 'start': 38.875, 'end': 39.34375, 'confidence': 0.8046875, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'challenges'}, {'word': 'are', 'start': 39.34375, 'end': 39.5, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'are'}, {'word': 'you', 'start': 39.5, 'end': 39.625, 'confidence': 0.99902344, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'you'}, {'word': 'facing', 'start': 39.625, 'end': 40.125, 'confidence': 0.99658203, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'facing?'}], 'speaker': 0, 'id': '40604b98-dace-46af-b209-809840722b07'}, {'start': 41.71875, 'end': 45.6875, 'confidence': 0.9342374, 'channel': 0, 'transcript': "Right now, I'm facing some challenges related to the speaker changes.", 'words': [{'word': 'right', 'start': 41.71875, 'end': 41.96875, 'confidence': 0.63623047, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'Right'}, {'word': 'now', 'start': 41.96875, 'end': 42.15625, 'confidence': 0.81274414, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'now,'}, {'word': "i'm", 'start': 42.15625, 'end': 42.5625, 'confidence': 0.98950195, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': "I'm"}, {'word': 'facing', 'start': 42.5625, 'end': 43.0, 'confidence': 0.9868164, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'facing'}, {'word': 'some', 'start': 43.0, 'end': 43.25, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'some'}, {'word': 'challenges', 'start': 43.25, 'end': 43.75, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'challenges'}, {'word': 'related', 'start': 44.03125, 'end': 44.53125, 'confidence': 0.98583984, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'related'}, {'word': 'to', 'start': 44.53125, 'end': 44.625, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'to'}, {'word': 'the', 'start': 44.625, 'end': 44.84375, 'confidence': 0.9716797, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'the'}, {'word': 'speaker', 'start': 44.84375, 'end': 45.1875, 'confidence': 0.9902344, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'speaker'}, {'word': 'changes', 'start': 45.1875, 'end': 45.6875, 'confidence': 0.9050293, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'changes.'}], 'speaker': 0, 'id': 'a005d2b7-9517-488f-82d4-9d66d3187e6d'}, {'start': 46.34375, 'end': 47.8125, 'confidence': 0.96622723, 'channel': 0, 'transcript': 'Like, when I', 'words': [{'word': 'like', 'start': 46.34375, 'end': 46.84375, 'confidence': 0.9001465, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'Like,'}, {'word': 'when', 'start': 47.0625, 'end': 47.3125, 'confidence': 0.99902344, 'speaker': 0, 'speaker_confidence': 0.57842153, 'punctuated_word': 'when'}, {'word': 'i', 'start': 47.3125, 'end': 47.8125, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'I'}], 'speaker': 0, 'id': 'ed305b95-e17c-4da2-a11f-263615a605ad'}, {'start': 48.5625, 'end': 51.8125, 'confidence': 0.8953044, 'channel': 0, 'transcript': 'transcribe the video into text format.', 'words': [{'word': 'transcribe', 'start': 48.5625, 'end': 49.0625, 'confidence': 0.9313965, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'transcribe'}, {'word': 'the', 'start': 49.3125, 'end': 49.56, 'confidence': 0.9580078, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'the'}, {'word': 'video', 'start': 49.8125, 'end': 50.3125, 'confidence': 0.9995117, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'video'}, {'word': 'into', 'start': 50.59375, 'end': 51.0, 'confidence': 0.91308594, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'into'}, {'word': 'text', 'start': 51.0, 'end': 51.3125, 'confidence': 0.7885742, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'text'}, {'word': 'format', 'start': 51.3125, 'end': 51.8125, 'confidence': 0.78125, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'format.'}], 'speaker': 0, 'id': 'f521d234-ad31-4ea6-9443-16e74f40b1cc'}, {'start': 52.6875, 'end': 55.46875, 'confidence': 0.7671596, 'channel': 0, 'transcript': 'Sometimes the speak speaker labeling are wrong.', 'words': [{'word': 'sometimes', 'start': 52.6875, 'end': 53.1875, 'confidence': 0.9589844, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'Sometimes'}, {'word': 'the', 'start': 53.1875, 'end': 53.375, 'confidence': 0.49975586, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'the'}, {'word': 'speak', 'start': 53.375, 'end': 53.625, 'confidence': 0.84521484, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'speak'}, {'word': 'speaker', 'start': 53.84375, 'end': 54.25, 'confidence': 0.71191406, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'speaker'}, {'word': 'labeling', 'start': 54.25, 'end': 54.71875, 'confidence': 0.9682617, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'labeling'}, {'word': 'are', 'start': 54.71875, 'end': 54.96875, 'confidence': 0.48291016, 'speaker': 0, 'speaker_confidence': 0.5727463, 'punctuated_word': 'are'}, {'word': 'wrong', 'start': 54.96875, 'end': 55.46875, 'confidence': 0.9030762, 'speaker': 0, 'speaker_confidence': 0.098423064, 'punctuated_word': 'wrong.'}], 'speaker': 0, 'id': '87a5b8f7-069c-4ffd-ac46-0ac464cdc7e0'}, {'start': 56.40625, 'end': 56.90625, 'confidence': 0.77368164, 'channel': 0, 'transcript': 'Okay.', 'words': [{'word': 'okay', 'start': 56.40625, 'end': 56.90625, 'confidence': 0.77368164, 'speaker': 0, 'speaker_confidence': 0.098423064, 'punctuated_word': 'Okay.'}], 'speaker': 0, 'id': 'f70db353-b977-4cda-94f8-819dfdb53505'}, {'start': 58.0625, 'end': 59.6875, 'confidence': 0.94954425, 'channel': 0, 'transcript': "I'm disconnecting the", 'words': [{'word': "i'm", 'start': 58.0625, 'end': 58.40625, 'confidence': 0.95996094, 'speaker': 0, 'speaker_confidence': 0.098423064, 'punctuated_word': "I'm"}, {'word': 'disconnecting', 'start': 58.40625, 'end': 58.90625, 'confidence': 0.99902344, 'speaker': 0, 'speaker_confidence': 0.098423064, 'punctuated_word': 'disconnecting'}, {'word': 'the', 'start': 59.1875, 'end': 59.6875, 'confidence': 0.88964844, 'speaker': 0, 'speaker_confidence': 0.098423064, 'punctuated_word': 'the'}], 'speaker': 0, 'id': '2697a274-547d-4486-abb3-9b5e7229a40c'}]}}

remove this

Add Docstrings in the Python SDK

Proposed changes

Add docstrings in the Python SDK for all the classes and functions in the deepgram -> transcription.py file. Please add @geekchick as a reviewer on your pull request.

Context

This change is important so it can help other developers using the SDK to quickly understand what's happening in a class or function, therefore decreasing development time and helping junior developers get up and running.

ssl.SSLCertVerificationError for batch and streaming starter code of deepgram

ssl.SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1129)

this error appears every time I try to run deepgram batch or streaming starter code

Sentiment (Listen) supported

old param analyze_sentiment should be kept for backwards compatibility cases.
new param sentiment added for sentiment analysis for /listen requests.

Getting error in pre-recorded audio

from deepgram import Deepgram
import asyncio
import json

# Your Deepgram API Key
apiKey = "******"

# Name and extension of the file you downloaded (e.g., sample.wav)
PATH_TO_FILE = 'Voice_sample_1.m4a'


async def transcribe_audio(audio_file):
    # Initialize the deepgram SDK
    dg_client = Deepgram(apiKey)
    # Open the audio file
    with open(audio_file, 'rb') as audio:
        # Replace mimetype as appropriate
        source = {'buffer': audio, 'mimetype': 'audio/m4a'}
        response = await dg_client.transcription.prerecorded(source, options={"punctuate": True})
        print(json.dumps(response, indent=4))
        return json.dumps(response, indent=4)


transcribedData = asyncio.run(transcribe_audio(PATH_TO_FILE))
with open("trancribedData.json", 'w') as jsonfile:
    jsonfile.write(transcribedData)
print("Transcribed data ready!")
jsonFile = open("trancribedData.json")
fileData = json.load(jsonFile)
print(fileData["results"]["channels"][0]["alternatives"][0]["transcript"])
filedata1 = fileData["results"]["channels"]
print(filedata1[0]["alternatives"][0]["transcript"])

I'm getting this after executing the above code. Please help me what I did wrong?

raise ClientConnectorCertificateError(req.connection_key, exc) from exc
aiohttp.client_exceptions.ClientConnectorCertificateError: Cannot connect to host api.deepgram.com:443 ssl:True [SSLCertVerificationError: (1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: certificate has expired (_ssl.c:1091)')]

Add discord badge to the top of the README

Add a discord shield/badge to the top of the repo's README, linked to our community Discord (https://discord.gg/xWRaCDBtW4).

Name: Python SDK

Exception on import python3.9

What is the current behavior?

from deepgram import Deepgram results in an exception

TypeError: unhashable type: 'list'
Most relevant lines:

File ".../python3.9/site-packages/deepgram/_types.py", line 171, in <module>
    EventHandler = Union[Callable[[Any], None], Callable[[Any], Awaitable[None]]]

Steps to reproduce

Python3.9, pip install, from deepgram import Deepgram

Please tell us about your environment

We want to make sure the problem isn't specific to your operating system or programming language.
Ubuntu 21.04

Other information

Is solved by changing
EventHandler = Union[Callable[[Any], None], Callable[[Any], Awaitable[None]]]
into
EventHandler = Union[Callable[Any, None], Callable[Any, Awaitable[None]]]

in deepgram/_types.py

Full trace:
Traceback (most recent call last):
File "/home/script/tmi_archive/manage.py", line 22, in
main()
File "/home/script/tmi_archive/manage.py", line 18, in main
execute_from_command_line(sys.argv)
File "/home/simon/.local/share/virtualenvs/tmi_archive-Ytjv44Ey/lib/python3.9/site-packages/django/core/management/init.py", line 425, in execute_from_command_line
utility.execute()
File "/home/simon/.local/share/virtualenvs/tmi_archive-Ytjv44Ey/lib/python3.9/site-packages/django/core/management/init.py", line 419, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/home/simon/.local/share/virtualenvs/tmi_archive-Ytjv44Ey/lib/python3.9/site-packages/django/core/management/init.py", line 263, in fetch_command
klass = load_command_class(app_name, subcommand)
File "/home/simon/.local/share/virtualenvs/tmi_archive-Ytjv44Ey/lib/python3.9/site-packages/django/core/management/init.py", line 39, in load_command_class
module = import_module('%s.management.commands.%s' % (app_name, name))
File "/home/opt/anaconda/lib/python3.9/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1030, in _gcd_import
File "", line 1007, in _find_and_load
File "", line 986, in _find_and_load_unlocked
File "", line 680, in _load_unlocked
File "", line 790, in exec_module
File "", line 228, in _call_with_frames_removed
File "/home/script/tmi_archive/talks/management/commands/transcribe-deepgram.py", line 2, in
from deepgram import Deepgram
File "/home/simon/.local/share/virtualenvs/tmi_archive-Ytjv44Ey/lib/python3.9/site-packages/deepgram/init.py", line 2, in
from ._types import Options
File "/home/simon/.local/share/virtualenvs/tmi_archive-Ytjv44Ey/lib/python3.9/site-packages/deepgram/_types.py", line 171, in
EventHandler = Union[Callable[[Any], None], Callable[[Any], Awaitable[None]]]
File "/home/opt/anaconda/lib/python3.9/typing.py", line 243, in inner
return func(*args, **kwds)
File "/home/opt/anaconda/lib/python3.9/typing.py", line 316, in getitem
return self._getitem(self, parameters)
File "/home/opt/anaconda/lib/python3.9/typing.py", line 421, in Union
parameters = _remove_dups_flatten(parameters)
File "/home/opt/anaconda/lib/python3.9/typing.py", line 215, in _remove_dups_flatten
all_params = set(params)
TypeError: unhashable type: 'list'

Summarize (Read) supported

new param summarize added for summarization for /read requests.

Use BlackFormatter for VSCode

Proposed changes

It has been recommended that we add a code formatter to the project:

Black is a common one for Python. It would be nice to run this through a code formatter all in one go, so we don't end up needing to make handfuls of formatting fixes which can clutter PR diffs later.

Context

This will help to keep the project formatted consistently no matter who is contributing.

Possible Implementation

Black

`ValueError: I/O operation on closed file`

What is the current behavior?

A ValueError: I/O operation on closed file is raised when calling dg_client.transcription.prerecorded and the following conditions satisfy

The passed buffer is a stream object (e.g. file)
The request failed for some reason (e.g. bad token)

Such an error is misleading. It is a side effect described in expected behaviour.

What's happening that seems wrong?

The SDK automatically reattempts requests if they fail. However, stream objects (often) cannot be re-read from the beginning.

Steps to reproduce

To make it faster to diagnose the root problem. Tell us how can we reproduce the bug.

DEEPGRAM_API_KEY = "BAD_TOKEN"
dg_client = Deepgram(DEEPGRAM_API_KEY)
with open("some_file_that_exists.wav", "rb") as f:
    await dg_client.transcription.prerecorded({"buffer": f, "mimetype": "audio/wav"})

Expected behavior

If the buffer is a stream object, the SDK should not automatically retry the request because streams cannot be directly restarted from the beginning. Retrying the request will cause ValueError: I/O operation on closed file exception because the stream is fully consumed and hence closed after the first attempt. Instead, the request should be only made at most once. If it fails, the real exception is thrown (e.g. in case of bad token, an Unauthorized exception should be thrown).

What would you expect to happen when following the steps above?

Please tell us about your environment

We want to make sure the problem isn't specific to your operating system or programming language.

Operating System/Version: Not relevant
Language: Python
Browser: Not relevant

Other information

Anything else we should know? (e.g. detailed explanation, stack-traces, related issues, suggestions how to fix, links for us to have context, eg. stack overflow, codepen, etc)

Suggested fix

In _request defined in _utils.py of the SDK, there is a retry logic.

async def _request(
    path: str, options: Options,
    method: str = 'GET', payload: Payload = None,
    headers: Optional[Mapping[str, str]] = None
) -> Any:
    # ...
    tries = RETRY_COUNT
    while tries > 0:
        try:
            return await attempt()
        except Exception as exc:
            print(exc)
            tries -= 1
            continue

To fix the problem, check the type of payload. If it is stream-like, only try once by assigning 1 to tries.

Unable to use DeepGram Python Sdk on Windows

I am trying to transcript a file using the DeepGram Python SDK, but I am getting an ssl certification error. This error occurs when I try to run a transcription of a file on Windows. I expected it to return a string of the audio. I am using a python 3.10 virtual environment.

Operating System/Version: Windows 10
Language: Python
Browser: Chrome

Static Type Checking via mypy - PEP 561 support

Proposed changes

I'm new to python so excuse me if I'm missing something obvious.

You've carefully added type hints to the SDK, amazing. When I add a static type checker to my project, like mypy, it can't seem to find them. According to mypy's documentation, packages are supposed to declare their compatibility with a py.typed file in the package directory. See here: https://mypy.readthedocs.io/en/stable/installed_packages.html#creating-pep-561-compatible-packages

Context

Static type checking dramatically reduces developer errors. You've already created types, adding compatibility to PEP 561 would allow us to use static checkers

Possible Implementation

See above

Other information

Am I missing anything? Another way to run a static type checker perhaps?

Sentiment (Read) supported

new param sentiment added for sentiment analysis for /read requests.