Code Monkey home page Code Monkey logo

freeswitch-modules's Introduction

freeswitch-modules

A collection of Freeswitch modules intended for use with a jambonz programmable voice platform deployment.

Licensing

This software is available under a dual-licensing scheme. For specific use in a standalone jambonz deployment, the MIT License applies. For all other uses, the software is licensed for use under the AGPL Version 3.0 license.

Please review COPYING for detailed information in order to assess whether your intended usage meets the specific conditions that allow for usage under the MIT License.

Contributing to the software

If you wish to contribute changes, please review our rules governing contributions.

freeswitch-modules's People

Contributors

ajgolledge avatar davehorton avatar vdharashive avatar xquanluu avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

freeswitch-modules's Issues

real time diarization with mod_azure_transcribe

I am using the module mod_azure_transcribe and my requirement is basically when to transcript a call , but right what is happening basically I am only getting transcription result from one channel
so the question is , mod_azure_transcribe supports real time diarization or not ??

Unable to load mod_aws_transcribe

Hello, thank you fro providing this libs. I have been able to install and use mod_google_transcribe successfully but I can't get mod_aws_transcribe to work properly. The below is my error.

2024-06-25 16:16:54.926542 73.60% [CRIT] switch_loadable_module.c:1750 Error Loading module /usr/local/freeswitch/mod/mod_aws_transcribe.so
**/usr/local/freeswitch/mod/mod_aws_transcribe.so: undefined symbol: aws_symmetric_cipher_destroy**

I have lifted the relevant parts in the provided configure.ac and Makefile.am files in your packer repository. I can't use the ansible/packer script because I have a custom build process. Can you help provide some pointers on how to solve this? Thank you.

Error compiling mod_aws_transcribe, Conflicting cJSON struct definition

The mod_aws_transcribe module fails to compile on a vanilla FreeSWITCH 1.10.11 installation due to a conflict between two different definitions of the cJSON struct. The conflict arises between the version included in FreeSWITCH's own JSON library (switch_cJSON.h) and the one embedded within the AWS SDK (aws/core/external/cjson/cJSON.h).

making install mod_aws_transcribe
make[2]: Entering directory '/usr/local/src/freeswitch-1.10.11.-release/src/mod/applications/mod_aws_transcribe'
  CXX      mod_aws_transcribe_la-aws_transcribe_glue.lo
In file included from /usr/local/include/aws/core/utils/json/JsonSerializer.h:14,
                 from /usr/local/include/aws/core/client/AWSError.h:12,
                 from /usr/local/include/aws/core/endpoint/internal/AWSEndpointAttribute.h:9,
                 from /usr/local/include/aws/core/auth/signer/AWSAuthV4Signer.h:17,
                 from /usr/local/include/aws/core/auth/AWSAuthSigner.h:11,
                 from /usr/local/include/aws/core/AmazonWebServiceRequest.h:11,
                 from /usr/local/include/aws/core/client/AWSUrlPresigner.h:9,
                 from /usr/local/include/aws/core/client/AWSClient.h:12,
                 from /usr/local/include/aws/core/monitoring/MonitoringManager.h:11,
                 from /usr/local/include/aws/core/Aws.h:13,
                 from aws_transcribe_glue.cpp:14:
/usr/local/include/aws/core/external/cjson/cJSON.h:114:16: error: redefinition of ‘struct cJSON’
  114 | typedef struct cJSON
      |                ^~~~~
In file included from ../../../../src/include/switch_json.h:43,
                 from ../../../../src/include/switch_types.h:43,
                 from ../../../../src/include/switch.h:113,
                 from aws_transcribe_glue.cpp:3:
../../../../src/include/switch_cJSON.h:95:16: note: previous definition of ‘struct cJSON’
   95 | typedef struct cJSON
      |                ^~~~~
In file included from /usr/local/include/aws/core/utils/json/JsonSerializer.h:14,
                 from /usr/local/include/aws/core/client/AWSError.h:12,
                 from /usr/local/include/aws/core/endpoint/internal/AWSEndpointAttribute.h:9,
                 from /usr/local/include/aws/core/auth/signer/AWSAuthV4Signer.h:17,
                 from /usr/local/include/aws/core/auth/AWSAuthSigner.h:11,
                 from /usr/local/include/aws/core/AmazonWebServiceRequest.h:11,
                 from /usr/local/include/aws/core/client/AWSUrlPresigner.h:9,
                 from /usr/local/include/aws/core/client/AWSClient.h:12,
                 from /usr/local/include/aws/core/monitoring/MonitoringManager.h:11,
                 from /usr/local/include/aws/core/Aws.h:13,
                 from aws_transcribe_glue.cpp:14:
/usr/local/include/aws/core/external/cjson/cJSON.h:134:3: error: conflicting declaration ‘typedef int cJSON’
  134 | } cJSON;
      |   ^~~~~
In file included from ../../../../src/include/switch_json.h:43,
                 from ../../../../src/include/switch_types.h:43,
                 from ../../../../src/include/switch.h:113,
                 from aws_transcribe_glue.cpp:3:
../../../../src/include/switch_cJSON.h:115:3: note: previous declaration as ‘typedef struct cJSON cJSON’
  115 | } cJSON;
      |   ^~~~~
make[2]: *** [Makefile:791: mod_aws_transcribe_la-aws_transcribe_glue.lo] Error 1
make[2]: Leaving directory '/usr/local/src/freeswitch-1.10.11.-release/src/mod/applications/mod_aws_transcribe'
make[1]: *** [Makefile:732: mod_aws_transcribe-install] Error 1
make[1]: Leaving directory '/usr/local/src/freeswitch-1.10.11.-release/src/mod'
make: *** [Makefile:4338: mod_aws_transcribe-install] Error 2

Steps to Reproduce:

Start with a fresh installation of FreeSWITCH 1.10.11.
Attempt to build and install the mod_aws_transcribe module using the standard procedure (e.g., make mod_aws_transcribe-install).
Observe the compilation failure with the errors mentioned above.

Deepgram STT fails on "returning event deepgram_transcribe::transcription, from finished recognition session"

Hi,

I'm running into some issues where deepgram STT is sometimes disconnecting and failing to transcribe on a gather. I have attached the freeswitch debug logs.

The high level summary of the issue is:

WE are running a longer call (~3m). Thinks are working good.

We have a DG request (1cd3f494-9c17-4fd4-afc7-96231e7b9bd2) that completes successfully. We then hit trouble on our next gather action:

####
#### Kicked over to DG...
#### Not that we get a response to the last request ID 1cd3f494-9c17-4fd4-afc7-96231e7b9bd2 and this kills things
####

"852fd82e-72e9-4a0b-8559-d62a8dc3241f 2024-04-12 21:38:50.643117 96.53% [DEBUG] dg_transcribe_glue.cpp:404 (56) no resampling needed for this call"
"852fd82e-72e9-4a0b-8559-d62a8dc3241f 2024-04-12 21:38:50.643117 96.53% [DEBUG] dg_transcribe_glue.cpp:407 (56) fork_data_init"
"852fd82e-72e9-4a0b-8559-d62a8dc3241f 2024-04-12 21:38:50.643117 96.53% [DEBUG] dg_transcribe_glue.cpp:479 connecting now"
"852fd82e-72e9-4a0b-8559-d62a8dc3241f 2024-04-12 21:38:50.643117 96.53% [DEBUG] dg_transcribe_glue.cpp:481 connection in progress"
"2024-04-12 21:38:50.643117 96.53% [DEBUG] mod_deepgram_transcribe.c:41 Got SWITCH_ABC_TYPE_INIT."
"852fd82e-72e9-4a0b-8559-d62a8dc3241f 2024-04-12 21:38:50.643117 96.53% [DEBUG] switch_core_media_bug.c:976 Attaching BUG to sofia/drachtio_mrf/[email protected]:5060"
"2024-04-12 21:38:50.643117 96.53% [DEBUG] mod_deepgram_transcribe.c:97 added media bug for dg transcribe"
"2024-04-12 21:38:50.663122 96.53% [DEBUG] mod_deepgram_transcribe.c:27 responseHandler returning event deepgram_transcribe::transcription, from finished recognition session"
"852fd82e-72e9-4a0b-8559-d62a8dc3241f 2024-04-12 21:38:50.663122 96.53% [DEBUG] dg_transcribe_glue.cpp:310 deepgram message (gather_deepgram_transcribe): {\"type\":\"Results\",\"channel_index\":[0,1],\"duration\":0.52993727,\"start\":4.59,\"is_final\":true,\"speech_final\":true,\"channel\":{\"alternatives\":[{\"transcript\":\"\",\"confidence\":0.0,\"words\":[]}]},\"metadata\":{\"request_id\":\"1cd3f494-9c17-4fd4-afc7-96231e7b9bd2\",\"model_info\":{\"name\":\"2-phonecall-nova\",\"version\":\"2024-02-07.20824\",\"arch\":\"nova-2\"},\"model_uuid\":\"7e3b5bdf-85ed-4fd2-9f7a-7721bbcad97b\"},\"from_finalize\":false}"
"2024-04-12 21:38:50.723124 96.53% [DEBUG] mod_deepgram_transcribe.c:27 responseHandler returning event deepgram_transcribe::transcription, from finished recognition session"
"852fd82e-72e9-4a0b-8559-d62a8dc3241f 2024-04-12 21:38:50.723124 96.53% [DEBUG] dg_transcribe_glue.cpp:310 deepgram message (gather_deepgram_transcribe): {\"type\":\"Metadata\",\"transaction_key\":\"deprecated\",\"request_id\":\"1cd3f494-9c17-4fd4-afc7-96231e7b9bd2\",\"sha256\":\"b44d8f3dcbb4af0ed7376a756155a854c652b301074f0376ab475877256b4072\",\"created\":\"2024-04-12T21:38:44.725Z\",\"duration\":5.1199374,\"channels\":1,\"models\":[\"7e3b5bdf-85ed-4fd2-9f7a-7721bbcad97b\"],\"model_info\":{\"7e3b5bdf-85ed-4fd2-9f7a-7721bbcad97b\":{\"name\":\"2-phonecall-nova\",\"version\":\"2024-02-07.20824\",\"arch\":\"nova-2\"}}}"
"2024-04-12 21:38:50.723124 96.53% [DEBUG] mod_deepgram_transcribe.c:27 responseHandler returning event deepgram_transcribe::disconnect, from finished recognition session"
"852fd82e-72e9-4a0b-8559-d62a8dc3241f 2024-04-12 21:38:50.723124 96.53% [DEBUG] dg_transcribe_glue.cpp:297 connection (gather_deepgram_transcribe) dropped from far end"
"2024-04-12 21:38:50.723124 96.53% [DEBUG] dg_transcribe_glue.cpp:97 852fd82e-72e9-4a0b-8559-d62a8dc3241f (55) got remote close"
"852fd82e-72e9-4a0b-8559-d62a8dc3241f 2024-04-12 21:38:50.883114 96.53% [INFO] dg_transcribe_glue.cpp:280 connection (gather_deepgram_transcribe) successful"


####
#### DG Socket Closed
####


"852fd82e-72e9-4a0b-8559-d62a8dc3241f 2024-04-12 21:39:02.883139 95.77% [DEBUG] dg_transcribe_glue.cpp:297 connection (gather_deepgram_transcribe) dropped from far end"


####
#### Playback ended 
####

"852fd82e-72e9-4a0b-8559-d62a8dc3241f 2024-04-12 21:39:08.563127 95.83% [DEBUG] switch_ivr_play_say.c:2010 done playing file /tmp/tts-84106434-ae05-4f13-93fb-e3994bfdba66af5844e8-b7be-417c-a6cb-84352ff7b0b6:190d09833e3fc02d22f656f0e526edbce60ae5c1.r8"


####
#### No input timer fires 
####

"2024-04-12 21:39:18.643123 96.60% [INFO] mod_deepgram_transcribe.c:146 stop transcribing gather_deepgram_transcribe"
"2024-04-12 21:39:18.643123 96.60% [INFO] mod_deepgram_transcribe.c:110 Received user command command to stop transcribe."
"852fd82e-72e9-4a0b-8559-d62a8dc3241f 2024-04-12 21:39:18.643123 96.60% [DEBUG] dg_transcribe_glue.cpp:495 (56) dg_transcribe_session_stop"
"2024-04-12 21:39:18.643123 96.60% [DEBUG] mod_deepgram_transcribe.c:47 Got SWITCH_ABC_TYPE_CLOSE."
"852fd82e-72e9-4a0b-8559-d62a8dc3241f 2024-04-12 21:39:18.643123 96.60% [DEBUG] dg_transcribe_glue.cpp:489 dg_transcribe_session_stop: no bug - websocket conection already closed"
"2024-04-12 21:39:18.643123 96.60% [DEBUG] mod_deepgram_transcribe.c:49 Finished SWITCH_ABC_TYPE_CLOSE."
"852fd82e-72e9-4a0b-8559-d62a8dc3241f 2024-04-12 21:39:18.643123 96.60% [DEBUG] switch_core_media_bug.c:1326 Removing BUG from sofia/drachtio_mrf/[email protected]:5060"
"2024-04-12 21:39:18.643123 96.60% [INFO] dg_transcribe_glue.cpp:103 852fd82e-72e9-4a0b-8559-d62a8dc3241f (56) destroy_tech_pvt"
"852fd82e-72e9-4a0b-8559-d62a8dc3241f 2024-04-12 21:39:18.643123 96.60% [DEBUG] dg_transcribe_glue.cpp:510 (56) dg_transcribe_session_stop"
"2024-04-12 21:39:18.643123 96.60% [INFO] mod_deepgram_transcribe.c:112 stopped transcribe."

The next transcript/gather action works correctly.

This does not happen all the time, but we are seeing a handfull of these in testing. We were about to go live on DG today when we caught this. I suspect once we run volume we will hit more of them.

I have attached logs. Let me know if there is more info you need.

We are running version 1.0.8 / commit b4f3a41. Rest of the jambonz stack is v0.8.5-22.

dg-afternoon-issue.txt

AWS Transcribe - Enable "Custom language model" feature

Custom language model is available for use on AWS API but is not available in Jambonz. We are currently using Custom Vocabulary but we are finding it difficult with its limitaions as max size of vocab is 50kb.

Custom language model, will allow us to train gb's of data targeting our specific vertical

 "recognizer" : {
   "vendor" : "aws",
   "language" : "en-GB",
   "vocabularyName" : "custom_verbs",
   ...
   "languageModel" : "custom_language"
}

bug: jb_transcribe_glue connection dropped from far end

custom transcribe node application if hosted using nginx , niginx have idle session set for 60 sec.
mod_jambonz_transcribe connection gets disconnected after 60 sec. either ping/pong or reconnect ion need to be implemented

this case happens when call is kept on mute or on hold

2024-07-28 06:26:56.455481 99.07% [DEBUG] jb_transcribe_glue.cpp:198 connection dropped from far end

mod_audio_fork supports play audio stream

add new feature to mod_audio_fork, which is support for bi-directional streaming audio. Currently the far end can send us base64 encoded audio and we save it, write it to a file, and then play the file. We should keep that feature, but add a new option where they send us linear 16 pcm audio and we stream it out immediately using the same SBF_WRITE_REPLACE thing we do in mod_dub.

mod_azure_tts randomly having issues

Issue 1
0cc94686-ab60-406e-b78c-7c64ddda41ea 2024-06-05 06:15:58.137289 71.90% [DEBUG] switch_ivr_play_say.c:2826 Speaking text: {session-uuid=0cc94686-ab60-406e-b78c-7c64ddda41ea,api_key=redacted,language=en-US,vendor=microsoft,voice=en-US-AriaNeural,write_cache_file=1,region=eastus}One more question before we schedule your service.
2024-06-05 06:15:58.137289 71.90% [DEBUG] azure_glue.cpp:175 azure_speech_feed_tts SynthesisStarted
2024-06-05 06:15:58.137289 71.90% [DEBUG] azure_glue.cpp:43 mod_azure_tts: Exception in start_synthesis Exception with an error code: 0x21 (SPXERR_INVALID_HANDLE)

Issue 2
0cc94686-ab60-406e-b78c-7c64ddda41ea 2024-06-05 06:15:57.957290 71.90% [DEBUG] switch_ivr_play_say.c:2826 Speaking text: {session-uuid=0cc94686-ab60-406e-b78c-7c64ddda41ea,api_key=redacted,language=en-US,vendor=microsoft,voice=en-US-AriaNeural,write_cache_file=1,region=eastus2}One more question before we schedule your service.
2024-06-05 06:15:57.957290 71.90% [DEBUG] azure_glue.cpp:22 start_synthesis calling
2024-06-05 06:15:57.957290 71.90% [DEBUG] azure_glue.cpp:175 azure_speech_feed_tts SynthesisStarted
2024-06-05 06:15:58.037293 71.90% [ERR] azure_glue.cpp:35 Error synthesizing text 2 with error string: Connection was closed by the remote host. Error code: 1007. Error details: USP state: TurnStarted. Received audio size: 0 bytes..

mod_google_transcribe does not signal grpc internal error

It is possible that the grpc_read_thread function in mod_google_transcribe/google_glue_v1.cpp and in mod_google_transcribe/google_glue_v2.cpp exits without the FreeSWITCH client being aware of this. We experienced an example of this where we experienced an internal grpc error. This is what the logs looked like:

2024-03-14 21:34:20.675689 99.83% [DEBUG] google_glue_v2.cpp:281 grpc_read_thread: got 1 responses
2024-03-14 21:35:41.755685 99.87% [DEBUG] google_glue_v2.cpp:296 grpc_read_thread: finish() status Internal error encountered. (13)

The nature of the error is still mysterious and we have not been able to reproduce it but the problem could occur with any status code returned from the streamer->finish() call in grpc_read_thread which does not equal 11 in the case of google_glue_v1.cpp and 10 in the case of google_glue_v2.cpp. In the case of the internal error detailed in the log output above, the status code was 13 which meant that no event was fired and the client received no indication that the grpc_read_thread function was no longer running. The only indication of a problem was in the FreeSWITCH logs.
The client should receive some kind of an event in a case like this. Although the example above is rare and difficult to reproduce it could potentially occur when other status error codes are returned from streamer->finish().

Microsoft Speech detection no_speech_detected even on some transcript generated

we noticed that for small intent like Yes/no yeah etc Microsoft is sending no_speech_detected due to which jambonz is not sending any received transcript to bot due gather is getting timeout

2024-02-07 01:22:39.525528 84.63% [DEBUG] azure_transcribe_glue.cpp:177 GStreamer onRecognitionEvent reason 0 results: {"Id":"b5d353ed9b754f8aa5ef4936dbaa4c29","RecognitionStatus":"InitialSilenceTimeout","Offset":0,"Duration":158400000,"Channel":0,"DisplayText":"","NBest":[{"Confidence":0.0,"Lexical":"","ITN":"","MaskedITN":"","Display":""},{"Confidence":0.050563347,"Lexical":"no","ITN":"no","MaskedITN":"no","Display":"no","Words":[{"Word":"no","Offset":157300000,"Duration":800000,"Confidence":0.050563347}]},{"Confidence":0.05015686,"Lexical":"yes","ITN":"yes","MaskedITN":"yes","Display":"yes","Words":[{"Word":"yes","Offset":157300000,"Duration":1100000,"Confidence":0.05015686}]},{"Confidence":0.05075142,"Lexical":"please","ITN":"please","MaskedITN":"please","Display":"please","Words":[{"Word":"please","Offset":156100000,"Duration":2300000,"Confidence":0.05075142}]},{"Confidence":0.050710723,"Lexical":"to","ITN":"to","MaskedITN":"to","Display":"to","Words":[{"Word":"to","Offset":157300000,"Duration":800000,"Confidence":0.050710723}]}]},

2024-02-07 01:22:39.525528 84.63% [INFO] mod_azure_transcribe.c:20 responseHandler event azure_transcribe::no_speech_detected, body {"Id":"b5d353ed9b754f8aa5ef4936dbaa4c29","RecognitionStatus":"InitialSilenceTimeout","Offset":0,"Duration":158400000,"Channel":0,"DisplayText":"","NBest":[{"Confidence":0.0,"Lexical":"","ITN":"","MaskedITN":"","Display":""},{"Confidence":0.050563347,"Lexical":"no","ITN":"no","MaskedITN":"no","Display":"no","Words":[{"Word":"no","Offset":157300000,"Duration":800000,"Confidence":0.050563347}]},{"Confidence":0.05015686,"Lexical":"yes","ITN":"yes","MaskedITN":"yes","Display":"yes","Words":[{"Word":"yes","Offset":157300000,"Duration":1100000,"Confidence":0.05015686}]},{"Confidence":0.05075142,"Lexical":"please","ITN":"please","MaskedITN":"please","Display":"please","Words":[{"Word":"please","Offset":156100000,"Duration":2300000,"Confidence":0.05075142}]},{"Confidence":0.050710723,"Lexical":"to","ITN":"to","MaskedITN":"to","Display":"to","Words":[{"Word":"to","Offset":157300000,"Duration":800000,"Confidence":0.050710723}]}]}.

2024-02-07 01:22:39.525528 84.63% [DEBUG] mod_azure_transcribe.c:26 responseHandler returning event azure_transcribe::no_speech_detected, from finished recognition session

Unable to install mod_audio_fork

Freeswitch Version: 1.10.12

When I run make mod_audio_fork-install . I'm getting this error, not sure what can be done. Please help

libtool: warning: relinking 'mod_audio_fork.la'
/usr/local/freeswitch/libtool: line 1733: ../build/config/install-sh: No such file or directory
make[3]: *** [../../../../build/modmake.rules:144: /usr/local/freeswitch/mod/mod_audio_fork.la] Error 127
make[3]: Leaving directory '/usr/local/freeswitch/src/mod/applications/mod_audio_fork'
make[2]: *** [../../../../build/modmake.rules:95: install] Error 1
make[2]: Leaving directory '/usr/local/freeswitch/src/mod/applications/mod_audio_fork'
make[1]: *** [Makefile:728: mod_audio_fork-install] Error 1
make[1]: Leaving directory '/usr/local/freeswitch/src/mod'
make: *** [Makefile:4328: mod_audio_fork-install] Error 2

Error Loading Module mod_audio_fork

FreeSWITCH Version: 1.10.5
Description:
I am encountering an error when trying to load the mod_audio_fork module. The error message is as follows:

[CRIT] switch_loadable_module.c:1785 Error Loading module /usr/local/freeswitch/mod/mod_audio_fork.so
/usr/local/freeswitch/mod/mod_audio_fork.so: undefined symbol: fork_session_stop_play

Expected Behavior:
The mod_audio_fork module should load without any errors.

Actual Behavior:
An error occurs, indicating an undefined symbol: fork_session_stop_play.

Additional Information:
Any assistance or guidance on resolving this issue would be greatly appreciated.

Thank you.

Google speech to text en-IN does not work through transcribe

when we set en-IN as the language for google ASR, it does not give transcriptions. en-US works. Our assumption is enhanced model is being set to true by default in transcribe. This could be tested through transcribe and setting language to en-IN

Module not loading elevenlabs_tts on Redhat

2024-03-08 11:19:37.103359 99.00% [INFO] switch_stun.c:896 External ip address detected using STUN: 44.198.66.145
2024-03-08 11:19:37.163356 99.00% [INFO] switch_stun.c:896 External ip address detected using STUN: 44.198.66.145
2024-03-08 11:19:37.203356 99.00% [INFO] switch_time.c:1433 Timezone reloaded 1750 definitions
2024-03-08 11:19:37.203356 99.00% [CRIT] switch_loadable_module.c:1754 Error Loading module /usr/local/freeswitch/mod/mod_elevenlabs_tts.so
/usr/local/freeswitch/mod/mod_elevenlabs_tts.so: undefined symbol: _ZN5boost6system15system_categoryEv

Memory allocated using "malloc" is released using "delete []" in mod_audio_fork

I'm grateful for this module.

The method of allocating memory for m_recv_buf in mod_audio_fork's AudioPipe is "malloc", but the memory is released using "delete []" in AudioPipe destructor.

ap->m_recv_buf = (uint8_t*) malloc(ap->m_recv_buf_len);

if (m_recv_buf) delete [] m_recv_buf;

Deepgram module latency issue

Hi First off, thank you for building and maintaining this amazing package. Our company is delivering real time voice capabilities to millions of customers of large enterprises in India, and want to leverage the Deepgram module in production setting.
We use Lua scripting language. We've managed to integrate the module with Freeswitch setup successfully. However, we're noticing a latency of at least 3 seconds between when the user stops speaking and when Freeswitch receives the response from Deepgram servers. Additionally, in the initial phase of the call, we're seeing a number of "discarding empty deepgram transcript" messages on the console. I have attached logs below. The speaker started speaking at 20:31:40 and spoke for approx. 2 seconds. The server sent the transcription around 20:31:46. This means there is a latency of 4 seconds. We would like to reduce the latency well below 1 second. Could you please help us with this.

2024-03-07 20:31:40.443757 99.93% [INFO] mod_deepgram_transcribe.c:154 start transcribing hi interim
2024-03-07 20:31:40.443757 99.93% [DEBUG] dg_transcribe_glue.cpp:336 path: /v1/listen?tier=enhanced&model=general&language=hi&smart_format=true&no_delay=true&punctuate=true&interim_results=true&encoding=linear16&sample_rate=8000
2024-03-07 20:31:40.443757 99.93% [DEBUG] dg_transcribe_glue.cpp:377 (64) no resampling needed for this call
2024-03-07 20:31:40.443757 99.93% [DEBUG] dg_transcribe_glue.cpp:380 (64) fork_data_init
2024-03-07 20:31:40.443757 99.93% [DEBUG] dg_transcribe_glue.cpp:452 connecting now
2024-03-07 20:31:40.443757 99.93% [DEBUG] dg_transcribe_glue.cpp:454 connection in progress
2024-03-07 20:31:40.443757 99.93% [DEBUG] mod_deepgram_transcribe.c:41 Got SWITCH_ABC_TYPE_INIT.
2024-03-07 20:31:40.443757 99.93% [DEBUG] switch_core_media_bug.c:976 Attaching BUG to sofia/tata-profile/09429515176
2024-03-07 20:31:40.443757 99.93% [DEBUG] mod_deepgram_transcribe.c:98 added media bug for dg transcribe
2024-03-07 20:31:40.443757 99.93% [DEBUG] switch_cpp.cpp:86 bound to CUSTOM deepgram_transcribe::transcription
2024-03-07 20:31:40.463756 99.93% [DEBUG] switch_core_io.c:448 Setting BUG Codec PCMU:0
2024-03-07 20:31:41.783759 99.93% [INFO] dg_transcribe_glue.cpp:279 connection successful
2024-03-07 20:31:43.203760 99.93% [DEBUG] dg_transcribe_glue.cpp:305 discarding empty deepgram transcript
2024-03-07 20:31:44.103758 99.93% [DEBUG] dg_transcribe_glue.cpp:305 discarding empty deepgram transcript
2024-03-07 20:31:44.303757 99.93% [DEBUG] dg_transcribe_glue.cpp:305 discarding empty deepgram transcript
2024-03-07 20:31:45.283757 99.93% [DEBUG] dg_transcribe_glue.cpp:305 discarding empty deepgram transcript
2024-03-07 20:31:46.463765 99.90% [DEBUG] dg_transcribe_glue.cpp:309 deepgram message: {"type":"Results","channel_index":[0,1],"duration":2.0189373,"start":2.26,"is_final":false,"speech_final":false,"channel":{"alternatives":[{"transcript":"हां बोलो","confidence":0.9941406,"words":[{"word":"हां","start":2.26,"end":2.76,"confidence":0.84228516,"punctuated_word":"हां"},{"word":"बोलो","start":3.2694688,"end":3.7694688,"confidence":0.9941406,"punctuated_word":"बोलो"}]}]},"metadata":{"request_id":"b0a43754-4336-4821-81ec-2eeda6d5bb4d","model_info":{"name":"general-enhanced","version":"1982-11-21.24709","arch":"polaris"},"model_uuid":"83d5f3ce-9a16-4eb9-affe-83bd9ad10f69"}}
2024-03-07 20:31:47.443757 99.90% [DEBUG] dg_transcribe_glue.cpp:309 deepgram message: {"type":"Results","channel_index":[0,1],"duration":3.0189373,"start":2.26,"is_final":false,"speech_final":false,"channel":{"alternatives":[{"transcript":"हां बोलो क्या काम है?","confidence":0.89453125,"words":[{"word":"हां","start":3.7792401,"end":3.97914,"confidence":0.78222656,"punctuated_word":"हां"},{"word":"बोलो","start":3.97914,"end":4.33896,"confidence":0.9980469,"punctuated_word":"बोलो"},{"word":"क्या","start":4.33896,"end":4.5388603,"confidence":0.8520508,"punctuated_word":"क्या"},{"word":"काम","start":4.5388603,"end":4.81872,"confidence":0.9995117,"punctuated_word":"काम"},{"word":"है","start":4.81872,"end":5.2789373,"confidence":0.89453125,"punctuated_word":"है?"}]}]},"metadata":{"request_id":"b0a43754-4336-4821-81ec-2eeda6d5bb4d","model_info":{"name":"general-enhanced","version":"1982-11-21.24709","arch":"polaris"},"model_uuid":"83d5f3ce-9a16-4eb9-affe-83bd9ad10f69"}}
2024-03-07 20:31:47.623757 99.90% [DEBUG] dg_transcribe_glue.cpp:309 deepgram message: {"type":"Results","channel_index":[0,1],"duration":3.2500002,"start":2.26,"is_final":true,"speech_final":true,"channel":{"alternatives":[{"transcript":"हां बोलो क्या काम है?","confidence":0.9238281,"words":[{"word":"हां","start":3.7792401,"end":3.97914,"confidence":0.78222656,"punctuated_word":"हां"},{"word":"बोलो","start":3.97914,"end":4.33896,"confidence":0.99853516,"punctuated_word":"बोलो"},{"word":"क्या","start":4.33896,"end":4.5388603,"confidence":0.8540039,"punctuated_word":"क्या"},{"word":"काम","start":4.5388603,"end":4.81872,"confidence":0.9995117,"punctuated_word":"काम"},{"word":"है","start":4.81872,"end":5.31872,"confidence":0.9238281,"punctuated_word":"है?"}]}]},"metadata":{"request_id":"b0a43754-4336-4821-81ec-2eeda6d5bb4d","model_info":{"name":"general-enhanced","version":"1982-11-21.24709","arch":"polaris"},"model_uuid":"83d5f3ce-9a16-4eb9-affe-83bd9ad10f69"}}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.