Code Monkey home page Code Monkey logo

cognitive-services-speech-sdk's Introduction

page_type languages name description products
sample
cpp
csharp
java
javascript
nodejs
objc
python
swift
Microsoft Cognitive Services Speech SDK Samples
Learn how to use the Microsoft Cognitive Services Speech SDK to add speech-enabled features to your apps.
azure
azure-cognitive-services

Sample Repository for the Microsoft Cognitive Services Speech SDK

This project hosts the samples for the Microsoft Cognitive Services Speech SDK. To find out more about the Microsoft Cognitive Services Speech SDK itself, please visit the SDK documentation site.

News

Please check here for release notes and older releases.

Features

This repository hosts samples that help you to get started with several features of the SDK. In addition more complex scenarios are included to give you a head-start on using speech technology in your application.

We tested the samples with the latest released version of the SDK on Windows 10, Linux (on supported Linux distributions and target architectures), Android devices (API 23: Android 6.0 Marshmallow or higher), Mac x64 (OS version 10.14 or higher) and Mac M1 arm64 (OS version 11.0 or higher) and iOS 11.4 devices.

Getting Started

The SDK documentation has extensive sections about getting started, setting up the SDK, as well as the process to acquire the required subscription keys. You will need subscription keys to run the samples on your machines, you therefore should follow the instructions on these pages before continuing.

Get the samples

  • The easiest way to use these samples without using Git is to download the current version as a ZIP file.

    • On Windows, before you unzip the archive, right-click it, select Properties, and then select Unblock.
    • Be sure to unzip the entire archive, and not just individual samples.
  • Clone this sample repository using a Git client.

Build and run the samples

Note: the samples make use of the Microsoft Cognitive Services Speech SDK. By downloading the Microsoft Cognitive Services Speech SDK, you acknowledge its license, see Speech SDK license agreement.

Please see the description of each individual sample for instructions on how to build and run it.

Related GitHub repositories

Speech recognition quickstarts

The following quickstarts demonstrate how to perform one-shot speech recognition using a microphone. If you want to build them from scratch, please follow the quickstart or basics articles on our documentation page.

Quickstart Platform Description
Quickstart C++ for Linux Linux Demonstrates one-shot speech recognition from a microphone.
Quickstart C++ for Windows Windows Demonstrates one-shot speech recognition from a microphone.
Quickstart C++ for macOS macOS
Quickstart C# .NET for Windows Windows Demonstrates one-shot speech recognition from a microphone.
Quickstart C# .NET Core Windows, Linux, macOS Demonstrates one-shot speech recognition from a microphone.
Quickstart C# UWP for Windows Windows Demonstrates one-shot speech recognition from a microphone.
Quickstart C# Unity (Windows or Android) Windows, Android Demonstrates one-shot speech recognition from a microphone.
Quickstart for Android Android Demonstrates one-shot speech recognition from a microphone.
Quickstart Java JRE Windows, Linux, macOS Demonstrates one-shot speech recognition from a microphone.
Quickstart JavaScript Web Demonstrates one-shot speech recognition from a microphone.
Quickstart Node.js Node.js Demonstrates one-shot speech recognition from a file.
Quickstart Python Windows, Linux, macOS Demonstrates one-shot speech recognition from a microphone.
Quickstart Objective-C iOS iOS Demonstrates one-shot speech recognition from a file with recorded speech.
Quickstart Swift iOS iOS Demonstrates one-shot speech recognition from a microphone.
Quickstart Objective-C macOS macOS Demonstrates one-shot speech recognition from a microphone.
Quickstart Swift macOS macOS Demonstrates one-shot speech recognition from a microphone.

Speech translation quickstarts

The following quickstarts demonstrate how to perform one-shot speech translation using a microphone. If you want to build them from scratch, please follow the quickstart or basics articles on our documentation page.

Quickstart Platform Description
Quickstart C++ for Windows Windows Demonstrates one-shot speech translation/transcription from a microphone.
Quickstart C# .NET Framework for Windows Windows Demonstrates one-shot speech translation/transcription from a microphone.
Quickstart C# .NET Core Windows, Linux, macOS Demonstrates one-shot speech translation/transcription from a microphone.
Quickstart C# UWP for Windows Windows Demonstrates one-shot speech translation/transcription from a microphone.
Quickstart Java JRE Windows, Linux, macOS Demonstrates one-shot speech translation/transcription from a microphone.

Speech synthesis quickstarts

The following quickstarts demonstrate how to perform one-shot speech synthesis to a speaker. If you want to build them from scratch, please follow the quickstart or basics articles on our documentation page.

Quickstart Platform Description
Quickstart C++ for Linux Linux Demonstrates one-shot speech synthesis to the default speaker.
Quickstart C++ for Windows Windows Demonstrates one-shot speech synthesis to the default speaker.
Quickstart C++ for macOS macOS Demonstrates one-shot speech synthesis to the default speaker.
Quickstart C# .NET for Windows Windows Demonstrates one-shot speech synthesis to the default speaker.
Quickstart C# UWP for Windows Windows Demonstrates one-shot speech synthesis to the default speaker.
Quickstart C# .NET Core Windows, Linux Demonstrates one-shot speech synthesis to the default speaker.
Quickstart for C# Unity (Windows or Android) Windows, Android Demonstrates one-shot speech synthesis to a synthesis result and then rendering to the default speaker.
Quickstart for Android Android Demonstrates one-shot speech synthesis to the default speaker.
Quickstart Java JRE Windows, Linux, macOS Demonstrates one-shot speech synthesis to the default speaker.
Quickstart Python Windows, Linux, macOS Demonstrates one-shot speech synthesis to the default speaker.
Quickstart Objective-C iOS iOS Demonstrates one-shot speech synthesis to a synthesis result and then rendering to the default speaker.
Quickstart Swift iOS iOS Demonstrates one-shot speech synthesis to the default speaker.
Quickstart Objective-C macOS macOS Demonstrates one-shot speech synthesis to the default speaker.
Quickstart Swift macOS macOS Demonstrates one-shot speech synthesis to the default speaker.

Voice Assistant quickstarts

The following quickstarts demonstrate how to create a custom Voice Assistant. The applications will connect to a previously authored bot configured to use the Direct Line Speech channel, send a voice request, and return a voice response activity (if configured). If you want to build these quickstarts from scratch, please follow the quickstart or basics articles on our documentation page.

See also Azure-Samples/Cognitive-Services-Voice-Assistant for full Voice Assistant samples and tools.


Quickstart Platform Description
Quickstart Java JRE Windows, Linux, macOS Demonstrates speech recognition through the DialogServiceConnector and receiving activity responses.
Quickstart C# UWP for Windows Windows Demonstrates speech recognition through the DialogServiceConnector and receiving activity responses.

Samples

The following samples demonstrate additional capabilities of the Speech SDK, such as additional modes of speech recognition as well as intent recognition and translation. Voice Assistant samples can be found in a separate GitHub repo.

Sample Platform Description
C++ Console app for Windows Windows Demonstrates speech recognition, speech synthesis, intent recognition, conversation transcription and translation
C++ Speech Recognition from MP3/Opus file (Linux only) Linux Demonstrates speech recognition from an MP3/Opus file
C# Console app for .NET Framework on Windows Windows Demonstrates speech recognition, speech synthesis, intent recognition, and translation
C# Console app for .NET Core (Windows or Linux) Windows, Linux, macOS Demonstrates speech recognition, speech synthesis, intent recognition, and translation
Java Console app for JRE Windows, Linux, macOS Demonstrates speech recognition, speech synthesis, intent recognition, and translation
Python Console app Windows, Linux, macOS Demonstrates speech recognition, speech synthesis, intent recognition, and translation
Speech-to-text UWP sample Windows Demonstrates speech recognition
Text-to-speech UWP sample Windows Demonstrates speech synthesis
Speech recognition sample for Android Android Demonstrates speech and intent recognition
Speech recognition, synthesis, and translation sample for the browser, using JavaScript Web Demonstrates speech recognition, intent recognition, and translation
Speech recognition and translation sample using JavaScript and Node.js Node.js Demonstrates speech recognition, intent recognition, and translation
Speech recognition sample for iOS using a connection object iOS Demonstrates speech recognition
Extended speech recognition sample for iOS iOS Demonstrates speech recognition using streams etc.
Speech synthesis sample for iOS iOS Demonstrates speech synthesis using streams etc.
C# UWP DialogServiceConnector sample for Windows Windows Demonstrates speech recognition through the DialogServiceConnector and receiving activity responses.
C# Unity sample for Windows or Android Windows, Android Demonstrates speech recognition, intent recognition, and translation for Unity
C# Unity SpeechBotConnector sample for Windows or Android Windows, Android Demonstrates speech recognition through the SpeechBotConnector and receiving activity responses.
C#, C++ and Java DialogServiceConnector samples Windows, Linux, Android Additional samples and tools to help you build an application that uses Speech SDK's DialogServiceConnector for voice communication with your Bot-Framework Bot or Custom Command web application.

Samples for using the Speech Service REST API (no Speech SDK installation required):

Sample Description
Batch transcription Demonstrates usage of batch transcription from different programming languages
Batch synthesis Demonstrates usage of batch synthesis from different programming languages
Custom voice Demonstrates usage of custom voice from different programming languages

Tools

Tool Platform Description
Enumerate audio devices C++, Windows Shows how to get the Device ID of all connected microphones and loudspeakers. Device ID is required if you want to listen via non-default microphone (Speech Recognition), or play to a non-default loudspeaker (Text-To-Speech) using Speech SDK
Enumerate audio devices C# .NET Framework, Windows -"-

Sample data for Custom Speech

Resources

cognitive-services-speech-sdk's People

Contributors

bexxx avatar brandom-msft avatar brianmouncer avatar chlandsi avatar chschrae avatar dargilco avatar dependabot[bot] avatar eric-urban avatar glecaros avatar glharper avatar henryvandervegte avatar jinshan1979 avatar l17813182030 avatar mahilleb-msft avatar microsoftopensource avatar oscholz avatar pankopon avatar panosperiorellis avatar rhurey avatar richardsunms avatar szhaomsft avatar v-demjoh avatar v-jaswel avatar v-wangtong avatar wangkenpu avatar wiazur avatar wolfma61 avatar yinhew avatar yulin-li avatar zhouwangzw avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cognitive-services-speech-sdk's Issues

Android speech result returns canceled recognized text: <>

Please provide us with the following information:

This issue is for a: (mark with an x)

- [x] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Update the subscription key and region. Then press the start button

Any log messages given by the failure

N/A

Expected/desired behavior

Expect voice recognition work

OS and Version?

Android 7.0 API 24

Versions

0.6

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Could not figure out the usage of continuous speech recognition

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [ ] feature request
- [x] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Run continuous speech recognition with StartContinuousRecognitionAsync(),
I know that when there is a stream bind with the recognizer. The read() method
returns 0 means the stream is over. But how could I know the recognition result received
is complete too? could there be multiple final results during the recognition process? can I
identify which one is the real last result?

Also I'm not very clear about how OnSessionEvent behaviour, which action could trigger
session start and stop event? Could anyone give a solid example? Thanks!

Any log messages given by the failure

N/A

Expected/desired behavior

N/A

OS and Version?

Windows 10

Versions

N/A

Mention any other details that might be useful


Thanks! We'll be in touch soon.

connection failure occurs on the network using the proxy

This app runs well on the network that don't use proxy.
However, this application outputs a 'connection error' on the network using the proxy.
I want to run this application on a network using a proxy.

This issue is for a: (mark with an x)

- [x] bug report -> please search issues before submitting
- [x] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Run application, select the option 1

Any log messages given by the failure

 1. Speech recognition with microphone input.
 2. Speech recognition in the specified language.
 3. Speech recognition with file input.
 4. Speech recognition using customized model.
 5. Speech continuous recognition using events.
 6. Translation with microphone input.
 7. Translation with file input.
 Your choice (0: Stop.): 1
 Say something...
 There was an error. Status:Canceled, Reason:Connection failed (no connection to the remote host).
 
 Recognition done. Your Choice (0: Stop):

Expected/desired behavior

No Connection failed occurs.

OS and Version?

Windows 8.1 Pro

Versions

cognitive-services-speech-sdk/Windows/csharp_samples/csharp_samples.sln (042626c : Commits on May 25, 20)

Mention any other details that might be useful

none

Connection failure occurs when using this sdk in China

This app worked well before, Azure dashboard shows the last call from this app to speech service is successful in China on September 21. But this app has been unable to work since September 25 in China.

This issue is for a: (mark with an x)

- [x] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Run cognitive-services-speech-sdk/quickstart/csharp-dotnet-windows application in China.

Any log messages given by the failure

Say something...
CANCELED: Reason=Error
CANCELED: ErrorDetails=Connection failed (no connection to the remote host). Internal error: 8. Error details: 998. Please check network connection, firewall setting, and the region name used to create speech factory.
CANCELED: Did you update the subscription info?

Expected/desired behavior

No Connection failed occurs.

OS and Version?

Windows 7

Versions

v1.0.0

Mention any other details that might be useful

none

CrisModelId

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [ ] feature request
- [x] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Hello!
I try to start example from SpeechToText-WPF folder.
It has several settings and one setting called CrisModelId. This value assigned to customRecognizer.DeploymentId .
What is it? And where I can find it?
I tried to find it in jsons from resources.azure.com but I couldn't.

Any log messages given by the failure

Expected/desired behavior

OS and Version?

Windows 10 pro

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Recognize Intent failed with Canceled (android)

Please provide us with the following information:

This issue is for a: (mark with an x)

- [x ] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Press Recognize intent button (with a properly configured subscription key and app id) and try to start an intent recognition

Any log messages given by the failure

Final result received: Intent failed with Canceled. Did you enter your Language Understanding subscription? WebSocket Upgrade failed with an authentication error (403). Please check the subscription key or the authorization token, and the region name., intent:

Expected/desired behavior

I expect an utterance evaluation as it happens via REST

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). Other.
Android studio 3.1 debug on physical device (android 6)

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Speaker recognition on Azure home page does not work

I'm trying out this demo and it is not working. The spinner is just stuck. When I inspect the page there are js errors.

screen shot 2018-09-09 at 18 48 12

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Any log messages given by the failure

Expected/desired behavior

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). Other.

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Adapting own AudioInputStream for Linux SDK.

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [v] feature request
- [v] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

tyring to implement own audioinputstream and connect to speechapi factory

Any log messages given by the failure

Expected/desired behavior

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). Other.
ubuntu 16.04

Versions

Mention any other details that might be useful

hi,

Could you let me know how to connect own AudioInputStream more in detail for linux SDK? (sample code would be the best.)

I'm trying to adapt my own AudioInputStream on linux, c++ environment,
I have tried to implement MyAudioInputStream class which is defined as speechapi_cxx_audioinputstream.h and
use InitSpeechApi_AudioInputStreamAdapterFromAudioInputStream() in speechapi_cxx_factory.h to apply the MyAudioInputStream class.
but on this point, I needed to derive the SpeechApi_AudioInputStreamAdapter struct which seems to be defined in the sdk as a default one. due to that I couldn't find any point of this already-implemented adapter to connect MyAudioInputStream to the SDK, got stuck eventually.

please let me know if I am digging wrong way, or how to fix this problem.

thank you in advance.


Thanks! We'll be in touch soon.

.NET Core sample for speech synthesis (Text-To-Speech)

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [ ] feature request
- [X ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Expected/desired behavior

A .NET Core sample of using the speech synthesis feature of the Speech API - it is currently a nightmare to try and use the documentation to figure this out, as it just includes an API reference with no explanation of the best way to then play the audio file returned by the service (no System.Media library in .NET Core so can't use the .NET Framework example from the Bing Speech documentation)


Thanks! We'll be in touch soon.

SDK does not work with .net framework

Please provide us with the following information:

This issue is for a: (mark with an x)

- [x ] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Compile program using .NET Framework

Any log messages given by the failure

An application error occurred. Please contact the adminstrator with the following information:

The type initializer for 'Microsoft.CognitiveServices.Speech.Internal.carbon_csharpPINVOKE' threw an exception.

Stack Trace:
at Microsoft.CognitiveServices.Speech.Internal.carbon_csharpPINVOKE.SpeechFactory_FromSubscription(String jarg1, String jarg2)
at Microsoft.CognitiveServices.Speech.Internal.SpeechFactory.FromSubscription(String subscription, String region)
at Microsoft.CognitiveServices.Speech.SpeechFactory.FromSubscription(String subscriptionKey, String region)

at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state)
at System.Threading.ThreadHelper.ThreadStart()

Expected/desired behavior

Not throw the above error when using speech

OS and Version?

Windows 10.

Versions

Mention any other details that might be useful

We need to be able to use this SDK using .net framework because we have a lot of code that can't easily be ported to .net core

Thanks for you help


Thanks! We'll be in touch soon.

Error SPXERR_MIC_NOT_AVAILABLE when deployed to Windows 2016 VM

Please provide us with the following information:

This issue is for a: (mark with an x)

- [x] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Deploy sample code in .net core web application to Azure Windows Server 2016 VM
Access through web browser and initiate speech to text

Any log messages given by the failure

AggregateException: One or more errors occurred. (Exception with an error code: 0xe (SPXERR_MIC_NOT_AVAILABLE)

Expected/desired behavior

OS and Version?

Windows Server 2016 VM

Versions

Mention any other details that might be useful

Same codes work on development machine (Windows 10, VS 2017 IDE) but when deployed I encounter the above mentioned error.


Thanks! We'll be in touch soon.

Unity Android project crashes (now with logs from Android 9)

Describe the bug
I am invoking speech recognition from a Unity app on Android. When I call recognizeOnceAsync(), the app immediately dies with no crash report in the Android logs other than "app has died". I have successfully built and run the Android Sample apps using my subscription key and those are working.

I would appreciate any help you may be able to provide.

To Reproduce
Steps to reproduce the behavior:
This Java method is being run on the Android UI thread (I had been running it on the Unity thread and thought that might be an issue, so I tried the UI thread, no difference observed). I've added copious logging via Android Log calls. There are also callbacks to Unity showing that the connection back to Unity is working.

    public void StartRecognitionUI()
    {

        ///////////////////////////////////////////////////
        // recognize with intermediate results
        ///////////////////////////////////////////////////

        Log.d("Unity:SpeechEngine", "StartRecognitionUI");
        try {
            final SpeechConfig speechConfig = SpeechConfig.fromSubscription(SpeechSubscriptionKey, SpeechRegion);
            final AudioConfig audioInput = AudioConfig.fromDefaultMicrophoneInput();
            Log.d("Unity:SpeechEngine", "Got audioInput");
            final SpeechRecognizer reco = new SpeechRecognizer(speechConfig, audioInput);
            Log.d("Unity:SpeechEngine", "Got Recgonizer");

            reco.recognizing.addEventListener(new EventHandler<SpeechRecognitionEventArgs>() {
                public void onEvent(Object o, SpeechRecognitionEventArgs speechRecognitionResultEventArgs)
                {
                    Log.d("Unity:SpeechEngine", "onEvent");
                    try {
                        final String s = speechRecognitionResultEventArgs.getResult().getText();
                        Log.d("Unity:SpeechEngine", "onEvent hypothesis: " + s);
                        _speechCallbacks.Log("Intermediate result received: " + s);
                        _speechCallbacks.HypothesisReceived(s);
                        Log.d("Unity:SpeechEngine", "Callbacks done");
                    } catch (Exception ex) {
                        Log.d("Unity:SpeechEngine", "Exception during hypothesis Event handler\n" + formatException(ex));
                    }
                }
            });

            _speechCallbacks.Log("Starting Speech Recognition");
            Log.d("Unity:SpeechEngine", "Starting Speech Recognition");

            final Future<SpeechRecognitionResult> task = reco.recognizeOnceAsync();

// **** NOTE: This is where the app dies. The next Log call is never made. ****
// **** I am making a similar comment in the logs posted below ****

            Log.d("Unity:SpeechEngine", "Got Future task, isDone:" + task.isDone());
            setOnTaskCompletedListener(task, new OnTaskCompletedListener<SpeechRecognitionResult>() {
                public void onCompleted(SpeechRecognitionResult result)
                {
                    Log.d("Unity:SpeechEngine", "onCompleted");

                    try {
                        final String s = result.getText();
                        Log.d("Unity:SpeechEngine", "result: " + s);
                        reco.close();
                        Log.d("Unity:SpeechEngine", "reco.closed");
                        _speechCallbacks.Log("Recognizer returned: " + s);
                        _speechCallbacks.ResultReceived(s);
                        Log.d("Unity:SpeechEngine", "Callbacks done");
                    } catch (Exception ex) {
                        Log.d("Unity:SpeechEngine", "Exception during hypothesis Event handler\n" + formatException(ex));
                    }
                }
            });
            Log.d("Unity:SpeechEngine", "TaskCompletedListener started");
            
        } catch (Exception ex) {
            _speechCallbacks.LogError(formatException(ex));
        }
    }

Expected behavior
I expect that the speech recognition should start instead of crashing

Version of the Cognitive Services Speech SDK
client-sdk-1.0.0.aar (Android)

Platform, Operationg System, and Programming Language

  • OS: Android Google Pixel 2 running Oreo 8.1
  • Hardware - ARM. (compiling for ARMv7)

Additional context
This is the logcat output from pushing the button to start recognition through to after the crash. There is some "NOTE ****" comments added by me in the list. There is a consistent exception logged after the app has died.

2018-10-15 19:23:28.086 16832-16856/johnse.test.speech D/Unity:SpeechEngine: StartRecognition
2018-10-15 19:23:28.087 16832-16832/johnse.test.speech D/Unity:SpeechEngine: StartRecognitionUI
2018-10-15 19:23:28.119 16832-16832/johnse.test.speech D/Unity:SpeechEngine: Got audioInput
2018-10-15 19:23:28.124 16832-16832/johnse.test.speech D/Unity:SpeechEngine: Got Recgonizer
2018-10-15 19:23:28.131 16832-16832/johnse.test.speech I/Unity: Starting Speech Recognition
     
    (Filename: ./Runtime/Export/Debug.bindings.h Line: 43)
2018-10-15 19:23:28.131 16832-16832/johnse.test.speech D/Unity:SpeechEngine: Starting Speech Recognition

NOTE **** This is where recognizeOnceAsync() is called and the app dies.

2018-10-15 19:23:28.379 1102-1205/? W/InputDispatcher: channel '1921ccf johnse.test.speech/johnse.test.speech.UnityPlayerActivity (server)' ~ Consumer closed input channel or an error occurred.  events=0x9
2018-10-15 19:23:28.379 1102-1205/? E/InputDispatcher: channel '1921ccf johnse.test.speech/johnse.test.speech.UnityPlayerActivity (server)' ~ Channel is unrecoverably broken and will be disposed!
2018-10-15 19:23:28.380 607-627/? E/SurfaceFlinger: Failed to find layer (SurfaceView - johnse.test.speech/johnse.test.speech.UnityPlayerActivity#0) in layer parent (no-parent).
2018-10-15 19:23:28.380 1102-1429/? I/MediaFocusControl: AudioFocus  removeFocusStackEntryOnDeath(): removing entry for android.os.BinderProxy@e68ef1c
2018-10-15 19:23:28.380 607-828/? E/SurfaceFlinger: Failed to find layer (Background for - SurfaceView - johnse.test.speech/johnse.test.speech.UnityPlayerActivity#0) in layer parent (no-parent).
    
    --------- beginning of system
2018-10-15 19:23:28.380 1102-1120/? I/ActivityManager: Process johnse.test.speech (pid 16832) has died: fore TOP 
2018-10-15 19:23:28.380 1102-1130/? W/zygote64: kill(-16832, 9) failed: No such process
2018-10-15 19:23:28.380 1102-1130/? I/zygote64: Successfully killed process cgroup uid 10127 pid 16832 in 0ms
2018-10-15 19:23:28.381 1102-1429/? I/WindowManager: WIN DEATH: Window{1921ccf u0 johnse.test.speech/johnse.test.speech.UnityPlayerActivity}
2018-10-15 19:23:28.381 1102-1429/? W/InputDispatcher: Attempted to unregister already unregistered input channel '1921ccf johnse.test.speech/johnse.test.speech.UnityPlayerActivity (server)'
2018-10-15 19:23:28.381 710-710/? I/Zygote: Process 16832 exited cleanly (1)
2018-10-15 19:23:28.385 1102-1120/? W/ActivityManager: Force removing ActivityRecord{2954ccd u0 johnse.test.speech/.UnityPlayerActivity t24}: app died, no saved state
2018-10-15 19:23:28.392 1102-1135/? W/ActivityManager: setHasOverlayUi called on unknown pid: 16832
2018-10-15 19:23:28.424 607-607/? D/SurfaceFlinger: duplicate layer name: changing com.google.android.apps.nexuslauncher/com.google.android.apps.nexuslauncher.NexusLauncherActivity to com.google.android.apps.nexuslauncher/com.google.android.apps.nexuslauncher.NexusLauncherActivity#1
2018-10-15 19:23:28.426 1324-1324/? I/Elmyra/ElmyraService: Unblocked; current action: SetupWizardAction
2018-10-15 19:23:28.433 755-798/? I/ASH: @ 436987.500: ccb_proximityInit: PROX initialized w/initData
2018-10-15 19:23:28.437 31613-31613/? W/SessionLifecycleManager: Handover failed. Creating new session controller.
2018-10-15 19:23:28.441 31613-31613/? I/OptInState: There is a new client and it does not support opt-in. Dropping request.
2018-10-15 19:23:28.444 1401-1401/? I/WallpaperService: engine resumed
2018-10-15 19:23:28.451 755-798/? D/CHRE: @ 436987.500: +: id 10, otherClientPresent 1, mode 3
2018-10-15 19:23:28.452 724-1252/? D/ACDB-LOADER: ACDB -> ACDB_CMD_GET_AUDPROC_GAIN_DEP_STEP_TABLE, vol index 0
2018-10-15 19:23:28.452 724-1252/? E/volume_listener: check_and_set_gain_dep_cal: Failed to set gain dep cal level
2018-10-15 19:23:28.452 1102-1120/? I/WifiService: getConnectionInfo uid=10046
2018-10-15 19:23:28.453 724-7254/? E/volume_listener: check_and_set_gain_dep_cal: Failed to set gain dep cal level
2018-10-15 19:23:28.453 1102-1120/? I/WifiService: getWifiEnabledState uid=10046
2018-10-15 19:23:28.453 31613-16919/? W/Settings: Setting mobile_data has moved from android.provider.Settings.Secure to android.provider.Settings.Global.
2018-10-15 19:23:28.453 755-798/? D/CHRE: @ 436987.500: [ImuCal] Dynamic sensor configuration: high-performance.
2018-10-15 19:23:28.453 755-798/? D/CHRE: @ 436987.500: sensorType 9 allowed 1: mergedMode 3, otherClientPresent 1
2018-10-15 19:23:28.453 755-798/? D/CHRE: @ 436987.500: sensorType 11 allowed 1: mergedMode 3, otherClientPresent 1
2018-10-15 19:23:28.454 755-798/? D/CHRE: @ 436987.500: sensorType 13 allowed 0: mergedMode 3, otherClientPresent 0
2018-10-15 19:23:28.454 1102-1446/? I/WifiService: getConnectionInfo uid=10046
2018-10-15 19:23:28.455 1102-1446/? I/WifiService: getWifiEnabledState uid=10046
2018-10-15 19:23:28.455 31613-16918/? W/Settings: Setting mobile_data has moved from android.provider.Settings.Secure to android.provider.Settings.Global.
2018-10-15 19:23:28.462 31613-31613/? W/CRChildManagerHelper: Trying to restore children after changes have been made to the child manager
2018-10-15 19:23:28.462 31613-31613/? W/CRChildManagerHelper: Trying to restore children after changes have been made to the child manager
2018-10-15 19:23:28.463 31613-16250/? E/EntrySyncManager: Cannot determine account name: drop request

NOTE **** This exception does seem to always occur, but it seems more related to Google Services
I will note that this phone does not have a SIM and does not have a user logged in. However, it is running with guest access MSFTGUEST WiFi ***

2018-10-15 19:23:28.464 31613-16250/? E/NowController: Failed to access data from EntryProvider to update foreground actions.
    java.util.concurrent.ExecutionException: java.lang.RuntimeException: Could not complete scheduled request to refresh entries. ClientErrorCode: 3
        at com.google.common.util.concurrent.d.da(SourceFile:85)
        at com.google.common.util.concurrent.d.get(SourceFile:23)
        at com.google.common.util.concurrent.l.get(SourceFile:2)
        at com.google.android.apps.gsa.staticplugins.nowstream.b.a.aj.bas(SourceFile:15)
        at com.google.android.apps.gsa.staticplugins.nowstream.b.a.aj.doInBackground(SourceFile:40)
        at com.google.android.apps.gsa.shared.util.concurrent.d.call(SourceFile:2)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:457)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at com.google.android.apps.gsa.shared.util.concurrent.a.ag.run(Unknown Source:4)
        at com.google.android.apps.gsa.shared.util.concurrent.a.bo.run(SourceFile:4)
        at com.google.android.apps.gsa.shared.util.concurrent.a.bo.run(SourceFile:4)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1162)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:636)
        at java.lang.Thread.run(Thread.java:764)
        at com.google.android.apps.gsa.shared.util.concurrent.a.ak.run(SourceFile:6)
     Caused by: java.lang.RuntimeException: Could not complete scheduled request to refresh entries. ClientErrorCode: 3
        at com.google.android.apps.gsa.sidekick.main.i.u.ak(SourceFile:6)
        at com.google.common.util.concurrent.q.P(SourceFile:7)
        at com.google.common.util.concurrent.p.run(SourceFile:37)
        at com.google.common.util.concurrent.br.execute(SourceFile:3)
        at com.google.common.util.concurrent.d.a(SourceFile:264)
        at com.google.common.util.concurrent.d.addListener(SourceFile:134)
        at com.google.common.util.concurrent.p.b(SourceFile:3)
        at com.google.common.util.concurrent.ax.a(SourceFile:18)
        at com.google.android.apps.gsa.shared.util.concurrent.h.a(SourceFile:10)
        at com.google.android.apps.gsa.shared.util.concurrent.h.a(SourceFile:7)
        at com.google.android.apps.gsa.sidekick.main.i.r.a(SourceFile:15)
        at com.google.android.apps.gsa.sidekick.main.i.s.ak(SourceFile:8)
        at com.google.common.util.concurrent.q.P(SourceFile:7)
        at com.google.common.util.concurrent.p.run(SourceFile:37)
        at com.google.common.util.concurrent.br.execute(SourceFile:3)
        at com.google.common.util.concurrent.d.a(SourceFile:264)
        at com.google.common.util.concurrent.d.addListener(SourceFile:134)
        at com.google.common.util.concurrent.p.b(SourceFile:3)
        at com.google.common.util.concurrent.ax.a(SourceFile:18)
        at com.google.android.apps.gsa.shared.util.concurrent.h.a(SourceFile:10)
        at com.google.android.apps.gsa.shared.util.concurrent.h.a(SourceFile:7)
        at com.google.android.apps.gsa.sidekick.main.i.r.a(SourceFile:10)
        at com.google.android.apps.gsa.staticplugins.nowstream.b.a.aj.bas(SourceFile:14)
        at com.google.android.apps.gsa.staticplugins.nowstream.b.a.aj.doInBackground(SourceFile:40) 
        at com.google.android.apps.gsa.shared.util.concurrent.d.call(SourceFile:2) 
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:457) 
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
        at com.google.android.apps.gsa.shared.util.concurrent.a.ag.run(Unknown Source:4) 
        at com.google.android.apps.gsa.shared.util.concurrent.a.bo.run(SourceFile:4) 
        at com.google.android.apps.gsa.shared.util.concurrent.a.bo.run(SourceFile:4) 
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1162) 
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:636) 
        at java.lang.Thread.run(Thread.java:764) 
        at com.google.android.apps.gsa.shared.util.concurrent.a.ak.run(SourceFile:6) 
2018-10-15 19:23:28.464 31613-16601/? W/LocationOracle: No location history returned by ContextManager
2018-10-15 19:23:28.466 31613-31613/? I/MicroDetectionWorker: #updateMicroDetector [detectionMode: [mDetectionMode: [1]]]
2018-10-15 19:23:28.466 31613-31613/? I/MicroDetectionWorker: #startMicroDetector [speakerMode: 0]
2018-10-15 19:23:28.469 31613-15793/? I/MicroRecognitionRunner: Starting detection.
2018-10-15 19:23:28.469 31613-16919/? I/MicrophoneInputStream: mic_starting com.google.android.apps.gsa.speech.audio.af@367cf51
2018-10-15 19:23:28.470 724-7254/? W/DeviceHAL: Device 0xe8ab4000 open_input_stream: Invalid argument
2018-10-15 19:23:28.473 807-16921/? I/AudioFlinger: AudioFlinger's thread 0xeaa94180 tid=16921 ready to run
2018-10-15 19:23:28.478 31613-16919/? I/MicrophoneInputStream: mic_started com.google.android.apps.gsa.speech.audio.af@367cf51
2018-10-15 19:23:28.479 18317-16851/? W/ctxmgr: [AclManager]No 3 for (accnt=account#-517948760#, com.google.android.gms(10019):UserVelocityProducer, vrsn=14366028, 0, 3pPkg = null ,  3pMdlId = null ,  pid = 18317). Was: 3 for 1, account#-517948760#
2018-10-15 19:23:28.481 31613-31613/? I/MicroDetectionWorker: onReady
2018-10-15 19:23:28.481 755-798/? I/ASH: @ 436987.531: ccb_proximityHandle: PRX_STATE_BEGIN
2018-10-15 19:23:28.482 755-798/? I/ASH: @ 436987.531: ccb_proximityHandle: initial state farAway
2018-10-15 19:23:28.484 724-16922/? D/sound_trigger_platform: platform_stdev_check_and_update_concurrency: concurrency active 0, tx 1, rx 0, concurrency session_allowed 0
2018-10-15 19:23:28.484 724-16922/? D/audio_hw_primary: enable_snd_device: snd_device(75: voice-rec-mic)
2018-10-15 19:23:28.484 724-16922/? D/audio_route: Apply path: voice-rec-mic
2018-10-15 19:23:28.484 2244-2249/? E/ANDR-PERF-RESOURCEQS: Failed to apply optimization [2, 0]
2018-10-15 19:23:28.489 31613-31613/? W/CRChildManagerHelper: Trying to restore children after changes have been made to the child manager
2018-10-15 19:23:28.489 31613-31613/? W/ErrorReporter: reportError [type: 29, code: 917507]: null
2018-10-15 19:23:28.489 31613-31613/? W/StreamController: Failed to get the restored children from restoreChildren.
2018-10-15 19:23:28.496 724-16922/? D/ACDB-LOADER: ACDB -> send_audio_cal, acdb_id = 41, path = 1, app id = 0x11132, sample rate = 48000
2018-10-15 19:23:28.496 724-16922/? D/ACDB-LOADER: ACDB -> ACDB_CMD_GET_AUDPROC_GAIN_DEP_STEP_TABLE, vol index 0
2018-10-15 19:23:28.496 724-16922/? E/ACDB-LOADER: Error: ACDB AudProc vol returned = -19
2018-10-15 19:23:28.496 724-16922/? D/ACDB-LOADER: ACDB -> GET_AFE_TOPOLOGY_ID for adcd_id 41, Topology Id 112fb
2018-10-15 19:23:28.496 724-16922/? D/ACDB-LOADER: Error: ACDB AFE returned = -19
2018-10-15 19:23:28.496 724-16922/? D/audio_hw_primary: enable_audio_route: usecase(9) apply and update mixer path: audio-record
2018-10-15 19:23:28.496 724-16922/? D/audio_route: Apply path: audio-record
2018-10-15 19:23:28.508 18317-18317/? I/GeofencerStateMachine: removeGeofences: removeRequest=RemoveGeofencingRequest[REMOVE_BY_PENDING_INTENT pendingIntent=PendingIntent[creatorPackage=com.google.android.gms], packageName=null]
2018-10-15 19:23:28.509 18317-18317/? I/GeofencerStateMachine: removeGeofences: removeRequest=RemoveGeofencingRequest[REMOVE_BY_PENDING_INTENT pendingIntent=PendingIntent[creatorPackage=com.google.android.gms], packageName=null]
2018-10-15 19:23:28.510 18317-16925/? I/PlaceInferenceEngine: [anon] Changed inference mode: 0
2018-10-15 19:23:28.526 18317-16851/? E/ctxmgr: [ProducerStatusImpl]updateStateForNewContextData: inactive, contextName=7
2018-10-15 19:23:28.549 18317-16925/? I/Places: ?: PlacesBleScanner start() with priority 2
2018-10-15 19:23:28.550 18317-16925/? I/PlaceInferenceEngine: [anon] Changed inference mode: 1
2018-10-15 19:23:28.555 18317-18317/? I/BeaconBle: Client requested scan, settings=BleSettings [scanMode=ZERO_POWER, callbackType=ALL_MATCHES, reportDelayMillis=0, 1 filters, 0 clients, callingClientName=Places]
2018-10-15 19:23:28.561 18317-16925/? I/Places: Converted 0 out of 47 WiFi scans
2018-10-15 19:23:28.561 18317-18317/? I/BeaconBle: ZERO_POWER is disabled.
2018-10-15 19:23:28.561 18317-18317/? I/BeaconBle: 'L' hardware scan: scan stopped, no powered clients
2018-10-15 19:23:28.561 1102-1720/? I/WifiService: getConnectionInfo uid=10019
2018-10-15 19:23:28.565 18317-16925/? I/PlaceInferenceEngine: [anon] Changed inference mode: 1
2018-10-15 19:23:28.566 31613-31613/? I/MicroDetectionWorker: onReady
2018-10-15 19:23:28.583 18317-16924/? I/PlaceInferenceEngine: No beacon scan available - ignoring candidates.

  • Error messages, log information, stack trace, ...
  • If you report an error for a specific service interaction, please report the SessionId and time (incl. timezone) of the reported incidents. The SessionId is reported in all call-backs/events you receive.
  • Any other additional information

Speech SDK for .NET Core does not run on macOS

As .NET Core is cross-platform, it is expected, that the SDK supports all these platforms as well.

This issue is for a:

- [x] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Run the .NET Core Tutorial on macOS

Any log messages given by the failure

Unhandled Exception: System.AggregateException: One or more errors occurred. (The type initializer for 'Microsoft.CognitiveServices.Speech.Internal.carbon_csharpPINVOKE' threw an exception.) ---> System.TypeInitializationException: The type initializer for 'Microsoft.CognitiveServices.Speech.Internal.carbon_csharpPINVOKE' threw an exception. ---> System.TypeInitializationException: The type initializer for 'SWIGExceptionHelper' threw an exception. ---> System.DllNotFoundException: Unable to load shared library 'Microsoft.CognitiveServices.Speech.csharp.bindings.dll' or one of its dependencies. In order to help diagnose loading problems, consider setting the DYLD_PRINT_LIBRARIES environment variable: dlopen(libMicrosoft.CognitiveServices.Speech.csharp.bindings.dll, 1): image not found
   at Microsoft.CognitiveServices.Speech.Internal.carbon_csharpPINVOKE.SWIGExceptionHelper.SWIGRegisterExceptionCallbacks_carbon_csharp(ExceptionDelegate applicationDelegate, ExceptionDelegate arithmeticDelegate, ExceptionDelegate divideByZeroDelegate, ExceptionDelegate indexOutOfRangeDelegate, ExceptionDelegate invalidCastDelegate, ExceptionDelegate invalidOperationDelegate, ExceptionDelegate ioDelegate, ExceptionDelegate nullReferenceDelegate, ExceptionDelegate outOfMemoryDelegate, ExceptionDelegate overflowDelegate, ExceptionDelegate systemExceptionDelegate)
   at Microsoft.CognitiveServices.Speech.Internal.carbon_csharpPINVOKE.SWIGExceptionHelper..cctor()
   --- End of inner exception stack trace ---
   at Microsoft.CognitiveServices.Speech.Internal.carbon_csharpPINVOKE.SWIGExceptionHelper..ctor()
   at Microsoft.CognitiveServices.Speech.Internal.carbon_csharpPINVOKE..cctor()
   --- End of inner exception stack trace ---
   at Microsoft.CognitiveServices.Speech.Internal.carbon_csharpPINVOKE.SpeechConfig_FromSubscription(String jarg1, String jarg2)
   at Microsoft.CognitiveServices.Speech.Internal.SpeechConfig.FromSubscription(String subscription, String region)
   at Microsoft.CognitiveServices.Speech.SpeechConfig.FromSubscription(String subscriptionKey, String region)
   at doetnettest.Program.RecognizeSpeechAsync() in /Users/robinmanuelthiel/Desktop/doetnettest/Program.cs:line 13
   --- End of inner exception stack trace ---
   at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
   at System.Threading.Tasks.Task.Wait()
   at doetnettest.Program.Main() in /Users/robinmanuelthiel/Desktop/doetnettest/Program.cs:line 50

OS and Version?

macOS Mojave

Versions

1.0.0

Mention any other details that might be useful


Thanks! We'll be in touch soon.

IOS (Swift) Sample with new speech SDK

Please update sample for iOS (swift) with new speech SDK.

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [ ] feature request
- [x] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Any log messages given by the failure

Expected/desired behavior

OS and Version?

Swift/MacOS/XCode

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

SDK for Xamarin Android and Xamarin IOS

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [X] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

NaN

Any log messages given by the failure

NaN

Expected/desired behavior

NaN

OS and Version?

IOS 12

Versions

Any Version

Mention any other details that might be useful

Is there any SDK for xamarin ios and xamarin android? Because it seems not possible to use the C# Version for them at the moment (The only available C# is windows)

https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/


System.AggregateException: 'One or more errors occurred.'

Please provide us with the following information:

This issue is for a: (mark with an x)

- [x ] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

I simply downloaded the repo, didn't change any code, and tried to build the solution in the "Windows/csharp_samples" directory, following the instructions provided on the webpage.

Any log messages given by the failure

System.AggregateException
  HResult=0x80131500
  Message=One or more errors occurred.
  Source=mscorlib
  StackTrace:
   at System.Threading.Tasks.Task.ThrowIfExceptional(Boolean includeTaskCanceledExceptions)
   at System.Threading.Tasks.Task.Wait(Int32 millisecondsTimeout, CancellationToken cancellationToken)
   at System.Threading.Tasks.Task.Wait()
   at MicrosoftSpeechSDKSamples.Program.Main(String[] args) in C:\Users\bwang\Downloads\cognitive-services-speech-sdk-master\Windows\csharp_samples\Program.cs:line 43

Inner Exception 1:
ApplicationException: Exception with an error code: 0x8
[CALL STACK]

    > CreateModuleObject

    - CreateModuleObject

    - CreateModuleObject

    - CreateModuleObject

    - CreateModuleObject

    - SpeechFactory_CreateSpeechRecognizer_With_FileInput

    - CSharp_MicrosoftfCognitiveServicesfSpeechfInternal_IntentRecognitionEventArgs_SWIGUpcast___

    - CSharp_MicrosoftfCognitiveServicesfSpeechfInternal_ICognitiveServicesSpeechFactory_CreateSpeechRecognizerWithFileInput__SWIG_0___

    - 00007FFB7C3B7030 (SymFromAddr() error: The specified module could not be found.)

Screenshot: http://i63.tinypic.com/2zp59hz.png

Expected/desired behavior

OS and Version?

Windows 10 Enterprise.

Versions

Version 10.0.16299 Build 16299

Mention any other details that might be useful

http://i68.tinypic.com/125mmme.png

Samples fail on Windows 10 RS5

Initially posted this in the Windows Dev MVP alias but surfacing it here for visibility.

Configuration

Windows 10 RS5, Build 17751
.NET Core 2.1 console app with simple sample

Behavior

On start, immediately throws an exception on factory.CreateSpeechRecognizer(). (Does this SDK work on RS5? Final RS5 bits are on the horizon so am a little shocked here.)

I do not get the same exception on RS4.

(85): 18ms SPX_THROW_HR_IF: (0x00e) = 0xe
(85): 166ms SPX_RETURN_ON_FAIL: hr = 0x47967820
SPX_THROW_ON_FAIL: ::SpeechFactory_CreateSpeechRecognizer_With_Defaults(m_hfactory, &hreco) = 0x47967820

System.ApplicationException: 'Exception with an error code: 0xe
[CALL STACK]

    > CreateModuleObject

    - CreateModuleObject

    - CreateModuleObject

    - CreateModuleObject

    - CreateModuleObject

    - CreateModuleObject

    - Session_SetParameter_String

    - CreateModuleObject

    - CreateModuleObject

    - CreateModuleObject

    - CreateModuleObject

    - SpeechFactory_CreateSpeechRecognizer_With_Defaults

    - CSharp_MicrosoftfCognitiveServicesfSpeechfInternal_FactoryParameterCollectionBase_SWIGUpcast___

    - CSharp_MicrosoftfCognitiveServicesfSpeechfInternal_ICognitiveServicesSpeechFactory_CreateSpeechRecognizer__SWIG_0___

    - 00007FFCE42D6F8F (SymFromAddr() error: The specified module could not be found.)

Issue while integrating with Chatbot directline -instance.speechIsAvailable is not a function

Hi,
I have been using BING Speech API for TTS and STT functionalities within chat bot directline channel. And it has been working fine where we pass the speechOptions with speech recognizers and synthesizers.

However I am unable to find a corresponding code for new Azure Speech Service. I have created speech instance in azure west us region and have keys ready. However getting an error of "instance.speechIsAvailable is not a function" on page load itself
Details:
Chat bot- Microsoft Bot Framework V3
Chat bot channel- Directline
Speech Service Tier- Free
Speech Service Regions tried - westus and eastus,

Please refer screenshots and kindly guide on how to proceed.
js fail
js_code
Main requirement is to add speech capabilities into chat bot directline leveraging Azure Speech Service.

C++ sample app crash on Linux/Ubuntu

Hi,
I compiled the SpeechSDK/quickstart/cpp-linux sample under Ubuntu, but when starting, I get the following error:

root@JOE-LAPTOP:/home/formater/SpeechSDK/quickstart/cpp-linux# ./helloworld
Say something...
(642): 51ms SPX_THROW_HR_IF: (0x015) = 0x15
(64): 55ms SPX_RETURN_ON_FAIL: hr = 0x580018b0
(64): 56ms SPX_REPORT_ON_FAIL: hr = Recognizer_RecognizeAsync_WaitFor(hasync, (4294967295U), phresult) = 0x580018b0
(64): 57ms SPX_RETURN_ON_FAIL: hr = 0x580018b0
(529): 58ms
SPX_TRACE_VERBOSE: [CALL STACK]

[0x7fe465523b08] +0x63b08
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x92d3a) [0x7fe4651c2d3a]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x92d95) [0x7fe4651c2d95]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0x92d2e) [0x7fe4651c2d2e]
[0x7fe466025922] +0x25922
[0x7fe46601ebe9] +0x1ebe9
[0x7fe46600f669] +0xf669
[0x7fe46600f968] +0xf968
[0x7fe464b31b97] __libc_start_main +0xe7
[0x7fe46600f22a] +0xf22a

I thought, maybe there is problem with microphone access, so I changed to CreateSpeechRecognizerWithFileInput, but I got the same error during execution.

Please provide us with the following information:

This issue is for a: (mark with an x)

- [x] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Any log messages given by the failure

Expected/desired behavior

OS and Version?

Ubuntu

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

ARM support / sample of how to implement on ARM IoT device

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [X] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Developing a UWP app for Raspberry Pi 3 running Windows IoT Core - ARM is not supported by the SDK - is there an sample that can be shared of any way to implement this in an IoT ARM device? This is a pretty big use case for speech to text (voice assistants etc.)

Getting exception when running console application in .net core 2.1 from a Mac

This issue is for a: (mark with an x)

- [x] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

  1. Create a Console Application with .netcore2.1
  2. Call: Speechfactory.FromSubscription("key", "region")
  3. The app will throw an exception about not being able to find a dll

Any log messages given by the failure

Unhandled Exception: System.AggregateException: One or more errors occurred. (The type initializer for 'Microsoft.CognitiveServices.Speech.Internal.carbon_csharpPINVOKE' threw an exception.) ---> System.TypeInitializationException: The type initializer for 'Microsoft.CognitiveServices.Speech.Internal.carbon_csharpPINVOKE' threw an exception. ---> System.TypeInitializationException: The type initializer for 'SWIGExceptionHelper' threw an exception. ---> System.DllNotFoundException: Unable to load shared library 'Microsoft.CognitiveServices.Speech.csharp.bindings.dll' or one of its dependencies. In order to help diagnose loading problems, consider setting the DYLD_PRINT_LIBRARIES environment variable: dlopen(libMicrosoft.CognitiveServices.Speech.csharp.bindings.dll, 1): image not found
at Microsoft.CognitiveServices.Speech.Internal.carbon_csharpPINVOKE.SWIGExceptionHelper.SWIGRegisterExceptionCallbacks_carbon_csharp(ExceptionDelegate applicationDelegate, ExceptionDelegate arithmeticDelegate, ExceptionDelegate divideByZeroDelegate, ExceptionDelegate indexOutOfRangeDelegate, ExceptionDelegate invalidCastDelegate, ExceptionDelegate invalidOperationDelegate, ExceptionDelegate ioDelegate, ExceptionDelegate nullReferenceDelegate, ExceptionDelegate outOfMemoryDelegate, ExceptionDelegate overflowDelegate, ExceptionDelegate systemExceptionDelegate)
at Microsoft.CognitiveServices.Speech.Internal.carbon_csharpPINVOKE.SWIGExceptionHelper..cctor()
--- End of inner exception stack trace ---
at Microsoft.CognitiveServices.Speech.Internal.carbon_csharpPINVOKE.SWIGExceptionHelper..ctor()
at Microsoft.CognitiveServices.Speech.Internal.carbon_csharpPINVOKE..cctor()
--- End of inner exception stack trace ---
at Microsoft.CognitiveServices.Speech.Internal.carbon_csharpPINVOKE.SpeechFactory_FromSubscription(String jarg1, String jarg2)
at Microsoft.CognitiveServices.Speech.Internal.SpeechFactory.FromSubscription(String subscription, String region)
at Sample.ConsoleApp.Program.RecognizeSpeechAsync() in //Program.cs:line 14

Expected/desired behavior

Not get the exception

OS and Version?

MacOS 10.13.6 - Using VS for mac 7.6.3

Versions

0.6

Is there is a way to specify the recognize mode for android sdk?

As subject.

There is an introduction on this link:
https://docs.microsoft.com/en-us/azure/cognitive-services/speech/concepts#recognition-modes

There are three modes of recognition: interactive, conversation, and dictation. The recognition mode adjusts speech recognition based on how the users are likely to speak. Choose the appropriate recognition mode for your application.

These modes are applicable when you directly use the REST or WebSocket protocol. The client libraries use different parameters to specify recognition mode. For more information, see the client library of your choice.

But I cannot find a way to specify recognition mode, anyone can help? Thanks a lot!

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [x ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Any log messages given by the failure

Expected/desired behavior

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). Other.

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Connection failure for speech-text

Please provide us with the following information:

This issue is for a: (mark with an x)

- [x ] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Run C# .Net helloworld or speechtotext-wpf sample

Any log messages given by the failure

--- Start speech recognition using microphone in en-US language ----

Speech recognition: Session started event: SessionId: 25364c6a974548209d8f153d5cd5b1bd..
--- recognition canceled ---
CancellationReason: Error. ErrorDetails: Connection failed (no connection to the remote host). Internal error: 8. Error details: 998. Please check network connection, firewall setting, and the region name used to create speech factory..

Expected/desired behavior

speech to text result

OS and Version?

Windows 7

Versions

v1.0.0

Mention any other details that might be useful

I have tried the F0 tier as well as the S0 tier. On the F0 I have tried West US and on the S0 I have tried East US. Nothing works and I get the same messages.


Thanks! We'll be in touch soon.

en-ZA language does not seem to work

Please provide us with the following information:

This issue is for a: (mark with an x)

- [] bug report -> please search issues before submitting
- [] feature request
- [x] documentation issue or request
- [] regression (a behavior that used to work and stopped in a new release)

When trying to pass in a non-default bcp-47 language code, I get an error. If however, I explicitly pass in en-us, then all is good.
Where can I find the supported list of bcp-47 languages, or how do I adjust my subscription to support different language codes?

Minimal steps to reproduce

using (var recognizer = factory.CreateSpeechRecognizerWithFileInput(@"file.wav", @"en-za"))

Any log messages given by the failure

An error occurred. Status: Canceled, FailureReason: WebSocket Upgrade failed with a bad request (400). Please check the language name and deployment id, and ensure the deployment id (if used) is correctly associated with the provided subscription key.

Expected/desired behavior

No error

OS and Version?

Windows 10

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Using the SDK creates java.lang.ClassNotFoundException: com.microsoft.cognitiveservices.speech.RecognitionResult

This issue is for a: (mark with an x)

- [x] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

java -d64 -cp ./target/SpeechSDKDemo-0.0.1-SNAPSHOT.jar:./target/dependency/client-sdk-1.0.0.jar com.microsoft.cognitiveservices.speech.samples.console.Main

Choose option 5: "Speech recognition with audio stream"

Any log messages given by the failure

Exception in thread "main" java.lang.NoClassDefFoundError: com/microsoft/cognitiveservices/speech/RecognitionResult                         
        at com.microsoft.cognitiveservices.speech.samples.console.Main.main(Main.java:47)                                                   
Caused by: java.lang.ClassNotFoundException: com.microsoft.cognitiveservices.speech.RecognitionResult                                       
        at java.net.URLClassLoader.findClass(URLClassLoader.java:381)                                                                       
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        ... 1 more  

Expected/desired behavior

I get the transcription of my audio file

OS and Version?

Linux:

Linux mypc 4.18.6-arch1-1-ARCH #1 SMP PREEMPT Wed Sep 5 11:54:09 UTC 2018 x86_64 GNU/Linux

Versions

$ java -version                                                                                             
openjdk version "1.8.0_181"                                                                                                                 
OpenJDK Runtime Environment (build 1.8.0_181-b13)                                                                                           
OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode)

I'm using the current latest version from this repo 703d4b0111f4b61f09d776295bd1c5aff7674df5.

Mention any other details that might be useful

I've seen issue #46 but I don't think that applies here because I'm running a 64 bit JVM.

I'm tried this in an Ubuntu Docker container (following the installation steps from the README) and also in an Ubutu 16.04 VM (using the same installation steps).

Got any plans to provide library for ARM?

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [v] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Any log messages given by the failure

Expected/desired behavior

Do you have any plans to release ARM(32/64) library, cf. Raspberry?
If you have, When would it be released? ;)

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). Other.

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Crashed when doing continuous speech recognition from a file

It's OK to do speech-to-text from microphone.
But it's crashed when to do speech recognition from a file.

I copied the codes and download a wav file from
https://docs.microsoft.com/en-US/azure/cognitive-services/speech-service/how-to-recognize-speech-java.
and my subscriptionKey and serviceRegion are correct.

the errors are as blow:

java.lang.RuntimeException: Exception with an error code: 0x8 (SPXERR_FILE_OPEN_FAILED)
[CALL STACK]

  # 0: 0x7bf49b72f8  _ZN9Microsoft17CognitiveServices6Speech4Impl17CSpxWavFileReader4OpenEPKw
  # 1: 0x7bf49b9924  _ZN9Microsoft17CognitiveServices6Speech4Impl15CSpxWavFilePump10EnsureFileEPKw
  # 2: 0x7bf49ba02c  _ZThn64_N9Microsoft17CognitiveServices6Speech4Impl15CSpxWavFilePump4OpenEPKw
  # 3: 0x7bf499e8c4  _ZN9Microsoft17CognitiveServices6Speech4Impl22CSpxAudioStreamSession12InitFromFileEPKw
  # 4: 0x7bf4975b7c  _ZN9Microsoft17CognitiveServices6Speech4Impl20CSpxSpeechApiFactory31InitSessionFromAudioInputConfigENSt6__ndk110shared_ptrINS2_11ISpxSessionEEENS5_INS2_15ISpxAudioConfigEEE
  # 5: 0x7bf4974450  _ZN9Microsoft17CognitiveServices6Speech4Impl20CSpxSpeechApiFactory34CreateRecognizerFromConfigInternalEPKcS5_S5_NS1_12OutputFormatENSt6__ndk110shared_ptrINS2_15ISpxAudioConfigEEE
  # 6: 0x7bf4974b80  _ZThn48_N9Microsoft17CognitiveServices6Speech4Impl20CSpxSpeechApiFactory32CreateSpeechRecognizerFromConfigEPKcNS1_12OutputFormatENSt6__ndk110shared_ptrINS2_15ISpxAudioConfigEEE
  # 7: 0x7bf4955fd0  recognizer_create_speech_recognizer_from_config
  # 8: 0x7bf4d5be4c  _ZN9Microsoft17CognitiveServices6Speech16SpeechRecognizer10FromConfigENSt6__ndk110shared_ptrINS1_12SpeechConfigEEENS4_INS1_5Audio11AudioConfigEEE
  # 9: 0x7bf4d4082c  Java_com_microsoft_cognitiveservices_speech_internal_carbon_1javaJNI_SpeechRecognizer_1FromConfig_1_1SWIG_10
  #10: 0x7c1211c794  ???
  #11: 0x7c1211346c  ???
  #12: 0x7c1211feb4  _ZN3art9ArtMethod6InvokeEPNS_6ThreadEPjjPNS_6JValueEPKc
  #13: 0x7c122cdbf4  _ZN3art11interpreter34ArtInterpreterToCompiledCodeBridgeEPNS_6ThreadEPNS_9ArtMethodEPKNS_7DexFile8CodeItemEPNS_11ShadowFrameEPNS_6JValueE
  #14: 0x7c122c7cb0  _ZN3art11interpreter6DoCallILb1ELb0EEEbPNS_9ArtMethodEPNS_6ThreadERNS_11ShadowFrameEPKNS_11InstructionEtPNS_6JValueE
  #15: 0x7c12596ae0  MterpInvokeStaticRange
  #16: 0x7c12106018  ExecuteMterpImpl
  #17: 0x7c12106018  ExecuteMterpImpl
  #18: 0x7c12106018  ExecuteMterpImpl
  #19: 0x7c12106018  ExecuteMterpImpl
  #20: 0x7c12106018  ExecuteMterpImpl
  #21: 0x7c12106018  ExecuteMterpImpl
  #22: 0x7c12106018  ExecuteMterpImpl
  #23: 0x7c12106018  ExecuteMterpImpl
  #24: 0x7c12106018  ExecuteMterpImpl
  #25: 0x7c12106018  ExecuteMterpImpl
  #26: 0x7c12106018  ExecuteMterpImpl
  #27: 0x7c12106018  ExecuteMterpImpl


    at com.microsoft.cognitiveservices.speech.internal.carbon_javaJNI.SpeechRecognizer_FromConfig__SWIG_0(Native Method)
    at com.microsoft.cognitiveservices.speech.internal.SpeechRecognizer.FromConfig(SpeechRecognizer.java:41)
    at com.microsoft.cognitiveservices.speech.SpeechRecognizer.<init>(SpeechRecognizer.java:88)
    at com.microsoft.cognitiveservices.speech.samples.sdkdemo.MainActivity.testFile(MainActivity.java:437)
    at com.microsoft.cognitiveservices.speech.samples.sdkdemo.MainActivity$1.onClick(MainActivity.java:230)
    at android.view.View.performClick(View.java:5646)
    at android.view.View$PerformClick.run(View.java:22473)
    at android.os.Handler.handleCallback(Handler.java:761)
    at android.os.Handler.dispatchMessage(Handler.java:98)
    at android.os.Looper.loop(Looper.java:156)
    at android.app.ActivityThread.main(ActivityThread.java:6523)
    at java.lang.reflect.Method.invoke(Native Method)
    at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:942)
    at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:832)

Phone info:
HUAWEI Honor 8, Android 7.0

Transcription of continuous stream?

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [X] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

  • create SpeechFactory
  • create SpeechRecognizer
  • try to continuously send byte[]
  • not available...

Expected/desired behavior

Similar to the ProjectOxford Speech Recognition SDK (DataClient):

_speechClient = SpeechRecognitionServiceFactory.CreateDataClient(SpeechRecognitionMode.LongDictation, "en-us", "key");
_speechClient.SendAudio(audioChunk, audioChunk.Length); // audioChunk is byte[]

OS and Version?

Windows 10

Versions

Any :)

Mention any other details that might be useful

Is there a way to continuously stream audio to the speech service and get back text transcript? Not a file, but a series of byte arrays for more than hour (live speech transcription). I'm currently solving it with the old SDK and WebSockets, but I'm looking for ways how to do it with this one (since there are limitations).

The application closes with any message while trying to use the option 3 ("Speech recognition with file input").

Please provide us with the following information:

This issue is for a: (mark with an x)

- [X] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Run application, select the option 3

Any log messages given by the failure

No. The application closes with any message or trace

Expected/desired behavior

Receive the text

OS and Version?

Windows 10 Pro

Versions

1803

Mention any other details that might be useful

The audio file is attached here.
20180531-134348_994114391_15361167-k_ENTMAY18-all.wav.zip

gradle always downloads aar file (no android!)

Hey there,

I want to include the Microsoft Speech SDK into my gradle project. It should run on linux.

This issue is for a: (mark with an x)

- [ x ] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

  • Create a gradle project
  • Set repo and dep like this (converted from maven to gradle)
repositories {
    maven {
        url "https://csspeechstorage.blob.core.windows.net/maven"
    }
}
dependencies {
    implementation "com.microsoft.cognitiveservices.speech:client-sdk:1.0.1"
}
  • Try to use sample code
  • --> Package com.microsoft.cognitiveservices.speech is not available

Any log messages given by the failure

Could not find com.microsoft.cognitiveservices.speech

Expected/desired behavior

Download .jar file instead of the .aar file

OS and Version?

Ubuntu 18.04

Versions

1.0.1

Mention any other details that might be useful

It seems like gradle always downloads the aar file.

Thank you very much! :)
Best regards

.net-core WebSocket Upgrade failed with a bad request (400)

All of the sudden I experience the following error:
WebSocket Upgrade failed with a bad request (400). Please check the language name and deployment id, and ensure the deployment id (if used) is correctly associated with the provided subscription key.

I provided the right subscription key and also the right location. Here's the code im using:

var stream = new MemoryStream(audioData);
stream.WriteWavHeader(false, 1, 16, 16000, audioData.Length / 2);

var factory = SpeechFactory.FromSubscription("key", "northeurope");
using (var recognizer = factory.CreateSpeechRecognizerWithStream(new BinaryAudioStreamReader(
    new AudioInputStreamFormat
    {
        FormatTag = 1,
        BitsPerSample = 16,
        BlockAlign = 2,
        Channels = 1,
        SamplesPerSec = 16000,
        AvgBytesPerSec = 32000
    }, stream), language))
{
    // only for single shot recognition like command or query. For long-running recognition, use
    // StartContinuousRecognitionAsync() instead.
    var recognitionResult = await recognizer.RecognizeAsync();

    ...
}

Am I doing something wrong? It worked two weeks ago or so

Please provide us with the following information:

This issue is for a: (mark with an x)

- [x] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

see above

Any log messages given by the failure

WebSocket Upgrade failed with a bad request (400). Please check the language name and deployment id, and ensure the deployment id (if used) is correctly associated with the provided subscription key.

Expected/desired behavior

Everything works fine and I get my speech recognition result

OS and Version?

Windows 10 (on Parallels AND also on Azure App Service)

Versions

SDK .net-core versions 0.5.0 (Communication Error 400) and 0.6.0 (see error message above

UWP support for translations

With the 1.0 release UWP apps throw an error when initializing the recognizer for translations that it is not supported in WinRT. The Speech only works fine it is just translations that is throwing an error. Did I miss this feature regression update in the docs?

Speech Recognition with Intents is Incomplete

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [X] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

NA

Any log messages given by the failure

NA

Expected/desired behavior

When using the IntentRecognizer, it returns IntentRecognitionResult. The IntentRecognitionResult only has the intents and not the entities. I have found that I can add Microsoft.Bot.Builder.Ai.LUIS and related packages. I then have to cast this in a very confusing way.

var result = await recognizer.RecognizeAsync().ConfigureAwait(false);
var luisJson = result.Properties.Get(ResultPropertyKind.LanguageUnderstandingJson);
var luisResult = JsonConvert.DeserializeObject(luisJson);

Why would dependencies be needed for the BotBuilder framework? The BotBuilder framework is huge, and not directly related. The predecessor of this project (ProjectOxford) seemed to have had at least a Payload property for storing the JSON. In addition, it would be nice to have a LuisResult or an extension that would not be dependent on the whole BotBuilder framework to get a complete Intent object with entities, scoring, etc.

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). Other.
Windows 10

Versions

Version 0.5.0

Mention any other details that might be useful


Thanks! We'll be in touch soon.

missing .so native library for Android deployment

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ x] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

  1. Download/Clone the repo at https://github.com/Azure-Samples/cognitive-services-speech-sdk/tree/master/quickstart/java-android
  2. Build and Run the quick-start app according to the instructions provided
  3. Run on Pixel API 26 emulator

Any log messages given by the failure

The app crashes on start with the log:
08-22 22:08:33.601 20360-20360/com.microsoft.cognitiveservices.speech.samples.sdkdemo E/AndroidRuntime: FATAL EXCEPTION: main
Process: com.microsoft.cognitiveservices.speech.samples.sdkdemo, PID: 20360
java.lang.UnsatisfiedLinkError: dalvik.system.PathClassLoader[DexPathList
... truncated..
couldn't find "libMicrosoft.CognitiveServices.Speech.java.bindings.so"
at java.lang.Runtime.loadLibrary0(Runtime.java:1011)
at java.lang.System.loadLibrary(System.java:1657)
at com.microsoft.cognitiveservices.speech.SpeechFactory.(SpeechFactory.java:52)
at com.microsoft.cognitiveservices.speech.SpeechFactory.configureNativePlatformBindingWithDefaultCertificate(SpeechFactory.java:81)
08-22 22:08:33.603 20360-20360/com.microsoft.cognitiveservices.speech.samples.sdkdemo

Expected/desired behavior

OS and Version?

Android SDK API 26
Android Studio 3.1.4

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Possible memory leaks in C++ for Linux

Please provide us with the following information:

This issue is for a: (mark with an x)

- [x] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Unfortunately, I don't have a reliable way to reproduce the complete issue with a simple example. There are two issues as far as I can tell. One of them can be reproduced every time, but the second one occurs only rarely with the following sample; however, in my real code I encounter the issue every time.

#include <cstring>
#include <thread>

#include <speechapi_cxx.h>

using namespace Microsoft::CognitiveServices::Speech;

class Stream : public AudioInputStream {
public:
    size_t GetFormat(_AudioInputStreamFormatCPP* format, size_t size) {
        if (size >= sizeof(_AudioInputStreamFormatCPP)) {
            format->FormatTag = 1;  // == PCM
            format->Channels = 1;
            format->SamplesPerSec = 16000;
            format->BitsPerSample = 16;
            format->BlockAlign = 2;  // == BitsPerSample / 8
            format->AvgBytesPerSec = 32000;  // == SamplesPerSec * Channels * (BitsPerSample / 8)
        }

        return sizeof(_AudioInputStreamFormatCPP);
    }

    size_t Read(char* dataBuffer, size_t size) {
        memset(dataBuffer, 0, size);
        return size;
    }

    void Close() {

    }
};

void test() {
    auto factory = SpeechFactory::FromSubscription(L"...", L"...");
    
    auto stream = std::make_shared<Stream>();
    auto recognizer = factory->CreateSpeechRecognizerWithStream(stream);
    
    recognizer->IntermediateResult += [](const SpeechRecognitionEventArgs& args) {};
    recognizer->FinalResult += [](const SpeechRecognitionEventArgs& args) {};
    recognizer->Canceled += [](const SpeechRecognitionEventArgs& args) {};
    
    recognizer->StartContinuousRecognitionAsync();
    
    std::this_thread::sleep_for(std::chrono::seconds(10));
}

int main() {
    test();
    return 0;
}

Any log messages given by the failure

$ valgrind --leak-check=full ./test 
==8327== Memcheck, a memory error detector
==8327== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==8327== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==8327== Command: ./test
==8327== 
==8327== 
==8327== HEAP SUMMARY:
==8327==     in use at exit: 173,615 bytes in 3,313 blocks
==8327==   total heap usage: 7,471 allocs, 4,158 frees, 659,368 bytes allocated
==8327== 
==8327== 88 (16 direct, 72 indirect) bytes in 1 blocks are definitely lost in loss record 586 of 661
==8327==    at 0x4C2E0EF: operator new(unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==8327==    by 0x4E8D80D: SpeechFactory_CreateSpeechRecognizer_With_Stream (in ./lib/x64/libMicrosoft.CognitiveServices.Speech.core.so)
==8327==    by 0x40EF3F: Microsoft::CognitiveServices::Speech::SpeechFactory::InternalCognitiveServicesSpeechFactory::CreateSpeechRecognizerWithStream(Microsoft::CognitiveServices::Speech::SpeechFactory::InternalCognitiveServicesSpeechFactory::SpeechApi_AudioInputStreamAdapter*) (speechapi_cxx_factory.h:412)
==8327==    by 0x40EEDA: Microsoft::CognitiveServices::Speech::SpeechFactory::InternalCognitiveServicesSpeechFactory::CreateSpeechRecognizerWithStream(std::shared_ptr<Microsoft::CognitiveServices::Speech::AudioInputStream> const&) (speechapi_cxx_factory.h:406)
==8327==    by 0x406AAA: test() (test.cpp:37)
==8327==    by 0x406D37: main (test.cpp:49)
==8327== 
==8327== 415 (24 direct, 391 indirect) bytes in 1 blocks are definitely lost in loss record 615 of 661
==8327==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==8327==    by 0x4F0B0DB: json_value_init_object (in ./lib/x64/libMicrosoft.CognitiveServices.Speech.core.so)
==8327==    by 0x4EF83CC: populate_event_timestamp(void**, char const*, char const*, char const*) (in ./lib/x64/libMicrosoft.CognitiveServices.Speech.core.so)
==8327==    by 0x4EF94FB: inband_event_timestamp_populate (in ./lib/x64/libMicrosoft.CognitiveServices.Speech.core.so)
==8327==    by 0x4EEBED6: Microsoft::CognitiveServices::Speech::USP::Connection::Impl::QueueAudioSegment(unsigned char const*, unsigned long) (in ./lib/x64/libMicrosoft.CognitiveServices.Speech.core.so)
==8327==    by 0x4EBDA95: Microsoft::CognitiveServices::Speech::Impl::CSpxUspRecoEngineAdapter::UspWriteFormat(Microsoft::CognitiveServices::Speech::Impl::WAVEFORMATEX*) (in ./lib/x64/libMicrosoft.CognitiveServices.Speech.core.so)
==8327==    by 0x4EC13D5: Microsoft::CognitiveServices::Speech::Impl::CSpxUspRecoEngineAdapter::ProcessAudio(std::shared_ptr<unsigned char>, unsigned int) (in ./lib/x64/libMicrosoft.CognitiveServices.Speech.core.so)
==8327==    by 0x4EC9AB7: Microsoft::CognitiveServices::Speech::Impl::CSpxAudioStreamSession::ProcessAudioDataNow(std::shared_ptr<unsigned char>, unsigned int) (in ./lib/x64/libMicrosoft.CognitiveServices.Speech.core.so)
==8327==    by 0x4ED35EA: Microsoft::CognitiveServices::Speech::Impl::CSpxAudioStreamSession::ProcessAudio(std::shared_ptr<unsigned char>, unsigned int) (in ./lib/x64/libMicrosoft.CognitiveServices.Speech.core.so)
==8327==    by 0x4EE0809: Microsoft::CognitiveServices::Speech::Impl::CSpxStreamPump::PumpThread(std::shared_ptr<Microsoft::CognitiveServices::Speech::Impl::CSpxStreamPump>, std::shared_ptr<Microsoft::CognitiveServices::Speech::Impl::ISpxAudioProcessor>) (in ./lib/x64/libMicrosoft.CognitiveServices.Speech.core.so)
==8327==    by 0x4EE129D: std::thread::_Impl<std::_Bind_simple<std::_Mem_fn<void (Microsoft::CognitiveServices::Speech::Impl::CSpxStreamPump::*)(std::shared_ptr<Microsoft::CognitiveServices::Speech::Impl::CSpxStreamPump>, std::shared_ptr<Microsoft::CognitiveServices::Speech::Impl::ISpxAudioProcessor>)> (Microsoft::CognitiveServices::Speech::Impl::CSpxStreamPump*, std::shared_ptr<Microsoft::CognitiveServices::Speech::Impl::CSpxStreamPump>, std::shared_ptr<Microsoft::CognitiveServices::Speech::Impl::ISpxAudioProcessor>)> >::_M_run() (in ./lib/x64/libMicrosoft.CognitiveServices.Speech.core.so)
==8327==    by 0x520BC7F: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
==8327== 
==8327== 426 (24 direct, 402 indirect) bytes in 1 blocks are definitely lost in loss record 616 of 661
==8327==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==8327==    by 0x4F0B0DB: json_value_init_object (in ./lib/x64/libMicrosoft.CognitiveServices.Speech.core.so)
==8327==    by 0x4EF90E4: inband_event_key_value_populate (in ./lib/x64/libMicrosoft.CognitiveServices.Speech.core.so)
==8327==    by 0x4EED4E4: Microsoft::CognitiveServices::Speech::USP::Connection::Impl::Connect() (in ./lib/x64/libMicrosoft.CognitiveServices.Speech.core.so)
==8327==    by 0x4EE8032: Microsoft::CognitiveServices::Speech::USP::Connection::Connection(Microsoft::CognitiveServices::Speech::USP::Client const&) (in ./lib/x64/libMicrosoft.CognitiveServices.Speech.core.so)
==8327==    by 0x4EE8101: Microsoft::CognitiveServices::Speech::USP::Client::Connect() (in ./lib/x64/libMicrosoft.CognitiveServices.Speech.core.so)
==8327==    by 0x4EC18B4: Microsoft::CognitiveServices::Speech::Impl::CSpxUspRecoEngineAdapter::UspInitialize() (in ./lib/x64/libMicrosoft.CognitiveServices.Speech.core.so)
==8327==    by 0x4EC2ACB: Microsoft::CognitiveServices::Speech::Impl::CSpxUspRecoEngineAdapter::SetFormat(Microsoft::CognitiveServices::Speech::Impl::WAVEFORMATEX*) (in ./lib/x64/libMicrosoft.CognitiveServices.Speech.core.so)
==8327==    by 0x4ECAA72: Microsoft::CognitiveServices::Speech::Impl::CSpxAudioStreamSession::SetFormat(Microsoft::CognitiveServices::Speech::Impl::WAVEFORMATEX*) (in ./lib/x64/libMicrosoft.CognitiveServices.Speech.core.so)
==8327==    by 0x4EE054B: Microsoft::CognitiveServices::Speech::Impl::CSpxStreamPump::PumpThread(std::shared_ptr<Microsoft::CognitiveServices::Speech::Impl::CSpxStreamPump>, std::shared_ptr<Microsoft::CognitiveServices::Speech::Impl::ISpxAudioProcessor>) (in ./lib/x64/libMicrosoft.CognitiveServices.Speech.core.so)
==8327==    by 0x4EE129D: std::thread::_Impl<std::_Bind_simple<std::_Mem_fn<void (Microsoft::CognitiveServices::Speech::Impl::CSpxStreamPump::*)(std::shared_ptr<Microsoft::CognitiveServices::Speech::Impl::CSpxStreamPump>, std::shared_ptr<Microsoft::CognitiveServices::Speech::Impl::ISpxAudioProcessor>)> (Microsoft::CognitiveServices::Speech::Impl::CSpxStreamPump*, std::shared_ptr<Microsoft::CognitiveServices::Speech::Impl::CSpxStreamPump>, std::shared_ptr<Microsoft::CognitiveServices::Speech::Impl::ISpxAudioProcessor>)> >::_M_run() (in ./lib/x64/libMicrosoft.CognitiveServices.Speech.core.so)
==8327==    by 0x520BC7F: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
==8327== 
==8327== 475 (24 direct, 451 indirect) bytes in 1 blocks are definitely lost in loss record 619 of 661
==8327==    at 0x4C2DB8F: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==8327==    by 0x4F0B0DB: json_value_init_object (in ./lib/x64/libMicrosoft.CognitiveServices.Speech.core.so)
==8327==    by 0x4EF83CC: populate_event_timestamp(void**, char const*, char const*, char const*) (in ./lib/x64/libMicrosoft.CognitiveServices.Speech.core.so)
==8327==    by 0x4EF8D7D: inband_connection_telemetry (in ./lib/x64/libMicrosoft.CognitiveServices.Speech.core.so)
==8327==    by 0x4EFCF6B: TransportDoWork (in ./lib/x64/libMicrosoft.CognitiveServices.Speech.core.so)
==8327==    by 0x4EEE70D: Microsoft::CognitiveServices::Speech::USP::Connection::Impl::WorkThread(std::weak_ptr<Microsoft::CognitiveServices::Speech::USP::Connection::Impl>) (in ./lib/x64/libMicrosoft.CognitiveServices.Speech.core.so)
==8327==    by 0x4EF4BF2: std::thread::_Impl<std::_Bind_simple<void (*(std::shared_ptr<Microsoft::CognitiveServices::Speech::USP::Connection::Impl>))(std::weak_ptr<Microsoft::CognitiveServices::Speech::USP::Connection::Impl>)> >::_M_run() (in ./lib/x64/libMicrosoft.CognitiveServices.Speech.core.so)
==8327==    by 0x520BC7F: ??? (in /usr/lib/x86_64-linux-gnu/libstdc++.so.6.0.21)
==8327==    by 0x56F26B9: start_thread (pthread_create.c:333)
==8327==    by 0x5A0F41C: clone (clone.S:109)
==8327== 
==8327== LEAK SUMMARY:
==8327==    definitely lost: 88 bytes in 4 blocks
==8327==    indirectly lost: 1,316 bytes in 32 blocks
==8327==      possibly lost: 0 bytes in 0 blocks
==8327==    still reachable: 172,211 bytes in 3,277 blocks
==8327==         suppressed: 0 bytes in 0 blocks
==8327== Reachable blocks (those to which a pointer was found) are not shown.
==8327== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==8327== 
==8327== For counts of detected and suppressed errors, rerun with: -v
==8327== ERROR SUMMARY: 4 errors from 4 contexts (suppressed: 0 from 0)

Expected/desired behavior

The two issues I see are:

  1. The first leak reported by valgrind occurs every time when using the stream-based speech recognition.
  2. The 3 other leaks reported by valgrind occur rarely in the sample, so my best guess is that there is a race-condition somewhere when the recognition handle is closed.

OS and Version?

Linux ... 4.4.0-131-generic #157-Ubuntu SMP Thu Jul 12 15:51:36 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Versions

0.5.0

Mention any other details that might be useful


Thanks! We'll be in touch soon.

access violation in AddIntent

This issue is for a: (mark with an x)

- [x] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

auto factory = SpeechFactory::FromSubscription(L"xxxx", L"westeurope");
auto recognizer = factory->CreateIntentRecognizer(L"en-US");
auto model = Intent::LanguageUnderstandingModel::FromAppId(L"xxxx");
recognizer->AddIntent(L"id1", model); // <- access violation in speechapi_cxx_intent_trigger.h line 47

But

recognizer->AddIntent(L"dummy", model, L"dummy"); // this works fine

Any log messages given by the failure

Expected/desired behavior

OS and Version?

Windows 10 and Ubuntu 16.04

Versions

Speech SDK 0.5 and 0.6

Mention any other details that might be useful


Thanks! We'll be in touch soon.

Speech to text conversion seems to handle max 15~16 seconds of audio

Please provide us with the following information:

This issue is for a: (mark with an x)

- [x] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Select any audio wav file longer than 16 seconds. And tried converted files using both option 3 and option 6, but audio of upto 15~16 seconds is recognized and converted into text. The rest of audio is ignored. There is no error message.

Any log messages given by the failure

Expected/desired behavior

Entire audio file should be converted.

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). Other.
Windows 10 pro

Versions

Mention any other details that might be useful

The REST API does mention that it handles max of 15 seconds but longer audio is handled by the SDK. I am using SDK.


Thanks! We'll be in touch soon.

Could not create speech recognizer

I was trying to run the sample app and to use SPXTranslationRecognizer. However I can't instantiate an instance of this class.

Steps to reproduce:

SPXSpeechTranslationConfiguration *config = [[SPXSpeechTranslationConfiguration alloc] initWithSubscription:speechKey region:serviceRegion];

self.recognizer = [[SPXTranslationRecognizer alloc] init:config];

when I instantiate SPXTranslationRecognizer I get an exception:

Exception caught when creating SPXTranslationRecognizer in core.

Recognition failed with Canceled. Did you enter your subscription? WebSocket Upgrade failed with an authentication error (401). Please check the subscription key or the authorization token, and the region name.

I pulled down this repo and built it in Android Studio

When I hit the recognize button I get this error:

"Recognition failed with Canceled. Did you enter your subscription?
WebSocket Upgrade failed with an authentication error (401). Please check the subscription key or the authorization token, and the region name."

I was able to verify my subscription key as valid by using powershell to test a post (see below)
$FetchTokenHeader = @{
'Content-type'='application/x-www-form-urlencoded'
'Content-Length'= '0'
'Ocp-Apim-Subscription-Key' = 'my key here'
}
$OAuthToken = Invoke-RestMethod -Method POST -Uri https://api.cognitive.microsoft.com/sts/v1.0/issueToken -Headers $FetchTokenHeader
$OAuthToken

The test was successful and I got an auth token. So I know that is correct.

But I could not find what region my resource was in, It just said global.

I tried every one of regions, including the global listed in the azure website to no avail.

1.) How do I definitively determine the region of my azure bing resource?
2.) Why can I not get around this error with a valid subscription key and the valid region? I tested all regions, one of them had to be correct.

FYI - Once I can get the reco working on this, I will be porting this code to Xamarin.

That usually means doing a binding library for the .aar and referencing it that way.

Please advise if this is the easiest path. It seems that the NuGet package out there was geared to C# and not necc Xamarin.

This issue is for a: (mark with an x)

- [X ] bug report -> please search issues before submitting
- [ ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Build and deploy to a samsung S2 tablet
click the recognize button
view error

Any log messages given by the failure

09-19 15:38:24.108 27054-27054/? E/Zygote: v2
09-19 15:38:24.109 27054-27054/? I/libpersona: KNOX_SDCARD checking this for 10519
KNOX_SDCARD not a persona
09-19 15:38:24.109 27054-27054/? E/Zygote: accessInfo : 0
09-19 15:38:24.110 27054-27054/? W/SELinux: SELinux selinux_android_compute_policy_index : Policy Index[2], Con:u:r:zygote:s0 RAM:SEPF_SECMOBILE_7.0_0010, [-1 -1 -1 -1 0 1]
09-19 15:38:24.111 27054-27054/? I/SELinux: SELinux: seapp_context_lookup: seinfo=untrusted, level=s0:c512,c768, pkgname=com.microsoft.cognitiveservices.speech.samples.sdkdemo
09-19 15:38:24.113 27054-27054/? I/art: Late-enabling -Xcheck:jni
09-19 15:38:24.149 27054-27054/? D/TimaKeyStoreProvider: TimaKeyStore is not enabled: cannot add TimaSignature Service and generateKeyPair Service
09-19 15:38:24.277 27054-27054/com.microsoft.cognitiveservices.speech.samples.sdkdemo I/InstantRun: starting instant run server: is main process
09-19 15:38:24.341 27054-27054/com.microsoft.cognitiveservices.speech.samples.sdkdemo W/art: Before Android 4.1, method android.graphics.PorterDuffColorFilter android.support.graphics.drawable.VectorDrawableCompat.updateTintFilter(android.graphics.PorterDuffColorFilter, android.content.res.ColorStateList, android.graphics.PorterDuff$Mode) would have incorrectly overridden the package-private method in android.graphics.drawable.Drawable
09-19 15:38:24.412 27054-27054/com.microsoft.cognitiveservices.speech.samples.sdkdemo D/TextView: setTypeface with style : 0
09-19 15:38:24.413 27054-27054/com.microsoft.cognitiveservices.speech.samples.sdkdemo D/TextView: setTypeface with style : 0
09-19 15:38:24.441 27054-27054/com.microsoft.cognitiveservices.speech.samples.sdkdemo D/TextView: setTypeface with style : 0
09-19 15:38:24.444 27054-27054/com.microsoft.cognitiveservices.speech.samples.sdkdemo D/TextView: setTypeface with style : 0
09-19 15:38:24.446 27054-27054/com.microsoft.cognitiveservices.speech.samples.sdkdemo D/TextView: setTypeface with style : 0
09-19 15:38:24.447 27054-27054/com.microsoft.cognitiveservices.speech.samples.sdkdemo D/TextView: setTypeface with style : 0
09-19 15:38:24.449 27054-27054/com.microsoft.cognitiveservices.speech.samples.sdkdemo D/TextView: setTypeface with style : 0
09-19 15:38:24.531 27054-27054/com.microsoft.cognitiveservices.speech.samples.sdkdemo D/ViewRootImpl@8692e94[MainActivity]: ThreadedRenderer.create() translucent=false
09-19 15:38:24.536 27054-27054/com.microsoft.cognitiveservices.speech.samples.sdkdemo D/InputTransport: Input channel constructed: fd=76
09-19 15:38:24.536 27054-27054/com.microsoft.cognitiveservices.speech.samples.sdkdemo D/ViewRootImpl@8692e94[MainActivity]: setView = DecorView@b721d3d[MainActivity] touchMode=true
09-19 15:38:24.558 27054-27054/com.microsoft.cognitiveservices.speech.samples.sdkdemo D/ViewRootImpl@8692e94[MainActivity]: dispatchAttachedToWindow
09-19 15:38:24.605 27054-27054/com.microsoft.cognitiveservices.speech.samples.sdkdemo D/ViewRootImpl@8692e94[MainActivity]: Relayout returned: oldFrame=[0,0][0,0] newFrame=[0,0][1536,2048] result=0x27 surface={isValid=true 547617089024} surfaceGenerationChanged=true
mHardwareRenderer.initialize() mSurface={isValid=true 547617089024} hwInitialized=true
09-19 15:38:24.605 27054-27072/com.microsoft.cognitiveservices.speech.samples.sdkdemo I/Adreno: QUALCOMM build : e488ad0, Ie9c95840c4
Build Date : 10/11/17
OpenGL ES Shader Compiler Version: XE031.09.00.03
Local Branch :
Remote Branch : refs/tags/AU_LINUX_ANDROID_LA.BR.1.3.6.C1_RB1.07.00.00.267.008
Remote Branch : NONE
Reconstruct Branch : NOTHING
09-19 15:38:24.610 27054-27072/com.microsoft.cognitiveservices.speech.samples.sdkdemo I/OpenGLRenderer: Initialized EGL, version 1.4
09-19 15:38:24.610 27054-27072/com.microsoft.cognitiveservices.speech.samples.sdkdemo D/OpenGLRenderer: Swap behavior 1
09-19 15:38:24.635 27054-27054/com.microsoft.cognitiveservices.speech.samples.sdkdemo W/art: Before Android 4.1, method int android.support.v7.widget.DropDownListView.lookForSelectablePosition(int, boolean) would have incorrectly overridden the package-private method in android.widget.ListView
09-19 15:38:24.659 27054-27054/com.microsoft.cognitiveservices.speech.samples.sdkdemo D/ViewRootImpl@8692e94[MainActivity]: MSG_RESIZED_REPORT: frame=Rect(0, 0 - 1536, 2048) ci=Rect(0, 48 - 0, 0) vi=Rect(0, 48 - 0, 0) or=1
MSG_WINDOW_FOCUS_CHANGED 1
mHardwareRenderer.initializeIfNeeded()#2 mSurface={isValid=true 547617089024}
09-19 15:38:24.660 27054-27054/com.microsoft.cognitiveservices.speech.samples.sdkdemo V/InputMethodManager: Starting input: tba=android.view.inputmethod.EditorInfo@8224765 nm : com.microsoft.cognitiveservices.speech.samples.sdkdemo ic=null
09-19 15:38:24.660 27054-27054/com.microsoft.cognitiveservices.speech.samples.sdkdemo I/InputMethodManager: [IMM] startInputInner - mService.startInputOrWindowGainedFocus
09-19 15:38:24.668 27054-27068/com.microsoft.cognitiveservices.speech.samples.sdkdemo D/InputTransport: Input channel constructed: fd=79
09-19 15:38:24.681 27054-27054/com.microsoft.cognitiveservices.speech.samples.sdkdemo D/ViewRootImpl@8692e94[MainActivity]: Relayout returned: oldFrame=[0,0][1536,2048] newFrame=[0,0][1536,2048] result=0x1 surface={isValid=true 547617089024} surfaceGenerationChanged=false
09-19 15:38:24.687 27054-27054/com.microsoft.cognitiveservices.speech.samples.sdkdemo V/InputMethodManager: Starting input: tba=android.view.inputmethod.EditorInfo@5775c3a nm : com.microsoft.cognitiveservices.speech.samples.sdkdemo ic=null
09-19 15:38:29.008 27054-27054/com.microsoft.cognitiveservices.speech.samples.sdkdemo D/ViewRootImpl@8692e94[MainActivity]: ViewPostImeInputStage processPointer 0
09-19 15:38:29.015 27054-27054/com.microsoft.cognitiveservices.speech.samples.sdkdemo V/BoostFramework: mAcquireFunc method = public int com.qualcomm.qti.Performance.perfLockAcquire(int,int[])
mReleaseFunc method = public int com.qualcomm.qti.Performance.perfLockRelease()
mAcquireTouchFunc method = public int com.qualcomm.qti.Performance.perfLockAcquireTouch(android.view.MotionEvent,android.util.DisplayMetrics,int,int[])
mIOPStart method = public int com.qualcomm.qti.Performance.perfIOPrefetchStart(int,java.lang.String)
mIOPStop method = public int com.qualcomm.qti.Performance.perfIOPrefetchStop()
09-19 15:38:29.020 27054-27054/com.microsoft.cognitiveservices.speech.samples.sdkdemo V/BoostFramework: BoostFramework() : mPerf = com.qualcomm.qti.Performance@36786f4
09-19 15:38:29.107 27054-27054/com.microsoft.cognitiveservices.speech.samples.sdkdemo D/ViewRootImpl@8692e94[MainActivity]: ViewPostImeInputStage processPointer 1
09-19 15:38:30.401 27054-27080/com.microsoft.cognitiveservices.speech.samples.sdkdemo I/reco 1: Recognizer returned: Recognition failed with Canceled. Did you enter your subscription?
WebSocket Upgrade failed with an authentication error (401). Please check the subscription key or the authorization token, and the region name.

Expected/desired behavior

speech to text recognition would occur

OS and Version?

Windows 10 running Android Studio 3.1.4 => deploying to Samsung T713 S2 tablet running Android 7.0

Versions

Mention any other details that might be useful


Thanks! We'll be in touch soon.

.NET Standard Support

This issue is for a: (mark with an x)

- [ X ] bug report -> please search issues before submitting
- [ X ] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

  1. Start up a .NET Core 2 or above console application
  2. Install the Microsoft.CognitiveServices.Speech nuget package
  3. Package installs with no libraries referenced.

Any log messages given by the failure

None

Expected/desired behavior

  1. Start up a .NET Core 2 or above console application
  2. Install the Microsoft.CognitiveServices.Speech nuget package
  3. Package installs setting up the required references (bringing in the Windows Compatibility Pack if needbe)

OS and Version?

Windows 10

Versions

0.4.0

Mention any other details that might be useful

Would be great if we could get some .NET Standard support to work in .NET Core - even if it only works on Windows we can at least throw it in a Windows Server Docker container 👍

[iOS] SPXPushAudioInputStream doesn't work

I write a class named MSAudioStream inherited from SPXPushAudioInputStream
and call the method -(void)write:(NSData*)data in a thread,
like this: [self write:data];
the data is read from microphone and I'm sure data is OK because I save these data in a WAV file( 16K,16bits) and the wav file is OK.

here is part of my code:

MSAudioStream* stream = [ MSAudioStream new];
SPXAudioConfiguration* audioCfg = [[ SPXAudioConfiguration alloc] initWithStreamInput: stream ];
SPXSpeechConfiguration* speechCfg = [[ SPXSpeechConfiguration alloc] initWithSub.....

{Microsoft.CognitiveServices.Speech.SpeechFactory.FactoryParametersImpl} 'factory.EndpointURL' threw an exception of type 'System.UriFormatException'

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [ ] feature request
- [ X] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Downloaded the Sample for Quickstart: Recognize speech in C# under .NET Core on Windows using the Speech SDK - https://docs.microsoft.com/en-us/azure/cognitive-services/speech-service/quickstart-csharp-dotnetcore-windows
Pasted the SubscriptionKey and Region as follows and followed the sample

// Creates an instance of a speech factory with specified
// subscription key and service region. Replace with your own subscription key
// and service region (e.g., "westus").

var factory = SpeechFactory.FromSubscription($"{SUBKEY}", "westus");

Also tried replacing region with https://westus.api.cognitive.microsoft.com/sts/v1.0 as provided in the Free Service Activation. Get the Following Error...factory factory.

EndpointURL = 'factory.EndpointURL' threw an exception of type 'System.UriFormatException' System.Uri {System.UriFormatException} factory.Message = "Invalid URI: The URI is empty."

Any log messages given by the failure

Get the Following Error...factory factory.EndpointURL = 'factory.EndpointURL' threw an exception of type 'System.UriFormatException' System.Uri {System.UriFormatException} factory.Message = "Invalid URI: The URI is empty."

Expected/desired behavior

Activate Microphone and Process Utterance

OS and Version?

Windows 10 Build 10.0.17134

Versions

.NET Framework 4.6.1

Mention any other details that might be useful


Thanks! We'll be in touch soon.

How can I get the confidence score from the speech result?

Please provide us with the following information:

This issue is for a: (mark with an x)

- [ ] bug report -> please search issues before submitting
- [V] feature request
- [ ] documentation issue or request
- [ ] regression (a behavior that used to work and stopped in a new release)

Minimal steps to reproduce

Any log messages given by the failure

Expected/desired behavior

OS and Version?

Windows 7, 8 or 10. Linux (which distribution). Other.

Versions

Mention any other details that might be useful

during this Speech SDK porting, I couldn't find any confidence score which I would be able to utilize the value to retry the ASR.
on the result class, I've only found ; ErrorDetails, Properties, Reason, ResultId, Text.

Is there any way to get its confidence score, or get the result as NBest list?
(I used to have confidence score former bing asr api.)

Please let me know if you have answers or plans.
Thank you in advance, Good day! ;)


Thanks! We'll be in touch soon.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.