Code Monkey home page Code Monkey logo

picovoice / rhino Goto Github PK

View Code? Open in Web Editor NEW
593.0 15.0 82.0 302.36 MB

On-device Speech-to-Intent engine powered by deep learning

Home Page: https://picovoice.ai/

License: Apache License 2.0

C 3.60% Python 16.08% Java 9.13% Swift 7.86% C# 12.01% Ruby 0.60% TypeScript 14.12% Shell 0.49% Dart 6.94% Go 10.62% Rust 10.88% JavaScript 7.67%
natural-language-understanding voice-recognition nlu spoken-language-understanding voice-assistant voice-ui voice-user-interface speech-recognition voice-commands voice-control

rhino's Introduction

Rhino

GitHub release GitHub

Crates.io Maven Central Maven Central npm npm npm npm npm npm Nuget CocoaPods Pub Version PyPI Go Reference

Made in Vancouver, Canada by Picovoice

Twitter URL YouTube Channel Views

Rhino is Picovoice's Speech-to-Intent engine. It directly infers intent from spoken commands within a given context of interest, in real-time. For example, given a spoken command:

Can I have a small double-shot espresso?

Rhino infers what the user wants and emits the following inference result:

{
  "isUnderstood": "true",
  "intent": "orderBeverage",
  "slots": {
    "beverage": "espresso",
    "size": "small",
    "numberOfShots": "2"
  }
}

Rhino is:

  • using deep neural networks trained in real-world environments.
  • compact and computationally-efficient. It is perfect for IoT.
  • cross-platform:
    • Arm Cortex-M, STM32, Arduino, and i.MX RT
    • Raspberry Pi, NVIDIA Jetson Nano, and BeagleBone
    • Android and iOS
    • Chrome, Safari, Firefox, and Edge
    • Linux (x86_64), macOS (x86_64, arm64), and Windows (x86_64)
  • self-service. Developers can train custom contexts using Picovoice Console.

Table of Contents

Use Cases

Rhino is the right choice if the domain of voice interactions is specific (limited).

  • If you want to create voice experiences similar to Alexa or Google, see the Picovoice platform.
  • If you need to recognize a few static (always listening) voice commands, see Porcupine.

Try It Out

Rhino in Action

Language Support

Performance

A comparison between the accuracy of Rhino and major cloud-based alternatives is provided here. Below is the summary of the benchmark:

Terminology

Rhino infers the user's intent from spoken commands within a domain of interest. We refer to such a specialized domain as a Context. A context can be thought of a set of voice commands, each mapped to an intent:

turnLightOff:
  - Turn off the lights in the office
  - Turn off all lights
setLightColor:
  - Set the kitchen lights to blue

In examples above, each voice command is called an Expression. Expressions are what we expect the user to utter to interact with our voice application.

Consider the expression:

Turn off the lights in the office

What we require from Rhino is:

  1. To infer the intent (turnLightOff)
  2. Record the specific details from the utterance, in this case the location (office)

We can capture these details using slots by updating the expression:

turnLightOff:
  - Turn off the lights in the $location:lightLocation.

$location:lightLocation means that we expect a variable of type location to occur, and we want to capture its value in a variable named lightLocation. We call such variable a Slot. Slots give us the ability to capture details of the spoken commands. Each slot type is be defined as a set of phrases. For example:

lightLocation:
  - "attic"
  - "balcony"
  - "basement"
  - "bathroom"
  - "bedroom"
  - "entrance"
  - "kitchen"
  - "living room"
  - ...

You can create custom contexts using the Picovoice Console.

To learn the complete expression syntax of Rhino, see the Speech-to-Intent Syntax Cheat Sheet.

Demos

If using SSH, clone the repository with:

git clone --recurse-submodules [email protected]:Picovoice/rhino.git

If using HTTPS, clone the repository with:

git clone --recurse-submodules https://github.com/Picovoice/rhino.git

Python Demos

Install the demo package:

sudo pip3 install pvrhinodemo

With a working microphone connected to your device run the following in the terminal:

rhino_demo_mic --access_key ${ACCESS_KEY} --context_path ${CONTEXT_PATH}

Replace ${CONTEXT_PATH} with either a context file created using Picovoice Console or one within the repository.

For more information about Python demos, go to demo/python.

.NET Demos

Rhino .NET demo is a command-line application that lets you choose between running Rhino on an audio file or on real-time microphone input.

Make sure there is a working microphone connected to your device. From demo/dotnet/RhinoDemo run the following in the terminal:

dotnet run -c MicDemo.Release -- --access_key ${ACCESS_KEY} --context_path ${CONTEXT_FILE_PATH}

Replace ${ACCESS_KEY} with your Picovoice AccessKey and ${CONTEXT_FILE_PATH} with either a context file created using Picovoice Console or one within the repository.

For more information about .NET demos, go to demo/dotnet.

Java Demos

The Rhino Java demo is a command-line application that lets you choose between running Rhino on an audio file or on real-time microphone input.

To try the real-time demo, make sure there is a working microphone connected to your device. Then invoke the following commands from the terminal:

cd demo/java
./gradlew build
cd build/libs
java -jar rhino-mic-demo.jar -a ${ACCESS_KEY} -c ${CONTEXT_FILE_PATH}

Replace ${CONTEXT_FILE_PATH} with either a context file created using Picovoice Console or one within the repository.

For more information about Java demos go to demo/java.

Go Demos

The demo requires cgo, which on Windows may mean that you need to install a gcc compiler like Mingw to build it properly.

From demo/go run the following command from the terminal to build and run the mic demo:

go run micdemo/rhino_mic_demo.go -access_key ${ACCESS_KEY} -context_path ${CONTEXT_FILE_PATH}

Replace ${ACCESS_KEY} with your Picovoice AccessKey and ${CONTEXT_FILE_PATH} with either a context file created using Picovoice Console or one from the Rhino GitHub repository.

For more information about Go demos go to demo/go.

Unity Demos

To run the Rhino Unity demo, import the Rhino Unity package into your project, open the RhinoDemo scene and hit play. To run on other platforms or in the player, go to File > Build Settings, choose your platform and hit the Build and Run button. / To browse the demo source go to demo/unity.

Flutter Demos

To run the Rhino demo on Android or iOS with Flutter, you must have the Flutter SDK installed on your system. Once installed, you can run flutter doctor to determine any other missing requirements for your relevant platform. Once your environment has been set up, launch a simulator or connect an Android/iOS device.

Run the prepare_demo script with a language code to set up the demo in the language of your choice (e.g. de -> German, ko -> Korean). To see a list of available languages, run prepare_demo without a language code.

cd demo/flutter
dart scripts/prepare_demo.dart ${LANGUAGE}

Run the following command to build and deploy the demo to your device:

cd demo/flutter
flutter run

Once the demo app has started, press the start button and utter a command to start inferring context. To see more details about the current context information, press the Context Info button on the top right corner in the app.

React Native Demos

To run the React Native Rhino demo app you will first need to set up your React Native environment. For this, please refer to React Native's documentation. Once your environment has been set up, navigate to demo/react-native to run the following commands:

For Android:

yarn android-install    # sets up environment
yarn android-run        # builds and deploys to Android

For iOS:

yarn ios-install        # sets up environment
yarn ios-run            # builds and deploys to iOS

Both demos use a smart lighting context, which can understand commands such as:

Turn off the lights.

or

Set the lights in the living room to purple.

Android Demos

Using Android Studio, open demo/android/Activity as an Android project and then run the application.

Once the demo app has started, press the Start button and speak a command from the context to start inference. To see more details about the current context information, press the Show Context button on the top right corner in the app.

For more information about Android demo, go to demo/android.

iOS Demos

To run the application demo:

  1. From the demo directory run:
pod install
  1. Open RhinoDemo.xcworkspace in XCode.

  2. Replace let accessKey = "${YOUR_ACCESS_KEY_HERE}" in the file ContentView.swift with your AccessKey.

  3. Go to Product > Scheme and select the scheme for the language you would like to demo (e.g. arDemo -> Arabic Demo, deDemo -> German Demo)

  4. Run the demo with a simulator or connected iOS device.

  5. Once the demo app has started, press the Start button to infer audio within a context. To see more details about the current context information, press the Context Info button on the top right corner in the app.

For more information about iOS demo, go to demo/ios.

Web Demos

Vanilla JavaScript and HTML

From demo/web use yarn or npm to install the dependencies, and the start script with a language code to start a local web server hosting the demo in the language of your choice (e.g. pl -> Polish, ko -> Korean). To see a list of available languages, run start without a language code.

yarn
yarn start ${LANGUAGE}

(or)

npm install
npm run start ${LANGUAGE}

Open http://localhost:5000 in your browser to try the demo.

Angular Demos

From demo/angular use yarn or npm to install the dependencies, and the start script with a language code to start a local web server hosting the demo in the language of your choice (e.g. pl -> Polish, ko -> Korean). To see a list of available languages, run start without a language code.

yarn
yarn start ${LANGUAGE}

(or)

npm install
npm run start ${LANGUAGE}

Open http://localhost:4200 in your browser to try the demo.

React Demos

From demo/react use yarn or npm to install the dependencies, and the start script with a language code to start a local web server hosting the demo in the language of your choice (e.g. pl -> Polish, ko -> Korean). To see a list of available languages, run start without a language code.

yarn
yarn start ${LANGUAGE}

(or)

npm install
npm run start ${LANGUAGE}

Open http://localhost:3000 in your browser to try the demo.

Vue Demos

From demo/vue use yarn or npm to install the dependencies, and the start script with a language code to start a local web server hosting the demo in the language of your choice (e.g. pl -> Polish, ko -> Korean). To see a list of available languages, run start without a language code.

yarn
yarn start ${LANGUAGE}

(or)

npm install
npm run start ${LANGUAGE}

The command-line output will provide you with a localhost link and port to open in your browser.

Node.js Demos

Install the demo package:

yarn global add @picovoice/rhino-node-demo

With a working microphone connected to your device, run the following in the terminal:

rhn-mic-demo --access_key ${ACCESS_KEY} --context_path ${CONTEXT_FILE_PATH}

Replace ${CONTEXT_FILE_PATH} with either a context file created using Picovoice Console or one within the repository.

For more information about Node.js demos go to demo/nodejs.

Rust Demos

This demo opens an audio stream from a microphone and performs inference on spoken commands. From demo/rust/micdemo run the following:

cargo run --release -- --access_key ${ACCESS_KEY} --context_path ${CONTEXT_FILE_PATH}

Replace ${CONTEXT_FILE_PATH} with either a context file created using Picovoice Console or one within the repository.

For more information about Rust demos go to demo/rust.

C Demos

The C demo requires CMake version 3.4 or higher.

Windows Requires MinGW to build the demo.

Microphone Demo

At the root of the repository, build with:

cmake -S demo/c/. -B demo/c/build && cmake --build demo/c/build --target rhino_demo_mic
Linux (x86_64), macOS (x86_64, arm64), Raspberry Pi, BeagleBone, and Jetson

List input audio devices with:

./demo/c/build/rhino_demo_mic --show_audio_devices

Run the demo using:

./demo/c/build/rhino_demo_mic -l ${RHINO_LIBRARY_PATH} -m lib/common/rhino_params.pv \
-c resources/contexts/${PLATFORM}/smart_lighting_${PLATFORM}.rhn \
-d ${AUDIO_DEVICE_INDEX} -a ${ACCESS_KEY}

Replace ${LIBRARY_PATH} with path to appropriate library available under lib, ${PLATFORM} with the name of the platform you are running on (linux, raspberry-pi, mac, beaglebone, or jetson), ${AUDIO_DEVICE_INDEX} with the index of your audio device and ${ACCESS_KEY} with your Picovoice AccessKey.

Windows

List input audio devices with:

.\\demo\\c\\build\\rhino_demo_mic.exe --show_audio_devices

Run the demo using:

.\\demo\\c\\build\\rhino_demo_mic.exe -l lib/windows/amd64/libpv_rhino.dll -c lib/common/rhino_params.pv -c resources/contexts/windows/smart_lighting_windows.rhn -d ${AUDIO_DEVICE_INDEX} -a ${ACCESS_KEY}

Replace ${AUDIO_DEVICE_INDEX} with the index of your audio device and ${ACCESS_KEY} with your Picovoice AccessKey.

The demo opens an audio stream and infers your intent from spoken commands in the context of a smart lighting system. For example, you can say:

"Turn on the lights in the bedroom."

File Demo

At the root of the repository, build with:

cmake -S demo/c/. -B demo/c/build && cmake --build demo/c/build --target rhino_demo_file
Linux (x86_64), macOS (x86_64, arm64), Raspberry Pi, BeagleBone, and Jetson

Run the demo using:

./demo/c/build/rhino_demo_file -l ${LIBRARY_PATH} -m lib/common/rhino_params.pv \
-c resources/contexts/${PLATFORM}/coffee_maker_${PLATFORM}.rhn -w resources/audio_samples/test_within_context.wav \
-a ${ACCESS_KEY}

Replace ${LIBRARY_PATH} with path to appropriate library available under lib, ${PLATFORM} with the name of the platform you are running on (linux, raspberry-pi, mac, beaglebone, or jetson) and ${ACCESS_KEY} with your Picovoice AccessKey.

Windows

Run the demo using:

.\\demo\\c\\build\\rhino_demo_file.exe -l lib/windows/amd64/libpv_rhino.dll -m lib/common/rhino_params.pv -c resources/contexts/windows/coffee_maker_windows.rhn -w resources/audio_samples/test_within_context.wav -a ${ACCESS_KEY}

Replace ${ACCESS_KEY} with your Picovoice AccessKey.

The demo opens up the WAV file and infers the intent in the context of a coffee-maker system.

For more information about C demos go to demo/c.

SDKs

Python

Install the Python SDK:

pip3 install pvrhino

The SDK exposes a factory method to create instances of the engine:

import pvrhino

access_key = "${ACCESS_KEY}" # AccessKey obtained from Picovoice Console (https://console.picovoice.ai/)

handle = pvrhino.create(access_key=access_key, context_path='/absolute/path/to/context')

Where context_path is the absolute path to the Speech-to-Intent context created either using Picovoice Console or one of the default contexts available on Rhino's GitHub repository.

When initialized, the required sample rate can be obtained using rhino.sample_rate. The expected frame length (number of audio samples in an input array) is provided by rhino.frame_length. The object can be used to infer intent from spoken commands as below:

def get_next_audio_frame():
    pass

while True:
    is_finalized = handle.process(get_next_audio_frame())

    if is_finalized:
        inference = handle.get_inference()
        if not inference.is_understood:
            # add code to handle unsupported commands
            pass
        else:
            intent = inference.intent
            slots = inference.slots
            # add code to take action based on inferred intent and slot values

Finally, when done be sure to explicitly release the resources using handle.delete().

.NET

Install the .NET SDK using NuGet or the dotnet CLI:

dotnet add package Rhino

The SDK exposes a factory method to create instances of the engine as below:

using Pv;

const string accessKey = "${ACCESS_KEY}";
string contextPath = "/absolute/path/to/context.rhn";
Rhino handle = Rhino.Create(accessKey, contextPath);

When initialized, the valid sample rate is given by handle.SampleRate. The expected frame length (number of audio samples in an input array) is handle.FrameLength. The engine accepts 16-bit linearly-encoded PCM and operates on single-channel audio.

short[] GetNextAudioFrame()
{
    // .. get audioFrame
    return audioFrame;
}

while(true)
{
    bool isFinalized = handle.Process(GetNextAudioFrame());
    if(isFinalized)
    {
        Inference inference = handle.GetInference();
        if(inference.IsUnderstood)
        {
            string intent = inference.Intent;
            Dictionary<string, string> slots = inference.Slots;
            // .. code to take action based on inferred intent and slot values
        }
        else
        {
            // .. code to handle unsupported commands
        }
    }
}

Rhino will have its resources freed by the garbage collector, but to have resources freed immediately after use, wrap it in a using statement:

using(Rhino handle = Rhino.Create(accessKey, contextPath))
{
    // .. Rhino usage here
}

Java

The Rhino Java binding is available from the Maven Central Repository at ai.picovoice:rhino-java:${version}.

The SDK exposes a Builder that allows you to create an instance of the engine:

import ai.picovoice.rhino.*;

final String accessKey = "${ACCESS_KEY}"; // AccessKey obtained from Picovoice Console (https://console.picovoice.ai/)

try{
    Rhino handle = new Rhino.Builder()
                    .setAccessKey(accessKey)
                    .setContextPath("/absolute/path/to/context")
                    .build();
} catch (RhinoException e) { }

When initialized, the valid sample rate is given by handle.getSampleRate(). The expected frame length (number of audio samples in an input array) is handle.getFrameLength(). The engine accepts 16-bit linearly-encoded PCM and operates on single-channel audio.

short[] getNextAudioFrame(){
    // .. get audioFrame
    return audioFrame;
}

while(true) {
    boolean isFinalized = handle.process(getNextAudioFrame());
    if(isFinalized){
        RhinoInference inference = handle.getInference();
        if(inference.getIsUnderstood()){
            String intent = inference.getIntent();
            Map<string, string> slots = inference.getSlots();
            // .. code to take action based on inferred intent and slot values
        } else {
            // .. code to handle unsupported commands
        }
    }
}

Once you are done with Rhino, ensure you release its resources explicitly:

handle.delete();

Go

To install the Rhino Go module to your project, use the command:

go get github.com/Picovoice/rhino/binding/go

To create an instance of the engine with default parameters, pass an AccessKey and a path to a Rhino context file (.rhn) to the NewRhino function and then make a call to .Init().

import . "github.com/Picovoice/rhino/binding/go/v2"

const accessKey string = "${ACCESS_KEY}" // obtained from Picovoice Console (https://console.picovoice.ai/)

rhino := NewRhino(accessKey, "/path/to/context/file.rhn")
err := rhino.Init()
if err != nil {
    // handle error
}

Once initialized, you can start passing in frames of audio for processing. The engine accepts 16-bit linearly-encoded PCM and operates on single-channel audio. The sample rate that is required by the engine is given by SampleRate and number of samples-per-frame is given by FrameLength.

To feed audio into Rhino, use the Process function in your capture loop. You must have called Init() before calling Process.

func getNextFrameAudio() []int16 {
    // get audio frame
}

for {
    isFinalized, err := rhino.Process(getNextFrameAudio())
    if isFinalized {
        inference, err := rhino.GetInference()
        if inference.IsUnderstood {
            intent := inference.Intent
            slots := inference.Slots
            // add code to take action based on inferred intent and slot values
        } else {
            // add code to handle unsupported commands
        }
    }
}

When finished with the engine, resources have to be released explicitly.

rhino.Delete()

Unity

Import the Rhino Unity Package into your Unity project.

The SDK provides two APIs:

High-Level API

RhinoManager provides a high-level API that takes care of audio recording. This class is the quickest way to get started.

Using the constructor RhinoManager.Create will create an instance of the RhinoManager using the provided context file.

using Pv.Unity;

string accessKey = "${ACCESS_KEY}"; // AccessKey obtained from Picovoice Console (https://console.picovoice.ai/)

try
{
    RhinoManager _rhinoManager = RhinoManager.Create(
                                    accessKey,
                                    "/path/to/context/file.rhn",
                                    (inference) => {});
}
catch (Exception ex)
{
    // handle rhino init error
}

Once you have instantiated a RhinoManager, you can start audio capture and intent inference by calling:

_rhinoManager.Process();

Audio capture stops and Rhino resets once an inference result is returned via the inference callback. When you wish to result, call .Process() again.

Once the app is done with using an instance of RhinoManager, you can explicitly release the audio resources, and the resources allocated to Rhino:

_rhinoManager.Delete();

There is no need to deal with audio capture to enable intent inference with RhinoManager. This is because it uses our unity-voice-processor Unity package to capture frames of audio and automatically pass it to the inference engine.

Low-Level API

Rhino provides low-level access to the inference engine for those who want to incorporate speech-to-intent into an already existing audio processing pipeline.

To create an instance of Rhino, use the .Create static constructor, and a context file.

using Pv.Unity;

string accessKey = "${ACCESS_KEY}"; // AccessKey obtained from Picovoice Console (https://console.picovoice.ai/)

try
{
    Rhino _rhino = Rhino.Create(accessKey, "path/to/context/file.rhn");
}
catch (RhinoException ex)
{
    // handle rhino init error
}

To feed Rhino your audio, you must send it frames of audio to its Process function until it has made an inference.

short[] GetNextAudioFrame()
{
    // .. get audioFrame
    return audioFrame;
}

try
{
    bool isFinalized = _rhino.Process(GetNextAudioFrame());
    if(isFinalized)
    {
        Inference inference = _rhino.GetInference();
        if(inference.IsUnderstood)
        {
            string intent = inference.Intent;
            Dictionary<string, string> slots = inference.Slots;
            // .. code to take action based on inferred intent and slot values
        }
        else
        {
            // .. code to handle unsupported commands
        }
    }
}
catch (RhinoException ex)
{
    Debug.LogError(ex.ToString());
}

For process to work correctly, the audio data must be in the audio format required by Picovoice.

Rhino implements the IDisposable interface, so you can use Rhino in a using block. If you don't use a using block, resources will be released by the garbage collector automatically, or you can explicitly release the resources like so:

_rhino.Dispose();

Flutter

Add the Rhino Flutter plugin to your pub.yaml.

dependencies:
  rhino_flutter: ^<version>

The SDK provides two APIs:

High-Level API

RhinoManager provides a high-level API that takes care of audio recording. This class is the quickest way to get started.

The constructor RhinoManager.create will create an instance of the RhinoManager using a context file that you pass to it.

import 'package:rhino_flutter/rhino_manager.dart';
import 'package:rhino_flutter/rhino_error.dart';

final String accessKey = "{ACCESS_KEY}";  // AccessKey obtained from Picovoice Console (https://console.picovoice.ai/)

void createRhinoManager() async {
    try{
        _rhinoManager = await RhinoManager.create(
            accessKey,
            "/path/to/context/file.rhn",
            _inferenceCallback);
    } on RhinoException catch (err) {
        // handle rhino init error
    }
}

The inferenceCallback parameter is a function that you want to execute when Rhino makes an inference. The function should accept a RhinoInference instance that represents the inference result.

void _inference(RhinoInference inference) {
    if(inference.isUnderstood!) {
        String intent = inference.intent!;
        Map<String, String> = inference.slots!;
        // add code to take action based on inferred intent and slot values
    }
    else {
        // add code to handle unsupported commands
    }
}

Once you have instantiated a RhinoManager, you can start audio capture and intent inference using the .process() function. Audio capture stops and rhino resets once an inference result is returned via the inference callback.

try {
    await _rhinoManager.process();
} on RhinoException catch (ex) { }

Once your app is done with using RhinoManager, be sure you explicitly release the resources allocated for it:

_rhinoManager.delete();

Our flutter_voice_processor Flutter plugin captures the frames of audio and automatically passes it to the speech-to-intent engine.

Low-Level API

Rhino provides low-level access to the inference engine for those who want to incorporate speech-to-intent into an already existing audio processing pipeline.

Rhino is created by passing a context file to its static constructor create:

import 'package:rhino_flutter/rhino_manager.dart';
import 'package:rhino_flutter/rhino_error.dart';

final String accessKey = "{ACCESS_KEY}"; // AccessKey obtained from Picovoice Console (https://console.picovoice.ai/)

void createRhino() async {
    try {
        _rhino = await Rhino.create(accessKey, '/path/to/context/file.rhn');
    } on RhinoException catch (err) {
        // handle rhino init error
    }
}

To deliver audio to the engine, you must send audio frames to its process function. Each call to process will return a RhinoInference instance with following variables:

  • isFinalized - true if Rhino has made an inference, false otherwise
  • isUnderstood - null if isFinalized is false, otherwise true if Rhino understood what it heard based on the context or false if it did not
  • intent - null if isUnderstood is not true, otherwise name of intent that were inferred
  • slots - null if isUnderstood is not true, otherwise the dictionary of slot keys and values that were inferred
List<int> buffer = getAudioFrame();

try {
    RhinoInference inference = await _rhino.process(buffer);
    if(inference.isFinalized) {
        if(inference.isUnderstood!) {
            String intent = inference.intent!;
            Map<String, String> = inference.slots!;
            // add code to take action based on inferred intent and slot values
        }
    }
} on RhinoException catch (error) {
    // handle error
}

// once you are done
this._rhino.delete();

React Native

Install @picovoice/react-native-voice-processor and @picovoice/rhino-react-native. The SDK provides two APIs:

High-Level API

RhinoManager provides a high-level API that takes care of audio recording. This class is the quickest way to get started.

The constructor RhinoManager.create will create an instance of a RhinoManager using a context file that you pass to it.

const accessKey = "${ACCESS_KEY}"; // AccessKey obtained from Picovoice Console (https://console.picovoice.ai/)

async createRhinoManager(){
    try{
        this._rhinoManager = await RhinoManager.create(
            accessKey,
            '/path/to/context/file.rhn',
            inferenceCallback);
    } catch (err) {
        // handle error
    }
}

Once you have instantiated a RhinoManager, you can start/stop audio capture and intent inference by calling .process(). Upon receiving an inference callback, audio capture will stop automatically and Rhino will reset. To restart it you must call .process() again.

let didStart = await this._rhinoManager.process();

When you are done using Rhino, you must explicitly release resources:

this._rhinoManager.delete();

@picovoice/react-native-voice-processor handles audio capture and RhinoManager passes frames to the inference engine for you.

Low-Level API

Rhino provides low-level access to the inference engine for those who want to incorporate speech-to-intent into an already existing audio processing pipeline.

Rhino is created by passing a context file to its static constructor create:

const accessKey = "${ACCESS_KEY}"; // AccessKey obtained from Picovoice Console (https://console.picovoice.ai/)

async createRhino(){
    try{
        this._rhino = await Rhino.create(accessKey, '/path/to/context/file.rhn');
    } catch (err) {
        // handle error
    }
}

To deliver audio to the engine, you must pass it audio frames using the process function. The RhinoInference result that is returned from process will have up to four fields:

  • isFinalized - true if Rhino has made an inference, false otherwise
  • isUnderstood - null if isFinalized is false, otherwise true if Rhino understood what it heard based on the context or false if it did not
  • intent - null if isUnderstood is not true, otherwise name of intent that were inferred
  • slots - null if isUnderstood is not true, otherwise the dictionary of slot keys and values that were inferred
let buffer = getAudioFrame();
try {
    let inference = await this._rhino.process(buffer);
    // use result
    // ..
    }
} catch (e) {
    // handle error
}

// once you are done
this._rhino.delete();

Android

To include the package in your Android project, ensure you have included mavenCentral() in your top-level build.gradle file and then add the following to your app's build.gradle:

dependencies {
    implementation 'ai.picovoice:rhino-android:${LATEST_VERSION}'
}

There are two possibilities for integrating Rhino into an Android application: the High-level API and the Low-level API.

High-Level API

RhinoManager provides a high-level API for integrating Rhino into Android applications. It manages all activities related to creating an input audio stream, feeding it into Rhino, and invoking a user-provided inference callback. Context files (.rhn) should be placed under the Android project assets folder (src/main/assets/).

final String accessKey = "${ACCESS_KEY}"; // AccessKey obtained from Picovoice Console (https://console.picovoice.ai/)
final String contextPath = "/path/to/context.rhn" // path relative to 'assets' folder

try {
    RhinoManager rhinoManager = new RhinoManager.Builder()
                        .setAccessKey(accessKey)
                        .setContextPath("/path/to/context.rhn")
                        .setSensitivity(0.35f)
                        .build(appContext, new RhinoManagerCallback() {
                            @Override
                            public void invoke(RhinoInference inference) {
                                if (inference.getIsUnderstood()) {
                                    final String intent = inference.getIntent();
                                    final Map<String, String> slots = inference.getSlots();
                                    // add code to take action based on inferred intent and slot values
                                }
                                else {
                                    // add code to handle unsupported commands
                                }
                            }
                        });
} catch (RhinoException e) { }

The appContext parameter is the Android application context - this is used to extract Rhino resources from the APK. Sensitivity is the parameter that enables developers to trade miss rate for false alarm. It is a floating-point number within [0, 1]. A higher sensitivity reduces miss rate at cost of increased false alarm rate.

When initialized, input audio can be processed using manager.process(). When done, be sure to release the resources using manager.delete().

Low-Level API

Rhino provides a binding for Android using JNI. It can be initialized using:

import ai.picovoice.rhino.*;

final String accessKey = "${ACCESS_KEY}"; // AccessKey obtained from Picovoice Console (https://console.picovoice.ai/)

try {
    Rhino rhino = new Rhino.Builder()
                        .setAccessKey(accessKey)
                        .setContextPath("/path/to/context.rhn")
                        .build(appContext);
} catch (RhinoException e) { }

Once initialized, handle can be used for intent inference:

private short[] getNextAudioFrame();

while (!handle.process(getNextAudioFrame()));

final RhinoInference inference = handle.getInference();
if (inference.getIsUnderstood()) {
    // logic to perform an action given the intent object.
} else {
    // logic for handling out of context or unrecognized command
}

Finally, prior to exiting the application be sure to release resources acquired:

handle.delete()

iOS

The Rhino iOS binding is available via CocoaPods. To import it into your iOS project, add the following line to your Podfile and run pod install:

pod 'Rhino-iOS'

There are two approaches for integrating Rhino into an iOS application: The high-level API and the low-level API.

High-Level API

RhinoManager provides a high-level API for integrating Rhino into iOS applications. It manages all activities related to creating an input audio stream, feeding it to the engine, and invoking a user-provided inference callback.

import Rhino

let accessKey = "${ACCESS_KEY}" // Obtained from Picovoice Console (https://console.picovoice.ai)
do {
    let manager = try RhinoManager(
        accessKey: accessKey,
        contextPath: "/path/to/context/file.rhn",
        modelPath: "/path/to/model/file.pv",
        sensitivity: 0.35,
        onInferenceCallback: { inference in
                if inference.isUnderstood {
                    let intent:String = inference.intent
                    let slots:Dictionary<String,String> = inference.slots
                    // use inference results
                }
            })
} catch { }

Sensitivity is the parameter that enables developers to trade miss rate for false alarm. It is a floating-point number within [0, 1]. A higher sensitivity reduces miss rate at cost of increased false alarm rate.

When initialized, input audio can be processed using manager.process(). When done, be sure to release the resources using manager.delete().

Low-Level API

Rhino provides low-level access to the Speech-to-Intent engine for those who want to incorporate intent inference into an already existing audio processing pipeline.

import Rhino

let accessKey = "${ACCESS_KEY}" // Obtained from Picovoice Console (https://console.picovoice.ai)
do {
    let handle = try Rhino(
      accessKey: accessKey,
      contextPath: "/path/to/context/file.rhn")
} catch { }

Once initialized, handle can be used for intent inference:

func getNextAudioFrame() -> [Int16] {
    // .. get audioFrame
    return audioFrame
}

while true {
    do {
        let isFinalized = try handle.process(getNextAudioFrame())
        if isFinalized {
            let inference = try handle.getInference()
            if inference.isUnderstood {
                let intent:String = inference.intent
                let slots:Dictionary<String, String> = inference.slots
                // add code to take action based on inferred intent and slot values
            }
        }
    } catch { }
}

Finally, prior to exiting the application be sure to release resources acquired:

handle.delete()

Web

Rhino is available on modern web browsers (i.e. not Internet Explorer) via WebAssembly. Microphone audio is handled via the Web Audio API and is abstracted by the WebVoiceProcessor, which also handles downsampling to the correct format. Rhino is provided pre-packaged as a Web Worker.

Vanilla JavaScript and HTML (CDN Script Tag)

<!DOCTYPE html>
<html lang="en">
  <head>
    <script src="https://unpkg.com/@picovoice/rhino-web/dist/iife/index.js"></script>
    <script src="https://unpkg.com/@picovoice/web-voice-processor/dist/iife/index.js"></script>
    <script type="application/javascript">
      const RHINO_CONTEXT_BASE64 = /* Base64 representation of `.rhn` context file  */;
      const RHINO_MODEL_BASE64 = /* Base64 representation of the `.pv` model file */;

      let rhino = null;

      function rhinoInferenceCallback(inference) {
        if (inference.isFinalized) {
          console.log(`Inference detected: ${JSON.stringify(inference)}`);
          WebVoiceProcessor.WebVoiceProcessor.unsubscribe(rhino);
          document.getElementById("push-to-talk").disabled = false;
          console.log("Press the 'Push to Talk' button to speak again.");
        }
      }

      async function startRhino() {
        console.log("Rhino is loading. Please wait...");
        rhino = await RhinoWeb.RhinoWorker.create(
            accessKey: "${ACCESS_KEY}",  // AccessKey obtained from Picovoice Console (https://console.picovoice.ai/)
            { base64: RHINO_CONTEXT_BASE64 },
            rhinoInferenceCallback,
            { base64: RHINO_MODEL_BASE64 }
        );

        console.log("Rhino worker ready!");
        document.getElementById("push-to-talk").disabled = false;
        writeMessage("Press the 'Push to Talk' button to talk.");
      }

      document.addEventListener("DOMContentLoaded", function () {
        document.getElementById("push-to-talk").onclick = function (event) {
          if (rhino) {
            console.log("Rhino is listening for your commands ...");
            this.disabled = true;
            WebVoiceProcessor.WebVoiceProcessor.subscribe(rhino);
          }
        };
      });
    </script>
  </head>
  <body>
    <button id="push-to-talk">Push to Talk</button>
  </body>
</html>

Vanilla JavaScript and HTML (ES Modules)

yarn add @picovoice/rhino-web @picovoice/web-voice-processor

(or)

npm install @picovoice/rhino-web @picovoice/web-voice-processor
import { WebVoiceProcessor } from "@picovoice/web-voice-processor"
import { RhinoWorker } from "@picovoice/rhino-web";

const RHN_CONTEXT_BASE64 = /* Base64 representation of a `.rhn` context file */
const RHINO_MODEL_BASE64 = /* Base64 representation of the `.pv` model file*/;

let rhino = null

function rhinoInferenceCallback(inference) {
  if (inference.isFinalized) {
    console.log(`Rhino inference: ${JSON.stringify(inference)}`);
    WebVoiceProcessor.unsubscribe(rhino);
  }
}

async function startRhino() {
  // Create a Rhino Worker to listen for commands in the specified context
  rhino = await RhinoWorker.create(
    accessKey: "${ACCESS_KEY}",  // AccessKey obtained from Picovoice Console (https://console.picovoice.ai/)
    { base64: RHINO_CONTEXT_BASE64 },
    rhinoInferenceCallback,
    { base64: RHINO_MODEL_BASE64 }
  );
}

// Start a voice interaction:
// WebVoiceProcessor will request microphone permission.
// n.b. This promise will reject if the user refuses permission! Make sure you handle that possibility.
function pushToTalk() {
  if (rhino) {
    WebVoiceProcessor.subscribe(rhino);
  }
}

startRhino()

...

// Finished with Rhino? Release the WebVoiceProcessor and the worker.
if (done) {
  WebVoiceProcessor.unsubscribe(rhino);
  rhino.release()
  rhino.terminate()
}

Angular

yarn add @picovoice/rhino-angular @picovoice/web-voice-processor

(or)

npm install @picovoice/rhino-angular @picovoice/web-voice-processor
import { Subscription } from "rxjs";
import { RhinoService } from "@picovoice/rhino-angular";
import rhinoParams from "${PATH_TO_RHINO_PARAMS_BASE64}";
import rhinoContext from "${PATH_TO_RHINO_CONTEXT_BASE64}";

constructor(private rhinoService: RhinoService) {
  this.contextInfoDetection = rhinoService.contextInfo$.subscribe(
    contextInfo => {
      console.log(contextInfo);
    });
  this.inferenceDetection = rhinoService.inference$.subscribe(
    inference => {
      console.log(inference);
    });
  this.isLoadedDetection = porcupineService.isLoaded$.subscribe(
    isLoaded => {
      console.log(isLoaded);
    });
  this.isListeningDetection = porcupineService.isListening$.subscribe(
    isListening => {
      console.log(isListening);
    });
  this.errorDetection = porcupineService.error$.subscribe(
    error => {
      console.error(error);
    });
}

async ngOnInit() {
  await this.rhinoService.init(
    ${ACCESS_KEY},
    { base64: rhinoContext },
    { base64: rhinoParams },
  )
}

async process() {
  await this.rhinoService.process();
}

ngOnDestroy() {
  this.contextInfoDetection.unsubscribe();
  this.inferenceDetection.unsubscribe();
  this.isLoadedDetection.unsubscribe();
  this.isListeningDetection.unsubscribe();
  this.errorDetection.unsubscribe();
  this.rhinoService.release();
}

React

yarn add @picovoice/rhino-react @picovoice/web-voice-processor

(or)

npm install @picovoice/rhino-react @picovoice/web-voice-processor
import React, { useEffect } from 'react';
import { useRhino } from '@picovoice/rhino-react';

const RHINO_CONTEXT_BASE64 = /* Base64 representation of a Rhino context (.rhn) for WASM, omitted for brevity */
const RHN_MODEL_BASE64 = /* Base64 representation of a Rhino parameter model (.pv), omitted for brevity */

function VoiceWidget(props) {
  const {
    inference,
    contextInfo,
    isLoaded,
    isListening,
    error,
    init,
    process,
    release,
  } = useRhino();

  useEffect(() => {
    if (!isLoaded) {
      init(
        "${ACCESS_KEY}", // AccessKey obtained from Picovoice Console (https://console.picovoice.ai/)
        { base64: RHINO_CONTEXT_BASE64 },
        { base64: RHN_MODEL_BASE64 }
      );
    }
  }, [isLoaded])

return (
  <div className="voice-widget">
    <button onClick={() => process()} disabled={isListening || !isLoaded || error !== null}>
      Process
    </button>
    <p>{JSON.stringify(inference)}</p>
  </div>
)

Vue

yarn add @picovoice/rhino-vue @picovoice/web-voice-processor

(or)

npm install @picovoice/rhino-vue @picovoice/web-voice-processor
<script lang='ts'>
import { useRhino } from '@picovoice/rhino-vue';

import rhinoParams from "${PATH_TO_RHINO_PARAMS_BASE64}";
import rhinoContext from "${PATH_TO_RHINO_CONTEXT_BASE64}";

export default {
  data() {
    const {
      state,
      init,
      process,
      release
    } = useRhino();

    init(
      ${ACCESS_KEY},
      { base64: rhinoContext },
      { base64: rhinoParams },
    );

    return {
      state,
      process,
      release
    }
  },
  watch: {
    "state.inference": function(inference) {
      if (inference !== null) {
        console.log(inference)
      }
    },
    "state.contextInfo": function(contextInfo) {
      if (contextInfo !== null) {
        console.log(contextInfo)
      }
    },
    "state.isLoaded": function(isLoaded) {
      console.log(isLoaded)
    },
    "state.isListening": function(isListening) {
      console.log(isListening)
    },
    "state.error": function(error) {
      console.error(error)
    },
  },
  onBeforeDestroy() {
    this.release();
  },
};
</script>

Node.js

Install the Node.js SDK:

yarn add @picovoice/rhino-node

Create instances of the Rhino class by specifying the path to the context file:

const Rhino = require("@picovoice/rhino-node");
const accessKey = "${ACCESS_KEY}" // Obtained from the Picovoice Console (https://console.picovoice.ai/)
let handle = new Rhino(accessKey, "/path/to/context/file.rhn");

When instantiated, handle can process audio via its .process method:

let getNextAudioFrame = function() {
    ...
};

let isFinalized = false;
while (!isFinalized) {
  isFinalized = handle.process(getNextAudioFrame());
  if (isFinalized) {
    let inference = engineInstance.getInference();
    // Insert inference event callback
  }
}

When done, be sure to release resources acquired by WebAssembly using release():

handle.release();

Rust

First you will need Rust and Cargo installed on your system.

To add the rhino library into your app, add pv_rhino to your apps Cargo.toml manifest:

[dependencies]
pv_rhino = "*"

To create an instance of the engine you first create a RhinoBuilder instance with the configuration parameters for the speech to intent engine and then make a call to .init():

use rhino::RhinoBuilder;

let access_key = "${ACCESS_KEY}"; // AccessKey obtained from Picovoice Console (https://console.picovoice.ai/)

let rhino: Rhino = RhinoBuilder::new(access_key, "/path/to/context/file.rhn").init().expect("Unable to create Rhino");

To feed audio into Rhino, use the process function in your capture loop:

fn next_audio_frame() -> Vec<i16> {
    // get audio frame
}

loop {
    if let Ok(is_finalized) = rhino.process(&next_audio_frame()) {
        if is_finalized {
            if let Ok(inference) = rhino.get_inference() {
                if inference.is_understood {
                    let intent = inference.intent.unwrap();
                    let slots = inference.slots;
                    // add code to take action based on inferred intent and slot values
                } else {
                    // add code to handle unsupported commands
                }
            }
        }
    }
}

C

Rhino is implemented in ANSI C and therefore can be directly linked to C applications. The pv_rhino.h header file contains relevant information. An instance of the Rhino object can be constructed as follows:

const char *access_key = "${ACCESS_KEY}" // obtained from the Picovoice Console (https://console.picovoice.ai/)
const char *model_path = ... // Available at lib/common/rhino_params.pv
const char *context_path = ... // absolute path to context file for the domain of interest
const float sensitivity = 0.5f;
bool require_endpoint = false;

pv_rhino_t *handle = NULL;
const pv_status_t status = pv_rhino_init(access_key, model_path, context_path, sensitivity, require_endpoint, &handle);
if (status != PV_STATUS_SUCCESS) {
    // add error handling code
}

Now the handle can be used to infer intent from an incoming audio stream. Rhino accepts single channel, 16-bit PCM audio. The sample rate can be retrieved using pv_sample_rate(). Finally, Rhino accepts input audio in consecutive chunks (frames); the length of each frame can be retrieved using pv_rhino_frame_length().

extern const int16_t *get_next_audio_frame(void);

while (true) {
    const int16_t *pcm = get_next_audio_frame();

    bool is_finalized = false;
    pv_status_t status = pv_rhino_process(handle, pcm, &is_finalized);
    if (status != PV_STATUS_SUCCESS) {
        // add error handling code
    }

    if (is_finalized) {
        bool is_understood = false;
        status = pv_rhino_is_understood(rhino, &is_understood);
        if (status != PV_STATUS_SUCCESS) {
            // add error handling code
        }

        if (is_understood) {
            const char *intent = NULL;
            int32_t num_slots = 0;
            const char **slots = NULL;
            const char **values = NULL;
            status = pv_rhino_get_intent(rhino, &intent, &num_slots, &slots, &values);
            if (status != PV_STATUS_SUCCESS) {
                // add error handling code
            }

            // add code to take action based on inferred intent and slot values

            pv_rhino_free_slots_and_values(rhino, slots, values);
        } else {
            // add code to handle unsupported commands
        }

        pv_rhino_reset(rhino);
    }
}

When done, remember to release the resources acquired.

pv_rhino_delete(rhino);

Releases

v3.0.0 - October 26th, 2023

  • Improvements to error reporting
  • Upgrades to authorization and authentication system
  • Added reset() function to API
  • Various bug fixes and improvements
  • Node min support bumped to 16
  • Unity editor min support bumped to 2021
  • Patches to .NET support

v2.2.0 - April 12th, 2023

  • Added language support for Arabic, Dutch, Hindi, Mandarin, Polish, Russian, Swedish and Vietnamese
  • Added support for .NET 7.0 and fixed support for .NET Standard 2.0
  • iOS minimum support moved to 11.0
  • Improved stability and performance

v2.1.0 - January 20th, 2022

  • Added macOS arm64 support for Java and Unity SDKs
  • Support added for non-English built-in slots
  • Support for Macros added
  • Various bug fixes and improvements

v2.0.0 - November 25th, 2021

  • Improved accuracy
  • Added Rust SDK
  • macOS arm64 support
  • Added NodeJS support for Windows, NVIDIA Jetson Nano, and BeagleBone
  • Added .NET support for NVIDIA Jetson Nano and BeagleBone
  • Runtime optimization

v1.6.0 - December 2nd, 2020

  • Added support for React Native
  • Added support for Java
  • Added support for .NET
  • Added support for NodeJS

v1.5.0 - June 4th, 2020

  • Accuracy improvements

v1.4.0 - April 13th, 2020

  • Accuracy improvements
  • Builtin slots

v1.3.0 - February 13th, 2020

  • Accuracy improvements
  • Runtime optimizations
  • Added support for Raspberry Pi 4
  • Added support for JavaScript
  • Added support for iOS
  • Updated documentation

v1.2.0 - April 26, 2019

  • Accuracy improvements
  • Runtime optimizations

v1.1.0 - December 23rd, 2018

  • Accuracy improvements
  • Open-sourced Raspberry Pi build

v1.0.0 - November 2nd, 2018

  • Initial Release

FAQ

You can find the FAQ here.

rhino's People

Contributors

albho avatar bejager avatar danielhass avatar dbartle avatar dependabot[bot] avatar dominikbuenger avatar erismik avatar hellow554 avatar iliadrm avatar kenarsa avatar ksyeo1010 avatar laves avatar mehrdadfeller avatar millasml avatar mrrostam avatar reinzor avatar volkerlieber avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

rhino's Issues

PvArgumentError when running the File Demo command using Node JS

Hi,

NodeJS: v12.13.0
I am using Node JS global packages. I have installed : npm install -g @picovoice/rhino-node-demo

When I am running the command to transciber --> rhn-file-demo --context_path resources/contexts/linux/coffee_maker_mac.rhn --input_audio_file_path resources/audio_samples/test_within_context.wav

I am getting the following error.

throw new PvArgumentError(
^

ReferenceError: PvArgumentError is not defined
at fileDemo (/home/jav/.nvm/versions/node/v12.13.0/lib/node_modules/@picovoice/rhino-node-demo/file.js:63:5)
at Object. (/home/jav/.nvm/versions/node/v12.13.0/lib/node_modules/@picovoice/rhino-node-demo/file.js:132:1)

Thanks in advance

Feature request: demo apps for 64bit ARM64/AARCH64

Hi,
we are using a plain Ubuntu Xenial linux distro both on a RPi3+ board and another Cortex-A53 custom board and, in both cases, we use a 64bit version of the OS.
Is there any demo available that may run on our platforms?
The Python executable in our systems is a 64bit binary, hence it fails to load the shared libraries that you provide.
Thanks,

Roberto

Libraries for armhf and arm64

Hi,

I am working on a ROS wrapper for porcupine and rhino. The ROS build farm is building for linux for architectures armhf, arm64 and amd64. For now I am only using the this library which causes the arm64 and armhf builds to fail. What libraries should I use for the armhf and arm64 linux builds? Or aren't these available?

Thanks!

Rhino C code on Rpi unable to parse the wav file

Hello Alireza,
I am experimenting with Rhino and tried it on RPi-3. It takes input from the test_within_context.wav file from audio_samples and returns the detected intent. I recorded few more audios in same format and expected rhino to understand the intents. But it could just give slots, slots_value output for one of the audio files. The rest of three audio wav files are not understood by Rhino despite being recorded in the same environment as the one being understood. What might be the error?

Comparing it with Snips: Snips is able to understand my voice commands spoken in the same environment. So, it makes me believe that my voice commands and noise should not be an issue.
Please guide.

Thanks!

List of intents that the demo supports?

What is the list of intents that the demo files support?

I tried to recreate the working audio file using audacity but the only I way I could the demo to match my voice was to include the silence at the beginning and the end of the wav file. Is this expected behaviour?

Tony V

date and time as built-in slots

Hi,

I am using rhino to create a model for scheduling a helper. Based on a task and a timing, it should extract what that task is and the date and time as intents. I am having trouble with this and cannot find any resources online to guide me through adding time and date as slots, which normally would be regular expressions. Any advice/guidance?

Thanks

Rhino Issue: Rhino issue

i am running picovoice rhino demo for react-native i have given a context to it which contains : 30-40 words its like a 3-4 line sentence that i have inserted as my context.
what is happening is when i am trying to say the exact same sentence which is there in the context inserted by me it is sometime showing "isunderstood" : false, even if i am resiting the exact same sentence it is not able to find the intent 'sometimes' why is this happening and how to overcome this challenge?

Screenshot_2020-12-12-12-09-11-084_ai picovoice reactnative rhinodemo

Also can you tell me how the rhino speech to intent is actually working is it first converting the speech to text and then using NLU to find out the intent?

Rhino Documentation Issue

What is the URL of the doc?

https://github.com/Picovoice/rhino

What's the nature of the issue? (e.g. steps do not work, typos/grammar/spelling, etc., out of date)

The note here about self service "self-service. Developers can train custom models using Picovoice Console " paired with the 30 day expiration shown in the console is unclear to me what is possible in the free model. Is the free model only available to be downloaded in that 30 day window or does it actually expire in use every 30 days and need to be retrained and installed? I don't seem to see any other options to create custom models outside of the console at this time so I am not sure if that is a possibility either.

How to design the model?

Or more specifically: what trade-offs are there to consider?

Hi there! As a minimal examples to illustrate my questions:

  1. Are two intents "lightsOn" (expression: "turn lights on") and "lightsOff" (expression: "turn lights off") cheaper in terms of performance than one intent "switchLight" with expression "turn lights $state:state" with slot "state" having the elements "on" and "off"?

  2. How about the equivalent, but less intuitive option of a single intent "switchLight" with expressions "$dummy:on lights on" and "$dummy:off lights of" with the slot "dummy" having just the one element "turn"? This is admittedly a bad example, but I think the general idea to just have an expression put a dummy value into a specifically named slot could come in handy sometimes - unless it's always better to create a separate intent for some reason...

  3. Is it helpful to define sort of sub-slots (e.g. have a slot with all the days and a separate one with just the workdays) and use the more specific one where the other options are not valid? Or just put the general slot and filter the invalid results later, in your application, to avoid cluttering the model?

  4. Do I put everything into a single model or does it make sense to have multiple smaller models and just let Rhino listen for the one that is expected/allowed in the current situation? If neither performance nor colliding expressions are an issue, a single model might be easier to have, but its a bit hard to manage in the console because you cannot re-order elements (at least not as far as I have seen).

And while I'm here, a question regarding licensing: what do you mean by # Voice Interactions (per month): 1000 on the pricing page? And can I at least switch between devices as I am allowed just one? Or would it even be acceptable to run the software on multiple computers as long as they are all my machines, located in different rooms of my home? (might be easier than having to send the data from all microphones to a single instance)

Chaining rhino inferences together

Hi, I'm trying to do a series of rhino inferences chained triggered by porcupine. So for example:

"Porcupine" -> "turn the lights on in the living room" -> "turn the lights in the living room blue" -> "turn off the lights in the kitchen" -> timeout -> Porcupine listening

I'm struggling with an error in AudioRecord start() with status code -38. My guess is that there's two things trying to use audio at the same time?

Currently in the RhinoManager callback, depending on inference results, I'm setting an observable integer that corresponds to whether rhino or porcupine should be running. The chaining works a few times before I get this status code. Any suggestions?

Train more Expressions

How I can train so many Expressions in a intent. In the docs , it just trains one Expression by pressing mic button and talking, but when having more Expressions, how should I say ?

React-native - Continuous processing

Is your feature request related to a problem? Please describe.
At the moment, when using RhinoManager, as soon as an intent is proceed, the VoiceProcessor is stopped and we have to launch it again (via manager.process() with a delay). So to listen continuously, the record will start and stop and start, etc...

Describe the solution you'd like
It would be great to have an option to listen continuously so after an intent, it does not call await this._voiceProcessor.stop();
It would be just a little option to the process method.

I'm willing to post a PR if needed.

Rhino Issue: not able to find intent

i am running picovoice rhino demo for react-native i have given a context to it which contains : 30-40 words its like a 3-4 line sentence that i have inserted as my context.

IMG_20201212_121453

what is happening is when i am trying to say the exact same sentence which is there in the context inserted by me it is sometime showing "isunderstood" : false, even if i am resiting the exact same sentence it is not able to find the intent 'sometimes' why is this happening and how to overcome this challenge?

Screenshot_2020-12-12-12-09-11-084_ai picovoice reactnative rhinodemo

Also can you tell me how the rhino speech to intent is actually working is it first converting the speech to text and then using NLU to find out the intent?

permanently train Rhino inference models

Is your feature request related to a problem? Please describe.
I am working on permanently installing a voice-controlled component for our research lab, and would like to avoid re-training and re-downloading the Rhino inference model file every 30 days.

Describe the solution you'd like
Is there a way to permanently train the Rhino inference model (so only need to download once?)

Describe alternatives you've considered
If we can't permanently train the Rhino file, I'm wondering if there is a way to integrate the porcupine wake word with the Picovoice Cheetah speech-to-text engine? (again to avoid re-downloading every month)

Additional context
Add any other context or screenshots about the feature request here.

Set explicit slot phrase values

Hi,

I want to keep the code that processes the intents as generic as possible. As described in #61, I am controlling Spotify via its API. Right now I have commands like music play or music pause, and then I am using spotifyApi[intent.slots.command] to call the according method on the Spotyify API. However, the command to skip to the next song is skipToNext, in the voice command I'd like to say music next.

Would it be possible to configure slot phrase values for each phrase defined in a slot? In the best case, I could even define if its a string, a boolean or a number (e.g. for the shuffle command, where I'd like to convert "on" to true and "off" to false), and the same is true for volume levels (music volume low or similar that I'd like to translate into an integer). If a separate slot phrase value is available, that could then be available in intent.slot_values.xyz. That way, I could have slot names like param_1, param_2 that I automatically pass into the Spotify API functions when present in a intent.

The advantage is that I'd only have to cange the Rhino config in the cosole, and not touch my intent processing code, to add a new Spotify API voice call. Hope that makes sense.

Jonas

Optional slots?

Is your feature request related to a problem? Please describe.
I'm super impressed by your product. Really amazing offering, and great experience.

My question: is it possible to make slot optional for matching? I.e. allow the expression to be matched even if slot was not matched? My example

[open, extend, slide out] (the, a) $zone:zone (hydraulic, hydraulics, slide out, slide outs, room, section, slide, slides, cylinder, cylinders, areas, areas) (completely, entirely, fully, to the max, hundred percent, to the limit, as much as possible)

So I have only one meaningful $zone slot here, but also I have a set of words that I don't care that much for, but they can be in the command. Because i have a bunch of intents and commands reusing those sets, i'd love to turn them into slots for reuse, while basically ignoring them later. The problem is that I'd like the command to work even if they are not matched.,

Describe the solution you'd like
($zone:zone) type syntax for optional slots

Describe alternatives you've considered
I can use multiple separate commands with or without slots, but there's combinatorial complexity to support them all. I can also keep listing them as is which makes commands hard to parse visually

Rhino Issue: Cannot open Intent Editor in Picovoice Console

Make sure you have read the documentation, and have put forth a reasonable effort to find an existing answer.

Expected behaviour

Click on newly created Rhino Context using Empty Template, the Intent Editor displays

Actual behaviour

Browser displays an empty page, with the following console errors
TypeError: can't convert null to object
Uncaught (in promise) TypeError: can't convert null to object

Steps to reproduce the behaviour

Create new Rhino Context with an Empty Template and click on it to progress to Intent Editor
(Include enough details so that the issue can be reproduced independently.)
(Using Firefox Browser)

handle non-ASCII chars when returning inference results.

Hello,

After testing French language, the detection is working fine but the returned string from slots is "tricky" when the word contains symbols such as 'é', 'è' etc..
For example, for the word "éteindre", the returned slot string is "éteindre" which is not user-friendly!
I don't know how do you manage this, I propose to replace any letter of a specific symbol by its basic Latin letter, for example:
'é', 'è' -> e
à -> a
etc ..

PS: I tried to put the text "eteindre" instead of "éteindre" in the rhino console but the dictionary rejected the first one :-(

Processing crashes on some commands

For some context: I'm writing a Java program (running on Windows) that combines Porcupine and Rhino, i.e., listens for a wake word and passes to Rhino to interpret a single command. I tried to test it with the smart lighting demo context and an extended model that is based on the smart lighting one; both show the same problem.

When I say "Turn light in the kitchen to blue.", the program crashes somewhere inside the Rhino.process(short[] pcm) method and only the message "Process finished with exit code -1073740940 (0xC0000374)". No Java Exception/Error is thrown, so I think it has to be in the native code. The odd part is that similar commands like "Turn light in the kitchen to green." or "Turn light in the bathroom to blue." work fine and are interpreted correctly...

Do you have any hints as to what could cause this issue or is there any more useful information I should provide?

Libpv_rhino.dll - The specified module could not be found.

I am attempting to use Rhino in custom Python code on Windows. I am able to successfully run the rhino_demo_mic.py, but when trying a minimal custom example:

    _rhino_library_path = "bin/pvrhino/libpv_rhino.dll"
    _rhino_model_file_path = "bin/pvrhino/rhino_params.pv"
    turn_on_context_file_path = "bin/pvrhino/model_2020-08-22_v1.5.0.rhn"

    toggle_display = Rhino(
        library_path=_rhino_library_path,
        model_path=_rhino_model_file_path,
        context_path=turn_on_context_file_path)

It throws the following error:

  File "C:\Projects\xxx\xxx\xxx\lib\Rhino\rhino.py", line 61, in __init__
    library = cdll.LoadLibrary(library_path)
  File "C:\Users\xxx\AppData\Local\Programs\Python\Python37\lib\ctypes\__init__.py", line 442, in LoadLibrary
    return self._dlltype(name)
  File "C:\Users\xxx\AppData\Local\Programs\Python\Python37\lib\ctypes\__init__.py", line 364, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: [WinError 126] The specified module could not be found
bin/pvrhino/libpv_rhino.dll

Process finished with exit code 1

It is getting past the Rhino check for the file's existance:

        if not os.path.exists(library_path):
            raise IOError("couldn't find Rhino's library at '%s'" % library_path)

But fails immediately after that.

I have copied the rhino_params.pv and libpv_rhino.dll file from the repository to make sure that I am the most up to date and using the same file that was used during the rhino_demo_mic.py demo.

Any suggestions would be appreciated.

Possible to have a generic slot?

Hi, I'm very interested in using this package for a Flutter app I'm working on.

I'm wondering if it is possible to have a slot that works somewhat like a wildcard. For example, a "food" slot is naturally difficult to make as listing all foods in a slot isn't really feasible.

I tried to use the built-in $pv.Alphabetic slot but that seems to pick up individual letters instead of whole words.

Some examples to demonstrate exactly what I'd like to do:

How many calories are in $pv.TwoDigitInteger:quantity servings of $food:food?
What pairs well with $food:food_a and $food:food_b?

Problem while invoking Rhino from a foreground service

Hi Picovoice,
I am raising again an issue originally described at #153 as it is vital for my application.

Expected behaviour

I am using the Picovoice SDK and Rhino SDK for Android. The Picovoice SDK works very well for me on a foreground service and I use the Porcupine wake word engine to invoke Rhino.

But I also need to use the Rhino SDK alone to implement a dialogue in a foreground service.
The scenario looks like this:

  1. Start service with Picovoice SDK and send the app in the background (or lock the phone)
  2. Use Porcupine to start recording audio
  3. Speak and get the intent back from Rhino
  4. Based on the intent coming from Rhino, I want to invoke Rhino again (programmatically, without wake word, using rhinoManager.process();) so the user can speak again and give a second command/instruction

Actual behaviour

  1. I can successfully start service with Picovoice SDK and send the app in the background (lock the phone)
  2. I can successfully use Porcupine to start recording audio
  3. I can successfully speak and get the intent back from Rhino
  4. After the intent is returned from Picovoice SDK, I am not able to invoke Rhino only, so the user can speak again and give a second command/instruction (without using the wake word engine) and I get the error below:

E/IAudioFlinger: createRecord returned error -1
E/AudioRecord: createRecord_l(1225279496): AudioFlinger could not create record track, status: -1
E/AudioRecord-JNI: Error creating AudioRecord instance: initialization check failed with status -1.
E/android.media.AudioRecord: Error code -20 when initializing native AudioRecord object.

Steps to reproduce the behaviour

The easiest ways to reproduce would be to either use my scenario above or you can also try the steps below:

  1. Create an Android foreground service with a Rhino instance
  2. Set a timer (2 mins for example) and after the time elapses, the Rhino engine should get invoked automatically while the app is still in the background (using rhinoManager.process();)
  3. Send the app that contains the service in the background (or lock the phone)
  4. After 2 minutes Rhino engine should get invoked, the user speaks and Rhino returns the intent

Important note: The Rhino engine works when the app is not in the background, I get the error only when the app is running in the background or the phone is locked.

Hope this explains my problem. Let me know if you need more info.

Many thanks
Adhurim

Rhino Issue: issue when French language is used

Hello,

I got a notification that different languages are now supported in rev 2.0.0 which is very nice (Issue #160). Thank you a lot for that.
However, I tried to a do a test with French language with STM32F769-Disco board, I got the following error:
[ERROR] Keyword files (.PPN) and model file (.PV) should belong to the same language. Keyword files belongs to 'fr' while model file belongs to 'en'.
This is what I did:

  • In the console, and in Language menu I've selected French language.
  • Created a new context.
  • Fill the context with different commands in French.
  • Save and train the model.
  • Tested the command with the PC microphone in the console: the commands are detected perfectly.
  • Downloaded the latest version of Picovoice from github and in the demo under \demo\mcu\stm32f769\stm32f769i-disco, copied the new generated context array in CONTEXT_ARRAY in pv_params.h in French section.
  • Activated French language in the project preprocessor by replacing the flag PV_LANGUAGE_ENGLISH by PV_LANGUAGE_FRENCH then compiled the project.
  • Loaded the project and at the beginning, in the trace, I got the previous error!
    Am I missing something?
    Thank you for your help.

Define keyword to separate multiple intents in one voice command

Is your feature request related to a problem? Please describe.
Sometimes I want to give multiple commands to Rhino without having to say the wake word again each time.

Describe the solution you'd like
I guess the title says it all: allow the user to define a keyword for a Rhino context that can be used to separate multiple intents in one voice command. Could be a keyword that is then forbidden to be used in any intent or slot, to make it easier?

Describe alternatives you've considered
No idea.

Additional context
As I have Audio feedback when processing intents, currently I always have to wait for that to finish before I can say the wake word again or it would interfere with the voice command.

Rhino Issue:

i am running picovoice rhino demo for react-native i have given a context to it which contains : 30-40 words its like a 3-4 line sentence that i have inserted as my context.

what is happening is when i am trying to say the exact same sentence which is there in the context inserted by me it is sometime showing "isunderstood" : false, even if i am resiting the exact same sentence it is not able to find the intent 'sometimes' why is this happening and how to overcome this challenge?
Also can you tell me how the rhino speech to intent is actually working is it first converting the speech to text and then using NLU to find out the intent?

Where/How does rhino do silence detection

Hello,

I'm playing with the demos and I'm very impressed so far. One thing I am missing though is how to do silence / break in speech detection? You must be doing so in the rhino coffee machine demo?

Thanks

Rhino Issue: Undersores not supported in slot names

As a way to overcome #133 i've built a yaml preprocessor that would enable optional slots, and add a syntax to define aliases for slot values.

I noticed that slot names are very restrictive, they do not even allow underscores.

Expected behaviour

Allow underscores in slot names

Actual behaviour

"Slot command__close regex /^([A-Za-z]){1,32}$/"

Steps to reproduce the behaviour

Use underscore in slot name

Edit: Maybe this is not a bug, but I don't know how to change label now

[question] Can the context/model be trained on the end device?

Hi, we are thinking of trialling out Rhino on our custom android devices but can you tell us if it is possible to retrain the models on the devices so they are personalized for the end-users, please?
For example, if we wanted to design an intent which would let the user call the contacts on their device i.e. something like "Call [contact_on_device]", would it be possible to inject all the contact names on the user's device in the model so that it can recognize them?
Thank you.

Model used

What is the deep learning model used in the examples? I'm willing to run the C example on a Sparkfun Edge, which is powered by an Ambiq Apollo3 Blue chip (featuring a Cortex M4 MCU), but I'm not sure it would work right away.
However, I thought I could train the model myself and deploy it on my Sparkfun Edge if I knew the model's architecture and the dataset it was trained on. Actually, if you trained it using TensorFlow, a model checkpoint would be sufficient since the Sparkfun Edge is meant to run TensorFlow Lite models.

Rhino Issue: React Native using voice to control video playback

Expected behaviour

When triggering play/pause of video component, picovoice should continue listening

Actual behaviour

Currently after several commands Rhino encounters an error (callstack available in XCode
#6 0x000000010baadc2e in PvRhino.process(handle:pcm:resolver:rejecter:) at /Users/john.bowden/development/rnpvtest/node_modules/@picovoice/rhino-react-native/ios/Rhino.swift:102

Steps to reproduce the behaviour

Init a react native project with picovoice and @react-native-community/voice, use picovoice with wakeword and intent to toggle playback of a video,

I have a private repo that I can share to reproduce this, I can invite someone if given a github user.

Note: we also have a corporate account, can provide details separately

non-English support for STM32 on GitHub

I'm a newbie in Picovoice World and I don't know if this is the right place to ask a question because I dindn't find a forum for Picovoice.
Anyway, I created a Personal account to get started with Picovoice solution with a STM32 device (available demo) and it looks very cool.
My issue is that when I select the Langage French, I get this error:
[ERROR] context file belongs to 'fr' while model file belongs to 'en'
I already selected french langage in Rhino interface before generating the model!
Does Personal account supports the usage of langages apart of English langage?
If yes, what is the issue? how can I test french langage with the current STM32 demo?
Thank you.

Rhino Issue: Audio error when running on a background service

Hi Picovoice,

I have been playing with Porcupine and Rhino on Android and recently came to a situation where the Rhino is showing an error when the app runs in the background (foreground service). It works well when the app is not in the background.
Any idea if I am doing something wrong, or is this a bug?

This is the message it shows:
E/IAudioFlinger: createRecord returned error -1
E/AudioRecord: createRecord_l(1225279496): AudioFlinger could not create record track, status: -1
E/AudioRecord-JNI: Error creating AudioRecord instance: initialization check failed with status -1.
E/android.media.AudioRecord: Error code -20 when initializing native AudioRecord object.

Rhino Issue:

Make sure you have read the documentation, and have put forth a reasonable effort to find an existing answer.

Expected behaviour

I expect to build iOS project with the custom Rhino context file.

Actual behaviour

Project cannot find my custom Rhino file and the Teminal states that it cannot find the model file.
error messages are shown as below:

  • porcupineError(Porcupine.PorcupineError.invalidArgument(message: "Model file at does not exist at '' "))

Steps to reproduce the behaviour

Please guide me how to solve this issue.

(Include enough details so that the issue can be reproduced independently.)

Add support for React Native

If you are developing a cross-platform mobile application with React Native, please comment on this issue. The more requests we get, the faster we'd support this platform.

We are also looking for technical contributors who can help us expand our SDK support for a variety of platforms. If you are an expert in React Native or other platforms listed in the link below, please do not hesitate to reach out to us via the link below:

https://forms.gle/7B58sj87QT8gbwrr8

.NET demo has missing dependency

Hello,

I tried to run your .NET demo on Win10 v1909 FR.

dotnet run -c MicDemo.Release -- --context_path "C:\Users\Username\Desktop\model.rhn"

It crashes with

Listening
System.DllNotFoundException: Could not load the dll 'openal32.dll' (this load is intercepted, specified in DllImport as 'AL').
   at OpenTK.Audio.OpenAL.ALLoader.ImportResolver(String libraryName, Assembly assembly, Nullable1 searchPath)
   at System.Runtime.InteropServices.NativeLibrary.LoadLibraryCallbackStub(String libraryName, Assembly assembly, Boolean hasDllImportSearchPathFlags, UInt32 dllImportSearchPathFlags)
   at OpenTK.Audio.OpenAL.ALC.CaptureOpenDevice(String devicename, Int32 frequency, ALFormat format, Int32 buffersize)
   at RhinoDemo.MicDemo.RunDemo(String contextPath, String modelPath, Single sensitivity, Nullable1 audioDeviceIndex, String outputPath) in  E:\rhino\demo\dotnet\RhinoDemo\MicDemo.cs:line 83
   at RhinoDemo.MicDemo.Main(String[] args) in E:\rhino\demo\dotnet\RhinoDemo\MicDemo.cs:line 296

Seems related to opentk/opentk#1169

Define hardcoded/prefilled slots

Hi,

I'm controlling my Spotify streaming via Picovoice. I have expressions in the form music $musiccommand:command where the command can be something like 'play', 'pause', 'next', 'previous'.

Now some commands need extra parameters, like 'shuffle' (on/off), 'volume' (double digit integer). I want to use the same intent for that, and the code for that intent expects the command slot to be there. Therefore, for now, I also define the expressions music $musiccommand:command $pv.SingleDigitInteger:volumelevel and music $musiccommand:command $state:state.

Now I could also say music play off which does not make sense. So what I would like to do is define an expression like music shuffle:command $state:state or music volume:command $pv.SingleDigitInteger:volumelevel, basically defining hardcoded values for the slots instead of using a list of possible values.

Of course I could define extra slots for shuffle/volume, but this way it would make things much easier.

Generally, Picovoice with Respeaker on Raspi - awesome!

Rhino using French on Flutter

Dear Picovoice team,

I am trying to run picovoice on French but I get trouble to load a french custom model (created using the console).
I get error loading model without further info. Should I add french context somewhere ?
I only added the .rhn model in my asset

`pico_voice % flutter doctor
Doctor summary (to see all details, run flutter doctor -v):
[✓] Flutter (Channel stable, 2.2.0, on macOS 11.2.3 20D91 darwin-x64, locale fr-FR)
[✓] Android toolchain - develop for Android devices (Android SDK version 30.0.3)
[✓] Xcode - develop for iOS and macOS
[✓] Chrome - develop for the web
[✓] Android Studio (version 4.1)
[✓] Connected device (2 available)

• No issues found!`

`
n Observatory debugger and profiler on iPhone 12 Pro Max is available at: http://127.0.0.1:54807/P1AkUQBJ8Oc=/
flutter: Failed to initialize Rhino: Failed to initialize Rhino.
The Flutter DevTools debugger and profiler on iPhone 12 Pro Max is available at:
http://127.0.0.1:9101?uri=http%3A%2F%2F127.0.0.1%3A54807%2FP1AkUQBJ8Oc%3D%2F
[VERBOSE-2:ui_dart_state.cc(199)] Unhandled Exception: LateInitializationError: Field '_rhinoManager@606092562' has not been initialized.

`

Using custome wake words and custom contexts

Hello, We just created custom wake words and custom contexts and trained models using the account we have in the Picovoice console. We would like to use these instead of the default ones supplied with the python demo. I see that the following variables need to be customized:

  • rhino_library_path=args.rhino_library_path
  • rhino_model_file_path=args.rhino_model_file_path,
  • rhino_context_file_path=args.rhino_context_file_path,
  • porcupine_library_path=args.porcupine_library_path,
  • porcupine_model_file_path=args.porcupine_model_file_path,
  • porcupine_keyword_file_path=args.porcupine_keyword_file_path

We are able to figure out the rhino_library_path, rhino_model_file_path, porcupine_model_file_path, porcupine_library_path. But what about the keyword file and context file, how do we get those files for the custom keyword and context we trained?

Thanks

Add Raspberry Pi support to Go binding

Make sure you have read the documentation, and have put forth a reasonable effort to find an existing answer.

Expected behaviour

demo should compile

Actual behaviour

doesnt compile
---- OUTPUT
$ go run micdemo/rhino_mic_demo.go

github.com/gen2brain/malgo

In file included from miniaudio.c:4:
miniaudio.h: In function ‘ma_device_data_loop_wakeup__alsa’:
miniaudio.h:20930:9: warning: ignoring return value of ‘write’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
20930 | write(pDevice->alsa.wakeupfdCapture, &t, sizeof(t));
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
miniaudio.h:20933:9: warning: ignoring return value of ‘write’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
20933 | write(pDevice->alsa.wakeupfdPlayback, &t, sizeof(t));
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
miniaudio.h: In function ‘ma_device_wait__alsa’:
miniaudio.h:20760:13: warning: ignoring return value of ‘read’ declared with attribute ‘warn_unused_result’ [-Wunused-result]
20760 | read(pPollDescriptors[0].fd, &t, sizeof(t)); /* <-- Important that we read here so that the next write() does not block. */
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

github.com/Picovoice/rhino/binding/go

/home/ubuntu/go/pkg/mod/github.com/!picovoice/rhino/binding/[email protected]/rhino_posix.go:142:5: could not determine kind of name for C.bool
/home/ubuntu/go/pkg/mod/github.com/!picovoice/rhino/binding/[email protected]/rhino_posix.go:148:12: could not determine kind of name for C.pv_rhino_is_understood_wrapper
/home/ubuntu/go/pkg/mod/github.com/!picovoice/rhino/binding/[email protected]/rhino_posix.go:138:12: could not determine kind of name for C.pv_rhino_process_wrapper
cgo:
gcc errors for preamble:
/home/ubuntu/go/pkg/mod/github.com/!picovoice/rhino/binding/[email protected]/rhino_posix.go:34:67: error: unknown type name 'bool'
34 | typedef int32_t (*pv_rhino_process_func)(void *, const int16_t *, bool *);
| ^~~~
/home/ubuntu/go/pkg/mod/github.com/!picovoice/rhino/binding/[email protected]/rhino_posix.go:36:77: error: unknown type name 'bool'
36 | int32_t pv_rhino_process_wrapper(void *f, void *object, const int16_t *pcm, bool *is_finalized) {
| ^~~~
/home/ubuntu/go/pkg/mod/github.com/!picovoice/rhino/binding/[email protected]/rhino_posix.go:40:62: error: unknown type name 'bool'
40 | typedef int32_t (*pv_rhino_is_understood_func)(const void *, bool *);
| ^~~~
/home/ubuntu/go/pkg/mod/github.com/!picovoice/rhino/binding/[email protected]/rhino_posix.go:42:69: error: unknown type name 'bool'
42 | int32_t pv_rhino_is_understood_wrapper(void *f, const void *object, bool *is_understood) {
| ^~~~

Steps to reproduce the behaviour

try to compile on a raspberry pi4

(Include enough details so that the issue can be reproduced independently.)

$ uname -a
Linux raspi4 5.8.0-1024-raspi #27-Ubuntu SMP PREEMPT Thu May 6 10:07:12 UTC 2021 aarch64 aarch64 aarch64 GNU/Linux

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 20.10
Release: 20.10
Codename: groovy

$ go version
go version go1.16.4 linux/arm64

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.