Code Monkey home page Code Monkey logo

streamassist's Introduction

Stream Assist

Home Assistant custom component that allows you to turn almost any camera and almost any speaker into a local voice assistant.

Component will use:

  • Stream integration for receiving audio from camera (RTSP/HTTP/RTMP) and automatic transcoding of audio codec into a format suitable for Speech-to-Text (STT)
  • Assist pipeline integration for run: Speech-to-Text (STT) => Natural Language Processing (NLP) => Text-to-Speech (TTS)
  • Almost any Media player for play audio respose from Text-to-Speech (TTS)

Assist pipeline can use:

Video instruction from fixtSE

Local Voice Assistant: Using your Cameras & Speakers in HA

Installation

HACS > Integrations > 3 dots (upper top corner) > Custom repositories > URL: AlexxIT/StreamAssist, Category: Integration > Add > wait > Stream Assist > Install

Or manually copy stream_assist folder from latest release to /config/custom_components folder.

Configuration

Config wake word detection (WAKE)

  1. Add wake word detection Add-on Settings > Add-ons > Add-on Store > openWakeWord > Install
  2. Config WAKE Add-on:
    openWakeWord > Configuration
  3. Add WAKE Integration:
    Settings > Integrations > openWakeWord > Configure

Config local Speech-to-Text (STT)

  1. Add local Speech-to-Text Add-on
    Settings > Add-ons > Add-on Store > Whisper > Install
  2. Config STT Add-on:
    Whisper > Configuration
  3. Add STT Integration:
    Settings > Integrations > Whisper > Configure

Config local Text-to-Speech (TTS)

  1. Add local Text-to-Speech Add-on
    Settings > Add-ons > Add-on Store > Piper > Install
  2. Config TTS Integration:
    Piper > Configuration
  3. Add TTS Integration:
    Settings > Integrations > Piper > Configure

Config local Voice assistant (INTENT)

  1. Config Voice assistant:
    Settings > Voice assistants > Home Assistant > Select: STT, TTS and WAKE

Config Stream Assist

  1. Add Stream Assist Integration
    Settings > Integrations > Add Integration > Stream Assist
  2. Config Stream Assist Integration
    Settings > Integrations > Stream Assist > Configure

You can select or camera entity_id as audio (MIC) source or stream URL.

You can select Voice Assistant Pipeline for recognition process: WAKE => STT => NLP => TTS. By default componen will use default pipeline. You can create several Pipelines with different settings. And several Stream Assist components with different settings.

You can select one or multiple Media players (SND) to output audio response. If your camera support two way audio you can use WebRTC Camera custom integration to add it as Media player.

You can set STT start media for play "beep" after WAKE detection (ex: media-source://media_source/local/beep.mp3).

Using

Component has MIC switch and multiple sensors - WAKE, STT, INTENT, TTS. There may be fewer sensors, depending on the Pipeline settings.

The sensor attributes contain a lot of useful information about the results of each step of the assistant.

You can also view the pipelines running history in the Home Assistant interface:

  • Settings > Voice assistants > Pipeline > 3 dots > Debug

Service

You can run pipeline as a service. Almost all settings optional. But allow you to achieve customisations that are not possible in Hass by default.

service: stream_assist.run
data:
  stream_source: rtsp://...
  camera_entity_id: camera.xxx
  player_entity_id: media_player.xxx
  stt_start_media: media-source://media_source/local/beep.mp3
  pipeline_id: abcdefg...
  assist:
    start_stage: wake_word  # wake_word, stt, intent, tts
    end_stage: tts
    pipeline:
      conversation_language: en
      conversation_engine: homeassistant
      language: en
      name: Home Assistant
      stt_engine: stt.faster_whisper
      stt_language: en
      tts_engine: tts.google_en_com
      tts_language: en
      tts_voice: None
      wake_word_entity: wake_word.openwakeword
      wake_word_id: None
    wake_word_settings: { timeout: 5 }
    audio_settings:
      noise_suppression_level: None
      auto_gain_dbfs: None
      volume_multiplier: None
    conversation_id: None
    device_id: None
    intent_input: None
    tts_audio_output: None  # None, wav, mp3
    tts_input: None
  stream:
    file: ...
    options: {}

Tips

  1. Recommended settings for Whisper:

    • Model: small-int8 or medium-int8
    • Beam size: 5
  2. You can add remote Whisper/Piper installation from another server:

    • First server: Settings > Add-ons > Whisper/Piper > Configuration > Network > Select port
    • Second server: Settings > Integrations > Add integration > Wyoming Protocol > Select: first server IP, add-on port
  3. You can use Google Translate integration instead of Piper, which support many languages for TTS.

  4. If your environment does not allow you to install add-ons, you can install Faster Whisper custom integration for local STT.

streamassist's People

Contributors

alexxit avatar marcomow avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

streamassist's Issues

[New Feature] - Create extra two entries with question and response text

Hi @AlexxIT, thanks for really great plugin.

Could you create an additional entities that would store the last processed query and response? eg. Last question and Last response for each configured stream? This would really help a lot in streamlining and customizing text and response processing..

PS. Also it would be really nice to add the ability to set the end of sentence recognition (eg. Default, Aggressive). I noticed that when there is a slight noise in the background, the text is constantly being processed even if I finish speak sentence.. process wait for complete silence to stop processing sound.

Wyoming-whisper errors and no timeout on STT error

When using Wyoming-whisper docker image I've got this in logs, and STT sensor is 'processing' quite long (I believe it never ends)

wyoming-whisper           | ERROR:asyncio:Task exception was never retrieved
wyoming-whisper           | future: <Task finished name='Task-11' coro=<AsyncEventHandler.run() done, defined at /usr/local/lib/python3.9/dist-packages/wyoming/server.py:26> exception=ValueError("can't extend empty axis 0 using modes other than 'constant' or 'empty'")>
wyoming-whisper           | Traceback (most recent call last):
wyoming-whisper           |   File "/usr/local/lib/python3.9/dist-packages/wyoming/server.py", line 32, in run
wyoming-whisper           |     if not (await self.handle_event(event)):
wyoming-whisper           |   File "/usr/local/lib/python3.9/dist-packages/wyoming_faster_whisper/handler.py", line 61, in handle_event
wyoming-whisper           |     segments, _info = self.model.transcribe(
wyoming-whisper           |   File "/usr/local/lib/python3.9/dist-packages/wyoming_faster_whisper/faster_whisper/transcribe.py", line 124, in transcribe
wyoming-whisper           |     features = self.feature_extractor(audio)
wyoming-whisper           |   File "/usr/local/lib/python3.9/dist-packages/wyoming_faster_whisper/faster_whisper/feature_extractor.py", line 152, in __call__
wyoming-whisper           |     frames = self.fram_wave(waveform)
wyoming-whisper           |   File "/usr/local/lib/python3.9/dist-packages/wyoming_faster_whisper/faster_whisper/feature_extractor.py", line 98, in fram_wave
wyoming-whisper           |     frame = np.pad(frame, pad_width=padd_width, mode="reflect")
wyoming-whisper           |   File "<__array_function__ internals>", line 200, in pad
wyoming-whisper           |   File "/usr/local/lib/python3.9/dist-packages/numpy/lib/arraypad.py", line 815, in pad
wyoming-whisper           |     raise ValueError(
wyoming-whisper           | ValueError: can't extend empty axis 0 using modes other than 'constant' or 'empty'

Question about Android audio only

First off, this is super impressive and very close to what I'm wanting to use. I'm working on making a replacement for my Echo Show devices by using an Android tablet. fixtSE made a video featuring your StreamAssist and in it he demos it with Android. I'd absolutely love to use this if it could be used as an Android audio input device. I am already using the camera on the device with Fully Kiosk for motion detect and am pretty sure that the IPCAM software and the Fully Kiosk can't use the same camera device.

Is it possible to use StreamAssist but only use the mic on the tablet and leave the camera available to Fully Kiosk?

Thanks!

Error on configuration / attempt to use

First, thank you for this integration! This is exactly what I wished for when I first saw the wake word article for HA assist.

I added my camera via entity. After clicking configure, I get following error (duplicate of #6 )

Logger: aiohttp.server
Source: /usr/local/lib/python3.11/site-packages/aiohttp/web_protocol.py:403
First occurred: 22:39:15 (1 occurrences)
Last logged: 22:39:15

Error handling request
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/aiohttp/web_protocol.py", line 433, in _handle_request
    resp = await request_handler(request)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/aiohttp/web_app.py", line 504, in _handle
    resp = await handler(request)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/aiohttp/web_middlewares.py", line 117, in impl
    return await handler(request)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/http/security_filter.py", line 85, in security_filter_middleware
    return await handler(request)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/http/forwarded.py", line 100, in forwarded_middleware
    return await handler(request)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/http/request_context.py", line 28, in request_context_middleware
    return await handler(request)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/http/auth.py", line 236, in auth_middleware
    return await handler(request)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/http/headers.py", line 31, in headers_middleware
    response = await handler(request)
               ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/http/view.py", line 148, in handle
    result = await handler(request, **request.match_info)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/http/decorators.py", line 63, in with_admin
    return await func(self, request, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/config/config_entries.py", line 213, in post
    return await super().post(request)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/http/data_validator.py", line 72, in wrapper
    result = await method(view, request, data, *args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/helpers/data_entry_flow.py", line 71, in post
    result = await self._flow_mgr.async_init(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/data_entry_flow.py", line 265, in async_init
    result = await self._async_handle_step(flow, flow.init_step, data)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/data_entry_flow.py", line 394, in _async_handle_step
    result: FlowResult = await getattr(flow, method)(user_input)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/config/custom_components/stream_assist/config_flow.py", line 58, in async_step_init
    defaults.setdefault("vad_mode", VAD.vad_mode)
                                    ^^^^^^^^^^^^
AttributeError: type object 'VoiceCommandSegmenter' has no attribute 'vad_mode'

When I turn on switch the first time, I get following error:

Error doing job: Task exception was never retrieved
Traceback (most recent call last):
  File "/config/custom_components/stream_assist/switch.py", line 152, in async_process_audio_stream
    async for _ in self.audio_stream(self.close):
  File "/config/custom_components/stream_assist/switch.py", line 110, in audio_stream
    if not self.vad.process(chunk):
           ^^^^^^^^^^^^^^^^^^^^^^^
TypeError: VoiceCommandSegmenter.process() missing 1 required positional argument: 'is_speech'

And VAD changes to standby and does not do anything.

Won't Work with Tapo C110 & C200

Hello, the possibility to use the camera stream is fantastic. But unfortunately it does not work for me. Tested via the Tapo integration (models C110 & 2x C200) and via the restream from frigate with audio in aac. Only via Android IP camera it works. How can I debug?
However, the video stream in Homeassistant of the cameras has sound.

Maybe someone can help me.

STT, target is not populated

Hello, thanks for taking the time to create this.

I have run into one issue, when the response comes back and TTS is triggered to play on a media device it does not play. I have configured it for a media player, but when I check the logs, I see that the command sent had no target devices even though I have selected one. Is this correct to expect the device to be here?

Thanks!

SmartSelect_20240217_030836_Home Assistant

STT sometimes hangs for 7-10 sec

Sometimes STT is lightning-fast, but half of the time it hangs in "start" state for up to 10 seconds after i finished talking...
Overall - it's great integration, thank you!

Add delay until STT start media finishes playing

Hello. Great job. I was waiting for the wake word for Stream Assist and I'm glad you managed to do it. My problem is that for "STT start media" I want to use personalized random answers like ”yes, i m listening”, ”how can I assist you” etc. and, because VAD is too aggressive, it also records part of the answer ”yes , i m listening” reason for which it gives an error response, that it did not understand the request. I tried an automation so that when it detects the wake word it turns off the microphone switch for a second and then turns it on again, but it doesn't start listening again. Can you make it possible to set a delay between wake word detection and STT listening?

Error: "Config flow could not be loaded"

When I try to config the integration entity I get error: Config flow could not be loaded:

image

Logs show error: AttributeError: type object 'VoiceCommandSegmenter' has no attribute 'vad_mode':

Logger: aiohttp.server
Source: /usr/local/lib/python3.11/site-packages/aiohttp/web_protocol.py:403
First occurred: 10:06:19 PM (7 occurrences)
Last logged: 10:26:31 PM

Error handling request
Traceback (most recent call last):
  File "/usr/local/lib/python3.11/site-packages/aiohttp/web_protocol.py", line 433, in _handle_request
    resp = await request_handler(request)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/aiohttp/web_app.py", line 504, in _handle
    resp = await handler(request)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/aiohttp/web_middlewares.py", line 117, in impl
    return await handler(request)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/http/security_filter.py", line 85, in security_filter_middleware
    return await handler(request)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/http/forwarded.py", line 227, in forwarded_middleware
    return await handler(request)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/http/request_context.py", line 28, in request_context_middleware
    return await handler(request)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/http/ban.py", line 80, in ban_middleware
    return await handler(request)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/http/auth.py", line 236, in auth_middleware
    return await handler(request)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/http/headers.py", line 31, in headers_middleware
    response = await handler(request)
               ^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/http/view.py", line 148, in handle
    result = await handler(request, **request.match_info)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/http/decorators.py", line 63, in with_admin
    return await func(self, request, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/config/config_entries.py", line 213, in post
    return await super().post(request)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/components/http/data_validator.py", line 72, in wrapper
    result = await method(view, request, data, *args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/helpers/data_entry_flow.py", line 71, in post
    result = await self._flow_mgr.async_init(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/data_entry_flow.py", line 265, in async_init
    result = await self._async_handle_step(flow, flow.init_step, data)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/src/homeassistant/homeassistant/data_entry_flow.py", line 394, in _async_handle_step
    result: FlowResult = await getattr(flow, method)(user_input)
                         ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/config/custom_components/stream_assist/config_flow.py", line 58, in async_step_init
    defaults.setdefault("vad_mode", VAD.vad_mode)
                                    ^^^^^^^^^^^^
AttributeError: type object 'VoiceCommandSegmenter' has no attribute 'vad_mode'

I've tried selecting the entity from list:

image

I've tried pasting in the go2rtc camera name.

image

Go2RTC

image

Here are my default HA Voice Pipeline settings:

image

Thanks for help @AlexxIT !

No TTS audio playing over Sonos media player

I am getting the following error when trying to use a Sonos media player. The beep.wav plays after the wake word and commands are executed but I get no TTS audio through my Sonos AMP / ceiling speakers.

ERROR (SyncWorker_45) [homeassistant.components.sonos.media_player] Sonos does not support media type "audio/mpeg"

Thanks for all you do.

Add visual custom responses with gifs and tts

A while ago I managed to get custom visual responses to play on an android tablet using an Esp32 satellite.
This is a small demo.
Thanks to the fact that AlexxIT managed to introduce wake word detection in the Stream Asssit integration, I thought of replacing the Esp32 satellite that I was using in this project with the Stream Assist integration and Rtpmic app on android tablet.
I added to "Stt start media" custom tts responses and two gifs for speech and listen to be played with Browser Mod popup.
The advantage is that the custom responses are very easy to adapt for each language using your favorite tts service.
Because I made several changes in the Stream Assist integration code, before making the pull request I chose to copy AlexxIT's repository and add it with the changes made to my github page along with the necessary instructions.
@AlexxIT please, if you have time, take a look at this modified repository and tell me your opinion, if you want to add these changes to your integration or you prefer to keep these changes only in the repository modified by me on my github.

Add area into Assist context

All satellites HA presented (including Wyoming-satellite and ESP32-S3 satellite) have something called "area awareness". I guess it's just device area getting provided to Assist pipeline with STT data.

In current state there's no area-awareness for StreamAssist. Could you please look into it?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.