markparker5 / stark Goto Github PK

S.T.A.R.K. - Speech And Text Algorithmic Recognition Kit

License: Other

Python 99.95% HTML 0.05%

cross-platform framework natural-language natural-language-processing natural-language-understanding python python3 speech-processing speech-recognition voice voice-assistant voice-commands voice-control voice-interface voice-recognition

stark's Introduction

👋 Hi, I’m Mark Parker, full-stack software engineer

MajorDom

I'm making majordom.io - a smart home of the future that is really smart; a brand new open-source smart home ecosystem, designed for privacy, autonomy, and seamless integration across a wide range of devices. The platform combines easy plug-n-play functionality with intelligent automations and a really smart voice assistant.

My other open source projects

Voice Interfaces

STARK - (python) Speech And Text Algorithmic Recognition Kit. Modern, advanced, and fast framework for creating natural language (especially voice) interfaces. Like FastAPI, but with speech instead of http.
STARK-PLACE - S.T.A.R.K. Platform Library and Community Extensions

iOS/swift Development

AnyObservableObject - Protocol-friendly equivalents to SwiftUI's property wrappers but without compile-time type restrictions. Use protocols in views without generics!
XCodeLocalize - One cli command to find and translate all .strings files in xcode project to all languages.
SwiftyTranslate - Swift wrapper for public google translate. Absolutely free, no api keys needed.
TwiTreads - An abandoned idea to combine several socials into one platform. An example of clean SwiftUI architecture: MVVMP+SOLID+DI

Python Development

python-app-architecture-demo - clean architecture example for the article
system-sounds - list and play available system sound files (cross-platform)
rpi-networking - Control wifi, hotspot, and hostname of Raspberry Pi (created for raspbian, but can also work with other Linux systems)
raspi-gpio - RPi.GPIO and spidev wrapper with mocks for developmennt on any platform
rpi-reactive-gpio - Syntax sugar for controlling RPi.GPIO with reactive design.

Dr. House AI

HouseMDAI - AI diagnostician in your phone. A PoC of a startup, built on hackathon in two days. SwiftUI app, FastAPI backend, OpenAI api, business models, and custdev.

Articles about the project:

Gists

https://gist.github.com/MarkParker5

Blog

Check my blog at markparker.me/blog or mirrors at medium, dev.to, or hashnode. You can subscribe to articles on these platforms. Additional notifications will also be on Twitter and Threads. More relevant links can be found at markparker.me

stark's People

Contributors

Stargazers

Watchers

Forkers

luisxcosta artemii1 vladimirshleyev

stark's Issues

Create Bug-Report and Feature-Request Templates

RuntimeError on main basic example

This error you will get if you run basic sample from documentation
Traceback:

`Traceback (most recent call last):
File "/home/seeker/tmp/./sovetnik.py", line 11, in
synthesizer = SileroSpeechSynthesizer(model_url='https://models.silero.ai/models/tts/ru/v4_ru.pt')
File "/home/seeker/.local/lib/python3.10/site-packages/stark/interfaces/silero.py", line 37, in init
self.model = torch.package.PackageImporter(local_file).load_pickle('tts_models', 'model')
File "/home/seeker/.local/lib/python3.10/site-packages/torch/package/package_importer.py", line 271, in load_pickle
result = unpickler.load()
File "/usr/lib/python3.10/pickle.py", line 1213, in load
dispatchkey[0]
File "/usr/lib/python3.10/pickle.py", line 1254, in load_binpersid
self.append(self.persistent_load(pid))
File "/home/seeker/.local/lib/python3.10/site-packages/torch/package/package_importer.py", line 249, in persistent_load
loaded_reduces[reduce_id] = func(self, *args)
File "/home/seeker/.local/lib/python3.10/site-packages/torch/jit/_script.py", line 372, in unpackage_script_module
cpp_module = torch._C._import_ir_module_from_package(
RuntimeError:
Unknown builtin op: aten::scaled_dot_product_attention.
Here are some suggestions:
aten::_scaled_dot_product_attention

The original call is:
File ".data/ts_code/code/torch/torch/nn/functional.py", line 489
_114 = [bsz, num_heads, src_len0, head_dim]
v8 = torch.view(v6, _114)
attn_output5 = torch.scaled_dot_product_attention(q3, k8, v8, attn_mask16, dropout_p0, is_causal)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
_115 = torch.permute(attn_output5, [2, 0, 1, 3])
_116 = torch.contiguous(_115)
'multi_head_attention_forward' is being compiled since it was called from 'MultiheadAttention.forward'
Serialized File ".data/ts_code/code/torch/torch/nn/modules/activation.py", line 44
_6 = "The fast path was not hit because {}"
_7 = "MultiheadAttention does not support NestedTensor outside of its fast path. "
_8 = torch.torch.nn.functional.multi_head_attention_forward
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ <--- HERE
_9 = uninitialized(Tuple[Tensor, Tensor])
_10 = uninitialized(Optional[Tensor])
`

[SOLVED] RuntimeError: quantized engine QNNPACK is not supported

When I tried to run the example from the documentation I got an error

RuntimeError: quantized engine QNNPACK is not supported

import anyio
from stark import run, CommandsManager, Response
from stark.interfaces.vosk import VoskSpeechRecognizer

from stark.interfaces.silero import SileroSpeechSynthesizer

VOSK_MODEL_URL = "https://alphacephei.com/vosk/models/vosk-model-small-ru-0.22.zip"
SILERO_MODEL_URL = "https://models.silero.ai/models/tts/ru/v4_ru.pt"

recognizer = VoskSpeechRecognizer(model_url=VOSK_MODEL_URL)
synthesizer = SileroSpeechSynthesizer(model_url=SILERO_MODEL_URL)

manager = CommandsManager()

@manager.new('hello')
async def hello_command() -> Response:
    text = voice = 'Hello, world!'
    return Response(text=text, voice=voice)

async def main():
    await run(manager, recognizer, synthesizer)

if __name__ == '__main__':
    anyio.run(main)

OS: Windows 11
CPU: Intel(R) Core(TM) i7-8550U CPU @ 1.80GHz 1.99 GHz
GPU: Intel(R) UHD Graphics 620 / NVIDIA GeForce GTX 1050
stark-engine: 4.0.5
torch: 2.1.0