Code Monkey home page Code Monkey logo

make-a-smart-speaker's Introduction

To make a smart speaker

中文

Here is a collection of resources to make a smart speaker. Hope we can make an open source one for daily use. I believe we have enough resources to make an open source smart speaker. Let's do it. Just created a project called smart speaker from scratch on hackaday. Check it out to see what we can make.

The simplified flowchart of a smart speaker is like:

+---+   +----------------+   +---+   +---+   +---+
|Mic|-->|Audio Processing|-->|KWS|-->|STT|-->|NLU|
+---+   +----------------+   +---+   +---+   +-+-+
                                               |
                                               |
+-------+   +---+   +----------------------+   |
|Speaker|<--|TTS|<--|Knowledge/Skill/Action|<--+
+-------+   +---+   +----------------------+
  • Audio Processing includes Acoustic Echo Cancellation (AEC), Beamforming, Noise Suppression (NS), etc.
  • Keyword Spotting (KWS) detects a keyword (such as OK Google, Hey Siri) to start a conversation.
  • Speech To Text (STT)
  • Natural Language Understanding (NLU) converts raw text into structured data.
  • Knowledge/Skill/Action - Knowledge base and plugins (Alexa Skill, Google Action) to provide an answer.
  • Text To Speech

KWS + STT + NLU + Skill + TTS

Active open source projects

  • Snips ⭐ - the first 100% on-device and private-by-design open-source Voice AI platform
  • Mycroft ⭐ - a hackable open source voice assistant
  • SEPIA 🤖 - Highly customizable, open-source, cross-platform voice assistant and VUI framework (HTML + Java + x)
  • Kalliope - a framework that will help you to create your own personal assistant, kind of similar with Mycroft (Both written by Python)
  • dingdang robot - a 🇨🇳 voice interaction robot based on Jasper and built with raspberry pi

SDK

KWS

  • Mycroft Precise - A lightweight, simple-to-use, RNN wake word listener
  • Snowboy - DNN based hotword and wake word detection toolkit
  • Honk - PyTorch reimplementation of Google's TensorFlow CNNs for keyword spotting
  • ML-KWS-For-MCU - Maybe the most promise for resource constrained devices such as ARM Cortex M7 microcontroller
  • Porcupine - Lightweight, cross-platform engine to build custom wake words in seconds

STT

NLU

TTS

  • Mimic - Mycroft's TTS engine, based on CMU's Flite (Festival Lite)
  • manytts - an open-source, multilingual text-to-speech synthesis system written in pure java
  • espeak-ng - an open source speech synthesizer that supports 99 languages and accents.
  • ekho - Chinese text-to-speech engine
  • WaveNet, Tacotron 2

Audio Processing

  • Acoustic Echo Cancellation

    • SpeexDSP, its python binding speexdsp-python
    • EC - Echo Cancelation Daemon based on SpeexDSP AEC for Raspberry Pi or other devices running Linux.
  • Direction Of Arrival (DOA) - Most used DOA algorithms is GCC-PHAT

    • tdoa
    • odas - ODAS stands for Open embeddeD Audition System. This is a library dedicated to perform sound source localization, tracking, separation and post-filtering. ODAS is coded entirely in C, for more portability, and is optimized to run easily on low-cost embedded hardware. ODAS is free and open source.
  • Beamforming

  • Voice Activity Detection

  • Noise Suppresion

Audio I/O

make-a-smart-speaker's People

Contributors

fq-selbach avatar maelp avatar xiongyihui avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.