Code Monkey home page Code Monkey logo

audio-transcription's Introduction

audio-transcription

This project is intended to become a basic otter.ai clone, for speech-to-text in the browser and persisting transcriptions.

Design

Currently the 'AI' is just the Web Speech API, and persistence is as object storage with minio.

The Web Speech API generates for each transcription in progress both 'interim results', which are mutable, and 'final results', which are immutable. Currently interim results are submitted every 5 seconds, final results upon stopping recording. The Web Speech API is itself not perfect, so it may make sense to do additional corrections in the backend using both the interim and final results.

Results are submitted to the backend as HTTP post requests. May be sensible to change this to persistent connections, like Websockets.

The system consists of a backend server (flask), which writes to a task queue (rabbitmq) requests for persistence to minio, handled by celery workers.

There are 5 container components:

  • web
  • app
  • worker
  • rabbitmq
  • minio
     
Object storage
 
   /    \
   W1   Wk
    \   /
      MQ  
      |
    
    |app|
    
      |

    |web|

Build and Run

Build and run with docker-compose

docker-compose up --build

Authenticate to the minio object storage server using the minio access key and secret key specified by environment variables in the docker-compose.yml. You should be able to access a minio UI at http://localhost:8080.

You should be able then start audio transcription in the browser from http://localhost, and then see saved transcriptions in the 'final' and 'interim' buckets in minio through the minio UI.

Or if you don't care about saving transcriptions, launch just the frontend directly here, hosted on github pages.

audio-transcription's People

Contributors

redwrasse avatar

Forkers

mishanefedov

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.