Code Monkey home page Code Monkey logo

tts-api's Introduction

TTS-API

Text to speech REST API for multiple TTS engines.

License: MIT Sponsor

You can send a text to be converted into audio, using different TTS engines and sound effects. Then, you will be listening to it on your local audio device, or receiving the resultant audio file.

Setup

First, you should install the supported TTS engines:

GoogleSpeech setup

apt install python3 sox libsox-fmt-mp3
pip install google_speech

gTTS setup

apt install python3 sox libsox-fmt-mp3
pip install gTTS

Festival setup

apt install festival festvox-ellpc11k

eSpeak setup

apt install espeak

You also need to install nodejs and npm, and then, simply run npm install and npm start. The API should now be running at http://localhost:3000.

Or you can just use pedroetb/tts-api Docker image, which already has all dependencies configured.

Setup using Docker

The only requirement is to have Docker installed. Then, you can run:

docker run --rm -d --name tts-api --device /dev/snd -p 3000:3000 pedroetb/tts-api

The API will be running and accessible at http://localhost:3000.

Alternatively, you can deploy it in a Docker Swarm cluster using docker compose (already included in Docker if using modern version) and docker swarm (create Swarm cluster first):

cd deploy

# Deploy Caddy service
env $(grep -v '^[#| ]' .env | xargs) \
 TRAEFIK_DOMAIN=change.me \
 docker stack deploy \
 -c compose.caddy.yaml \
 tts-api

# Run TTS-API container
docker compose \
 -f compose.tts-api.yaml \
 -p tts-api \
 up -d

The service is prepared to be reverse-proxied with Traefik, and accessible at tts.${TRAEFIK_DOMAIN} domain. How to run Traefik is not described here, check its official site.

The proxy needs a little help from Caddy, because Docker Swarm is not compatible with devices configuration (required to use sound capabilities) and Traefik cannot work with Docker containers and Docker Swarm services all at once. This way, only Caddy service is exposed using Traefik and tts-api container is only accessible through reverse-proxy provided by Caddy (same way Traefik is reverse-proxing to Caddy).

Both, Docker container and service, can be running on different hosts, because they are able to communicate through a Docker overlay network. Run tts-api Docker container on host which has speakers, so you can listen speech.

Don't forget to edit TRAEFIK_DOMAIN environment variable before deploying.

Usage

When running, the API will receive POST requests at http://localhost:3000. You can use your favourite REST client to send a request, or use the built-in form.

Both modes (playing or downloading audio) are available using different voice codes, select one according to your needs.

Built-in form

Go to http://localhost:3000 with your browser, fill the form with data and submit it. Just that.

Send POST request

You can send a POST request to http://localhost:3000 following this scheme:

  • Headers
    • Content-Type: application/json
  • Body
    • { "voice": "google_speech", "textToSpeech": "hello world", "language": "en", "speed": "1" }

For example, using curl:

# Play audio
curl http://localhost:3000 \
 -d '{ "voice": "google_speech", "textToSpeech": "hello world", "language": "en", "speed": "1" }' \
 -H 'Content-Type: application/json'

# Download audio file
curl http://localhost:3000 \
 -d '{ "voice": "gtts_file", "textToSpeech": "hello world", "language": "en", "speed": "1" }' \
 -H 'Content-Type: application/json' \
 -o 'output.mp3'

Available TTS engines

GoogleSpeech engine

Google Speech is a simple multiplatform command line tool to read text using Google Translate TTS (Text To Speech) API.

You need to be online to communicate with Google servers.

Learn more at https://github.com/desbma/GoogleSpeech

gTTS engine

Google Text-to-Speech (gTTS) is a Python library and CLI tool to interface with Google Translate's text-to-speech API.

You need to be online to communicate with Google servers.

Learn more at https://github.com/pndurette/gTTS

Festival engine

Festival is a free software multi-lingual speech synthesis workbench that runs on multiple-platforms offering black box text to speech, as well as an open architecture for research in speech synthesis.

It works offline.

Learn more at http://www.cstr.ed.ac.uk/projects/festival/ and http://festvox.org/festival/

eSpeak engine

eSpeak is a compact open source software speech synthesizer for English and other languages, for Linux and Windows.

It works offline.

Learn more at http://espeak.sourceforge.net/

License

License: MIT

This project is released under the MIT License.

tts-api's People

Contributors

pedroetb avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

tts-api's Issues

using it from a web site?

Hello,

I was just came across your tts-api on Docker Hub that I tested out on my local system.

I found that the curl calls do not seem to work on my Debian 10 system and that may just be the device setting that I can test out later.

The question that I have is this.

I was able to start up the container and use my web browser for a local interface:

http://localhost:3000

I would like to be able to use this on a website on a web server that I have set up and need to see how to interface it the same way that you do for your local test page so that I can send text to it in the background.

Can you please advise on how I might be able to do this?
Thanks in advance

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.