Code Monkey home page Code Monkey logo

tts-for-lod's Introduction

TTS for LOD

A synthetic voice for the luxembourgish online dictionary

References : my earlier public Text-to-Speech projects hosted at Github and HuggingFace

New high-quality dataset

To train a high-quality luxembourgish TTS voice, the ZLS (Zenter fir d'Lëtzebuerger Sprooch) assembled an outstanding luxembourgish dataset of 39.836 audio samples, with related transcriptions, recorded in studio quality by Max Kuborn. After the first training I discovered that the dataset included several hundred files with a female voice. I cleaned the dataset and retrained the model with 32.000 male samples. My Wiki-Page Dataset provides detailed informations about this corpus.

Processing environment

The main hardware required for the TTS-training is a NVIDIA graphic card. My ancient TTS development system was set-up two years ago in a Linux-Ubuntu desktop with a NVIDIA RTX2070 card. When I started the training with the new dataset I was disappointed because my old scripts were no longer working without errors, probably due to automatic updates of some Python modules which are no longer compliant with my original configuration. After some frustrating attempts to handle the errors, I decided to restart from scatch and to set-up a new developement system on a Windows 11 labtop with a NVIDIA card RTX3060. My Wiki-Page Processing Environment shows how to install all the required sofware.

VITS TTS model

The choice of the Coqui-TTS-VITS model is explained in my Wiki-Page TTS-Model.

Installation of the Coqui-TTS Tools

Creating the required developement environment on a personal computer takes some time and can be frustrating. I published a small guide on the Wiki-Page Coqui-TTS Tools.

Training script

The training script is the heart of the project. All details are provided at Training Script.

Training process

It's important to understand the Training Process.

Evaluation

The continuous evaluation of the training evolution is the hard job. I will extend the Evaluation Guide as soon as possible.

Tensorboard

I will add more figures to my Wiki-Page Tensorboard.

Best Model

These details will be provided at the end of the training in the Wiki-Page Best Model.

Inference

When the model is ready, it can be used to synthesize luxembourgish texts. The scripts to run the synthesis and the access to public demo-spaces will be explained in the Inference Guide.

tts-for-lod's People

Contributors

mbarnig avatar

Stargazers

Sasan Jafarnejad avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.