Code Monkey home page Code Monkey logo

speaker_adapted_tts's Introduction

Making a TTS model with 1 minute of speech samples within 10 minutes

Seeing my implementaions of Tacotron and DCTTS, many people have asked me "How large speech dataset is needed for neural TTS?" or "Can you make a TTS model with X hour(s)/minute(s) of training data?" I'm fully aware of the importance of those questions. When you plan a service using TTS, it is not always likely to get lots of speech samples. I would like to give an answer. I really do. But unfortunately I have no answer. The only thing I know is that I could train a model successfully with five hours of speech samples I extracted from Kate Winslet's audiobook. I haven't tried less data than that. I could try it, but I actually I have a better idea. Since I have a decent model trained with the LJ Speech Dataset for several days, why don't I use it? After all, we all have different voices, but the way we speak English is not totally different.

In the above two repos, I trained TTS models using all the speech samples of my two favorite celebrities, Nick Offerman and Kate Winslet, from scratch. This time, I use only one minute of the speech samples. The following are the synthesized samples after 10 minutes of fine-tuning training. Do you think they sound like them?

Check here to see the model details, source code and the pretrained model which served as a seed.

speaker_adapted_tts's People

Contributors

kyubyong avatar

Watchers

mjunaidi avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.