Code Monkey home page Code Monkey logo

datasets-cmu_wilderness's Introduction

              "Building Voices in Festival"
 Alan W Black ([email protected]), Kevin Lenzo ([email protected])
                 and see ACKNOWLEDGEMENTS
                  http://www.festvox.org

For full details about voice building see the document itself

http://festvox.org/bsv/

The included documentation, scripts and examples should be sufficient for an interested person to build their own synthetic voices in currently supported languages or new languages in the University of Edinburgh's Festival Speech Synthesis System. The quality of the result depends much on the time and skill of the builder. For English it may be possible to build a new voice in a couple of days work, a new language may take months or years to build. It should be noted that even the best voices in Festival (or any other speech synthesis system for that matter) are still nowhere near perfect quality.

This distribution includes

Support for designing, recording and autolabelling statistical parametric
    synthesis voices
Support for designing, recording and autolabelling diphone databases
Support for designing, recording and autolabelling unit selection dbs
Building simple limited domain synthesis engines
Support for building rule driven and data driven prosody models
   (duration, intonation and phrasing)
Support for building rule driven and data driven text analysis
Lexicon and building Letter to Sound rule support
Predefined scripts for building new US (and UK) English voices
Predefined scripts for building grapheme(++) voices for any language
Scripts for designing and selecting prompts to record for
   arbitrary languages

New in 2.8

https://github.com/festvox/festival/
Grapheme built voices can be converted to .flitevox files for android
Database size reduction for random forest clustergen voices
Random Forests for F0 prediction too
18 English voices, and 13 Indic voices

New in 2.7

Random forest models building for spectrum and duration in clustergen
Grapheme based synthesizers (with specific support for large number
  of unicode writing systems)
Clustergen state and stop value optimization
Wavesurfer label support
SPAM F0 support
Phrase break support
Support for SPTK's mgc parameterization

New in 2.3

Support for cygwin tools under Windows
Substantially improved CLUSTERGEN support with mlpg and mlsb

WARNING

This is not a pointy/clicky plug and play program to build new voices. It is instructions with discussion on the problems and an attempt to document the expertise we have gained in building other voices. Although we have tried to automate the task as much as possible this is no substitute for careful correction and understanding of the processes involved. There are significant pointers into the literature throughout the document that allow for more detailed study and further reading.

REQUIREMENTS

A Unix Machine

although there is nothing inheritantly Unix about the scripts, no
attempt has yet been made about porting this to other platforms

Festival and Speech Tools

This uses speech tools programs and festival itself at various
stages in builidng voices as well as (of course) for the final
voices.  Festival and the Edinburgh Speech Tools are available from

   http://www.cstr.ed.ac.uk/projects/festival/
   
or

   http://www.festvox.org/festival

or

   https://github.com/festvox
   
It is recommended that you compile your own versions of these
as you will need the libraries and include files to build some
programs in this festvox.

Wavesurfer

To display waveforms, spectragrams and phoneme labels.

Patience and understanding

Building a new voice is a lot of work, and something will probably
go wrong which may require the repetition of some long boring and
tedious process.  Even with lots of care a new voice still might 
just not work.  In distributing this document we hope to increase the
basic knowledge of synthesis out there and hopefully find people 
who can improve on this making the processing easier and more reliable
in the future.

INSTALLATION

You must have the Edinburgh Speech Tools and Festival instllation before you can build the tools in the festvox distribution.

Unpack festvox-2.8-release.tar.gz or clone it from github

git clone https://github.com/festvox/festvox
cd festvox
./configure
make

The configuration basically tries to find your version of the Edinburgh Speech Tools and uses its configuration to set compiler type etc. So you must have that installed. If configure fails try expliciting setting your ESTDIR environment variable to point ot your compiled version of the Speech Tools.

A pre-generated version of the document in html and postscript are distributed in the html/ directory

If you need to build the document itself, you will need a working version of the docbook tools, which may (or may not) already be installed on your system

To build the documenation

cd docbook
make doc

Note that even if the documentation build fails you can still use all the scripts and programs.

To use the scripts and programs in the festvox distribution each user is expected to have the environment variables ESTDIR and FESTVOXDIR set for example as (if you use bash, zsh, ksh or sh)

export ESTDIR=/home/awb/projects/speech_tools
export FESTVOXDIR=/home/awb/projects/festvox
export FLITEDIR=/home/awb/projects/flite
export SPTKDIR=/home/awb/projects/SPTK

Or if you use csh or tcsh

setenv ESTDIR /home/awb/projects/speech_tools
setenv FESTVOXDIR /home/awb/projects/festvox
setenv FLITEDIR /home/awb/projects/flite
setenv SPTKDIR /home/awb/projects/SPTK

Remember to set these to where your installations are, not ours.

datasets-cmu_wilderness's People

Contributors

awbcmu avatar festvox avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

datasets-cmu_wilderness's Issues

Failed dependencies in alignment

Hi, when I tried to run

nohup ./bin/do_found fast_make_align indices/NANTTV.tar.gz &

it returns an error in nohup.out file that says like this:

FAILED dependencies
NOTFOUND Edinburgh Speech Tools at
NOTFOUND Festival at /../festival
NOTFOUND SPTK at
NOTFOUND FestVox Tools at
NOTFOUND Flite at
FOUND ffmpeg at /usr/bin/ffmpeg
FOUND sox at /usr/bin/sox
FOUND html2text at /usr/bin/html2text

even though that I already run make dependencies and it succeeded.

Any help very much appreciated.

Download part of the script no longer working

I just found this amazing resource and its associated paper a few days ago, but unfortunately, as is mentioned in another issue from 3 years ago, it seems like the bible.is website has changed, causing the download part of the script to not work anymore.

I tried to adapt the script to the new website, the URLs can be fixed by simply changing http://listen.bible.is/... to http://live.bible.is/bible/... which can be achieved by changing

$this get_from_bibleis `cat $languageid/starturl`

to

url_id=`cat $languageid/starturl | cut -c 24-`
$this get_from_bibleis `echo 'http://live.bible.is/bible/'$url_id`

The parsing of the webpage however to get the languageid, languagename and most importantly audioUrl is also broken and I am unable to fix it. When inspecting the website, there is still an audioplayer with a source url, which can be downloaded, but I don't know how to parse this and all of the other needed bits out of the html response.

It would be really unfortunate to let all the amazing work on the alignments go to waste, just because the download changed, is there any chance this gets updated?

Resources to know more about initial alignment

Hi,
Thanks for creating this rich dataset. Is it possible to provide more details on how the initial alignments are made and the techniques used behind creating the pronunciation lexicon for all languages.

License issues ?

Awesome resources !

But I wonder if I compile the languages , will there be a license issue if I want to publish them freely on android ?

Incorrect hard links' destination under phone_alignments

The link destination for the hard links under 'make_phone_alignments' is incorrect ('/bin/do_found' line: 1184). It should be creating the links within a folder called 'wav/' under 'v_ph_aligns', but neither does the command have the closing forward-slash, nor has the 'wav' directory been created before the command.

Failed to download.

I found that the urls have been changed resulting failed downloading.

Anyone have ideas to fix it making use of this powerful project?

Audio and text files

I am a bit puzzled because seems what you have provided is only the alignment but without the audio and text files - unless I am missing something here.

I wonder how I can have the actual data before the alignment - meaning a folder with splitted audio files and say a CSV files where the text for each segment is written.

Is there anyway to retrieve this from the script you have provided here ?

Thank you again for your work and providing the data.

missing high-resource languages

Hi, thanks for this great corpus! However, I noticed that some high-resource languages such as German, Korean and Greek are not included. Do you have plans to add them to the collection?

Not able to download for NANTTV data

Hi,
I tried downloading the NANTTV data using nohup ./bin/do_found fast_make_align indices/NANTTV.tar.gz &
but faced this issue

ls: cannot access 'wav/*.wav': No such file or directory
ln: failed to access 'ps/wav/*.wav': No such file or directory
sox wav/B01___01_Matthew_____NANTTVN2DA.wav v/wav/B01___01_Matthew_____NANTTVN2DA_00001.wav trim 0 2.78
sox FAIL formats: can't open input file `wav/B01___01_Matthew_____NANTTVN2DA.wav': No such file or directory
sox wav/B01___01_Matthew_____NANTTVN2DA.wav v/wav/B01___01_Matthew_____NANTTVN2DA_00003.wav trim 5.58 5.5275
sox FAIL formats: can't open input file `wav/B01___01_Matthew_____NANTTVN2DA.wav': No such file or directory
sox wav/B01___01_Matthew_____NANTTVN2DA.wav v/wav/B01___01_Matthew_____NANTTVN2DA_00004.wav trim 11.1075 7.41
sox FAIL formats: can't open input file `wav/B01___01_Matthew_____NANTTVN2DA.wav': No such file or directory
sox wav/B01___01_Matthew_____NANTTVN2DA.wav v/wav/B01___01_Matthew_____NANTTVN2DA_00005.wav trim 18.5175 8.6225
.
.
.
grep: data/B01___01_Matthew_____NANTTVN2DA.data: No such file or directory
grep: data/B01___01_Matthew_____NANTTVN2DA.data: No such file or directory
grep: data/B01___01_Matthew_____NANTTVN2DA.data: No such file or directory
grep: data/B01___01_Matthew_____NANTTVN2DA.data: No such file or directory
grep: data/B01___01_Matthew_____NANTTVN2DA.data: No such file or directory
grep: data/B01___01_Matthew_____NANTTVN2DA.data: No such file or directory
grep: data/B01___01_Matthew_____NANTTVN2DA.data: No such file or directory
.
.
.
ln: failed to access 'v/wav/).wav': No such file or directory
ln: failed to access 'v/wav/).wav': No such file or directory
ln: failed to access 'v/wav/).wav': No such file or directory
ln: failed to access 'v/wav/).wav': No such file or directory
ln: failed to access 'v/wav/).wav': No such file or directory
ln: failed to access 'v/wav/).wav': No such file or directory
ln: failed to access 'v/wav/).wav': No such file or directory
.
.
.

Could anyone please help.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.