Code Monkey home page Code Monkey logo

datasets-cmu_dog's Introduction

              "Building Voices in Festival"
 Alan W Black ([email protected]), Kevin Lenzo ([email protected])
                 and see ACKNOWLEDGEMENTS
                  http://www.festvox.org

For full details about voice building see the document itself

http://festvox.org/bsv/

The included documentation, scripts and examples should be sufficient for an interested person to build their own synthetic voices in currently supported languages or new languages in the University of Edinburgh's Festival Speech Synthesis System. The quality of the result depends much on the time and skill of the builder. For English it may be possible to build a new voice in a couple of days work, a new language may take months or years to build. It should be noted that even the best voices in Festival (or any other speech synthesis system for that matter) are still nowhere near perfect quality.

This distribution includes

Support for designing, recording and autolabelling statistical parametric
    synthesis voices
Support for designing, recording and autolabelling diphone databases
Support for designing, recording and autolabelling unit selection dbs
Building simple limited domain synthesis engines
Support for building rule driven and data driven prosody models
   (duration, intonation and phrasing)
Support for building rule driven and data driven text analysis
Lexicon and building Letter to Sound rule support
Predefined scripts for building new US (and UK) English voices
Predefined scripts for building grapheme(++) voices for any language
Scripts for designing and selecting prompts to record for
   arbitrary languages

New in 2.8

https://github.com/festvox/festival/
Grapheme built voices can be converted to .flitevox files for android
Database size reduction for random forest clustergen voices
Random Forests for F0 prediction too
18 English voices, and 13 Indic voices

New in 2.7

Random forest models building for spectrum and duration in clustergen
Grapheme based synthesizers (with specific support for large number
  of unicode writing systems)
Clustergen state and stop value optimization
Wavesurfer label support
SPAM F0 support
Phrase break support
Support for SPTK's mgc parameterization

New in 2.3

Support for cygwin tools under Windows
Substantially improved CLUSTERGEN support with mlpg and mlsb

WARNING

This is not a pointy/clicky plug and play program to build new voices. It is instructions with discussion on the problems and an attempt to document the expertise we have gained in building other voices. Although we have tried to automate the task as much as possible this is no substitute for careful correction and understanding of the processes involved. There are significant pointers into the literature throughout the document that allow for more detailed study and further reading.

REQUIREMENTS

A Unix Machine

although there is nothing inheritantly Unix about the scripts, no
attempt has yet been made about porting this to other platforms

Festival and Speech Tools

This uses speech tools programs and festival itself at various
stages in builidng voices as well as (of course) for the final
voices.  Festival and the Edinburgh Speech Tools are available from

   http://www.cstr.ed.ac.uk/projects/festival/
   
or

   http://www.festvox.org/festival

or

   https://github.com/festvox
   
It is recommended that you compile your own versions of these
as you will need the libraries and include files to build some
programs in this festvox.

Wavesurfer

To display waveforms, spectragrams and phoneme labels.

Patience and understanding

Building a new voice is a lot of work, and something will probably
go wrong which may require the repetition of some long boring and
tedious process.  Even with lots of care a new voice still might 
just not work.  In distributing this document we hope to increase the
basic knowledge of synthesis out there and hopefully find people 
who can improve on this making the processing easier and more reliable
in the future.

INSTALLATION

You must have the Edinburgh Speech Tools and Festival instllation before you can build the tools in the festvox distribution.

Unpack festvox-2.8-release.tar.gz or clone it from github

git clone https://github.com/festvox/festvox
cd festvox
./configure
make

The configuration basically tries to find your version of the Edinburgh Speech Tools and uses its configuration to set compiler type etc. So you must have that installed. If configure fails try expliciting setting your ESTDIR environment variable to point ot your compiled version of the Speech Tools.

A pre-generated version of the document in html and postscript are distributed in the html/ directory

If you need to build the document itself, you will need a working version of the docbook tools, which may (or may not) already be installed on your system

To build the documenation

cd docbook
make doc

Note that even if the documentation build fails you can still use all the scripts and programs.

To use the scripts and programs in the festvox distribution each user is expected to have the environment variables ESTDIR and FESTVOXDIR set for example as (if you use bash, zsh, ksh or sh)

export ESTDIR=/home/awb/projects/speech_tools
export FESTVOXDIR=/home/awb/projects/festvox
export FLITEDIR=/home/awb/projects/flite
export SPTKDIR=/home/awb/projects/SPTK

Or if you use csh or tcsh

setenv ESTDIR /home/awb/projects/speech_tools
setenv FESTVOXDIR /home/awb/projects/festvox
setenv FLITEDIR /home/awb/projects/flite
setenv SPTKDIR /home/awb/projects/SPTK

Remember to set these to where your installations are, not ours.

datasets-cmu_dog's People

Contributors

festvox avatar henryzhou0333 avatar shrimai avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

datasets-cmu_dog's Issues

Duplicate conversations across train/valid/test

Hi CMU_DoG dataset authors,

While doing some analysis of the dataset, we noticed that there were a few conversations duplicated between train/valid/test. For example, see

https://github.com/festvox/datasets-CMU_DoG/blob/master/Conversations/test/7747dbdeaeb5c9082abe54c0231fcbf1d9907d38.json

and

https://github.com/festvox/datasets-CMU_DoG/blob/master/Conversations/valid/7747dbdeaeb5c9082abe54c0231fcbf1d9907d38.json

which are the same conversation, except that one is in test and one is in valid (see the URLs).

The following are the 110 conversations that show up as duplicated. None of them overlap between all 3, but they overlap between different combinations of train/test, train/valid, and test/valid.

7747dbdeaeb5c9082abe54c0231fcbf1d9907d38
a7db00cab02b9513fa1d7172d35573e6f23630ab
017f651588118f8794349b3c9bd027c63d4226cc
01e70e7c454d15f3408e516cce788a4e8b24694a
02a3b61b613f7d2ba733881c4b732bbbad73ff7b
04d985b10ce191de275f9c4d1f9f4d809478b707
07a0a2126e37ea5ed83748483e5a2deed2bc120a
088b88b115140214c3e1b3d955c772a69613211c
09bbda4742d7603907c9e6ed27466d6745dffe31
0ecc4fb0efcc2362ecbe89cb2c3d6fc3012508f2
0f8ab05299aeed625e0304069ebcb3908d9430b4
0ff4a418d41d7755b992acd4b48c16ce536f90c1
116c5d7e7dd946a6eed95ff7838230656876761f
1359558ae032c547fac59406d33a449f6a338960
1381a18b60a35681a78620dc9479b5f019c72bb0
13b8d82a55192c0e48afe06caf1b398840fac5b5
16ea8e6ad0f90cc30fccde2106163305501bd1f7
17f5a0806eeddf1c865d255246194cc6c16ceab2
1a5666222c56bd5ca757ceebd3a0425757063b83
1bc2cda069820df5e8c1a50e9f687a21fe557c4e
20703fb140627f1bdfffa8d22f45dc9b70284327
20dc13f012d2ff943880f1f7b2a1364cc8805b76
26d5f0381a0b415201d84caff5bbe31a6746286d
2a7e8fcd644d0b4e38a9919d67898a05f4efcdd1
2f7ae28c28b287014f857391254d62251334ea2c
3001cb92b91e7ccad98b533f6998ada0bf8d2935
30a0fed0f23bae0b04110f3cb25e7a960a12ba21
30f5ee0dbb86b2ad38b8ff648a261f8c759730f7
33815de08497bb9e2dbb5d1799fc9e6747f153b4
34a28dcaca5730a4f2046f509315d802616a2ca9
36183c3c2f123e5dc3d8ac126d02a118d1fe1f75
3a823ace51029edf277e620c039045deb9b8afb7
3c9e09be88afdd52fd96538ec0cbaae6667f8117
42397f53285165f64e51b932334ce24cbd73c992
4665d98463d996b96c5436f331e361b51014adb0
4abe690df9089bb3673ff1233c3acbf53a29ff76
4c6ef11a1411d94af7abfe56547bed5139f80df2
549e28fd4e2f48c5a8919e87e883b388937973a4
56c4f87acf58a8d2454a6a814a0d463f6100502c
574a0d6263e6bcf555b6121a20cecd4aa35ba331
5d1a22f369b4d4edf10522f0dce98ba3fe4ba7b8
5e59f8bcb3f6b14c0ab462c2fba0f393e8dfa153
60d21f582f3707b37e616ee859e4ed08a814f918
62059e7d014539546fc8af24f72b9c67a2cb50be
6205d01434cd688417331a822c3206ed96abaeda
623c99e0a14e05afc92e0ef717955f6db5e9540f
637e22cb9527ca9dc29f45a8ac63934889c46bf1
64096861b9834df2eb307aa585b929bf1d047147
6483a0c5154147823d9dd06d45204a43e1c84c68
653b3d41abc2a261b6b52cb055721e56c44ffb9b
6876ce7a8fda3da5f889de7647037a1c200d5f6f
6bfbef62ca380281cc074eb69be092f315239083
6e4d4181d03021379822e2fc05df6c7294652bfe
70a119f7dcc93b5503eb2a9bf2fa8bc81e1b20fb
726ff4eb9abddd3963f06cd6ae980cf9370ce283
78f08dac1ce14021c8b3159c3d0c81cc55771ff6
7df0e0d70733082d5f18738086f92599750861f2
80f367e76c4e3c7dcc8a1004fdcd261b5a2f13ce
8281f7c60a2cabab0231a08554a57900a4ea3a49
84fd06134e332f4cdc8db3f868d5828d4cacb0f3
852b189a5da869ae04ff5c6de05d8a8bb51ef126
855d496060a757e64bc6a7d267432e91b5e61ad3
8d6c8ba49709839220139c9db1fe1a217a8ec298
9edc25e2b1a930aa2eb6e70332ea3cbf4862f583
a283e5ab4fea6aac663a12ed377b3005cf316c2e
a56156a210ca64e0ece5d4bb70eddb6702ba33d8
a592659327e51d654fcddd2763b63b5072620c6e
af408e2d131f34d20779c44ae83b39a02646f9a8
af52c68e6fb510a5192017a30c432e4e77f6043c
af8108609c557500687c6665afdae4835c7aff74
af8b3886d7f2b22b0880fd26135b05226c2138bf
afd8d2f054f068cf17e9dd2f70424114810d9e33
b0a42297255fb0abfca1e72004a1320fb0d8bac2
b301692297dda701377420861b26b3840a2d41dd
b30491ded83cc6aac65d905a01f8dbecb51bc60d
b7909659ab1157476bb62ba0998910f431792d9d
b890a435f85d9385d8092a75caa53a0af35f345d
bb460128bb7440368fcf78389e1d1bd7ef32cad4
bb60032589c61c8f866b88831f5171b1a026c7ed
bc3e8e30e47e3192f7ae41ff4bca8f2de221cb20
bcdd24739c3da8bd147e9ac52581f59370ff6722
bdd812d5582b2fa57d833678780ed23b649ebcc5
be1a6cdb20115526c9b11aae15b7a20af66d3319
be534aad2c42a8ca1c8a0a0c0dcaa5ee061dcda1
c0c0b679ea13cce1ddfd674b6bd9bba07e81c421
c3c6d8d44a5c79576344f41268b049812764fb9f
c896c54da7fd56ae14f66387a58f91b250ddea71
ca84f2086537b5cbe4c7ae68b4b30e6b8539dbe2
cdb1dd3bf587a66afe44ae1c80837389dfc5d528
d0f5e4da2a4115eef595310121a74762f89ff959
d1484f4c978275de43f27f72c5dca18b42e1ea1c
d2baca41b51ed0dbe5a6b88f2087a05d7fe3c081
d3d7c3ddca11ed89a8d4474eea6bc7ef7bab84d5
d4e9d17dc51fdbf4e8fba842b0007781b99660c2
d84c1088c8de7faf0fb73ee1d725aab8e6881946
d95df65e15b78100b4bea60c39532a1ff80ebc3d
da8546a7c874693a4083147ab86ac8921a9e38d5
db3ab055421f0e8b81987db3e46f9100c69e36b4
e791c0984063fcb07ea69a3d074a3eb52c033f52
e83a9ec6538046cc00f58a00cb3582de6c660def
e88e40589b4bebedad9b2bca7d174a8d894635ba
e8a57b826a25f1a163b771c14be4be0a2a46cb44
ed9a1995fff41fc6accf7e816633e1c0b5a19905
f080bc4443de70e3bf9441a98bbbde4224a1dcbb
f1bb15df63c0520602f3787f6c1541a1df6a1753
f2e29d17eb5d85a21b5704c59ffb20ad92ecdd83
f5a754e4d923d271de4e7f6ddc4ecd39c280ca4c
f86f196f304d52050e1a3786ca6b49c9a6e13d9e
f8d9ed8a56714098567c10107c29d73fd2fe805a
fd698fb98d1eb6436d2e5f2155d1332f494ebecc

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.