Code Monkey home page Code Monkey logo

Comments (23)

boredomed avatar boredomed commented on August 19, 2024

You have to recreate the files in $FLITEDIR/lang/cmulex using your new dictionary.
Just follow the steps mentioned here:
https://boredomed.wordpress.com/2019/03/07/festvox-to-flite-tts-conversion/
feel free to ask for any querries

from flite.

attitudechunfeng avatar attitudechunfeng commented on August 19, 2024

One more question. Is the "lexicon.out" file represents the new lexicon file, and the content in the file follows the format "word pronounciation"?

from flite.

boredomed avatar boredomed commented on August 19, 2024

Yes lexicon.out is your new or extended lexicon and has to be build on the format of words pronunciation ie their phonetic representation. You can take help from festival/lib/dicts/cmu/cmudict.out
But you can make it simpler by skipping the 0, 1 emphasis if not required.

from flite.

attitudechunfeng avatar attitudechunfeng commented on August 19, 2024

Thanks for your quick reply. Your answer has helped me solve the question, so i close this issue.

from flite.

attitudechunfeng avatar attitudechunfeng commented on August 19, 2024

Sorry, I've met another problem when i tried to extend the dict. "Also place your lexicon and allowables files in this ‘lex’ directory" in the tutorial, so which files should i exactly put into the lex directory? .out file, allowbles file and more? And how can i generate these files?

from flite.

boredomed avatar boredomed commented on August 19, 2024

Just 2 files lexicon and allowable.
You have to place the lexicon.out file in 'lex' directory. Which in your case is the lexicon that you have extended the format is mentioned in the tutorial 'Lexicon format'.
If you are using English then you can obtain the allowables.scm from festival/lib/dicts/allowables.scm else if its not the case you have to use your own allowables file here on which bases you are creating the word pronunciation in lexicon.

from flite.

attitudechunfeng avatar attitudechunfeng commented on August 19, 2024

I've tried as guided, it can't work and the error info is as follows:

cat: alllets.out: No such file or directory
cat: allphones.out: No such file or directory
cat: let2phones.out: No such file or directory
Find probabilities of letter-phone pairs
/festival/src/main/festival: argument for "--heap" not an int
Type -h for help on options.
Align letter-phone pairs with best path
/festival/src/main/festival: argument for "--heap" not an int
Type -h for help on options.
Build letter to phone CART trees
awk: fatal: cannot open file lex.feats' for reading (No such file or directory) awk: fatal: cannot open file lex.feats' for reading (No such file or directory)
awk: fatal: cannot open file lex.feats' for reading (No such file or directory) awk: fatal: cannot open file lex.feats' for reading (No such file or directory)
awk: fatal: cannot open file lex.feats' for reading (No such file or directory) awk: fatal: cannot open file lex.feats' for reading (No such file or directory)
awk: fatal: cannot open file lex.feats' for reading (No such file or directory) awk: fatal: cannot open file lex.feats' for reading (No such file or directory)
awk: fatal: cannot open file lex.feats' for reading (No such file or directory) awk: fatal: cannot open file lex.feats' for reading (No such file or directory)
awk: cmd. line:1: fatal: cannot open file lex.feats' for reading (No such file or directory) awk: fatal: cannot open file lex.feats' for reading (No such file or directory)
Build complete model
/festival/src/main/festival: argument for "--heap" not an int
Type -h for help on options.
cp: cannot stat 'lts_scratch/lex_lts_rules.scm': No such file or directory
Test model
/festival/src/main/festival: argument for "--heap" not an int
Type -h for help on options.
with ALL data -- no held out test set
and i also find the allowables.scm has been replaced. What may cause this problem?

from flite.

boredomed avatar boredomed commented on August 19, 2024

The issue is in the build_lts file in which heap value is not declared.
in build_lts in the cummulate , build , align , merge if statements remove '--heap HEAP' and then run it.

from flite.

attitudechunfeng avatar attitudechunfeng commented on August 19, 2024

I remove '--heap HEAP' and re run it, it seems slow. How long it will take to finish the process?

from flite.

boredomed avatar boredomed commented on August 19, 2024

It should not take long
Are you using the allowables from cmu dict?
If yes did you commented the line that makes allowables in the build_lts file in the argument 'lts'?
do it maybe the issue is that its wrongly remaking the allowables file

from flite.

attitudechunfeng avatar attitudechunfeng commented on August 19, 2024

I've got it and the process can successfully finish. Thanks very much.

from flite.

attitudechunfeng avatar attitudechunfeng commented on August 19, 2024

I've run through the whole process. However, the result seems wrong. With the new dictionary, not only the new word is wrong, but also the original word. What may be the reason? By the way, when run "bulid_lts test", almost all words are failed, is this normal?

from flite.

boredomed avatar boredomed commented on August 19, 2024

Is there 0 and 1 added with the phonemes in lexicon thats created in lts_scratch ?
If yes does your allowables also contain the same phonemes ie with 0 and 1 .
The issue can be the phonemes in your allowables does not match with those in lexicon so allignments failed.
Try removing 0s 1s from lexicon in lts_scratch it not in allowables or vice verca.
and do the steps again .

from flite.

attitudechunfeng avatar attitudechunfeng commented on August 19, 2024

I've checked "lts_scratch/lex_entries.out" and "allowables.scm",some fragments are like below:

( ("a" "a" "a") nil (t r ih1 p ax0 l ey1 ))
( ("a" "a" "b" "e" "r" "g") nil (aa1 b er0 g ))
( ("a" "a" "c" "h" "e" "n") nil (aa1 k ax0 n ))
( ("a" "a" "k" "e" "r") nil (aa1 k er0 ))
( ("a" "a" "l" "s" "e" "t" "h") nil (aa1 l s eh0 th ))
( ("a" "a" "m" "o" "d" "t") nil (aa1 m ax0 t ))
( ("a" "a" "n" "c" "o" "r") nil (aa1 n k ao1 r ))
( ("a" "a" "r" "d" "e" "m" "a") nil (aa0 r d eh1 m ax0 ))
( ("a" "a" "r" "d" "v" "a" "r" "k") nil (aa1 r d v aa1 r k ))
( ("a" "a" "r" "o" "n") nil (eh1 r ax0 n ))

(require 'lts_build)
(set! allowables
'((a epsilon aa aa1 aa0
ax ax1 ax0
eh eh1 eh0
ah ah1 ah0
ae ae1 ae0
ey ey1 ey0
ay ay1 ay0
er er1 er0

and I think they are consistent, how do you think about it?

from flite.

boredomed avatar boredomed commented on August 19, 2024

Check the log files where there are the unalignments check that the failed ones.
Take a word which is failed check its phones see that wheather they are actually right? and falsely unaligned or else?
and then match them with the phones in allowables
this can help you finding the bug.

from flite.

attitudechunfeng avatar attitudechunfeng commented on August 19, 2024

However, I only add one new word to the original festival cmudict and use the same allowable as cmudict, but the new results become wrong. It's strange for me.

from flite.

boredomed avatar boredomed commented on August 19, 2024

from flite.

attitudechunfeng avatar attitudechunfeng commented on August 19, 2024

I have commented the line "./build_lts make_allowables_smt" in build_lex and replaced the data_raw file with data compressed as mentioned in the tutorial.

from flite.

ZhenheZhang avatar ZhenheZhang commented on August 19, 2024

Hi there,

I'm following your conversation to expand the dictionary in my system as well. My objective is to upgrade the default CMUdict-0.4 to 0.7b in flite-2.2.
I have commented the line "./build_lts make_allowables_smt" in build_lex. But I have not removed '--heap HEAP' as I have not got the error info stated above.
My problem is the step "./build_lex lts" takes long time, 2 days already. I've attached my server info below. can you help explain how long should be the expectation. Thanks.
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 64
On-line CPU(s) list: 0-63
Thread(s) per core: 2
Core(s) per socket: 8
Socket(s): 4
NUMA node(s): 4
Vendor ID: GenuineIntel
CPU family: 6
Model: 62
Model name: Intel(R) Xeon(R) CPU E7-4820 v2 @ 2.00GHz
Stepping: 7
CPU MHz: 2000.000
BogoMIPS: 4002.65

Btw, this tutorial is greatly helpful.
https://boredomed.wordpress.com/2019/03/07/festvox-to-flite-tts-conversion/

Regards,
Zhenhe

from flite.

boredomed avatar boredomed commented on August 19, 2024

Hi,
Instead of directly doing the step ./build_lex lts run the individual commands in it that are as following:
./build_lts cummulate
./build_lts align
./build_lts build
./build_lts merge
Check which of these step takes the longest so you can find whats taking the longest time and pinpoint the issue.
Also don't run the last command in it ./build_lts test maybe it's the one taking the longest an is not compulsory for it.

from flite.

ZhenheZhang avatar ZhenheZhang commented on August 19, 2024

It's ./build_lts align that takes the longest. What's worse is I've got failed for several words, like:
align failed: (("a" "a" "a") nil (t r ih1 p ax0 l ey1))
what could be the cause you think?
And regarding to the time comsuming, is it related to the HEAP as well.
what is this command doing in festival? can you explain the fundamental briefly.
$FESTIVAL -b --heap $SIODHEAPSIZE allowables.scm lts_scratch/lex-pl-tablesp.scm

Thanks,
Zhenhe

from flite.

bringtree avatar bringtree commented on August 19, 2024

Ye, I also found the process of "./build_lts align" is too slow.
It spends too much time to generate "/lex/lts_scratch/lex.align" and it is a single thread program.

@boredomed @ZhenheZhang Could you share me with the generated files?

from flite.

bringtree avatar bringtree commented on August 19, 2024

oh, I found the allowable generated from ./build_lts make_allowables_smt is ver large.

from flite.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.