[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:01851] Re: HTS voice built with Festvox 2.1 - HHed failure during duration model clustering


I've tried a similar thing and when I try to run the voice in Festival I get a bus error.

Here is my "script":

Copied nitech_us_slt_arctic_hts to hts_demo_cmu_arctic_slt and changed .scm scripts in festvox directory accordingly. Then replaced files in hts directory with the ones obtained from training HTS-demo_cmu_arctic_slt.
Ran:
x2x +sf < lf0.win2 > lf0_dyn.win
x2x +sf < lf0.win3 > lf0_acc.win
x2x +sf < mgc.win2 > mcep_dyn.win
x2x +sf < mgc.win3 > mcep_acc.win

swab +f mgc.pdf > mcep.pdf (copy to p_mcep.pdf as there seems to be no difference between the two, and the scm file uses only p_mcep).
swab +f duration.pdf > dur.pdf
swab +f lf0.pdf > lf0_new.pdf

cp tree-mgc.inf trees-mcep.inf
cp tree-lf0.inf trees-lf0.inf
cp tree-dur.inf trees-dur.inf
cp label.feats feat.list

I looked at the new mcep file using dmp +i and it seems very similar to the old mcep file that was there. Any other hints here?

Thanks, Esther

On Dec 12, 2008, at 1:09 AM, Daniel Tihelka wrote:

Hallo again,

I just tried to use HTS to build the SLT voice (from file
HTS-demo_CMU-ARCTIC-SLT.tar.bz2) - make works perfectly without any visible problems, synthetic examples in ..../gen/qst001/ver1/hts_engine sound as I
would expect, files in ..../voices/qst001/ver1/ were generated.

One note: option -w is not supported in sox 14.1.0 (and probably even in some little bit earlier version), use -2 instead, when called from Training.pl.


Now the question is, how to "map" the files generated into Festival voice. I tried the following mapping, but without success - the voice was loaded correctly, but SayText("Hallo, this is the first try") command consumed about 1.5GB memory and failed with message WALLOC: failed to malloc 671088648 bytes.

I just copied the files in lib/voices/us/voice_name/festvox (in Festival) from the SLT voice package cmu_us_slt_arctic-0.95-release.tar.bz2, for the files in lib/voices/us/voice_name/hts I have used the following mapping (in order
required by Festival: --> provided by HTS's Training.pl):

- duration.pdf		--> voices/qst001/ver1/dur.pdf
 simply renamed

- feat.list			--> voices/qst001/ver1/label.feats
 simply renamed

- lf0_dyn.win		--> voices/qst001/ver1/lf0.win2
 3 floats need to be converted to binary form, e.g. 'x2x' from SPTK
 (I have read it somewhere, but I cannot find it anymore ...)

- lf0_acc.win		--> voices/qst001/ver1/lf0.win3
 3 floats need to be converted to binary form, e.g. 'x2x' from SPTK

- lf0.pdf			--> voices/qst001/ver1/lf0.pdf
 used without changes

- mcep_dyn.win	--> voices/qst001/ver1/mgc.win2
 3 floats need to be converted to binary form, e.g. 'x2x' from SPTK

- mcep_acc.win	--> voices/qst001/ver1/mgc.win3
 3 floats need to be converted to binary form, e.g. 'x2x' from SPTK

- mcep.pdf		--> voices/qst001/ver1/mgc.pdf
 simply renamed

- p_mcep.pdf		--> voices/qst001/ver1/????
does not exist in HTS, but required by Festival (in case that 'mgc.pdf' file
would be use for this, what about 'mcep.pdf' then?)
 I just linked it to mgc.pdf for the test ...

- trees-dur.inf		--> voices/qst001/ver1/tree-dur.inf
 simply renamed

- trees-lf0.inf		--> voices/qst001/ver1/tree-lf0.inf
 simply renamed

- trees-mcep.inf		--> voices/qst001/ver1/tree-mgc.inf
 simply renamed


As I said, it did not work. Unfortunately.

So I would like to ask you for some additional hints. HTS training works perfectly, but how to convert the result into Festival? How the voice package
cmu_us_slt_arctic-0.95-release.tar.bz2 has been built?


Thank you very much. Best regards,
Dan


On Wednesday 03 of December 2008, Esther Klabbers wrote:
I wish there was more information available on how to use files
generated in HTS-demo in Festival. I have been able to run HTS-demo
for the SLT voice but when I compare the output .win, .inf and .pdf
files they are different than the ones distributed in the
nitech_cmu_slt_arctic_hts/hts directory.
- For one the standard HTS-demo script produces mgc.pdf instead of
mcep.pdf. Are these the same but with a different name? Or are they
different. If they are different how can you change the scheme files
for Festival to work with the different format.
- The .win files are ascii in HTS-demo but something else in the
festival voice. How do you convert these?

Thanks, Esther Klabbers


Esther Klabbers
Assistant Professor,
Center for Spoken Language Understanding (CSLU),
Division of Biomedical Computer Science (BMCS)
Oregon Health & Science University (OHSU)

20000 NW Walker Road / Beaverton, OR 97006
Office: +1-503-748-3005 / Fax: +1-503-748-1306
http://www.cslu.ogi.edu/people/klabbers


References
[hts-users:01817] HTS voice built with Festvox 2.1 - HHed failure during duration model clustering, Daniel Tihelka
[hts-users:01818] Re: HTS voice built with Festvox 2.1 - HHed failure during duration model clustering, Heiga Zen (Byung Ha CHUN)
[hts-users:01819] Re: HTS voice built with Festvox 2.1 - HHed failure during duration model clustering, Esther Klabbers
[hts-users:01824] Re: HTS voice built with Festvox 2.1 - HHed failure during duration model clustering, Daniel Tihelka