[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:01819] Re: HTS voice built with Festvox 2.1 - HHed failure during duration model clustering


I wish there was more information available on how to use files generated in HTS-demo in Festival. I have been able to run HTS-demo for the SLT voice but when I compare the output .win, .inf and .pdf files they are different than the ones distributed in the nitech_cmu_slt_arctic_hts/hts directory. - For one the standard HTS-demo script produces mgc.pdf instead of mcep.pdf. Are these the same but with a different name? Or are they different. If they are different how can you change the scheme files for Festival to work with the different format. - The .win files are ascii in HTS-demo but something else in the festival voice. How do you convert these?

Thanks, Esther Klabbers

On Dec 3, 2008, at 6:47 AM, Heiga Zen (Byung Ha CHUN) wrote:

Hi,

Daniel Tihelka wrote:

first of all I have found small bug on 64-bit platform: in src/ hts_build/data/mkdata.pl (written by Heiga Zen), line 114, the format of first two items stored into tmp.head file are formatted by -al in command x2x. It will write two long numbers - in 32-bit platform they are 4B long, as expected by HTK. However, when SPTK (I use 3.1 version) is compiled at 64bit platform, the numbers are 8B long and HTK (also 64bit version) cannot of course read it. Quick workaround is to use -ai switch; it should work on both 32 and 64bit.

It doesn7t happen in the current HTS-demo. The HTS-demo no longer uses SPTK to append HTK headers. It uses addhtkheader.pl to append this. You can find the following statement in addhtkheader.pl:

# number of frames in long
$NFRAME = pack("l", $nframe);

it packs number of frames in "l". In Perl, packing a value in "l" peforms packing the value in "signed 32-bit interger value". So it doesn't depend on 32/64-bit.

What troubles me, however, is that Training.pl script from src/ hts_build/ does not build HTS voice correctly (checked on CMU ARCTIC AWB 0.90 and CMU US SLT ARCTIC 0.95, on both the same type of failure occurs). No changes were made both in scripts and voice!

Festvox's training script is out of date. I wrote it about 6 years ago. I don't recommend you to use that. It cannot use the latest technologies and fixes.

What may be wrong? I do not expect a problem be a 64bit platform. All scripts and voice files were also used without any change (except -x switch in the build script and -ai switch for x2x, which, I think, are unlikely to cause the error). And the awb and slt HTS voices are available for Festival (and working), so they were build somehow - and I suppose that by build_hts script. Is the script up-to-date (the last date there is May 2003)? Or is there another preferred way how to build HTS voice for Festival?

Please do not use training script included in Festvox.

Any advice will be really appreciated. May I give you any additional information?

Please use the latest one released on the HTS website.

Regards,

Heiga ZEN (Byung Ha CHUN)

--
--------------------------
Heiga ZEN (Byung Ha CHUN)
Speech Technology Group
Cambridge Research Lab
Toshiba Research Europe
phone: +44 1223 436975

______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email ______________________________________________________________________

Esther Klabbers
Assistant Professor,
Center for Spoken Language Understanding (CSLU),
Division of Biomedical Computer Science (BMCS)
Oregon Health & Science University (OHSU)

20000 NW Walker Road / Beaverton, OR 97006
Office: +1-503-748-3005 / Fax: +1-503-748-1306
http://www.cslu.ogi.edu/people/klabbers


Follow-Ups
[hts-users:01824] Re: HTS voice built with Festvox 2.1 - HHed failure during duration model clustering, Daniel Tihelka
References
[hts-users:01817] HTS voice built with Festvox 2.1 - HHed failure during duration model clustering, Daniel Tihelka
[hts-users:01818] Re: HTS voice built with Festvox 2.1 - HHed failure during duration model clustering, Heiga Zen (Byung Ha CHUN)