[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:00046] Re: hts_engine binary format


What is the purpose of the hts_engine conversion?  What exactly is it
changing in the HTK formatted mmf that is so important?  It doesn't seem to
make a difference if that step is skipped and the speech signals are
synthesized.

I need to know because I haven't been including that step in a multi-speaker
recognition/synthesis application and the synthesized speech so far as been
unintelligible.  I want to make sure that this isn't the reason and I want
to know what the hts_engine is anyway.

Thanks!

----------------------------------------------------------------
Dr. Elliot Moore II, Post Doc
Center for Signal and Image Processing
Georgia Institute of Technology
Electrical and Computer Engineering
email: emoore@xxxxxxxxxxxxxx (ECE)
em80@xxxxxxxxxxxxxxx (GT)
WWW: users.ece.gatech.edu/~emoore


----- Original Message -----
From: "Heiga Zen / Byung-Ha Chun" <zen@xxxxxxxxxxxxxxxx>
To: <hts-users@xxxxxxxxxxxxxxxxxxxxxxxxx>
Sent: Tuesday, May 25, 2004 6:44 AM
Subject: [hts-users:00045] Re: hts_engine binary format


> Hello Matthew,
>
> On Thu, 20 May 2004 11:30:49 -0400 (EDT)
> Matthew Lee <mattlee@xxxxxxxxxxxxxx> wrote:
>
> > I have a question concerning the HTS demo script.
> > There is a section labelled, "convert HTK macro model file format into
hts_engine model binary format."
> > This is performed using HHEd with the script commands LT, CT, and CM.
> > I am trying to implement a model without tree clustering, but I cannot
get this conversion script to work.
> > I have tried just using the CM command with HHEd, but I get errors
saying that no trees have been loaded.
>
> Sorry, current hts_engine assumes decision-tree stream clustered HMMs.
> However, if you want to use monophone HMMs for your acoustic model,
> I think you can convert monophone HMMs into hts_engine format by using
following procedure:
>
> 0. Train monophone HMMs
> 1. Prepare questions about current phoneme.
> 2. Set variables $mdl{mcep}, $mdl{logF0}, and $mdl{duration} in training
script to "".
>    It means that trees grow untill no question improves model likelihood.
> 3. Construct trees and cluster HMM streams.
> 4. Embedded training
> 5. Convert HMMs into hts_engine acoustic model format by using HHEd with
the script command LT, CT and CM.
>
> Best regards,
>
> Heiga Zen / Byung-Ha Chun
>
> --
>  ------------------------------------------------
>   Heiga Zen     (in Japanese pronunciation)
>   Byung-Ha Chun (in Korean pronunciation)
>
>   Department of Computer Science and Engineering
>   Graduate School of Engineering
>   Nagoya Institute of Technology
>   Japan
>
>   e-mail: zen@xxxxxxxxxxxxxxxx
>      web: http://kt-lab.ics.nitech.ac.jp/~zen
>  ------------------------------------------------
>


Follow-Ups
[hts-users:00047] Re: hts_engine binary format, Heiga Zen / Byung-Ha Chun
References
[hts-users:00044] hts_engine binary format, Matthew Lee
[hts-users:00045] Re: hts_engine binary format, Heiga Zen / Byung-Ha Chun