[hts-users:00046] Re: hts_engine binary format
- Subject: [hts-users:00046] Re: hts_engine binary format
- From: "Elliot Moore" <em80@xxxxxxxxxxxxxxx>
- Date: Tue, 8 Jun 2004 11:24:21 -0400
What is the purpose of the hts_engine conversion? What exactly is it
changing in the HTK formatted mmf that is so important? It doesn't seem to
make a difference if that step is skipped and the speech signals are
synthesized.
I need to know because I haven't been including that step in a multi-speaker
recognition/synthesis application and the synthesized speech so far as been
unintelligible. I want to make sure that this isn't the reason and I want
to know what the hts_engine is anyway.
Thanks!
----------------------------------------------------------------
Dr. Elliot Moore II, Post Doc
Center for Signal and Image Processing
Georgia Institute of Technology
Electrical and Computer Engineering
email: emoore@xxxxxxxxxxxxxx (ECE)
em80@xxxxxxxxxxxxxxx (GT)
WWW: users.ece.gatech.edu/~emoore
----- Original Message -----
From: "Heiga Zen / Byung-Ha Chun" <zen@xxxxxxxxxxxxxxxx>
To: <hts-users@xxxxxxxxxxxxxxxxxxxxxxxxx>
Sent: Tuesday, May 25, 2004 6:44 AM
Subject: [hts-users:00045] Re: hts_engine binary format
> Hello Matthew,
>
> On Thu, 20 May 2004 11:30:49 -0400 (EDT)
> Matthew Lee <mattlee@xxxxxxxxxxxxxx> wrote:
>
> > I have a question concerning the HTS demo script.
> > There is a section labelled, "convert HTK macro model file format into
hts_engine model binary format."
> > This is performed using HHEd with the script commands LT, CT, and CM.
> > I am trying to implement a model without tree clustering, but I cannot
get this conversion script to work.
> > I have tried just using the CM command with HHEd, but I get errors
saying that no trees have been loaded.
>
> Sorry, current hts_engine assumes decision-tree stream clustered HMMs.
> However, if you want to use monophone HMMs for your acoustic model,
> I think you can convert monophone HMMs into hts_engine format by using
following procedure:
>
> 0. Train monophone HMMs
> 1. Prepare questions about current phoneme.
> 2. Set variables $mdl{mcep}, $mdl{logF0}, and $mdl{duration} in training
script to "".
> It means that trees grow untill no question improves model likelihood.
> 3. Construct trees and cluster HMM streams.
> 4. Embedded training
> 5. Convert HMMs into hts_engine acoustic model format by using HHEd with
the script command LT, CT and CM.
>
> Best regards,
>
> Heiga Zen / Byung-Ha Chun
>
> --
> ------------------------------------------------
> Heiga Zen (in Japanese pronunciation)
> Byung-Ha Chun (in Korean pronunciation)
>
> Department of Computer Science and Engineering
> Graduate School of Engineering
> Nagoya Institute of Technology
> Japan
>
> e-mail: zen@xxxxxxxxxxxxxxxx
> web: http://kt-lab.ics.nitech.ac.jp/~zen
> ------------------------------------------------
>
- Follow-Ups
-
- [hts-users:00047] Re: hts_engine binary format, Heiga Zen / Byung-Ha Chun
- References
-
- [hts-users:00044] hts_engine binary format, Matthew Lee
- [hts-users:00045] Re: hts_engine binary format, Heiga Zen / Byung-Ha Chun