[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:02202] Re: More questions to HSMM


燕鹏举 wrote:

> 1. Can HERest still output duration model in a HMM mode? (As in paper T.
> Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura. Duration
> modeling for HMM-based speech synthesis. In Proc. ICSLP, pages 29?32,
> 1998.)


> I tried not to provide an initial duration model and use -g in order to
> output a duration model in a HMM mode, however all the resultant duration
> pdfs are N(0, 1).

I guess you didn't specify a proper flag with -u option.  If you want to output
state-durations distributions, you should run HERest with

-u tmvwdmv
"dmv" means estimate mean and variance of duration models.

> 2. In the HSMM mode, how can I provide an good initial duration model to
> HERest (when I have to "flat-start" because I have no initial boundary
> labeling)? HCompV can get global mean and variance for cmp only but not for
> duration.

One possibility is to initialize your duration models using external databases
which have manual (or automatic) segmentations.  If you want to run completely
flat-start training, please initialize all mean and variance of state-duration
distributions by fixed values.  Once you use large variance to initialize your
duration models, these duration models become non-informative.  Therefore, the
initial state assignments are obtained without using state-duration models.

> 3. Do the parameter generation algorithms of Case 1/2/3 correspond to HMGenS
> -c 0/1/2, respectively? 

No.  The case 1 algorithm, Cholesky-based one, is -c 0.  The case 3 algorithm,
EM-based one, is both -c 1 and -c 2.  If you run HMGenS with -c 1, the case 3
algorithm is run but only "mixture components" are regarded as hidden variables,
i.e., state sequence is fixed.  However, if you run HMGenS with -c 2, both
"mixture components" and "state assignments" are regarded as hidden variables.
The case 2 algorithm has not been integrated to HTS.

# Only the case 2 algorithm was available when I was a bachelor student, but I
# replaced it by the case 1 algorithm when I became a master student.
# If you want to look at the source code of the case 2 algorithm, please check
# the source code of SPTK's mlpg command.  It's based on the case 2 algorithm.

> What about hts_engine?

Only case 1 algorithm is implemented in hts_engine.

Best regards,

Heiga ZEN (Byung Ha CHUN)

Heiga ZEN (Byung Ha CHUN)
Speech Technology Group
Cambridge Research Lab
Toshiba Research Europe
phone: +44 1223 436975

This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 

[hts-users:02206] Re: More questions to HSMM, 燕鹏举
[hts-users:02198] TRANSP of monophone reestimation, 燕鹏举
[hts-users:02200] Re: TRANSP of monophone reestimation, Heiga ZEN (Byung Ha CHUN)
[hts-users:02201] Re: More questions to HSMM, 燕鹏举