[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:02201] Re: More questions to HSMM


Thank you Oura-san and ZEN-san.

Here come more related questions to HSMM.

1. Can HERest still output duration model in a HMM mode? (As in paper T.
Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura. Duration
modeling for HMM-based speech synthesis. In Proc. ICSLP, pages 29?32,
1998.)

I tried not to provide an initial duration model and use -g in order to
output a duration model in a HMM mode, however all the resultant duration
pdfs are N(0, 1).

2. In the HSMM mode, how can I provide an good initial duration model to
HERest (when I have to "flat-start" because I have no initial boundary
labeling)? HCompV can get global mean and variance for cmp only but not for
duration.

3. Do the parameter generation algorithms of Case 1/2/3 correspond to HMGenS
-c 0/1/2, respectively? What about hts_engine?

Thanks a lot.

Pengju.

-----邮件原件-----
发件人: Heiga ZEN (Byung Ha CHUN) [mailto:heiga.zen@xxxxxxxxxxxxxxxxx] 
发送时间: 2009年8月31日 18:22
收件人: hts-users@xxxxxxxxxxxxxxx
主题: [hts-users:02200] Re: TRANSP of monophone reestimation

Hi,

燕鹏举 wrote (2009/08/31 11:06):

> I'm studying the standard HTS demo process (Training.pl), I found that the
> TRANSP during monophone re-estimation isn't updated appropriately.
> 
> Before the step of $ERST0, the transition matrices contain non-0
> probabilities to current and the immediate next states, say
> 
> <TRANSP> 7
>  0.000000e+00 1.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00
> 0.000000e+00 0.000000e+00
>  0.000000e+00 4.952075e-01 5.047925e-01 0.000000e+00 0.000000e+00
> 0.000000e+00 0.000000e+00
>  0.000000e+00 0.000000e+00 7.473042e-01 2.526958e-01 0.000000e+00
> 0.000000e+00 0.000000e+00
>  0.000000e+00 0.000000e+00 0.000000e+00 7.416168e-01 2.583832e-01
> 0.000000e+00 0.000000e+00
>  0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 9.256996e-01
> 7.430040e-02 0.000000e+00
>  0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00
> 5.919777e-01 4.080223e-01
>  0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00
> 0.000000e+00 0.000000e+00
> 
> But after $ERST0 (HERest on monophones), the transition probability to the
> current state is 0, say
> 
> <TRANSP> 7
>  0.000000e+00 1.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00
> 0.000000e+00 0.000000e+00
>  0.000000e+00 0.000000e+00 1.000000e+00 0.000000e+00 0.000000e+00
> 0.000000e+00 0.000000e+00
>  0.000000e+00 0.000000e+00 0.000000e+00 1.000000e+00 0.000000e+00
> 0.000000e+00 0.000000e+00
>  0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 1.000000e+00
> 0.000000e+00 0.000000e+00
>  0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00
> 1.000000e+00 0.000000e+00
>  0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00
> 0.000000e+00 1.000000e+00
>  0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00
> 0.000000e+00 0.000000e+00
> 
> You can check the file monophone.mmf.embedded.gz in your own cmp folder.
> 
> Can anybody give any hint? Thanks.

because it's an HSMM rather than HMM.  Self-state transition
probabilities of HSMMs are always 0.

HERest supports both HMM & HSMM training.  However, HInit & HRest
don't (only HMM).  When HERest loads both state-output distribution
mmf and state-duration distribution mmf, HERest automatically converts
HMM into HSMM.  Therefore, until the first HERest, state-output
distribution mmf is HMM.  Once you update state-output &
state-duration distribution mmfs, they become HSMM.

Regards,

Heiga ZEN (Byung Ha CHUN)

-- 
Heiga ZEN (Byung Ha CHUN)
Speech Technology Group
Cambridge Research Lab
Toshiba Research Europe
phone: +44 1223 436975

______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
______________________________________________________________________





Follow-Ups
[hts-users:02202] Re: More questions to HSMM, Heiga ZEN (Byung Ha CHUN)
References
[hts-users:02198] TRANSP of monophone reestimation, 燕鹏举
[hts-users:02200] Re: TRANSP of monophone reestimation, Heiga ZEN (Byung Ha CHUN)