[hts-users:02202] Re: More questions to HSMM
> 1. Can HERest still output duration model in a HMM mode? (As in paper T.
> Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura. Duration
> modeling for HMM-based speech synthesis. In Proc. ICSLP, pages 29?32,
> I tried not to provide an initial duration model and use -g in order to
> output a duration model in a HMM mode, however all the resultant duration
> pdfs are N(0, 1).
I guess you didn't specify a proper flag with -u option. If you want to output
state-durations distributions, you should run HERest with
"dmv" means estimate mean and variance of duration models.
> 2. In the HSMM mode, how can I provide an good initial duration model to
> HERest (when I have to "flat-start" because I have no initial boundary
> labeling)? HCompV can get global mean and variance for cmp only but not for
One possibility is to initialize your duration models using external databases
which have manual (or automatic) segmentations. If you want to run completely
flat-start training, please initialize all mean and variance of state-duration
distributions by fixed values. Once you use large variance to initialize your
duration models, these duration models become non-informative. Therefore, the
initial state assignments are obtained without using state-duration models.
> 3. Do the parameter generation algorithms of Case 1/2/3 correspond to HMGenS
> -c 0/1/2, respectively?
No. The case 1 algorithm, Cholesky-based one, is -c 0. The case 3 algorithm,
EM-based one, is both -c 1 and -c 2. If you run HMGenS with -c 1, the case 3
algorithm is run but only "mixture components" are regarded as hidden variables,
i.e., state sequence is fixed. However, if you run HMGenS with -c 2, both
"mixture components" and "state assignments" are regarded as hidden variables.
The case 2 algorithm has not been integrated to HTS.
# Only the case 2 algorithm was available when I was a bachelor student, but I
# replaced it by the case 1 algorithm when I became a master student.
# If you want to look at the source code of the case 2 algorithm, please check
# the source code of SPTK's mlpg command. It's based on the case 2 algorithm.
> What about hts_engine?
Only case 1 algorithm is implemented in hts_engine.
Heiga ZEN (Byung Ha CHUN)
Heiga ZEN (Byung Ha CHUN)
Speech Technology Group
Cambridge Research Lab
Toshiba Research Europe
phone: +44 1223 436975
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email
- [hts-users:02206] Re: More questions to HSMM, 燕鹏举
- [hts-users:02198] TRANSP of monophone reestimation, 燕鹏举
- [hts-users:02200] Re: TRANSP of monophone reestimation, Heiga ZEN (Byung Ha CHUN)
- [hts-users:02201] Re: More questions to HSMM, 燕鹏举