[hts-users:02206] Re: More questions to HSMM
Hi ZEN-san,
Thanks for the detailed answers, that helped a lot in understanding HTS.
The problem of HERest -g still exists that I get always N(0,1) for duration
pdfs.
I did use -u tmvwdmv. What I did is just (based on HERest parameters in step
$ERST0) removing the -N and -R parameters and adding a -g parameter. See the
following command:
HERest -A -C .../trn.cnf -D -T 1 -S .../train.scp -I .../mono.mlf -m 1 -u
tmvwdmv -w 3 -t 1500 100 5000 -H .../cmp/monophone.mmf -M .../cmp -g
.../dur/monophone.mmf .../mono.list
Is there anything incorrect or I missed?
Thank you very much.
Pengju.
-----邮件原件-----
发件人: Heiga ZEN (Byung Ha CHUN) [mailto:heiga.zen@xxxxxxxxxxxxxxxxx]
发送时间: 2009年9月1日 23:31
收件人: hts-users@xxxxxxxxxxxxxxx
主题: [hts-users:02202] Re: More questions to HSMM
Hi,
燕鹏举 wrote:
> 1. Can HERest still output duration model in a HMM mode? (As in paper T.
> Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura. Duration
> modeling for HMM-based speech synthesis. In Proc. ICSLP, pages 29?32,
> 1998.)
Yes.
> I tried not to provide an initial duration model and use -g in order to
> output a duration model in a HMM mode, however all the resultant duration
> pdfs are N(0, 1).
I guess you didn't specify a proper flag with -u option. If you want to
output
state-durations distributions, you should run HERest with
-u tmvwdmv
^^^
"dmv" means estimate mean and variance of duration models.
> 2. In the HSMM mode, how can I provide an good initial duration model to
> HERest (when I have to "flat-start" because I have no initial boundary
> labeling)? HCompV can get global mean and variance for cmp only but not
for
> duration.
One possibility is to initialize your duration models using external
databases
which have manual (or automatic) segmentations. If you want to run
completely
flat-start training, please initialize all mean and variance of
state-duration
distributions by fixed values. Once you use large variance to initialize
your
duration models, these duration models become non-informative. Therefore,
the
initial state assignments are obtained without using state-duration models.
> 3. Do the parameter generation algorithms of Case 1/2/3 correspond to
HMGenS
> -c 0/1/2, respectively?
No. The case 1 algorithm, Cholesky-based one, is -c 0. The case 3
algorithm,
EM-based one, is both -c 1 and -c 2. If you run HMGenS with -c 1, the case
3
algorithm is run but only "mixture components" are regarded as hidden
variables,
i.e., state sequence is fixed. However, if you run HMGenS with -c 2, both
"mixture components" and "state assignments" are regarded as hidden
variables.
The case 2 algorithm has not been integrated to HTS.
# Only the case 2 algorithm was available when I was a bachelor student, but
I
# replaced it by the case 1 algorithm when I became a master student.
# If you want to look at the source code of the case 2 algorithm, please
check
# the source code of SPTK's mlpg command. It's based on the case 2
algorithm.
> What about hts_engine?
Only case 1 algorithm is implemented in hts_engine.
Best regards,
Heiga ZEN (Byung Ha CHUN)
--
--------------------------
Heiga ZEN (Byung Ha CHUN)
Speech Technology Group
Cambridge Research Lab
Toshiba Research Europe
phone: +44 1223 436975
______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email
______________________________________________________________________
- Follow-Ups
-
- [hts-users:02207] Re: More questions to HSMM, Heiga ZEN (Byung Ha CHUN)
- [hts-users:02269] Re: More questions to HSMM, 那兴宇
- References
-
- [hts-users:02198] TRANSP of monophone reestimation, 燕鹏举
- [hts-users:02200] Re: TRANSP of monophone reestimation, Heiga ZEN (Byung Ha CHUN)
- [hts-users:02201] Re: More questions to HSMM, 燕鹏举
- [hts-users:02202] Re: More questions to HSMM, Heiga ZEN (Byung Ha CHUN)