[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:02206] Re: More questions to HSMM


Hi ZEN-san,

Thanks for the detailed answers, that helped a lot in understanding HTS.

The problem of HERest -g still exists that I get always N(0,1) for duration
pdfs.

I did use -u tmvwdmv. What I did is just (based on HERest parameters in step
$ERST0) removing the -N and -R parameters and adding a -g parameter. See the
following command:

HERest -A -C .../trn.cnf -D -T 1 -S .../train.scp -I .../mono.mlf -m 1 -u
tmvwdmv -w 3 -t 1500 100 5000 -H .../cmp/monophone.mmf -M .../cmp -g
.../dur/monophone.mmf .../mono.list

Is there anything incorrect or I missed?

Thank you very much.

Pengju.

-----邮件原件-----
发件人: Heiga ZEN (Byung Ha CHUN) [mailto:heiga.zen@xxxxxxxxxxxxxxxxx] 
发送时间: 2009年9月1日 23:31
收件人: hts-users@xxxxxxxxxxxxxxx
主题: [hts-users:02202] Re: More questions to HSMM

Hi,

燕鹏举 wrote:

> 1. Can HERest still output duration model in a HMM mode? (As in paper T.
> Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura. Duration
> modeling for HMM-based speech synthesis. In Proc. ICSLP, pages 29?32,
> 1998.)

Yes.

> I tried not to provide an initial duration model and use -g in order to
> output a duration model in a HMM mode, however all the resultant duration
> pdfs are N(0, 1).

I guess you didn't specify a proper flag with -u option.  If you want to
output
state-durations distributions, you should run HERest with

-u tmvwdmv
       ^^^
"dmv" means estimate mean and variance of duration models.

> 2. In the HSMM mode, how can I provide an good initial duration model to
> HERest (when I have to "flat-start" because I have no initial boundary
> labeling)? HCompV can get global mean and variance for cmp only but not
for
> duration.

One possibility is to initialize your duration models using external
databases
which have manual (or automatic) segmentations.  If you want to run
completely
flat-start training, please initialize all mean and variance of
state-duration
distributions by fixed values.  Once you use large variance to initialize
your
duration models, these duration models become non-informative.  Therefore,
the
initial state assignments are obtained without using state-duration models.

> 3. Do the parameter generation algorithms of Case 1/2/3 correspond to
HMGenS
> -c 0/1/2, respectively? 

No.  The case 1 algorithm, Cholesky-based one, is -c 0.  The case 3
algorithm,
EM-based one, is both -c 1 and -c 2.  If you run HMGenS with -c 1, the case
3
algorithm is run but only "mixture components" are regarded as hidden
variables,
i.e., state sequence is fixed.  However, if you run HMGenS with -c 2, both
"mixture components" and "state assignments" are regarded as hidden
variables.
The case 2 algorithm has not been integrated to HTS.

# Only the case 2 algorithm was available when I was a bachelor student, but
I
# replaced it by the case 1 algorithm when I became a master student.
# If you want to look at the source code of the case 2 algorithm, please
check
# the source code of SPTK's mlpg command.  It's based on the case 2
algorithm.

> What about hts_engine?

Only case 1 algorithm is implemented in hts_engine.

Best regards,

Heiga ZEN (Byung Ha CHUN)

-- 
--------------------------
Heiga ZEN (Byung Ha CHUN)
Speech Technology Group
Cambridge Research Lab
Toshiba Research Europe
phone: +44 1223 436975

______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
______________________________________________________________________





Follow-Ups
[hts-users:02207] Re: More questions to HSMM, Heiga ZEN (Byung Ha CHUN)
[hts-users:02269] Re: More questions to HSMM, 那兴宇
References
[hts-users:02198] TRANSP of monophone reestimation, 燕鹏举
[hts-users:02200] Re: TRANSP of monophone reestimation, Heiga ZEN (Byung Ha CHUN)
[hts-users:02201] Re: More questions to HSMM, 燕鹏举
[hts-users:02202] Re: More questions to HSMM, Heiga ZEN (Byung Ha CHUN)