[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:03127] Re: Bug in the setting of the state durations


Hi,

I know the problem.
The several solutions (include yours) are considering.
Thank you.

Regards,
Keiichiro Oura


2011/12/7 Javier Latorre-Chimoto <javier.latorrechimoto@xxxxxxxxx>:
> Hi,
> We have found a bug in the algorithm to set the state duration both in HTS
> and in hts-engine.
> Basically the problem is due to the rounding of the number of frames and the
> carry over of the non-assigned frames. The problem happens when the phone
> duration is given and also when the speech rate is different than one.
> I think that the main reason for it is that Yoshimura's equation is based on
> Gaussians, but the duration has a lower bound of 1 frame. We propose to
> modify the algorithm to assign the state durations as follows:
> 1) For a set of states Q find the number of available frames based on the
> asigned model duration and/or the defined speaking rate
> 2) Compute rho and set an initial duration for each state based on
> Yoshimura's equation: (d[s] = mean[s]+rho/var[s])
> 3) Compute the difference between the available states computed in step 1)
> and the assigned states computed in step 2).
> 4) Find the state on which modifying the assign duration by one frame would
> have the highest log-likelihood and add 1 frame to the assigned duration
> 5) Back to step 3) until the absolute value of the difference between
> available and asigned frames is less than 1.
> 6) Carry over the remaining difference for the next block of states (Note
> that this difference is always less than 1 frame)
>
> I attach two files, with the proposed modification: one for HTS
> (HGen.c:SetStateDurations) and the other one for hts-engine
> (HTS_sstream.c:HTS_set_duration).
> The one for HTS works for labels in HTK fotmsy but it might require some
> modificatios for other type of label formats.
> For the modification of hts-engine please note that in our internal version
> vari is the variance, not its inverse.
>
> Kind regards,
>
> -Javier Latorre
>
> Toshiba Research Europe Limited, Cambridge, UK

References
[hts-users:03126] Bug in the setting of the state durations, Javier Latorre-Chimoto