[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:00939] Re: questions ?


Tamer Fares wrote:


hi all
i'am wondering for why the log(f0) and delta-log(f0) and delta-delta-log(f0) are modeled by multi-stream while the spectral coefficients is only one stream?

This is because F0 can switch between two states: unvoiced (where there is no F0 value) and voiced (where there is an F0 value, with a Gaussian distribution).

The windows used to compute the deltas mean that the delta-F0 can be in a different switch state to the F0 value. For example, if F0 switches from voiced to unvoiced at a particular frame, then delta F0 will have to stay in the 'voiced' state a few frames longer so that the deltas can be computed back at the frame where the switch occurred. delta-delta-F0 behaves similarly with respect to delta-F0. So, all three of the them need their own streams, so that they can be in different voiced/unvoiced states an any given moment.

Simon


References
[hts-users:00927] TR LM etc., 艾斯卡尔
[hts-users:00928] Re: TR LM etc., Heiga ZEN (Byung Ha CHUN)
[hts-users:00929] Re: TR LM etc., Heiga ZEN (Byung Ha CHUN)
[hts-users:00938] questions ?, Tamer Fares