[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:00975] Re: Discountuniuty in the speech synthesis!


Hi,

marc sobhy wrote (2007/11/21 17:54):

if the parameter vector at frame t is determined independently of preceding and succeding frames , the speech parameter O which maximize the P(O | Q , Landa, T) is obtained as a *sequence of mean vectors of substates *, this will cause *discontinuity* in the synthesized speech which degrade qualilty of the synthesized speech. i'am wondering why the speech parameter O can be a sequence of mean vectors?

Because P(O|Q,lambda,T) becomes a Gaussian distribution.
The observation sequence that maximizes its output probability from this Gaussian distribution is obviously given by its mean vector.

why this cause discontinuity in the synthesized speech?

Because mean vector of the above Gaussian becomes stepwise.
It causes spectral and F0 discontinuities at the boundaries of HMM states.

Regards,

Heiga ZEN (Byung Ha CHUN)

--
------------------------------------------------
Heiga ZEN     (in Japanese pronunciation)
Byung Ha CHUN (in Korean pronunciation)

Department of Computer Science and Engineering
Nagoya Institute of Technology
Gokiso-cho, Showa-ku, Nagoya 466-8555 Japan

http://www.sp.nitech.ac.jp/~zen
------------------------------------------------

Follow-Ups
[hts-users:00976] Re: Discountuniuty in the speech synthesis!, Simon King
References
[hts-users:00969] Binary file size of HTS, Han, Seungho
[hts-users:00970] Re: Binary file size of HTS, Heiga ZEN (Byung Ha CHUN)
[hts-users:00972] duration Model, Tamer Fares
[hts-users:00973] Re: duration Model, Heiga ZEN (Byung Ha CHUN)
[hts-users:00974] Discountuniuty in the speech synthesis!, marc sobhy