[hts-users:00975] Re: Discountuniuty in the speech synthesis!
Hi,
marc sobhy wrote (2007/11/21 17:54):
if the parameter vector at frame t is determined independently of
preceding and succeding frames , the speech parameter O which maximize
the P(O | Q , Landa, T) is obtained as a *sequence of mean vectors of
substates *, this will cause *discontinuity* in the synthesized speech
which degrade qualilty of the synthesized speech.
i'am wondering why the speech parameter O can be a sequence of mean vectors?
Because P(O|Q,lambda,T) becomes a Gaussian distribution.
The observation sequence that maximizes its output probability from this Gaussian distribution is obviously given by its mean vector.
why this cause discontinuity in the synthesized speech?
Because mean vector of the above Gaussian becomes stepwise.
It causes spectral and F0 discontinuities at the boundaries of HMM states.
Regards,
Heiga ZEN (Byung Ha CHUN)
--
------------------------------------------------
Heiga ZEN (in Japanese pronunciation)
Byung Ha CHUN (in Korean pronunciation)
Department of Computer Science and Engineering
Nagoya Institute of Technology
Gokiso-cho, Showa-ku, Nagoya 466-8555 Japan
http://www.sp.nitech.ac.jp/~zen
------------------------------------------------
- Follow-Ups
-
- [hts-users:00976] Re: Discountuniuty in the speech synthesis!, Simon King
- References
-
- [hts-users:00969] Binary file size of HTS, Han, Seungho
- [hts-users:00970] Re: Binary file size of HTS, Heiga ZEN (Byung Ha CHUN)
- [hts-users:00972] duration Model, Tamer Fares
- [hts-users:00973] Re: duration Model, Heiga ZEN (Byung Ha CHUN)
- [hts-users:00974] Discountuniuty in the speech synthesis!, marc sobhy