[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:00203] RE: [hts-users:00202] 答复: [hts-users:00201] Re: emotional speech synthesis using hts


Hi,杨鸿武

I can't understand the "CART",
What do you mean?

And do you think it is feasible that synthesizing emotinal speech through prosody model in HTS.

thank you liulei 2006.2.24

From: 杨鸿武 <yang-hw03@xxxxxxxxxxxxxxxxxxxxx>
Reply-To: hts-users@xxxxxxxxxxxxxxxxxxxxxxxxx
To: <hts-users@xxxxxxxxxxxxxxxxxxxxxxxxx>
Subject: [hts-users:00202] 答复: [hts-users:00201] Re: emotional speech
synthesis using hts
Date: Fri, 24 Feb 2006 09:29:57 +0800

Hi,
I believe that PSOLA based speech synthesis is very different from HTS. I
think the CART itself is the prosody model in HTS.

-----邮件原件-----
发件人: 刘 磊 [mailto:liulei_198216@xxxxxxxxxxx]
发送时间: 2006年2月24日 8:27
收件人: hts-users@xxxxxxxxxxxxxxxxxxxxxxxxx
主题: [hts-users:00201] Re: emotional speech synthesis using hts

Hi,Heiga ZEN

Thanks for your help.

I have read some paper about "PSOLA",
I find that speech synthesis with "PSOLA" needs "text analyzer" and
"prosody model".
The "prosody model" can forecast prosody(F0, duration) form text that will
be synthesized.

We all HTS uses festival as   "text analyzer",
does festival have the function of forecasting prosody.

HTS uses MSD-HMM to model F0.
If festival can forecast prosody(F0),
How do they work together.

thank you

liulei
2006.2.24


>From: "Heiga ZEN (Byung Ha CHUN)" <zen@xxxxxxxxxxxxxxxx>
>Reply-To: hts-users@xxxxxxxxxxxxxxxxxxxxxxxxx
>To: hts-users@xxxxxxxxxxxxxxxxxxxxxxxxx
>Subject: [hts-users:00200] Re: emotional speech synthesis using hts
>Date: Wed, 22 Feb 2006 23:11:47 +0900
>
>Hi,
>
>liulei_198216@xxxxxxxxxxx wrote:
>
>>I have read some papers about emotional speech synthesis, and now,I
>>know that hts uses "model interpolation" and "speaker adaptation"
>>to synthesize  motional speech and speech with various styles.
>
>Yes.
>
>>About "speaker adaptation" , it refers that  for synthesizing
>>speech with various styles, we must  convert speech feature
>>including spectrum, F0,duration.
>
>Actually wee do not need to convert speech features themselves, we
>need to convert "statistics" (model parameters) of these features.
>
>>But when I want to synthesize emotional speech, is it necessary to
>>convert spectrum.
>>Is it enough that getting emotional speech through converting F0
>>,duration.
>
>In my opinion, converting spectrum will help to synthesize emotional
>speech.
>
>>In addition, how can I get a real-time emotional convertion, for
>>example from sad to happy.
>>Certainly we can use "model interpolation" and "speaker
>>adaptation", but  they need time in  the  training part.
>
>For speaker interpolation you have to prepare a number of models
>using sufficient emotional speech samples.
>Recording speech and training HMMs may take some time.
>On the other hand, for speaker adaptation you only need one set of
>HMMs trained using neutral speech and a few emotional speech samples
>for adaptation.
>Speaker (emotion) adaptation can be done off-line, so synthesizing
>emotional speech does not require any additional time.
>You can also use adapted models for interpolation.
>
>>does anyone know "speech synthesis driven by emotional function " ?
>
>I don't know what "emotional function" is.
>In the HMM-based speech synthesis system with MLLR-based speaker
>(emotion) adaptation, I think linear transformation matrices for
>mean and variances of the HMMs can be viewed as the functions to
>represent the relationships between neutral and emotional speech.
>
>Best regards,
>
>Heiga Zen (Byung Ha Chun)
>
>--
>  ------------------------------------------------
>   Heiga ZEN     (in Japanese pronunciation)
>   Byung-Ha CHUN (in Korean pronunciation)
>
>   Department of Computer Science and Engineering
>   Graduate School of Engineering
>   Nagoya Institute of Technology
>   Japan
>
>   e-mail: zen@xxxxxxxxxxxxxxxx
>      web: http://kt-lab.ics.nitech.ac.jp/~zen
>  ------------------------------------------------
>

_________________________________________________________________
与联机的朋友进行交流,请使用 MSN Messenger:  http://messenger.msn.com/cn


_________________________________________________________________
免费下载 MSN Explorer: http://explorer.msn.com/lccn/
Follow-Ups
[hts-users:00204] Re: emotional speech synthesis using hts, Nicholas Volk
References
[hts-users:00202] 答复: [hts-users:00201] Re: emotional speech synthesis using hts, 杨鸿武