From: "Heiga ZEN (Byung Ha CHUN)" <zen@xxxxxxxxxxxxxxxx>
Reply-To: hts-users@xxxxxxxxxxxxxxxxxxxxxxxxx
To: hts-users@xxxxxxxxxxxxxxxxxxxxxxxxx
Subject: [hts-users:00200] Re: emotional speech synthesis using hts
Date: Wed, 22 Feb 2006 23:11:47 +0900
Hi,
liulei_198216@xxxxxxxxxxx wrote:
I have read some papers about emotional speech synthesis, and now,I
know that hts uses "model interpolation" and "speaker adaptation"
to synthesize motional speech and speech with various styles.
Yes.
About "speaker adaptation" , it refers that for synthesizing
speech with various styles, we must convert speech feature
including spectrum, F0,duration.
Actually wee do not need to convert speech features themselves, we
need to convert "statistics" (model parameters) of these features.
But when I want to synthesize emotional speech, is it necessary to
convert spectrum.
Is it enough that getting emotional speech through converting F0
,duration.
In my opinion, converting spectrum will help to synthesize emotional
speech.
In addition, how can I get a real-time emotional convertion, for
example from sad to happy.
Certainly we can use "model interpolation" and "speaker
adaptation", but they need time in the training part.
For speaker interpolation you have to prepare a number of models
using sufficient emotional speech samples.
Recording speech and training HMMs may take some time.
On the other hand, for speaker adaptation you only need one set of
HMMs trained using neutral speech and a few emotional speech samples
for adaptation.
Speaker (emotion) adaptation can be done off-line, so synthesizing
emotional speech does not require any additional time.
You can also use adapted models for interpolation.
does anyone know "speech synthesis driven by emotional function " ?
I don't know what "emotional function" is.
In the HMM-based speech synthesis system with MLLR-based speaker
(emotion) adaptation, I think linear transformation matrices for
mean and variances of the HMMs can be viewed as the functions to
represent the relationships between neutral and emotional speech.
Best regards,
Heiga Zen (Byung Ha Chun)
--
------------------------------------------------
Heiga ZEN (in Japanese pronunciation)
Byung-Ha CHUN (in Korean pronunciation)
Department of Computer Science and Engineering
Graduate School of Engineering
Nagoya Institute of Technology
Japan
e-mail: zen@xxxxxxxxxxxxxxxx
web: http://kt-lab.ics.nitech.ac.jp/~zen
------------------------------------------------