Hi,
Perhaps measuring the RMSE in the F0 could show if the predicted
contour
with GV minimizes the error in comparison with the conventional
method.
Note that the generation algorithm considering GV usually causes
LARGER
errors compared with the conventional one. This tendency is
observed in
both mel-cep and F0 generations.
The HMM likelihood for a parameter trajectory generated by the
conventional
algorithm is too large compared with that for a natural one. This
implies that we
don't have to maximize only the HMM likelihood. This point is
described in th
following papers:
T. Toda, A.W. Black, and K. Tokuda, ICASSP2005.
T. Toda and K. Tokuda, IEICE, May 2007 (to appear).
Note that the ICASSP paper describes only the mel-cep generation in
voice
conversion but we can see the similar result on the mel-cep and F0
generations
in HTS as well. Those HTS-results are described in the IEICE paper.
Thanks,
Tomoki Toda
Nara Institute of Science and Technology
E-mail: tomoki@xxxxxxxxxxx
TEL: +81-743-72-5282
FAX: +81-743-72-5289
----- Original Message -----
From: "Xavi Gonzalvo" <gonzalvo@xxxxxxxxxxxxx>
To: <hts-users@xxxxxxxxxxxxxxxxxxxxxxxxx>
Sent: Tuesday, April 17, 2007 6:27 PM
Subject: [hts-users:00645] Re: F0 contours become flat
Hi,
I've been reading the GV information as explained in your article
Interspeech 2005 and I notice that the MOS improves when GV is
used in
mel-cepstral, but perceptual tests don't show an improvement when
applied to
F0 (as explained in the paper, perhaps caused by some data errors).
Perhaps measuring the RMSE in the F0 could show if the predicted
contour
with GV minimizes the error in comparison with the conventional
method.
Greetings
-----Mensaje original-----
De: Heiga ZEN (Byung Ha CHUN) [mailto:zen@xxxxxxxxxxxxxxxx]
Enviado el: martes, 13 de marzo de 2007 23:49
Para: hts-users@xxxxxxxxxxxxxxxxxxxxxxxxx
Asunto: [hts-users:00577] Re: F0 contours become flat
Hi,
Xavi Gonzalvo wrote:
Let me add to this that we carried out some tests with Spanish
Language
comparing the f0 contour from HTS 1.1.1 and the one obtained from
a CBR
(Case Base Reasoning) algorithm. HMM contour were absolutely
flatter than
the CBR one, especially when phrases were interrogative and short.
In our internal version we use GV.
It can reduce this problem.
Regards,
Heiga ZEN (Byung Ha CHUN)
--
------------------------------------------------
Heiga ZEN (in Japanese pronunciation)
Byung Ha CHUN (in Korean pronunciation)
Department of Computer Science and Engineering
Nagoya Institute of Technology
Gokiso-cho, Showa-ku, Nagoya 466-8555 Japan
http://www.sp.nitech.ac.jp/~zen
------------------------------------------------