When I built Korean HTS, I had
the same problem as you have too. I thought that was a systemic
problem. That is, When calculating least square solution (cholesky decomposition),
we could get stability only for adjust long sentence. When I tried to make short sentence
(only 1 word sentence), I could discovered lf0, duration, spectrum to be very
unstable and unnatural, especially at initial position and ending position. And the fact that training
sentences were almost long sentences makes worse. So first of all, I have
retrained hmm with adding short sentences to train DB( 1 word, 2 word sentence about
500s (10 %)), and I have modified synthesis part modules so that if input
sentence lengths were shorter over some threshold, I forcibly made duration,
lf0 mean values to be stabile by change the values through interpolation method
and make duration and lf0 variation to be smaller. That methods were considerable
things for Korean at least. YoungHo Han / Engineer /
선임연구원 Core
Lab. INFINITY TELECOM Gurogu,
Gurodong, Seoul, KOREA O)+82-2-565-8808
F)+82-2-6675-8811 M)
+82-10-3936-2191 Pursuing Infinite Innovations From:
somayeh bagherbeygi [mailto:sb_4715@xxxxxxxxx]
|