When I built Korean HTS, I had the same problem as you have too.

I thought that was a systemic problem. That is, When calculating least square solution (cholesky decomposition), we could get stability only for adjust long sentence.

When I tried to make short sentence (only 1 word sentence), I could discovered lf0, duration, spectrum to be very unstable and unnatural, especially at initial position and ending position.

And the fact that training sentences were almost long sentences makes worse. So first of all, I have retrained hmm with adding short sentences to train DB( 1 word, 2 word sentence about 500s (10 %)), and I have modified synthesis part modules so that if input sentence lengths were shorter over some threshold, I forcibly made duration, lf0 mean values to be stabile by change the values through interpolation method and make duration and lf0 variation to be smaller.

That methods were considerable things for Korean at least.


I have built hts database for my language(persian).I have traind 500 utts and 500 raws.I didn't have  any problem during train. I have problem when I test short sentences(forexample 1 word(subject) or 2 word(suject and verb).Their quality is worse than bigger sentences.when I test the same word in bigger phrase with 3 or 4 words,the quality be better.whould you please help me?

Best Regards


