Hi all,
Thank you for all your answers.
I am not using STRAIGHT, not was using global variance on that experiment. I have checked that I don't have any state duration being zero.
Thanks to your help, though, I was able to find out what my problem is: it turns out that some LSP coefficients do get over Pi.
I think I can find some ad-hoc solution at the vocoding stage.
Thanks again for your help.
Geoffrey
Hi,In that paper, we used STRAIGHT to extract spectra and then extracted 39-order LSPs for each frame. Are you using STRAIGHT? And if GV is used, synthesis filters obtained from generated LSPs sometimes get unstable (mcep achieved the best MOS in that experiment when GV was used).
Geoffrey Wilfart wrote:
Thank you for your answer. I have already used both, and never got any issue when using mel-cepstrum parameters.
I was willing to use generalized LPC-LSP parameters on mel scale, as they're reported to give the best MOS in:
H. Zen, T. Toda, K. Tokuda, "The Nitech-NAIST HMM-based speech synthesis system for the Blizzard Challenge 2006", IEICE Trans. on Information and Systems, 2006.
Best regards,
Heiga ZEN (Byung Ha CHUN)
--
Heiga ZEN (Byung Ha CHUN)
Speech Technology Group
Cambridge Research Lab
Toshiba Research Europe
phone: +44 1223 436975
______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email ______________________________________________________________________