Hi Keiichiro,
Thank you for your answer. I have already used both, and never got any issue when using mel-cepstrum parameters.
I was willing to use generalized LPC-LSP parameters on mel scale, as they're reported to give the best MOS in:
H. Zen, T. Toda, K. Tokuda, "The Nitech-NAIST HMM-based speech synthesis system for the Blizzard Challenge 2006", IEICE Trans. on Information and Systems, 2006.
Thanks,
Geoffrey
Hi, Geoffrey
To separate a duration problem and a LSP stability problem, you should try to use mel-Cepstrum parameter for spectrum.
Regards,
Keiichiro Oura
Geoffrey Wilfart wrote:
Dear all,
I have run an experiment training a model on 24-order LSP coefficients + LF0 for alpha=0.42 and gamma=3.
The training went well, and I'm able to synthesize some sentences.
However, the models seem sensitive to duration models, and I have encountered problems when synthesizing speech from training labels, with imposed segmentation, or using a third-party duration model.
I have tried to use lspcheck for stability, without success.
Do you have any idea with is this ?
Thank you,
Geoffrey