I got an unstable problem about the synthesized voice and would like to know how to debug it.
Basically, I adopt the scripts/settings from "Speaker dependent training with STRAIGHT demo" (for English voice), and changed the data to Mandarin speech (in fact Mandarin data from Blizzard Challenge 2009).
But it seems that the synthesized voices are not very stable. I mean that for some utterances, I got good voices. But for some utterances, I got unstable voices (pop, white noise or even silence).
I had also put the same data to "Speaker dependent training demo" (without STRAIGHT) and got no problems. So I don't know where is the problem (from STRAIGHT?) and how to solve it.
I appreciate very much your help and Have a nice day!
PS: I used HTS 2.1 and STRAIGHT version V40_006b (no error during training procedure)