[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:00437] Re: training error


Heiga ZEN (Byung Ha CHUN) wrote:
Alexis Moinet wrote:
 Processing Data: sentence_0016_00.cmp; Label sentence_0016_00.lab
 Unable to traverse 530 states in 482 frames
 WARNING [-7324]  StepBack: Bad data or over pruning
 in /home/user/work/HTS-2.0/htk-3.4/HTKTools/HERest
"Unable to traverse 530 states in 482 frames" means that the sentence HMM corresponding to sentence_0016_00.lab contains 530 states but sentence_0016_00.cmp consists of 482 frames.
Therefore, it is unable to perform forward-backward algorithm.
Please check whether sentence_0016_00.lab (context-dependent) is correct or not.

Well, I understand this, but the lab file seems correct.
There are 63 phonemes in it (including silences), so how does HERest chose the number of states to be 530 ? I guess it depends on the duration model trained in previous steps of training (which are "monophone" and "tie transition probabilities" ) ?

The point is : if we use only sentence_0001 until sentence_0020, we have no warning[-7324] (--> the sentence_0016 is ok) and we can synthesize (very) poor speech If we use sentence_0001 until sentence_0300, we got warning[-7324] for sentence_0016 (and some other sentences as well) and the training eventually stops because of further error most probably caused by the warning.

so it might be that the HMM models computed during the monophone training and used to initialize the fullcontext HMMs cause the "530 states" problem ?

by the way, we use 22kHz sound files, does it matter ? (since you seem to use 16kHz for HTS-demo)

Hi Alexis,

# How are you doing :-)

# Fine, thanks. What about you ? :-)

Best regards,

Alexis




References
[hts-users:00433] training error, Alexis Moinet
[hts-users:00434] Re: training error, Heiga ZEN (Byung Ha CHUN)