[hts-users:04521] Re: too few observation sequences

On Tue, May 2, 2017 at 3:48 PM David Tofu <david.tofu@xxxxxxxxxxxx> wrote:

We ran this command, with elevated -T,

/proj/tts/hts-2.3/htk/HTKTools/HInit -A -C /proj/tts/voices/babel/amharic_female/configs/qst001/ver1/trn.cnf -D -T 31 -S /proj/tts/voices/babel/amharic_female/data/scp/train.scp -m 1 -u tmvw -w 5000 -H /proj/tts/voices/babel/amharic_female/models/qst001/ver1/cmp/init.mmf -M /proj/tts/voices/babel/amharic_female/models/qst001/ver1/cmp/HInit -I /proj/tts/voices/babel/amharic_female/data/labels/mono.mlf -l dw -o dw /proj/tts/voices/babel/amharic_female/proto/qst001/ver1/state-5_stream-4_mgc-105_lf0-3.prt

And it turns out it loads no observations because it ignores all instances of the phoneme, like this:

0 observations loaded from /proj/tts/voices/babel/amharic_female/data/cmp/BABEL_OP3_307_51611_20140423_232011_inLine_120.cmp
seg dw 31170101.000000->31328700.000000 ignored
seg dw 31328700.000000->31485200.000000 ignored
0 observations loaded from /proj/tts/voices/babel/amharic_female/data/cmp/BABEL_OP3_307_51611_20140423_232011_inLine_132.cmp

Do you have any idea why this would happen? The frame length is set to 1200 in the Makefile in the data/ dir.

Thank you!

David

On Tue, May 2, 2017 at 3:05 AM, Heiga ZEN (Byung Ha CHUN) <heigazen@xxxxxxxxxx> wrote:
First let's increase HInit's trace level (-T) and output more fine-grain logs. It will tell you more details why it failed.

One possibility is that HInit internally failed to run Viterbi alignment over all 'dw' segments.

Heiga

On Tue, May 2, 2017 at 12:54 AM Erica Cooper <ecooper@xxxxxxxxxxxxxxx> wrote:
Hi hts-users,

We are training a new voice and running into this error:

ERROR [+2121] HInit: Too Few Observation Sequences [0]
FATAL ERROR - Terminating program /proj/tts/hts-2.3/htk/HTKTools/HInit
Error in /proj/tts/hts-2.3/htk/HTKTools/HInit -A -C /proj/tts/voices/babel/amharic_female/configs/qst001/ver1/trn.cnf -D -T 1 -S /proj/tts/voices/babel/amharic_female/data/scp/train.scp -m 1 -u tmvw -w 5000 -H /proj/tts/voices/babel/amharic_female/models/qst001/ver1/cmp/init.mmf -M /proj/tts/voices/babel/amharic_female/models/qst001/ver1/cmp/HInit -I /proj/tts/voices/babel/amharic_female/data/labels/mono.mlf -l dw -o dw /proj/tts/voices/babel/amharic_female/proto/qst001/ver1/state-5_stream-4_mgc-105_lf0-3.prt

We understand that this error may happen because of too few examples of the phoneme, or because they don't each contain enough frames, but we have more than 3 examples of this phoneme in our data (the minimum specified in the HTK book) and they all contain more than 5 frames (which sounds like the minimum required according to other posts on this list) -- is there any other reason why this error may occur?

Thanks,
Erica

--
---------------------------------------
Heiga ZEN (in Japanese)
Byung Ha CHUN (in Korean)
<heigazen@xxxxxxxxxx>