[hts-users:04520] Re: too few observation sequences
- Subject: [hts-users:04520] Re: too few observation sequences
- From: "Heiga ZEN (Byung Ha CHUN)" <heigazen@xxxxxxxxxx>
- Date: Tue, 02 May 2017 07:05:21 +0000
- Authentication-results: mailgw.mains.nitech.ac.jp; dkim=pass (2048-bit key) header.d=google.com email@example.com header.b=gNAh/N+l
- Cc: David Tofu <david.tofu@xxxxxxxxxxxx>
- Delivered-to: hts-users@xxxxxxxxxxxxxxx
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=BPAK51/RziPJxzXqVvrwsT8g+3yyBCPz9sDRrZDKJRw=; b=gNAh/N+lPgGLJnWWafMoGS2wCdF1BdvWGQy0qrbmIzQq3pNsApv5I+GjPi9QLO7ILY SHOv/DlXOFW1M+V9H+NWdRGkyalUle92QsRWTakDzCAxwPjrtRS5tSCtapmdDGKOCdrR Gq0rVH3gobzufJYMWjQAURjOGdgRA1zRvAS+2WwCy0Qjb7Ib7ZrIvK08E8usCTkz5IKk rCGA0FPcDz0EpWCUNb1VORIJ9Et6nDNOqNf6kEfo1Tg1wg4iRJpvtcFiHePOX/EbcDzi n5ti6JSiK8p1RR3mO12Z8e0JFjTnmFAoxtQAXmLmPOOTn5S+G449dU1DQ5kO225Z/vVG 2QjQ==
First let's increase HInit's trace level (-T) and output more fine-grain logs. It will tell you more details why it failed.
One possibility is that HInit internally failed to run Viterbi alignment over all 'dw' segments.
We are training a new voice and running into this error:
ERROR [+2121] HInit: Too Few Observation Sequences 
FATAL ERROR - Terminating program /proj/tts/hts-2.3/htk/HTKTools/HInit
Error in /proj/tts/hts-2.3/htk/HTKTools/HInit -A -C /proj/tts/voices/babel/amharic_female/configs/qst001/ver1/trn.cnf -D -T 1 -S /proj/tts/voices/babel/amharic_female/data/scp/train.scp -m 1 -u tmvw -w 5000 -H /proj/tts/voices/babel/amharic_female/models/qst001/ver1/cmp/init.mmf -M /proj/tts/voices/babel/amharic_female/models/qst001/ver1/cmp/HInit -I /proj/tts/voices/babel/amharic_female/data/labels/mono.mlf -l dw -o dw /proj/tts/voices/babel/amharic_female/proto/qst001/ver1/state-5_stream-4_mgc-105_lf0-3.prt
We understand that this error may happen because of too few examples of the phoneme, or because they don't each contain enough frames, but we have more than 3 examples of this phoneme in our data (the minimum specified in the HTK book) and they all contain more than 5 frames (which sounds like the minimum required according to other posts on this list) -- is there any other reason why this error may occur?
---------------------------------------Heiga ZEN (in Japanese)Byung Ha CHUN (in Korean)<heigazen@xxxxxxxxxx>
- [hts-users:04519] too few observation sequences, Erica Cooper