[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:00800] catalan hts voices: only silence is generated (sorry ... complete email now)

Subject: [hts-users:00800] catalan hts voices: only silence is generated (sorry ... complete email now)
From: Antonio Bonafonte <antonio.bonafonte@xxxxxxx>
Date: Sun, 23 Sep 2007 00:00:45 +0200
Delivered-to: hts-users@xxxxxxxxxxxxxxx


(sorry for previous incomplete email)

Dear all,
I am trying to building Catalan voices using hts.
I have download the HTS-demo: in English, as it is, it works perfectly.

In Catalan, the log file is very similar to the English one. The output

wav files have the appropriated duration .. but the amplitude does notrepresent speech: is almost zero, in some files, with a small pulse.



Do you have any suggestion?

The whole log file can be found in
http://gps-tsc.upc.es/veu/festcat/ext/hts_ona.log

Compared with the English demo, I think the Catalan utterances arelonger (each file is a short paragraph, that takes twice the time of thearctic ones). We use different phone set and different questions.


As I said, the log files look similar. However, I've seen
three things that maybe related with the problem.

***** (1) *****
In some cases, the logprob have different sign.
For instance, in the first model of Hinit, for the original demo I get:


811 Observation Sequences Loaded
Starting Estimation Process
Iteration 1: Average LogP =  1482.33728
Iteration 2: Average LogP =  1522.82434  Change =    40.48701
Iteration 3: Average LogP =  1526.38599  Change =     3.56160
Iteration 4: Average LogP =  1527.31726  Change =     0.93121
...

And for the catalan run:
1856 Observation Sequences Loaded
Starting Estimation Process
Iteration 1: Average LogP = -1101.82544
Iteration 2: Average LogP =  -492.18414  Change =   609.64130
Iteration 3: Average LogP =  -467.79330  Change =    24.39084
Iteration 4: Average LogP =  -459.21719  Change =     8.57610

Is this normal?
At the end of the whole process, both log files show similar LogP variables.

Other issue: in the Catalan log appear two warnings:

***** (2) *****
First warning:

============================================================================
Start tree-based context clustering (dur) at Sat Sep 22 08:43:18 CEST 2007
============================================================================

HHEd -A -B -C configs/trn.cnf -D -T 1 -p -i -Hmodels/qst001/ver1/dur/re_clustered.mmf -m -a 1.0 -wmodels/qst001/ver1/dur/re_clustered.mmfedfiles/qst001/ver1/dur/cxc_dur.hed.untied data/lists/full.list

../..
TR 3
 Adjusting trace level

// construct decision trees
TB 0.00 dur_s2_  {*.state[2].stream[1-1]}
 Tree based clustering based on MDL criterion, threshold=5.303107e+01

WARNING [-2638] TypicalState: No typical state for dur_s2_633 inhtk/bin/HHEdWARNING [-2638] TypicalState: No typical state for dur_s2_631 inhtk/bin/HHEdWARNING [-2638] TypicalState: No typical state for dur_s2_628 inhtk/bin/HHEd


.... (same warnign repeated ~100 times)

**** (3) ****
This is the second warning:
===================================================================================================

Start increasing the number of mixture components (1mix -> 2mix) at SatSep 22 08:51:26 CEST 2007

===================================================================================================

HHEd -A -B -C configs/trn.cnf -D -T 1 -p -i -Hmodels/qst001/ver1/cmp/re_clustered.mmf -wmodels/qst001/ver1/cmp/re_clustered.mmf.2mixedfiles/qst001/ver1/cmp/upm.hed data/lists/full.list

 ../..

// increase the number of mixtures per stream
MU +1
 Mixup by 1 components per stream
 MU: Number of mixes increased from 434 to 868

MU +1 {}
 Mixup by 1 components per stream

WARNING [-2637] HeaviestMix: mix 1 inL^s-p+ax=rr@1_3/A:1_1_4/B:0-1-3@1-1&4-4#2-1$2-2!1-0;1-1|ax/C:0+1+2/D:content_1/E:in+1@3+3&3+2#1+1/F:content_1/G:1_1/\

H:7=5@4=3|NONE/I:8=4/J:47+27-6 has v.small gConst [-326.653839] in HHEd

WARNING [-2637] HeaviestMix: mix 1 ine1^n-p+i1=t@1_3/A:1_1_3/B:1-1-3@1-1&9-1#5-1$5-1!1-0;1-0|i1/C:0+0+2/D:content_2/E:content+1@5+1&5+0#1+0/F:content_4/G\

:9_3/H:9=5@2=3|NONE/I:9=2/J:66+33-4 has v.small gConst [-326.653839] in HHEd
 MU: Number of mixes increased from 259 to 518

MU +1 {}
 Mixup by 1 components per stream
 MU: Number of mixes increased from 175 to 350

My first guest was that the cluster does not work (maybe incorrectquestions, etc.) and all the units are mapped to pause .. but this seemsnot to be the case: I have looked at the states selected for eachunit (in hts_engine generation), and the (clustered) state is differentfor each unit ....



Thank you in advance for any suggestion.
Antonio

Follow-Ups
: [hts-users:00804] Re: catalan hts voices: only silence is generated (sorry ... complete email now), Nickolay V. Shmyrev

Prev by Subject: [hts-users:00799] catalan hts voices: only silence is generated
Next by Subject: [hts-users:00801] Re: Static And Dynamic Features
Previous by thread: [hts-users:00799] catalan hts voices: only silence is generated
Next by thread: [hts-users:00804] Re: catalan hts voices: only silence is generated (sorry ... complete email now)