[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:00800] catalan hts voices: only silence is generated (sorry ... complete email now)



(sorry for previous incomplete email)

Dear all,
I am trying to building Catalan voices using hts.
I have download the HTS-demo: in English, as it is, it works perfectly.

In Catalan, the log file is very similar to the English one. The output
wav files have the appropriated duration .. but the amplitude does not represent speech: is almost zero, in some files, with a small pulse.


Do you have any suggestion?

The whole log file can be found in
http://gps-tsc.upc.es/veu/festcat/ext/hts_ona.log


Compared with the English demo, I think the Catalan utterances are longer (each file is a short paragraph, that takes twice the time of the arctic ones). We use different phone set and different questions.

As I said, the log files look similar. However, I've seen
three things that maybe related with the problem.

***** (1) *****
In some cases, the logprob have different sign.
For instance, in the first model of Hinit, for the original demo I get:


811 Observation Sequences Loaded
Starting Estimation Process
Iteration 1: Average LogP =  1482.33728
Iteration 2: Average LogP =  1522.82434  Change =    40.48701
Iteration 3: Average LogP =  1526.38599  Change =     3.56160
Iteration 4: Average LogP =  1527.31726  Change =     0.93121
...

And for the catalan run:
1856 Observation Sequences Loaded
Starting Estimation Process
Iteration 1: Average LogP = -1101.82544
Iteration 2: Average LogP =  -492.18414  Change =   609.64130
Iteration 3: Average LogP =  -467.79330  Change =    24.39084
Iteration 4: Average LogP =  -459.21719  Change =     8.57610

Is this normal?
At the end of the whole process, both log files show similar LogP variables.

Other issue: in the Catalan log appear two warnings:

***** (2) *****
First warning:

============================================================================
Start tree-based context clustering (dur) at Sat Sep 22 08:43:18 CEST 2007
============================================================================
HHEd -A -B -C configs/trn.cnf -D -T 1 -p -i -H models/qst001/ver1/dur/re_clustered.mmf -m -a 1.0 -w models/qst001/ver1/dur/re_clustered.mmf edfiles/qst001/ver1/dur/cxc_dur.hed.untied data/lists/full.list
../..
TR 3
 Adjusting trace level

// construct decision trees
TB 0.00 dur_s2_  {*.state[2].stream[1-1]}
 Tree based clustering based on MDL criterion, threshold=5.303107e+01
WARNING [-2638] TypicalState: No typical state for dur_s2_633 in htk/bin/HHEd WARNING [-2638] TypicalState: No typical state for dur_s2_631 in htk/bin/HHEd WARNING [-2638] TypicalState: No typical state for dur_s2_628 in htk/bin/HHEd

.... (same warnign repeated ~100 times)

**** (3) ****
This is the second warning:
===================================================================================================
Start increasing the number of mixture components (1mix -> 2mix) at Sat Sep 22 08:51:26 CEST 2007
===================================================================================================

HHEd -A -B -C configs/trn.cnf -D -T 1 -p -i -H models/qst001/ver1/cmp/re_clustered.mmf -w models/qst001/ver1/cmp/re_clustered.mmf.2mix edfiles/qst001/ver1/cmp/upm.hed data/lists/full.list
 ../..

// increase the number of mixtures per stream
MU +1
 Mixup by 1 components per stream
 MU: Number of mixes increased from 434 to 868

MU +1 {}
 Mixup by 1 components per stream
WARNING [-2637] HeaviestMix: mix 1 in L^s-p+ax=rr@1_3/A:1_1_4/B:0-1-3@1-1&4-4#2-1$2-2!1-0;1-1|ax/C:0+1+2/D:content_1/E:in+1@3+3&3+2#1+1/F:content_1/G:1_1/\
H:7=5@4=3|NONE/I:8=4/J:47+27-6 has v.small gConst [-326.653839] in HHEd
WARNING [-2637] HeaviestMix: mix 1 in e1^n-p+i1=t@1_3/A:1_1_3/B:1-1-3@1-1&9-1#5-1$5-1!1-0;1-0|i1/C:0+0+2/D:content_2/E:content+1@5+1&5+0#1+0/F:content_4/G\
:9_3/H:9=5@2=3|NONE/I:9=2/J:66+33-4 has v.small gConst [-326.653839] in HHEd
 MU: Number of mixes increased from 259 to 518

MU +1 {}
 Mixup by 1 components per stream
 MU: Number of mixes increased from 175 to 350



My first guest was that the cluster does not work (maybe incorrect questions, etc.) and all the units are mapped to pause .. but this seems not to be the case: I have looked at the states selected for each unit (in hts_engine generation), and the (clustered) state is different for each unit ....


Thank you in advance for any suggestion.
Antonio

Follow-Ups
[hts-users:00804] Re: catalan hts voices: only silence is generated (sorry ... complete email now), Nickolay V. Shmyrev