[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:04536] HTS adaptation failed in HTS-demo_CMU-ARCTIC-ADAPT, Error +7231


Hi all!

I am trying to use HTS-demo_CMU-ARCTIC-ADAPT to run HMM adaptation synthesis, the cmp, label and question set are replaced by myself, 
there are 4 training speaker and one speaker to adapt. the training files are named with "speaker_*.lab"

But I got error "+7231 Attempt to form list from different types h and m" when building regression-class trees for adaptation:

...
Splitting Node 157, score 3.206303e-01
(Stream=4, vSize=1)
Splitting Node 156, score 3.027475e-01
(Stream=4, vSize=1)
Splitting Node 147, score 2.953010e-01
(Stream=4, vSize=1)
Splitting Node 138, score 2.912575e-01
(Stream=4, vSize=1)
Splitting Node 151, score 2.727024e-01
(Stream=4, vSize=1)
No more nodes to split..
stream 1: #terminals=1
stream 2: #terminals=32
stream 3: #terminals=32
stream 4: #terminals=32
  ERROR [+7231]  ChkType: Attempt to form list from different types h and m
 FATAL ERROR - Terminating program /home/disk1/tools/HTS-2.3_for_HTK-3.4.1/bin/HHEd
Error in /home/disk1/tools/HTS-2.3_for_HTK-3.4.1/bin/HHEd      -A    -C /home/disk2/usar/hts_workspace/hts-adaptation/configs/qst001/ver1/trn.cnf -D -T 1 -p -i  -C /home/disk2/usar/hts_workspace/hts-adaptation/configs/qst001/ver1/dec_cmp.cnf -H /home/disk2/usar/hts_workspace/hts-adaptation/models/qst001/ver1/cmp/re_clustered.mmf -M /home/disk2/usar/hts_workspace/hts-adaptation/models/qst001/ver1/cmp/regTrees /home/disk2/usar/hts_workspace/hts-adaptation/edfiles/qst001/ver1/cmp/reg.hed /home/disk2/usar/hts_workspace/hts-adaptation/data/lists/full.list



It looks like the regression tree is builded successfully, but crashed in the final phase.
Some configurations are listed as follows, other configurations are same as the Demo:

# Settings ==============================
$usestraight = '0';

@SET        = ('cmp','dur');
if(!$usestraight){
    @cmp        = ('mgc','lf0');
}



%strb = ('mgc' => '1',     # stream start
         'lf0' => '2',
         'bap' => '5',
         'dur' => '1');

%stre = ('mgc' => '1',     # stream end
         'lf0' => '4',
         'bap' => '5',
         'dur' => '7');

%ordr = ('mgc' => '41',     # feature order
         'lf0' => '1',
         'bap' => '25',
         'dur' => '7');


# Speech Analysis/Synthesis Setting ==============
# speech analysis
$sr = 16000;   # sampling rate (Hz)
$fs = 80; # frame period (point)
$fw = 0.0;   # frequency warping
$gm = 1;      # pole/zero representation weight
$lg = 1;     # use log gain instead of linear gain



# Speaker adaptation Setting ============
$spkrPat = "\"*/%%%_*\"";       # speaker name pattern

# regression classes
%dect = ('mgc' => '500.0',    # occupancy thresholds for regression classes (dec)
         'lf0' => '100.0',    # set thresholds in less than adpt and satt
         'bap' => '100.0',
         'dur' => '100.0');

$nClass  = 32;                            # number of regression classes (reg)

# transforms
%nblk = ('mgc' => '3',       # number of blocks for transforms
         'lf0' => '1',
         'bap' => '3',
         'dur' => '1');

%band = ('mgc' => '41',       # band width for transforms
         'lf0' => '1',
         'bap' => '25',
         'dur' => '0');

$bias{'cmp'} = 'TRUE';               # use bias term for MLLRMEAN/CMLLR
$bias{'dur'} = 'TRUE';
$tran        = 'feat';             # transformation kind (mean -> MLLRMEAN, cov -> MLLRCOV, or feat -> CMLLR)

# adaptation
%adpt = ('mgc' => '500.0',       # occupancy thresholds for adaptation
         'lf0' => '100.0',
         'bap' => '100.0',
         'dur' => '100.0');

$tknd{'adp'}   = 'dec';            # tree kind (dec -> decision tree or reg -> regression tree (k-means))
$dcov          = 'FALSE';             # use diagonal covariance transform for MLLRMEAN
$usemaplr      = 'TRUE';            # use MAPLR adaptation for MLLRMEAN/CMLLR
$usevblr       = 'FALSE';             # use VBLR adaptation for MLLRMEAN
$sprior        = 'TRUE';  # use structural prior for MAPLR/VBLR with regression class tree
$priorscale    = 1.0;            # hyper-parameter for SMAPLR adaptation
$nAdapt        = 3;              # number of iterations to reestimate adaptation xforms
$addMAP        = 1;                # apply additional MAP estimation after MLLR adaptation
$maptau{'cmp'} = 50.0;             # hyper-parameters for MAP adaptation
$maptau{'dur'} = 50.0;

# speaker adaptive training
%satt = ('mgc' => '10000.0',    # occupancy thresholds for adaptive training
         'lf0' => '2000.0',
         'bap' => '2000.0',
         'dur' => '5000.0');

$tknd{'sat'} = 'dec';           # tree kind (dec -> decision tree or reg -> regression tree (k-means))
$nSAT        = 3;                  # number of SAT iterations


# Modeling/Generation Setting ==============
# modeling
$nState      = 7;        # number of states



# Switch ================================
$MKEMV = 1; # preparing environments
$HCMPV = 1; # computing a global variance
$IN_RE = 1; # initialization & reestimation
$MMMMF = 1; # making a monophone mmf
$ERST0 = 1; # embedded reestimation (monophone)
$MN2FL = 1; # copying monophone mmf to fullcontext one
$ERST1 = 1; # embedded reestimation (fullcontext)
$CXCL1 = 1; # tree-based context clustering
$ERST2 = 1; # embedded reestimation (clustered)
$UNTIE = 1; # untying the parameter sharing structure
$ERST3 = 1; # embedded reestimation (untied)
$CXCL2 = 1; # tree-based context clustering
$ERST4 = 1; # embedded reestimation (re-clustered)
$FALGN = 0; # forced alignment for no-silent GV
$MCDGV = 0; # making global variance
$MKUNG = 0; # making unseen models (GV)
$MKUN1 = 1; # making unseen models (speaker independent)
$PGEN1 = 1; # generating speech parameter sequences (speaker independent)
$WGEN1 = 1; # synthesizing waveforms (speaker independent)
$REGTR = 1; # building regression-class trees for adaptation ERROR OCCURRED
$ADPT1 = 1; # speaker adaptation (speaker independent)
$PGEN2 = 1; # generating speech parameter sequences (speaker adapted)
$WGEN2 = 1; # synthesizing waveforms (speaker adapted)
$SPKAT = 1; # speaker adaptive training (SAT)
$MKUN2 = 1; # making unseen models (SAT)
$PGEN3 = 1; # generating speech parameter sequences (SAT)
$WGEN3 = 1; # synthesizing waveforms (SAT)
$ADPT2 = 1; # speaker adaptation (SAT)
$PGEN4 = 1; # generate speech parameter sequences (SAT+adaptation)
$WGEN4 = 1; # synthesizing waveforms (SAT+adaptation)
$CONVM = 1; # converting mmfs to the hts_engine file format
$ENGIN = 1; # synthesizing waveforms using hts_engine


Can someone help me with this problem?  I don't know what the cause is. 


Best regards