*Subject*: [hts-users:03187] Re: Problem of building regression tree (average voice model)*From*: Keiichiro Oura <uratec@xxxxxxxxxxxxxxx>*Date*: Tue, 28 Feb 2012 10:01:17 +0900*Authentication-results*: mr.google.com; spf=pass (google.com: domain of ura228@xxxxxxxxx designates 10.180.103.35 as permitted sender) smtp.mail=ura228@xxxxxxxxx; dkim=pass header.i=ura228@xxxxxxxxx*Cc*: uratec <uratec@xxxxxxxxxxxx>*Delivered-to*: hts-users@xxxxxxxxxxxxxxx*Dkim-signature*: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=i6wKg7jpdSJKBVM4DYeDQLtfrd09hkKXx9BkklE3jvU=; b=o6SpcYl0zbAT6Pbx5A9q8IMKnKU1O4ArroenOJYG/LXbWFxOkAozbGP9O2angB0rAy b8eVAKE/eF4KquLMcvoVggfUSp86zMdwQlVI7bQ7X94hJtussSbz6yHsPBzLaPYV4+fq pIvCSnE7JA6Ppfa7ivxb943rObC3FGFHsRJno=

Hi, Please check stream[3] in your data before beginning the training. For example, HList command can be used. Regards, Keiichiro Oura 2012/2/28 li jay <lij.acd@xxxxxxxxx>: > Hi, > > Thank you for your help. I conducted some experiments. Originally I have > speech data of 20 speakers. I built 20 models using sentences of only 1 > speaker (SD model). I found that re_clustered.mmf of models of some > speakers abnormal. In these files they were like: > > 745 <STREAM> 3 > 746 <NUMMIXES> 2 > 747 <MIXTURE> 1 9.436758e-01 > 748 <MEAN> 1 > 749 -9.250264e-03 > 750 <VARIANCE> 1 > 751 1.051578e+12 > > 784 <STREAM> 3 > 785 <NUMMIXES> 2 > 786 <MIXTURE> 1 3.212932e-01 > 787 <MEAN> 1 > 788 -1.123511e-02 > 789 <VARIANCE> 1 > 790 1.051578e+12 > > 979 <STREAM> 3 > 980 <NUMMIXES> 2 > 981 <MIXTURE> 1 8.911794e-01 > 982 <MEAN> 1 > 983 6.913134e-03 > 984 <VARIANCE> 1 > 985 1.051578e+12 > > The variance of the stream 3 was too big, and ridiculously the variances > were the same. Could it be a bug of HTS? > Besides I tried to include these abnormal speakers' speech data into > training data. Once I include the abnormal speakers' data into the training > data, an error in building regression tree occurs. But without them I can > build a regression tree successfully. I think it was that the abnormal data > made the training of the models unsuccessful, which resulted in failure of > building the regression tree. But there is something strange. I still can > generate the voice using the models even if the variation of stream 3 is too > big. > > Additionally, I plotted lf0 figures of those abnormal speakers, and they > seemed to be nothing wrong with them, which looked like the lf0 figures of > normal speakers. I am not sure which extracted lf0 files caused the > abnormality because I used the same method and tools to extract them as I > did on those normal files. Is it possible that some values are not supported > by HTS because they are too big or small? > > Regards, > Jay > > 2012/2/24 Heiga ZEN (Byung Ha CHUN) <heigazen@xxxxxxxxxx> > >> Hi, >> >> It is unlikely that having 20 speakers have caused this problem. I expect >> that some data files are corrupted, i. e., failed to extract speech features >> from waveforms. Please check whether your data is OK or not. >> >> Regards, >> >> Heiga >> >> 2012/02/24 9:23 "li jay" <lij.acd@xxxxxxxxx>: >> >>> Hi, >>> >>> Thank you for telling me this. Compared to the delta f0 of the speaker >>> dependent model I trained previously, it is too big ( the current model I'm >>> training is an average voice model ) . I thought it was because I used >>> speech sentences of 20 speakers, which resulted in big delta f0. I used the >>> same configuration and option settings of speaker dependent model training >>> and replaced the training data with the data of 20 speakers. Is it correct >>> to train a average voice model without modify any configuration and option? >>> >>> Regards, >>> Jay >>> >>> 2012/2/24 Keiichiro Oura <uratec@xxxxxxxxxxxxxxx> >>>> >>>> Hi, >>>> >>>> It seems that delta lf0 is not trained correctly. >>>> >>>> <VARIANCE> 1 >>>> 2.078870e+12 >>>> >>>> The variance is too big. >>>> You should check delta f0 sequence in training data. >>>> >>>> Regards, >>>> Keiichiro Oura >>>> >>>> 2012/2/24 li jay <lij.acd@xxxxxxxxx>: >>>> > Hi, >>>> > >>>> > The following is the part of re_clustered.mmf >>>> > >>>> > 170535 ~p "lf0_s2_523-3" >>>> > 170536 <STREAM> 3 >>>> > 170537 <NUMMIXES> 2 >>>> > 170538 <MIXTURE> 1 1.582146e-01 >>>> > 170539 <MEAN> 1 >>>> > 170540 -9.624681e-03 >>>> > 170541 <VARIANCE> 1 >>>> > 170542 2.078870e+12 >>>> > 170543 <GCONST> 3.020072e+01 >>>> > 170544 <MIXTURE> 2 8.417839e-01 >>>> > 170545 <MEAN> 0 >>>> > 170546 <VARIANCE> 0 >>>> > 170547 <GCONST> 0.000000e+00 >>>> > 170548 ~p "lf0_s2_523-4" >>>> > 170549 <STREAM> 4 >>>> > 170550 <NUMMIXES> 2 >>>> > 170551 <MIXTURE> 1 1.582144e-01 >>>> > 170552 <MEAN> 1 >>>> > 170553 -1.753086e-03 >>>> > 170554 <VARIANCE> 1 >>>> > 170555 8.353170e-04 >>>> > 170556 <GCONST> -5.249823e+00 >>>> > 170557 <MIXTURE> 2 8.417841e-01 >>>> > ... >>>> > >>>> > 260387 ~h >>>> > >>>> > "CH_dz`-CH_U+sp/T:x_4_4_x_4/WS:1_6_6/CS:2_7_8/CW:2_1_2/PS:5_19_23/PW:5_1_5/PC:2_1_2" >>>> > 260388 <BEGINHMM> >>>> > 260389 <NUMSTATES> 7 >>>> > 260390 <STATE> 2 >>>> > 260391 <STREAM> 1 >>>> > 260392 ~p "mgc_s2_21" >>>> > 260393 <STREAM> 2 >>>> > 260394 ~p "lf0_s2_523-2" >>>> > 260395 <STREAM> 3 >>>> > 260396 ~p "lf0_s2_523-3" >>>> > 260397 <STREAM> 4 >>>> > 260398 ~p "lf0_s2_523-4" >>>> > 260399 <STATE> 3 >>>> > 260400 <STREAM> 1 >>>> > ... >>>> > >>>> > There seem to be no error or in the stream[3]. What could affect the >>>> > building process of regression tree? >>>> > >>>> > Regards, >>>> > Jay >>>> > >>>> > 2012/2/24 Keiichiro Oura <uratec@xxxxxxxxxxxxxxx> >>>> >> >>>> >> Hi, >>>> >> >>>> >> The distributions are in .../cmp/re_clustered.mmf >>>> >> >>>> >> Regards, >>>> >> Keiichiro Oura >>>> >> >>>> >> >>>> >> 2012/2/24 li jay <lij.acd@xxxxxxxxx>: >>>> >> > Hi, >>>> >> > >>>> >> > The following is part of the log file when I tried to build the >>>> >> > regression >>>> >> > tree: >>>> >> > >>>> >> > Splitting Node 32763, score 9.997541e+09 >>>> >> > (Stream=3, vSize=1) >>>> >> > Splitting Node 32765, score 9.997541e+09 >>>> >> > (Stream=3, vSize=1) >>>> >> > Splitting Node 32767, score 9.997541e+09 >>>> >> > (Stream=3, vSize=1) >>>> >> > Splitting Node -32767, score 9.997541e+09 >>>> >> > (Stream=3, vSize=1) >>>> >> > Splitting Node -32765, score 9.997541e+09 >>>> >> > (Stream=3, vSize=1) >>>> >> > Splitting Node -32763, score 9.997541e+09 >>>> >> > >>>> >> > The reason why the index went to negative value seemed to be an >>>> >> > overflow occurred. >>>> >> > Could you tell me in which file I can check the distribution of >>>> >> > stream[3]? >>>> >> > Thank you for you help. >>>> >> > >>>> >> > Regards, >>>> >> > Jay >>>> >> > >>>> >> > 2012/2/23 Keiichiro Oura <uratec@xxxxxxxxxxxxxxx> >>>> >> >> >>>> >> >> Hi, >>>> >> >> >>>> >> >> This value is node index in "reg.tree". >>>> >> >> >>>> >> >> printf("Splitting Node %d, score %e\n", r->nodeIndex, score); >>>> >> >> >>>> >> >> Index is always positive value. >>>> >> >> I don't know why split is in a loop... >>>> >> >> Could you check the distribution of stream[3]? >>>> >> >> >>>> >> >> Regards, >>>> >> >> Keiichiro Oura >>>> >> >> >>>> >> >> >>>> >> >> 2012/2/23 li jay <lij.acd@xxxxxxxxx>: >>>> >> >> > Thank you for your reply. >>>> >> >> > >>>> >> >> > I've tried HTS2-2, replacing the HTS commands with the ones of >>>> >> >> > HTS-2.2. >>>> >> >> > The >>>> >> >> > result was exactly the same as the one of HTS-2.1.1. The score >>>> >> >> > stayed >>>> >> >> > the >>>> >> >> > same, and the node split endlessly. >>>> >> >> > >>>> >> >> > What do you mean by 'Splitting Node' is negative value? You mean >>>> >> >> > the >>>> >> >> > score? >>>> >> >> > >>>> >> >> > Regards, >>>> >> >> > Jay >>>> >> >> > >>>> >> >> > 2012/2/22 Keiichiro Oura <uratec@xxxxxxxxxxxxxxx> >>>> >> >> >> >>>> >> >> >> Hi, >>>> >> >> >> >>>> >> >> >> Could you try HTS-2.2? >>>> >> >> >> I don't know why 'Splitting Node' is negative value. >>>> >> >> >> >>>> >> >> >> Regards, >>>> >> >> >> Keiichiro Oura >>>> >> >> >> >>>> >> >> >> 2012/2/22 li jay <lij.acd@xxxxxxxxx>: >>>> >> >> >> > Hi, >>>> >> >> >> > >>>> >> >> >> > I've been trying to build a regression tree for speaker >>>> >> >> >> > adaptation. I >>>> >> >> >> > am >>>> >> >> >> > using HTS 2.1.1. I've trained a average voice model from 4000 >>>> >> >> >> > sentences >>>> >> >> >> > (about 2.5 hrs) of 20 speakers. It was successful to generate >>>> >> >> >> > voice >>>> >> >> >> > using >>>> >> >> >> > the average voice model. I wanted to apply speaker adaptation >>>> >> >> >> > on >>>> >> >> >> > this >>>> >> >> >> > average voice model, so I tried to build a regression tree >>>> >> >> >> > with >>>> >> >> >> > the >>>> >> >> >> > command >>>> >> >> >> > below: >>>> >> >> >> > /usr/local/HTS-2.1.1/bin/HHEd -A -B -C >>>> >> >> >> > /home/jay/TTS/try/AST_female_20_speakers_2/configs/trn.cnf -D >>>> >> >> >> > -T 1 >>>> >> >> >> > -p >>>> >> >> >> > -i >>>> >> >> >> > -H >>>> >> >> >> > /home/ >>>> >> >> >> > jay /TTS/try/AST_female_20_speakers_2/models/cmp/re_clust >>>> >> >> >> > ered.mmf -M /home/ >>>> >> >> >> > jay /TTS/try/AST_female_20_speakers_2/models/cmp/regTrees >>>> >> >> >> > /home/ >>>> >> >> >> > jay /TTS/try/AST_female_20_speakers_2/edfiles/cmp/reg.hed >>>> >> >> >> > /home/ >>>> >> >> >> > jay /TTS/try/AST_female_20_speaker >>>> >> >> >> > s_2/data/lists/full.list >>>> >> >> >> > >>>> >> >> >> > The problem was that splitting of nodes did finish. It seemed >>>> >> >> >> > to >>>> >> >> >> > be >>>> >> >> >> > in a >>>> >> >> >> > loop, and the score stayed the same. So the HHEd command >>>> >> >> >> > cannot >>>> >> >> >> > stop. The >>>> >> >> >> > log file shows as below: >>>> >> >> >> > >>>> >> >> >> > HTK Configuration Parameters[10] >>>> >> >> >> > Module/Tool Parameter Value >>>> >> >> >> > # MINDUR 5 >>>> >> >> >> > # MAXSTDDEVCOEF 10 >>>> >> >> >> > # APPLYDURVARFLOOR TRUE >>>> >> >> >> > # DURVARFLOORPERCENTILE 1.000000 >>>> >> >> >> > # SHRINKOCCTHRESH Vector 4 500.0 100.0 100.0 >>>> >> >> >> > 100.0 >>>> >> >> >> > # VFLOORSCALESTR Vector 4 0.01 0.01 0.01 >>>> >> >> >> > 0.01 >>>> >> >> >> > # MINLEAFOCC 0 >>>> >> >> >> > # NATURALWRITEORDER TRUE >>>> >> >> >> > # NATURALREADORDER TRUE >>>> >> >> >> > # APPLYVFLOOR TRUE >>>> >> >> >> > >>>> >> >> >> > // construct regression class tree >>>> >> >> >> > RC 32 reg >>>> >> >> >> > Building regression tree with 32 terminals (4 streams) >>>> >> >> >> > Creating regression class tree with ident reg.tree and >>>> >> >> >> > baseclass >>>> >> >> >> > reg.base >>>> >> >> >> > Splitting Node 1, score 1.000000e+10 >>>> >> >> >> > (Stream splitting) >>>> >> >> >> > Splitting Node 3, score 1.000000e+10 >>>> >> >> >> > (Stream splitting) >>>> >> >> >> > Splitting Node 5, score 1.000000e+10 >>>> >> >> >> > (Stream splitting) >>>> >> >> >> > Splitting Node 7, score 1.000000e+10 >>>> >> >> >> > (MSD splitting) >>>> >> >> >> > Splitting Node 6, score 1.000000e+10 >>>> >> >> >> > (MSD splitting) >>>> >> >> >> > Splitting Node 10, score 8.998759e+10 >>>> >> >> >> > (Stream=3, vSize=1) >>>> >> >> >> > Splitting Node 13, score 2.999760e+10 >>>> >> >> >> > (Stream=3, vSize=1) >>>> >> >> >> > Splitting Node 4, score 1.000000e+10 >>>> >> >> >> > (MSD splitting) >>>> >> >> >> > Splitting Node 15, score 9.997541e+09 >>>> >> >> >> > (Stream=3, vSize=1) >>>> >> >> >> > Splitting Node 19, score 9.997541e+09 >>>> >> >> >> > (Stream=3, vSize=1) >>>> >> >> >> > Splitting Node 21, score 9.997541e+09 >>>> >> >> >> > (Stream=3, vSize=1) >>>> >> >> >> > ... >>>> >> >> >> > ... >>>> >> >> >> > ... >>>> >> >> >> > Splitting Node -16495, score 9.997541e+09 >>>> >> >> >> > (Stream=3, vSize=1) >>>> >> >> >> > Splitting Node -16493, score 9.997541e+09 >>>> >> >> >> > (Stream=3, vSize=1) >>>> >> >> >> > Splitting Node -16491, score 9.997541e+09 >>>> >> >> >> > (Stream=3, vSize=1) >>>> >> >> >> > >>>> >> >> >> > Could you do me a favor to help the problem? My questions >>>> >> >> >> > are: >>>> >> >> >> > 1: What could be the reason or problem result in this endless >>>> >> >> >> > splitting >>>> >> >> >> > node >>>> >> >> >> > situation. >>>> >> >> >> > 2:Could it be the problem with the average modeling? Is there >>>> >> >> >> > any >>>> >> >> >> > option >>>> >> >> >> > to >>>> >> >> >> > enable average modeling? I trained the average model just as >>>> >> >> >> > speaker >>>> >> >> >> > dependent model with the same scripts, except the training >>>> >> >> >> > data >>>> >> >> >> > from >>>> >> >> >> > different people. >>>> >> >> >> > >>>> >> >> >> > Thank you. >>>> >> >> >> > >>>> >> >> >> > Regards, >>>> >> >> >> > Jay >>>> >> >> >> >>>> >> >> > >>>> >> >> >>>> >> > >>>> >> >>>> > >>>> >>> >

**Follow-Ups****[hts-users:03193] Re: Problem of building regression tree (average voice model)**,*li jay*

**References****[hts-users:03173] Problem of building regression tree (average voice model)**,*li jay***[hts-users:03174] Re: Problem of building regression tree (average voice model)**,*Keiichiro Oura***[hts-users:03176] Re: Problem of building regression tree (average voice model)**,*li jay***[hts-users:03177] Re: Problem of building regression tree (average voice model)**,*Keiichiro Oura***[hts-users:03179] Re: Problem of building regression tree (average voice model)**,*li jay***[hts-users:03180] Re: Problem of building regression tree (average voice model)**,*Keiichiro Oura***[hts-users:03181] Re: Problem of building regression tree (average voice model)**,*li jay***[hts-users:03182] Re: Problem of building regression tree (average voice model)**,*Keiichiro Oura***[hts-users:03183] Re: Problem of building regression tree (average voice model)**,*li jay***[hts-users:03184] Re: Problem of building regression tree (average voice model)**,*Heiga ZEN (Byung Ha CHUN)***[hts-users:03186] Re: Problem of building regression tree (average voice model)**,*li jay*

- Prev by Subject:
**[hts-users:03186] Re: Problem of building regression tree (average voice model)** - Next by Subject:
**[hts-users:03188] Re: Building a voice from scratch** - Previous by thread:
**[hts-users:03186] Re: Problem of building regression tree (average voice model)** - Next by thread:
**[hts-users:03193] Re: Problem of building regression tree (average voice model)**