[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:04576] رد: Re: HTS Demo with Arabic langauge


Hello Quang Bui Tan


The HTS-Demo needs (gen label) to training the data, I tried to generate label using the festival and festvox tools but the result label was like this:


#

0.07 125 pau

0.145 125 pau

0.215 125 f

0.245 125 ao

0.26 125 r

0.285 125 dh

0.3 125 ax

0.315 125 t

0.33 125 w

0.345 125 eh

0.36 125 n

0.63 125 t

0.645 125 iy

0.79 125 ax

0.805 125 th

1.065 125 t

1.08 125 ay

1.155 125 m

1.17 125 dh

1.275 125 ae

1.325 125 t

1.46 125 iy

1.5 125 v

1.565 125 n

1.635 125 ih

1.755 125 ng

1.765 125 pau

1.78 125 dh

1.825 125 ax

2.035 125 t

2.05 125 uw

2.145 125 m

2.255 125 eh

2.345 125 n

2.49 125 sh

2.505 125 uh

2.62 125 k

2.635 125 hh

2.895 125 ae

2.995 125 n

3.01 125 d

3.135 125 z

3.195 125 pau


Like mono label, not like the gen label in hts-demo:

x^x-pau+ae=l@x_x/A:0_0_0/B:x-x-x@x-x&x-x#x-x$x-x!x-x;x-x|x/C:1+1+2/D:0_0/E:x+x@x+x&x+x#x+x/F:content_2/G:0_0/H:x=x^1=10|0/I:19=12/J:79+57-10
x^pau-ae+l=ax@1_2/A:0_0_0/B:1-1-2@1-2&1-19#1-10$1-5!0-2;0-8|ae/C:0+0+2/D:0_0/E:content+2@1+12&1+6#0+2/F:aux_1/G:0_0/H:19=12^1=10|L-H%/I:3=3/J:79+57-10
pau^ae-l+ax=s@2_1/A:0_0_0/B:1-1-2@1-2&1-19#1-10$1-5!0-2;0-8|ae/C:0+0+2/D:0_0/E:content+2@1+12&1+6#0+2/F:aux_1/G:0_0/H:19=12^1=10|L-H%/I:3=3/J:79+57-10
ae^l-ax+s=w@1_2/A:1_1_2/B:0-0-2@2-1&2-18#1-10$1-5!1-1;1-7|ax/C:1+0+3/D:0_0/E:content+2@1+12&1+6#0+2/F:aux_1/G:0_0/H:19=12^1=10|L-H%/I:3=3/J:79+57-10
l^ax-s+w=aa@2_1/A:1_1_2/B:0-0-2@2-1&2-18#1-10$1-5!1-1;1-7|ax/C:1+0+3/D:0_0/E:content+2@1+12&1+6#0+2/F:aux_1/G:0_0/H:19=12^1=10|L-H%/I:3=3/J:79+57-10
ax^s-w+aa=z@1_3/A:0_0_2/B:1-0-3@1-1&3-17#1-9$1-5!2-2;2-6|aa/C:0+0+3/D:content_2/E:aux+1@2+11&2+6#1+1/F:content_3/G:0_0/H:19=12^1=10|L-H%/I:3=3/J:79+57-10
s^w-aa+z=b@2_2/A:0_0_2/B:1-0-3@1-1&3-17#1-9$1-5!2-2;2-6|aa/C:0+0+3/D:content_2/E:aux+1@2+11&2+6#1+1/F:content_3/G:0_0/H:19=12^1=10|L-H%/I:3=3/J:79+57-10
w^aa-z+b=ih@3_1/A:0_0_2/B:1-0-3@1-1&3-17#1-9$1-5!2-2;2-6|aa/C:0+0+3/D:content_2/E:aux+1@2+11&2+6#1+1/F:content_3/G:0_0/H:19=12^1=10|L-H%/I:3=3/J:79+57-10
aa^z-b+ih=g@1_3/A:1_0_3/B:0-0-3@1-3&4-16#2-9$1-5!1-1;3-5|ih/C:1+0+2/D:aux_1/E:content+3@3+10&2+5#2+2/F:to_1/G:0_0/H:19=12^1=10|L-H%/I:3=3/J:79+57-10
z^b-ih+g=ih@2_2/A:1_0_3/B:0-0-3@1-3&4-16#2-9$1-5!1-1;3-5|ih/C:1+0+2/D:aux_1/E:content+3@3+10&2+5#2+2/F:to_1/G:0_0/H:19=12^1=10|L-H%/I:3=3/J:79+57-10
b^ih-g+ih=n@3_1/A:1_0_3/B:0-0-3@1-3&4-16#2-9$1-5!1-1;3-5|ih/C:1+0+2/D:aux_1/E:content+3@3+10&2+5#2+2/F:to_1/G:0_0/H:19=12^1=10|L-H%/I:3=3/J:79+57-10
ih^g-ih+n=ih@1_2/A:0_0_3/B:1-0-2@2-2&5-15#2-8$1-5!2-3;4-4|ih/C:0+0+2/D:aux_1/E:content+3@3+10&2+5#2+2/F:to_1/G:0_0/H:19=12^1=10|L-H%/I:3=3/J:79+57-10
g^ih-n+ih=ng@2_1/A:0_0_3/B:1-0-2@2-2&5-15#2-8$1-5!2-3;4-4|ih/C:0+0+2/D:aux_1/E:content+3@3+10&2+5#2+2/F:to_1/G:0_0/H:19=12^1=10|L-H%/I:3=3/J:79+57-10
ih^n-ih+ng=t@1_2/A:1_0_2/B:0-0-2@3-1&6-14#3-8$1-5!1-2;5-3|ih/C:0+0+2/D:aux_1/E:content+3@3+10&2+5#2+2/F:to_1/G:0_0/H:19=12^1=10|L-H%/I:3=3/J:79+57-10
n^ih-ng+t=ax@2_1/A:1_0_2/B:0-0-2@3-1&6-14#3-8$1-5!1-2;5-3|ih/C:0+0+2/D:aux_1/E:content+3@3+10&2+5#2+2/F:to_1/G:0_0/H:19=12^1=10|L-H%/I:3=3/J:79+57-10


So, how to generate the (gen label)? Is there a tool or program use it to create (gen label)?



من: Quang Bui Tan <langmaninternet@xxxxxxxxx>
‏‏تم الإرسال: 23/صفر/1439 04:55 م
إلى: hts-users@xxxxxxxxxxxxxxx
‏‏الموضوع: [hts-users:04568] Re: HTS Demo with Arabic langauge
 
You can no need utt files
by using my makefile

2017-11-12 20:54 GMT+07:00 Quang Bui Tan <langmaninternet@xxxxxxxxx>:
You can try my make file, http://hts.sp.nitech.ac.jp/hts-users/spool/2017/msg00001.html


and set 

# Use preload mono and full  
USEPRELOADMONOFULL = 1

# Use preload question  
USEPRELOADQUESTION = 1

2017-11-12 20:52 GMT+07:00 Quang Bui Tan <langmaninternet@xxxxxxxxx>:
Use must some value for variables :
DATASET=???
SPEAKER=???


like :

./configure --with-fest-search-path=/home/quang/HTS/festival/examples \
                 --with-sptk-search-path=/home/quang/HTS/Tools/bin \
                 --with-hts-search-path=/home/quang/HTS/Tools/bin \
                 --with-hts-engine-search-path=/home/quang/HTS/Tools/bin \
                 --with-matlab-search-path=/home/quang/HTS/Tools/Matlab/bin \
                 --with-straight-path=/home/quang/HTS/Tools/STRAIGHTV40pcode \
LOWERF0=160 UPPERF0=360 DATASET=QBT SPEAKER=ChungHC QNAME=ChungHC     






2017-11-12 20:45 GMT+07:00 Nora Qm <NoraQm@xxxxxxxxxx>:

Hello


I’m beginning in speech synthesis field. I trained HTS demo with english and it was run and worked very well. Right now, I need to training the HTS with the Arabic language. I have already my own data, which are: question, labels (mono, full), text, and raw. When I tried to training, this error occurs:



# Extracting features from raw audio

mkdir -p mgc lf0 bap

SAMPKHZ=`echo 48000    | /usr/local/bin/x2x +af | /usr/local/bin/sopr -m 0.001 | /usr/local/bin/x2x +fa`; \

for raw in raw/cmu_us_arctic_slt_*.raw; do \

base=`basename ${raw} .raw`; \

min=`/usr/local/bin/x2x +sf ${raw} | /usr/local/bin/minmax | /usr/local/bin/x2x +fa | head -n 1`; \

max=`/usr/local/bin/x2x +sf ${raw} | /usr/local/bin/minmax | /usr/local/bin/x2x +fa | tail -n 1`; \

if [ -s ${raw} -a ${min} -gt -32768 -a ${max} -lt 32767 ]; then \

echo "Extracting features from ${raw}"; \

if [ 0 -eq 0 ]; then \

/usr/local/bin/x2x +sf ${raw} | /usr/local/bin/pitch -H 280     -L 110     -p 240  -s ${SAMPKHZ} -o 2 > lf0/${base}.lf0; \

if [ 0       -eq 0 ]; then \

/usr/local/bin/x2x +sf ${raw} | \

/usr/local/bin/frame -l 1200    -p 240  | \

/usr/local/bin/window -l 1200    -L 2048      -w 1  -n 1   | \

/usr/local/bin/mcep -a 0.55    -m 34    -l 2048      -e 1.0E-08 > mgc/${base}.mgc; \

else \

if [ 1      -eq 1 ]; then \

GAINOPT="-L"; \

fi; \

/usr/local/bin/x2x +sf ${raw} | \

/usr/local/bin/frame -l 1200    -p 240  | \

/usr/local/bin/window -l 1200    -L 2048      -w 1  -n 1   | \

/usr/local/bin/mcep -a 0.55    -c 0       -m 34    -l 2048      -e 1.0E-08 -o 4 | \

/usr/local/bin/lpc2lsp -m 34    -s ${SAMPKHZ} ${GAINOPT} -n 2048      -p 8 -d 1.0E-08 > mgc/${base}.mgc; \

fi; \

if [ -n "`/usr/local/bin/nan lf0/${base}.lf0`" ]; then \

echo " Failed to extract features from ${raw}"; \

rm -f lf0/${base}.lf0; \

fi; \

if [ -n "`/usr/local/bin/nan mgc/${base}.mgc`" ]; then \

echo " Failed to extract features from ${raw}"; \

rm -f mgc/${base}.mgc; \

fi; \

else \

FRAMESHIFTMS=`echo 240  | /usr/local/bin/x2x +af | /usr/local/bin/sopr -m 1000 -d 48000    | /usr/local/bin/x2x +fa`; \

/usr/local/bin/raw2wav -s ${SAMPKHZ} -d . ${raw}; \

echo "path(path,'');"                    >  ${base}.m; \

echo "prm.F0frameUpdateInterval=${FRAMESHIFTMS};"  >> ${base}.m; \

echo "prm.F0searchUpperBound=280    ;"           >> ${base}.m; \

echo "prm.F0searchLowerBound=110    ;"           >> ${base}.m; \

echo "prm.spectralUpdateInterval=${FRAMESHIFTMS};" >> ${base}.m; \

echo "[x,fs]=wavread('${base}.wav');"              >> ${base}.m; \

echo "[f0,ap] = exstraightsource(x,fs,prm);"        >> ${base}.m; \

echo "[sp] = exstraightspec(x,f0,fs,prm);"          >> ${base}.m; \

echo "ap = ap';"                                    >> ${base}.m; \

echo "sp = sp';"                                    >> ${base}.m; \

echo "sp = sp*32768.0;"                             >> ${base}.m; \

echo "save '${base}.f0' f0 -ascii;"                >> ${base}.m; \

echo "save '${base}.ap' ap -ascii;"                >> ${base}.m; \

echo "save '${base}.sp' sp -ascii;"                >> ${base}.m; \

echo "quit;"                                        >> ${base}.m; \

: -nodisplay -nosplash -nojvm < ${base}.m; \

if [ -s ${base}.f0 ]; then \

/usr/local/bin/x2x +af ${base}.f0 | /usr/local/bin/sopr -magic 0.0 -LN -MAGIC -1.0E+10 > lf0/${base}.lf0; \

if [ -n "`/usr/local/bin/nan lf0/${base}.lf0`" ]; then \

echo " Failed to extract features from ${raw}"; \

rm -f lf0/${base}.lf0; \

fi; \

fi; \

if [ -s ${base}.sp ]; then \

if [ 0       -eq 0 ]; then \

/usr/local/bin/x2x +af ${base}.sp | \

/usr/local/bin/mcep -a 0.55    -m 34    -l 2048 -e 1.0E-08 -j 0 -f 0.0 -q 3 > mgc/${base}.mgc; \

else \

if [ 1      -eq 1 ]; then \

GAINOPT="-L"; \

fi; \

/usr/local/bin/x2x +af ${base}.sp | \

/usr/local/bin/mcep -a 0.55    -c 0       -m 34    -l 2048 -e 1.0E-08 -j 0 -f 0.0 -q 3 -o 4 | \

/usr/local/bin/lpc2lsp -m 34    -s ${SAMPKHZ} ${GAINOPT} -n 2048 -p 8 -d 1.0E-08 > mgc/${base}.mgc; \

fi; \

if [ -n "`/usr/local/bin/nan mgc/${base}.mgc`" ]; then \

echo " Failed to extract features from ${raw}"; \

rm -f mgc/${base}.mgc; \

fi; \

fi; \

if [ -s ${base}.ap ]; then \

/usr/local/bin/x2x +af ${base}.ap | \

/usr/local/bin/mcep -a 0.55    -m 24    -l 2048 -e 1.0E-08 -j 0 -f 0.0 -q 1 > bap/${base}.bap; \

if [ -n "`/usr/local/bin/nan bap/${base}.bap`" ]; then \

echo " Failed to extract features from ${raw}"; \

rm -f bap/${base}.bap; \

fi; \

fi; \

rm -f ${base}.m ${base}.wav ${base}.f0 ${base}.ap ${base}.sp; \

fi; \

fi; \

done

Cannot open file raw/cmu_us_arctic_slt_*.raw!

Cannot open file raw/cmu_us_arctic_slt_*.raw!

# Composing training data files from extracted features

mkdir -p cmp

for raw in raw/cmu_us_arctic_slt_*.raw; do \

base=`basename ${raw} .raw`; \

echo "Composing training data for ${base}"; \

if [ 0 -eq 0 ]; then \

MGCDIM=`expr 34    + 1`; \

LF0DIM=1; \

MGCWINDIM=`expr 3 \* ${MGCDIM}`; \

LF0WINDIM=`expr 3 \* ${LF0DIM}`; \

BYTEPERFRAME=`expr 4 \* \( ${MGCWINDIM} + ${LF0WINDIM} \)`; \

if [ -s mgc/${base}.mgc -a -s lf0/${base}.lf0 ]; then \

MGCWINS=""; \

i=1; \

while [ ${i} -le 3 ]; do \

eval MGCWINS=\"${MGCWINS} win/mgc.win${i}\"; \

i=`expr ${i} + 1`; \

done; \

/usr/bin/perl scripts/window.pl ${MGCDIM} mgc/${base}.mgc ${MGCWINS} > tmp.mgc; \

LF0WINS=""; \

i=1; \

while [ ${i} -le 3 ]; do \

eval LF0WINS=\"${LF0WINS} win/lf0.win${i}\"; \

i=`expr ${i} + 1`; \

done; \

/usr/bin/perl scripts/window.pl ${LF0DIM} lf0/${base}.lf0 ${LF0WINS} > tmp.lf0; \

/usr/local/bin/merge +f -s 0 -l ${LF0WINDIM} -L ${MGCWINDIM} tmp.mgc < tmp.lf0                 > tmp.cmp; \

/usr/bin/perl scripts/addhtkheader.pl 48000    240  ${BYTEPERFRAME} 9 tmp.cmp > cmp/${base}.cmp; \

rm -f tmp.mgc tmp.lf0 tmp.cmp; \

fi; \

else \

MGCDIM=`expr 34    + 1`; \

LF0DIM=1; \

BAPDIM=`expr 24    + 1`; \

MGCWINDIM=`expr 3 \* ${MGCDIM}`; \

LF0WINDIM=`expr 3 \* ${LF0DIM}`; \

BAPWINDIM=`expr 3 \* ${BAPDIM}`; \

MGCLF0WINDIM=`expr ${MGCWINDIM} + ${LF0WINDIM}`; \

BYTEPERFRAME=`expr 4 \* \( ${MGCWINDIM} + ${LF0WINDIM} + ${BAPWINDIM} \)`; \

if [ -s mgc/${base}.mgc -a -s lf0/${base}.lf0 -a -s bap/${base}.bap ]; then \

MGCWINS=""; \

i=1; \

while [ ${i} -le 3 ]; do \

eval MGCWINS=\"${MGCWINS} win/mgc.win${i}\"; \

i=`expr ${i} + 1`; \

done; \

/usr/bin/perl scripts/window.pl ${MGCDIM} mgc/${base}.mgc ${MGCWINS} > tmp.mgc; \

LF0WINS=""; \

i=1; \

while [ ${i} -le 3 ]; do \

eval LF0WINS=\"${LF0WINS} win/lf0.win${i}\"; \

i=`expr ${i} + 1`; \

done; \

/usr/bin/perl scripts/window.pl ${LF0DIM} lf0/${base}.lf0 ${LF0WINS} > tmp.lf0; \

BAPWINS=""; \

i=1; \

while [ ${i} -le 3 ]; do \

eval BAPWINS=\"${BAPWINS} win/bap.win${i}\"; \

i=`expr ${i} + 1`; \

done; \

/usr/bin/perl scripts/window.pl ${BAPDIM} bap/${base}.bap ${BAPWINS} > tmp.bap; \

/usr/local/bin/merge +f -s 0 -l ${LF0WINDIM} -L ${MGCWINDIM}    tmp.mgc     < tmp.lf0          > tmp.mgc+lf0; \

/usr/local/bin/merge +f -s 0 -l ${BAPWINDIM} -L ${MGCLF0WINDIM} tmp.mgc+lf0 < tmp.bap          > tmp.cmp; \

/usr/bin/perl scripts/addhtkheader.pl 48000    240  ${BYTEPERFRAME} 9 tmp.cmp > cmp/${base}.cmp; \

rm -f tmp.mgc tmp.lf0 tmp.bap tmp.mgc+lf0 tmp.cmp; \

fi; \

fi; \

done

Composing training data for cmu_us_arctic_slt_*

# Extracting monophone and fullcontext labels

mkdir -p labels/mono

mkdir -p labels/full

if [ 1 -eq 1 ]; then \

for utt in utts/cmu_us_arctic_slt_*.utt; do \

base=`basename ${utt} .utt`; \

if [ -s ${utt} ]; then \

echo "Extracting labels from ${utt}"; \

/usr/local/TTS_System/FestivalTTS_2/festival/examples/dumpfeats -eval scripts/extra_feats.scm -relation Segment -feats scripts/label.feats -output tmp.feats ${utt}; \

fi; \

if [ -s tmp.feats ]; then \

awk -f scripts/label-full.awk tmp.feats > labels/full/${base}.lab; \

awk -f scripts/label-mono.awk tmp.feats > labels/mono/${base}.lab; \

rm -f tmp.feats; \

fi; \

done; \

else \

for txt in txt/cmu_us_arctic_slt_*.txt; do \

base=`basename ${txt} .txt`; \

if [ -s ${txt} ]; then \

echo "Extracting labels from ${txt}"; \

/usr/bin/perl scripts/normtext.pl ${txt} > tmp.txt; \

/usr/local/TTS_System/FestivalTTS_2/festival/examples/text2utt tmp.txt > tmp.utt; \

/usr/local/TTS_System/FestivalTTS_2/festival/examples/dumpfeats -eval scripts/extra_feats.scm -relation Segment -feats scripts/label.feats -output tmp.feats tmp.utt; \

rm -f tmp.txt tmp.utt; \

fi; \

if [ -s tmp.feats ]; then \

awk -f scripts/label-full.awk tmp.feats > labels/full/${base}.lab; \

awk -f scripts/label-mono.awk tmp.feats > labels/mono/${base}.lab; \

rm -f tmp.feats; \

fi; \

done; \

fi

# Generating monophone and fullcontext Master Label Files (MLF)

echo "#!MLF!#" > labels/mono.mlf

echo "\"*/cmu_us_arctic_slt_*.lab\" -> \"/usr/local/TTS_System/HTS-demo_ARABIC/data/labels/mono\"" >> labels/mono.mlf

echo "#!MLF!#" > labels/full.mlf

echo "\"*/cmu_us_arctic_slt_*.lab\" -> \"/usr/local/TTS_System/HTS-demo_ARABIC/data/labels/full\"" >> labels/full.mlf

# Generating a fullcontext model list file

mkdir -p lists

rm -f tmp

for lab in labels/full/cmu_us_arctic_slt_*.lab; do \

if [ -s ${lab} -a -s labels/mono/`basename ${lab}` -a -s cmp/`basename ${lab} .lab`.cmp ]; then \

sed -e "s/.* //g" ${lab} >> tmp; \

fi \

done

sort -u tmp > lists/full.list

sort: No such file or directory

make[1]: *** [list] Error 2

make: *** [data] Error 2




I don’t know why this happened! Is it because the (utterance file) and (gen label)? where I didn’t use them with the training?


I read the manual for Festival tool about creating utterance file, but  the steps it was not clear. Thus, what is the step ? and which the files are required to create utterance file?
















References
[hts-users:04565] HTS Demo with Arabic langauge, Nora Qm
[hts-users:04566] Re: HTS Demo with Arabic langauge, Quang Bui Tan
[hts-users:04567] Re: HTS Demo with Arabic langauge, Quang Bui Tan
[hts-users:04568] Re: HTS Demo with Arabic langauge, Quang Bui Tan