[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:02275] Question about the method of creating my own *.utt file for synthesis part in HTS-demo


Hello, all.

I just want to check if my method is correct.
Basically, I am trying to create my own *.utt file for synthesis part (I think I know how to create my own labels for training part. I created label files as well and then created *.utt files using the labels) in HTS-demo.
What I have done for *.utt files for synthesis part is to use festival and type $ESTDIR/.../festival/bin/festival -b festvox/build_clunits.scm '(build_prompts "etc/txt.done.data")'
txt.done.data file is like below

( utt1 "DON DE ESTA EL BANYO." )
( utt2 "QUANTO QUESTA." )

This command created *.utt files as below (let me show you only utt2 example below.)
What I need to do from now is to use dumpfeats and utt2lab.sh and use HTS-format labels for rest of process.
Am I on the right path?

I have a little concern that my own *.utt files (for both training part and synthesis part) do not show any stress information (All zero as you can see below. I am working on Spanish voice)

Thank you for your time.

Best,
Jangwon


EST_File utterance
DataType ascii
version 2
EST_Header_End
Features max_id 25 ; type Text ; iform "\"QUANTO QUESTA.\"" ;
Stream_Items
1 id _1 ; name QUANTO ; whitespace "" ; prepunctuation "" ;
2 id _2 ; name QUESTA ; punc . ; whitespace " " ; prepunctuation "" ;
3 id _4 ; name QUESTA ; pbreak B ; pos nil ;
4 id _5 ; name . ; pbreak B ; pos punc ;
5 id _3 ; name QUANTO ; pbreak NB ; pos nil ;
6 id _6 ; name B ;
7 id _7 ; name syl ; stress 0 ;
8 id _12 ; name syl ; stress 0 ;
9 id _15 ; name syl ; stress 0 ;
10 id _19 ; name syl ; stress 0 ;
11 id _22 ; name pau ; dur_factor 0 ; end 0.2 ;
12 id _8 ; name k ; dur_factor -0.171832 ; end 0.276862 ;
13 id _9 ; name uW ; dur_factor -0.460147 ; end 0.325884 ;
14 id _10 ; name a ; dur_factor -0.542042 ; end 0.370089 ;
15 id _11 ; name n ; dur_factor 0.0418376 ; end 0.439065 ;
16 id _13 ; name t ; dur_factor -0.772612 ; end 0.485402 ;
17 id _14 ; name o ; dur_factor -0.413674 ; end 0.530719 ;
18 id _16 ; name k ; dur_factor 0.439377 ; end 0.626335 ;
19 id _17 ; name e ; dur_factor 0.126522 ; end 0.683121 ;
20 id _18 ; name s ; dur_factor 0.252152 ; end 0.800286 ;
21 id _20 ; name t ; dur_factor 0.465378 ; end 0.875206 ;
22 id _21 ; name a ; dur_factor 2.5611 ; end 1.07215 ;
23 id _23 ; name pau ; dur_factor 0 ; end 1.27215 ;
24 id _25 ; f0 110 ; pos 1.07215 ;
25 id _24 ; f0 130 ; pos 0.2 ;
End_of_Stream_Items
Relations
Relation Token ; ()
4 4 0 0 0 3
3 3 2 0 4 0
2 2 0 3 0 1
5 5 1 0 0 0
1 1 0 5 2 0
End_of_Relation
Relation Word ; ()
2 3 0 0 0 1
1 5 0 0 2 0
End_of_Relation
Relation Phrase ; ()
3 3 0 0 0 2
2 5 1 0 3 0
1 6 0 2 0 0
End_of_Relation
Relation Syllable ; ()
4 10 0 0 0 3
3 9 0 0 4 2
2 8 0 0 3 1
1 7 0 0 2 0
End_of_Relation
Relation Segment ; ()
13 23 0 0 0 12
12 22 0 0 13 11
11 21 0 0 12 10
10 20 0 0 11 9
9 19 0 0 10 8
8 18 0 0 9 7
7 17 0 0 8 6
6 16 0 0 7 5
5 15 0 0 6 4
4 14 0 0 5 3
3 13 0 0 4 2
2 12 0 0 3 1
1 11 0 0 2 0
End_of_Relation
Relation SylStructure ; ()
3 4 0 0 0 2
7 22 0 0 0 6
6 21 5 0 7 0
5 10 0 6 0 4
10 20 0 0 0 9
9 19 0 0 10 8
8 18 4 0 9 0
4 9 2 8 5 0
2 3 0 4 3 1
14 17 0 0 0 13
13 16 12 0 14 0
12 8 0 13 0 11
18 15 0 0 0 17
17 14 0 0 18 16
16 13 0 0 17 15
15 12 11 0 16 0
11 7 1 15 12 0
1 5 0 11 2 0
End_of_Relation
Relation IntEvent ; ()
End_of_Relation
Relation Intonation ; ()
End_of_Relation
Relation Target ; ()
3 24 2 0 0 0
2 22 0 3 0 1
4 25 1 0 0 0
1 11 0 4 2 0
End_of_Relation
End_of_Relations
End_of_Utterance