Hi,
I'm facing the same problem with you.
I want to build HTS voice for new language, I started to run the HTS demo successfully, and now I don't know how to build the utterance file.
I read some of the discussion in this mailing list and it said that I have to prepare some files to build it (.segment, .syllable, .word, .phrase, .IntEvent, and target)
My question is, how to build those files? I've searched everywhere (included Festival manual) but can't found the appropriate manual to build those file.
For example, in one of the mailing list discussion, it stated that the following files must be prepared.
Segment
segment labels with (near) correct boundaries, in the phone set of your language.
File1.Segment
#
0.583812 121 pau
0.665375 121 D
0.738812 121 e
0.871687 121
f
0.918312 121 t
1.04994 121 E
1.09619 121 r
1.197 121 a
......
......
Syllable
Syllables, with stress marking (if appropriate) whose
boundaries are closely aligned with the segment boundaries.
File1.Syllable
#
0.738812 121 D.e ; stress 0 ;
1.04994 121 f.t.E ; stress 1 ;
1.197 121 r.a ; stress 0 ;
.....
.....
Word
Words with boundaries aligned (close) to the syllables and segments. By words we mean the things which can be looked up in a lexicon thus "1986" would not be considered a word and should be rendered as three words "nineteen eighty six".
File1.Word
#
1.197 121 DeftEra ; wordlab "1"
.......
......
Phrase
A name and marking for the end of each prosodic phrase.
File1.Phrase
#
1.197 77 2
.....
......
IntEvent
Intonation
labels aligned to a syllable (either within the syllable boundary or explicitly naming the syllable they should align to. If using ToBI (or some derivative) these would be standard ToBI labels, while in something like Tilt these would be "a" and "b" marking accents and labels.
File1.IntEvent
#
1.197 77 L*+H
1.77462 77 L*+H
2.24356 77 L*+H
....
....
Target
The mean F0 value in Hertz at the mid-point of each segment in the utterance.
But, I don't know how to build them, for example in the first row of .segment file, what "0.583812 121 pau " means?
I guess the last column is for phoneme, how about the first (0.583812) and the second (121)? And what about the format for others files?
Really appreciate your help.
Thanks,
-Clara
From: Narendra Naidu Lolugu <narendranaidu.l@xxxxxxxxx>
To: hts-users@xxxxxxxxxxxxxxx
Sent: Fri, October 1, 2010 2:32:17 PM
Subject: [hts-users:02615] help needed in building HTS voice for HINDI language
Hi,
I am interested in building HTS voice for HINDI language.
The following steps i have done. I need your help in building the HTS voice.
1> Installed festvox.
2> speech_tools
3> installed festival.
I am not getting sufficient information on how to build HTS voice for new language.
I am using flite+htsengine.
Thanks in advance.
Best Regards
-Narendra