[hts-users:04539] Re: Preparing data for HTS

Hi,

The utt are used as an input for festival to generate the labels. Therefore, you do not need festival if you have another way to generate the labels and the questions files.

What can be the other way of generating labels other than Festival?. I have corresponding text of recording in roman form like this

> NIILAM NAE SAALGIRAA PAR HAYDD SAYSMOOGIRAAF ASVAD_D QURAYSHII KAE MAAT_D_HAE PAR AYNTT_HAN OR 7AM KII AAT_DISHIIN RO MEHSUUS KII

and I have marked its phonetic boundaries like this

> ##N II L A M##N AE##S AA L G I R AA##P A R##H AY DD##S AY S M OO G I R AA F##A S V A D_D##Q U R AY SH II##K AE##M AA T_D_H AE##P A R##AY N TT_H A N##O R##7 A M##K II##AA T_D I SH IIN##R O##M E H S UU S##K II##<

I have seen "lab_format.pdf" file in HTS demo but it has different format for every demo provided in HTS website.

On Sun, Jul 16, 2017 at 12:54 PM, Sébastien Le Maguer <slemaguer@xxxxxxxxxxxxxxxxxxxx> wrote:

Hello,

what you need for HTS are :
- the questions
- the labels (full & mono)
- the audio

The utt are used as an input for festival to generate the labels. Therefore, you do not need festival if you have another way to generate the labels and the questions files.

To generate the questions and the labels files there is no universal recipe it depends on your language. If it is a phone-based language, you should at least you a tri-phoneme context in order to capture coarticulation. For the rest of the descriptive features it depends on what you can have and what is important for your language.

Kind regards
Sébastien

On Sun, Jul 16 2017 (07:15), Atlas Khan <atlaskhan90@xxxxxxxxx> wrote:

Hi,

I am working on Speech Synthesis for language which do not have any type of
support in Festival. It has different phonemes and Lexicons than English. I
have recordings in *raw *format. As per my knowledge, I need following
types of data for speech synthesis with HTS.

1. questions
2. labels (full and mono)
3. utt

I want to ask how can I prepare questions, labels for language which have
different lexicon and phonemes than English. If I need Festival for
generating that data, than how can I do for language for which Festival do
not have any support.

Regards,

Atlas Khan

--
Save our in-boxes! http://emailcharter.org

================================================================================
Dr. Sébastien Le Maguer
Postdoctorate researcher
Co-chair of SYNSIG (https://synsig.org/index.php/Main_Page)

Saarland University
Campus C7.4 - room 2.03
D-66123 Saarbrücken
Germany

phone : +49-681-302-70030
Mail: slemaguer@xxxxxxxxxxxxxxxxxxxx
website : http://www.coli.uni-saarland.de/~slemaguer/
================================================================================