[hts-users:04106] Re: Festival options to generate utts for synthesis

Hi,

as I understand, you directly use HMGenS with your label file which contains unseen model. If my assumption is correct, you have to get the appropriate models based on the decision trees. Therefore, if you use the demonstration architecture, the most simple way is to achieve these steps:

1. put your label files into the data/labels/gen folder

2. call the make file in the data directory to rebuild scp and list files (make scp; make list)

3. activate only the steps from make unseen to synthesis (choose your generation kind (1mix, stc, 2mix))

4. call the make file in root demonstration directory by doing "make voice"

In other cases, you should adapt scripts for your need.

Kind regards,

Sébastien

De: "Praneeth Kurpad" <praneeth.kurpad@xxxxxxxxx>
À: hts-users@xxxxxxxxxxxxxxx
Envoyé: Mercredi 27 Août 2014 10:42:46
Objet: [hts-users:04104] Re: Festival options to generate utts for synthesis

Hi,

I generated prompt-utterances for cmu-arctic database using festvox do_build build_prompts. This is successfully training the HMM's and also generating wave files for labels provided in gen/labels folder but when i try to synthesize waveforms for new text, it gives the same error, ie, cannot find the label in duration model.

Is this because it is looking for the exact same label files in the models/tiedlist?

Is there any way to invoke HMGenS command that it will look for an approximate match and try to synthesize the waveform? Because the label format contains lot of information ( number of phrses, words, and syllables in the utterance etc. ), all of which simply cannot be exhaustively listed.

Can anyone please suggest any alternatives!

Regards

Praneeth

On Thu, Aug 21, 2014 at 3:51 PM, Blaise Potard <bpotard@xxxxxxxxx> wrote:

Hi,

If you had generated your training utterances yourself with festival, your probably wouldn't have this problem, so I assume the training utts files were already provided to you. Please provide more information regarding the audio database you are using for training.

You will need to find out how the training utts were originally generated - almost certainly with festival - , and try to use the appropriate options for festival to perform compatible utts generation. You might simply need to select one of the existing voice to set the right options. It seems from your script above that you generate your utts with whatever default settings your festival installation uses, which is almost certainly not going to work.

Depending on the actual front-end (phone set, lexicon...) used, the format of the utts can vary considerably.

If you tell us what database you are using (arctic, maybe?) someone here might be able to provide with you some information on the right options to use.

Hope that helps,
Blaise

2014-08-21 11:24 GMT+02:00 Praneeth Kurpad <praneeth.kurpad@xxxxxxxxx>:

Hi,

I have been trying to build a text to speech synthesis system (English) using HTS and festival. I want to generate a speech waveform from text, rather than label files as done in HTS-demo. I used festival to generate utts using the code :

(let ((utt1 (SynthText "Ah, indeed")))
(utt.save utt1 "asdf.utt")
)

and used the part of script in HTS-demo/data/Makefile to create labels from festival generated utterances.

But when i try to synthesize it, it gives the following error :

***********************************************

ERROR [+9935] Generator: Cannot find duration model x^x-pau+hh=ax@x_x/A:0_0_0/B:x-x-x@x-x&x-x#x-x$x-x!x-x;x-x|x/C:0+0+2/D:0_0/E:x+x@x+x&x+x#x+x/F:content_2/G:0_0/H:x=x^1=1|0/I:2=1/J:2+1-1 in current list

************************************************

I have verified that the HMGenS command options file list exist at corresponding paths

(Please note there are considerable differences in festival generated utterances and the traiining utterances present in data/utts/)

I am successfully able to synthesize speech for .utt file given in data/utts, which suggests an error i the utt generation part.

I tried various suggested solutions but nothing seems to rectify the error. some of them being, setting a different default phone set, downloaded and built hts voice for festival.

Could someone please guide me as to what settings must be used in festival to generate .utt files that can be used to synthesize the speech waveform for any text.

Regards