[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:04100] Re: Festival options to generate utts for synthesis


Hi,

Yes, I need information about what settings were used to generate the training utts data as provided in HTS-demo. I used the arctic database, and the training utterances were provided in the HTS-demo itself. 

Also, I tried using different voices-kal_diphone, rab_diphone , cmu_hts etc but they all generate the same utt file. I was able to figure out that the phone set used is radio, which is what I am using as well.  

Thank you,

Praneeth


On Thu, Aug 21, 2014 at 3:51 PM, Blaise Potard <bpotard@xxxxxxxxx> wrote:
Hi,

If you had generated your training utterances yourself with festival, your probably wouldn't have this problem, so I assume the training utts files were already provided to you. Please provide more information regarding the audio database you are using for training.

You will need to find out how the training utts were originally generated - almost certainly with festival - , and try to use the appropriate options for festival to perform compatible utts generation. You might simply need to select one of the existing voice to set the right options. It seems from your script above that you generate your utts with whatever default settings your festival installation uses, which is almost certainly not going to work.

Depending on the actual front-end (phone set, lexicon...) used, the format of the utts can vary considerably.

If you tell us what database you are using (arctic, maybe?) someone here might be able to provide with you some information on the right options to use.

Hope that helps,
Blaise


2014-08-21 11:24 GMT+02:00 Praneeth Kurpad <praneeth.kurpad@xxxxxxxxx>:

Hi, 

I have been trying to build a text to speech synthesis system (English) using HTS and festival. I want to generate a speech waveform from text, rather than label files as done in HTS-demo. I used festival to generate utts using the code :

(let ((utt1 (SynthText "Ah, indeed")))
    (utt.save utt1 "asdf.utt")
)

and used the part of script in HTS-demo/data/Makefile to create labels from festival generated utterances.  

But when i try to synthesize it, it gives the following error : 

***********************************************
ERROR [+9935]  Generator: Cannot find duration model x^x-pau+hh=ax@x_x/A:0_0_0/B:x-x-x@x-x&x-x#x-x$x-x!x-x;x-x|x/C:0+0+2/D:0_0/E:x+x@x+x&x+x#x+x/F:content_2/G:0_0/H:x=x^1=1|0/I:2=1/J:2+1-1 in current list
************************************************

I have verified that the HMGenS command options file list exist at corresponding paths

(Please note there are considerable differences in festival generated utterances and the traiining utterances present in data/utts/)

I am successfully able to synthesize speech for .utt file given in data/utts, which suggests an error i the utt generation part.

I tried various suggested solutions but nothing seems to rectify the error. some of them being, setting a different default phone set, downloaded and built hts voice for festival. 

Could someone please guide me as to what settings must be used in festival to generate .utt files that can be used to synthesize the speech waveform for any text.

Regards



References
[hts-users:04098] Festival options to generate utts for synthesis, Praneeth Kurpad
[hts-users:04099] Re: Festival options to generate utts for synthesis, Blaise Potard