[hts-users:04430] Re: HTS demo 'alice' labels

Hey guys.

I am trying to train a voice for my native Malawian language of Chichewa. I've managed to achieve forced alignment (aligned.mlf) of a monophone models using HTK. My question is:

How do I generate .lab and/or .utt files similar to those in the HTS Demos? Do I have to use text2utt?

If so, then was the point going through the process with HTK (I am new to NLP, and was advised to do this first step by my Professors - I was told HTS takes input from HTK)?

How do I then use the generated (hopefully) synthesized WAVs (and .htsvoice file) to make a TTS sample (either with Festival, Festlight or any other system?)

Your help will be very much appreciated.

Attached is my aligned file

Regards

Jeremiah

On Wed, Dec 7, 2016 at 3:16 AM, Rasmus Dall <R.Dall@xxxxxxxxxxxx> wrote:

There may have been ever so slight changes since they were generated 10 years ago.

I'd recommend regenerating them yourself using the same front-end setup (even if that is also festival) you use for your training data!

- Rasmus

Quoting "Heiga ZEN (Byung Ha CHUN)" <heigazen@xxxxxxxxxx> on Tue, 06 Dec 2016 17:12:22 +0000:

AFAIK there was not change at that time.

Heiga

On Tue, Dec 6, 2016 at 4:32 PM Erica Cooper <ecooper@xxxxxxxxxxxxxxx> wrote:

Thanks, Heiga! so the Festival HTS voice was just using the default US
English frontend from Festival, or do you know of any changes that were
made?

On Mon, Dec 5, 2016 at 2:29 PM, Heiga ZEN (Byung Ha CHUN) <
heigazen@xxxxxxxxxx> wrote:

I think I generated these labels about 10 years ago...

IIRC, they were directly generated from festival HTS voice. At that time,
festival HTS voices dumped labels to a tmp file then ran hts engine to
synthesize speech. I think I copied these tmp label files.

I am not sure whether it still runs in this way. It may be fully
integrated and no tmp file is created.

Heiga

2016年12月5日(月) 19:12 Erica Cooper <ecooper@xxxxxxxxxxxxxxx>:

Hi HTS-users,

I was wondering if anyone knew anything about how the 'alice' test labels
in the demo were created. Searching through the list archives it appears
that Festival was used, however I have noticed on a few occasions that with
some different voices trained on different types of data, .wav files
synthesized from the 'alice' labels just tend to sound better than
synthesis from our own test labels we've created. We are using Festival
'text2utt' to get utts out of a txt.done.data-formatted file, and then
using the steps in the data/Makefile to convert from utt->lab. Were the
'alice' labels perhaps created in some different way, or using some
different settings in Festival?

Thanks,
Erica

--
---------------------------------------
Heiga ZEN (in Japanese)
Byung Ha CHUN (in Korean)
<heigazen@xxxxxxxxxx>

--

---------------------------------------
Heiga ZEN (in Japanese)
Byung Ha CHUN (in Korean)
<heigazen@xxxxxxxxxx>

--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.