[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:03838] Re: Quality from demos


Hello Dietmar!
Thank you so much for your reply.
I recently changed the questions folder to fit my phone set, and the result have improved a lot:

https://dl.dropboxusercontent.com/u/81143637/nueva.wav

The problem now I think is the pitch in the synthetized wave. I've tried with male and female voices, and in both cases the results are heard in lower tones than the original voice. I ran the demo script cmu_us_arctic_slt as it comes, and the result was perfect.

I also read on the log file: "x2x : warning: input data is over the range of type 'short'!" I read about it on the mailing list but didn't understand if I have to change something on the scripts.
Thanks in advance for your attention,

Marvin. 


2013/8/7 Dietmar Schabus <schabus@xxxxxx>
Hello Marvin,

I think it's due to the questions.

If the questions do not match your phone set, the decision tree based clustering cannot produce reasonable results (everything will be answered "no").
I think the easiest way to obtain questions is to write a simple script that generates them from a set of classes (like "vowel" or "fricative" etc.) where the phones (of your phone set) that belong to each class are listed.

Regards,
Dietmar



On 2013-08-06 22:06, Marvin Coto wrote:
Hello!

I'm trying to use the demo scripts of HTS 2.2: the Speaker-dependent
training demo in English and Portuguese, for a voice in Spanish.
I change the raw files from my spanish database (184 files made from
good quality recordings), and the utt files (generated with Festival
2.1, from the text transcription of the audio files). Even though I
haven't change the Questions folder, I was expected a little more
quality from the beggining.
A example of the result, synthetized in Festival can be heard here:

https://dl.dropboxusercontent.com/u/81143637/ConEstoico.wav

It supposed to say "Con estoico respeto a la justicia adyacente guardo
sus flechas". I'm sure nobody needs to speak spanish to realize
something is going very wrong. From that audio file, does somebody have
any idea of where could be the problem? Maybe the raw files, the way I'm
generating the utt, or should be the questions folder?

In the last case, do I have to change the questions manually, or that
have to be done in HTK?

Thanks in advance,

Marvin.




--
----------------------------------------------------------------------
Dietmar Schabus | Researcher
phone +43 1 5052830-48 | fax -99 | schabus@xxxxxx | http://userver.ftw.at/~schabus/

FTW Telecommunications Research Center Vienna
Donau-City-Straße 1/3 | 1220 Vienna | Austria | www.ftw.at



Follow-Ups
[hts-users:03847] Re: Quality from demos, Dietmar Schabus
References
[hts-users:03833] Quality from demos, Marvin Coto
[hts-users:03834] Re: Quality from demos, Dietmar Schabus