[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:01226] Questions concerning the HTS_engine


Dear List members,

I'm currently setting up some quality tests (intelligibility/ naturalness) for different speech synthesis techniques.

For the moment I'll only be looking on the set
   {spectral features, MLSA filter, excitation signal, lpc vocoder}
and overlooking any kind of HMM-modeling.


That said, I was hoping you could help me with some of the choices I have to make which, after looking a couple of times to the docu and the mailing list archives, still remain unclear to me.


1) I've seen one can get the cepstral features with the HCopy command.
Is there anyway of getting the MLSA filter and using it only using the available commands, or would I have to write a program myself using the HTS_engine API?

1.1) or should I rather be looking at something like SPTK?


2) Supposing I pass point 1, is it possible to choose between a mixed excitation model and a simple pulse one?


3) Is there any work comparing the quality of the synthesized speech using different parameters (number of coefficients, etc)? Is there any optimal configuration?



Hope I'm not making anyone repeat himself.
Thx in advance,

Miguel Vaz

-----------------
Dept. Industrial Electronics
Universidade do Minho
Guimaraes
Portugal

Follow-Ups
[hts-users:01248] Re: Questions concerning the HTS_engine, Heiga ZEN (Byung Ha CHUN)