[hts-users:01226] Questions concerning the HTS_engine
Dear List members,
I'm currently setting up some quality tests (intelligibility/
naturalness) for different speech synthesis techniques.
For the moment I'll only be looking on the set
{spectral features, MLSA filter, excitation signal, lpc vocoder}
and overlooking any kind of HMM-modeling.
That said, I was hoping you could help me with some of the choices I
have to make which, after looking a couple of times to the docu and the
mailing list archives, still remain unclear to me.
1) I've seen one can get the cepstral features with the HCopy command.
Is there anyway of getting the MLSA filter and using it only using the
available commands, or would I have to write a program myself using the
HTS_engine API?
1.1) or should I rather be looking at something like SPTK?
2) Supposing I pass point 1, is it possible to choose between a mixed
excitation model and a simple pulse one?
3) Is there any work comparing the quality of the synthesized speech
using different parameters (number of coefficients, etc)? Is there any
optimal configuration?
Hope I'm not making anyone repeat himself.
Thx in advance,
Miguel Vaz
-----------------
Dept. Industrial Electronics
Universidade do Minho
Guimaraes
Portugal
- Follow-Ups
-
- [hts-users:01248] Re: Questions concerning the HTS_engine, Heiga ZEN (Byung Ha CHUN)