[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:00959] Re: LSFs (was: Re: Questions about certain concepts)



  Hi;

Simon King wrote:
 it is better to work in other
representations in which it is simpler to guarantee a stable filter (e.g., LSFs are one such common representation that is used widely in speech coding and synthesis).

Has anyone actually tried and published the use of LSFs for HMM-based speech
synthesis or HMM-based speech recognition?
How do they perform regarding the alignments and likelihood convergence
during the HMM training?

In theory, the cepstrum should perform best with HMMs because
the Gaussian-based log-likelihood implements a Mahalanobis
distance in the cepstrum domain, and this distance can be related
to a perceptually relevant spectral distortion measure, see:
    A.H. Gray and J.D. Markel
    "Distance measures for speech processing"
    IEEE Trans. Acoustics, Speech and Signal Processing
    Vol. ASSP 24, no.5, Oct. 1976
This questions the performance of LSFs, for which the distance
implemented by the log-likelihood has a different meaning.

Alternatively, besides the Hemptinne master thesis, has anyone tried
a spectral envelope + OLA paradigm rather than a source + filter
paradigm for the vocoding component?

Thanks;
                                             -*- Sacha K. -*-
--
Dr.Sacha Krstulovic - Research Engineer
Toshiba Research Europe Limited
Cambridge Research Laboratory
Speech Technology Group
208 Science Park, Milton Road
Cambridge CB4 0GZ - United Kingdom
Tel:    +44 1223 436 978
Fax:    +44 1223 436 909
E-mail: sacha@xxxxxxxxxxxxxxxxx



______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email ______________________________________________________________________
begin:vcard
fn:Sacha Krstulovic
n:Krstulovic;Sacha
org:Toshiba Research Europe Limited - Cambridge Research Laboratory;Speech Technology Group
adr:;;260 Science Park, Milton Road;Cambridge;;CB4 0WE;United Kingdom
email;internet:sacha@xxxxxxxxxxxxxxxxx
title:Research Engineer
tel;work:+44 1223 436 978
tel;fax:+44 1223 436 909
x-mozilla-html:FALSE
url:http://www.toshiba-europe.com/research/crl/
version:2.1
end:vcard


Follow-Ups
[hts-users:00961] Re: LSFs, Heiga ZEN (Byung Ha CHUN)
References
[hts-users:00941] Questions about certain concepts, marc sobhy
[hts-users:00942] Re: Questions about certain concepts, Heiga ZEN (Byung Ha CHUN)
[hts-users:00944] Re: Questions about certain concepts, Tamer Fares
[hts-users:00945] Re: Questions about certain concepts, Simon King