[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:02003] Re: Adaptation


Hi,

Probably I should answer this question.

Both SMAPLR and CSMAPLR was implemented for HTS 2.1.
I don't remember which one we used for HTS2.0.1 by default.
But you can utilize MLLRMEAN, MLLRVAR, and CMLLR in HTS2.0 series.
In HTS2.1, you can use MLLRMEAN, MLLRVAR (unconstrained MLLR),
CMLLR (constrained MLLR), SMAPLR, and CSMAPLR.

In order to use SMAPLR or CSMAPLR, please use
the following configuration. I assume the models used
are HSMMs rather than HMMs.

--- SMAPLR---------

ADAPTKIND    = TREE
DURADAPTKIND = TREE
TRANSKIND    = MLLRMEAN
USEBIAS      = TRUE
DURTRANSKIND = MLLRMEAN
DURUSEBIAS   = TRUE
USESMAP      = TRUE
MLLRDIAGCOV  = FALSE
SMAPSIGMA    = 1.0

----CSMAPLR ----------

ADAPTKIND    = TREE
DURADAPTKIND = TREE
TRANSKIND    = CMLLR
USEBIAS      = TRUE
DURTRANSKIND = CMLLR
DURUSEBIAS   = TRUE
USESMAP      = TRUE
SMAPSIGMA    = 1.0

SMAPSIGMA is a hyper-parameter to adjust the effect of
prior transforms. If you set the value to 0, it is equivalent
to the ML estimation. If you set the value to larger values,
the prior transforms have more weight than the ML transforms.

Personally I define adaptation techniques that transforms observation
features as feature-space adaptation and techniques that transforms model
parameters (e.g. mean or variance) as model-space adaptation.
But they are equivalent for some linear transformation techniques
such as CMLLR, SEMIT, HLDA, and so on.

See Mark Gales' latest book available his webpage.
  The Application of Hidden Markov Models in Speech Recognition
  Mark Gales and Steve Young
This book answers your question.

Regards,
Dr. Junichi Yamagishi
University of Edinburgh





On 1 Jun 2009, at 14:38, Javi Palenzuela wrote:

Hi all,

Reading the README file of HTS2.1RC1 and the mailing list, it seems
that the algorithm used in that last release of HTS is CSMAPLR. To use
it, config file must have TRANSKIND = CMLLR. Is that right?

HTS2.0.1, was it using CMLLR?

Finally, something I don't understand and/or I'm a bit confused. What
kind of transformation is being used, feature or model based? CMLLR is
described in the HTKBook as a feature adaptation, unlike the MLLRMEAN
adaptation that is described as a model adaptation. In some papers
like,

Junichi Yamagishi, Katsumi Ogata, Yuji Nakano, Juri Isogai, Takao
Kobayashi, "HSMM-BASED MODEL ADAPTATION ALGORITHMS FOR
AVERAGE-VOICE-BASED
SPEECH SYNTHESIS", 2006

CMLLR is described as a model adaptation.

In the last paper about SMAPLR,

Junichi Yamagishi, Takao Kobayashi, Yuji Nakano, Katsumi Ogata, and
Juri Isogai, "Analysis of Speaker Adaptation Algorithms for HMM-Based
Speech Synthesis and a  constrained SMAPLR Adaptation Algorithm", IEEE
TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, 2009

is described that the feature adaptation in CMLLR is equivalent to
that of model transforms.

Thank you




--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


References
[hts-users:02002] Adaptation, Javi Palenzuela