Dear
Najeeb,
Matt
is correct at this point.
In
my understanding, part reason for such “natural” parameter representation instead
of the normal N(mu, sigma^2) representation is: this will facilitate the implementation
detail when solving Eq. (15) in [1]. To be exact, when solving such equation we
need the inverse of variance instead of variance, and need mean divided by
variance instead of mean.
To
convert back to mean and variance, you can simply modify the code in function WriteParms()
in HMGenS.c source file based on the aforementioned mathematical relation. The
following lines of code should be modified:
/* output pdfs */
if (outPdf) {
WriteVector(pdffp,
pst->mseq[pst->t], inBinary);
if (pst->fullCov)
WriteTriMat(pdffp,
pst->vseq[pst->t].inv, inBinary);
else
WriteVector(pdffp,
pst->vseq[pst->t].var, inBinary);
}
I
have done this before.
Yang
Wang
[1]
K. Tokuda, T. Yoshimura, T. Masuko, T. Kobayashi, and T. Kitamura, "Speech
parameter generation algorithms for HMM-based speech synthesis," in
ICASSP, 2000, pp. 1315-1318.