[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:03321] About the prototype definition of the MSD-HMM


Hi,

I have question about the definition of the prototype of the MSD-HMM.
My monophone.mmf in the model directory is like:
~h "mo"
<BEGINHMM>
<NUMSTATES> 7
<STATE> 2

<STREAM> 1
<MEAN> 120
...
<VARIANCE> 120
...
<GCONST> -7.524412e+02

<STREAM> 2
<NUMMIXES> 2
<MIXTURE> 1 5.459577e-01
<MEAN> 1
 4.907650e+00
<VARIANCE> 1
 2.217153e-02
<GCONST> -1.971069e+00
<MIXTURE> 2 4.540337e-01
<MEAN> 0
<VARIANCE> 0
<GCONST> 0.000000e+00

<STREAM> 3
...

<STREAM> 4
...

I want to utilize the monophone model to calculate the KL-divergence
of the frequency distributions. However there are voice and unvoiced
speech. I think stream 2, 3, and 4 are the frequency streams, and in
each stream mixture 1 stands for the distribution weight of the voiced
speech and mixture 2 stands for the distribution weight of the
unvoiced speech. Is my explanation correct?

-- 
Lisa Kwan
lisakwan1102(at)gmail.com
Advanced Speech Technology Lab, ASTL

Follow-Ups
[hts-users:03322] Re: About the prototype definition of the MSD-HMM, 那兴宇