[hts-users:03324] Re: About the prototype definition of the MSD-HMM
- Subject: [hts-users:03324] Re: About the prototype definition of the MSD-HMM
- From: Kwan Lisa <lisakwan1102@xxxxxxxxx>
- Date: Wed, 6 Jun 2012 02:45:20 +0800
- Delivered-to: hts-users@xxxxxxxxxxxxxxx
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; bh=+oW1QqFepkw2xe2npFPMZNtrbbypCZqi3NBnD3egW9A=; b=m3NdmkbvOFiBdyCB449Q6G6cNf2ehUY3oOI+Uuu2mJ201PiRdgQKFOnR595LXA1ED/ gXAHRLZtFEE/gqkPFxuG2NqDXo7JxgcODvV1K96/XlD33EHwbrtPTdAQmIcPa7f0VkJU 2CJQg0Ghq5djYteo9+dTDwcZplwFLJKii3tYmHsBAuvNR9b++GnI2ZohMMxQlZqywE/Q pC0J3kDCCTbV4HIKZeH+WqnZmfIKeOtyasIPpUh3BemEtIv7hnEOOMyHADIXKzyKGJ69 U+VsIVKiVJIFnER7x/39sIHnYaA+e29M5DkezRi4oJxDOtYzEUd4MrAyjmL0a+h4gosx Do6Q==
Hi,
I don't understand why the distributions of frequency and its 1- and
2-order dynamics are placed in stream 2, 3, and 4 respectively, but
the distributions of spectral and its 1- and 2-order dynamics are
placed in stream 1. What if I place all of frequency feature in stream
2 and treat them as a 3 dimensional data like spectral data?
2012/6/5 那兴宇 <nxy-yzqs@xxxxxxx>:
> Hi,
>
> Yes, you are right.
> Stream 3 and 4 are distributions of 1- and 2-order dynamics respectively.
>
> --
> Xingyu Na (那兴宇)
> Beijing Institute of Technology
> naxy(at)bit.edu.cn
> asr.naxingyu(at)gmail.com
> naxingyu at {facebook, twitter, linkedin}
>
>
> At 2012-06-05 02:43:21,"Kwan Lisa" <lisakwan1102@xxxxxxxxx> wrote:
>>Hi,
>>
>>I have question about the definition of the prototype of the MSD-HMM.
>>My monophone.mmf in the model directory is like:
>>~h "mo"
>><BEGINHMM>
>><NUMSTATES> 7
>><STATE> 2
>>
>><STREAM> 1
>><MEAN> 120
>>...
>><VARIANCE> 120
>>...
>><GCONST> -7.524412e+02
>>
>><STREAM> 2
>><NUMMIXES> 2
>><MIXTURE> 1 5.459577e-01
>><MEAN> 1
>> 4.907650e+00
>><VARIANCE> 1
>> 2.217153e-02
>><GCONST> -1.971069e+00
>><MIXTURE> 2 4.540337e-01
>><MEAN> 0
>><VARIANCE> 0
>><GCONST> 0.000000e+00
>>
>><STREAM> 3
>>...
>>
>><STREAM> 4
>>...
>>
>>I want to utilize the monophone model to calculate the KL-divergence
>>of the frequency distributions. However there are voice and unvoiced
>>speech. I think stream 2, 3, and 4 are the frequency streams, and in
>>each stream mixture 1 stands for the distribution weight of the voiced
>>speech and mixture 2 stands for the distribution weight of the
>>unvoiced speech. Is my explanation correct?
>>
>>--
>>Lisa Kwan
>>lisakwan1102(at)gmail.com
>>Advanced Speech Technology Lab, ASTL
>>
>
>
>
--
Lisa Kwan
lisakwan1102(at)gmail.com
Advanced Speech Technology Lab, ASTL
- Follow-Ups
-
- [hts-users:03325] Re: About the prototype definition of the MSD-HMM, 王洋
- [hts-users:03327] Re: About the prototype definition of the MSD-HMM, 那兴宇
- References
-
- [hts-users:03321] About the prototype definition of the MSD-HMM, Kwan Lisa
- [hts-users:03322] Re: About the prototype definition of the MSD-HMM, 那兴宇