[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:01424] Regarding the new IT command of HHEd of HTS-2.1RC2


Hi, all,

I tried the new IT command of HHEd of HTS-2.1RC2. According to its brief description, I presume that "IT" calculates means, variances and weights of all the leaf nodes of a given decision tree, from data of a given full-context model. Namely, in the case below,

    +---fullcontext.mmf---+     +------------+     +---clustered.mmf(A)
    |         Questions---+-->--+ TB of HHEd +-->--+
    |   statistic files---+     +------------+     +---decision tree
    |   ---+-----------                                ------+------
    |      |                                                 |
    V      |    +--------------------<-----------------------+
    |      |    |
    |      |    +---->----+     +------------+
    |      +--------->----+-->--+ IT of HHEd +-->--clustered.mmf(B)
    +---------------->----+     +------------+

clustered.mmf(A) should be exactly the same as clustered.mmf(B).

Is my presumption correct? If so, I find something strange.

Here are a tied stream of clustered.mmf(A),
~p "spectrum_s6_22"
<STREAM> 1
<MEAN> 123
 1.047655e-002 1.504280e-002 2.348183e-002 3.310003e-002 4.486487e-002 5.667774e-002 6.780868e-002 7.683331e-002 8.563469e-002 9.706887e-002 1.103918e-001 1.234748e-001 1.396191e-001 1.510170e-001 1.614375e-001 ....
<VARIANCE> 123
 7.492220e-007 8.066245e-007 5.884825e-006 1.223310e-005 1.242518e-005 7.765059e-006 1.244235e-005 2.299086e-005 2.221467e-005 1.530542e-005 5.468271e-006 8.743534e-006 1.323156e-005 1.424249e-005 3.373104e-005 ....
<GCONST> -1.209281e+003

and a tied stream of clustered.mmf(B),
~p "spectrum_s6_22"
<STREAM> 1
<MEAN> 123
 1.047094e-002 1.496241e-002 2.337582e-002 3.277307e-002 4.448711e-002 5.656610e-002 6.779103e-002 7.714669e-002 8.584835e-002 9.745022e-002 1.106722e-001 1.235035e-001 1.397249e-001 1.508762e-001 1.608496e-001 ....
<VARIANCE> 123
 8.766731e-007 1.213980e-006 7.213330e-006 1.387043e-005 1.408452e-005 7.693285e-006 1.230850e-005 2.457094e-005 2.624969e-005 1.566426e-005 6.887655e-006 9.689571e-006 1.397587e-005 1.570964e-005 3.065937e-005 ....
<GCONST> -1.199125e+003

Means and variances of the two tied streams are not equal. Especially, variance differences of some dimensions are not negligible. I don't know why this happens.

The command I used is:
HHEd_2.1RC2.exe -A -B -C train.conf -D -T 1 -s -i -H fullcontext_cmp.mmf -w clustered_cmp_IT.mmf IT.hed full.list

IT.hed contained:
TR 2
LS "D:/HHEd-IT/stats_cmp.sts"
LT "D:/HHEd-IT/trees_lnF0.inf"
LT "D:/HHEd-IT/trees_spectrum.inf"
IT "D:/HHEd-IT/trees_IT.inf"

Many thanks!

Best regards,
Hui LIANG


      __________________________________________________________
Sent from Yahoo! Mail.
A Smarter Email http://uk.docs.yahoo.com/nowyoucan.html


Follow-Ups
[hts-users:01425] Re: Regarding the new IT command of HHEd of HTS-2.1RC2, Heiga ZEN (Byung Ha CHUN)
[hts-users:01427] Re: Regarding the new IT command of HHEd of HTS-2.1RC2, Heiga ZEN (Byung Ha CHUN)