$B:9=P?M(B: "Hui LIANG (@Idiap)" <hui.liang@xxxxxxxx>
$BF|;~(B: 16 July 2009 10:44:01 BST
$B08@h(B: Junichi Yamagishi <jyamagis@xxxxxxxxxxxx>
$B7oL>(B: So influential stream weight?
Hello, Yamagishi-san,
My question came from some weird occupancy values given by HERest when
it estimated speaker-specific CMLLR transforms. In my experiment,
there
were *two* mixture components in each bndap stream. I used only one
adaptation utterance for the moment, in which there were 1765
frames. If
the stream weight of bndap was set to 0.0, HERest gave such strange
occupancy values during transform estimation:
Node 6 (stream=5, vsize=15, occ=0.000000)
Node 14 (stream=5, vsize=15, occ=0.000000)
Node 9069 (stream=5, vsize=15, occ=0.000000)
Node 9095 (stream=5, vsize=15, occ=0.000000)
Node 9107 (stream=5, vsize=15, occ=0.000000)
Node 9132 (stream=5, vsize=15, occ=0.000000)
....
Node 6 here was the root node of my bndap regression tree. Since the
occupancy is 0.0, I got no transforms for bndap.
I also tried 0.8, 1.0 and 1.5 as the bndap stream weight. I found the
occupancy value of Node 6 varied greatly. See:
bndap stream weight: 0.8
Node 6 (stream=5, vsize=15, occ=3.425499)
bndap stream weight: 1.0
Node 6 (stream=5, vsize=15, occ=1764.913008)
bndap stream weight: 1.5
Node 6 (stream=5, vsize=15, occ=12001236446949118)
However, if there was only *one* mixture component in each bndap
stream,
the occupancy value was almost a constant. See:
bndap stream weight: 0.0
Node 6 (stream=5, vsize=15, occ=1764.810253)
bndap stream weight: 0.8
Node 6 (stream=5, vsize=15, occ=1764.853866)
bndap stream weight: 1.0
Node 6 (stream=5, vsize=15, occ=1764.738418)
bndap stream weight: 1.5
Node 6 (stream=5, vsize=15, occ=1764.805865)
I am not sure if these weird values are due to something hard-coded in
HTS. According to my understanding, a stream weight shouldn't be so
influential.
Thank you very much in advance!
Hui LIANG