[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:02380] Re: Question about speaker adaptation


Hi,

On 22 Jan 2010, at 10:35, Daniel wrote:
I use SMAPLR to do adaptation, this method use context-decision tree to do the clustering. The demo script has “occupancy thresholds for adaptation “
for mgc , lf0 and duration separately,
What’s the relationship between the adaptation data and these threshold? I only know when the threshold increase , the transform matrix number decrease.

That simply means you can control the number of transforms
for those features separately.

If you use context-clustering decision trees for tying of
linear transforms, the relation between the threshold and
the number of transforms is an exponential function (in theory).
See threthold-vs-numtransform.tiff

So, if you see the inverse of the threshold, it would be
easy for you to predict (see invthrethold-vs-numtransform.tiff)

And how to find the optimum threshold?

It totally depends on your decision tree structure,
the amount of data, average voice models, their dimension,
speaker characteristics, language(e.g. # of phonemes),
recording condition, etc.

Regards,
Dr. Junichi Yamagishi
CSTR



Daniel Wu

TIFF image

TIFF image

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

References
[hts-users:02379] Question about speaker adaptation, Daniel