[hts-users:02380] Re: Question about speaker adaptation
Hi,
On 22 Jan 2010, at 10:35, Daniel wrote:
I use SMAPLR to do adaptation, this method use context-decision tree
to do the clustering.
The demo script has “occupancy thresholds for
adaptation “
for mgc , lf0 and duration separately,
What’s the relationship between the adaptation data and these
threshold?
I only know when the threshold increase , the transform matrix
number decrease.
That simply means you can control the number of transforms
for those features separately.
If you use context-clustering decision trees for tying of
linear transforms, the relation between the threshold and
the number of transforms is an exponential function (in theory).
See threthold-vs-numtransform.tiff
So, if you see the inverse of the threshold, it would be
easy for you to predict (see invthrethold-vs-numtransform.tiff)
And how to find the optimum threshold?
It totally depends on your decision tree structure,
the amount of data, average voice models, their dimension,
speaker characteristics, language(e.g. # of phonemes),
recording condition, etc.
Regards,
Dr. Junichi Yamagishi
CSTR
Daniel
Wu
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
- References
-
- [hts-users:02379] Question about speaker adaptation, Daniel