[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:02380] Re: Question about speaker adaptation

Subject: [hts-users:02380] Re: Question about speaker adaptation
From: Junichi Yamagishi <jyamagis@xxxxxxxxxxxx>
Date: Fri, 22 Jan 2010 13:42:45 +0000
Cc: Junichi Yamagishi <jyamagis@xxxxxxxxxxxx>
Delivered-to: hts-users@xxxxxxxxxxxxxxx

Hi,

On 22 Jan 2010, at 10:35, Daniel wrote:

I use SMAPLR to do adaptation, this method use context-decision treeto do the clustering.The demo script has “occupancy thresholds foradaptation “
for mgc , lf0 and duration separately,
What’s the relationship between the adaptation data and thesethreshold?I only know when the threshold increase , the transform matrixnumber decrease.


That simply means you can control the number of transforms
for those features separately.

If you use context-clustering decision trees for tying of
linear transforms, the relation between the threshold and
the number of transforms is an exponential function (in theory).
See threthold-vs-numtransform.tiff

So, if you see the inverse of the threshold, it would be
easy for you to predict (see invthrethold-vs-numtransform.tiff)

And how to find the optimum threshold?


It totally depends on your decision tree structure,
the amount of data, average voice models, their dimension,
speaker characteristics, language(e.g. # of phonemes),
recording condition, etc.

Regards,
Dr. Junichi Yamagishi
CSTR

DanielWu

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

References
: [hts-users:02379] Question about speaker adaptation, Daniel

Prev by Subject: [hts-users:02379] Question about speaker adaptation
Next by Subject: [hts-users:02381] flite+hts_engine on Flash
Previous by thread: [hts-users:02379] Question about speaker adaptation
Next by thread: [hts-users:02381] flite+hts_engine on Flash