Hi Everyone,
I'm cross posting this to both groups as it's a
basic questic but may have different impacts for different
synthesizers.
The question is, how can one guarantee sufficient /
balanced phonetic coverage in a prompt list built for statistical parametric/HMM
synthesis. This is a variant of the sparse data problem, but I gather that
sparse data is less of an issue for HMM type synthesizers than for concatenative
synthesizers.
I also gather that there are algorithms that can
compensate for sparse data in the statistical HMM case. Is this
true?
The issue comes up for me in multilingual
synthesis, where I might not have sufficient access to knowledge in the target
language.
Thank you.
Ronnie
Please point me to any relevant papers on the
topic.
|