[hts-users:00765] Re: Questions on training and flat pitch pattern
Hi,
Lee Sillon wrote (2007/08/08 0:10):
1. If the training corpus is larger (for example,10,000 sentences), the
training process will crash because of memory consuming(about 2GB).
Could it be solved?
I guess tree-based clustering (HHEd) consumes huge memory for larger training data.
You can use low-memory implementation of tree-based clustering by specifying -r option.
2. How many sentences are enough for training a good model? 1000 sentences are enough or not?
I think 1000 (1 hour?) is not enough.
In the last year's Blizzard Challenge 2006 we used 5 hours of speech, and this year we used 7 hours.
3. In some papers, POS is considered because it influence pitch pattern
a lot. But in the demo, it is ignored. Is it one of reasons that cause
flat pitch pattern?
I don't think so, because broad POS has more effect to F0 patterns than detailed POS.
In Japanese we checked the effect of broad and detailed POS and found that even broad POS had small effect to the final synthesized speech quality.
So current demo uses only broad POS (gPOS in Festival).
Best regards,
Heiga ZEN (Byung Ha CHUN)
--
------------------------------------------------
Heiga ZEN (in Japanese pronunciation)
Byung Ha CHUN (in Korean pronunciation)
Department of Computer Science and Engineering
Nagoya Institute of Technology
Gokiso-cho, Showa-ku, Nagoya 466-8555 Japan
http://www.sp.nitech.ac.jp/~zen
------------------------------------------------
- Follow-Ups
-
- [hts-users:00766] Re: Questions on training and flat pitch pattern, Junichi Yamagishi
- [hts-users:00780] Inceasing Number of mixtrure componenets, Tamer Fares
- References
-
- [hts-users:00764] Questions on training and flat pitch pattern, Lee Sillon