[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:02165] Re: HInit in Trainning.pl

Subject: [hts-users:02165] Re: HInit in Trainning.pl
From: "Nickolay V. Shmyrev" <nshmyrev@xxxxxxxxx>
Date: Wed, 12 Aug 2009 01:16:38 +0400
Delivered-to: hts-users@xxxxxxxxxxxxxxx

В Вск, 09/08/2009 в 12:35 +0300, Stas пишет:
> Javi,
> 
> Thank you for your response. 
> 
> How much data (~1000 samples, ~10000 samples, more) may be
> sufficient in order to have robust models without HINIT, HEREST? 
> Can you please provide me more information?

What is "robust"? 10 ms average mistake in label boundary, 11 ms, 12
ms? :)

Really you either have labels which is good for TTS or don't have
them/don't want to create them which is often several percents worse in
MOS terms. It takes a lot time to hand-label the db, so it's up to you
to decide if you need it.

Another solution would be to segment a little part of the DB (say, 30
utterances), bootstrap a models from this part and force align the rest
of the db to get more accuracy.

Attachment: signature.asc
Description: =?koi8-r?q?=FC=D4=C1?==?koi8-r?q?_=DE=C1=D3=D4=D8?= =?koi8-r?q?_=D3=CF=CF=C2=DD=C5=CE=C9=D1?= =?koi8-r?q?_=D0=CF=C4=D0=C9=D3=C1=CE=C1?= =?koi8-r?q?_=C3=C9=C6=D2=CF=D7=CF=CA?= =?koi8-r?q?_=D0=CF=C4=D0=C9=D3=D8=C0?=

References
: [hts-users:02148] HInit in Trainning.pl, Stas; [hts-users:02149] Re: HInit in Trainning.pl, Sébastien Le Maguer; [hts-users:02151] Re: HInit in Trainning.pl, Xavi Gonzalvo; [hts-users:02152] Re: HInit in Trainning.pl, Stas

Prev by Subject: [hts-users:02164] Re: GPOS and TOBI questions
Next by Subject: [hts-users:02166] Re: HTS ERROR -HInit: Segment label expected
Previous by thread: [hts-users:02152] Re: HInit in Trainning.pl
Next by thread: [hts-users:02150] Problem about unstable synthesized voices (using Speaker dependenttraining with STRAIGHT demo)