[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:02900] Re: f0 extraction problem


On 21/06/11 15:56, Matt Shannon wrote:
> During the initial training phase these small inconsistencies are
> swamped by a huge inconsistency with the provided label files, which
> quite often (169 of 1132 utts) are short by 9 frames or more, and are
> sometimes short by as much as 57 frames (cmu_us_arctic_slt_b0340).

I investigated a bit further, and thought I'd share what I found in case
anyone finds it interesting!  The original ARCTIC corpus
(http://festvox.org/cmu_arctic/dbs_slt.html) contains lab files which
are basically the correct length (out by at most 1 or 2 frames) in the
'lab' directory.  Presumably these lab files were then converted to utt
files (in 'festival/utts') using festvox. However when the last two
phones in the lab file are both 'pau', the utt file drops the final
'pau', causing the utt file to be too short.  These utt files are in
turn used to generate the HTS lab files.  Therefore the 169 HTS label
files that are much too short are missing a final 'pau' phone.

Matt

Follow-Ups
[hts-users:03034] Re: f0 extraction problem, zhinan zhang
References
[hts-users:02897] f0 extraction problem, chimi chimi2011
[hts-users:02899] Re: f0 extraction problem, Matt Shannon