[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:00080] f0 extraction using pda on raw sound files


Hi all,

I tried to extract f0 contours using the pda utility from the Edinburgh Speech Tools, but I'm not sure which command line parameters that work best with HTS. For instance, what low/high freq boundaries should be set when extracting a male voice? Also, are there any other parameters (ie voiced/voiceless treshold) that could affect HTS performance?

I created som f0 files using this utility and packed them to a binary float little endian, but I receive the "ViterbiAlign: No path found in 8'th segment" when training reaches "sil" (silence). I have successfully trained using the exact same data, but with f0 contours taken from the KTH "Snack" f0 extraction tool. The problem then is that many segments in sentence-final position becomes partially unvoiced though there is no evidence for this in the training data.

Am I right in believing this might be a treshold problem, for the voiced/unvoiced parts of speech? Or, could it be that some frames have been truncated from the f0 files created with KTH Snack?


Follow-Ups
[hts-users:00081] Re: f0 extraction using pda on raw sound files, Heiga Zen (Byung-Ha Chun)