[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:00018] ESPS get_f0


Hello everyone,

I have a working Festival unit selection voice for Finnish and I was
thinkin of building a HTS voice out of it.
(So basicly I have the utts and the raws (wavs to be exact).)

After an initial look at the HTS-demo_CMU-ARCTIC-AWB,
which I try to imitiate, there are few questions that puzzle me...

1) The size of *.raw files is always divisible by 32000. Why?
I don't think this is the case in the original Artctic files.
Is this necessary and do you just append silence to the end of the
files?

2) I got the ESPS from KTH's web site, but it's notoriously hard to install
(at least in Linux and Cygwin/Windows). Does anyone know of precompiled
packages or some other way to get ESPS's get_f0 working?
Or is there a way to convert data from some other F0 extractor to
ESPS's *.f0 files?
For example Festival's Edinburgh Speech Tools comes with a tool called pda
which also extracts F0 contour. It might not be as good as get_f0
(the algorithm is different), but it would probably suffice.

best regards,

  Nicholas Volk / Bitlips





Follow-Ups
[hts-users:00019] Re: ESPS get_f0, Heiga ZEN