[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:00078] changing sample rate


Hi all,

If I understand correctly the duration/f0/melcep models do not depend on
the sample rate. So theoretically one could use same models for, say,
16000
and 8000 samples/second.

The main point is to speed things up.
Other ways to make things go faster I know are
reducing $numstate, reducing MCEP_ORDER, and
modifying %lambda in the training script.
Other ideas would be appreciated.

I'm trying to use models which were build at sample rate 16000 to produce
speech at sample rate 8000. I've redefined FPERIOD and RATE
accordingly. Should something else be changed as well?

Everything seems to work fine for durations and F0 contour,
but the speech quality is awful. Changing "vocal tract length" from
0.42 to 0.2 makes it somewhat more human-like, though it does not
resemble the underlying voice even then.

I've also build a version from 8K samples, but it was also bad,
probably about the same as described above.
(16K voices are perfectly fine).

(I also did a brute resample hack which took every second sample at
16K thus yielding 8K.
It's relatively fine with the exception of sibilants.)

best regards,
  Nicholas Volk

Follow-Ups
[hts-users:00079] Re: changing sample rate, Keiichi Tokuda