[hts-users:03015] Error speed synthesized speech while using 16K data wi

Dear all,

I'm recently switching my HTS project from HTS-2.01 to HTS-2.2. For using the English speaker

dependent training demo from HTS-2.2 project.

I installed HTS-2.2_for_HTK-3.4.1 without any trouble, and also change my HTS_Engine to 1.05.

In fact, the whole training process went well smoothly, and the synthesized speech sounds good.

But when I want to change the wave data to cmu-bdl (16KHz), I got very bad synthesized speech.

The voice sounds broken, and the speed of the speech is also weird.

I changed the feature extraction parameters in data/Makfile as:

SAMPFREQ = 16000 &nb sp; # 48000 Sampling frequency (48kHz)

FRAMELEN = 400 # 1200 Frame length in point (1200 = 48000 * 0.025)

FRAMESHIFT = 80 # 240 Frame shift in point (240 = 48000 * 0.005)

WINDOWTYPE = 1 # Window type -> 0: Blackman 1: Hamming 2: Hanning

NORMALIZE = 1 # Normalization -> 0: none 1: by power 2: by magnitude

FFTLEN = 1024 # FFT length in point

FREQWARP = 0.42 # 0.55 # frequency warping factor

GAMMA = 0 # pole/zero weight for mel-generalized cepstral (MGC) analysis

MGCORDER = 24 # order of MGC analysis

LNGAIN = 1 # use logarithmic gain rather than linear gain

LOWERF0 = 40 # lower limit for f0 extraction (Hz)

UPPERF0 = 400 # upper limit for f0 extraction (Hz)

NOISEMASK = 50 # standard deviation of white noise to mask noises in f0 extraction

and the training parameters in scrpits/Config.pm

# Speech Analysis/Synthesis Setting ==============

# speech analysis

$sr = 16000; #48000; # sampling rate (Hz)

$fs = 80; #240; # frame period (point)

$fw = 0.42; #0.55; # frequency warping

$gm = 0; # pole/zero representation weight

$lg = 1; # use log gain instead of linear gain

$fr = $fs/$sr; # frame period (sec)

# speech synthesis

$pf = 1.4; # postfiltering factor

$fl = 4096; # length of impulse response

$co = 2047; # order of cepstrum to approximate mel-generalized cepstrum

The rest of t he training parameters remain the same, but I cannot get correct result from training.

Could anyone tell me where can I possibly go wrong?

Thanks in advance!

Sincerely,

Mandy

[hts-users:03015] Error speed synthesized speech while using 16K data with HTS-2.2