[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:00500] saturation in the raw

Subject: [hts-users:00500] saturation in the raw
From: Alexis Moinet <alexis.moinet@xxxxxxxxxx>
Date: Thu, 25 Jan 2007 16:13:36 +0100

Hi mailing list,

from time to time (this is not very frequent), it happens that somesamples x of the speech synthesized by hts_engine (the xs invocoder.cpp) are bigger than 32767 (x > 1) or lower than -32768 (x <-1), which means that the line "xs = (short) x;" in vocoder.cpp causessome sort of saturation.

We tried to synthesize the same sentence with 2 different trainingresult, one training was made with about 1000 sentences and the otherone with 3000 sentences.

For the 1000-sentences-based-training, we got no saturation but for the3000-sentences-based-training, we got saturation for one state (10frames or 50 ms) of one phoneme (5 states). It was a "small" saturation(< 5 %).

we compared the features (means and variances for cepstrum and f0) ofthe model selected by hts_engine for 1000 and 3000 and they both arerelatively the same.

to correct this, we simply weighted all the "x" by 0.75 before computingxs, but we think it's not an optimal solution because somehow it reducesthe dynamic of the signal. (another solution is to write the raw infloat or in 32 bits integer but we don't find it practical)


Did someone have met this problem before ?

Thanks

Alexis Moinet

PhD student
FPMs - TCTS Lab

Follow-Ups
: [hts-users:00503] Re: saturation in the raw, Nicholas Volk

Prev by Subject: [hts-users:00499] saturation in the raw
Next by Subject: [hts-users:00501] and Duration model
Previous by thread: [hts-users:00499] saturation in the raw
Next by thread: [hts-users:00503] Re: saturation in the raw