[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:00503] Re: saturation in the raw


I have seen this when using high values of beta but never when
beta is set to 0.0.

br,
 Nicholas


> Hi mailing list,
>
> from time to time (this is not very frequent), it happens that some
> samples x of the speech synthesized by hts_engine (the xs in
> vocoder.cpp) are bigger than 32767 (x > 1) or lower than -32768 (x <
> -1), which means that the line "xs = (short) x;" in vocoder.cpp causes
> some sort of saturation.
>
> We tried to synthesize the same sentence with 2 different training
> result, one training was made with about 1000 sentences and the other
> one with 3000 sentences.
>
> For the 1000-sentences-based-training, we got no saturation but for the
> 3000-sentences-based-training, we got saturation for one state (10
> frames or 50 ms) of one phoneme (5 states). It was a "small" saturation
> (< 5 %).
>
> we compared the features (means and variances for cepstrum and f0) of
> the model selected by hts_engine for 1000 and 3000 and they both are
> relatively the same.
>
> to correct this, we simply weighted all the "x" by 0.75 before computing
> xs, but we think it's not an optimal solution because somehow it reduces
> the dynamic of the signal. (another solution is to write the raw in
> float or in 32 bits integer but we don't find it practical)
>
> Did someone have met this problem before ?
>
> Thanks
>
> Alexis Moinet
>
> PhD student
> FPMs - TCTS Lab
>



References
[hts-users:00500] saturation in the raw, Alexis Moinet