[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:04133] Re: Dealing with "metallic" sounding of some consonants in HTS


Hi Momop,
 
Sure: I've changed it in the synthesis parameters (-u parameter if you use the hts_engine API). Probably I should have called it "voiced/unvoiced threshold" to be clear.
 
Ilya
 
02.10.2014, 05:53, "Momop Momop" <momop540@xxxxxxxxx>:
 
Illya
 
Can you please tell me where you changed the MSD threshold ?
 
Thank you
momop

On Wednesday, October 1, 2014 1:18 PM, Ilya Edrenkin <ilia@xxxxxxxxxxxxxxx> wrote:


Hi,
 
Has anyone faced a problem of "metallic" or "ringing" sounding of particular consonants in HTS, such as  "z" and "th"?
 
Here are a couple of examples attached: one English "this is the zombie" from cmu-slt voice and one Russian with text "zoya zabrala zebru". In both cases "z" sounds a bit strange in almost the same way, although the training databases are disjoint and even the training systems are different (English is taken from MaryTTS, Russian is built on hts-2.3alpha).
 
Training data itself does not contain such an effect for "z" phones. Postfiltering or altering GV weights doesn't seem to help. Setting MSD threshold as high as 0.95 does help: there is no more "metallic" sounding for "z", but vowels are of course distorted into "whispering".
 
It seems that the main problem is in the frequency band 2.5KHz-6KHz. It is visible at the spectrogram (attached; positions 0.2, 0.5, 1.0).  Applying a steep bandreject filter (almost removing this band) does help. I wonder if there is a cleaner way to deal with it? Probably tuning the MLSA filter or playing with mel-cepstral feature extraction parameters could help?
 
Thank you for any advice!
 
Regards,
Ilya



References
[hts-users:04132] Dealing with "metallic" sounding of some consonants in HTS, Ilya Edrenkin