[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:04318] Style Adaptation Issue with HTS-ARCTIC_ADAPT demo


Hi All,
     I have used the speaker adaptation demo package to implement "Style Adaptation" for Anger emotion by following the paper written by Tachibana, Yamagishi, Masuko and Kobayashi: "A Style adaptation technique for speech synthesis using HSMM and supra segmental features".  I have successfully obtained the style adapted voice by training, however it does not sound as emotional as it does when a simple TTS using HSMM is built. Furthermore, there is a sudden shriek sound or muffledness due to which some phones get swallowed and is really unpleasant to listen to. I think this may be due to the FRAMESHIFT and FRAMELEN i am using as its not according to that mentioned in paper. Do these factors affect/disturb synthesis? Can anyone guide me as to why muffled or shriek sounds are present in synthesized speech? How can i improve the system as to increase the style's effect (anger/joy) on natural speech? My configuration is as follows:

./configure DATASET=uet_ur TRAINSPKR=mar_neu ADAPTSPKR=mar_ang ADAPTHEAD=b0 ALLSPKR='mar_neu mar_ang' F0_RANGES='mar_neu 70 625 mar_ang 77 635' GAMMA=0 FREQWARP=0.55 FRAMELEN=1440 FRAMESHIFT=288 SAMPFREQ=48000 TRANSKIND=mean MGCOCCTHRESH=1000.0 LF0OCCTHRESH=150.0 ADDMAP=0 USESMAP=FALSE --with-tcl-search-path=/usr/bin --with-fest-search-path=/home/ammarah/Festival_SpeechTools/festival/examples  --with-sptk-search-path=/usr/local/SPTK/bin --with-hts-search-path=/usr/local/HTS-2.2beta/bin --with-hts-engine-search-path=/usr/local/bin

Regards,
Ammarah.

Follow-Ups
[hts-users:04319] Re: Style Adaptation Issue with HTS-ARCTIC_ADAPT demo, Keiichiro Oura