[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:02180] Re: problem with volume after adaptation

Subject: [hts-users:02180] Re: problem with volume after adaptation
From: Junichi Yamagishi <jyamagis@xxxxxxxxxxxx>
Date: Wed, 19 Aug 2009 12:05:37 +0100
Cc: Junichi Yamagishi <jyamagis@xxxxxxxxxxxx>, toth.b@xxxxxxxxxxx
Delivered-to: hts-users@xxxxxxxxxxxxxxx

Hi,

Two tips.

Before you convert float value to short value for raw waveforms
using x2x,

$MGLSADF -m ".($ordr{'mgc'}-1)." -p $fs -a $fw -g $gm $mgc | "
"$X2X +fs | "

"$SOX -c 1 -s -w -t raw -r $sr - -c 1 -s -w -t wav -r $sr $gendir/$base.wav";


you may modify amplitude of them by using sopr if necessary.

$MGLSADF -m ".($ordr{'mgc'}-1)." -p $fs -a $fw -g $gm $mgc | "
"$SOPR -d 2 | "
"$X2X +fs | "

"$SOX -c 1 -s -w -t raw -r $sr - -c 1 -s -w -t wav -r $sr $gendir/$base.wav";


Sometimes they exceed intmax 32766 due to Gaussian's nature.

Even if you forcibly modify amplitude of them and still face the issueof

power or amplitude, you should compare mean vectors for the GV models
(especially GV for C0 term) for the target speaker with those for
other speakers.

If the target speaker has higher a GV value for C0 term than others,

you would need to re-think about better normalization of amplitudebetween

speakers and within each speaker.

If the target speaker have almost the same GV values as others but
the issue does alter, your adaptation fails to transform C0 terms.
Please adjust tuning parameters for adaptation such as SPLITTHRESH
and SMAPSIGMA. You may use separate block transforms for them,
e.g. for 40 static features HADAPT:BLOCKSIZE = "IntVec 6 1 39 1 39 1 39"

Regards,
Junichi


On 18 Aug 2009, at 22:56, Tóth Bálint wrote:

Dear Junichi Yamagishi,

Thank you very much for your answer.
I’ve checked and HTS calculates GV models from all speakers(including average voice speakers and target speaker). When GV iscalculated from all speakers there are much more overshoots, but incase of calculating GV only from the target speaker the synthesizedvoice (SAT+adaptation) still overshoots often.
I normalized all the audio data in the same way.
In case of SI+adaptation (with the same training and adaptationdata) the overshoots are quite rarely in the synthesized speech, butthere are still some.
Any help is highly appreciated.

Best Regards,
Balint Toth


Junichi Yamagishi írta:
Hi,
Did you calculate GV models on adaptation data for the targetspeaker?
Sometimes GV models calculated for average voice are too big for some
speakers. (this would be crucial for log gain case.)
I don't remember how HTS-demo calculates this, but please checkthis first.
Then it might be good to normalize amplitude level of adaptationdata tothat of training data for avoiding bad transformation of C0/gainterms.
Regards,
Junichi Yamagishi
CSTR

On 12 Aug 2009, at 22:22, Tóth Bálint wrote:
Hi,
I am trying to adapt HTS to a new voice. The SAT average voice isok: http://alpha.tmit.bme.hu/~toth.b/hts_samples/SAT.wav
but after adaptation the volume overshoots: http://alpha.tmit.bme.hu/~toth.b/hts_samples/SAT_dec_feat3.wav
The volume of the adaptation data is normal, there are noovershoots. The adaptation of other voices works well.
Can you please help me, what can be the problem?

Thanks in advance!

Best Regards,
Balint Toth



--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

References
: [hts-users:02079] CMLLR, pollyvin; [hts-users:02169] problem with volume after adaptation, Tóth Bálint; [hts-users:02170] Re: problem with volume after adaptation, Junichi Yamagishi; [hts-users:02179] Re: problem with volume after adaptation, Tóth Bálint

Prev by Subject: [hts-users:02179] Re: problem with volume after adaptation
Next by Subject: [hts-users:02181] Possible error SLT HTS 2.0.1 Training.pl
Previous by thread: [hts-users:02179] Re: problem with volume after adaptation
Next by thread: [hts-users:02083] WARNING [-2661] LoadTree: Macro dur_s2_1