> I have tried many parameter tuning. But, the spikes are still present in the
> synthesized file.
>
> The analysis shows that the spikes are happening when vowel phone is
> followed by nasal sound. Can I change any thing in the context dependent
> modeling to avoid these spikes?
>
> Thanks,
> Raghavendra.
>
>
>
> On Fri, Oct 19, 2012 at 8:01 PM, Keiichiro Oura <
uratec@xxxxxxxxxxxxxxx>
> wrote:
>>
>> Hi,
>>
>> You can try to change many settings for GV in the HTS demo scripts.
>>
>> MAXGVITER maximum number of iterations of GV-based parameter
>> generation algorithm (default=50)
>> GVEPSILON convergence factor for GV iteration (default=0.0001)
>> MINEUCNORM minimum Euclid norm for GV iteration (default=0.01)
>> STEPINIT initial step size (default=1.0)
>> STEPINC step size acceleration factor (default=1.2)
>> STEPDEC step size deceleration factor (default=0.5)
>> HMMWEIGHT weight for HMM output prob. (default=1.0)
>> GVWEIGHT weight for GV output prob. (default=1.0)
>> OPTKIND optimization method (STEEPEST, NEWTON, or LBFGS)
>> (default=NEWTON)
>> NOSILGV turn on GV without silent and pause phoneme (0:off or
>> 1:on, default=1)
>> CDGV turn on context-dependent GV (0:off or 1:on, default=1)
>>
>> Regards,
>> Keiichiro Oura
>>
>>
>> 2012/10/19 Veera Raghavendra <
raghavendra@xxxxxxxxxx>:
>> > Dear All,
>> >
>> > I found the difference in the quality with and without GV.
>> >
>> > If GV is turned-on, the voice quality is very clear but there are sudden
>> > spikes. These spikes destroys the synthesis quality.
>> >
>> > If GV is turned-off, there are no spikes but the content is not clear.
>> >
>> > Do I need to change any settings in HSMMAlign.
>> >
>> > Thanks,
>> > Raghavendra.
>>
>