[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:03741] Re: Problems in the HTS 2.3alpha implementation of the LSP postfilter


Whoops, sorry, problem is only that mgc2mgc is just *really* slow with -M 4095(!) due to $fl being set to 4096 for some reason. I see the default value in HTS is now 576, the same in hts_engine and 384 for HTS_EMBEDDED. Nevermind then :)


On 25 Apr 2013, at 3:39 PM, Pieter Scholtz <pieterscholtz@xxxxxxxxx> wrote:

> Hi
> 
> I have found a few problems in the Perl implementation of postfiltering_lsp in the Training.pl script of the HTS 2.3alpha SLT demo:
> http://hts.sp.nitech.ac.jp/archives/2.3alpha/HTS-demo_CMU-ARCTIC-SLT.tar.bz2
> 
> I noticed my machine spending an unusual amount of time on the SPTK-based synthesis in the $WGEN1 branch, and especially the mgc2mgc routine and found two instances of the following error in postfiltering_lsp:
> 
> -   $line .= "$MGC2MGC -m " . ( $ordr{'mgc'} - 1 ) . " -a $fw -c $gm -n -u -M " . ( $fl - 1 ) . " -A 0.0 -G 1.0 | ";
> +   $line .= "$MGC2MGC -m " . ( $ordr{'mgc'} - 1 ) . " -a $fw -c $gm -n -u -M " . ( $ordr{'mgc'} - 1 ) . " -A $fw -C $gm | ";
> 
> I have attached a patch to fix these two instances. However, the patch does not resolve all the issues. I now get the following errors:
> 
> Synthesizing a speech waveform from arctic_a0005.mgc and arctic_a0005.lf0...[No. 1] is unstable frame
> [No. 2] is unstable frame
> [No. 3] is unstable frame
> [No. 4] is unstable frame
> ...
> for all frames and then a stream of
> x2x : warning: input data is over the range of type 'short'!
> x2x : warning: input data is over the range of type 'short'!
> x2x : warning: input data is over the range of type 'short'!
> x2x : warning: input data is over the range of type 'short'!
> 
> The resultant synthesised waveforms are just noise (and I have applied this fix too: http://hts.sp.nitech.ac.jp/hts-users/spool/2012/msg00407.html). If the postfiltering routine is bypassed, the synthesised waveforms sound fine. The hts_engine-based synthesis in the $ENGIN branch, on the other hand, works perfectly. I also tested the voice manually with hts_engine using various postfiltering coefficients and everything sounds as expected, without any errors or warnings.
> 
> I replaced the packaged 48 kHz waveforms with the original 32 kHz versions from http://festvox.org/cmu_arctic/dbs_slt.html, with the appropriate settings and FRAMESHIFT=160. The vocoder configuration I am using is MGCORDER=24 GAMMA=2 FREQWARP=0.55 LNGAIN=1 PSTFILTER_LSP=0.7.
> 
> I am using SPTK 3.6, hts_engine 1.07 and HTS 2.3alpha compiled to 64-bit binaries on OS X 10.8.3.
> 
> I plan on testing with GAMMA=1 FREQWARP=0.0 soon and will report back.
> 
> Regards
> Pieter
> 
> <Training.pl.diff>


Follow-Ups
[hts-users:03742] Re: Problems in the HTS 2.3alpha implementation of the LSP postfilter, Pieter Scholtz
References
[hts-users:03740] Problems in the HTS 2.3alpha implementation of the LSP postfilter, Pieter Scholtz