Hi I have found a few problems in the Perl implementation of postfiltering_lsp in the Training.pl script of the HTS 2.3alpha SLT demo: http://hts.sp.nitech.ac.jp/archives/2.3alpha/HTS-demo_CMU-ARCTIC-SLT.tar.bz2 I noticed my machine spending an unusual amount of time on the SPTK-based synthesis in the $WGEN1 branch, and especially the mgc2mgc routine and found two instances of the following error in postfiltering_lsp: - $line .= "$MGC2MGC -m " . ( $ordr{'mgc'} - 1 ) . " -a $fw -c $gm -n -u -M " . ( $fl - 1 ) . " -A 0.0 -G 1.0 | "; + $line .= "$MGC2MGC -m " . ( $ordr{'mgc'} - 1 ) . " -a $fw -c $gm -n -u -M " . ( $ordr{'mgc'} - 1 ) . " -A $fw -C $gm | "; I have attached a patch to fix these two instances. However, the patch does not resolve all the issues. I now get the following errors: Synthesizing a speech waveform from arctic_a0005.mgc and arctic_a0005.lf0...[No. 1] is unstable frame [No. 2] is unstable frame [No. 3] is unstable frame [No. 4] is unstable frame ... for all frames and then a stream of x2x : warning: input data is over the range of type 'short'! x2x : warning: input data is over the range of type 'short'! x2x : warning: input data is over the range of type 'short'! x2x : warning: input data is over the range of type 'short'! The resultant synthesised waveforms are just noise (and I have applied this fix too: http://hts.sp.nitech.ac.jp/hts-users/spool/2012/msg00407.html). If the postfiltering routine is bypassed, the synthesised waveforms sound fine. The hts_engine-based synthesis in the $ENGIN branch, on the other hand, works perfectly. I also tested the voice manually with hts_engine using various postfiltering coefficients and everything sounds as expected, without any errors or warnings. I replaced the packaged 48 kHz waveforms with the original 32 kHz versions from http://festvox.org/cmu_arctic/dbs_slt.html, with the appropriate settings and FRAMESHIFT=160. The vocoder configuration I am using is MGCORDER=24 GAMMA=2 FREQWARP=0.55 LNGAIN=1 PSTFILTER_LSP=0.7. I am using SPTK 3.6, hts_engine 1.07 and HTS 2.3alpha compiled to 64-bit binaries on OS X 10.8.3. I plan on testing with GAMMA=1 FREQWARP=0.0 soon and will report back. Regards Pieter
Attachment:
Training.pl.diff
Description: Binary data