You can try STRAIGHT spectrum in replacement of FFT.
I was assuming that, SLT samples were upsampled from 16KHz to 48KHZ and captured good spectrum between 8KHz to 24KHz.
I am using following configuration for 16KHz data. Can you please suggest any other changes required while building with 16KHz samples.
On Thu, Sep 27, 2012 at 7:49 AM, 那兴宇 <nxy-yzqs@xxxxxxx>
The huge difference in the quality between 16KHz and 48KHz voice using SLT is due to the 48KHz dataset itself. The 48KHz sound provide more spectral information and thus require more features and model parameters. There is no point of upsampling 16KHz to 48KHz unless you can retreive the spectrum between 8000Hz and 24000Hz.
You can use the 16KHz demo or reconfigure the HTS2.2 demo using your configurations.
Xingyu Na (那兴宇)
Beijing Institute of Technology
在 2012-09-27 02:02:09，"Veera Raghavendra" <raghavendra@xxxxxxxxxx
I was trying to build HTS voice using version 2.2. But, my recordings are in 16KHz and HTS 2.2 seems customized for 48KHz samples. When I upsample the 16KHz files to 48KHz using SOX, frequencies are scaled to only 8KHz. How to upsample the recordings to 48KHz.
PS: I perceived huge difference in the quality between 16KHz and 48KHz voice using SLT.