Hello all,
I have two questions in using "HTS-demo_CMU-ARCTIC-SLT-wodata.tar.bz2" demo to generate our own voice.
First , our training 16kHz ".raw" files was convert from 48kHz ".wav"
files which we recorded them in a very quiet recording studio.sounds
like "http://pan.baidu.com/s/1eQYQGSE". Then we label these voices by
hand. After generating the".lab" file we training these files with HTS.
Finally We got voices like "http://pan.baidu.com/s/1hqLVyHu" which with a lot of noise, And
it doesn't sound so clear .
Which aspects went wrong? How can I fix this problem?
Second , In this demo,we can synthesizing waveforms using 1mix,2mix and
stc but when we using hts_engine to generate ".wav" files , the ".wav"
files sounds like "http://pan.baidu.com/s/1kUf2heB". It can hear a
little phones, but it wasn't my require voices.
And I
used the hts_enging API 1.00 ,which was told to use it in the
"HTS-demo_CMU-ARCTIC-SLT-wodata.tar.bz2" demo's README file.
What is this problems for? How can I fix it?
Thanks very much.