[hts-users:00517] HTS question about creating a new voice for the Catala

Hello,

I'm trying to make a Catalan voice for Festival based on the HTS system and I see that the speech quality of the HTS model is far from perfect and, even though I played with some of the parameters of the model, I don't know what the problem can be since it works quite good in the arctic database.

I have uploaded some demo files: a sample of the original speech (not included in the training set) [LINK: http://gps-tsc.upc.es/veu/festcat/tiki-download_file.php?fileId=3 ], a sample synthesized using the clunits module in Festival [LINK: http://gps-tsc.upc.es/veu/festcat/tiki-download_file.php?fileId=5] and a sample of the HTS model that I did based on the demo http://hts.ics.nitech.ac.jp/release/HTS-demo_CMU-ARCTIC-SLT.tar.bz2 [LINK http://gps-tsc.upc.es/veu/festcat/tiki-download_file.php?fileId=4]. I used the *.win files of the http://hts.ics.nitech.ac.jp/?plugin=attach&refer=Download&openfile=festvox_nitech_us_awb_arctic_hts.tar.bz2 voice and the *.inf and *.pdf files generated with my script found under the voices directory.

Some information of the training database: 5500 seconds of speech, labeled using sphinxtrain (there is some missalignment ...), and a new phoneset defined for the catalan language. We are currently constructing a bigger and hand-labeled database and we use this one as a temporary solution.

I would also like to ask if the HTS module uses the duration, f0model, intonation and phrasing modules defined in the festvox directory to synthesize the voice.

If you need more information, do not hesitate to ask. Thank you very much!

Best,

Oriol

[hts-users:00517] HTS question about creating a new voice for the Catalan language