[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:01724] Re: about utterance


paminy wrote:
hi,everynoe
I want to use the questions in the HTS to creat the CART using other speech data,and I have to use the utterance of the speech data. But I only have the phoneme lab, how could I get the whole utterance? Dose it need manual labeling?
You have data with phoneme labels, but you want to get "full context" labels including prosodic contextual factors - correct?

The normal way to obtain such labels is to predict them from the text, using a TTS front end. If you don't have a front end for your language, you could manually label the data - however, without a front end, you will not be able to automatically synthesise new sentences.

If you are working on English, then Festival provides a front end that will predict the labels you need. For other languages, you will need to find a suitable front end from somewhere else.

If you are in this situation:

- you have full context labels for some data (e.g., ARCTIC)
- you have trained an average voice model on that data
- you want to adapt that model to a target speaker using some new data
- you only have phonetic labels for the new data

then read this paper for a simple solution that appears to work quite well:

Unsupervised Adaptation for HMM-Based Speech Synthesis. Simon King, Keiichi Tokuda, Heiga Zen, Junichi Yamagishi. Proc. Interspeech 2008, Brisbane, Australia. September 2008.

I think I can send you a personal copy of this, if you don't have access to those proceedings.

Simon

--
www.cstr.ed.ac.uk

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

References
[hts-users:01723] about utterance, paminy