I have tried the HTS-demo and now I am analyzing it to apply the same or similar process to Korean Speech Synthesis.
I chose Japanese version of the demo because of close similarity between Japanese and Korean.
However, I could not find how the full context labels given created. There is no explanation about the format or generation rule at all.
I found the English version (CMU_ARCTIC) has a document named an example of context-dependent label format for HMM-based speech synthesis in English. This format looks not the same as Japanese one.
Could anyone give some hint or doc about it?
Thanks in advance.
Voice Interface Lab., KAIST,