[hts-users:04508] Speech Synthesis for New Language

Subject: [hts-users:04508] Speech Synthesis for New Language

From: Atlas Khan <atlaskhan90@xxxxxxxxx>

Date: Sat, 18 Mar 2017 11:49:34 +0500

Authentication-results: mailgw.mains.nitech.ac.jp; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=oDt6XiGh

Delivered-to: hts-users@xxxxxxxxxxxxxxx

Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:from:date:message-id:subject:to; bh=4BvD9PYn64JsSDZTj6t9ymUa0FYuOYGI7ioermTqtiU=; b=oDt6XiGh6oYGJZrQQEO3rj1t0mHS7RFuwkG+L42I83/nWnjNa1CFxBLnnzyjFniyG6 wm9xgP7vLM40admg44kXOwaZeX60K/DcuzJp0zV93R97az0BJNTRmPrzJulPfnbZ7iO5 Vk9nPWRxzTTvascGZMs/SG8p/NjLYVOUkc1HyDnh2EhCuC2zvWoLrkH5fLMkGtNe/1Bs p1nsjC0wiWNqpWOp9nYEojLuxnxeDmFg/vbrT9sqPQuT+0DnXwGOwgDFGNbY9tMNu2ms Y3lMmlKQplva6NK+/YAWUypa9kkRZ/IKrZwT8p8fw+F2ibOHQFCvYsCzbrfRHcVEGu/u ZtKA==

Hi,

I want to do speech synthesis using HTS for Urdu Language. I have ran Demo Script for English (HTS-demo_CMU-ARCTIC-SLT.tar.bz2) and it is run also ning fine. I also have explored all files in data directory of above demo. There are following different type of files in data directory

questions: list of all context and properties format for tree-based context clustering.
labels: phone labeled file for xxx.raw with/without their context and properties
raw: I think they are recordings
txt: Text of corresponding recordings
utts: Utterance files

What I wanted to ask is which of these types of data I needed to prepare for Urdu language. I have speech corpus and their text labels. Kindly also tell me how to prepare this data.