[hts-users:02576] Is it OK to use a few and long speech files for HTS training?
- Subject: [hts-users:02576] Is it OK to use a few and long speech files for HTS training?
- From: jangwon kim <jangwonjkim@xxxxxxxxx>
- Date: Mon, 2 Aug 2010 19:01:01 -0700
- Delivered-to: hts-users@xxxxxxxxxxxxxxx
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:date:message-id :subject:from:to:content-type; bh=bvOsG48w0tVDu5k4X+uyuW57/gk3brMuZY0Z/vb4c5c=; b=p6xtLDlDkvtiTIOgn41aV8hqfO7JbHuZ3o4JZ+X3klkg25O4w6qM9d3l89ihRCd0qE JYWnGXwUH9yroCWO+jGr8wcaSaV7lK6uu6lCOkPedsyg2Yb6GopBrfxMMcgNyWfuceR4 SG93RGPFLhC91GCT7vXu3ZEtQM84lajgeJLzM=
- Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=MbX0T+R4uZaHDp3czGKcr9Q1hgJVS8i5Cx9254CV0pDBWVx0kgqjY0z2XGHcKSv16C ksVGQ280YWbOf7b++gpzsY1ERjWcI43G+gGEwNmxYQY4LX/oeVwX9iAbmQiMd5BZJGgZ uKuWHlpisq2vt1sNDYiBqPc3I0NikWiW2skno=
Dear all.
I am using 19 audio-book files (about 8 minutes for each file => total 160 minutes) spoken by a single speaker, who is a native speaker of American English.
audio file format has been changed to raw
utt files were created by festival.
HTS versions I tried were 2.1.1 and 2.1 (both had same problem)
The problem I have now is following
In the initialization and reestimation ste, the number of observation sequences does not match with the number of corresponding phoneme in the audio-book data that I am using.
I compared it with what I got by running HTS-demo of using cmu_us_arctic_slt database.
When I used cmu_us_arctic_slt database, the number of a certain phoneme (ex: aa, oy) was same to the number of the observation sequences loaded for initialization and reestimation step.
I am wondering if this problem happened because of the size of utt files (It is too big and long. So *.lab files are also big and long)
Or, would it be happened because of the file name that I set? (I uses 01-02.utt, 01-03.utt, ... 01-10.utt, 02-01.utt, ..., 02.10.utt)
I would appreciate if you would give me some advice on this problem.
thank you in advance
jangwon