Hi,
About question 1
I have solved that ,
it is because there is no "enter" or new line in the end of full_lab file.
About question 2,3
I use "pda" to get f0 for the voices which are in the CMU ARCTIC database,
and compare them with the f0 in the CMU ARCTIC database.
but there is great defference between
them .
I use "pda $name.wav -L -S 0.025 -length 0.005 -fmax 200 -fmin 80| x2x +af > $name.f0 "
I will try other methods.
thank you!
Best regards
From: "Heiga ZEN (Byung Ha CHUN)" <zen@xxxxxxxxxxxxxxxx>
Reply-To: hts-users@xxxxxxxxxxxxxxxxxxxxxxxxx
To: hts-users@xxxxxxxxxxxxxxxxxxxxxxxxx
Subject: [hts-users:00368] Re: my speech
Date: Wed, 12 Jul 2006 18:35:42 +0900
>Hi
>
>liu lei wrote:
>
>>I have used HTS to generate chinese , I use syllables to construct
>>my HMM models, and in the course of synthesizing,
>>I found some questions.
>>1. about the list
>> in the directory of "lists", the full.list allways has some
>>wrong information.
>>for example
>>sil-liu+l/A:.....
>>iu-lei+sil/A:.....
>>sil-da+l/A.......a-lian+l/A..........
>>sil-da+l/A.........
>>in the example,
>>between "sil-da+l/A:.." and "a-lian+l/A" , there is no "enter" ,
>>they are in the same line, but they are different syllable models.
>>I use the "makefile" in the "HTS-demo_NIT-ATR503-M001" ,
>>and make no modification.
>
>They are automatically generated from your full.mlf, so please check
>your mlf whether it includes such lines or not.
>
>>2.some puzzles about f0
>>I use tcl/snack to get f0s for HTS. I write a tcl script,
>>and set framelength=0.025 that is for mel to get f0s.
>>But ,when I use them to generate speech, the results is too bad.
>>I find int the course of extracting mel, HTS needs
>>$sampfreq = 16000; $framelength = 0.025; $frameshift = 0.005;
>>$windowtype = 0; $normtype = 1; $FFTLength = 512;
>>$freqwarp = 0.42; $mceporder = MCEPORDER; but my course of
>>extracting f0 only uses the framelength,
>>and set it the same as mel'framelength.
>>I want to know that
>>is it necessary to set frameshift and other options for geting f0?
>
>I think frame shift for f0 extraction have to be equal to that of
>mel-cepstral analysis.
>And you should optimize f0 search range to avoid half/double pitch.
>
>>3.about my speech
>>I guess it is the f0 that cause my speech's unclear,
>>so I use "pda" and "tcl/snack" to get f0 respectively,
>>but I don't get better result.
>>Any other factors can effect speech' s articulation?
>
>Have you ever tried to extract f0s from CMU ARCTIC databases
>(HTS-demo) using your pda/get_f0 and trained HTS?
>I think comparing HTSs trained using f0s included in the database
>and extracted by your tools will show whether f0 extraction method
>causes your problem or not.
>
>Regards,
>
>Heiga Zen (Byung Ha Chun)
>
>--
>------------------------------------------------
>Heiga ZEN (in Japanese pronunciation)
>Byung Ha CHUN (in Korean pronunciation)
>
>Department of Computer Science and Engineering
>Nagoya Institute of Technology
>Gokiso-cho, Showa-ku, Nagoya 466-8555 Japan
>
>http://kt-lab.ics.nitech.ac.jp/~zen
>------------------------------------------------
>