[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:00184] Re: adding new language


Hi,

I've the f0 files with 5ms frame legth but it is still throwing the same
error - no path found in beta pass..
format is float, little endian, range is correct (80-200 for voiced, 0 for
unvoiced regions)

what value should be used for the frame shift lenght?

mcep generation is using 25ms long frame and 5ms for frame shifting.
but it doesn't work with these settings as well.

regards,

Tomas Sykora

> Hi,
>
> Few comments...
>
> The original F0 values in the AWB example were extracted by
> ESPS's get_f0. See section "Pitch and formant extraction " in
> http://www.speech.kth.se/snack/tutorial.html for details.
>
> PDA works as well. I wasn't happy with unvoiced results in the creaky
> sections in my speech data, though.
> There are other F0 extractors as well, like Praat and tempo,
> which may or may not be better.
>
> You are using F0 range 80-200 in your PDA example, which are
> probably reasonable values for an average male voice.
> Note however that the typically HTS uses 5ms milliseconds frames,
> not the 10 ms frames that are the default in PDA.
> See --length option.
>
> As for the unexpected output.
> I'm lazy and not particularly good with endianisms or PHP...
> I'd use something like
>
> pda foo.wav -L -length 0.005 -fmin 80 -fmax 200 | x2x +af > foo.f0
>
> Try whether it works for you.
>
> (Notice also that PDA seems to print an extra space at the start of the
> output.)
>
> Good look,
>
>   Nicholas
>
>
>
>
>
>
>
>
>> Well I have checked the f0 files in the original hts demo and the values
>> were like:
>>
>> 145.683
>> 145.81
>> 146.363
>> 143.789
>> 144.369
>> 144.981
>> 146.285
>> 147.798
>> 148.425
>> 148.662
>> 152.237
>> 153.206
>> 153.75
>> 154.09
>> 154.149
>> 151.575
>> 148.836
>> 147.957
>> 147.801
>> 144.776
>> 141.303
>> 140.924
>> 0
>> 0
>> 0
>> 0
>> 0
>> 0
>> 0
>> 0
>> 0
>> 0
>> 0
>> 0
>> 0
>> 0
>> 0
>>
>>
>> my values of f0 (using different versions of speech tools pda):
>>
>> 2.81973e+31
>> 0.00610188
>> 219.974
>> 3.78405e-23
>> 309242
>> -2.27633e-20
>> 1.42612e+30
>> -2.2167e-07
>> -4.93042e-22
>> 4.14523e+06
>> 0
>> 0
>> -2.88618e+13
>> -2.53785e+28
>> 1.45001e+13
>> 3.42235e+13
>> 1.86944e-22
>> -1.39997e-29
>> -11710.6
>> 0
>> 0
>> 0
>>
>>
>> in the config log file of sptk is written "not big endian".
>>
>> What is the usual range of values of f0?
>>
>> How to create "normal" f0 files?
>>
>> Thanks.
>>
>> Regards,
>>
>> Tomas Sykora
>> ------------------------------
>>
>> my script for f0 extraction:
>>
>> #!/usr/bin/php
>> <?
>>
>> if ($handle = opendir('wav')) {
>>    echo "Directory handle: $handle\n";
>>    echo "Files:\n";
>>
>>    while (false !== ($file = readdir($handle))) {
>>        if(!((strcmp($file,'.') == NULL) || (strcmp($file,'..') ==
>> NULL)))
>> {
>>                 system("/home/tomx/fei/speech_tools/speech_tools/bin/pda
>> wav/".$file." -o f0/".$file.".f0 -fmin 80 -fmax 200 -L");
>>                 system("/home/tomx/fei/SPTK-3.0/src/bin/x2x/x2x +af
>> f0/".$file.".f0 > f0_float/".$file.".f0");
>>                 system("/home/tomx/fei/SPTK-3.0/src/bin/x2x/x2x +fa
>> f0_float/".$file.".f0 > f0_human/".$file.".f0");
>>                 $i++;
>>         }
>>    }
>>
>> closedir($handle);
>> }
>>
>> ?>
>>
>>
>
>
>


-- 
all your base are belong to us


References
[hts-users:00175] adding new language, Tomas Sykora
[hts-users:00176] Re: adding new language, Heiga ZEN (Byung Ha CHUN)
[hts-users:00177] Re: adding new language, Tomas Sykora
[hts-users:00178] Re: adding new language, Heiga ZEN (Byung Ha CHUN)
[hts-users:00179] Re: adding new language, Tomas Sykora
[hts-users:00180] Re: adding new language, Heiga ZEN (Byung Ha CHUN)
[hts-users:00181] Re: adding new language, Tomas Sykora
[hts-users:00182] Re: adding new language, Nicholas Volk