[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:03368] Re: About getf0 and cmp data =?iso-2022-jp?b?ZmlsZRskQiEpGyhC?=


Hi,

It's strange.
In my test, the 1082 mgc frames were generated correctly.
The test commands are follows.

# the number of samples = 86060 = 1082 frames * 80 points
% x2x +sa in.raw | wc -l
86560

# 43280 = 1082 frames * 400 points
% x2x +sf in.raw | frame -l 400 -p 80 | x2x +fa | wc -l
432800

# 553984 = 1082 frames * 512 points
% x2x +sf in.raw | frame -l 400 -p 80 | window -l 400 -L 512 -w 1 -n 1
| x2x +fa | wc -l
553984

# 22722 = 1082 frames * 21-order
% x2x +sf in.raw | frame -l 400 -p 80 | window -l 400 -L 512 -w 1 -n 1
| mcep -a 0.42 -m 20 -l 512 | x2x +fa | wc -l
22722

Please, check your SPTK command and its version carefully.

Regards,
Keiichiro Oura


2012/6/25 Leo Arthur <bin007.zhao@xxxxxxxxxxx>:
> 3104 frames is another raw file. maybe it is my fault for the unclear description,  Now I make the problem cleared as followed:
>    Now the 86560 samples raw file I checked in last mail also have this problem. The lf0 file have the same frame num with the raw file(1082 frames). But mgc file have 1080 frames, the same num with the cmp file, so my question is why? Why mgc file have less two frames than the raw file?
>
> $B:_(B 2012-6-25$B!$(B14:23$B!$(B"Keiichiro Oura" <uratec@xxxxxxxxxxxxxxx> $B<LF;!'(B
>
>> Hi,
>>
>> When the number of samples is 86560, the length of the raw file is 5.41 sec.
>> So, the number of frames should be 1082 (= 86560 points / FRAMESHIFT) frames.
>>
>> But, you said your framed raw file have 3104 frames.
>> Is this correct?
>>
>> Regards,
>> Keiichiro Oura
>>
>>
>> 2012/6/22 ArthurLeo <bin007.zhao@xxxxxxxxxxx>:
>>>
>>>    The number of samples is 86560, the length of the file is 5410ms, and
>>> the sample rate is 16k. So what is the problem$B!)(B
>>>
>>>> Date: Tue, 19 Jun 2012 13:55:53 +0900
>>>> From: uratec@xxxxxxxxxxxxxxx
>>>> Subject: [hts-users:03358] Re: About getf0 and cmp data file$B!)(B
>>>
>>>> To: hts-users@xxxxxxxxxxxxxxx
>>>> CC: uratec@xxxxxxxxxxxx
>>>>
>>>> Hi,
>>>>
>>>> Please, tell me number of samples of your raw file.
>>>> It can be checked as follows.
>>>>
>>>> x2x +sa yourfile.raw | wc -l
>>>>
>>>> Regards,
>>>> Keiichiro Oura
>>>>
>>>>
>>>> 2012/6/16 ArthurLeo <bin007.zhao@xxxxxxxxxxx>:
>>>>>
>>>>> Yes, I installed the SPTK-3.5, My training data have 4000 utterance, I
>>>>> do
>>>>> not checked any one of the training data, but I sample 100 utt, and all
>>>>> of
>>>>> them
>>>>> have the same problem that the mgc file have less two frames tha n the
>>>>> raw
>>>
>>>>> file. the raw files are 16k sample rate, 16bit pcm, little endian data
>>>>> format.
>>>>> the details command below:
>>>>> FRAMELEN = 400 FRAMESHIFT=80 FFTLEN=512 WINDOWTYPE =1 FREQWARP=0.42
>>>>> GAMMA=0
>>>>> MGCORDER=20
>>>>>> Date: Thu, 14 Jun 2012 00:55:28 +0900
>>>>>> From: uratec@xxxxxxxxxxxxxxx
>>>>>> Subject: [hts-users:03347] Re: About getf0 and cmp data file$B!)(B
>>>>>
>>>>>> To: hts-users@xxxxxxxxxxxxxxx
>>>>>> CC: uratec@xxxxxxxxxxxx
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Do you use SPTK-3.5?
>>>>>> Anyway, please let me know the number of *samples* of your raw file.
>>>>>>
>>>>>> Regards,
>>>>>> Keiichiro Oura
>>>>>>
>>>>>>
>>>>>> 2 012/6/9 ArthurLeo <bin007.zhao@xxxxxxxxxxx>:
>>>>>
>>>>>>> Thank you very much for detail answer, her e is the further question$B!'(B
>>>
>>>>>>> 1. Now assuming the F0 is 200Hz, acctually, the f0 can be varied from
>>>>>>> 60Hz
>>>>>>> to 400Hz generally, so even if 5ms waveform added to the head, the
>>>>>>> first
>>>>>>> frame also can not be located the start time of the raw file
>>>>>>> certainly.
>>>>>>> is it?
>>>>>>>
>>>>>>> 2. From the attached illustration, the mgc file should have the same
>>>>>>> frames
>>>>>>> as the raw file, but in my test, the mgc file have less two frames
>>>>>>> than
>>>>>>> the
>>>>>>> raw file,
>>>>>>> is it abnormal?
>>>>>>>
>>>>>>>> Date: Sat, 9 Jun 2012 17:43:40 +0900
>>>>>>>> From: tokuda@xxxxxxxxxxxx
>>>>>>>> Subject: [hts-users:03344] Re: [hts-users:03342] RE:
>>>>>>>> [hts-users:03341]
>>>>>>>> Re:
>>>>>>>> Ab out getf0 and cmp data file$B!)(B
>>>
>>>>>>>> To: hts-users@xxxxxxxxxxxxxxx
>>>>>>>> CC: tokuda@nit ech.ac.jp
>>>>>>>>
>>>>>>>>> 1. yes, I guess, but I am not sure about the snack a lgorithm of
>>>>>>>>> getting
>>>>>>>
>>>>>>>>> pitch, so I wander to know why the 5ms/25ms, not other length?
>>>>>>>>
>>>>>>>> We add 5ms to the file head in order to adjust the positions of
>>>>>>>> "frame
>>>>>>>> center"
>>>>>>>> assuming the extracted F0 is always 200Hz.
>>>>>>>>
>>>>>>>> See the attached illustration.
>>>>>>>>
>>>>>>>> Keiichi
>>>>>>>>
>>>>>>>>
>>>>>>>> 2012/6/9 ArthurLeo <bin007.zhao@xxxxxxxxxxx>:
>>>>>>>>> 1. yes, I guess, but I am not sure about the snack algorithm of
>>>>>>>>> getting
>>>>>>>>> pitch, so I wander to know why the 5ms/25ms, not other length?
>>>>>>>>>
>>>>>>>>> 2. My raw file have 3104 frames$B!$(B16k sample rate, FRAMELEN =
>>>>>>>>> 400 FRAMESHIFT=80 FFTLEN=512 WINDOWTYPE =1 FREQWARP=0.42 GAMMA= 0
>>>>>>>>> MGCORDER=20, but after mgc analysis, the mgc file have less two
>>>>>>>>> frames
>>>>>>>>> (3102
>>>>>>>>> frames) , and the lf0 file have the same number frames$B!J(B3104$B!K(B with
>>>>>>>>> the
>>>>>>>>> original speech raw file , maybe because of the added head/tail.
>>>>>>>>> when
>>>>>>>>> the< br>> > stage using the sptk tool merge function to compose
>>>>>>>>> the
>>>>>>>>> cmp
>>>>>>>>> file$B!$(Bmgc file
>>>>>>>
>>>>>>>> &gt ; inserted
>>>
>>>>>>>>> the lf0 file$B!$(Bso the final cmp file have the same frames with mgc
>>>>>>>>> file
>>>>>>>>> not
>>>>>>>>> the lf0 file$B!$(Band lf0 file two more frames at the tail were
>>>>>>>>> discarded
>>>>>>>>> directly.
>>>>>>>>> So I was confused, and want to know the detail reason. is it
>>>>>>>>> reasonable?
>>>>>>>>>> Date: Sat, 9 Jun 2012 00:08:44 +0900
>>>>>>>>>> From: uratec@xxxxxxxxxxxxxxx
>>>>>>>>>> Subject: [h ts-users:03341] Re: About getf0 and cmp data file$B!)(B
>>>>>
>>>>>>>>>
>>>>>>>>>> To: hts-users@xxxxxxxxxxxxxxx
>>>>>>>>>> CC: uratec@xxxxxxxxxxxx
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>>> 1. Why added 5ms/25ms waveform to head/tail? why not other
>>>>>>>>>>> length,
>>>>>>>>>>> like
>>>>>>>>>>> 10ms head or the 5ms tail? I guess it must have some special
>>>>>>>>>>> reason,
>>>>>>>>>>> I
>>>>>>>>>>> read
>>>>>>>>>>> the getf0.tcl. the default minpitch/maxpitch for speech is
>>>>>>>>>>> 60/400Hz,
>>>>>>>>>>> so if the autocorrelation method is generally used to caculate
>>>>>>>>>>> the
>>>>>>>>>>> first
>>>>>>>>>>> frame pitch, it should be added (1/60)s=16.7ms waveform for the
>>>>>>>>>>> head
>>>>>>>>>>> at
>>>>>>>>>>> least, but now the numbe r is 5ms/25ms, so can you explain more
>>>>>
>>>>>>>>>>> de tail?
>>>>>>>>>>
>>>>>>>>>> Do you mean that the lower F0 limit affect the number of
>>>>>>>>>> generated
>>>>>>>>>> F0
>>>>>>>>>> frames?
>>>>>>>>>> When I changed the lower F0 limit, the number of generated F0
>>>>>>>>>> frames
>>>>>>>>>> were not changed.
>>>>>>>>>>
>>>>>>>>>>> 2. I checked the *.mgc file, it always have less two frames
>>>>>>>>>>> than
>>>>>>>>>>> the
>>>>>>>>>>> *.raw
>>>>>>>>>>> file, so my question is the discarded two frames is the raw
>>>>>>>>>>> file
>>>>>>>>>>> head
>>>>>>>>>>> two
>>>>>>>>>>> frame or the raw file tail frame ?
>>>>>>>>>>
>>>>>>>>>> It's strange.
>>>>>>>>>> Please, let me know the number of samp les of your raw file,
>>>>>>>>>> command
>>>>>>>
>>>>>>>>>> lines of mgc analysis, and the number of generated mgc frames,
>>>>>>>>>> respectively.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Keiichiro Oura
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 2012/6/8 ArthurLeo <bin007.zhao@xxxxxxxxxxx>:
>>>>>>>>>>> Thank you for your answer, but I have some further questions
>>>>>>>>>>> about
>>>>>>>>>>> the
>>>>>>>>>>> details of the lf0 and mgc parameter extraction.
>>>>>>>>>>> 1. Why added 5ms/25ms waveform to head/tail? why not other
>>>>>>>>>>> length,
>>>>>>>>>>> like
>>>>>>>>>>> 10ms head or the 5ms ta il? I guess it must have some special
>>>
>>>>>>>>>>> reason,
>>>>>>>>>>> I
>>>>>>>>>>> read
>>>>>>>>>>> the getf0.tcl. the default minpitch/maxpitch for speech is
>>>>>>>>>>> 60/400Hz,
>>>>>>>>>>> so if the autocorrelation method is generally used to cacu late
>>>>>>>>>>> the
>>>>>
>>>>>>>>>>> first
>>>>>>>>>>> frame pitch, it should be added (1/60)s=16.7ms waveform for the
>>>>>>>>>>> head
>>>>>>>>>>> at
>>>>>>>>>>> least, but now the number is 5ms/25ms, so can you explain more
>>>>>>>>>>> detail?
>>>>>>>>>>>
>>>>>>>>>>> 2. I checked the *.mgc file, it alw ays have less two frames
>>>>>>>>>>> than
>>>
>>>>>>>>>>> the
>>>>>>>>>>> *.raw
>>>>>>>>>>> file, so my question is the discarded two frames is the raw
>>>>>>>>>>> file
>>>>>>>>>>> head
>>>>>>>>>>> two
>>>>>>>>>>> frame or the raw file tail frame ?
>>>>>>>>>>>
>>>>>>>>>>> &gt ; Date: Fri, 8 Jun 2012 09:01:56 +0900
>>>>>>>>>>>> From: uratec@xxxxxxxxxxxxxxx
>>>>>>>>>>>> Subject: [hts-users:03339] Re: About getf0 and cmp data file$B!)(B
>>>>>>>>>>> &gt ; To: hts-users@xxxxxxxxxxxxxxx
>>>>>
>>>>>>>>>>>> CC: uratec@xxxxxxxxxxxx
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>
>>>>>>>>>>>> The number of frames generated by Snack (in ActiveTcl) is
>>>>>>>>>>>> often
>>>>>>>>>>>> lower
>>>>>>>>>>>> than the number of frames generated by SPTK.
>>>>>>>>>> &g t;> The internal frame calculation is different between Snack
>>>>>>>>>> and
>>>>>>>>>> SPTK.
>>>>>>>>>
>>>>>>>>>>>> Therefore, 5ms/25ms waveform are added to head/tail of the
>>>>>>>>>>>> utterance
>>>>>>>>>>>> before f0 analysis in the HTS demo script.
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> Keiichiro Oura
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> &g t; 2012/6/7 ArthurLeo <bin007.zhao@xxxxxxxxxxx>:
>>>
>>>>>>>>>>>>> Hi all$B!$(B
>>>>>>>>>>>>> When I prepared the cmp data for training$B!$(B I found that the
>>>>>>>>>>>>> *.lf0
>>>>>>>>>>>>> file
>>>>>>>>>>>>> always have two more frames than the *.mgc file, and they
>>>>>>>>>>>>> are
>>>>>>>>>>>>> extracted
>>>>>>>>>>>>> from
>>>>>>>>>>>>> the same *.raw file. So I checked the ./data/makefile$B!$(Band I
>>>>>>>>>>>>> foun d
>>>>>>>>>> & gt;> > the
>>>>>>>
>>>>>>>>>>>>> *.raw
>>>>>>>>>>>
>>>>>>>>>>>>> file were added the 0.5ms head data and the 25ms tail data$B!$(BI
>>>>>>>>>>>>> guess
>>>>>>>>>>>>> it
>>>>>>>>>>>>> maybe
>>>>>>>>>>>>> the reason$B!$(Bbut I don not know why do this$B!)(Bwhy add the head
>>>>>>>>>>>>> and
>>>>>>>>>>>>> t
>>>>>>>>>>>>> he
>>>>>>>>>>>>> tail$B!)(B
>>>>>>>>>
>>>>>>>>>>>>> why
>>>>>>>>>>>>> 0.5ms and 25ms$B!$(Bnot others$B!)(B Can some one known about it$B!)(B when
>>>>>>>>>>>>> composing
>>>>>>>>>>>>> the
>>>>>>>>>>>>> lf0 and mgc with the SPTK tool merge function$B!$(Bhow deal with
>>>>>>>>>>>>> the
>>>>>>>>>>>>> the
>>>>>>>>>>>>> extra
>>> & gt; >> >> >> >> > lf0 frame$B!)(B discarded directly$B!)(B
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>
>>

References
[hts-users:03338] About getf0 and cmp data file?, ArthurLeo
[hts-users:03339] Re: About getf0 and cmp data =?iso-2022-jp?b?ZmlsZRskQiEpGyhC?=, Keiichiro Oura
[hts-users:03340] RE: [hts-users:03339] Re: About getf0 and cmp data file?, ArthurLeo
[hts-users:03341] Re: About getf0 and cmp data =?iso-2022-jp?b?ZmlsZRskQiEpGyhC?=, Keiichiro Oura
[hts-users:03342] RE: [hts-users:03341] Re: About getf0 and cmp data file?, ArthurLeo
[hts-users:03344] Re: [hts-users:03342] RE: [hts-users:03341] Re: About getf0 and cmp data file?, Keiichi Tokuda
[hts-users:03345] RE: [hts-users:03344] Re: [hts-users:03342] RE: [hts-users:03341] Re: About getf0 and cmp data file?, ArthurLeo
[hts-users:03347] Re: About getf0 and cmp data =?iso-2022-jp?b?ZmlsZRskQiEpGyhC?=, Keiichiro Oura
[hts-users:03357] RE: [hts-users:03347] Re: About getf0 and cmp data file?, ArthurLeo
[hts-users:03358] Re: About getf0 and cmp data =?iso-2022-jp?b?ZmlsZRskQiEpGyhC?=, Keiichiro Oura
[hts-users:03361] RE: [hts-users:03358] Re: About getf0 and cmp data file?, ArthurLeo
[hts-users:03364] Re: About getf0 and cmp data =?iso-2022-jp?b?ZmlsZRskQiEpGyhC?=, Keiichiro Oura
[hts-users:03367] Re: Re: About getf0 and cmp data file?, Leo Arthur