[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:03379] Re: [hts-users:03367] Re: Re: About getf0 and cmp data file?


Hi

On 25/06/12 13:52, Leo Arthur wrote:
> 3104 frames is another raw file. maybe it is my fault for the unclear description,  Now I make the problem cleared as followed:
>      Now the 86560 samples raw file I checked in last mail also have this problem. The lf0 file have the same frame num with the raw file(1082 frames). But mgc file have 1080 frames, the same num with the cmp file, so my question is why? Why mgc file have less two frames than the raw file?

The .mgc files usually have less frames because the program "frame" of
SPTK stops segmenting the input when EOF is detected. If you want to
have the exact number of frames then adding zeros (or white noise) at
the end of the .raw file should solve the problem. For example, when
window_function_size=5*frame_shift then adding 2*frame_shift zeros
should work.

Ranniery


> 
> 在 2012-6-25,14:23,"Keiichiro Oura"<uratec@xxxxxxxxxxxxxxx>  写道:
> 
>> Hi,
>>
>> When the number of samples is 86560, the length of the raw file is 5.41 sec.
>> So, the number of frames should be 1082 (= 86560 points / FRAMESHIFT) frames.
>>
>> But, you said your framed raw file have 3104 frames.
>> Is this correct?
>>
>> Regards,
>> Keiichiro Oura
>>
>>
>> 2012/6/22 ArthurLeo<bin007.zhao@xxxxxxxxxxx>:
>>>
>>>     The number of samples is 86560, the length of the file is 5410ms, and
>>> the sample rate is 16k. So what is the problem?
>>>
>>>> Date: Tue, 19 Jun 2012 13:55:53 +0900
>>>> From: uratec@xxxxxxxxxxxxxxx
>>>> Subject: [hts-users:03358] Re: About getf0 and cmp data file?
>>>
>>>> To: hts-users@xxxxxxxxxxxxxxx
>>>> CC: uratec@xxxxxxxxxxxx
>>>>
>>>> Hi,
>>>>
>>>> Please, tell me number of samples of your raw file.
>>>> It can be checked as follows.
>>>>
>>>> x2x +sa yourfile.raw | wc -l
>>>>
>>>> Regards,
>>>> Keiichiro Oura
>>>>
>>>>
>>>> 2012/6/16 ArthurLeo<bin007.zhao@xxxxxxxxxxx>:
>>>>>
>>>>> Yes, I installed the SPTK-3.5, My training data have 4000 utterance, I
>>>>> do
>>>>> not checked any one of the training data, but I sample 100 utt, and all
>>>>> of
>>>>> them
>>>>> have the same problem that the mgc file have less two frames tha n the
>>>>> raw
>>>
>>>>> file. the raw files are 16k sample rate, 16bit pcm, little endian data
>>>>> format.
>>>>> the details command below:
>>>>> FRAMELEN = 400 FRAMESHIFT=80 FFTLEN=512 WINDOWTYPE =1 FREQWARP=0.42
>>>>> GAMMA=0
>>>>> MGCORDER=20
>>>>>> Date: Thu, 14 Jun 2012 00:55:28 +0900
>>>>>> From: uratec@xxxxxxxxxxxxxxx
>>>>>> Subject: [hts-users:03347] Re: About getf0 and cmp data file?
>>>>>
>>>>>> To: hts-users@xxxxxxxxxxxxxxx
>>>>>> CC: uratec@xxxxxxxxxxxx
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Do you use SPTK-3.5?
>>>>>> Anyway, please let me know the number of *samples* of your raw file.
>>>>>>
>>>>>> Regards,
>>>>>> Keiichiro Oura
>>>>>>
>>>>>>
>>>>>> 2 012/6/9 ArthurLeo<bin007.zhao@xxxxxxxxxxx>:
>>>>>
>>>>>>> Thank you very much for detail answer, her e is the further question:
>>>
>>>>>>> 1. Now assuming the F0 is 200Hz, acctually, the f0 can be varied from
>>>>>>> 60Hz
>>>>>>> to 400Hz generally, so even if 5ms waveform added to the head, the
>>>>>>> first
>>>>>>> frame also can not be located the start time of the raw file
>>>>>>> certainly.
>>>>>>> is it?
>>>>>>>
>>>>>>> 2. From the attached illustration, the mgc file should have the same
>>>>>>> frames
>>>>>>> as the raw file, but in my test, the mgc file have less two frames
>>>>>>> than
>>>>>>> the
>>>>>>> raw file,
>>>>>>> is it abnormal?
>>>>>>>
>>>>>>>> Date: Sat, 9 Jun 2012 17:43:40 +0900
>>>>>>>> From: tokuda@xxxxxxxxxxxx
>>>>>>>> Subject: [hts-users:03344] Re: [hts-users:03342] RE:
>>>>>>>> [hts-users:03341]
>>>>>>>> Re:
>>>>>>>> Ab out getf0 and cmp data file?
>>>
>>>>>>>> To: hts-users@xxxxxxxxxxxxxxx
>>>>>>>> CC: tokuda@nit ech.ac.jp
>>>>>>>>
>>>>>>>>> 1. yes, I guess, but I am not sure about the snack a lgorithm of
>>>>>>>>> getting
>>>>>>>
>>>>>>>>> pitch, so I wander to know why the 5ms/25ms, not other length?
>>>>>>>>
>>>>>>>> We add 5ms to the file head in order to adjust the positions of
>>>>>>>> "frame
>>>>>>>> center"
>>>>>>>> assuming the extracted F0 is always 200Hz.
>>>>>>>>
>>>>>>>> See the attached illustration.
>>>>>>>>
>>>>>>>> Keiichi
>>>>>>>>
>>>>>>>>
>>>>>>>> 2012/6/9 ArthurLeo<bin007.zhao@xxxxxxxxxxx>:
>>>>>>>>> 1. yes, I guess, but I am not sure about the snack algorithm of
>>>>>>>>> getting
>>>>>>>>> pitch, so I wander to know why the 5ms/25ms, not other length?
>>>>>>>>>
>>>>>>>>> 2. My raw file have 3104 frames,16k sample rate, FRAMELEN =
>>>>>>>>> 400 FRAMESHIFT=80 FFTLEN=512 WINDOWTYPE =1 FREQWARP=0.42 GAMMA= 0
>>>>>>>>> MGCORDER=20, but after mgc analysis, the mgc file have less two
>>>>>>>>> frames
>>>>>>>>> (3102
>>>>>>>>> frames) , and the lf0 file have the same number frames(3104) with
>>>>>>>>> the
>>>>>>>>> original speech raw file , maybe because of the added head/tail.
>>>>>>>>> when
>>>>>>>>> the<  br>>  >  stage using the sptk tool merge function to compose
>>>>>>>>> the
>>>>>>>>> cmp
>>>>>>>>> file,mgc file
>>>>>>>
>>>>>>>> &gt ; inserted
>>>
>>>>>>>>> the lf0 file,so the final cmp file have the same frames with mgc
>>>>>>>>> file
>>>>>>>>> not
>>>>>>>>> the lf0 file,and lf0 file two more frames at the tail were
>>>>>>>>> discarded
>>>>>>>>> directly.
>>>>>>>>> So I was confused, and want to know the detail reason. is it
>>>>>>>>> reasonable?
>>>>>>>>>> Date: Sat, 9 Jun 2012 00:08:44 +0900
>>>>>>>>>> From: uratec@xxxxxxxxxxxxxxx
>>>>>>>>>> Subject: [h ts-users:03341] Re: About getf0 and cmp data file?
>>>>>
>>>>>>>>>
>>>>>>>>>> To: hts-users@xxxxxxxxxxxxxxx
>>>>>>>>>> CC: uratec@xxxxxxxxxxxx
>>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>>
>>>>>>>>>>> 1. Why added 5ms/25ms waveform to head/tail? why not other
>>>>>>>>>>> length,
>>>>>>>>>>> like
>>>>>>>>>>> 10ms head or the 5ms tail? I guess it must have some special
>>>>>>>>>>> reason,
>>>>>>>>>>> I
>>>>>>>>>>> read
>>>>>>>>>>> the getf0.tcl. the default minpitch/maxpitch for speech is
>>>>>>>>>>> 60/400Hz,
>>>>>>>>>>> so if the autocorrelation method is generally used to caculate
>>>>>>>>>>> the
>>>>>>>>>>> first
>>>>>>>>>>> frame pitch, it should be added (1/60)s=16.7ms waveform for the
>>>>>>>>>>> head
>>>>>>>>>>> at
>>>>>>>>>>> least, but now the numbe r is 5ms/25ms, so can you explain more
>>>>>
>>>>>>>>>>> de tail?
>>>>>>>>>>
>>>>>>>>>> Do you mean that the lower F0 limit affect the number of
>>>>>>>>>> generated
>>>>>>>>>> F0
>>>>>>>>>> frames?
>>>>>>>>>> When I changed the lower F0 limit, the number of generated F0
>>>>>>>>>> frames
>>>>>>>>>> were not changed.
>>>>>>>>>>
>>>>>>>>>>> 2. I checked the *.mgc file, it always have less two frames
>>>>>>>>>>> than
>>>>>>>>>>> the
>>>>>>>>>>> *.raw
>>>>>>>>>>> file, so my question is the discarded two frames is the raw
>>>>>>>>>>> file
>>>>>>>>>>> head
>>>>>>>>>>> two
>>>>>>>>>>> frame or the raw file tail frame ?
>>>>>>>>>>
>>>>>>>>>> It's strange.
>>>>>>>>>> Please, let me know the number of samp les of your raw file,
>>>>>>>>>> command
>>>>>>>
>>>>>>>>>> lines of mgc analysis, and the number of generated mgc frames,
>>>>>>>>>> respectively.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> Keiichiro Oura
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 2012/6/8 ArthurLeo<bin007.zhao@xxxxxxxxxxx>:
>>>>>>>>>>> Thank you for your answer, but I have some further questions
>>>>>>>>>>> about
>>>>>>>>>>> the
>>>>>>>>>>> details of the lf0 and mgc parameter extraction.
>>>>>>>>>>> 1. Why added 5ms/25ms waveform to head/tail? why not other
>>>>>>>>>>> length,
>>>>>>>>>>> like
>>>>>>>>>>> 10ms head or the 5ms ta il? I guess it must have some special
>>>
>>>>>>>>>>> reason,
>>>>>>>>>>> I
>>>>>>>>>>> read
>>>>>>>>>>> the getf0.tcl. the default minpitch/maxpitch for speech is
>>>>>>>>>>> 60/400Hz,
>>>>>>>>>>> so if the autocorrelation method is generally used to cacu late
>>>>>>>>>>> the
>>>>>
>>>>>>>>>>> first
>>>>>>>>>>> frame pitch, it should be added (1/60)s=16.7ms waveform for the
>>>>>>>>>>> head
>>>>>>>>>>> at
>>>>>>>>>>> least, but now the number is 5ms/25ms, so can you explain more
>>>>>>>>>>> detail?
>>>>>>>>>>>
>>>>>>>>>>> 2. I checked the *.mgc file, it alw ays have less two frames
>>>>>>>>>>> than
>>>
>>>>>>>>>>> the
>>>>>>>>>>> *.raw
>>>>>>>>>>> file, so my question is the discarded two frames is the raw
>>>>>>>>>>> file
>>>>>>>>>>> head
>>>>>>>>>>> two
>>>>>>>>>>> frame or the raw file tail frame ?
>>>>>>>>>>>
>>>>>>>>>>> &gt ; Date: Fri, 8 Jun 2012 09:01:56 +0900
>>>>>>>>>>>> From: uratec@xxxxxxxxxxxxxxx
>>>>>>>>>>>> Subject: [hts-users:03339] Re: About getf0 and cmp data file?
>>>>>>>>>>> &gt ; To: hts-users@xxxxxxxxxxxxxxx
>>>>>
>>>>>>>>>>>> CC: uratec@xxxxxxxxxxxx
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>>
>>>
>>>>>>>>>>>> The number of frames generated by Snack (in ActiveTcl) is
>>>>>>>>>>>> often
>>>>>>>>>>>> lower
>>>>>>>>>>>> than the number of frames generated by SPTK.
>>>>>>>>>> &g t;>  The internal frame calculation is different between Snack
>>>>>>>>>> and
>>>>>>>>>> SPTK.
>>>>>>>>>
>>>>>>>>>>>> Therefore, 5ms/25ms waveform are added to head/tail of the
>>>>>>>>>>>> utterance
>>>>>>>>>>>> before f0 analysis in the HTS demo script.
>>>>>>>>>>>>
>>>>>>>>>>>> Regards,
>>>>>>>>>>>> Keiichiro Oura
>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> &g t; 2012/6/7 ArthurLeo<bin007.zhao@xxxxxxxxxxx>:
>>>
>>>>>>>>>>>>> Hi all,
>>>>>>>>>>>>> When I prepared the cmp data for training, I found that the
>>>>>>>>>>>>> *.lf0
>>>>>>>>>>>>> file
>>>>>>>>>>>>> always have two more frames than the *.mgc file, and they
>>>>>>>>>>>>> are
>>>>>>>>>>>>> extracted
>>>>>>>>>>>>> from
>>>>>>>>>>>>> the same *.raw file. So I checked the ./data/makefile,and I
>>>>>>>>>>>>> foun d
>>>>>>>>>> &  gt;>  >  the
>>>>>>>
>>>>>>>>>>>>> *.raw
>>>>>>>>>>>
>>>>>>>>>>>>> file were added the 0.5ms head data and the 25ms tail data,I
>>>>>>>>>>>>> guess
>>>>>>>>>>>>> it
>>>>>>>>>>>>> maybe
>>>>>>>>>>>>> the reason,but I don not know why do this?why add the head
>>>>>>>>>>>>> and
>>>>>>>>>>>>> t
>>>>>>>>>>>>> he
>>>>>>>>>>>>> tail?
>>>>>>>>>
>>>>>>>>>>>>> why
>>>>>>>>>>>>> 0.5ms and 25ms,not others? Can some one known about it? when
>>>>>>>>>>>>> composing
>>>>>>>>>>>>> the
>>>>>>>>>>>>> lf0 and mgc with the SPTK tool merge function,how deal with
>>>>>>>>>>>>> the
>>>>>>>>>>>>> the
>>>>>>>>>>>>> extra
>>> &  gt;>>  >>  >>  >>  >  lf0 frame? discarded directly?
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>>
>>
> 
> ______________________________________________________________________
> This email has been scanned by the Symantec Email Security.cloud service.
> For more information please visit http://www.symanteccloud.com
> ______________________________________________________________________


-- 
----------------------------
Ranniery Maia
Speech Technology Group
Toshiba Research Europe LTD
Tel: +44 1223 436974


======================================================================
NOTE: The information in this email and any attachments may be
confidential and/or legally privileged. This message may be read, copied
and used only by the intended recipient. If you are not the intended
recipient, please destroy this message, delete any copies held on your
system and notify the sender immediately.

Toshiba Research Europe Limited, registered in England and Wales (2519556).
Registered Office: 208 Cambridge Science Park, Milton Road, Cambridge
CB4 0GZ, England. Web: http://www.toshiba-europe.com/research/crl
======================================================================

______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud service.
For more information please visit http://www.symanteccloud.com
______________________________________________________________________

References
[hts-users:03338] About getf0 and cmp data file?, ArthurLeo
[hts-users:03339] Re: About getf0 and cmp data =?iso-2022-jp?b?ZmlsZRskQiEpGyhC?=, Keiichiro Oura
[hts-users:03340] RE: [hts-users:03339] Re: About getf0 and cmp data file?, ArthurLeo
[hts-users:03341] Re: About getf0 and cmp data =?iso-2022-jp?b?ZmlsZRskQiEpGyhC?=, Keiichiro Oura
[hts-users:03342] RE: [hts-users:03341] Re: About getf0 and cmp data file?, ArthurLeo
[hts-users:03344] Re: [hts-users:03342] RE: [hts-users:03341] Re: About getf0 and cmp data file?, Keiichi Tokuda
[hts-users:03345] RE: [hts-users:03344] Re: [hts-users:03342] RE: [hts-users:03341] Re: About getf0 and cmp data file?, ArthurLeo
[hts-users:03347] Re: About getf0 and cmp data =?iso-2022-jp?b?ZmlsZRskQiEpGyhC?=, Keiichiro Oura
[hts-users:03357] RE: [hts-users:03347] Re: About getf0 and cmp data file?, ArthurLeo
[hts-users:03358] Re: About getf0 and cmp data =?iso-2022-jp?b?ZmlsZRskQiEpGyhC?=, Keiichiro Oura
[hts-users:03361] RE: [hts-users:03358] Re: About getf0 and cmp data file?, ArthurLeo
[hts-users:03364] Re: About getf0 and cmp data =?iso-2022-jp?b?ZmlsZRskQiEpGyhC?=, Keiichiro Oura
[hts-users:03367] Re: Re: About getf0 and cmp data file?, Leo Arthur