[hts-users:03343] Re: About getf0 and cmp data =?iso-2022-jp?b?ZmlsZRskQiEpGyhC?=
- Subject: [hts-users:03343] Re: About getf0 and cmp data =?iso-2022-jp?b?ZmlsZRskQiEpGyhC?=
- From: Keiichiro Oura <uratec@xxxxxxxxxxxxxxx>
- Date: Sat, 9 Jun 2012 15:35:15 +0900
- Cc: uratec <uratec@xxxxxxxxxxxx>
- Delivered-to: hts-users@xxxxxxxxxxxxxxx
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=1nw0Xqfk3z42r6yanF/3/J5J0UCB1rCppQLYMU38gt8=; b=JEKLN3CDQ18/RcSl4FJFEPiCfdrZXZoeGgvWcIumM0U68X65YveVOlewSCeBlNWFfG D3+Ny7uLDho+x3cV2gn6PvQ1eJzKQXHxTZqqFqVy3SZFLzZxa3FKwKHEnqfIK2NcTWgz 4qlNPl9He61qR5GekDw0/iXSfhxfyTkLuEzWV2INUdNH13oMwbv0c7a06KujP/V7QAJh fNSwpEG6Uv2JxFDjROQmmehz+1m3jfPM0jsGfgo0lbYx/jD2euRtpJ2xxlXVgZU9ujRQ iBsds4Oa+gdOTHIvrkt//g9U+Ag8wGP1AigVvi+KCXg6FKbOUYNY4mp6KYl9HQ++/xAW DG6Q==
Hi,
> 1. yes, I guess, but I am not sure about the snack algorithm of getting
> pitch, so I wander to know why the 5ms/25ms, not other length?
When you changed the lower F0 limit, were the number of generated F0
frames changed?
Anyway, 5ms/25ms are used to *reduce* the difference between Snack and SPTK.
I think that the difference can be erased completely if snack source
code is modified.
But, I don't try it yet.
> 2. My raw file have 3104 frames,16k sample rate, FRAMELEN =
> 400 FRAMESHIFT=80 FFTLEN=512 WINDOWTYPE =1 FREQWARP=0.42 GAMMA=0
> MGCORDER=20, but after mgc analysis, the mgc file have less two frames (3102
> frames) , and the lf0 file have the same number frames(3104) with the
> original speech raw file , maybe because of the added head/tail. when the
> stage using the sptk tool merge function to compose the cmp file,mgc file
> inserted
> the lf0 file,so the final cmp file have the same frames with mgc file not
> the lf0 file,and lf0 file two more frames at the tail were discarded
> directly.
> So I was confused, and want to know the detail reason. is it reasonable?
Let me know the number of *samples* of your raw file.
Regards,
Keiichiro Oura
2012/6/9 ArthurLeo <bin007.zhao@xxxxxxxxxxx>:
> 1. yes, I guess, but I am not sure about the snack algorithm of getting
> pitch, so I wander to know why the 5ms/25ms, not other length?
>
> 2. My raw file have 3104 frames,16k sample rate, FRAMELEN =
> 400 FRAMESHIFT=80 FFTLEN=512 WINDOWTYPE =1 FREQWARP=0.42 GAMMA=0
> MGCORDER=20, but after mgc analysis, the mgc file have less two frames (3102
> frames) , and the lf0 file have the same number frames(3104) with the
> original speech raw file , maybe because of the added head/tail. when the
> stage using the sptk tool merge function to compose the cmp file,mgc file
> inserted
> the lf0 file,so the final cmp file have the same frames with mgc file not
> the lf0 file,and lf0 file two more frames at the tail were discarded
> directly.
> So I was confused, and want to know the detail reason. is it reasonable?
>> Date: Sat, 9 Jun 2012 00:08:44 +0900
>> From: uratec@xxxxxxxxxxxxxxx
>> Subject: [hts-users:03341] Re: About getf0 and cmp data file?
>
>> To: hts-users@xxxxxxxxxxxxxxx
>> CC: uratec@xxxxxxxxxxxx
>>
>> Hi,
>>
>> > 1. Why added 5ms/25ms waveform to head/tail? why not other length, like
>> > 10ms head or the 5ms tail? I guess it must have some special reason, I
>> > read
>> > the getf0.tcl. the default minpitch/maxpitch for speech is 60/400Hz,
>> > so if the autocorrelation method is generally used to caculate the first
>> > frame pitch, it should be added (1/60)s=16.7ms waveform for the head at
>> > least, but now the number is 5ms/25ms, so can you explain more detail?
>>
>> Do you mean that the lower F0 limit affect the number of generated F0
>> frames?
>> When I changed the lower F0 limit, the number of generated F0 frames
>> were not changed.
>>
>> > 2. I checked the *.mgc file, it always have less two frames than the
>> > *.raw
>> > file, so my question is the discarded two frames is the raw file head
>> > two
>> > frame or the raw file tail frame ?
>>
>> It's strange.
>> Please, let me know the number of samples of your raw file, command
>> lines of mgc analysis, and the number of generated mgc frames,
>> respectively.
>>
>> Regards,
>> Keiichiro Oura
>>
>>
>> 2012/6/8 ArthurLeo <bin007.zhao@xxxxxxxxxxx>:
>> > Thank you for your answer, but I have some further questions about the
>> > details of the lf0 and mgc parameter extraction.
>> > 1. Why added 5ms/25ms waveform to head/tail? why not other length, like
>> > 10ms head or the 5ms tail? I guess it must have some special reason, I
>> > read
>> > the getf0.tcl. the default minpitch/maxpitch for speech is 60/400Hz,
>> > so if the autocorrelation method is generally used to caculate the first
>> > frame pitch, it should be added (1/60)s=16.7ms waveform for the head at
>> > least, but now the number is 5ms/25ms, so can you explain more detail?
>> >
>> > 2. I checked the *.mgc file, it always have less two frames than the
>> > *.raw
>> > file, so my question is the discarded two frames is the raw file head
>> > two
>> > frame or the raw file tail frame ?
>> >
>> > > ; Date: Fri, 8 Jun 2012 09:01:56 +0900
>> >> From: uratec@xxxxxxxxxxxxxxx
>> >> Subject: [hts-users:03339] Re: About getf0 and cmp data file?
>> >> To: hts-users@xxxxxxxxxxxxxxx
>> >> CC: uratec@xxxxxxxxxxxx
>> >
>> >>
>> >> Hi,
>> >>
>> >> The number of frames generated by Snack (in ActiveTcl) is often lower
>> >> than the number of frames generated by SPTK.
>> &g t;> The internal frame calculation is different between Snack and SPTK.
>
>> >> Therefore, 5ms/25ms waveform are added to head/tail of the utterance
>> >> before f0 analysis in the HTS demo script.
>> >>
>> >> Regards,
>> >> Keiichiro Oura
>> >>
>> >>
>> >> 2012/6/7 ArthurLeo <bin007.zhao@xxxxxxxxxxx>:
>> >> > Hi all,
>> >> > When I prepared the cmp data for training, I found that the *.lf0
>> >> > file
>> >> > always have two more frames than the *.mgc file, and they are
>> >> > extracted
>> >> > from
>> >> > the same *.raw file. So I checked the ./data/makefile,and I foun d
>> >> > the
>> >> > *.raw
>> >
>> >> > file were added the 0.5ms head data and the 25ms tail data,I guess it
>> >> > maybe
>> >> > the reason,but I don not know why do this?why add the head and t he
>> >> > tail?
>
>> >> > why
>> >> > 0.5ms and 25ms,not others? Can some one known about it? when
>> >> > composing
>> >> > the
>> >> > lf0 and mgc with the SPTK tool merge function,how deal with the the
>> >> > extra
>> >> > lf0 frame? discarded directly?
>> >>
>>
- References
-
- [hts-users:03338] About getf0 and cmp data file?, ArthurLeo
- [hts-users:03339] Re: About getf0 and cmp data =?iso-2022-jp?b?ZmlsZRskQiEpGyhC?=, Keiichiro Oura
- [hts-users:03340] RE: [hts-users:03339] Re: About getf0 and cmp data file?, ArthurLeo
- [hts-users:03341] Re: About getf0 and cmp data =?iso-2022-jp?b?ZmlsZRskQiEpGyhC?=, Keiichiro Oura
- [hts-users:03342] RE: [hts-users:03341] Re: About getf0 and cmp data file?, ArthurLeo