Thank you for your answer, but I have some further questions about the details of the lf0 and mgc parameter extraction.
1. Why added 5ms/25ms waveform to head/tail? why not other length, like 10ms head or the 5ms tail? I guess it must have some special reason, I read the getf0.tcl. the default minpitch/maxpitch for speech is 60/400Hz, so if the autocorrelation method is generally used to caculate the first frame pitch, it should be added (1/60)s=16.7ms waveform for the head at least, but now the number is 5ms/25ms, so can you explain more detail? 2. I checked the *.mgc file, it always have less two frames than the *.raw file, so my question is the discarded two frames is the raw file head two frame or the raw file tail frame ? >
; Date: Fri, 8 Jun 2012 09:01:56 +0900 > From: uratec@xxxxxxxxxxxxxxx > Subject: [hts-users:03339] Re: About getf0 and cmp data file? > To: hts-users@xxxxxxxxxxxxxxx > CC: uratec@xxxxxxxxxxxx > > Hi, > > The number of frames generated by Snack (in ActiveTcl) is often lower > than the number of frames generated by SPTK. > The internal frame calculation is different between Snack and SPTK. > Therefore, 5ms/25ms waveform are added to head/tail of the utterance > before f0 analysis in the HTS demo script. > > Regards, > Keiichiro Oura > > > 2012/6/7 ArthurLeo <bin007.zhao@xxxxxxxxxxx>: > > Hi all, > > When I prepared the cmp data for training, I found that the *.lf0 file > > always have two more frames than the *.mgc file, and they are extracted from > > the same *.raw file. So I checked the ./data/makefile,and I foun d the *.raw > > file were added the 0.5ms head data and the 25ms tail data,I guess it maybe > > the reason,but I don not know why do this?why add the head and the tail? why > > 0.5ms and 25ms,not others? Can some one known about it? when composing the > > lf0 and mgc with the SPTK tool merge function,how deal with the the extra > > lf0 frame? discarded directly? > |