The number of samples is 86560, the length of the file is 5410ms, and the sample rate is 16k. So what is the problem? > Date: Tue, 19 Jun 2012 13:55:53 +0900 > From: uratec@xxxxxxxxxxxxxxx > Subject: [hts-users:03358] Re: About getf0 and cmp data file? > To: hts-users@xxxxxxxxxxxxxxx > CC: uratec@xxxxxxxxxxxx > > Hi, > > Please, tell me number of samples of your raw file. > It can be checked as follows. > > x2x +sa yourfile.raw | wc -l > > Regards, > Keiichiro Oura > > > 2012/6/16 ArthurLeo <bin007.zhao@xxxxxxxxxxx>: > > > > Yes, I installed the SPTK-3.5, My training data have 4000 utterance, I do > > not checked any one of the training data, but I sample 100 utt, and all of > > them > > have the same problem that the mgc file have less two frames tha n the raw > > file. the raw files are 16k sample rate, 16bit pcm, little endian data > > format. > > the details command below: > > FRAMELEN = 400 FRAMESHIFT=80 FFTLEN=512 WINDOWTYPE =1 FREQWARP=0.42 GAMMA=0 > > MGCORDER=20 > >> Date: Thu, 14 Jun 2012 00:55:28 +0900 > >> From: uratec@xxxxxxxxxxxxxxx > >> Subject: [hts-users:03347] Re: About getf0 and cmp data file? > > > >> To: hts-users@xxxxxxxxxxxxxxx > >> CC: uratec@xxxxxxxxxxxx > >> > >> Hi, > >> > >> Do you use SPTK-3.5? > >> Anyway, please let me know the number of *samples* of your raw file. > >> > >> Regards, > >> Keiichiro Oura > >> > >> > >> 2 012/6/9 ArthurLeo <bin007.zhao@xxxxxxxxxxx>: > > > >> > Thank you very much for detail answer, her e is the further question: > >> > 1. Now assuming the F0 is 200Hz, acctually, the f0 can be varied from > >> > 60Hz > >> > to 400Hz generally, so even if 5ms waveform added to the head, the first > >> > frame also can not be located the start time of the raw file certainly. > >> > is it? > >> > > >> > 2. From the attached illustration, the mgc file should have the same > >> > frames > >> > as the raw file, but in my test, the mgc file have less two frames than > >> > the > >> > raw file, > >> > is it abnormal? > >> > > >> >> Date: Sat, 9 Jun 2012 17:43:40 +0900 > >> >> From: tokuda@xxxxxxxxxxxx > >> >> Subject: [hts-users:03344] Re: [hts-users:03342] RE: [hts-users:03341] > >> >> Re: > >> >> Ab out getf0 and cmp data file? > >> >> To: hts-users@xxxxxxxxxxxxxxx > >> >> CC: tokuda@nit ech.ac.jp > >> >> > >> >> > 1. yes, I guess, but I am not sure about the snack a lgorithm of > >> >> > getting > >> > > >> >> > pitch, so I wander to know why the 5ms/25ms, not other length? > >> >> > >> >> We add 5ms to the file head in order to adjust the positions of "frame > >> >> center" > >> >> assuming the extracted F0 is always 200Hz. > >> >> > >> >> See the attached illustration. > >> >> > >> >> Keiichi > >> >> > >> >> > >> >> 2012/6/9 ArthurLeo <bin007.zhao@xxxxxxxxxxx>: > >> >> > 1. yes, I guess, but I am not sure about the snack algorithm of > >> >> > getting > >> >> > pitch, so I wander to know why the 5ms/25ms, not other length? > >> >> > > >> >> > 2. My raw file have 3104 frames,16k sample rate, FRAMELEN = > >> >> > 400 FRAMESHIFT=80 FFTLEN=512 WINDOWTYPE =1 FREQWARP=0.42 GAMMA= 0 > >> >> > MGCORDER=20, but after mgc analysis, the mgc file have less two > >> >> > frames > >> >> > (3102 > >> >> > frames) , and the lf0 file have the same number frames(3104) with the > >> >> > original speech raw file , maybe because of the added head/tail. when > >> >> > the< br>> > stage using the sptk tool merge function to compose the > >> >> > cmp > >> >> > file,mgc file > >> > > >> >> > ; inserted > >> >> > the lf0 file,so the final cmp file have the same frames with mgc file > >> >> > not > >> >> > the lf0 file,and lf0 file two more frames at the tail were discarded > >> >> > directly. > >> >> > So I was confused, and want to know the detail reason. is it > >> >> > reasonable? > >> >> >> Date: Sat, 9 Jun 2012 00:08:44 +0900 > >> >> >> From: uratec@xxxxxxxxxxxxxxx > >> >> >> Subject: [h ts-users:03341] Re: About getf0 and cmp data file? > > > >> >> > > >> >> >> To: hts-users@xxxxxxxxxxxxxxx > >> >> >> CC: uratec@xxxxxxxxxxxx > >> >> >> > >> >> >> Hi, > >> >> >> > >> >> >> > 1. Why added 5ms/25ms waveform to head/tail? why not other length, > >> >> >> > like > >> >> >> > 10ms head or the 5ms tail? I guess it must have some special > >> >> >> > reason, > >> >> >> > I > >> >> >> > read > >> >> >> > the getf0.tcl. the default minpitch/maxpitch for speech is > >> >> >> > 60/400Hz, > >> >> >> > so if the autocorrelation method is generally used to caculate the > >> >> >> > first > >> >> >> > frame pitch, it should be added (1/60)s=16.7ms waveform for the > >> >> >> > head > >> >> >> > at > >> >> >> > least, but now the numbe r is 5ms/25ms, so can you explain more > > > >> >> >> > de tail? > >> >> >> > >> >> >> Do you mean that the lower F0 limit affect the number of generated > >> >> >> F0 > >> >> >> frames? > >> >> >> When I changed the lower F0 limit, the number of generated F0 frames > >> >> >> were not changed. > >> >> >> > >> >> >> > 2. I checked the *.mgc file, it always have less two frames than > >> >> >> > the > >> >> >> > *.raw > >> >> >> > file, so my question is the discarded two frames is the raw file > >> >> >> > head > >> >> >> > two > >> >> >> > frame or the raw file tail frame ? > >> >> >> > >> >> >> It's strange. > >> >> >> Please, let me know the number of samp les of your raw file, command > >> > > >> >> >> lines of mgc analysis, and the number of generated mgc frames, > >> >> >> respectively. > >> >> >> > >> >> >> Regards, > >> >> >> Keiichiro Oura > >> >> >> > >> >> >> > >> >> >> 2012/6/8 ArthurLeo <bin007.zhao@xxxxxxxxxxx>: > >> >> >> > Thank you for your answer, but I have some further questions about > >> >> >> > the > >> >> >> > details of the lf0 and mgc parameter extraction. > >> >> >> > 1. Why added 5ms/25ms waveform to head/tail? why not other length, > >> >> >> > like > >> >> >> > 10ms head or the 5ms ta il? I guess it must have some special > >> >> >> > reason, > >> >> >> > I > >> >> >> > read > >> >> >> > the getf0.tcl. the default minpitch/maxpitch for speech is > >> >> >> > 60/400Hz, > >> >> >> > so if the autocorrelation method is generally used to cacu late > >> >> >> > the > > > >> >> >> > first > >> >> >> > frame pitch, it should be added (1/60)s=16.7ms waveform for the > >> >> >> > head > >> >> >> > at > >> >> > > > least, but now the number is 5ms/25ms, so can you explain more > >> >> > > > detail? > >> >> >> > > >> >> >> > 2. I checked the *.mgc file, it alw ays have less two frames than > >> >> >> > the > >> >> >> > *.raw > >> >> >> > file, so my question is the discarded two frames is the raw file > >> >> >> > head > >> >> >> > two > >> >> >> > frame or the raw file tail frame ? > >> >> >> > > >> >> >> > > ; Date: Fri, 8 Jun 2012 09:01:56 +0900 > >> >> >> >> From: uratec@xxxxxxxxxxxxxxx > >> >> >> >> Subject: [hts-users:03339] Re: About getf0 and cmp data file? > >> >> >> >> ; To: hts-users@xxxxxxxxxxxxxxx > > > >> >> >> >> CC: uratec@xxxxxxxxxxxx > >> >> >> > > >> >> >> >> > >> >> >> >> Hi,> >> >> >> >> > >> >> >> >> The number of frames generated by Snack (in ActiveTcl) is often > >> >> >> >> lower > >> >> >> >> than the number of frames generated by SPTK. > >> >> >> &g t;> The internal frame calculation is different between Snack and > >> >> >> SPTK. > >> >> > > >> >> >> >> Therefore, 5ms/25ms waveform are added to head/tail of the > >> >> >> >> utterance > >> >> >> >> before f0 analysis in the HTS demo script. > >> >> >> >> > >> >> >> >> Regards, > >> >> >> >> Keiichiro Oura > >> >> >> >> > >> >> >> >>> >> >> >&g t; 2012/6/7 ArthurLeo <bin007.zhao@xxxxxxxxxxx>: > >> >> >> >> > Hi all, > >> >> >> >> > When I prepared the cmp data for training, I found that the > >> >> >> >> > *.lf0 > >> >> >> >> > file > >> >> >> >> > always have two more frames than the *.mgc file, and they are > >> >> >> >> > extracted > >> >> >> >> > from > >> >> >> >> > the same *.raw file. So I checked the ./data/makefile,and I > >> >> >> >> > foun d > >> >> >> & gt;> > the > >> > > >> >> >> >> > *.raw > >> >> >> > > >> >> >> >> > file were added the 0.5ms head data and the 25ms tail data,I > >> >> >> >> > guess > >> >> >> >> > it > >> >> >> >> > maybe > >> >> >> >> > the reason,but I don not know why do this?why add the head and > >> >> >> >> > t > >> >> >> >> > he > >> >> >> >> > tail? > >> >> > > >> >> >> >> > why > >> >> >> >> > 0.5ms and 25ms,not others? Can some one known about it? when > >> >> >> >> > composing > >> >> >> >> > the > >> >> >> >> > lf0 and mgc with the SPTK tool merge function,how deal with the > >> >> >> >> > the > >> >> >> >> > extra & gt; >> >> >> >> > lf0 frame? discarded directly? > >> >> >> >> > >> >> >> > >> > |