[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:00858] Re: get wav file information


Hi,

askhar@xxxxxxxxxx wrote (2007/10/17 20:16):

If I want to get wav file information, for example, framelen, frameshift,
windowtype etc, is there any tools in SPTK?

They are not wav file information.
They are *configuration variables* for spectral analysis and F0 extraction.

In the demo script, I found some variables in the configure file such as
framelen,frameshift,windowtype,normalize,fftlen,freqwarp,mcporder,lowerf0,upperf0,
sampfreq. any explains on this variables?thanks

Run

% ./configure --help
You may obtain the following messages:

 ...
 SPEAKER     speaker name (default=slt)
 DATASET     dataset (default=cmu_us_arctic)
 VER         version number of this setting (default=1)
 QNUM        question set number (default='001')
 FRAMELEN    Frame length in point (default=400)
 FRAMESHIFT  Frame shift in point (default=80)
 WINDOWTYPE  Window type -> 0: Blackman 1: Hamming 2: Hanning (default=1)
 NORMALIZE   Normalization -> 0: none 1: by power 2: by magnitude (default=1)
 FFTLEN      FFT length in point (default=512)
 FREQWARP    Frequency warping factor (default=0.42)
 GAMMA       Pole/Zero weight factor (0: mel-cepstral analysis 1: LPC
             analysis 2,3,...,N: mel-generalized cepstral (MGC) analysis)
             (default=0)
 MGCLSP      Use MGC-LSPs instead of MGC coefficients (default=0)
 MGCORDER    Order of MGC analysis (default=24 for cepstral form, default=12
             for LSP form)
 LNGAIN      Use logarithmic gain instead of linear gain (default=0)
 LOWERF0     Lower limit for F0 extraction in Hz (default=80)
 UPPERF0     Upper limit for F0 extraction in Hz (default=350)
 PSTFILTER   Postfiltering factor (default=1.4)
 IMPLEN      Length of impulse response (default=4096)
 SAMPFREQ    Sampling frequency in Hz (default=16000)
 NMGCWIN     number of delta windows for MGC coefficients (default=3)
 NLF0WIN     number of delta windows for log F0 values (default=3)
 NSTATE      number of HMM states (default=5)
 NITER       number of iterations of embedded training (default=5)
 WFLOOR      mixture weight flooring scale (default=3)

Please read them.

Regards,

Heiga ZEN (Byung Ha CHUN)

--
------------------------------------------------
Heiga ZEN     (in Japanese pronunciation)
Byung Ha CHUN (in Korean pronunciation)

Department of Computer Science and Engineering
Nagoya Institute of Technology
Gokiso-cho, Showa-ku, Nagoya 466-8555 Japan

http://www.sp.nitech.ac.jp/~zen
------------------------------------------------

References
[hts-users:00856] get wav file information, 艾斯卡尔