[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:01191] Re: How does HTK/HTS knows if the frame is voiced/unvoiced?


Hi Antonio,

HTS is based on the statistical method.  I think HTS does not care
about whether the frame is voice/unvoice or not.  It can only be
distinguished by the feature.  The observation is just one frame by
one frame with 5ms shift.

Regards,

Thomas

On Fri, Feb 29, 2008 at 3:52 AM, Antonio Bonafonte
<antonio.bonafonte@xxxxxxx> wrote:
>
>  Dear all,
>
>  I am trying to understand how HTK/HTS knows the spaces of the frame.
>
>  In the thesis from Yoshimura, page 22-28, it seems that each observation
>  includes the parameters (e.g. mcp, lf0) and also the spaces that
>  apply to the frame. He calls the space selector, S(o_t).
>
>  I would expect that for every frame (assuming there are not deltas
>  features),
>  we code the S(o) information:
>     voiced frames => mcp and lf0
>     unvoiced frames => mcp
>
>  However, in the cmp files, the only information that signals if the
>  frame is voiced or not,
>  is the lf0 value itself: for unvoiced frames the value is log(0).
>
>  My question is: how do HTK/HTS identifies the indexes of the observation?
>  Does it identify the log(0) value ?
>  I have seen that the prototype file (proto/*) describes which
>  streams are or not MSD, but still I don't know how each particular frame
>  is identified.
>
>  I am asking this question because we want to use different spectral
>  parameters (and different vector sizes)
>  for voiced and unvoiced frames.
>
>
>  Thank you in advance for your time.
>  Antonio
>
>
>

References
[hts-users:01190] How does HTK/HTS knows if the frame is voiced/unvoiced?, Antonio Bonafonte