[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:01143] Re: HTS2.1beta-HVite


Hi,

Sacha Krstulovic wrote (2008/01/25 19:56):

If one wanted to align some unseen acoustic observations
with some HSMM states, would the conversion suggested
by Simon with the use of HVite be the best way to achieve this,
or would one have to hack HERest to export the affiliation
between the states and the cumulated frames with best occupancy,
or would there just be another better solution?

I think the best way is using the Viterbi algorithm for HSMMs.
We have proposed a decoding method for HSMMs based on weighted finite state transducers (WFSTs).
WFST is a very flexible framework.
By using WFSTs, we can easily develop a decoder for HSMMs.

Please refer to the following paper for details:

Keiichiro Oura, Heiga Zen, Yoshihiko Nankaku, Akinobu Lee, Keiichi Tokuda, Hidden semi-Markov model based speech recognition system using weighted finite-state transducer, Proc of ICASSP2006, pp.33-34, Toulouse, France, May 2006.

Regards,

Heiga ZEN (Byung Ha CHUN)

--
------------------------------------------------
Heiga ZEN     (in Japanese pronunciation)
Byung Ha CHUN (in Korean pronunciation)

Department of Computer Science and Engineering
Nagoya Institute of Technology
Gokiso-cho, Showa-ku, Nagoya 466-8555 Japan

http://www.sp.nitech.ac.jp/~zen
------------------------------------------------

References
[hts-users:01135] HTS2.1beta-HVite, zhizheng wu
[hts-users:01136] Re: HTS2.1beta-HVite, Heiga ZEN (Byung Ha CHUN)
[hts-users:01137] Re: HTS2.1beta-HVite, zhizheng wu
[hts-users:01138] Re: HTS2.1beta-HVite, Heiga ZEN (Byung Ha CHUN)
[hts-users:01140] Re: HTS2.1beta-HVite, Heiga ZEN (Byung Ha CHUN)
[hts-users:01141] Re: HTS2.1beta-HVite, Simon King
[hts-users:01142] Re: HTS2.1beta-HVite, Sacha Krstulovic