History source of Publications(No. 45) - HMM/DNN-based speech synthesis system (HTS)

* Publications [#z45e85ef]

This page aims to collect HTS-related publications. ~
If you would like to add your publications to this page, please [[contact us>Contact]].

#contents

** Basic core techniques [#l9b04238]
- K. Tokuda, T. Yoshimura, T. Masuko, T. Kobayashi, T. Kitamura, '''Speech parameter generation algorithms for HMM-based speech synthesis,''' Proc. of ICASSP, pp.1315-1318, June 2000. &ref(tokuda_icassp2000.pdf,,"pdf");
- K. Tokuda, T. Mausko, N. Miyazaki, T. Kobayashi, '''Multi-space probability distribution HMM,''' IEICE Trans. Inf. & Syst., vol.E85-D, no.3, pp.455-464, March 2002. &ref(tokuda_ieice_e85-d_3_455-464_2002.pdf,,"pdf");
- T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, T. Kitamura, '''Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis,''' Proc. of Eurospeech, pp.2347-2350, Sept. 1999. &ref(yoshimura_eurospeech1999.pdf,,"pdf"); &ref(http://hts.sp.nitech.ac.jp/publications/dur-correct/,,"correction");
- T. Yoshimura, '''Simultaneous modeling of phonetic and prosodic parameters, and characteristic conversion for HMM-based text-to-speech systems,''' Ph.D thesis, Nagoya Institute of Technology, Jan. 2002. &ref(thesis_yossie.pdf,,pdf);
- K. Tokuda, H. Zen, A.W. Black, '''An HMM-based speech synthesis system applied to English,''' Proc. of 2002 IEEE SSW, Sept. 2002. &ref(tokuda_TTSworkshop2002.pdf,,"pdf");
- K. Tokuda, H. Zen, A.W. Black, ''' HMM-based approach to multilingual speech synthesis,''' Text to speech synthesis: New paradigms and advances, S. Narayanan, A. Alwan (Eds.), Prentice Hall, 2004. &ref(http://www.phptr.com/title/013145661X,,link);
- A.W. Black, H. Zen, K. Tokuda, '''Statistical parametric speech synthesis,''' Proc. of ICASSP, pp.1229-1232, Apr. 2007. &ref(http://www.sp.nitech.ac.jp/~zen/english/index.php?Publications%2FInternational%20conferences#n1133c96,,link);
- H. Zen, T. Nose, J. Yamagishi, S. Sako, T. Masuko, A.W. Black, K. Tokuda, '''The HMM-based speech synthesis system version 2.0,''' Proc. of ISCA SSW6, Bonn, Germany, Aug. 2007. &ref(http://www.sp.nitech.ac.jp/~zen/english/index.php?Publications%2FInternational%20conferences#n1133c96,,link);

** Acoustic modeling [#v94663df]
- H. Zen, K. Tokuda, T. Masuko, T. Kobayashi, T. Kitamura, ''' Hidden semi-Markov model based speech synthesis,'''  Proc. of ICSLP 2004, vol.II, pp.1397-1400, Oct. 2004. &ref(http://www.sp.nitech.ac.jp/~zen/english/index.php?Publications%2FInternational%20conferences#t1b7f6e2,,link);
- H. Zen, K. Tokuda, T. Kitamura, ''' An introduction of trajectory model into HMM-based speech synthesis,'''  Proc. of 5th ISCA Speech Synthesis Workshop, June 2004. &ref(http://www.sp.nitech.ac.jp/~zen/english/index.php?Publications%2FInternational%20conferences#t1b7f6e2,,link);
- J. Yamagishi, K. Onishi, T. Masuko, T. Kobayashi, '''Acoustic modelingof speaking styles and emotional expressions in HMM-based speech synthesis,''' IEICE Trans. on Inf. & Syst., vol.E88-D, no.3, pp.503-509, March 2005. &ref(http://search.ieice.org/bin/summary.php?id=e88-d_3_502&category=D&lang=E&year=2005&abst=&auth=1,,link);
- Y.-J. Wu, R.-H. Wang, '''Minimum generation error training for HMM-based speech synthesis,''' Proc. of ICASSP, pp.89-92, 2006. &ref(http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?isnumber=34757&arnumber=1659964&count=320&index=29,,link);
- Y. Nankaku, H. Zen, K. Tokuda, T. Kitamura, T. Masuko, '''A Bayesian approach to HMM-based speech synthesis,''' Tech. rep. of IEICE, vol.103, pp.193-77, 2003 (in Japanese).
- H. Zen, '''Implementing an HSMM-based speech synthesis system using an efficient forward-backward algorithm,''' Technical Report of Nagoya Institute of Technology, TR-SP-0001, Dec. 2007. &ref(http://www.sp.nitech.ac.jp/~zen/english/index.php?Publications%2FTechnical%20reports#n13009a5,,link);

** Speaker adaptation [#sea319db]
- J. Yamagishi, T. Kobayashi, '''Average-voice-based speech synthesis using HSMM-based speaker adaptation and adaptive training,''' IEICE Trans. on Inf. & Syst., vol.E90-D, no.2, pp.533-543, Feb. 2007. &ref(http://search.ieice.org/bin/summary.php?id=e90-d_2_533&category=D&year=2007&lang=E&abst=,,link); 
- M. Tachibana, J. Yamagishi, T. Masuko, T. Kobayashi, '''A Style Adaptation Technique for Speech Synthesis Using HSMM and Suprasegmental Features,''' IEICE Trans. on Inf. & Syst., vol.E89-D, no.3, pp.1092-1099, March. 2006. &ref(http://search.ieice.org/bin/summary.php?id=e89-d_3_1092&category=D&lang=E&year=2006&abst=&auth=1,,link); 
- J. Yamagishi, M. Tamura, T. Masuko, K. Tokuda, T. Kobayashi, '''A training method of average voice model for HMM-based speech synthesis,''' IEICE Trans. on Fundamentals, vol.E86-A, no.8, pp.1956-1963, Aug. 2003. &ref(http://search.ieice.org/bin/summary.php?id=e86-a_8_1956&category=A&lang=E&year=2003&abst=&auth=1,,link);
- J. Yamagishi, M. Tamura, T. Masuko, K. Tokuda, T. Kobayashi, '''A context clustering technique for average voice models,''' IEICE Trans. Inf. & Syst., vol.E86-D, no.3, pp.534-542, March 2003. &ref(http://search.ieice.org/bin/summary.php?id=e86-d_3_534&category=D&lang=E&year=2003&abst=&auth=1,,link);
- M. Tamura, T. Masuko, K. Tokuda, T. Kobayashi, '''Text-to-speech synthesis with arbitrary speaker's voice from average voice,''' Proc. of Eurospeech, pp.345-348, Sept. 2001. &ref(http://www.isca-speech.org/archive/eurospeech_2001/e01_0345.html,,link);
- M. Tamura, T. Masuko, K. Tokuda, T. Kobayashi, '''Adaptation of pitch and spectrum for HMM-based speech synthesis using MLLR,''' Proc of ICASSP, pp.805-808, May 2001. &ref(http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=941037,,link);
- M. Tamura, T. Masuko, K. Tokuda, T. Kobayashi, '''Speaker adaptation for HMM-based speech synthesis system using MLLR,''' Proc. ESCA/COCOSDA Workshop on Speech Synthesis, pp.273-276, Nov. 1998. &ref(http://citeseer.ist.psu.edu/288066.html,,link); 
- T. Masuko, K. Tokuda, T. Kobayashi, S. Imai, '''Voice characteristics conversion for HMM-based speech synthesis system,''' Proc. of ICASSP, pp.1611-1614, Apr. 1997. &ref(http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?tp=&arnumber=598807&isnumber=13191,,link);
- T. Masuko, K. Tokuda, T. Kobayashi, S. Imai, '''HMM-based speech synthesis with various voice characteristics,''' Proc. of ASA and ASJ 3rd Joint Meeting, pp.1043-1046, Dec. 1996. 
- J. Yamagishi, T. Masuko, T. Kobayashi, '''HMM-based expressive speech synthesis -- Towards TTS with arbitrary speaking styles and emotions,''' Proc. of Special Workshop in Maui (SWIM), Jan. 2004. &ref(http://www.kbys.ip.titech.ac.jp/yamagishi/pdf/SWIM.pdf,,pdf);
- J. Yamagishi, '''Average-Voice-Based Speech Synthesis,'''
Ph.D thesis, Tokyo Institute of Technology, March 2006. &ref(http://homepages.inf.ed.ac.uk/jyamagis/,,link);
- L. Qin, Y.-J. Wu, Z.-H. Ling, R.-H. Wang, '''Improving the performance of HMM-based voice conversion using context clustering decision tree and appropriate regression matrix,''' Proc. of Interspeech, pp.2250-2253, Sept. 2006. &ref(http://mail.ustc.edu.cn/~qinlong/publication.htm,,link); &ref(http://mail.ustc.edu.cn/~qinlong/demo.htm,,samples);
- J. Yamagishi, T. Kobayashi, S. Renals, S. King, H. Zen, T. Toda, K. Tokuda,  '''Improved Average-Voice-based Speech Synthesis using Gender-Mixed Modeling and A Parameter Generation Algorithm considering GV,''' Proc. ISCA SSW6, Aug. 2007. 

** Speaker interpolation [#j34fcdb8]
- T. Yoshimura, T. Masuko, K. Tokuda, T. Kobayashi, T. Kitamura, '''Speaker interpolation in HMM-based speech synthesis system,''' Proc. of Eurospeech, pp.2523-2526, Sept. 1997. &ref(http://www.sp.nitech.ac.jp/~zen/yossie/,,link);
- T. Yoshimura, T. Masuko, K. Tokuda, T. Kobayashi, T. Kitamura, '''Speaker interpolation for HMM-based speech synthesis system,''' J. Acoust. Soc. Jpn. (E), vol.21, no.4, pp.199-206, 2000. &ref(http://ci.nii.ac.jp/naid/110003106260/en/,,link);
- M. Tachibana, J. Yamagishi, T. Masuko, T. Kobayashi,  '''Speech synthesis with various emotional expressions and speaking styles by style Interpolation and morphing,''' IEICE Trans. Inf. & Syst., vol.E88-D, no.11, pp.2484-2491, Nov. 2005. &ref(http://search.ieice.org/bin/summary.php?id=e88-d_11_2484&category=D&lang=E&year=2005&abst=&auth=1,,link);

** Eigenvoices/Multiple regressions [#p68fa620]
- K. Shichiri, A. Sawabe, K. Tokuda, T. Masuko, T. Kobayashi, T. Kitamura, '''Eigenvoices for HMM-based speech synthesis,''' Proc. of ICSLP, pp.1269-1272, Sept. 2002. &ref(http://www.sp.nitech.ac.jp/~demo/synthesis_demo_2001/eigenvoice2/eigendemo.html,,demo);
- M. Tachibana, T. Nose, J. Yamagishi, T. Kobayashi, '''A technique for controlling voice quality of synthetic speech using multiple regression HSMM,''' Proc. Interspeech, pp.2438-2441, Sept. 2006.
- T. Nose, J. Yamagishi, T. Kobayashi, '''A style control technique for speech synthesis using multiple regression HSMM,''' Proc. Interspeech, pp.1324-1327, Sept. 2006. &ref(http://www.kbys.ip.titech.ac.jp/demo/stylectrl/MRHSMM/index.html,,samples);
- T. Nose, J. Yamagishi, T. Masuko, T. Kobayashi,  '''A Style Control Technique for HMM-based Expressive Speech Synthesis,''' IEICE Trans. Inf. & Syst., vol.E90-D, no.9, pp.1406-1413, Sept. 2007. &ref(http://search.ieice.org/bin/summary.php?id=e90-d_9_1406&category=D&lang=E&year=2007&abst=,,link);

** Excitation model [#x7dafa44]
- T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, T. Kitamura, '''Mixed excitation for HMM-based speech Synthesis,''' Proc. of Eurospeech, pp.2259-2262, Sept. 2001. 
- S.-J. Kim, M.-S. Hahn, '''Two-band excitation for HMM-based speech synthesis,''' IEICE Trans. Inf. & Syst., vol.E90-D, no.1, pp.378-381, Jan. 2007. &ref(http://search.ieice.org/bin/summary.php?id=e90-d_1_378&category=D&year=2007&lang=E&abst=,,link);
- C. Hemptinne, '''Integration of the harmonic plus noise model (HNM) into the hidden Markov model-based speech synthesis system (HTS),''' Master thesis, IDIAP Research Institute, June 2006. &ref(http://www.idiap.ch/publications/hemptinne-idiap-rr-06-69.bib.abs.html,,link); 
- R. Maia, T. Toda, H. Zen, Y. Nankaku, K. Tokuda, '''An excitation model for HMM-based speech synthesis based on residual modeling,''' Proc. ISCA SSW6, Aug. 2007. &ref(http://www.slc.atr.jp/%7Ermaia/demo.html,,samples);
- X. Gonzalvo, J.C. Socoro, I. Iriondo, C.Monzo, E. Martinez, '''Linguistic and mixed excitation improvements on a HMM-based speech synthesis for Castilian Spanish,''' Proc. ISCA SSW6, Aug. 2007. &ref(http://serpens.salleurl.edu/intranet/pdf/328.pdf,,link);


** Blizzard Challenge [#n664e716]
- H. Zen, T. Toda, M. Nakamura, K. Tokuda, ''' Details of Nitech HMM-based speech synthesis system for the Blizzard Challenge 2005,'''  IEICE Trans. Inf. & Syst. vol.E90-D, No.1, pp.325-333, Jan. 2007. &ref(http://search.ieice.org/bin/summary.php?id=e90-d_1_325&category=D&year=2007&lang=E&abst=,,link);
- H. Zen, T. Toda, K. Tokuda, ''' The Nitech-NAIST HMM-based speech synthesis system for the Blizzard Challenge 2006,'''  Proc. of Blizzard Challenge 2006 workshop, Sept. 2006. &ref(http://festvox.org/blizzard/blizzard2006.html,,link);
- H. Zen, T. Toda, ''' An overview of Nitech HMM-based speech synthesis system for Blizzard Challenge 2005,'''  Proc. of Interspeech2005 (Eurospeech), pp.93-96, Sept. 2005. &ref(http://festvox.org/blizzard/blizzard2005.html,,link);
- J. Yamagishi, H. Zen, T. Toda, K. Tokuda, ''' Speaker-Independent HMM-based Speech Synthesis System - HTS-2007 System for the Blizzard Challenge 2007,'''  Proc. of Blizzard Challenge 2007 workshop, Aug. 2007. &ref(http://festvox.org/blizzard/bc2007/blizzard_2007/blz3_008.html,,link);
- Z.-H. Ling, Y.-J. Wu, Y.-P. Wang, L. Qin, R.-H. Wang, '''USTC system for Blizzard Challenge 2006 an improved HMM-based speech synthesis method,''' Proc. of Blizzard Challenge 2006 workshop, Sept. 2006. &ref(http://festvox.org/blizzard/blizzard2006.html,,link); &ref(http://mail.ustc.edu.cn/~qinlong/demo.htm,,samples);
- Z.-H. Ling, L. Qin, H. Lu,  Y. Gao, L.-R. Dai, R.-H. Wang, Y. Jiang, Z.-W. Zhao, J.-H. Yang, J. Chen, G-P. Hu, '''The USTC and iFlytek Speech Synthesis Systems for Blizzard Challenge 2007,''' Proc. of Blizzard Challenge 2007 workshop, Aug. 2007. &ref(http://festvox.org/blizzard/bc2007/blizzard_2007/blz3_017.html,,link); 
- J. Yamagishi, T. Nose, H. Zen, T. Toda, K. Tokuda, '''Performance Evaluation of The Speaker-Independent HMM-based Speech Synthesis System "HTS-2007" for the Blizzard Challenge 2007,''' Proc. of ICASSP, Apr. 2008. &ref(http://homepages.inf.ed.ac.uk/jyamagis/icassp/Lecture.pdf,,paper); &ref(http://homepages.inf.ed.ac.uk/jyamagis/icassp/Lecture.mov,,lecture);

** Practical implementation [#if7a8974]
- S.-J. Kim, J.-J. Kim, M.-S. Hahn, '''HMM-based Korean speech synthesis system for hand-held devices, ''' IEEE Trans. Consumer Electronics, vol.52, no.4, pp.1384-1390, Nov. 2006. &ref(http://ieeexplore.ieee.org/iel5/30/4050031/04050071.pdf?tp=&arnumber=4050071&isnumber=4050031,,link);

** Multilingual [#lde10388]
- R. Maia, H. Zen, K. Tokuda, T. Kitamura, F.G. Resende Jr., '''Towards the development of a Brazilian Portuguese text-to-speech system based on HMM,''' Proc. of Eurospeech, pp.2465-2468, Sept. 2003.
- O. Abdel-Hamid, S. Abdou, M. Rashwan, '''Improving Arabic HMM based speech synthesis quality,''' Proc. of Interspeech, pp.1332-1335, 2006.
- Y. Qian, F. Soong, Y. Chen, M. Chu, '''An HMM-based Mandarin Chinese text-to-speech system,''' Proc. of ISCSLP, Dec. 2006.
- S.-J. Kim, J.-J. Kim, M.-S. Hahn, '''Implementation and evaluation of an HMM-based Korean speech synthesis system,''' IEICE Trans. Inf. & Syst., vol. E89-D, no.3, pp.1116-1119, 2006. &ref(http://search.ieice.org/bin/summary.php?id=e89-d_3_1116&category=D&year=2006&lang=E&abst=,,link);
- C. Weiss, R. Maia, K. Tokuda, W. Hess, '''Low resource HMM-based speech synthesis applied to German,''' Proc. of ESSP, 2005.
- C. Plahl, '''Sprachsynthese mit Hidden Markov Modellen,''' Master thesis, Bielefeld University, 2005 (in German). &ref(https://www.techfak.uni-bielefeld.de/ags/ai/publications/master-theses.html#2005,,link);
- M. Barros, R. Maia, K. Tokuda, D. Freitas, F.G. Resende Jr., '''HMM-based European Portuguese speech synthesis,''' Proc. of Interspeech, pp.2581-2584, 2005.
- A. Lundgren, '''An HMM-based text-to-speech system applied to Swedish,''' Master thesis, Royal Institute of Technology (KTH), 2005. &ref(http://www.speech.kth.se/publications/masterprojects/,,link);
- T. Ojala, '''Auditory quality evaluation of present Finnish text-to-speech systems,''' Master thesis, Helsinki University of Technology, 2006. &ref(http://lib.tkk.fi/Dipl/list.html#2006,,link);
- M. Vainio, A. Suni, P. Sirjola, '''Developing a Finnish concept-to-speech system,''' Proc. of 2nd Baltic conference on Human Language Technologies, pp.201-206, 2005. &ref(http://www.ioc.ee/hlt2005/HLT2005.pdf,,link);
- B. Vesnicer, F. Mihelic, '''Evaluation of the Slovenian HMM-based speech synthesis system,''' Proc. of TSD, pp.513-520, 2004. &ref(http://www.springerlink.com/content/tgkj1x1nw7y8pecg/,,link);
- M. Homayounpour, S. Mehdi, '''Farsi speech synthesis using hidden Markov model  and decision trees,''' The CSI Journal on Computer Science and Engineering, vol.2, no.1&3(a), 2004 (in Farsi). 
&ref(http://cs.ipm.ac.ir/jcse/Vol2No1_3.asp,,link);
- J. Latorre, K. Iwano, S. Furui, '''New approach to the polyglot speech generation by means of an HMM-based speaker adaptable synthesizer,''' Speech Communication, vol.48, no.10, pp.1227-1242 Oct. 2006 &ref(http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6V1C-4K7X861-1&_user=10&_coverDate=10%2F31%2F2006&_alid=514662865&_rdoc=1&_fmt=summary&_orig=search&_cdi=5671&_sort=d&_docanchor=&view=c&_acct=C000050221&_version=1&_urlVersion=0&_userid=10&md5=8c3cf5fa605ff3ac0b730bea83bd7bb3,,link);.
- J. Latorre, K. Iwano, S. Furui, '''Polyglot synthesis using a mixture of monolingual corpora,''' Proc. of ICASSP, pp.1-4, 2005.
- S. Martincic-Ipsic, I. Ipsic, '''Croatian HMM-based speech synthesis,''' Journal of Computing and Information Technology, vol.14, no.4, pp.307-313, 2006. &ref(http://cit.zesoi.fer.hr/browsePaper.php?issue=27&seq=6&paper=953,,link);
- X. Gonzalvo, I. Iriondo, J. Socor, F. Alas, C. Monzo, '''HMM-based Spanish speech synthesis using CBR as F0 estimator,''' ISCA Tutorial and Research Workshop on Non Linear Speech Processing (NOLISP07), 2007. &ref(http://www.salle.url.edu/~st06375/en/personal.php#publicaciones,,link);
- S. Chomphan, T. Kobayashi, '''Implementation and Evaluation of an HMM-based Thai Speech Synthesis System,''' Proc. of Interspeech, 2007. &ref(http://www.kbys.ip.titech.ac.jp/demo/thai/index.html,,samples);
- S. Krstulovic, A. Hunecke, M. Schroeder, '''An HMM-Based Speech Synthesis System applied to German and its Adaptation to a Limited Set of Expressive Football Announcements,''' Proc. of Interspeech, 2007. &ref(http://www.dfki.de/~schroed/publications.html,,link);

** Singing voice synthesis [#qe0579fe]
- K. Saino, H. Zen, Y. Nankaku, A. Lee, K. Tokuda, '''HMM-based singing voice synthesis system,''' Proc. Interspeech, pp.1141-1144, Sept. 2006. &ref(http://www.sp.nitech.ac.jp/~k-saino/music/,,samples);

** Application of HTS [#qe0579fe]
*** Hybrid approaches [#v3966256]
- H. Kawai, T. Toda, J. Yamagishi, T. Hirai, J. Ni, N. Nishizawa, M. Tsuzaki, K. Tokuda, '''XIMERA: a concatenative speech synthesis system with large scale corpora,''' IEICE Trans. J89-D-II, no.12, pp.2688-2698, Dec. 2006  &ref(http://search.ieice.org/bin/summary.php?id=j89-d_12_2688&category=D&year=2006&lang=J&abst=&auth=1,,link); &ref(http://ximera.atr.jp/index_j.html,,demo); (in Japanese) &ref(http://www.ssw5.org/papers/1057.pdf,,link);
- T. Hirai, J. Yamagishi, S. Tenpaku, '''Utilization of an HMM-Based Feature Generation Module in 5 ms Segment Concatenative Speech Synthesis,''' Proc. ISCA SSW6, Aug. 2007.
*** Motion synthesis [#db18fe62]
- K. Mori, Y. Nankaku, C. Miyajima, K. Tokuda, and T. Kitamura, '''Motion generation for Japanese finger language based on hidden Markov models,''' Proc. FIT, pp.56970, 2005.
- N. Niwase, J. Yamagishi, T. Kobayashi, '''Human Walking Motion Synthesis with Desired Pace and Stride Length Based on HSMM,''' IEICE Trans. Inf. & Syst. vol.E88-D, No.11, pp.2492-2499, Nov. 2005. &ref(http://search.ieice.org/bin/summary.php?id=e88-d_11_2492&category=D&lang=E&year=2005&abst=&auth=1,,link); &ref(http://www.kbys.ip.titech.ac.jp/demo/walking/index.html,,sample);
- G. Hofer, H. Shimodaira, J. Yamagishi, '''Speech driven Head Motion Synthesis based on a Trajectory Model,''' Proc. SIGGRAPH2007 Poster, 2007. &ref(http://homepages.inf.ed.ac.uk/jyamagis/Demo-html/demo.html,,sample);
- O. Govokhina, G. Bailly, G. Breton, and P. Bagshaw, '''TDA: a new trainable trajectory formation system for facial animation,''' Proc. Interspeech, pp.1274247, 2006.
*** Handwriting recognition [#se7dd65c]
- L. Ma, Y.J. Wu, P. Liu, and F. Soong, '''A MSD-HMM approach to pen trajectory modeling for online handwriting recognition,''' Proc. ICDAR, pp.12832, 2007.
*** Audio-visual [#f41f9ada]
- M. Tamura, S. Kondo, T. Masuko, and T. Kobayashi, '''Text-to-audiovisual speech synthesis based on parameter generation from HMM,''' Proc. Eurospeech, pp.95962, 1999.
- S. Sako, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, '''HMM-based text-to-audio-visual speech synthesis,''' Proc. ICSLP, pp.258, 2000.
- T. Ishikawa, Y. Sawada, H. Zen, Y. Nankaku, C. Miyajima, K. Tokuda, and T. Kitamura, '''Audio-visual large vocabulary continuous speech recognition based on early integration,''' Proc. FIT, pp.20304, 2002.
*** ASR [#kf7dde8e]
- R. Terashima, T. Yoshimura, T. Wakita, K. Tokuda, and T. Kitamura, '''An evaluation method of ASR performance by HMM-based speech synthesis,''' Proc. Spring Meeting of ASJ, pp.15960, 2003.
- K. Emoto, H. Zen, K. Tokuda, and T. Kitamura, '''Accent type recognition for automatic prosodic labeling,''' Proc. Autumn Meeting of ASJ, pp.22526, 2003.
- H.L. Wang, Y. Qian, F. Soong, J.L. Zhou, and J.Q. Han, '''A multi-space distribution (MSD) approach to speech recognition of tonal languages,''' Proc. of Interspeech, pp.12528, 2006.
- L. Zhang, C. Huang, M. Chu, F. Soong, X. Zhang, and Y. Chen, '''Automatic detection of tone mispronunciation in Mandarin,''' Proc. ISCSLP, pp.59001, 2006.
- K. Tanaka, S. Kuroiwa, S. Tsuge, and F. Ren, '''An acoustic model adaptation using HMM-based speech synthesis,''' Proc. NLPKE, pp.36873, 2003.
- M. Ishihara, C. Miyajima, N. Kitaoka, K. Itou, and K. Takeda, '''An approach for training acoustic models based on the vocabulary of the target speech recognition task,''' Proc. Spring Meeting of ASJ, pp.15354, 2007.
*** Feature mapping [#u25f5d5e]
- K. Richmond, '''A trajectory mixture density network for the acoustic-articulatory inversion mapping,''' Proc. of Interspeech, pp.57780, 2006.
*** Speech coding [#qe56bab1]
- T. Hoshiya, S. Sako, H. Zen, K. Tokuda, T. Masuko, T. Kobayashi, and T. Kitamura, '''Improving the performance of HMM-based very low bitrate speech coding,''' Proc. ICASSP, pp.80003, 2003.
HMM/DNN-based Speech Synthesis System (HTS) - History source of Publications (No. 45)