History source of Publications(No. 30) - HMM/DNN-based speech synthesis system (HTS)

* Publications [#z45e85ef]

This page aims to collect HTS-related publications. ~
If you would like to add your publications to this page, please [[contact us>Contact]].

#contents

** Basic core techniques [#l9b04238]
- K. Tokuda, T. Yoshimura, T. Masuko, T. Kobayashi, T. Kitamura, '''Speech parameter generation algorithms for HMM-based speech synthesis,''' Proc. of ICASSP, pp.1315-1318, June 2000. &ref(tokuda_icassp2000.pdf,,"pdf");
- K. Tokuda, T. Mausko, N. Miyazaki, T. Kobayashi, '''Multi-space probability distribution HMM,''' IEICE Trans. Inf. & Syst., vol.E85-D, no.3, pp.455-464, March 2002. &ref(tokuda_ieice_e85-d_3_455-464_2002.pdf,,"pdf");
- T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, T. Kitamura, '''Simultaneous modeling of spectrum, pitch and duration in HMM-based speech synthesis,''' Proc. of Eurospeech, pp.2347-2350, Sept. 1999. &ref(yoshimura_eurospeech1999.pdf,,"pdf"); &ref(http://hts.sp.nitech.ac.jp/publications/dur-correct/,,"correction");
- T. Yoshimura, '''Simultaneous modeling of phonetic and prosodic parameters, and characteristic conversion for HMM-based text-to-speech systems,''' Ph.D thesis, Nagoya Institute of Technology, Jan. 2002. &ref(thesis_yossie.pdf,,pdf);
- K. Tokuda, H. Zen, A.W. Black, '''An HMM-based speech synthesis system applied to English,''' Proc. of 2002 IEEE SSW, Sept. 2002. &ref(tokuda_TTSworkshop2002.pdf,,"pdf");
- K. Tokuda, H. Zen, A.W. Black, ''' HMM-based approach to multilingual speech synthesis,''' Text to speech synthesis: New paradigms and advances, S. Narayanan, A. Alwan (Eds.), Prentice Hall, 2004. &ref(http://www.phptr.com/title/013145661X,,link);
- A.W. Blac, H. Zen, K. Tokuda, '''Statistical parametric speech synthesis,''' Proc. of ICASSP, pp.1229-1232, Apr. 2007. &ref(http://www.sp.nitech.ac.jp/~zen/english/index.php?International%20conferences#n1133c96,,link);

** Acoustic modeling [#v94663df]
- H. Zen, K. Tokuda, T. Masuko, T. Kobayashi, T. Kitamura, ''' Hidden semi-Markov model based speech synthesis,'''  Proc. of ICSLP 2004, vol.II, pp.1397-1400, Oct. 2004. &ref(http://www.sp.nitech.ac.jp/~zen/english/index.php?International%20conferences,,link);
- H. Zen, K. Tokuda, T. Kitamura, ''' An introduction of trajectory model into HMM-based speech synthesis,'''  Proc. of 5th ISCA Speech Synthesis Workshop, June 2004. &ref(http://www.sp.nitech.ac.jp/~zen/english/index.php?International%20conferences,,link);
- J. Yamagishi, K. Onishi, T. Masuko, T. Kobayashi, '''Acoustic modelingof speaking styles and emotional expressions in HMM-based speech synthesis,''' IEICE Trans. on Inf. & Syst., vol.E88-D, no.3, pp.503-509, March 2005. &ref(http://search.ieice.org/bin/summary.php?id=e88-d_3_502&category=D&lang=E&year=2005&abst=&auth=1,,link);
- Y.-J. Wu, R.-H. Wang, '''Minimum generation error training for HMM-based speech synthesis,''' Proc. of ICASSP, pp.89-92, 2006. &ref(http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?isnumber=34757&arnumber=1659964&count=320&index=29,,link);
- Y. Nankaku, H. Zen, K. Tokuda, T. Kitamura, T. Masuko, '''A Bayesian approach to HMM-based speech synthesis,''' Tech. rep. of IEICE, vol.103, pp.193-77, 2003 (in Japanese).

** Speaker adaptation [#sea319db]
- J. Yamagishi, T. Kobayashi, '''Average-voice-based speech synthesis using HSMM-based speaker adaptation and adaptive training,''' IEICE Trans. on Inf. & Syst., vol.E90-D, no.2, pp.533-543, Feb. 2007. &ref(http://search.ieice.org/bin/summary.php?id=e90-d_2_533&category=D&year=2007&lang=E&abst=,,link); 
- J. Yamagishi, M. Tamura, T. Masuko, K. Tokuda, T. Kobayashi, '''A training method of average voice model for HMM-based speech synthesis,''' IEICE Trans. on Fundamentals, vol.E86-A, no.8, pp.1956-1963, Aug. 2003. &ref(http://search.ieice.org/bin/summary.php?id=e86-a_8_1956&category=A&lang=E&year=2003&abst=&auth=1,,link);
- J. Yamagishi, M. Tamura, T. Masuko, K. Tokuda, T. Kobayashi, '''A context clustering technique for average voice models,''' IEICE Trans. Inf. & Syst., vol.E86-D, no.3, pp.534-542, March 2003. &ref(http://search.ieice.org/bin/summary.php?id=e86-d_3_534&category=D&lang=E&year=2003&abst=&auth=1,,link);
- M. Tamura, T. Masuko, K. Tokuda, T. Kobayashi, '''Text-to-speech synthesis with arbitrary speaker's voice from average voice,''' Proc. of Eurospeech, pp.345-348, Sept. 2001. &ref(http://www.isca-speech.org/archive/eurospeech_2001/e01_0345.html,,link);
- M. Tamura, T. Masuko, K. Tokuda, T. Kobayashi, '''Adaptation of pitch and spectrum for HMM-based speech synthesis using MLLR,''' Proc of ICASSP, pp.805-808, May 2001. &ref(http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=941037,,link);
- M. Tamura, T. Masuko, K. Tokuda, T. Kobayashi, '''Speaker adaptation for HMM-based speech synthesis system using MLLR,''' Proc. ESCA/COCOSDA Workshop on Speech Synthesis, pp.273-276, Nov. 1998. 
- T. Masuko, K. Tokuda, T. Kobayashi, S. Imai, '''Voice characteristics conversion for HMM-based speech synthesis system,''' Proc. of ICASSP, pp.1611-1614, Apr. 1997.
- T. Masuko, K. Tokuda, T. Kobayashi, S. Imai, '''HMM-based speech synthesis with various voice characteristics,''' Proc. of ASA and ASJ 3rd Joint Meeting, pp.1043-1046, Dec. 1996. 
- J. Yamagishi, '''Average-Voice-Based Speech Synthesis,'''
Ph.D thesis, Tokyo Institute of Technology, March 2006. &ref(http://www.kbys.ip.titech.ac.jp/yamagishi/,,link);

** Speaker interpolation [#j34fcdb8]
- T. Yoshimura, T. Masuko, K. Tokuda, T. Kobayashi, T. Kitamura, '''Speaker interpolation in HMM-based speech synthesis system,''' Proc. of Eurospeech, pp.2523-2526, Sept. 1997. &ref(http://www.sp.nitech.ac.jp/~yossie/,,link);
- T. Yoshimura, T. Masuko, K. Tokuda, T. Kobayashi, T. Kitamura, '''Speaker interpolation for HMM-based speech synthesis system,''' J. Acoust. Soc. Jpn. (E), vol.21, no.4, pp.199-206, 2000. 
- M. Tachibana, J. Yamagishi, T. Masuko, T. Kobayashi,  '''Speech synthesis with various emotional expressions and speaking styles by style Interpolation and morphing,''' IEICE Trans. Inf. & Syst., vol.E88-D, no.11, pp.2484-2491, Nov. 2005. &ref(http://search.ieice.org/bin/summary.php?id=e88-d_11_2484&category=D&lang=E&year=2005&abst=&auth=1,,link);

** Eigenvoices [#p68fa620]
- K. Shichiri, A. Sawabe, K. Tokuda, T. Masuko, T. Kobayashi, T. Kitamura, '''Eigenvoices for HMM-based speech synthesis,''' Proc. of ICSLP, pp.1269-1272, Sept. 2002. 

** Excitation model [#x7dafa44]
- T. Yoshimura, K. Tokuda, T. Masuko, T. Kobayashi, T. Kitamura, '''Mixed excitation for HMM-based speech Synthesis,''' Proc. of Eurospeech, pp.2259-2262, Sept. 2001. 
- S.-J. Kim, M.-S. Hahn, '''Two-band excitation for HMM-based speech synthesis,''' IEICE Trans. Inf. & Syst., vol.E90-D, no.1, pp.378-381, Jan. 2007. &ref(http://search.ieice.org/bin/summary.php?id=e90-d_1_378&category=D&year=2007&lang=E&abst=,,link);
- C. Hemptinne, '''Integration of the harmonic plus noise model (HNM) into the hidden Markov model-based speech synthesis system (HTS),''' Master thesis, IDIAP Research Institute, June 2006. &ref(http://www.idiap.ch/publications/hemptinne-idiap-rr-06-69.bib.abs.html,,link); 
- R. Maia, T. Toda, H. Zen, Y. Nankaku, K. Tokuda, '''Mixed excitation for HMM-based speech synthesis based on state-dependent filtering,''' Proc. of Spring Meeting of ASJ, vol. I, pp. 199-200, 1-8-4, March 2007.

** Blizzard Challenge [#n664e716]
- H. Zen, T. Toda, M. Nakamura, K. Tokuda, ''' Details of Nitech HMM-based speech synthesis system for the Blizzard Challenge 2005,'''  IEICE Trans. Inf. & Syst. vol.E90-D, No.1, pp.325-333, Jan. 2007. &ref(http://search.ieice.org/bin/summary.php?id=e90-d_1_325&category=D&year=2007&lang=E&abst=,,link);
- H. Zen, T. Toda, K. Tokuda, ''' The Nitech-NAIST HMM-based speech synthesis system for the Blizzard Challenge 2006,'''  Proc. of Blizzard Challenge 2006 workshop, Sept. 2006. &ref(http://festvox.org/blizzard/blizzard2006.html,,link);
- H. Zen, T. Toda, ''' An overview of Nitech HMM-based speech synthesis system for Blizzard Challenge 2005,'''  Proc. of Interspeech2005 (Eurospeech), pp.93-96, Sept. 2005. &ref(http://festvox.org/blizzard/blizzard2005.html,,link);
- Z.-H. Ling, Y.-J. Wu, Y.-P. Wang, L. Qin, R.-H. Wang, '''USTC system for Blizzard Challenge 2006 an improved HMM-based speech synthesis method,''' Proc. of Blizzard Challenge 2006 workshop, Sept. 2006. &ref(http://festvox.org/blizzard/blizzard2006.html,,link);

** Multilingual [#lde10388]
- R. Maia, H. Zen, K. Tokuda, T. Kitamura, F.G. Resende Jr., '''Towards the development of a Brazilian Portuguese text-to-speech system based on HMM,''' Proc. of Eurospeech, pp.2465-2468, Sept. 2003.
- O. Abdel-Hamid, S. Abdou, M. Rashwan, '''Improving Arabic HMM based speech synthesis quality,''' Proc. of Interspeech, pp.1332-1335, 2006.
- Y. Qian, F. Soong, Y. Chen, M. Chu, '''An HMM-based Mandarin Chinese text-to-speech system,''' Proc. of ISCSLP, Dec. 2006.
- S.-J. Kim, J.-J. Kim, M.-S. Hahn, '''Implementation and evaluation of an HMM-based Korean speech synthesis system,''' IEICE Trans. Inf. & Syst., vol. E89-D, no.3, pp.1116-1119, 2006. &ref(http://search.ieice.org/bin/summary.php?id=e89-d_3_1116&category=D&year=2006&lang=E&abst=,,link);
- C. Weiss, R. Maia, K. Tokuda, W. Hess, '''Low resource HMM-based speech synthesis applied to German,''' Proc. of ESSP, 2005.
- C. Plahl, '''Sprachsynthese mit Hidden Markov Modellen,''' Master thesis, Bielefeld University, 2005 (in German). &ref(https://www.techfak.uni-bielefeld.de/ags/ai/publications/master-theses.html#2005,,link);
- M. Barros, R. Maia, K. Tokuda, D. Freitas, F.G. Resende Jr., '''HMM-based European Portuguese speech synthesis,''' Proc. of Interspeech, pp.2581-2584, 2005.
- A. Lundgren, '''An HMM-based text-to-speech system applied to Swedish,''' Master thesis, Royal Institute of Technology (KTH), 2005. &ref(http://www.speech.kth.se/publications/masterprojects/,,link);
- T. Ojala, '''Auditory quality evaluation of present Finnish text-to-speech systems,''' Master thesis, Helsinki University of Technology, 2006. &ref(http://lib.tkk.fi/Dipl/list.html#2006,,link);
- M. Vainio, A. Suni, P. Sirjola, '''Developing a Finnish concept-to-speech system,''' Proc. of 2nd Baltic conference on Human Language Technologies, pp.201-206, 2005. &ref(http://www.ioc.ee/hlt2005/HLT2005.pdf,,link);
- B. Vesnicer, F. Mihelic, '''Evaluation of the Slovenian HMM-based speech synthesis system,''' Proc. of TSD, pp.513-520, 2004. &ref(http://www.springerlink.com/content/tgkj1x1nw7y8pecg/,,link);
- M. Homayounpour, S. Mehdi, '''Farsi speech synthesis using hidden Markov model  and decision trees,''' The CSI Journal on Computer Science and Engineering, vol.2, no.1&3(a), 2004 (in Farsi). 
&ref(http://cs.ipm.ac.ir/jcse/Vol2No1_3.asp,,link);
- J. Latorre, K. Iwano, S. Furui, '''New approach to the polyglot speech generation by means of an HMM-based speaker adaptable synthesizer,''' Speech Communication, vol.48, no.10, pp.1227-1242 Oct. 2006 &ref(http://www.sciencedirect.com/science?_ob=ArticleURL&_udi=B6V1C-4K7X861-1&_user=10&_coverDate=10%2F31%2F2006&_alid=514662865&_rdoc=1&_fmt=summary&_orig=search&_cdi=5671&_sort=d&_docanchor=&view=c&_acct=C000050221&_version=1&_urlVersion=0&_userid=10&md5=8c3cf5fa605ff3ac0b730bea83bd7bb3,,link);.
- J. Latorre, K. Iwano, S. Furui, '''Polyglot synthesis using a mixture of monolingual corpora,''' Proc. of ICASSP, pp.1-4, 2005.
- S. Martincic-Ipsic, I. Ipsic, '''Croatian HMM-based speech synthesis,''' Journal of Computing and Information Technology, vol.14, no.4, pp.307-313, 2006. &ref(http://cit.zesoi.fer.hr/browsePaper.php?issue=27&seq=6&paper=953,,link);
- X. Gonzalvo, I. Iriondo, J. Socor, F. Alas, C. Monzo, '''HMM-based Spanish speech synthesis using CBR as F0 estimator,''' ISCA Tutorial and Research Workshop on Non Linear Speech Processing (NOLISP07), 2007. &ref(http://www.salle.url.edu/~st06375/en/personal.php#publicaciones,,link);&#12288;&ref(http://www.salle.url.edu/~gonzalvo/hmm/,,samples);
HMM/DNN-based Speech Synthesis System (HTS) - History source of Publications (No. 30)