================================================================ The HMM-based Speech Synthesis System (HTS) version 2.1RC1 release March 24, 2008 The HMM-Based Speech Synthesis System (HTS) (http://hts.sp.nitech.ac.jp/) has been being developed by the HTS working group (see "Who we are" below) and others (see "Acknowledgments" in the separate file). The training part of HTS was implemented as a modified version of the Hidden Markov Model Toolkit (HTK) (http://htk.eng.cam.ac.uk/). Major modifications which we made to HTK are listed below: - Context clustering based on MDL criterion (instead of ML one) - Stream-dependent context clustering - Multi-space probability distribution as state output probability (for pitch pattern modeling) - State duration modeling and clustering Related publications about the techniques and algorithms used in HTS can be found at http://hts.sp.nitech.ac.jp/?Publications The current version does not include any text analyzer but the Festival Speech Synthesis System (http://www.festvox.org/festival/) can be used as a text analyzer. Since version 1.1, HTS has included a small run-time synthesis engine called hts_engine (less than 2 Mbytes including HMMs). Since the synthesis engine can run without the HTK library, it is suitable for using on the Festival and other applications. This distribution comes with demo scripts for training speaker-dependent and speaker-adaptation systems using "CMU ARCTIC databases" (http://www.festvox.org/cmu_arctic/). These demo scripts generate "voices" for Festival. Six HTS voices for Festival 1.95 & 1.96 trained by CMU ARCTIC databases are also released with HTS version 2.1RC1. Each of HTS voices consists of HMMs trained by the demo script, and can be used as a "voice" of Festival Speech Synthesis System without any other HTS tools. *** Notes for Japanese speech synthesis *** A demo script using the Nitech database for speech synthesis "Nitech Jp ATR503 m001" is also prepared for training Japanese voices. Voices trained by the demo script can be used on GalateaTalk, which is a speech synthesis module of an open-source toolkit for anthropomorphic spoken dialogue agents developed in Galatea project (http://hil.t.u-tokyo.ac.jp/~galatea/), without any other HTS tools. An HTS voice for GalateaTalk trained by the demo script is also released with HTS version 2.1RC1. **************************************************************** What's new in version 2.1RC1 **************************************************************** * Many bug fixes. * CSMAPLR adaptation. * HHEd MU command supports '*2' style mixing up. * Mixture-level occupancy threshold in mixing up. **************************************************************** Copying **************************************************************** The basic core system of HTS version 2.1RC1 is released as a patch code to HTK version 3.4. The patch code is released under a MIT-style license, without commercial restrictions. However, it should be noted that once you apply the patch to the HTK source code, you must obey the license of HTK. Although the patch code is free, we still offer no warranties and no maintenance. We will continue to endeavor to fix bugs and answer queries when can, but are not in a position to guarantee it. We will consider consultancy if desired, please contacts us for details. If you are using HTS version 2.1RC1 in commercial environment, even though no license is required, we would be grateful if you let us know as it helps justify ourselves to our various sponsors. We also strongly encourage you to * reference the use of HTS in any publications that use the software * report all bugs, where possible with bug fixes, that are found. The current copyright on the core system is /* --------------------------------------------------------------- */ /* The HMM-Based Speech Synthesis System (HTS) */ /* HTS Working Group */ /* */ /* Department of Computer Science */ /* Nagoya Institute of Technology */ /* and */ /* Interdisciplinary Graduate School of Science and Engineering */ /* Tokyo Institute of Technology */ /* */ /* Copyright (c) 2001-2008 */ /* All Rights Reserved. */ /* */ /* Permission is hereby granted, free of charge, to use and */ /* distribute this software in the form of patch code to HTK and */ /* its documentation without restriction, including without */ /* limitation the rights to use, copy, modify, merge, publish, */ /* distribute, sublicense, and/or sell copies of this work, and to */ /* permit persons to whom this work is furnished to do so, subject */ /* to the following conditions: */ /* */ /* 1. Once you apply the HTS patch to HTK, you must obey the */ /* license of HTK. */ /* */ /* 2. The source code must retain the above copyright notice, */ /* this list of conditions and the following disclaimer. */ /* */ /* 3. Any modifications to the source code must be clearly */ /* marked as such. */ /* */ /* NAGOYA INSTITUTE OF TECHNOLOGY, TOKYO INSTITUTE OF TECHNOLOGY, */ /* HTS WORKING GROUP, AND THE CONTRIBUTORS TO THIS WORK DISCLAIM */ /* ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING ALL */ /* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT */ /* SHALL NAGOYA INSTITUTE OF TECHNOLOGY, TOKYO INSTITUTE OF */ /* TECHNOLOGY, HTS WORKING GROUP, NOR THE CONTRIBUTORS BE LIABLE */ /* FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY */ /* DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, */ /* WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTUOUS */ /* ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR */ /* PERFORMANCE OF THIS SOFTWARE. */ /* */ /* --------------------------------------------------------------- */ or /* --------------------------------------------------------------- */ /* The HMM-Based Speech Synthesis System (HTS) */ /* HTS Working Group */ /* */ /* Department of Computer Science */ /* Nagoya Institute of Technology */ /* and */ /* Interdisciplinary Graduate School of Science and Engineering */ /* Tokyo Institute of Technology */ /* */ /* Copyright (c) 2001-2008 */ /* */ /* The Centre for Speech Technology Research */ /* University of Edinburgh */ /* */ /* Copyright (c) 2008 */ /* */ /* All Rights Reserved. */ /* */ /* Permission is hereby granted, free of charge, to use and */ /* distribute this software in the form of patch code to HTK and */ /* its documentation without restriction, including without */ /* limitation the rights to use, copy, modify, merge, publish, */ /* distribute, sublicense, and/or sell copies of this work, and to */ /* permit persons to whom this work is furnished to do so, subject */ /* to the following conditions: */ /* */ /* 1. Once you apply the HTS patch to HTK, you must obey the */ /* license of HTK. */ /* */ /* 2. The source code must retain the above copyright notice, */ /* this list of conditions and the following disclaimer. */ /* */ /* 3. Any modifications to the source code must be clearly */ /* marked as such. */ /* */ /* NAGOYA INSTITUTE OF TECHNOLOGY, TOKYO INSTITUTE OF TECHNOLOGY, */ /* UNIVERSITY OF EDINBURGH, HTS WORKING GROUP, AND THE CONTRIBUTORS */ /* TO THIS WORK DISCLAIM ALL WARRANTIES WITH REGARD TO THIS */ /* SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY */ /* AND FITNESS, IN NO EVENT SHALL NAGOYA INSTITUTE OF TECHNOLOGY, */ /* TOKYO INSTITUTE OF TECHNOLOGY, UNIVERSITY OF EDINBURGH, */ /* HTS WORKING GROUP, NOR THE CONTRIBUTORS BE LIABLE FOR ANY */ /* SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES */ /* WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, */ /* WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTUOUS */ /* ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR */ /* PERFORMANCE OF THIS SOFTWARE. */ /* */ /* --------------------------------------------------------------- */ Several tools in HTS version 2.1RC1 are independent of HTK (though most of them use the HTK library). The copyright of these tools is /* --------------------------------------------------------------- */ /* The HMM-Based Speech Synthesis System (HTS) */ /* HTS Working Group */ /* */ /* Department of Computer Science */ /* Nagoya Institute of Technology */ /* and */ /* Interdisciplinary Graduate School of Science and Engineering */ /* Tokyo Institute of Technology */ /* */ /* Copyright (c) 2001-2008 */ /* All Rights Reserved. */ /* */ /* Permission is hereby granted, free of charge, to use and */ /* distribute this software and its documentation without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of this work, and to permit persons to whom this */ /* work is furnished to do so, subject to the following conditions: */ /* */ /* 1. The source code must retain the above copyright notice, */ /* this list of conditions and the following disclaimer. */ /* */ /* 2. Any modifications to the source code must be clearly */ /* marked as such. */ /* */ /* 3. Redistributions in binary form must reproduce the above */ /* copyright notice, this list of conditions and the */ /* following disclaimer in the documentation and/or other */ /* materials provided with the distribution. Otherwise, one */ /* must contact the HTS working group. */ /* */ /* NAGOYA INSTITUTE OF TECHNOLOGY, TOKYO INSTITUTE OF TECHNOLOGY, */ /* HTS WORKING GROUP, AND THE CONTRIBUTORS TO THIS WORK DISCLAIM */ /* ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING ALL */ /* IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS, IN NO EVENT */ /* SHALL NAGOYA INSTITUTE OF TECHNOLOGY, TOKYO INSTITUTE OF */ /* TECHNOLOGY, HTS WORKING GROUP, NOR THE CONTRIBUTORS BE LIABLE */ /* FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY */ /* DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, */ /* WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTUOUS */ /* ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR */ /* PERFORMANCE OF THIS SOFTWARE. */ /* */ /* --------------------------------------------------------------- */ or /* --------------------------------------------------------------- */ /* The HMM-Based Speech Synthesis System (HTS) */ /* HTS Working Group */ /* */ /* Department of Computer Science */ /* Nagoya Institute of Technology */ /* and */ /* Interdisciplinary Graduate School of Science and Engineering */ /* Tokyo Institute of Technology */ /* */ /* Copyright (c) 2001-2008 */ /* */ /* The Centre for Speech Technology Research */ /* University of Edinburgh */ /* */ /* Copyright (c) 2008 */ /* */ /* All Rights Reserved. */ /* */ /* Permission is hereby granted, free of charge, to use and */ /* distribute this software and its documentation without */ /* restriction, including without limitation the rights to use, */ /* copy, modify, merge, publish, distribute, sublicense, and/or */ /* sell copies of this work, and to permit persons to whom this */ /* work is furnished to do so, subject to the following conditions: */ /* */ /* 1. The source code must retain the above copyright notice, */ /* this list of conditions and the following disclaimer. */ /* */ /* 2. Any modifications to the source code must be clearly */ /* marked as such. */ /* */ /* 3. Redistributions in binary form must reproduce the above */ /* copyright notice, this list of conditions and the */ /* following disclaimer in the documentation and/or other */ /* materials provided with the distribution. Otherwise, one */ /* must contact the HTS working group. */ /* */ /* */ /* NAGOYA INSTITUTE OF TECHNOLOGY, TOKYO INSTITUTE OF TECHNOLOGY, */ /* UNIVERSITY OF EDINBURGH, HTS WORKING GROUP, AND THE CONTRIBUTORS */ /* TO THIS WORK DISCLAIM ALL WARRANTIES WITH REGARD TO THIS */ /* SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY */ /* AND FITNESS, IN NO EVENT SHALL NAGOYA INSTITUTE OF TECHNOLOGY, */ /* TOKYO INSTITUTE OF TECHNOLOGY, UNIVERSITY OF EDINBURGH, */ /* HTS WORKING GROUP, NOR THE CONTRIBUTORS BE LIABLE FOR ANY */ /* SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES */ /* WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, */ /* WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTUOUS */ /* ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR */ /* PERFORMANCE OF THIS SOFTWARE. */ /* */ /* --------------------------------------------------------------- */ **************************************************************** Installation **************************************************************** Please expand HTS-2.1RC1_for_HTK-3.4.tar.bz2 and see the extracted file "INSTALL". Note that HTS requires HTK. **************************************************************** Who we are **************************************************************** The HTS working group is a voluntary group for developing the HMM-Based Speech Synthesis System. Current members are Keiichi Tokuda http://www.sp.nitech.ac.jp/~tokuda/ (Principle Designer) Heiga Zen http://www.sp.nitech.ac.jp/~zen/ (Main Maintainer) Junichi Yamagishi http://homepages.inf.ed.ac.uk/jyamagis/ Alan W. Black http://www.cs.cmu.edu/~awb/ Takashi Masuko Shinji Sako http://www.mmsp.nitech.ac.jp/~sako/ Tomoki Toda http://spalab.naist.jp/~tomoki/index_e.html Takashi Nose Keiichiro Oura http://www.sp.nitech.ac.jp/~uratec/ and the members are dynamically changing. The current formal contact address of HTS working group and a mailing list for HTS users can be found at http://hts.sp.nitech.ac.jp/ ================================================================