History source of Home(No. 55) - HMM/DNN-based speech synthesis system (HTS)

* Welcome! [#k4f3be02]
> The [[HMM-based Speech Synthesis System (HTS)>http://hts.sp.nitech.ac.jp/]] has been being developed by the HTS working group and others (see [[Who we are]] and [[Acknowledgments]]).  The training part of HTS has been implemented as a modified version of [[HTK:http://htk.eng.cam.ac.uk/]] and released as a form of patch code to HTK.  The patch code is released under a free software license.  However, it should be noted that &color(red){once you apply the patch to HTK, you must obey the [[license of HTK:http://htk.eng.cam.ac.uk/docs/license.shtml]].};
Related publications about the techniques and algorithms used in HTS can be
found [[here>Publications]].

> HTS version 2.1 includes hidden semi-Markov model (HSMM) training/adaptation/synthesis, speech parameter generation algorithm considering global variance (GV), SMAPLR/CSMAPLR adaptation, and other minor new features.  Many bugs in HTS version 2.0.1 were also fixed.  The API for runtime synthesis module, hts_engine API, version 1.0 was also released.  Because hts_engine can run without the HTK library, users can develop their own open or proprietary softwares based on hts_engine.  HTS and hts_engine API does not include any text analyzers but the [[Festival Speech Synthesis System:http://www.festvox.org/festival/]], [[DFKI MARY Text-to-Speech System:http://mary.dfki.de/]], or other text analyzers can be used with HTS.  This distribution includes demo scripts for training speaker-dependent and speaker-adaptive systems using [[CMU ARCTIC database:http://www.festvox.org/cmu_arctic/]] (English).
Six HTS voices for Festival 1.96 are also released.  They use the hts_engine module included in Festival.  Each of HTS voices can be used without any other HTS tools.

> For training Japanese voices, a demo script using the Nitech database is also prepared.  Japanese voices trained by the demo script can be used on [[GalateaTalk:http://hil.t.u-tokyo.ac.jp/~galatea/]], which is a speech synthesis module of an open-source toolkit for anthropomorphic spoken dialogue agents developed in [[Galatea project:http://hil.t.u-tokyo.ac.jp/~galatea/]].  An HTS voice for Galatea trained by the demo script is also released.

* News! [#ve28e7f9]
- ''July 31, 2008''
> hts_engine website moved to http://hts-engine.sourceforge.net/~
hts_engine API version 1.01 and Flite+hts_engine version 0.90 were released.

- ''July 14, 2008''
> [[Keiichiro Oura:http://www.sp.nitech.ac.jp/~uratec/]] took over the maintainer of HTS from [[Heiga Zen:http://www.sp.nitech.ac.jp/~zen/]].

- ''June 27, 2008''
> HTS version 2.1 and hts_engine API version 1.0 were released.~
Their new features are
- HTS-2.1
-- Many bug fixes
-- Released under the [[New and Simplified BSD license:http://www.opensource.org/]]
-- Simple documentation
-- 64-bit compile support
-- MAXSTRLEN (max length of strings), SMAX (max # of streams), and PAT_LEN (max length of patterns) can be set through configure script like
 ./configure MAXSTRLEN=1024 SMAX=20
-- HFB:
--- HSMM training and adaptation
-- HAdapt:
--- SMAPLR/CSMAPLR adaptation
-- HGen:
--- Speech parameter generation algorithm considering GV
--- Random generation of state transitions, state durations, and mixture components (by configuration variable RNDFLAGS)
-- HMGenS:
--- Speech parameter generation from HSMMs
-- HHEd:
--- Add DM command to delete existing macros
--- Add IT command to impose pre-built trees in clustering
--- Add JM command to merge difference models on state or stream levels
--- MU command supports '*2' style mixing up
--- MU command supports mixture-level occupancy threshold in mixing up (by configuration variable MINMIXOCC)
- hts_engine API-1.0:
-- Released under the [[New and Simplified BSD license:http://www.opensource.org/]]
-- Support LSP-type parameters including LSP, mel-LSP, and MGC-LSP
-- Speech parameter generation algorithm considering GV

- ''June 13, 2008''
> HTS version 2.1RC2 and hts_engine API version 0.99 were released to the hts-users ML members.~
See [[here:http://hts.sp.nitech.ac.jp/hts-users/spool/2008/msg00336.html]] for details.

// - ''May 27, 2008''
// > HTS voice building tools for the MARY platform was released with [[DFKI MARY 3.6.0:http://mary.dfki.de/Download/mary-3-6-0-released]].
// 
// - ''March 24, 2008''
// > HTS version 2.1RC1 and hts_engine API version 0.96 were released to the hts-users ML members. See [[here:http://hts.sp.nitech.ac.jp/hts-users/spool/2008/msg00175.html]] for details.

// - ''January 15, 2008''
// > HTS version 2.1beta and hts_engine API version 0.95 were released to the hts-users ML members.

// - ''December 7, 2007''
// > hts_engine was ported to Java and included in [[DFKI MARY 3.5:http://mary.dfki.de/Download/mary-3-5-0-released]]. 

// - ''November 1, 2007''
// > HTS version 2.1alpha was released to the hts-users ML members.
// - ''October 1, 2007''
// > HTS version 2.0.1 and hts_engine_API version 0.9 were released.~
// The new features are
// - Many bug fixes.
// - Band structure for linear transforms.
// - Stream-dependent variance flooring scales.
// - State duration model mmf structure is changed.  In the previous versions we // used a multi-variate Gaussian PDF to represent state duration PDFs of an HMM.     // However, from this version we use multi-stream structure.  This is very important for the future HSMM support.
// - Demo scripts support LSP-type parameters for spectral representation in addition to cepstral ones.
// - API-style implementation of hts_engine.  Old stand-alone hts_engine will be thrown away.

// - ''September 20, 2007''
// > HTS version 2.0.1RC1 was released to the hts-users ML members.

// - ''September 18, 2007''
// > HTS version 2.0.1RC1 was released to the internal working group members.
HMM/DNN-based Speech Synthesis System (HTS) - History source of Home (No. 55)