Diff of History - HMM/DNN-based speech synthesis system (HTS)

The added line is THIS COLOR.
The deleted line is THIS COLOR.
Go to History.
Deleting diff of History
 * History [#af7558b8]
 #contents
 
 ** 2015 [#j9bcc490]
 
 > ''December 25, 2015''
 >> HTS version 2.3 was released.~
 Its new features are
 - HERest:
 -- Add VBLR adaptation.
 - HMGenS:
 -- Add DAEM-based parameter generation.
 -- Support DP search to determine state duration when the model alignments are given.
 - HInit, HRest, HRest:
 -- Support parallel mode.
 - HHEd:
 -- Speed up context-clustering by calculating differences between answers to current and previous questions.
 -- Add untying weights function in HHEd.
 - Demo scripts:
 -- Add modulation spectrum-based postfilter.
 -- Support text files instead of utt files for general English database.
 -- Turn off spectrum normalization in STRAIGHT.
 -- Add LSP postfilter.
 -- Support mel-cepstrum based aperiodic measure generated by STRAIGHT.
 -- Support new HTS voice format for hts engine API.
 -- Integrate normal demo and STRAIGHT demo.
 
 ** 2014 [#w71b4b73]
 > ''December 25, 2014''
 >> HTS version 2.3 beta was released to the hts-users ML members.
 
 ** 2013 [#sca5a2a6]
 > ''May 1, 2013''
 >> A tutorial about HMM-based speech synthesis was published on Proceedings of the IEEE: http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=6495700
 
 ** 2012 [#k2361095]
 > ''December 25, 2012''
 >> HTS version 2.3 alpha was released to the hts-users ML members.
 
 ** 2011 [#j9420e9d]
 
 > ''July 7, 2011''
 >> HTS version 2.2 was released.~
 Its new features are
 - HERest:
 -- Support DAEM algorithm in parameter estimation step.
 - HHEd:
 -- Support KLD-based state-mapping and cross-lingual speaker adaptation.
 -- Context-clustering can be started in the middle of the tree building.
 - HMgeTool:
 -- Add ECD-based MGE traning command, HMgeTool.
 - HSMMAlign:
 -- Add stand-alone HSMM based forced-alignment command, HSMMAlign.
 - Demo scripts:
 -- Change sampling frequency from 16kHz to 48kHz.
 -- Support bark critical-band based aperiodic measure.
 -- Change speaker and singer of Brazilian Portuguese and Japanese song demo, respectively.
 - Slides:
 -- Release slides as a tutorial of HMM-based speech synthesis.
 
 > ''March 3, 2011''
 >> HTS version 2.2 beta was released to the hts-users ML members.
 ** 2010 [#tde9d64b]
 > ''December 25, 2010''
 >> HTS version 2.2 alpha was released to the hts-users ML members.
 > ''May 14, 2010''
 >> HTS version 2.1.1 was was released.~
 Its new features are
 - Based on HTK-3.4.1
 - Many bug fixes
 - HFst:
 -- WFST converter for forced-alignment of HSMM
 - HMGenS:
 -- Initial GV weight for parameter generation
 -- Model-level alignments given from label of singing voice to determine note-level durations
 - HHEd:
 -- Memory reduction options for context-clustering
 - Demo scripts
 -- Context-dependent GV without silent and pause phoneme
 -- Demo using the Nitech Japanese database for singing voice synthesis
 
 ** 2009 [#f3d6c52f]
 
 > ''December 25, 2009''
 >> HTS version 2.1.1 beta was released to the hts-users ML members.
 > ''August 27, 2009''
 >> The first HTS meeting in [[Interspeech 2009:http://www.interspeech2009.org/conference/]].
 > ''May 22, 2009''
 >> HTS-Demo for Brazilian Portuguese is released.
 > ''March 16, 2009''
 >> Prof. Keiichi Tokuda & Dr. Heiga Zen have a [[tutorial about HMM-based speech synthesis>Tutorial]] at [[Interspeech 2009:http://www.interspeech2009.org/conference/]].  
 
 ** 2008 [#qd958615]
 
 > ''July 31, 2008''
 >> hts_engine website moved to http://hts-engine.sourceforge.net/~
 hts_engine API version 1.01 and Flite+hts_engine version 0.90 were released.
 > ''July 14, 2008''
 >> [[Keiichiro Oura:http://www.sp.nitech.ac.jp/~uratec/]] took over the maintainer of HTS from [[Heiga Zen:http://www.sp.nitech.ac.jp/~zen/]].
 > ''June 27, 2008''
 >> HTS version 2.1 and hts_engine API version 1.0 were released.~
 Their new features are
 - HTS-2.1
 -- Many bug fixes
 -- Released under the New and Simplified BSD license
 -- Simple documentation
 -- 64-bit compile support
 -- MAXSTRLEN (max length of strings), SMAX (max # of streams), and PAT_LEN (max length of patterns) can be set through configure script like
  ./configure MAXSTRLEN=1024 SMAX=20
 -- HFB:
 --- HSMM training and adaptation
 -- HAdapt:
 --- SMAPLR/CSMAPLR adaptation
 -- HGen:
 --- Speech parameter generation algorithm considering GV
 --- Random generation of state transitions, state durations, and mixture components (by configuration variable RNDFLAGS)
 -- HMGenS:
 --- Speech parameter generation from HSMMs
 -- HHEd:
 --- Add DM command to delete existing macros
 --- Add IT command to impose pre-built trees in clustering
 --- Add JM command to merge difference models on state or stream levels
 --- MU command supports '*2' style mixing up
 --- MU command supports mixture-level occcupancy threshhold in mixing up (by configuration variable MINMIXOCC)
 - hts_engine API-1.0:
 -- Released under the New and Simplified BSD license
 -- Support LSP-type parameters including LSP, mel-LSP, and MGC-LSP
 -- Speech parameter generation algorithm considering GV
 
 > ''June 13, 2008''
 >> HTS version 2.1RC2 and hts_engine API version 0.99 were released to the hts-users ML members.
 > ''May 27, 2008''
 >> HTS voice building tools for the MARY platform was released with [[DFKI MARY 3.6.0:http://mary.dfki.de/Download/mary-3-6-0-released]].
 > ''March 24, 2008''
 >> HTS version 2.1RC1 and hts_engine API version 0.96 were released to the hts-users ML members.
 > ''January 15, 2008''
 >> HTS version 2.1beta and hts_engine_API version 0.95 were released to the hts-users ML members.
 
 ** 2007 [#afa4796d]
 > ''December 7, 2007''
 >> hts_engine was ported to Java and included in [[DFKI MARY 3.5:http://mary.dfki.de/Download/mary-3-5-0-released]]. 
 > ''November 1, 2007''
 >> HTS version 2.1alpha was released to the hts-users ML members
 > ''October 1, 2007''
 >> HTS version 2.0.1 and hts_engine_API version 0.9 were released.~
 The new features are
 - Many bug fixes.
 - Band structure for linear transforms.
 - Stream-dependent variance flooring scales.
 - State duration model mmf structure is changed.  In the previous versions we used a multi-variate Gaussian PDF to represent state duration PDFs of an HMM.     However, from this version we use multi-stream structure.  This is very important for the future HSMM support.  
 - Demo scripts support LSP-type parameters for spectral representation in addition to cepstral ones.
 - API-style implementation of hts_engine.  Old stand-alone hts_engine will be thrown away.
 
 > ''September 20, 2007''
 >> HTS version 2.0.1RC1 was released to the hts-users ML members.
 > ''September 18, 2007''
 >> HTS version 2.0.1RC1 was released to the internal working group members.
 > ''March 20, 2007''
 >> Mail address of hts-users ML changed to hts-users@sp.nitech.ac.jp
 > ''March 7, 2007''
 >> HTS website moved to http://hts.sp.nitech.ac.jp/
 
 ** 2006 [#z5d7dda6]
 >''December 29, 2006''
 >> HTS version 2.0 was '''finally''' released :-)~
 The new features are
 - Based on [[HTK-3.4>http://htk.eng.cam.ac.uk/download.shtml]].
 - Compilation without [[SPTK>http://www.sp.nitech.ac.jp/~tokuda/SPTK/index.html]].
 - The [[license of HTS>License]] was slightly modified.
 - Support gcc4.
 - Thousands of fixed bugs.
 - HRest can generate state duration densities (-g option).
 - Model boundaries can be given to HERest (-e option).~
 We may specify a part of model boundaries (e.g, pause positions).
 - Reduced-memory implementation of context clustering in HHEd (-r option).
 - Each decision tree can have a name with regular expression (-p option).~
  TB 000 {(*-a+*, *-i+*, *-u+*).state[2]}
  TB 000 {(*-sil+*, *-pau+*).state[3]}
 As a result, different two trees can be constructed for consonants and vowels, respectively.
 - The interface of HMGenS has been switched from ''HHEd-style'' to ''HERest-style''.
 - Flexible model structures in HMGenS (in the previous version, the first stream is assumed as mcep, and the others are assumed as log F0).  Non-left-to-right models and full covariance matrices for state output pdfs can also be used.
 - EM-based parameter generation algorithm (-c option), i.e, mixture of Gaussians can be used. 
 -- -c 0: Cholesky decomposition
 -- -c 1: EM (with fixed state sequence)
 -- -c 2: EM (phone boundaries can be given with -e option)
 - Random generation algorithm is also supported (set config. variable RNDPG = TRUE).
 - Speaker adaptation, adaptive training, and semi-tied covariance transforms are supported for multi-stream HMMs/MSD-HMMs. 
 -- MLLRMEAN, MLLRCOV, and CMLLR-based adaptation.
 -- CMLLR-based adaptive training.
 -- Decision trees for context clustering can be used to define regression classes for adaptation.
 -- HMGenS can read MLLRMEAN, MLLRCOV, CMLLR, and SEMIT transforms for adaptation.
 - MAP adaptation is also supported.
 - Performance improvements in hts_engine.
 - Miscellaneous changes.
 
 >''December 4, 2006''
 >> HTS version 2.0RC3 was released to members of Mailing List.
 >''July 1, 2006''
 >> HTS version 2.0RC2 was released to members of Mailing List.
 >''March 3, 2006''
 >> HTS version 2.0RC1 was released to members of Mailing List.
 >''February 15, 2006''
 >> HTS version 2.0RC0 was released to the internal working group. 
 
 ** 2003 [#ld26312a]
 >''December 26, 2003''
 >> HTS version 1.1.1 was released. The new features were
 - Based on HTK-3.2.1
 - Demo script for ARCTIC database
 - Demo script for an original database (Japanese)
 - Variance flooring in demo script
 - Postfiltering in hts-engine
 - Many fixed bugs
 
 >''Oct. 14, 2003''
 >> New HTS voices trained by ARCTIC databases were released. 
 >''June 11, 2003''
 >>  HTS version 1.1b was released.
 >''May 9, 2003''
 >> HTS version 1.1 was released. The new features were
 - A small synthesis engine (to be called from Festival).
 - HMM file format converter for the engine.
 - Many fixed bugs (Thanks for reporting them).
 - Accompanied by HTS voices for Festival.
 
 >''January 21, 2003''
 >> Minor revision was made to HTS version 1.0. 
 
 ** 2002 [#s0e2a8a8]
 >''December 25, 2002''
 >> HTS version 1.0 was released.
 //>''September:''
 //>> The first paper about eigenvoice for HMM-based speech synthesis was //appeared in ICSLP'02.
 //
 //** 1999 [#v67c04e5]
 //> ''September:''
 //>> The first paper about the current HTS framework was appeared in //Eurospeech'99. 
 //> ''March:''
 //>> The first paper about MSD-HMM was appeared in ICASSP'99.
 //
 //** 1998 [#v760e959]
 //> ''November:''
 //>> The first paper about speaker adaptation for the HMM-based speech synthesis //using MLLR was appeared in ESCA/COCOSDA workshop on speech synthesis.
 //
 //** 1997 [#v95967aa]
 //> ''September:''
 //>> The first paper about speaker interpolation for the HMM-based speech //synthesis was appeared in Eurospeech'97.
 //
 ** 1995 [#v5285cba]
 > ''May 9, 1995''
 >> The first paper about the speech parameter generation algorithm was appeared  in ICASSP'95.
HMM/DNN-based Speech Synthesis System (HTS) - Diff of History