The HMM-based Speech Synthesis System (HTS) has been developed by the HTS working group and others (see Who we are and Acknowledgments). The training part of HTS has been implemented as a modified version of HTK and released as a form of patch code to HTK. The patch code is released under a free software license. However, it should be noted that once you apply the patch to HTK, you must obey the license of HTK. Related publications about the techniques and algorithms used in HTS can be found here.

HTS version 2.2 includes deterministic annealing EM algorithm in parameter estimation step, KLD-based state-mapping and cross-lingual speaker adaptation, minimum generation error (MGE) training, and other minor new features. Many bugs in HTS version 2.1.1 were also fixed. HTS does not include any text analyzers but the Festival Speech Synthesis System (English, Spanish, etc.), DFKI MARI Text-to-Speech System (German, English, etc.), Flite+hts_engine (English), Open JTalk (Japanese), or other text analyzers can be used with HTS. HTS slides are also released as a tutorial of HMM-based speech synthesis.

This distribution includes demo scripts for training speaker-dependent and speaker-adaptive systems using CMU ARCTIC database (English). For training other voices, demo scripts using Nitech database (Portuguese, Japanese, and Japanese Song) are also released.


  • December 25, 2014

    HTS version 2.3 beta was released to the hts-users ML members.

  • December 25, 2012

    HTS version 2.3 alpha was released to the hts-users ML members.

  • July 7, 2011

    HTS version 2.2 was released.
    Its new features are

    • HERest:
      • Support DAEM algorithm in parameter estimation step.
    • HHEd:
      • Support KLD-based state-mapping and cross-lingual speaker adaptation.
      • Context-clustering can be started in the middle of the tree building.
    • HMgeTool:
      • Add ECD-based MGE traning command, HMgeTool.
    • HSMMAlign:
      • Add stand-alone HSMM based forced-alignment command, HSMMAlign.
    • Demo scripts:
      • Change sampling frequency from 16kHz to 48kHz.
      • Support bark critical-band based aperiodic measure.
      • Change speaker and singer of Brazilian Portuguese and Japanese song demo, respectively.
    • Slides:
      • Release slides as a tutorial of HMM-based speech synthesis.

