Hi, Hui LIANG (Tienshyang) wrote:
Could anyone recommend a paper which elaborates the process of converting decision trees to a regression tree (i.e., how the DR command of HHEd works)? Actually I have found a paper from the references of the HTS Manual written by Dr. Heiga Zen on 27th June 2008: "J. Yamagishi, M. Tachibana, T. Masuko, and T. Kobayashi. Speaking style adaptation using context clustering decision tree for HMMbased speech synthesis, in Proc. of ICASSP, Pages 5–8, 2004." where however there are few details. Presumbly, DR rewrites the decision tree files in the file format of regression trees. Then some pruning seems to be involved. I am not quite sure about my understanding.
There are no reference because converting decision trees to a regression class tree is straightforward. It just converts decision trees into the HTK regression tree format. Pruning is done to reduce the number of leaf nodes of regression tree because decision trees sometimes have tens of thousands of leaf nodes but it sounds too much for regression tree. Pruning is based on occupancy count. By loading stats file, we can know # of training samples assigned to each leaf node of regression tree. Pruning is done until # of training samples of each node reaches threshold. You can control this threshold through the configuration variable SHRINKOCCTHRESH. If threshold==10000.0, pruning is done until all leaf nodes of a regression tree have at least 10000.0 occupancy counts.
Best regards, Heiga ZEN (Byung Ha CHUN) -- -------------------------- Heiga ZEN (Byung Ha CHUN) Speech Technology Group Cambridge Research Lab Toshiba Research Europe phone: +44 1223 436975 ______________________________________________________________________ This email has been scanned by the MessageLabs Email Security System.For more information please visit http://www.messagelabs.com/email ______________________________________________________________________