[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:03947] Re: MDL vs ML


Thank you very much again. I've read the paper. I see, that MDL produces a type of optimum and table 1 tells that it overperforms ML. Although I think ML with cross validation might find a better optimum.

And I am still interrested in the perceptual effects of MDL in HMM-TTS systems compared to ML. Unfortunately I didn't find any papers on that topic.

Best Regards,
Balint Toth




2013.11.20. 17:58 keltezéssel, Xavi Gonzalvo írta:
With the number of question we have in the HMM-TTS system, it's interesting to know when to stop the splitting.


Referenced by Junichi Yamagishi in "An Introduction to HMM-Based Speech Synthesis".


2013/11/20 Tóth Bálint <toth.b@xxxxxxxxxxx>
Thank you very much for your answer.

From practial point of view why was this modification applied to HTS? HTK 'still' uses ML.

Maybe the higher number of contextual features is the reason?
Would appling ML in HTS cause a severe quality degradation in synthesized speech quality?

Best Regards,
Balint Toth

2013.11.19. 21:55 keltezéssel, Xavi Gonzalvo írta:
The MDL criterion is an effective way to select the optimal probabilistic model from among various models. When used for decision tree clustering in HMM-TTS the ML criterion stays but MDL penalises larger trees.

Suppose we are given a sequence of N data points x = {x1, . . . , xN }. As an estimation problem, we could say that we are looking for the model that has generated this data. In other words, we try to estimate a vector of parameters θ = [θ1, . . . , θL] of a statistical model Pθ(x) for the data x. The MDL criterion is an effective way to select the optimal probabilistic model from among various models. In order to do that, it selects the statistical model with the minimum description length for the given data. The description length Dj(x) for data x of an underlying probabilistic model j is given by,

Imágenes integradas 1

where:
• θˆ(j) represents the ML estimate of model j for the vector of parameters θ.

• Lj is the number of parameters of θˆ(j) in probabilistic model j.

One of the advantages of the MDL criterion is that the second term defined in the equation works as a penalty imposed for employing a large model size. So, as a model becomes more complex, the value of the first term decreases and that of the second term increases.


See:

Rissanen, J. (1984). Universal coding, information, prediction, and estimation. IEEE Transactions on Information Theory, 30:629–636.




2013/11/19 Tóth Bálint <toth.b@xxxxxxxxxxx>
Dear All,

Is there a reason, why MDL (Minimum Description Length) is preferred oved ML (Maximum Likelihood) for building decision trees in HMM-TTS?

Thank you for your answer in advance!

Best Regards,
Balint Toth




--
Xavi.




Ez a levél vírus, és rosszindulatú-kód mentes, mert az avast! Antivirus védelem aktív.





--
Xavi.


Follow-Ups
[hts-users:03948] Re: MDL vs ML, Xavi Gonzalvo
References
[hts-users:03927] HMM extension (Multi-process supporting) link broken, Karthik Krishnan
[hts-users:03932] Full Context TTS Phones, Ibrahim Sobh
[hts-users:03933] Re: Full Context TTS Phones, Yongxin Wang
[hts-users:03936] MDL vs ML, Tóth Bálint
[hts-users:03937] Re: MDL vs ML, Xavi Gonzalvo
[hts-users:03939] Re: MDL vs ML, Tóth Bálint
[hts-users:03942] Re: MDL vs ML, Xavi Gonzalvo