[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:02433] Serious bug in HTK (and thus HTS)


Hi all,

As you may know through the HTK-Users/HTK-Developers ML, I found a serious bug in HTK HUtil.c:ConvLogWt() & HUtil.c:ConvExpWt. It also affects HTS. Currently they are as follows:

 void ConvLogWt(HMMSet *hset)
 {
    HMMScanState hss;

    if (hset->hsKind == DISCRETEHS || hset->hsKind == TIEDHS || hset->logWt == TRUE)
       return;
    NewHMMScan(hset, &hss);
    while (GoNextMix(&hss,FALSE))
       hss.me->weight = MixLogWeight(hset,hss.me->weight);
    EndHMMScan(&hss);
    hset->logWt = TRUE;
 }

 void ConvExpWt(HMMSet *hset)
 {
    HMMScanState hss;

    if (hset->hsKind == DISCRETEHS || hset->hsKind == TIEDHS  || hset->logWt == FALSE)
       return;
    NewHMMScan(hset, &hss);
    while (GoNextMix(&hss,FALSE))
       hss.me->weight = exp(hss.me->weight);
    EndHMMScan(&hss);
    hset->logWt = FALSE;
 }

The problem is that if your model set have mixture-level parameter tying structure, it doesn't work
properly, because GoNextMix() skips seen (touched) mixture components;

 Boolean GoNextMix(HMMScanState *hss, Boolean noSkip)
 {
    Boolean ok = TRUE;

    if (hss->isCont || (hss->hset->hsKind == TIEDHS)) {
       while (IsSeen(hss->mp->nUse) && ok){
          if (hss->m < hss->M) {
             ++hss->m;
             if (hss->isCont) {
                ++hss->me;
                hss->mp = hss->me->mpdf;
             } else {
                hss->mp = hss->hset->tmRecs[hss->s].mixes[hss->m];
                hss->me = NULL;
             }
          } else if (noSkip)
             return FALSE;
          else
             ok = GoNextStream(hss,FALSE);
       }
       if (ok) {
          Touch(&hss->mp->nUse);
          return TRUE;
       }
    } else { /* There are no components in a DISCRETEHS system - use GoNextStream instead */
       HError(7231,"GoNextMix: Cannot specify mixture components unless continuous");
    }
    hss->me = NULL;
    return FALSE;
 }

Even mixture components are seen (touched), their mixture weights may be unseen (untouched) if the model set has the mixture-level tying structure. Therefore, the current ConvLogWt() & ConvExpWt() convert "first-seen" mixture weights only into log domain; remaining mixture weights are still in linear domain. It causes serious inconsistency.

My fixes for HTS are as follows;

--- HUtil.c  2009-12-11 11:53:00.000000000 +0000
+++ HUtil.c  2010-04-05 11:05:30.858955000 +0000
@@ -548,13 +548,17 @@
 /* EXPORT->ConvLogWt Converts all mixture weights into log-weights. */
 void ConvLogWt(HMMSet *hset)
 {
+   int m;
    HMMScanState hss;

    if (hset->hsKind == DISCRETEHS || hset->hsKind == TIEDHS || hset->logWt == TRUE)
       return;
    NewHMMScan(hset, &hss);
-   while (GoNextMix(&hss,FALSE))
-      hss.me->weight = MixLogWeight(hset,hss.me->weight);
+   while (GoNextStream(&hss,FALSE)) {
+      for (m=1; m<=hss.M; m++) {
+         hss.sti->spdf.cpdf[m].weight = MixLogWeight(hset,hss.sti->spdf.cpdf[m].weight);
+      }
+   }
    EndHMMScan(&hss);
    hset->logWt = TRUE;
 }
@@ -562,13 +566,17 @@
 /* EXPORT->ConvExpWt Converts all mixture log-weights into weights. */
 void ConvExpWt(HMMSet *hset)
 {
+   int m;
    HMMScanState hss;

    if (hset->hsKind == DISCRETEHS || hset->hsKind == TIEDHS  || hset->logWt == FALSE)
       return;
    NewHMMScan(hset, &hss);
-   while (GoNextMix(&hss,FALSE))
-      hss.me->weight = exp(hss.me->weight);
+   while (GoNextStream(&hss,FALSE)) {
+      for (m=1; m<=hss.M; m++) {
+         hss.sti->spdf.cpdf[m].weight = MixWeight(hset,hss.sti->spdf.cpdf[m].weight);
+      }
+   }
    EndHMMScan(&hss);
    hset->logWt = FALSE;
 }

I checked the effect of this bug with an HMM set (speaker-dependent, with mixture-level tying structure but not TIEDHS) and found that the log likelihood changed significantly;

Before fixing this bug;

 Reestimation complete - average log prob per frame = 6.154159e+01
 Reestimation complete - average log prob per frame = 6.247933e+01
 Reestimation complete - average log prob per frame = 6.267500e+01
 Reestimation complete - average log prob per frame = 6.276860e+01
 Reestimation complete - average log prob per frame = 6.283732e+01

After fixing this bug;

 Reestimation complete - average log prob per frame = 6.109281e+01
 Reestimation complete - average log prob per frame = 6.198481e+01
 Reestimation complete - average log prob per frame = 6.214393e+01
 Reestimation complete - average log prob per frame = 6.222932e+01
 Reestimation complete - average log prob per frame = 6.229997e+01

As you can see, the likelihood of the model to the training data degreased. ConvLogWt() and ConvExpWt() are called from HERest() and HMMIRest(), so if you are using these programs with model sets with mixture-level tying structure, you are suffered from this bug.

Note that standard systems (state-level or stream-level tying structure) are not affected by this bug.

Best regards,

Heiga ZEN (Byung Ha CHUN)

--
--------------------------
Heiga ZEN (Byung Ha CHUN)
Speech Technology Group
Cambridge Research Lab
Toshiba Research Europe
phone: +44 1223 436975


______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email ______________________________________________________________________