Hello,
I am not entirely sure which version of the HTS demo or STRAIGHT you are using, as I don't think the demo normally sounds like this. Regardless, when you say you use the exact same data for hts_engine and STRAIGHT synthesis, you mean you are not using mixed excitation at all?
In any case, 1mix / 2mix / stc will produce different parameters from what hts-engine is generating, so if you want to have a fair comparison, you are probably better off dumping the filter / excitation feature coefficients from hts_engine using the -om / -of parameters, and do the synthesis from the generated coefficients using STRAIGHT.
If you still have problems with the synthesis using the parameters generated by HTS-engine, then probably you are using a bad version of STRAIGHT.
If you don't have problems with the synthesis, then it is likely something wrong happened during model training, or, maybe as Rasmus mentioned, during the feature extraction.
Regards,
Blaise