[hts-users:01538] hts_engine : forcing durations with -vp
Hi hts developpers,
it looks like when using the -vp option with hts_engine (from hts_engine
API 1.0) the resulting synthetic speech does not completely respect the
durations from the input label file (although 'duration_remain' variable
(in HTS_sstream.c) does reduce the error when compared to old hts_engine)
e.g. in hts-demo, for cmu_us_arctic_slt_a0001, the total number of
frames should be 669, but the synthetic speech has only 654 frames.
this is due to the 'frame' value in HTS_LabelString structure being an
'int'.
Then I simply tried to replace it with a 'double' and modify accordingly
HTS_Label.c (HTS_Label_load_from_string(),
HTS_Label_load_from_string_list(), HTS_Label_set_frame() and
HTS_Label_get_frame(), the latter returning now a double instead of an
int value).
it seems to work since I have now the correct number of frames at
synthesis time.
However I have no idea whether this might silently break something in
the API. Could you tell me whether it's a safe change or not ?
Many thanks,
Alexis
- Follow-Ups
-
- [hts-users:01562] Re: hts_engine : forcing durations with -vp, Oura Keiichiro