[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:01538] hts_engine : forcing durations with -vp


Hi hts developpers,

it looks like when using the -vp option with hts_engine (from hts_engine API 1.0) the resulting synthetic speech does not completely respect the durations from the input label file (although 'duration_remain' variable (in HTS_sstream.c) does reduce the error when compared to old hts_engine)

e.g. in hts-demo, for cmu_us_arctic_slt_a0001, the total number of frames should be 669, but the synthetic speech has only 654 frames.

this is due to the 'frame' value in HTS_LabelString structure being an 'int'.

Then I simply tried to replace it with a 'double' and modify accordingly HTS_Label.c (HTS_Label_load_from_string(), HTS_Label_load_from_string_list(), HTS_Label_set_frame() and HTS_Label_get_frame(), the latter returning now a double instead of an int value).

it seems to work since I have now the correct number of frames at synthesis time.

However I have no idea whether this might silently break something in the API. Could you tell me whether it's a safe change or not ?

Many thanks,

Alexis

Follow-Ups
[hts-users:01562] Re: hts_engine : forcing durations with -vp, Oura Keiichiro