I have encountered an odd (and extensive) slowdown in the production of
SIL when comparing use of the stand-alone hts_engine code and the use of the hts_engine
code embedded into an application. The modeling of SIL seems to extend the
duration of silence abnormally anytime it is found in the tri-phone set when
using the embedded hts_engine. I also observe that the quality of the audio
produced from an HTS 2.1 voice seems to be somewhat degraded. The same
tri-phones produce reasonable audio using the stand alone engine. Background: I have embedded the hts_engine code into a wrapper class in an
application. The wrapper initializes the hts_engine by executing the hts
main(argc,argv) function and sets all relevant parameters. The code to get
audio and the code to clean up the structs are stripped out and put into two
other functions (getAudioForString() and hts_cleanup()).The main() function
local automatic structure variables are moved into file scope so they are
visible in the two new functions. The only other changes to the code are
associated with grabbing the phonemes from a string object rather than from
stdin in HTS_GetToken. I created an overloaded version (sHTS_GetToken()) that
takes an istringstream vice a FILE*. The fget() calls are replaced by istringstream::get()
calls. There are no other code changes. Is there some timing issue involved when the hts_engine is embedded
into another application which does not appear in normal use? If so, what is
the timing issue and is there a way around it. Dr. Raymond T. Tillman L-3 Communications,
Inc., Link Simulation & Training Raymond.T.Tillman@xxxxxxxxxx Raymond.Tillman@xxxxxxxxxxxxxxxx US Air Force Research Labs
Caution: This message may contain
competitive, sensitive or other non-public information not intended for
disclosure outside official government channels. Do not disseminate this
message without the approval of the undersigned's office. If you received this
message in error, please notify the sender by reply e-mail and delete all copies
of this message. |