[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:02063] Audio-Visual speech synthesis


Dear Sir/Madam,
 
I am trying to synthesise visual speech via HTS by replacing the MGC files created with facial parameters. While I would appreciate it if someone who has done a similar thing before could share his experience, my specific doubt is that I need to know the format of the MGC files. Although using the dmp command in SPTK displays the MGC files it does so in a single column. I need to know the actual format so that I can insert the facial parameters in the same format.
 
Secondly I need to know the sampling period used for calculating the MGCs by the HTS code and I need to know how to modify it to suit the sampling period that I have used.
 
I plan to train the system on facial parameters inserted in the MGC files and later on generate the facial parameters once again in the MGC files and use those MGC files as my final output rather than the wav files. Is there anything else I would need to modify for my method to work?
 
 
Yours sincerely
 
Girish Malkarnenkar

Follow-Ups
[hts-users:02064] Re: Audio-Visual speech synthesis, Simon King
[hts-users:02066] Re: Audio-Visual speech synthesis, Heiga ZEN (Byung Ha CHUN)