[hts-users:02063] Audio-Visual speech synthesis

Subject: [hts-users:02063] Audio-Visual speech synthesis

From: Girish Malkarnenkar <girish1m@xxxxxxxxx>

Date: Fri, 3 Jul 2009 10:05:32 +0200

Delivered-to: hts-users@xxxxxxxxxxxxxxx

Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:reply-to:date:message-id :subject:from:to:content-type; bh=gIMyR9Vb8yU19tZiPtOh4VrH/JBotGcIk1+jX/vqDIw=; b=Y6RhH2tvF98iemQALWjjbcYx4tpsyqQoSPB0Laj37c8yTmM6nv9LaPTT15Vb8hCNCS 8szYiPgUqgIYTamSEgSA0wur+Q5SYXpaj8NF0+VPMUxVUZ5qOu4y6DT9ADEtBP5/au0t fwigOK09KSF3Tx11BYL2PQFwYLFYYiZLXJ37c=

Domainkey-signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:reply-to:date:message-id:subject:from:to:content-type; b=P2f3NpkJ9hCrNB2hl1bf9ohIHxVG1z2qCP+NxMZTCTED1ggYMQb/AGUHcXmtxUbJpY MhkBPv4BM6rECxiLejP1uYK57+oiygKmmGlIA28d9b/uTiSFZQjdvOEBnFtSljfBHBTR oQPkpSprNYb5IeZvQ2v3V1gZ5MKJth6WcEcRk=

Dear Sir/Madam,

I am trying to synthesise visual speech via HTS by replacing the MGC files created with facial parameters. While I would appreciate it if someone who has done a similar thing before could share his experience, my specific doubt is that I need to know the format of the MGC files. Although using the dmp command in SPTK displays the MGC files it does so in a single column. I need to know the actual format so that I can insert the facial parameters in the same format.

Secondly I need to know the sampling period used for calculating the MGCs by the HTS code and I need to know how to modify it to suit the sampling period that I have used.

I plan to train the system on facial parameters inserted in the MGC files and later on generate the facial parameters once again in the MGC files and use those MGC files as my final output rather than the wav files. Is there anything else I would need to modify for my method to work?

Yours sincerely

Girish Malkarnenkar