[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:03453] AW: Additional stream question


I am working on joint audio-visual speech synthesis. It depends on from what exactly you are starting, but basically, the procedure is something like:

1) Start from a working audio-only training procedure
2) Make sure that the feature extraction step creates .cmp files that contain values for your new stream (with deltas and delta-deltas) in addition to the audio features
3) Add your new stream to the protos (functions make_proto etc)
4) Go through all steps of the training, wherever there is something stream-specific going on, add code for your new stream

I think for me, that was about it.

Good luck ;)

Dipl.-Ing. Dietmar Schabus | Researcher
phone +43 1 5052830-48 | fax -99 | schabus@xxxxxx | http://userver.ftw.at/~schabus/

FTW Telecommunications Research Center Vienna
Donau-City-Straße 1/3 | 1220 Vienna | Austria | www.ftw.at

Von: Tomasz Kuczmarski [faqster@xxxxxxxxx]
Gesendet: Montag, 05. November 2012 16:29
An: hts-users@xxxxxxxxxxxxxxx
Betreff: [hts-users:03452] Additional stream question

Dear all,

I am trying to perform integrated training and synthesis of speech
acoustics and articulatory data. I have a speech corpus with
accompanying 2D EMA data.
I am guessing that I should introduce an additional data stream to my
models. This idea is pretty clear to me. However I do not really know
how to start.

What are the parts of code I should modify? Has anyone here done
anything similar (multimodal synthesis for example) and is willing to
share the basic technical details with me?

If I knew what is the general scheme of data/code preparation I could
work the rest out myself.

Thank you in advance,

Tomasz Kuczmarski

[hts-users:03454] Re: AW: Additional stream question, Tomasz Kuczmarski
[hts-users:03452] Additional stream question, Tomasz Kuczmarski