[hts-users:03453] AW: Additional stream question
- Subject: [hts-users:03453] AW: Additional stream question
- From: "Schabus, Dietmar" <Schabus@xxxxxx>
- Date: Mon, 5 Nov 2012 16:31:04 +0000
- Accept-language: de-AT, en-US
- Delivered-to: hts-users@xxxxxxxxxxxxxxx
- Thread-index: AQHNu2plKUVAwL41LUSi0oLxeUhwepfba3bu
- Thread-topic: [hts-users:03452] Additional stream question
I am working on joint audio-visual speech synthesis. It depends on from what exactly you are starting, but basically, the procedure is something like:
1) Start from a working audio-only training procedure
2) Make sure that the feature extraction step creates .cmp files that contain values for your new stream (with deltas and delta-deltas) in addition to the audio features
3) Add your new stream to the protos (functions make_proto etc)
4) Go through all steps of the training, wherever there is something stream-specific going on, add code for your new stream
I think for me, that was about it.
Good luck ;)
Dipl.-Ing. Dietmar Schabus | Researcher
phone +43 1 5052830-48 | fax -99 | schabus@xxxxxx | http://userver.ftw.at/~schabus/
FTW Telecommunications Research Center Vienna
Donau-City-Straße 1/3 | 1220 Vienna | Austria | www.ftw.at
Von: Tomasz Kuczmarski [faqster@xxxxxxxxx]
Gesendet: Montag, 05. November 2012 16:29
Betreff: [hts-users:03452] Additional stream question
I am trying to perform integrated training and synthesis of speech
acoustics and articulatory data. I have a speech corpus with
accompanying 2D EMA data.
I am guessing that I should introduce an additional data stream to my
models. This idea is pretty clear to me. However I do not really know
how to start.
What are the parts of code I should modify? Has anyone here done
anything similar (multimodal synthesis for example) and is willing to
share the basic technical details with me?
If I knew what is the general scheme of data/code preparation I could
work the rest out myself.
Thank you in advance,
- [hts-users:03454] Re: AW: Additional stream question, Tomasz Kuczmarski
- [hts-users:03452] Additional stream question, Tomasz Kuczmarski