[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:04419] WaveNet: A Generative Model for Raw Audio

Subject: [hts-users:04419] WaveNet: A Generative Model for Raw Audio
From: "Heiga ZEN (Byung Ha CHUN)" <heigazen@xxxxxxxxxx>
Date: Thu, 08 Sep 2016 19:06:14 +0000
Delivered-to: hts-users@xxxxxxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:from:date:message-id:subject:to; bh=Vll+xl+xJKwWCMTex/YF50Kx2kz3pZfVrSLBCf3WaD0=; b=eLfwR7gAOZjqMWWVV5//Z+gjBbOcDF5i3iHIGq4RtFoVb2vz+Mun77Cj1/i3bJ3iwO P/BBiAwgGlqLd/3GUNwinmLTWSVW2sNC4y3pTbalvglM9iCr9IZ01oGURoLsD6T6DwNj nc84T3VTtwhr+qZ0DdVIFTRgWdsc8/NA8ePPnHRbh6ry5m87XO9rjmqOg9THFph7rq+C TwlVXO6L+cGeXeCDIoytuY5FkKFwU/omWU55LYFwz8OgD1uiqiphUQwOpR7nV3oOHzS/ 4YcmugahUJuCSjOLzzwZE2b4bM3sZBaLcA0IGt+767McZrIFXPjNyLYDUiM7XkoHHXh1 C8uQ==

Hi all,

DeepMind researchers and I developed a new generative model for audio signals named "WaveNet". We can draw a waveform sample-by-sample from this model. By conditioning linguistic features derived from a text, it can be used for text-to-speech. It has already overtaken the existing concatenative and parametric TTS significantly. You can find the result, speech samples, and paper at DeepMind's blog post.

https://deepmind.com/blog/wavenet-generative-model-raw-audio/

I believe that this is a milestone in statistical parametric speech synthesis :-)

I'm looking forward to hearing your feedbacks.

Cheers,

Heiga

---------------------------------------

Heiga ZEN (in Japanese)
Byung Ha CHUN (in Korean)
<heigazen@xxxxxxxxxx>

Prev by Subject: [hts-users:04418] Merlin - a new Neural Network Speech Synthesis System from the University of Edinburgh
Next by Subject: [hts-users:04420] FYI: Google London Software Engineering Intern, PhD 2017
Previous by thread: [hts-users:04418] Merlin - a new Neural Network Speech Synthesis System from the University of Edinburgh
Next by thread: [hts-users:04420] FYI: Google London Software Engineering Intern, PhD 2017