[hts-users:04593] Re: Description of DNN Predicted output
- Subject: [hts-users:04593] Re: Description of DNN Predicted output
- From: Takenori Yoshimura <takenori.yoshimura24@xxxxxxxxx>
- Date: Wed, 31 Jan 2018 17:59:46 +0900
- Authentication-results: mailgw.mains.nitech.ac.jp; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ENPzyPFn"
- Delivered-to: hts-users@xxxxxxxxxxxxxxx
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=rJWfCM6Pun/N+9IKgxYDqsc2W7kHS96Y/LfGpqWu0D0=; b=ENPzyPFnCGOZ2+KRjwaZS9VsbD/jJQHAc+R+9uxCKsb58o77s5dnX81Uk3cBeCY8JM pv4CeU4+uah5Rbm4Jq+lvjTk4Ha0xaDEOiAaLW+QG3+dNmylBhEO+Dn2umxfdt9tMH/z 6Dv9sjM8808ADvdc+hgNw47jNccX0SHUFHb0saBITO0uMbRRJyeYF2K2LfMKeh3R6DRF A0Fzl4dyarh6b8murssnaJQp7DoBp5LERETvkMc3bf6PDK1KmQVn7YBLi0EQz6qczpRk bqTw3Z+wmZJYxUSxZxAIZy5Tka7LTmVNVA8mWCVcYzzZKkUSZz+WAfTu5OeETMH+DFoc 6zSQ==
Hi,
The output of DNN consists of the following features:
35: MGC
35: 1st-order derivatives of MGC
35: 2nd-order derivatives of MGC
1: voiced/unvoiced symbol
1: log f0
1: 1st-order derivative of log f0
1: 2nd-order derivative of lof f0
The position of log f0 is 107.
2018-01-31 17:23 GMT+09:00 Aayush Kumar Tyagi <aayush16081@xxxxxxxxxxx>:
> Hi,
>
> I am using DNN based speech synthesis(USEDNN=1).
> I guess the output of DNN is a combination of spectral and excitation
> parameters.
> The number of output units of DNN is 109.
> Can someone help me understand what these 109 points correspond to?Like how
> many of these are MGC points and which one is log f0.
> I am particularly interested in the position of the fundamental frequency in
> the predicted output.
>
> Please Correct me if I am completely wrong.
>
> Thanks a lot
> Aayush Tyagi
>
--
Nagoya Institute of Technology
Tokuda and Nankaku Laboratory
Takenori Yoshimura
takenori@xxxxxxxxxxxxxxx
- References
-
- [hts-users:04592] Description of DNN Predicted output, Aayush Kumar Tyagi