HMM/DNN-based Speech Synthesis System (HTS) -
Tutorial
[
Front page
] [
New
|
Page list
|
Search
|
Recent changes
|
Help
|
Log in
]
Start:
* Tutorial on HMM-based speech synthesis [#o7f5587e]
> ''Tutorial at Interspeech 2009'' &ref(http://www.inters...
''"Fundamentals and recent advances in HMM-based speech s...
'''Keiichi Tokuda (Nagoya Insitute of Technology)'''~
'''Heiga Zen (Toshiba Europe Research Ltd. Cambridge Rese...
> ''Introduction''~
Over the last ten years, the quality of speech synthesis ...
> In recent years, a kind of statistical parametric speec...
- Original speaker's voice characteristics can easily be ...
- Using a very small amount of adaptation speech data, vo...
From these features, the HMM-based speech synthesis appro...
> In this tutorial, the system architecture is outlined, ...
> Main target audience includes
- students who are going to work on speech synthesis
- researchers interested in statistical parametric speech...
- developers who want to integrate HMM-based speech synth...
> ''Presentation outline''
- Overview
-- Corpus-Based Speech Synthesis
-- Unit selection vs statistical parametric speech synthe...
- Basics
-- Vocoding techniques
--- Source-filter model
--- LP/LSP analysis
--- mel-cepstral analysis
--- MGC analysis
-- Speech Parameter generation
--- Definition of Hidden Markov model (HMM)
--- Speech Parameter Generation from HMM with dynamic fea...
--- Determination of State Durations
--- Solution for The Problem
-- F0 pattern modeling
--- Observation of F0
--- MSD-HMM for F0 Modeling
-- Model structure
--- HMM topology
--- Context-dependent modeling
--- Decision tree-based context clustering
--- MDL criterion
- Relation to the unit selection approach
-- Comparison between Two Approaches
-- Hybrid approaches
--- Target prediction
--- Unit smoothing
--- Mixing natural and generated units
- Trajectory modeling
-- Derivation of trajectory HMM
-- Relationship between trajectory HMM & HMM-based speech...
- Recent improvements and evaluation
-- STRAIGHT
-- Statistical mixed excitation
-- HSMM
-- MGE training
-- GV-based parameter generation algorithm
-- Blizzard Challenge
- Flexibility of the approach
-- Speaker adaptation (mimicking voices)
--- MLLR, MAP, SAT, etc
-- Speaker Interpolation (mixing voices)
-- Eigenvoices (producing voices)
-- Multiple-regression (controlling voices)
-- Multilingual speech synthesis
-- Singing voice synthesis
- Applications
-- Audio-visual speech synthesis
-- Human motion synthesis and others
-- Hand-writing recognition
-- Small-footprint synthesizer for mobile devices
- Software
-- SPTK, HTS, hts_engine, ARCTIC, etc.
- Summary
> ''Short Bios:''~
''Keiichi Tokuda'' received the Dr.Eng. degree from Tokyo...
> ''Heiga Zen'' received the Dr.Eng. degree in computer s...
End:
* Tutorial on HMM-based speech synthesis [#o7f5587e]
> ''Tutorial at Interspeech 2009'' &ref(http://www.inters...
''"Fundamentals and recent advances in HMM-based speech s...
'''Keiichi Tokuda (Nagoya Insitute of Technology)'''~
'''Heiga Zen (Toshiba Europe Research Ltd. Cambridge Rese...
> ''Introduction''~
Over the last ten years, the quality of speech synthesis ...
> In recent years, a kind of statistical parametric speec...
- Original speaker's voice characteristics can easily be ...
- Using a very small amount of adaptation speech data, vo...
From these features, the HMM-based speech synthesis appro...
> In this tutorial, the system architecture is outlined, ...
> Main target audience includes
- students who are going to work on speech synthesis
- researchers interested in statistical parametric speech...
- developers who want to integrate HMM-based speech synth...
> ''Presentation outline''
- Overview
-- Corpus-Based Speech Synthesis
-- Unit selection vs statistical parametric speech synthe...
- Basics
-- Vocoding techniques
--- Source-filter model
--- LP/LSP analysis
--- mel-cepstral analysis
--- MGC analysis
-- Speech Parameter generation
--- Definition of Hidden Markov model (HMM)
--- Speech Parameter Generation from HMM with dynamic fea...
--- Determination of State Durations
--- Solution for The Problem
-- F0 pattern modeling
--- Observation of F0
--- MSD-HMM for F0 Modeling
-- Model structure
--- HMM topology
--- Context-dependent modeling
--- Decision tree-based context clustering
--- MDL criterion
- Relation to the unit selection approach
-- Comparison between Two Approaches
-- Hybrid approaches
--- Target prediction
--- Unit smoothing
--- Mixing natural and generated units
- Trajectory modeling
-- Derivation of trajectory HMM
-- Relationship between trajectory HMM & HMM-based speech...
- Recent improvements and evaluation
-- STRAIGHT
-- Statistical mixed excitation
-- HSMM
-- MGE training
-- GV-based parameter generation algorithm
-- Blizzard Challenge
- Flexibility of the approach
-- Speaker adaptation (mimicking voices)
--- MLLR, MAP, SAT, etc
-- Speaker Interpolation (mixing voices)
-- Eigenvoices (producing voices)
-- Multiple-regression (controlling voices)
-- Multilingual speech synthesis
-- Singing voice synthesis
- Applications
-- Audio-visual speech synthesis
-- Human motion synthesis and others
-- Hand-writing recognition
-- Small-footprint synthesizer for mobile devices
- Software
-- SPTK, HTS, hts_engine, ARCTIC, etc.
- Summary
> ''Short Bios:''~
''Keiichi Tokuda'' received the Dr.Eng. degree from Tokyo...
> ''Heiga Zen'' received the Dr.Eng. degree in computer s...
Page: