[hts-users:03019] Re: Error speed synthesized speech while using 16K data with HTS-2.2
- Subject: [hts-users:03019] Re: Error speed synthesized speech while using 16K data with HTS-2.2
- From: Keiichiro Oura <uratec@xxxxxxxxxxxxxxx>
- Date: Thu, 8 Sep 2011 23:04:04 +0900
- Cc: uratec <uratec@xxxxxxxxxxxx>
- Delivered-to: hts-users@xxxxxxxxxxxxxxx
- Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=vOILFhKm3j0sWOQFwAWq85pi9MTLyMF5wLfpN7iYNv0=; b=fuGg8WdOD2aMZ9mXap+v69URs0sJRdTPLyyyVzQIgqhD2CQZvSM/ihq8KaeMPIlNuT tfYMEO4ky3m8OeRPr31ppMLLfO9SFnN33s3ggXNv5sL3jz7yhQCAjHRt7E8QYG6sBpIK chw3mLGL+yexQO3Qr34jBjZzBodPiCY0V3/PY=
Hi,
How did you prepare 16kHz *.raw files?
Regards,
Keiichiro Oura
2011/9/7 Yu-Chieh Chen <tobysworld@xxxxxxxxxxx>:
> Dear Keiichiro,
> The data in /raw are already prepared as 16KHz-16bit, but I get bad
> results, is there any chance that some settings is incorrect?
> Sincerely,
> Mandy
>> Date: Wed, 7 Sep 2011 12:52:11 +0900
>> From: uratec@xxxxxxxxxxxxxxx
>> Subject: [hts-users:03016] Re: Error speed synthesized speech while using
>> 16K data with HTS-2.2
>> To: hts-users@xxxxxxxxxxxxxxx
>> CC: uratec@xxxxxxxxxxxx
>>
>> Hi,
>>
>> data/raw/*.raw should be down-sampled from 48kHz to 16kHz.
>>
>> x2x +sf < 48kHz_16bit.raw | \
>> ds -s 32 | \
>> ds -s 21 | \
>> x2x +fs > 16kHz_16bit.raw
>>
>> Regards,
>> Keiichiro Oura
>>
>>
>> 2011/9/7 Yu-Chieh Chen <tobysworld@xxxxxxxxxxx>:
>> > Dear all,
>> > I'm recently switching my HTS project from HTS-2.01 to HTS-2.2. For
>> > using
>> > the English speaker
> & gt; > dependent training demo from HTS-2.2 project.
>> > I installed HTS-2.2_for_HTK-3.4.1 without any trouble, and also change
>> > my
>> > HTS_Engine to 1.05.
>> > In fact, the whole training process went well smoothly, and the
>> > synthesized
>> > speech sounds good.
>> > But when I want to change the wave data to cmu-bdl (16KHz), I got very
>> > bad
>> > synthesized speech.
>> > The voice sounds broken, and the speed of the speech is also weird.
>> > I changed the feature extraction parameters in data/Makfile as:
>> > SAMPFREQ = 16000 &nb sp; # 48000 Sampling frequency (48kHz)
>> > FRAMELEN = 400 # 1200 Frame length in point (1200 = 48000 *
>> > 0.025)
>> > FRAMESHIFT = 80 # 240 Frame shift in point (240 = 48000 * 0.005)
>> > WINDOWTYPE = 1 # Window type -> 0: Blackman 1: Ham ming 2:
>> > Hanning
>> > NORMALIZE = 1 # Normalization -> 0: none 1: by power 2: by
>> > magnitude
>> > FFTLEN = 1024 # FFT length in point
>> > FREQWARP = 0.42 # 0.55 # frequency warping factor
>> > GAMMA = 0 # pole/zero weight for mel-generalized cepstral
>> > (MGC)
>> > analysis
>> > MGCORDER = 24 # order of MGC analysis
>> > LNGAIN = 1 # use logarithmic gain rather than linear gain
>> > LOWERF0 = 40 # lower limit for f0 extraction (Hz)
>> > UPPERF0 = 400 # upper limit for f0 extraction (Hz)
>> > NOISEMASK = 50 # standard deviation of white noise to mask noises
>> > in
>> > f0 extrac tion
>> >
>> > and the training parameters in scrpits/Config.pm
>> > as
>> > # Speech Analysis/Synthesis Setting ==============
>> > # speech analysis
>> > $sr = 16000; #48000; # sampling rate (Hz)
>> > $fs = 80; #240; # frame period (point)
>> > $fw = 0.42; #0.55; # frequency warping
>> > $gm = 0; # pole/zero representation weight
>> > $lg = 1; # use log gain instead of linear gain
>> > $fr = $fs/$sr; # frame period (sec)
>> > # speech synthesis
>> > $pf = 1.4; # postfiltering factor
>> > $fl = 4096; # length of impulse response
>> > $co = 2047; # order of cepstrum to approximate mel-generalized
>> > cepstrum
>> > The rest of t he training parameter s remain the same, but I cannot get
>> > correct result from training.
>> > Could anyone tell me where can I possibly go wrong?
>> > Thanks in advance!
>> > Sincerely,
>> > Mandy
>>
>
- Follow-Ups
-
- [hts-users:03020] Re: Error speed synthesized speech while using 16K data with HTS-2.2, Yu-Chieh Chen
- References
-
- [hts-users:03015] Error speed synthesized speech while using 16K data with HTS-2.2, Yu-Chieh Chen
- [hts-users:03016] Re: Error speed synthesized speech while using 16K data with HTS-2.2, Keiichiro Oura
- [hts-users:03017] Re: Error speed synthesized speech while using 16K data with HTS-2.2, Yu-Chieh Chen