[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:02192] Re: Problem about unstable synthesized voices(usingSpeaker dependenttraining with STRAIGHT demo)

Subject: [hts-users:02192] Re: Problem about unstable synthesized voices(usingSpeaker dependenttraining with STRAIGHT demo)
From: 北科大-廖元甫 <yfliao@xxxxxxxxxxx>
Date: Thu, 27 Aug 2009 05:35:29 +0800
Delivered-to: hts-users@xxxxxxxxxxxxxxx

Dear Kei,

Thanks very much for your help, I have changed the frame length from 1024 to 2048 (for V40_007c).

However, I got another problem for STRAIGHT V40_007c, i.e., I can extract the feature but can't re-synthesize the waveform using STRAIGHT V40_007c. I mean after calling exstraightsource() and exstraightspec() functions in MATLAB, I can't call exstraightsynth() to get the re-synthesized waveform. It simply showed "Index exceeds matrix dimensions". And the same MATLAB script works fine with STRAGIHT V40_006b.

Another problem is about MATLAB itself. It seems that MATLAB can't handle long script, i.e., after processing about 1000 waveforms, it hangs without any output. But if I divide the script into several small script (750 files in my case), it is ok. This is true for both V40_006b and V40_007c.

I know those problems are only related to STRAIGHT and MATLAB, but maybe somebody here can help me.

Thanks for your help and have a nice day!

Best Regards,
Yuan-Fu

Here is my scritp for feature extraction and re-sythesis MATLAB script:

fprintf(1,'Processing wav/00000002.wav%c',10);
[x,fs]=wavread('wav/00000002.wav');
x = x * 32767;
prm.F0frameUpdateInterval=5;
prm.F0searchUpperBound=350   ;
prm.F0searchLowerBound=80    ;
prm.spectralUpdateInterval=5;
[f0, ap] = exstraightsource(x,fs,prm);
[sp] = exstraightspec(x, f0, fs, prm);
[sy] = exstraightsynth(f0, sp, ap, fs, prm);
wavwrite((sy)/max(abs(sy))*0.5,fs,'sy/00000002.sy.wav');
ap = ap';
sp = sp';
save 'f0/00000002.f0' f0 -ascii;
save 'ap/00000002.ap' ap -ascii;
save 'sp/00000002.sp' sp -ascii;
clear all;
close all;
ans

Here is the error log:

>> >> >> >> >> >> >> >> >> >> Processing wav/00000002.wav
>> >> >> >> >> >> >> >> >> ??? Index exceeds matrix dimensions.

Error in ==> straightSynthTB07ca at 290
    wnz=aprm(round(idcv(:)),round(ii)); % 06/May/2001 This is correct!

Error in ==> exstraightsynth at 59
[sy,statusReport] =straightSynthTB07ca(n3sgram,f0raw,shiftm,fs, ...

>> ??? Undefined function or variable 'sy'.

2009/8/26 Kei Hashimoto <bonanza@xxxxxxxxxxxxxxx>

Dear all,

The default setting of STRAIGHT version V40_007c was changed from before
version.
So, in data/Makefile.in, we need to modify the frame length for the
spectral parameter extraction from 1024 to 2048.

- $(MGCEP) -a $(FREQWARP) $${GAMMAOPT} -m $(MGCORDER) -l 1024 -j 0 -f
0.0 -q 3 > mgc/$${base}.mgc;
+ $(MGCEP) -a $(FREQWARP) $${GAMMAOPT} -m $(MGCORDER) -l 2048 -j 0 -f
0.0 -q 3 > mgc/$${base}.mgc;

After changing the frame length, I got good voices.

Best Regards,
Kei Hashimoto

Dear Evan,
I have tried several versions. But I can't get any stable results. Especailly, in my case, V40_007c didn't work at all.. How could you make it work?
Have a nice day!
Best Regards,
Yuan-Fu

2009/8/11 evan huang <vinsanity1101@xxxxxxxxx <mailto:vinsanity1101@xxxxxxxxx>>

Hello everyone,

I also encounter this problem. In fact, I cannot get reliable result
of the straight demo for English.
The STRAIGHT version is V40_007c. What version is preferred for the
STRAIGHT demo?
Thanks for any comments in advance.

Best Regards,

On Sun, Aug 9, 2009 at 8:15 AM, <yfliao@xxxxxxxxxxx
<mailto:yfliao@xxxxxxxxxxx>> wrote:
>
> Dear All,
>
> I got an unstable problem about the synthesized voice and would
like to know
> how to debug it.
>
> Basically, I adopt the scripts/settings from "Speaker dependent
training
> with STRAIGHT demo" (for English voice), and changed the data to
Mandarin
> speech (in fact Mandarin data from Blizzard Challenge 2009).
>
> But it seems that the synthesized voices are not very stable. I
mean that
> for some utterances, I got good voices. But for some utterances,
I got
> unstable voices (pop, white noise or even silence).
>
> I had also put the same data to "Speaker dependent training demo"
(without
> STRAIGHT) and got no problems. So I don't know where is the
problem (from
> STRAIGHT?) and how to solve it.
>
> I appreciate very much your help and Have a nice day!
>
> Best Regards,
> Yuan-Fu
>
> PS: I used HTS 2.1 and STRAIGHT version V40_006b (no error during
training
> procedure)
>
>
>

--
-------------------------------
Nagoya Institute of Technology
Tokuda and Lee lab.
Kei Hashimoto
bonanza@xxxxxxxxxxxxxxx
-------------------------------

References
: [hts-users:02150] Problem about unstable synthesized voices (using Speaker dependenttraining with STRAIGHT demo), yfliao; [hts-users:02162] Re: Problem about unstable synthesized voices (using Speaker dependenttraining with STRAIGHT demo), evan huang; [hts-users:02177] Re: Problem about unstable synthesized voices(using Speaker dependenttraining with STRAIGHT demo), 北科大-廖元甫; [hts-users:02189] Re: Problem about unstable synthesized voices(using Speaker dependenttraining with STRAIGHT demo), Kei Hashimoto

Prev by Subject: [hts-users:02191] The first HTS meeting
Next by Subject: [hts-users:02193] Format and Generation rule of Full context labels in HTS-demo(NIT-ATR503)
Previous by thread: [hts-users:02189] Re: Problem about unstable synthesized voices(using Speaker dependenttraining with STRAIGHT demo)
Next by thread: [hts-users:02190] Re: Problem about unstable synthesized voices (using Speaker dependenttraining with STRAIGHT demo)