[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:02192] Re: Problem about unstable synthesized voices(usingSpeaker dependenttraining with STRAIGHT demo)

Dear Kei,

Thanks very much for your help, I have changed the frame length from 1024 to 2048 (for V40_007c).

However, I got another problem for STRAIGHT V40_007c, i.e., I can extract the feature but can't re-synthesize the waveform using STRAIGHT V40_007c. I mean after calling exstraightsource() and exstraightspec() functions in MATLAB, I can't call exstraightsynth() to get the re-synthesized waveform. It simply showed "Index exceeds matrix dimensions". And the same MATLAB script works fine with STRAGIHT V40_006b.

Another problem is about MATLAB itself. It seems that MATLAB can't handle long script, i.e., after processing about 1000 waveforms, it hangs without any output. But if I divide the script into several small script (750 files in my case), it is ok. This is true for both V40_006b and V40_007c.

I know those problems are only related to STRAIGHT and MATLAB, but maybe somebody here can help me.

Thanks for your help and have a nice day!

Best Regards,

Here is my scritp for feature extraction and re-sythesis MATLAB script:

fprintf(1,'Processing wav/00000002.wav%c',10);
x = x * 32767;
prm.F0searchUpperBound=350   ;
prm.F0searchLowerBound=80    ;
[f0, ap] = exstraightsource(x,fs,prm);
[sp] = exstraightspec(x, f0, fs, prm);
[sy] = exstraightsynth(f0, sp, ap, fs, prm);
ap = ap';
sp = sp';
save 'f0/00000002.f0' f0 -ascii;
save 'ap/00000002.ap' ap -ascii;
save 'sp/00000002.sp' sp -ascii;
clear all;
close all;

Here is the error log:

>> >> >> >> >> >> >> >> >> >> Processing wav/00000002.wav
>> >> >> >> >> >> >> >> >> ??? Index exceeds matrix dimensions.

Error in ==> straightSynthTB07ca at 290
    wnz=aprm(round(idcv(:)),round(ii));  % 06/May/2001 This is correct!

Error in ==> exstraightsynth at 59
[sy,statusReport] =straightSynthTB07ca(n3sgram,f0raw,shiftm,fs, ...
>> ??? Undefined function or variable 'sy'.

2009/8/26 Kei Hashimoto <bonanza@xxxxxxxxxxxxxxx>
Dear all,

The default setting of STRAIGHT version V40_007c was changed from before
So, in data/Makefile.in, we need to modify the frame length for the
spectral parameter extraction from 1024 to 2048.

- $(MGCEP) -a $(FREQWARP) $${GAMMAOPT} -m $(MGCORDER) -l 1024 -j 0 -f
0.0 -q 3 > mgc/$${base}.mgc;
+ $(MGCEP) -a $(FREQWARP) $${GAMMAOPT} -m $(MGCORDER) -l 2048 -j 0 -f
0.0 -q 3 > mgc/$${base}.mgc;

After changing the frame length, I got good voices.

Best Regards,
Kei Hashimoto

Dear Evan,
 I have tried several versions. But I can't get any stable results. Especailly, in my case, V40_007c didn't work at all.. How could you make it work?
 Have a nice day!
 Best Regards,

2009/8/11 evan huang <vinsanity1101@xxxxxxxxx <mailto:vinsanity1101@xxxxxxxxx>>

   Hello everyone,

   I also encounter this problem. In fact, I cannot get reliable result
   of the straight demo for English.
   The STRAIGHT version is V40_007c. What version is preferred for the
   STRAIGHT demo?
   Thanks for any comments in advance.

   Best Regards,

   On Sun, Aug 9, 2009 at 8:15 AM, <yfliao@xxxxxxxxxxx
   <mailto:yfliao@xxxxxxxxxxx>> wrote:
    > Dear All,
    > I got an unstable problem about the synthesized voice and would
   like to know
    > how to debug it.
    > Basically, I adopt the scripts/settings from "Speaker dependent
    > with STRAIGHT demo" (for English voice), and changed the data to
    > speech (in fact Mandarin data from Blizzard Challenge 2009).
    > But it seems that the synthesized voices are not very stable. I
   mean that
    > for some utterances, I got good voices. But for some utterances,
   I got
    > unstable voices (pop, white noise or even silence).
    > I had also put the same data to "Speaker dependent training demo"
    > STRAIGHT) and got no problems. So I don't know where is the
   problem (from
    > STRAIGHT?) and how to solve it.
    > I appreciate very much your help and Have a nice day!
    > Best Regards,
    > Yuan-Fu
    > PS: I used HTS 2.1 and STRAIGHT version V40_006b (no error during
    > procedure)

Nagoya Institute of Technology
Tokuda and Lee lab.
Kei Hashimoto

[hts-users:02150] Problem about unstable synthesized voices (using Speaker dependenttraining with STRAIGHT demo), yfliao
[hts-users:02162] Re: Problem about unstable synthesized voices (using Speaker dependenttraining with STRAIGHT demo), evan huang
[hts-users:02177] Re: Problem about unstable synthesized voices(using Speaker dependenttraining with STRAIGHT demo), 北科大-廖元甫
[hts-users:02189] Re: Problem about unstable synthesized voices(using Speaker dependenttraining with STRAIGHT demo), Kei Hashimoto