[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:04400] Re: synthesis with STRAIGHT

Subject: [hts-users:04400] Re: synthesis with STRAIGHT
From: Rasmus Dall <R.Dall@xxxxxxxxxxxx>
Date: Tue, 29 Mar 2016 10:33:04 +0100
Delivered-to: hts-users@xxxxxxxxxxxxxxx

Hi,

The corpus we are training on is from 3 different speakers and we have
LOWERF0 and UPPERF0 set to 110 and 280 respectively, same as in the SLT
demo, for all speakers, for both STRAIGHT and hts-engine.

Whethe ror not this is actually the issue, you should see a qualityimprovement by tuning these parameters. This is the input to the F0extractor which will assume anything outwith this range is wrong. Tryand analyse your speakers individually by extracting F0 from a largenumber of utterances using a wide F0 range, then narrow it down untilit seems right.

Are there any
other parameters to change, in particular for voiced/unvoiced?

I'm not entirely sure if using straight. But if not using straight andyou use SPTK as is the default then in the SPTK/bin folder you shouldbe able to find the pitch.c file where you can see the options, -T forRAPT and -t for SWIPE.

These options should be passed on line 117 in the makefile of the datafolder in HTS-demo.


- Rasmus

Quoting Erica Cooper <ecooper@xxxxxxxxxxxxxxx> on Mon, 28 Mar 201613:18:45 -0400:

Thanks, all, for the helpful replies!

The hts-engine sample was trained using hts-2.2.  The STRAIGHT samples were
trained using hts-2.3 and STRAIGHT V40.  When I say we are using the same
data for both, I just mean the same corpus -- the STRAIGHT training
includes the 'bap' features.

The corpus we are training on is from 3 different speakers and we have
LOWERF0 and UPPERF0 set to 110 and 280 respectively, same as in the SLT
demo, for all speakers, for both STRAIGHT and hts-engine.  It sounds like
most likely these ought to be changed, which I will try.  Are there any
other parameters to change, in particular for voiced/unvoiced?  I checked
both data/Makefile and scripts/Config.pm and didn't see anything that
looked relevant.

Best,
Erica


On Mon, Mar 28, 2016 at 9:32 AM, Blaise Potard <bpotard@xxxxxxxxx> wrote:

Hello,

I am not entirely sure which version of the HTS demo or STRAIGHT you are
using, as I don't think the demo normally sounds like this. Regardless,
when you say you use the exact same data for hts_engine and STRAIGHT
synthesis, you mean you are not using mixed excitation at all?

In any case, 1mix / 2mix / stc will produce different parameters from what
hts-engine is generating, so if you want to have a fair comparison, you are
probably better off dumping the filter / excitation feature coefficients
from hts_engine using the -om / -of parameters, and do the synthesis from
the generated coefficients using STRAIGHT.

If you still have problems with the synthesis using the parameters
generated by HTS-engine, then probably you are using a bad version of
STRAIGHT.

If you don't have problems with the synthesis, then it is likely something
wrong happened during model training, or, maybe as Rasmus mentioned, during
the feature extraction.

Regards,
Blaise

2016-03-25 14:21 GMT+00:00 Erica Cooper <ecooper@xxxxxxxxxxxxxxx>:

Hi,

We've started using STRAIGHT for synthesis, and we've found that for our
data, it sounds worse than synthesis with hts-engine, despite the STRAIGHT
SLT demo voice sounding very nice.  We are using the exact same data with
both STRAIGHT and hts-engine synthesis, but the STRAIGHT-synthesized
utterances sound 'hoarse.'  1mix, 2mix, and stc are all not so good.  I was
wondering whether there is any advice for which parameters might be changed
to solve this.

original hts-engine voice:
http://www.cs.columbia.edu/~ecooper/audio/eng_alice01.wav
STRAIGHT 1mix:
http://www.cs.columbia.edu/~ecooper/audio/1mix_alice01.wav
STRAIGHT 2mix:
http://www.cs.columbia.edu/~ecooper/audio/2mix_alice01.wav
STRAIGHT stc:   http://www.cs.columbia.edu/~ecooper/audio/stc_alice01.wav

Thanks,
Erica




--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

Follow-Ups
: [hts-users:04401] Re: synthesis with STRAIGHT, Erica Cooper

References
: [hts-users:04392] synthesis with STRAIGHT, Erica Cooper; [hts-users:04395] Re: synthesis with STRAIGHT, Blaise Potard; [hts-users:04398] Re: synthesis with STRAIGHT, Erica Cooper

Prev by Subject: [hts-users:04399] Re: Speaker Adaptation Demo - Release Archive - HTS-2.2
Next by Subject: [hts-users:04401] Re: synthesis with STRAIGHT
Previous by thread: [hts-users:04398] Re: synthesis with STRAIGHT
Next by thread: [hts-users:04401] Re: synthesis with STRAIGHT