[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:04329] Re: bad voice output for test sentences

Subject: [hts-users:04329] Re: bad voice output for test sentences
From: Keiichiro Oura <uratec@xxxxxxxxxxxx>
Date: Thu, 19 Nov 2015 10:10:25 +0900
Cc: Keiichiro Oura <uratec@xxxxxxxxxxxx>
Delivered-to: hts-users@xxxxxxxxxxxxxxx
Dkim-signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to:cc:content-type; bh=f0G+xThvK4a4dUHfflbNuomjqWz5tzcsWVWi4Xzrktk=; b=zaSfZ6LRTYYQ65Ioxmmtl2hwKralOeHNseZy0SckVXpHuN2AlX3tBke1NE/1MhwGGe i5m4fp2xKc2ClaWwd4chNzpzX+wC5L98c0K3+5P6MQBI6lVMtWa90SX3IKbkllupwhpe 6IACxrfkWmJYjisfM/PTT+sFx8gBm4JSLJHyyWoTXm54jYP/JCUQEtwTDRMuhM2DESvq oMgdb+2rZYH93G8YVDDDdZdrfIllmmyihQ5GFbc4icIqoermwodFatea74590Cv8X9EE WuKUz6fgvW1UEYh7LTpSdPMQ13IzV9vOV4HXtsEnnDGfsns+GaRc28wWLvDTs8kokZBX V96w==

Hi,

I suppose that training data were recorded as 16kHz sampling rate.
And up-sampled data were used as training data.
It seems that there are two solutions.

1.
Run training scripts with 16kHz setting and 16kHz data.

2.
Add -E -60.0 to mcep/mgc command in data/Makefile and run training
scripts with 48kHz setting and up-sampled data.

Regards,
Keiichiro Oura




2015-11-19 4:51 GMT+09:00 Erica Cooper <ecooper@xxxxxxxxxxxxxxx>:
> Hi,
>
> I checked and I had mistakenly synthesized that test audio from the mono
> labels rather than full.
> Using hts-engine to synthesize from the full labels still produces bad
> output though, it's mostly silence:
> http://www.cs.columbia.edu/~ecooper/audio/nat_0001-2.wav
> Here is the fullcontext label that was used:
> http://www.cs.columbia.edu/~ecooper/audio/nat_0001.lab
> I don't know whether it makes a difference, but the training data used for
> this voice has been 'monotonized' (constant lf0 for voiced regions).  We
> have already done this for another voice trained the same way on different
> data and all synthesized test utterances came out properly.
>
> Thanks,
> Erica
>
>
> On Wed, Nov 18, 2015 at 10:09 AM, Keiichiro Oura <uratec@xxxxxxxxxxxxxxx>
> wrote:
>>
>> Hi,
>>
>> Let me see the test labels.
>>
>> Regards,
>> Keiichiro Oura
>>
>>
>>
>> 2015-11-18 23:43 GMT+09:00 Erica Cooper <ecooper@xxxxxxxxxxxxxxx>:
>> > Hi all,
>> >
>> > I have trained a voice using the HTS demo script and my own data.  I
>> > have
>> > also added my own test sentences to gen.scp.  The 'alice' sentences come
>> > out
>> > fine, however my own test sentences come out with no audible speech,
>> > they
>> > sound like this (hts_engine):
>> >
>> > http://www.cs.columbia.edu/~ecooper/audio/nat_0001.wav
>> >
>> > I would expect that there may be something wrong with my test sentence
>> > labels, however I have already been using them with dozens of other
>> > voices
>> > with no problems.  If anyone has any ideas or pointers as to what might
>> > be
>> > causing this and how to fix it, it would be greatly appreciated.
>> >
>> > Thanks,
>> > Erica
>>
>>
>

Follow-Ups
: [hts-users:04334] Re: bad voice output for test sentences, Erica Cooper

References
: [hts-users:04325] bad voice output for test sentences, Erica Cooper; [hts-users:04326] Re: bad voice output for test sentences, Keiichiro Oura; [hts-users:04327] Re: bad voice output for test sentences, Erica Cooper

Prev by Subject: [hts-users:04328] Call for Participation: Voice Conversion Challenge 2016
Next by Subject: [hts-users:04330] Re: A patch to HTS from Google
Previous by thread: [hts-users:04327] Re: bad voice output for test sentences
Next by thread: [hts-users:04334] Re: bad voice output for test sentences