[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:03847] Re: Quality from demos

Subject: [hts-users:03847] Re: Quality from demos
From: Dietmar Schabus <schabus@xxxxxx>
Date: Mon, 19 Aug 2013 09:59:35 +0200
Delivered-to: hts-users@xxxxxxxxxxxxxxx

How do you make .raw audio files from wave files? You need to convert toraw files at 48 kHz, 16 bit (2-byte short), signed integer, littleendian, single channel.

Like this (bash syntax):

for wavfile in /your/wav/dir/*.wav
do

sox "$wavfile" -V1 -t raw -r 48000 -b 16 -e signed-integer -L -c1"/someplace/HTS-demo_CMU-ARCTIC-SLT/data/raw/${wavfile%.wav}.raw"

done

See man sox for details.

Your file sounds like your .raws are at 16 kHz. You can also use 16 kHzdata, but then you need to specify that in the SAMPFREQ environmentvariable before running configure. Run ./configure --help for more info.


Best, Dietmar


On 2013-08-10 19:45, Marvin Coto wrote:

Hello Dietmar!
Thank you so much for your reply.
I recently changed the questions folder to fit my phone set, and the
result have improved a lot:

https://dl.dropboxusercontent.com/u/81143637/nueva.wav

The problem now I think is the pitch in the synthetized wave. I've tried
with male and female voices, and in both cases the results are heard in
lower tones than the original voice. I ran the demo script
cmu_us_arctic_slt as it comes, and the result was perfect.

I also read on the log file: "x2x : warning: input data is over the
range of type 'short'!" I read about it on the mailing list but didn't
understand if I have to change something on the scripts.
Thanks in advance for your attention,

Marvin.


2013/8/7 Dietmar Schabus <schabus@xxxxxx <mailto:schabus@xxxxxx>>

    Hello Marvin,

    I think it's due to the questions.

    If the questions do not match your phone set, the decision tree
    based clustering cannot produce reasonable results (everything will
    be answered "no").
    I think the easiest way to obtain questions is to write a simple
    script that generates them from a set of classes (like "vowel" or
    "fricative" etc.) where the phones (of your phone set) that belong
    to each class are listed.

    Regards,
    Dietmar



    On 2013-08-06 22:06, Marvin Coto wrote:

        Hello!

        I'm trying to use the demo scripts of HTS 2.2: the Speaker-dependent
        training demo in English and Portuguese, for a voice in Spanish.
        I change the raw files from my spanish database (184 files made from
        good quality recordings), and the utt files (generated with Festival
        2.1, from the text transcription of the audio files). Even though I
        haven't change the Questions folder, I was expected a little more
        quality from the beggining.
        A example of the result, synthetized in Festival can be heard here:

        https://dl.dropboxusercontent.__com/u/81143637/ConEstoico.wav
        <https://dl.dropboxusercontent.com/u/81143637/ConEstoico.wav>

        It supposed to say "Con estoico respeto a la justicia adyacente
        guardo
        sus flechas". I'm sure nobody needs to speak spanish to realize
        something is going very wrong. From that audio file, does
        somebody have
        any idea of where could be the problem? Maybe the raw files, the
        way I'm
        generating the utt, or should be the questions folder?

        In the last case, do I have to change the questions manually, or
        that
        have to be done in HTK?

        Thanks in advance,

        Marvin.


--
----------------------------------------------------------------------
Dipl.-Ing. Dietmar Schabus | Researcher

phone +43 1 5052830-48 | fax -99 | schabus@xxxxxx |http://userver.ftw.at/~schabus/


FTW Telecommunications Research Center Vienna
Donau-City-Straße 1/3 | 1220 Vienna | Austria | www.ftw.at

Follow-Ups
: [hts-users:03851] Re: Quality from demos, Marvin Coto

References
: [hts-users:03833] Quality from demos, Marvin Coto; [hts-users:03834] Re: Quality from demos, Dietmar Schabus; [hts-users:03838] Re: Quality from demos, Marvin Coto

Prev by Subject: [hts-users:03846]
Next by Subject: [hts-users:03848] Warning on using a german locale and HTS-demo_CMU-ARCTIC-SLT for HTS 2.2
Previous by thread: [hts-users:03838] Re: Quality from demos
Next by thread: [hts-users:03851] Re: Quality from demos