[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:04292] Re: Vocoding natural speech


Hi Erica,

I find that the easiest way is to use SPTK's wav2raw method (http://sp-tk.sourceforge.net/).

This has worked successfully for me.

The command I use is:

SPTKDIR/bin/wav2raw -N -L -d OUTDIR INFILE

You may wish to change some of the options yourself to fit your data though (e.g. if it is not stereo data).

- Rasmus

Quoting Erica Cooper <ecooper@xxxxxxxxxxxxxxx> on Mon, 13 Jul 2015 10:58:52 -0400:

Hi,

We are trying to vocode natural speech to use as a comparison in a
naturalness test.  I have been following parts of the HTS demo scripts to
do this.  In particular, first converting .wav to .raw, then doing 'make
lf0' and 'make mgc', and finally using the 'gen_wave' function in
Training.pl.

The problem is, I get a lot of the error "x2x : warning: input data is over
the range of type 'short'!"  And then the output audio has a lot of loud
squeaks and pops.  I've isolated the problem to the .wav to .raw conversion
- when I do the vocoding starting with the .raw files that came with the
SLT demo, instead of converting them from .wav myself, it comes out
sounding fine.

I've tried two methods to convert .wav to .raw: first, this one that was in
the README in the demo:

ch_wave -c 0 -F 32000 -otype raw in.wav | x2x +sf | interpolate -p 2 -d |
ds -s 43 | x2x +fs > out.raw

and this one, that I found in an earlier thread on this mailing list:

for wav ./wav/*.wav
do
  raw=./raw/`basename $wav .wav`.raw
  sox -c 1 -s -w -t wav -r 16000 $wav -c 1 -s -w -t wav -r 48000 $raw
done

and for both of these, the audio comes out sounding bad.

So, does anyone know what might be causing this, or how the .raw files in
the SLT demo were generated from .wav, or know of any other way to vocode
natural speech in a .wav file?

Thanks very much,
Erica




--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.



References
[hts-users:04291] Vocoding natural speech, Erica Cooper