[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:04022] Re: Building Sinsy voice


Thank you for your quick reply!

Just to verify; is it expected behavior to get possibly around 100~200 cents error w.r.t. the input MusicXML in some notes when using the .htsvoice trained with the smaller public domain set?
http://www.dtic.upf.edu/~mblaauw/sinsy_f0_synth.png
http://www.dtic.upf.edu/~mblaauw/sinsy_song031.wav
http://www.dtic.upf.edu/~mblaauw/sinsy_song070.wav

I checked the pitch of one of the files in the training set, and it doesn't look obviously wrong at least.
http://www.dtic.upf.edu/~mblaauw/sinsy_f0_train.png

Regards,
Merlijn



Sunday, March 02, 2014 4:17 PM
Hi,

I was wondering; is building the  "HTS-demo_NIT-SONG070-F001" training demo
is supposed to result in same voice as the
"hts_voice_nitech_jp_song070_f001-0.90" downloadable pre-build binary for
Sinsy?

No.
One of the main differences is the size of training data.
Only 31 songs (32min., *public domain*) are included in the demo scripts.
On the other hand, the HTS voice of http://sinsy.sourceforge.net is
trained by using 70 songs (72min.).

Regards,
Keiichiro Oura


2014-03-02 23:22 GMT+09:00 Merlijn Blaauw <merlijn.blaauw@xxxxxxx>:
Hello,

I was wondering; is building the  "HTS-demo_NIT-SONG070-F001" training demo
is supposed to result in same voice as the
"hts_voice_nitech_jp_song070_f001-0.90" downloadable pre-build binary for
Sinsy?

I tried to build the demo, but the file size of the resulting .htsvoice file
and synthesis results are very different from the pre-build voice.
In particular pitch (and breath sounds) seems to be modeled very poorly;
timbre seems more or less ok (although it is kind of hard to tell).
The "gen" phrases synthesized as part of the training script also do not
sound very good.

I'm using the following software: HTS 2.3alpha, HTS-demo_NIT-SONG070-F001
from HTS 2.3alpha (slightly modified to fix raw2wav sample rate issue), SPTK
3.7, sinsy 0.90, hts_engine 1.08 .

Thank you very much.
Merlijn


Sunday, March 02, 2014 3:22 PM
Hello,

I was wondering; is building the  "HTS-demo_NIT-SONG070-F001" training demo is supposed to result in same voice as the "hts_voice_nitech_jp_song070_f001-0.90" downloadable pre-build binary for Sinsy?

I tried to build the demo, but the file size of the resulting .htsvoice file and synthesis results are very different from the pre-build voice.
In particular pitch (and breath sounds) seems to be modeled very poorly; timbre seems more or less ok (although it is kind of hard to tell).
The "gen" phrases synthesized as part of the training script also do not sound very good.

I'm using the following software: HTS 2.3alpha, HTS-demo_NIT-SONG070-F001 from HTS 2.3alpha (slightly modified to fix raw2wav sample rate issue), SPTK 3.7, sinsy 0.90, hts_engine 1.08 .

Thank you very much.
Merlijn

References
[hts-users:04019] Building Sinsy voice, Merlijn Blaauw
[hts-users:04020] Re: Building Sinsy voice, Keiichiro Oura