[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:00194] Re: diffrences between merge and cat


hi,
Heiga ZEN

I greatly appreciate your and sangjin kim's help.

I have known the process of sythesizing speech as follow,

1.HTS voice which contains hts_engine is registered by \festvox\cstr_us_ked_timit_hts.scm that file tells festival where is the hts voice for example the following command change the current voice :
 festival > (voice_cstr_us_ked_timit_hts)

2.\festvox\cstr_us_ked_timit_hts.scm  loads \hts\hts.scm ,
\hts\hts.scm calls \hts\htsvoice.pl which uses hts_engine to synthesize speech.

3.\hts\hts.scm also defines some variables by which we can modify speech feature, such as f0.


In the \hts\htsvoice.pl , I have found that hts_engine synthesizes speech
 using .pdf   fles,
does the hts_engine have the function of generating parameters from .pdf files or the .pdf files are already the parameters format which can be used to synthesize speech.

 In the \festvox director
there are some other files except cstr_us_ked_timit_hts.scm such as

 cstr_us_ked_timit_hts_phoneset.scm
 cstr_us_ked_timit_hts_tokenizer.scm
 cstr_us_ked_timit_hts_tagger.scm
 cstr_us_ked_timit_hts_lexicon.scm
 cstr_us_ked_timit_hts_phrasing.scm
 cstr_us_ked_timit_hts_intonation.scm
 cstr_us_ked_timit_hts_duration.scm
 cstr_us_ked_timit_hts_f0model.scm
 cstr_us_ked_timit_hts_other.scm

 I have not found their effects in the process of synthesizing speech.
 Why the demo of "HTS voices for fetival"  provides them?

Regards,
liulei



From: "Heiga ZEN (Byung Ha CHUN)" <zen@xxxxxxxxxxxxxxxx>
Reply-To: hts-users@xxxxxxxxxxxxxxxxxxxxxxxxx
To: hts-users@xxxxxxxxxxxxxxxxxxxxxxxxx
Subject: [hts-users:00193] Re: diffrences between merge and cat
Date: Wed, 11 Jan 2006 02:45:15 +0900

Hi,

liulei_198216@xxxxxxxxxxx wrote:

But  I puzzle how festival works together with hts_engine.

I have read the "perl script" and found that hts_engine gets .lab files from /tmp/tmp.lab to systhsize speech.

when we use "festival" to input text from command line ,we will get real-time speech.
My question is
How does the festival trigger hts_engine to systhsiz speech after .lab files are generated from festival?

I have also read festival manul and found it systhsize speech through concatenating units,
but how does the hts_engine work with festival?

1. Festival extracts utterance information from an input text.
2. The uttrance information is saved as "utt.feats" on working directory.
3. Festival calls htsvoice.pl.
4. In htsvoice.pl, the utterance information is converted to corresponding context-dependent label sequence.
5. hts_engine is called from htsvoice.pl.
6. A waveform is synthesized and saved in raw audio format.
7. RIFF header is appended to raw audio by sox.
8. Resultant wav file is loaded by festival,

Please see hts.scm included in our hts voices.

In the next festival release (2.0?), hts_engine will be integrated as a function of the festival.

Regards,

Heiga Zen (Byung Ha CHUN)

--
 ------------------------------------------------
  Heiga ZEN     (in Japanese pronunciation)
  Byung Ha CHUN (in Korean pronunciation)

  Department of Computer Science and Engineering
  Graduate School of Engineering
  Nagoya Institute of Technology
  Japan

  e-mail: zen@xxxxxxxxxxxxxxxx
     web: http://kt-lab.ics.nitech.ac.jp/~zen
 ------------------------------------------------


_________________________________________________________________
与联机的朋友进行交流,请使用 MSN Messenger: http://messenger.msn.com/cn
Follow-Ups
[hts-users:00195] Re: diffrences between merge and cat, Heiga ZEN (Byung Ha CHUN)
References
[hts-users:00193] Re: diffrences between merge and cat, Heiga ZEN (Byung Ha CHUN)