[Subject Prev][Subject Next][Thread Prev][Thread Next][Date Index][Thread Index]

[hts-users:00320] Re: some questions about hts_engine and HTS_demo


Hi,
thank you.
I have found the .pdf file's formats as follow:
 
mcp file:

# header
4byte integer, dim. of feature vector for spectrum part (ex. 75)
4byte integer, #leaf nodes in 1st state
4byte integer, #leaf nodes in 2nd state
...
4byte integer, #leaf nodes in 5th state

# probability distributions (Gaussian)
4byte float,  1st dim. of mean vector at first leaf node
...
4byte float, 75th dim. of mean vector at first leaf node
4byte float,  1st diag. element of covariance matrix at first leaf node
...
4byte float, 75th diag. element of covariance matrix at first leaf node
.....
4byte float,  1st dim. of mean vector at last leaf node
...
4byte float, 75th dim. of mean vector at last leaf node
4byte float,  1st diag. element of covariance matrix at last leaf node
...
4byte float, 75th diag. element of covariance matrix at last leaf node

and
 
I have used
swab +f mcp.pdf | dmp +i | less (to review header parts)
swab +f mcp.pdf | dmp +f | less (to review distributions)
 
to see the mcp.pdf file in the hts-demo.
 
 
 "mcp.pdf"   in  the hts-demo is as this
 
header
0        12313123
1        12121231
..................
 
pdf
0      0.123131
................
 
The first column,    I think it is  serial number.
what's the second column's mean.
I can not see the ralations between the format and its contens.
 
another question
I aslo used     dmp +f   to open the ". mcep  " and ".pit" files that are used to generate speech.
Here is the ".pit",
 
0    123.123123
1    324.234244
.........................
 
but I found  there are different numbers of value in
the two file.
 
maybe there are 14781 values int the  mcep file  but 777 int the pit file
 
if  " MCEPORDER   = 18 "   , does it mean there are 36 mels (static and dynamic )  and  2 f0s  (static and dynamic).
so          f0's numbers  * 18 = mel's numbers
 
finally  if  I want to modify some values in the .pit  file to modify speech's characteristic,
what tool can I use to do that ?
 
Best regards
 
 
 
 
 
 
 
 
 
 
 
 
 
2006/5/24, Heiga ZEN (Byung Ha CHUN) <zen@xxxxxxxxxxxxxxxx>:
Hi,

lei liu wrote:

> I have read the "training.pl" in the HTS-demo, and found that it uses
> excite,mlsadf and x2x in the SPTK to generate voice.

Yes.

> But I can not open the .mcep and .pit files.?

You can open them using SPTK (dmp command).
They are saved under gen/ directory.

> And how  can  the ".pdf" flie in the voices  be opened?

They are in hts_engine format (binary int/float, big endian).
You can find the definition of these file format through hts-users mailing list archive.

> another question,
> HTS-demo generater .inf files and .pdf files for hts_engine before
> generating  unseen models.
> Can hts_engine generate that using .inf and .pdf files?

Yes.
For given unseen models, hts_engine traverses decision trees (inf files) and finds the corresponding leaf nodes.
Using statistics of the found nodes, it generates mcep and f0 sequences and synthesizes a speech waveform.

Regards,

Heiga Zen (Byung Ha Chun)

--
------------------------------------------------
Heiga ZEN     (in Japanese pronunciation)
Byung Ha CHUN (in Korean pronunciation)

Department of Computer Science and Engineering
Nagoya Institute of Technology
Gokiso-cho, Showa-ku, Nagoya 466-8555 Japan

http://kt-lab.ics.nitech.ac.jp/~zen
------------------------------------------------



Follow-Ups
[hts-users:00321] Re: some questions about hts_engine and HTS_demo, Heiga ZEN (Byung Ha CHUN)
References
[hts-users:00318] some questions about hts_engine and HTS_demo, lei liu
[hts-users:00319] Re: some questions about hts_engine and HTS_demo, Heiga ZEN (Byung Ha CHUN)