![]() |
15th International Congress of Phonetic Sciences (ICPhS-15)Barcelona, Spain |
![]() |
Corpus-based Text-To-Speech has been actively studied for the improvement
of synthesized speech heading to human-like naturalness. However, the
application of TTS is very restricted due to its large database size.
In this paper, to solve this problem, we propose two modified algorithms
of LBG clustering algorithm (split k-means). We introduce a terminating
threshold of total cost in the first modification. The number of selected
inventories becomes less than target cluster number if total cost reduction
is enough to end iteration process. Considering frequency information
of unit instances, which is obtained during synthesizing large text
corpus, makes the second modification. To consider frequency information
we proposed modified cost function of MinMax commonly used in selecting
centroids.
The perceptual test results show that our algorithm
achieves the successful performance with reducing most the DB size
and maintaining good speech quality.
Bibliographic reference. Kim, Jinyoung / Chun, Youngha / Lee, Joohun / Choi, Seungho (2003): "Modified LBG clustering algorithms for small unit inventory in corpus-based TTS system", In ICPhS-15, 2561-2564.