15th International Congress of Phonetic Sciences (ICPhS-15)
This paper introduces an efficient search algorithm to find a minimum sentence set for collecting speech database for speech study. The minimum set should have small text size to reduce the collection cost, and cover all the focused phonetic units. The method tries to select a sentence with the lowest cost from a subset corpus, which consists of all the sentences containing at least one token of the least frequent unit of the whole corpus. Compared with other conventional greedy search algorithms, the method successfully achieved a smaller objective set at significantly less computation time.
Bibliographic reference. Zhang, Jin-Song / Nakamura, Satoshi (2003): "An efficient algorithm to search for a minimum sentence set for collecting speech database", In ICPhS-15, 3145-3148.