15th International Congress of Phonetic Sciences (ICPhS-15)
This paper addresses three interrelated, broad questions: (i) What type of phonetic knowledge is used in text-to-speech synthesis? (ii) What good does it do? (iii) What future phonetics research does synthesis need? We argue that, depending on the specific architecture and aim of the system (i.e., open domain or closed domain), text-to-speech synthesis systems can incorporate a great variety of facts about human language. These facts do not necessarily take the form of manually crafted rule systems. These rules systems have often have been faulted for fragility, which in turn has been used as an argument for doing away with the incorporation of phonetic knowledge and using machine learning instead. The key importance of incorporating phonetic knowledge is its domain-independence, which lessens the dependence of a system's performance on the non-generalizable peculiarities of a training corpus. Moreover, current phonetic knowledge is not enough - needed improvements of speech synthesis quality need answers to many general phonetic questions; some examples of these will be provided. For this, however, closer cooperation is needed between the speech technology and phonetics communities.
Bibliographic reference. Santen, Jan P. H. van (2003): "The role of phonetics in synthesis", In ICPhS-15, 55-58.