14th International Congress of Phonetic Sciences (ICPhS-14)San Francisco, CA, USA |
ProSynth is an approach to speech synthesis which takes a rich linguistic structure as central to the generation of naturalsounding speech [1]. This paper outlines the model of temporal interpretation employed in ProSynth in generating polysyllabic utterances, and the phonological structures used to drive the synthesis. We start from the assumption that the speech signal is informationally rich, and that this acoustic richness reflects linguistic structural richness. The primary timing unit is the syllable, situated within a prosodic hierarchy. Two mechanisms are used for timing: (1) Syllables are joined by overlaying one over another; (2) Syllables are temporally compressed to produce the correct rhythmical effects.
Bibliographic reference. Ogden, Richard / Local, John / Carter, Paul (1999): "Temporal interpretation in ProSynth, a prosodic speech synthesis system", In ICPhS-14, 1059-1062.