15th International Congress of Phonetic Sciences (ICPhS-15)
In this paper, we investigate how intonation is used to confirm a word
in English. This intonation type is challenging to model, as it mixes
narrow focus and question with variations based on accent location,
phrasing and speaking rate.
We build a model that predicts the intonation from the text, using an extremely simple intonational phonology. One can interpret some of the parameters of the model as detailed description of accent shapes and others as prosodic strengths which carry phrasing information. The RMS deviation is 21 Hz or 1.7 semitones, a result comparable to machine learning methods, but with far fewer parameters that need to be learned.
Furthermore, the model handles both fast and slow speech with the same set of parameters in a principled way. The model incorporates some aspects of muscle dynamics, and its ability to predict F0 at different speaking rates is confirmation that an articulatory approach to F0 modeling is appropriate.
Bibliographic reference. Shih, Chilin / Kochanski, Greg (2003): "Modeling intonation: asking for confirmation in English", In ICPhS-15, 551-554.