14th International Congress of Phonetic Sciences (ICPhS-14)

San Francisco, CA, USA
August 1-7, 1999


Speech Synthesis Using A Physiological Articulatory Model with Feature-Based Rules

Jianwu Dang (1,2), Jiping Sun (2), Li Deng (2), Kiyoshi Honda (1)

(1) ATR Human Information Processing Research Labs, Kyoto, Japan
(2) Univ. of Waterloo, Waterloo, ON, Canada

A 3-D computational model of speech articulators has been developed for human-mimetic speech synthesis. The model geometry was derived from volumetric MRI data that were collected from one male speaker. A multipoint control strategy is developed to control the model, which involves three points of the articulators: the tongue tip, tongue dorsum, and the jaw. To control these points in the geometric space of the vocal tract independently, a set of weight coefficients is defined for each muscle in a specific control point. A dynamic muscle workspace is proposed to predict muscle force vectors for a control point in any arbitrary position. Muscle activation signals are generated via the dynamic workspace, and fed to the muscles to drive the model. To develop a speech synthesis system using the physiological model, this study explores some feature-based phonological rules, which provides temporally overlapping articulatory targets from a given sequence of phonetic segments. Examples of the synthetic sounds are given using the model.

Full Paper

Bibliographic reference.  Dang, Jianwu / Sun, Jiping / Deng, Li / Honda, Kiyoshi (1999): "Speech synthesis using a physiological articulatory model with feature-based rules", In ICPhS-14, 2267-2270.