14th International Congress of Phonetic Sciences (ICPhS-14)

San Francisco, CA, USA
August 1-7, 1999


Using Speech Acoustics to Drive Facial Motion

Hani Yehia (1), Takaaki Kuratate (2), Eric Vatikiotis-Bateson (2)

(1) Universidade Federal de Minas Gerais, Dept. Eng. Eletronica, Brazil
(2) ATR Human Information Processing Research Laboratories, Japan

This paper describes and evaluates a method to estimate facial motion during speech from the speech acoustics. It is a statistical method based on simultaneous measurements of facial motion and speech acoustics. Experiments were carried out for one American English and one Japanese speaker. Facial motion is characterized by the 3D position of markers placed on the face and tracked at 60 frames/s. The speech acoustics is characterized by LSP parameters. The method is based on two points: (i) using appropriate constraints, the vocal-tract shape can be estimated from the speech acoustics; and (ii) most of facial motion is a con- sequence of vocal-tract motion. Marker positions and LSP parameters were collected during several utterances and used to train artificial neural networks, which were then evaluated with test data. In the results obtained, approxi- mately 85% of the facial motion variance were determined from the speech acoustics.

Full Paper

Bibliographic reference.  Yehia, Hani / Kuratate, Takaaki / Vatikiotis-Bateson, Eric (1999): "Using speech acoustics to drive facial motion", In ICPhS-14, 631-634.