15th International Congress of Phonetic Sciences (ICPhS-15)

Barcelona, Spain
August 3-9, 2003

Extensible Infrastructure for a 3D Face and Vocal-Tract Model

Florian Vogt, S. Sidney Fels, Bryan Gick, Carol Jaeger, Ian Wilson

University of British Columbia, Canada

We describe an architecture for a combined 3D face and vocal tract animation simulator for articulatory speech synthesis. The architecture provides five main modules: 1. a simulator engine, 2. a 3D geometry module 3. a graphical user interface (GUI) module, 4 a synthesis engine and 5. a numerics engine. Elements of the model are specified using nodes placed hierarchically in a scene graph. Traversal of the nodes in the scene graph by the simulator engine creates the animation and drives the articulatory synthesis.
   Part of the motivation for the structure of the architecture is the recognition that many researchers have done extensive research on separate aspects of the problems of vocal tract and face modelling in addition to speech synthesis based on articulation. Our architecture is meant to facilitate combining models of different structures and levels of detail from different research groups easily providing a testbed for articulatory based speech research and production. Our ultimate aim is to have a fully functioning 3D vocal tract model that uses aeroacoustic models to produce speech.

Full Paper

Bibliographic reference.  Vogt, Florian / Fels, S. Sidney / Gick, Bryan / Jaeger, Carol / Wilson, Ian (2003): "Extensible infrastructure for a 3d face and vocal-tract model", In ICPhS-15, 2345-2348.