15th International Congress of Phonetic Sciences (ICPhS-15)
We describe an architecture for a combined 3D face and vocal tract
animation simulator for articulatory speech synthesis. The architecture
provides five main modules: 1. a simulator engine, 2. a 3D geometry
module 3. a graphical user interface (GUI) module, 4 a synthesis engine
and 5. a numerics engine. Elements of the model are specified using
nodes placed hierarchically in a scene graph. Traversal of the nodes
in the scene graph by the simulator engine creates the animation and
drives the articulatory synthesis.
Part of the motivation for the structure of the architecture is the recognition that many researchers have done extensive research on separate aspects of the problems of vocal tract and face modelling in addition to speech synthesis based on articulation. Our architecture is meant to facilitate combining models of different structures and levels of detail from different research groups easily providing a testbed for articulatory based speech research and production. Our ultimate aim is to have a fully functioning 3D vocal tract model that uses aeroacoustic models to produce speech.
Bibliographic reference. Vogt, Florian / Fels, S. Sidney / Gick, Bryan / Jaeger, Carol / Wilson, Ian (2003): "Extensible infrastructure for a 3d face and vocal-tract model", In ICPhS-15, 2345-2348.