15th International Congress of Phonetic Sciences (ICPhS-15)Barcelona, Spain |
We describe an architecture for a combined 3D face and vocal tract
animation simulator for articulatory speech synthesis. The architecture
provides five main modules: 1. a simulator engine, 2. a 3D geometry
module 3. a graphical user interface (GUI) module, 4 a synthesis engine
and 5. a numerics engine. Elements of the model are specified using
nodes placed hierarchically in a scene graph. Traversal of the nodes
in the scene graph by the simulator engine creates the animation and
drives the articulatory synthesis.
Part of the motivation for the
structure of the architecture is the recognition that many researchers
have done extensive research on separate aspects of the problems of
vocal tract and face modelling in addition to speech synthesis based
on articulation. Our architecture is meant to facilitate combining
models of different structures and levels of detail from different
research groups easily providing a testbed for articulatory based speech
research and production. Our ultimate aim is to have a fully functioning
3D vocal tract model that uses aeroacoustic models to produce speech.
Bibliographic reference. Vogt, Florian / Fels, S. Sidney / Gick, Bryan / Jaeger, Carol / Wilson, Ian (2003): "Extensible infrastructure for a 3d face and vocal-tract model", In ICPhS-15, 2345-2348.