14th International Congress of Phonetic Sciences (ICPhS-14)

San Francisco, CA, USA
August 1-7, 1999

Prosodic Feature Evaluation: Brute Force or Well Designed?

Anton Batliner, Jan Buckow, Richard Huber, Volker Warnke, Elmar Nöth, Heinrich Niemann

University of Erlangen-Nuremberg, Chair for Pattern Recognition, Erlangen, Germany

In this paper we want to bridge the gap between phonetic/ phonological theory on the one hand and automatic speech processing on the other hand. As material, we use a subset of the German VERBMOBIL database that is annotated with prosodic boundary and accent information. We computed a large prosodic feature vector: 276 features for a context window of up to five words modelling, duration, energy, tempo, pauses, and linguistic information on the word level. Linear Discriminant Analysis (LDA) was used in order to minimize the number of features without too much loss in classification performance. This number could be reduced drastically from 276 to 11 for boundaries and to 6 for accents; the overall classification rate was only reduced by some two to three percent. We discuss the ’surviving’ relevant features as well as limitations of this approach.

Full Paper

Bibliographic reference.  Batliner, Anton / Buckow, Jan / Huber, Richard / Warnke, Volker / Nöth, Elmar / Niemann, Heinrich (1999): "Prosodic feature evaluation: brute force or well designed?", In ICPhS-14, 2315-2318.