14th International Congress of Phonetic Sciences (ICPhS-14)

San Francisco, CA, USA
August 1-7, 1999

Evaluating Representations of Segment Level Dynamics in Acoustic-Phonetic Mapping

Dave Davies, J. Bruce Millar

Computer Sciences Laboratory, Research School of Information Sciences and Engineering, Australian National University, Australia

The time domain of phonetic events is examined with a view to proposing approaches that depart from the regular clock-timed representation of the acoustic analysis of speech and the use of simple time derivatives as a representation of temporal information in the acoustic vector normally fed to automatic speech recognition systems. A novel approach of incorporating temporal information within sequential acoustic vectors is introduced. Basic acoustic parameters derived using a sourcesynchronous analysis technique are combined with a coded representation of their temporal environment. The concept of similarity length is described and elaborated in various forms that can be applied to the description of sequences of speech up to phonetic segment level. A phoneme-recognition-based evaluation criterion is developed in order to evaluate the performance of such acoustic vectors in efficiently representing acoustic-phonetic mapping. Experiments that apply this analysis and evaluation to the acoustic representation of stop consonant are described. Results are presented in the form of the ranking of individual stop consonants against all other phonemes when this analysis is performed using simple acoustic parameters with and without the addition of temporal information.

Full Paper

Bibliographic reference.  Davies, Dave / Millar, J. Bruce (1999): "Evaluating representations of segment level dynamics in acoustic-phonetic mapping", In ICPhS-14, 1105-1108.