14th International Congress of Phonetic Sciences (ICPhS-14)
San Francisco, CA, USA
Coarticulation in speech is one of the most difficult problems for automatic speech recognition (ASR) systems. The degree of coarticulation is assumed to vary with contextual conditions, such as differences in speaking rate, stress, etc. In the past, coarticulation has been studied using only limited data sets and using acousticphonetic methods such as formant analysis. We propose a method that statistically analyzes the degree of coarticulatory influence on features typically used for automatic speech recognition systems (LPCs, MFCCs, RASTA, and compressed subband spectral envelopes). This method computes the Conditional Mutual Information (CMI) between time/feature-position pairs under a variety of coarticulatory conditions. We applied this method on a twohour subset of the Switchboard database and analyzed CMI for various speaking rate, stress, and vowel category conditions. Results show that CMI is indeed larger for those phonetic conditions believed to possess more coarticulation.
Bibliographic reference. Kirchhoff, Katrin / Bilmes, Jeff A. (1999): "Statistical acoustic indications of coarticulation", In ICPhS-14, 1729-1732.