14th International Congress of Phonetic Sciences (ICPhS-14)
San Francisco, CA, USA
A model for deriving perceived local speech rate directly out of the speech signal is developed based on perception experiments. Since local speech rate modifies acoustic cues (e.g. transitions and voice-onset time), phones, syllables, and even words, it is one of the most important prosodic cues. Our local speech rate estimation method is based on a linear combination of the local syllable rate and the local phone rate, since earlier investigations strongly suggest that neither the syllable rate nor the phone rate on its own represent the speech rate sufficiently. In the literature effects of F0 level and F0 movement on speech rate perception have been reported. Therefore we included these cues in our linear combination model. Our results show 1) that the duration of speech stimuli has a strong influence on the perception of speech rate, 2) that the linear combination of local syllable rate and phone rate is wellcorrelated with perceptual local speech rate (r = 0.91), 3) that F0 measurements could not increase the accuracy of the model, and 4) that our method is able to calculate the perceptual local speech rate and the relative local speech rate between two utterances.
Bibliographic reference. Pfitzinger, Hartmut R. (1999): "Local speech rate perception in German speech", In ICPhS-14, 893-896.