15th International Congress of Phonetic Sciences (ICPhS-15)

Barcelona, Spain
August 3-9, 2003

Comparison of Several Proposed Perceptual Representations of Vowel Spectra

Terrance M. Nearey (1), Michael Kiefte (2)

(1) University of Alberta, Canada
(2) Dalhousie University, Canada

We report some results of modeling the categorization of a three-formant synthetic vowel continuum by speakers of English and of Finnish. We focus here on assessing the relative merits of initial representations based on a standard 3-dimensional formant-frequency space compared to those based on 1) a 2-dimensional F1 by F2- prime space, and on 2) Hermansky's PLP5. PLP is a 5- dimensional representation of spectral shape that manifests an integration of closely spaced formants in a manner suggestive of F2-prime. Linear logistic models suggest that PLP5 modestly outperforms 3 formants, which in turn substantially outperforms F1 by F2-prime. However, if quadratic rather than linear methods are allowed, the 3 formant space provides a better account of listeners' response patterns than do other models of comparable complexity. This result is confirmed using modern model selection techniques.

Bibliographic reference.  Nearey, Terrance M. / Kiefte, Michael (2003): "Comparison of several proposed perceptual representations of vowel spectra", In ICPhS-15, 1005-1008.