15th International Congress of Phonetic Sciences (ICPhS-15)

Barcelona, Spain
August 3-9, 2003

Agreement and Reliability of Voice Quality Ratings in Anchored and Unanchored Protocols

Guus de Krom, Rianneke Crielaard

University of Utrecht, The Netherlands

Ratings of roughness, breathiness, strain, and asthenicity were obtained under two protocols: an unanchored one, in which listeners used conventional 7-point semantic interval scales, and an anchored one, in which listeners rated the same aspects, given an example human voice for each aspect and gradation combination. In order to assess intrarater (test-retest) consistency, listeners rated each voice twice in both protocols. Measures of rating agreement within and between listeners were computed, as well as variance estimates at different levels in the data hierarchy (i.e. listeners, speakers, and replicated ratings of speakers by listeners). Rating reliability coefficients were computed on the basis of these variance estimates for both protocols. For all four aspects, interlistener agreement and rating reliability was higher in the anchored protocol than in the unanchored protocol. The higher reliability could generally be attributed to a reduction of interlistener variance, though anchoring helped to magnify the speaker variance for ratings on the asthenic scale as well. These results suggest that anchoring using natural human voice fragments may help to improve the reliability of voice quality ratings.

