15th International Congress of Phonetic Sciences (ICPhS-15)

Barcelona, Spain
August 3-9, 2003


Precision Voice Analysis in Speaking, Singing and Pathology

Adrian Fourcin (1), Gea DeJong (2)

(1) University College London, UK
(2) City University, UK

Temporal periodicity is the main physical correlate used experimentally in relating perceived pitch to vocal fold frequency. The accuracy of most measurement methods does not, however, match the pitch difference limens of normal hearing.
   The experimental results presented here are concerned with the observations, and resulting understanding, that can be derived when routine analyses correspond to a 0.1% level of frequency difference detection at 1kHz.
   Three experimental situations are briefly discussed. The first two concern the relation between sustained vowel production and the modal structures found in the analysis of representative samples of voice in connected speech for two groups of subjects. Two of the main factors that contribute to our ability to control voice pitch come from auditory processing and proprioceptive laryngeal feedback. The analysis of representative samples of fluent speech makes it possible to define the main modal values of vocal fold vibration available to a speaker. In order to maintain pitch stability in a sustained sound we have found that the normal speaker chooses a dominant normal modal value of vocal fold vibration - defined from within the characteristics of connected speech. The normally hearing speaker but with a pathological voice adopts a related strategy. However, for such speakers the choice of dominant mode may lead to stably controlled voice pitch but at a quite abnormal vibratory frequency. These results have practical implications in the clinical management of voice pathology since it is not present practice to relate estimates of degree of pathology derived from sustained vowels to measurement of connected speech. The third set of related experiments concern the links between singing and auditory monitoring. There is an inverse relation between our ability to detect pitch change and the duration of the sound. This leads to an approximately tenfold difference between detectable pitch changes in sung vowels and the intonational changes of conversational speech. Crossplot [period by period] analyses of voice frequency irregularity in speech and singing lead, in consequence, to large predictable differences between these modalities of production.

Full Paper

Bibliographic reference.  Fourcin, Adrian / DeJong, Gea (2003): "Precision voice analysis in speaking, singing and pathology", In ICPhS-15, 2365-2368.