15th International Congress of Phonetic Sciences (ICPhS-15)

Barcelona, Spain
August 3-9, 2003

Phonetic Transcription of Large Speech Corpora: How to Boost Efficiency Without Affecting Quality

Diana Binnenpoorte, Catia Cucchiarini

University of Nijmegen, The Netherlands

This paper reports on an experiment aimed at improving an automatically generated phonetic transcription of the Spoken Dutch Corpus (CGN). Different techniques are explored to improve an automatically generated phonetic transcription (AGT). The different AGTs are compared to a reference transcription to determine their quality. The results indicate that implementing phonological rules does improve the AGT for all speech styles considered in the experiment. Applying ASR techniques to model phonological rules that are less frequent in continuous speech results in a decrease of substitution errors.

Full Paper

Bibliographic reference.  Binnenpoorte, Diana / Cucchiarini, Catia (2003): "Phonetic transcription of large speech corpora: how to boost efficiency without affecting quality", In ICPhS-15, 2981-2984.