14th International Congress of Phonetic Sciences (ICPhS-14)
San Francisco, CA, USA
This paper describes how the performance of a continuous speech recognizer for Dutch has been improved by modeling within-word and cross-word pronunciation variation. Within-word variants were automatically generated by applying five phonological rules to the words in the lexicon. For the within-word method, a significant improvement is found compared to the baseline. Cross-word pronunciation variation was modeled using two different methods: 1) adding cross-word variants directly to the lexicon, 2) only adding multi-words and their variants to the lexicon. Overall, cross-word method 2 leads to better results than cross-word method 1. The best results were obtained when cross-word method 2 was combined with the within-word method: a relative improvement of 8.8% WER was found compared to the baseline.
Bibliographic reference. Kessens, Judith M. / Wester, Mirjam / Strik, Helmer (1999): "Modeling within-word and cross-word pronunciation variation to improve the performance of a Dutch csr", In ICPhS-14, 1665-1668.