15th International Congress of Phonetic Sciences (ICPhS-15)

Barcelona, Spain
August 3-9, 2003

Automatic Pronunciation Scoring for Language Learning with Stress Detection

Yun Zhu, Tianli Zhao, Jia Liu, Runsheng Liu

Tsinghua University, China

A novel method is presented for automatic assessment of the English pronunciation quality of Chinese speakers to be used as a part of a Computer-Assisted Language Learning (CALL) system. In the research, the DAPRA TIMIT Acoustic-Phonetic Continuous Speech Corpus is used for training models for speech recognition and pronunciation scoring. And a database of nonnative read speech of 60 Chinese people speaking in English is recorded. Scores given by 4 expert human listeners are used as the references to evaluate the different machine scores and to provide targets to train the algorithms. We use intensity, fundamental frequency and the 13 Mel frequency cepstral coefficients (MFCC) to build models for scoring pronunciation of English words. Compared with previous scoring methods, the new HMM-based algorithm not only gives more comprehensive scores but also feeds back suprasegmental information on whether a student's pronunciation is properly stressed.

Full Paper

Bibliographic reference.  Zhu, Yun / Zhao, Tianli / Liu, Jia / Liu, Runsheng (2003): "Automatic pronunciation scoring for language learning with stress detection", In ICPhS-15, 1529-1532.