15th International Congress of Phonetic Sciences (ICPhS-15)

Barcelona, Spain
August 3-9, 2003

Use of a Large-Scale Spontaneous Speech Corpus in the Study of Linguistic Variation

Kikuo Maekawa, Hanae Koiso, Hideaki Kikuchi, Kiyoko Yoneyama

National Institute for Japanese Language, Japan

Corpus of Spontaneous Japanese, or CSJ, is a large-scale database of spontaneous Japanese. It contains speech signal and transcription of about 7 million words along with various annotations like POS and phonetic labels. After describing its design issues, the potential of the CSJ as a resource for linguistic variation study was evaluated.

Full Paper

Bibliographic reference.  Maekawa, Kikuo / Koiso, Hanae / Kikuchi, Hideaki / Yoneyama, Kiyoko (2003): "Use of a large-scale spontaneous speech corpus in the study of linguistic variation", In ICPhS-15, 643-646.