15th International Congress of Phonetic Sciences (ICPhS-15)
Corpus of Spontaneous Japanese, or CSJ, is a large-scale database of spontaneous Japanese. It contains speech signal and transcription of about 7 million words along with various annotations like POS and phonetic labels. After describing its design issues, the potential of the CSJ as a resource for linguistic variation study was evaluated.
Bibliographic reference. Maekawa, Kikuo / Koiso, Hanae / Kikuchi, Hideaki / Yoneyama, Kiyoko (2003): "Use of a large-scale spontaneous speech corpus in the study of linguistic variation", In ICPhS-15, 643-646.