15th International Congress of Phonetic Sciences (ICPhS-15)
A video database was made for extracting Japanese visemes and modeling speech articulators' motions during speech utterances. By using two high-speed video cameras, frontal and lateral views around lips were recorded at up to 300 frames per second. Recorded utterances are short Japanese sentences about 5 seconds duration each at a normal speed. A sentence set used is a phonetically balanced set which includes all Japanese syllables and typical trigram patterns of Japanese phonemes. Experimental configurations and measurement issues are discussed.
Bibliographic reference. Hirayama, Makoto J. (2003): "Making of a Japanese viseme video database by multiple high-speed video observations", In ICPhS-15, 3157-3160.