15th International Congress of Phonetic Sciences (ICPhS-15)

Barcelona, Spain
August 3-9, 2003

Modeling and Perception of Temporal Characteristics in Speech

Yoshinori Sagisaka

Waseda University, Japan

This paper describes characteristics of segmental duration control and its computational modeling that we have studied for more than two decades in speech synthesis. These studies not only contribute to prosody control in speech synthesis technology but also give an integrated view of individual temporal characteristics that have been found in phonetic science. The computational model can provide a new tool for analysis by synthesis of temporal characteristics by its prediction capability of assigning segmental duration in unseen contexts. Furthermore, a series of experimental results are shown on perceptual characteristics of duration modifications. These perceptual experiments reveal the context dependency of sensitivity to duration errors and strong correlation between duration errors and loudness that suggests the existence of a language universal temporal perception mechanism.

