14th International Congress of Phonetic Sciences (ICPhS-14)
San Francisco, CA, USA
This paper describes a new approach to acoustic-phonetic modelling,
the Hidden Dynamic Model (HDM), which explicitly accounts
for coarticulation and transitions between neighbouring
phonetic-segments. Inspired by the fact that speech is really produced
by an underlying dynamic system, theHDM learns, from labelled
speech data, a mapping from a hidden dynamic space where
simple dynamic properties exist, to the surface acoustic representation.
In this hidden space, each phone is represented by a single target vector. A simple filter is the dynamic system that interpolates between these targets. A trainable non-linear mapping in the form of a multi-layer perceptron (MLP) maps this hidden dynamic trajectory to an acoustic pattern.
By producing synthetic acoustic patterns using the HDM, we show how it captures the dynamic structure of speech, even with such a economic parameterisation. We also investigate the properties of the learned hidden space, and the effect of varying its dimensionality.
Bibliographic reference. Richards, Hywel B. / Bridle, John S. (1999): "Acoustic-phonetic modelling using the hidden dynamic model", In ICPhS-14, 691-694.