15th International Congress of Phonetic Sciences (ICPhS-15)

Barcelona, Spain
August 3-9, 2003

Improvements in Phone-Classification Accuracy from Modelling Duration

Philip J. B. Jackson

University of Surrey, UK

Durations of real speech segments do not generally exhibit exponential distributions, as modelled implicitly by the state transitions of Markov processes. Several duration models were considered for integration within a segmental-HMM recognizer: uniform, exponential, Poisson, normal, gamma and discrete. The gamma distribution fitted that measured for silence best, by an order of magnitude. Evaluations determined an appropriate weighting for duration against the acoustic models. Tests showed a reduction of 2% absolute (6+% relative) in the phone-classification error rate with gamma and discrete models; exponential ones gave approximately 1% absolute reduction, and uniform no significant improvement. These gains in performance recommend the wider application of explicit duration models.

Bibliographic reference.  Jackson, Philip J. B. (2003): "Improvements in phone-classification accuracy from modelling duration", In ICPhS-15, 1349-1352.