15th International Congress of Phonetic Sciences (ICPhS-15)
This paper describes the Motorola Polyphone Network (MotPoly), a hierarchical, universal phone correspondence network that defines allowable phone mergers for shared acoustic modeling in multilingual and multi-dialect automatic speech recognition (ML-ASR). MotPoly's organization is defined by phonetic similarity and other language-independent phonological factors. Unlike other approaches to shared acoustic modeling, MotPoly can be effectively used in systems where computational resources are limited, such as portable devices. Furthermore, it is less constrained by language data availability than other approaches. With MotPoly as part of an overall strategy, Motorola's Voice Dialog Systems Lab's ML-ASR team was able to define a set of multilingual acoustic models whose size was only 23% of the largest monolingual model set but whose overall performance was higher than the monolingual models by 1.4 percentage points.
Bibliographic reference. Melnar, Lynette / Talley, Jim (2003): "Phone merger specification for multilingual ASR: the Motorola polyphone network", In ICPhS-15, 1337-1340.