14th International Congress of Phonetic Sciences (ICPhS-14)
San Francisco, CA, USA
The paper describes the main principles of Russian text-tospeech synthesis developed by Speech group of the Philological Faculty, Lomonosov Moscow University, Russia. The system is organized as a mixture of two methods: concatenation - on the segment level (using the linguistically motivated units - allophones' waveforms spliced together to form the initial speech wave) and the rule-based method on the prosodic level to modify the initial speech wave according to the prosodic characteristics of a phrase being synthesized. The allophonic database is a set of allophone wave files, each file being named accounting the allophone itself and its phonetic context. Signal generation is implemented according to the phrase control file, which describes the phrase as a sequence of allophones' code names with assigned duration, energy and fundamental frequency values. To transform the base allophones to required prosodic values we use procedures that are close to TD PSOLA technology.
Bibliographic reference. Krivnova, Olga (1999): "Automatic synthesis of Russian speech", In ICPhS-14, 507-510.