Compact Finite-State Super Transducers for Grapheme-to-Phoneme Conversion in Highly Inflected Languages

Autor: Vitomir Struc, Žiga Golob, Mario Žganec, Jerneja Gros, Simon Dobrišek, Boštjan Vesnicer
Rok vydání: 2021
Předmět:
Zdroj: Intelligent Computing Theories and Application ISBN: 9783030845216
ICIC (1)
Popis: Finite-state transducers are suitable for compact representation of pronunciation dictionaries, which are an important component of speech synthesis systems. In this paper, we first revise and analyse several properties of finite state transducers regarding their size minimization, which can be achieved by their determinization and minimisation. In scope of a novel experiment, we demonstrate that for highly inflected languages, their minimum size starts to decrease when the number of words in the presented pronunciation dictionary reaches a certain threshold. This phenomenon motivated us to introduce a new type of finite-state transducers, called finite-state super transducers, which allow the representation of pronunciation dictionaries with a smaller number of states and transitions using the existing determinization and minimization algorithms. A finite-state super transducer can accept and convert words that are not in the original represented pronunciation dictionary. The resulting phonetic transcriptions of these words may be incorrect, but we demonstrate on new data that the error rates are comparable to the performance of the state-of-the-art grapheme-to-phoneme conversion methods.
Databáze: OpenAIRE