Compact Finite-State Super Transducers for Grapheme-to-Phoneme Conversion in Highly Inflected Languages
Autor: | Vitomir Struc, Žiga Golob, Mario Žganec, Jerneja Gros, Simon Dobrišek, Boštjan Vesnicer |
---|---|
Rok vydání: | 2021 |
Předmět: |
Minimisation (psychology)
Computer science Speech recognition Grapheme Computer Science::Computation and Language (Computational Linguistics and Natural Language and Speech Processing) Speech synthesis Pronunciation computer.software_genre Computer Science::Sound Component (UML) Minification Representation (mathematics) computer Scope (computer science) |
Zdroj: | Intelligent Computing Theories and Application ISBN: 9783030845216 ICIC (1) |
Popis: | Finite-state transducers are suitable for compact representation of pronunciation dictionaries, which are an important component of speech synthesis systems. In this paper, we first revise and analyse several properties of finite state transducers regarding their size minimization, which can be achieved by their determinization and minimisation. In scope of a novel experiment, we demonstrate that for highly inflected languages, their minimum size starts to decrease when the number of words in the presented pronunciation dictionary reaches a certain threshold. This phenomenon motivated us to introduce a new type of finite-state transducers, called finite-state super transducers, which allow the representation of pronunciation dictionaries with a smaller number of states and transitions using the existing determinization and minimization algorithms. A finite-state super transducer can accept and convert words that are not in the original represented pronunciation dictionary. The resulting phonetic transcriptions of these words may be incorrect, but we demonstrate on new data that the error rates are comparable to the performance of the state-of-the-art grapheme-to-phoneme conversion methods. |
Databáze: | OpenAIRE |
Externí odkaz: |