Zobrazeno 1 - 8
of 8
pro vyhledávání: '"Vesa Siivola"'
Autor:
Ebru Arisoy, Murat Saraclar, Matti Varjokallio, Mikko Kurimo, Mathias Creutz, Andreas Stolcke, Teemu Hirsimäki, Antti Puurula, Vesa Siivola, Janne Pylkkönen
Publikováno v:
ACM Transactions on Speech and Language Processing. 5:1-29
We explore the use of morph-based language models in large-vocabulary continuous-speech recognition systems across four so-called morphologically rich languages: Finnish, Estonian, Turkish, and Egyptian Colloquial Arabic. The morphs are subword units
Publikováno v:
IEEE Transactions on Audio, Speech and Language Processing. 15:1617-1624
N-gram models are the most widely used language models in large vocabulary continuous speech recognition. Since the size of the model grows rapidly with respect to the model order and available training data, many methods have been proposed for pruni
Publikováno v:
Computer Speech & Language. 20:515-541
In the speech recognition of highly inflecting or compounding languages, the traditional word-based language modeling is problematic. As the number of distinct word forms can grow very large, it becomes difficult to train language models that are bot
Publikováno v:
Interspeech 2011.
Publikováno v:
INTERSPEECH
This paper introduces two recent open source software packages developed for unsupervised natural language modeling. The Morfessor program segments words automatically into morpheme-like units without any rule-based morphological analyzers. The VariK
Autor:
Ebru Arisoy, Vesa Siivola, Teemu Hirsimäki, Janne Pylkkönen, Tanel Alumäe, Antti Puurula, Mikko Kurimo, Murat Saraclar
Publikováno v:
HLT-NAACL
Scopus-Elsevier
Scopus-Elsevier
It is practically impossible to build a word-based lexicon for speech recognition in agglutinative languages that would cover all the relevant words. The problem is that words are generally built by concatenating several prefixes and suffixes to the
Autor:
Bryan L. Pellom, Vesa Siivola
Publikováno v:
INTERSPEECH
Traditionally, when building an n-gram model, we decide the span of the model history, collect the relevant statistics and estimate the model. The model can be pruned down to a smaller size by manipulating the statistics or the estimated model. This
Publikováno v:
8th European Conference on Speech Communication and Technology (Eurospeech 2003).