Autor:	Meftouh, K., Smaili, K., Laskri, M. T.
Předmět:	STATISTICS ARABIC language STATISTICAL smoothing COMPUTATIONAL linguistics MATHEMATICAL linguistics
Zdroj:	International Review on Computers & Software; Jan2009, Vol. 4 Issue 1, p68-72, 5p, 3 Diagrams, 10 Charts
Abstrakt:	In this work we propose to investigate statistical language models for Arabic. Several experiments using different smoothing techniques have been carried out on a small corpus extracted from a daily newspaper. The sparseness data conducts us to investigate other solutions without increasing the size of the corpus. A word segmentation has been operated in order to increase the statistical viability of the corpus. This leads to a better performance in terms of normalized perplexity. [ABSTRACT FROM AUTHOR]
Databáze:	Complementary Index
Externí odkaz:	Zobrazit plný text záznamu