Autor: |
Meftouh, K., Smaili, K., Laskri, M. T. |
Předmět: |
|
Zdroj: |
International Review on Computers & Software; Jan2009, Vol. 4 Issue 1, p68-72, 5p, 3 Diagrams, 10 Charts |
Abstrakt: |
In this work we propose to investigate statistical language models for Arabic. Several experiments using different smoothing techniques have been carried out on a small corpus extracted from a daily newspaper. The sparseness data conducts us to investigate other solutions without increasing the size of the corpus. A word segmentation has been operated in order to increase the statistical viability of the corpus. This leads to a better performance in terms of normalized perplexity. [ABSTRACT FROM AUTHOR] |
Databáze: |
Complementary Index |
Externí odkaz: |
|