Výsledky vyhledávání - "Doostmohammadi, Ehsan"

Report

How Reliable Are Automatic Evaluation Methods for Instruction-Tuned LLMs?

Autor: Doostmohammadi, Ehsan, Holmström, Oskar, Kuhlmann, Marco

Work on instruction-tuned Large Language Models (LLMs) has used automatic methods based on text overlap and LLM judgments as cost-effective alternatives to human evaluation. In this paper, we perform a meta-evaluation of such methods and assess their

Externí odkaz: http://arxiv.org/abs/2402.10770

Zobrazit plný text záznamu

Report

Surface-Based Retrieval Reduces Perplexity of Retrieval-Augmented Language Models

Autor: Doostmohammadi, Ehsan, Norlund, Tobias, Kuhlmann, Marco, Johansson, Richard

Augmenting language models with a retrieval mechanism has been shown to significantly improve their performance while keeping the number of parameters low. Retrieval-augmented models commonly rely on a semantic retrieval mechanism based on the simila

Externí odkaz: http://arxiv.org/abs/2305.16243

Zobrazit plný text záznamu

Report

On the Generalization Ability of Retrieval-Enhanced Transformers

Autor: Norlund, Tobias, Doostmohammadi, Ehsan, Johansson, Richard, Kuhlmann, Marco

Recent work on the Retrieval-Enhanced Transformer (RETRO) model has shown that off-loading memory from trainable weights to a retrieval database can significantly improve language modeling and match the performance of non-retrieval models that are an

Externí odkaz: http://arxiv.org/abs/2302.12128

Zobrazit plný text záznamu

Report

SINA-BERT: A pre-trained Language Model for Analysis of Medical Texts in Persian

Autor: Taghizadeh, Nasrin, Doostmohammadi, Ehsan, Seifossadat, Elham, Rabiee, Hamid R., Tahaei, Maedeh S.

We have released Sina-BERT, a language model pre-trained on BERT (Devlin et al., 2018) to address the lack of a high-quality Persian language model in the medical domain. SINA-BERT utilizes pre-training on a large-scale corpus of medical contents inc

Externí odkaz: http://arxiv.org/abs/2104.07613

Zobrazit plný text záznamu

Report

Joint Persian Word Segmentation Correction and Zero-Width Non-Joiner Recognition Using BERT

Autor: Doostmohammadi, Ehsan, Nassajian, Minoo, Rahimi, Adel

Words are properly segmented in the Persian writing system; in practice, however, these writing rules are often neglected, resulting in single words being written disjointedly and multiple words written without any white spaces between them. This pap

Externí odkaz: http://arxiv.org/abs/2010.00287

Zobrazit plný text záznamu

Report

PerKey: A Persian News Corpus for Keyphrase Extraction and Generation

Autor: Doostmohammadi, Ehsan, Bokaei, Mohammad Hadi, Sameti, Hossein

Keyphrases provide an extremely dense summary of a text. Such information can be used in many Natural Language Processing tasks, such as information retrieval and text summarization. Since previous studies on Persian keyword or keyphrase extraction h

Externí odkaz: http://arxiv.org/abs/2009.12269

Zobrazit plný text záznamu

Report

Persian Keyphrase Generation Using Sequence-to-Sequence Models

Autor: Doostmohammadi, Ehsan, Bokaei, Mohammad Hadi, Sameti, Hossein

Keyphrases are a very short summary of an input text and provide the main subjects discussed in the text. Keyphrase extraction is a useful upstream task and can be used in various natural language processing problems, for example, text summarization

Externí odkaz: http://arxiv.org/abs/2009.12271

Zobrazit plný text záznamu

Report

Investigating Machine Learning Methods for Language and Dialect Identification of Cuneiform Texts

Autor: Doostmohammadi, Ehsan, Nassajian, Minoo

Identification of the languages written using cuneiform symbols is a difficult task due to the lack of resources and the problem of tokenization. The Cuneiform Language Identification task in VarDial 2019 addresses the problem of identifying seven la

Externí odkaz: http://arxiv.org/abs/2009.10794

Zobrazit plný text záznamu

Report

Ghmerti at SemEval-2019 Task 6: A Deep Word- and Character-based Approach to Offensive Language Identification

Autor: Doostmohammadi, Ehsan, Sameti, Hossein, Saffar, Ali

This paper presents the models submitted by Ghmerti team for subtasks A and B of the OffensEval shared task at SemEval 2019. OffensEval addresses the problem of identifying and categorizing offensive language in social media in three subtasks; whethe

Externí odkaz: http://arxiv.org/abs/2009.10792

Zobrazit plný text záznamu

Report

Persian Ezafe Recognition Using Transformers and Its Role in Part-Of-Speech Tagging

Autor: Doostmohammadi, Ehsan, Nassajian, Minoo, Rahimi, Adel

Ezafe is a grammatical particle in some Iranian languages that links two words together. Regardless of the important information it conveys, it is almost always not indicated in Persian script, resulting in mistakes in reading complex sentences and e

Externí odkaz: http://arxiv.org/abs/2009.09474

Zobrazit plný text záznamu

Vyhledávací nástroje:

Upřesnit hledání