Zobrazeno 1 - 2
of 2
pro vyhledávání: '"Nyist, Milán Konor"'
Training summarization models requires substantial amounts of training data. However for less resourceful languages like Hungarian, openly available models and datasets are notably scarce. To address this gap our paper introduces HunSum-2 an open-sou
Externí odkaz:
http://arxiv.org/abs/2404.03555
We introduce HunSum-1: a dataset for Hungarian abstractive summarization, consisting of 1.14M news articles. The dataset is built by collecting, cleaning and deduplicating data from 9 major Hungarian news sites through CommonCrawl. Using this dataset
Externí odkaz:
http://arxiv.org/abs/2302.00455