Biomedical Spanish Language Models for entity recognition and linking at BioASQ DisTEMIST

Autor: Moscato V., Postiglione M., Sperli' G.
Přispěvatelé: Moscato, V., Postiglione, M., Sperli', G.
Jazyk: angličtina
Rok vydání: 2022
Předmět:
Popis: Named Entity Recognition and Entity Linking systems usually require a rich and annotated dataset to be trained and produce high-quality results, but the annotation process is time consuming and expensive, especially when it needs the effort of domain experts, such as in the medical field. However, recent developments in Natural Language Processing (NLP) allow us to easily use transformer language models which have been pre-trained on a huge quantity of data (often coming from specialized domains), and thus obtain high performance without excessive efforts. In this work, we outline our approach to NER and EL tasks on Spanish clinical notes for the DisTEMIST track at the BioASQ 2022 challenge. Our results demonstrate that the proposed methodology based on biomedical pre-trained language models turned out the best for the NER task with a ∼ 3% higher F1 w.r.t. the second-best solution.
Databáze: OpenAIRE