The power of graphs in medicine: Introducing BioGraphSum for effective text summarization

Autor: Cengiz Hark
Jazyk: angličtina
Rok vydání: 2024
Předmět:
Zdroj: Heliyon, Vol 10, Iss 11, Pp e31813- (2024)
Druh dokumentu: article
ISSN: 2405-8440
DOI: 10.1016/j.heliyon.2024.e31813
Popis: In biomedicine, the expansive scientific literature combined with the frequent use of abbreviations, acronyms, and symbols presents considerable challenges for text processing and summarization. The Unified Medical Language System (UMLS) has been a go-to for extracting concepts and determining correlations in these studies; hence, the BioGraphSum model introduced in this study aims to reduce this UMLS dependence. Through adoption of an innovative perspective, sentences within a piece of text are graphically conceptualized as nodes, enabling the concept of “Malatya centrality” to be leveraged. This approach focuses on pinpointing influential nodes on a graph and, by analogy, the most pertinent sentences within the text for summarization. In order to evaluate the performance of the BioGraphSum approach, a corpus was curated that consisted of 450 contemporary scientific research articles available on the PubMed database, aligned with proven research methodology. The BioGraphSum model was subjected to rigorous testing against this corpus in order to demonstrate its capabilities. Preliminary results, especially in the precision-based and f-score-based ROUGE-(1–2), ROUGE-L, and ROUGE-SU metrics reported significant improvements when compared to other existing models considered state-of-the-art in text summarization.
Databáze: Directory of Open Access Journals