Popis: |
In biomedicine, the expansive scientific literature combined with the frequent use of abbreviations, acronyms, and symbols presents considerable challenges for text processing and summarization. The Unified Medical Language System (UMLS) has been a go-to for extracting concepts and determining correlations in these studies; hence, the BioGraphSum model introduced in this study aims to reduce this UMLS dependence. Through adoption of an innovative perspective, sentences within a piece of text are graphically conceptualized as nodes, enabling the concept of “Malatya centrality” to be leveraged. This approach focuses on pinpointing influential nodes on a graph and, by analogy, the most pertinent sentences within the text for summarization. In order to evaluate the performance of the BioGraphSum approach, a corpus was curated that consisted of 450 contemporary scientific research articles available on the PubMed database, aligned with proven research methodology. The BioGraphSum model was subjected to rigorous testing against this corpus in order to demonstrate its capabilities. Preliminary results, especially in the precision-based and f-score-based ROUGE-(1–2), ROUGE-L, and ROUGE-SU metrics reported significant improvements when compared to other existing models considered state-of-the-art in text summarization. |