Autor: |
KOPTIENT, Anaïs, GRABAR, Natalia |
Zdroj: |
Studies in Health Technology & Informatics; 2021, Issue 281, p313-317, 5p, 4 Charts |
Abstrakt: |
Abbreviations are very frequent in medical and health documents but they convey opaque semantics. The association with their expanded forms, like Chronic obstructive pulmonary disease for COPD, may help their understanding. Yet, several abbreviations are ambiguous and have expanded forms possible. We propose to disambiguate the abbreviations in order to associate them with the proper expansion for a given context. We treat the problem through supervised categorization. We create reference data and test several algorithms. The descriptors are collected from lexical and syntactic contexts of abbreviations. The training is done on sentences containing expanded forms of abbreviations. The test is done on corpus built manually, in which the meaning of abbreviations is defined according to their contexts. Our approach shows up to 0.895 F-measure on training data and 0.773 on test data. [ABSTRACT FROM AUTHOR] |
Databáze: |
Complementary Index |
Externí odkaz: |
|