Identification of Synonyms Using Definition Similarities in Japanese Medical Device Adverse Event Terminology
Autor: | Masahito Uesugi, Hideto Yokoi, Ayako Yagahara |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2021 |
Předmět: |
Technology
Synonym Computer science QH301-705.5 QC1-999 edit distance computer.software_genre Terminology 03 medical and health sciences Consistency (database systems) 0302 clinical medicine terminology General Materials Science Word2vec 030212 general & internal medicine Biology (General) Instrumentation QD1-999 030304 developmental biology Fluid Flow and Transfer Processes 0303 health sciences business.industry medical device Process Chemistry and Technology Physics General Engineering Levenshtein distance synonym detection Engineering (General). Civil engineering (General) Computer Science Applications Identification (information) Chemistry machine learning distributed representation Edit distance Artificial intelligence TA1-2040 business computer Natural language processing Sentence |
Zdroj: | Applied Sciences, Vol 11, Iss 3659, p 3659 (2021) Applied Sciences Volume 11 Issue 8 |
ISSN: | 2076-3417 |
Popis: | Japanese medical device adverse events terminology, published by the Japan Federation of Medical Devices Associations (JFMDA terminology), contains entries for 89 terminology items, with each of the terminology entries created independently. It is necessary to establish and verify the consistency of these terminology entries and map them efficiently and accurately. Therefore, developing an automatic synonym detection tool is an important concern. Such tools for edit distances and distributed representations have achieved good performance in previous studies. The purpose of this study was to identify synonyms in JFMDA terminology and evaluate the accuracy using these algorithms. A total of 125 definition sentence pairs were created from the terminology as baselines. Edit distances (Levenshtein and Jaro–Winkler distance) and distributed representations (Word2vec, fastText, and Doc2vec) were employed for calculating similarities. Receiver operating characteristic analysis was carried out to evaluate the accuracy of synonym detection. A comparison of the accuracies of the algorithms showed that the Jaro–Winkler distance had the highest sensitivity, Doc2vec with DM had the highest specificity, and the Levenshtein distance had the highest value in area under the curve. Edit distances and Doc2vec makes it possible to obtain high accuracy in predicting synonyms in JFMDA terminology. |
Databáze: | OpenAIRE |
Externí odkaz: |