A study on improving the quality inspection on national information by using levenshtein distance algorithm

Autor: Sanggi Lee, Inje Kang, Kang-Ryul Shon, ChulSu Lim, Eungyeong Kim
Rok vydání: 2018
Předmět:
Zdroj: International Journal of Engineering & Technology. 7:161
ISSN: 2227-524X
DOI: 10.14419/ijet.v7i2.33.13876
Popis: Background/Objectives: In Korea, much effort and budget were spent to improve national R&D information management. However yet, project summaries of national R&D are not accurate enough to be utilized.Methods/Statistical analysis: To examine the accuracy of project summaries, Levenshtein Distance Algorithm (LDA) was applied. LDA is expected to extract improper project summaries of which some parts of sentences are repeatedly used. To evaluate how the algorithm performs with national R&D information in Korea, project summaries of 53,492 national R&D projects that were conducted in 2014 were used.Findings: Unlike other algorithms, LDA was able to detect project summaries consisted of repeatedly used phrases. According to the test with LDA, from 53,492 cases, 3,445 projects had inaccurate contents in project summaries. In details, 2,707 projects had improper research objective, while 712 projects and 26 projects had improper contents in research summary and expected impact, respectively. Although the algorithm allowed extracting repeatedly used phrases, it had problems of time; thus, it was only applied offline. Also, a research had to confirm once more to verify the accuracy of the result.Improvements/Applications: This paper applied LDA to detect inappropriate project summaries. The result implies that by applying LDA, the quality of the information can be improved to facilitate the utilization.
Databáze: OpenAIRE