Research on string similarity algorithm based on Levenshtein Distance
Autor: | Yan Hu, Guangrong Bian, Shengnan Zhang |
---|---|
Rok vydání: | 2017 |
Předmět: |
050210 logistics & transportation
Bitap algorithm 05 social sciences Commentz-Walter algorithm 02 engineering and technology Approximate string matching Levenshtein distance Longest common substring problem Damerau–Levenshtein distance 0502 economics and business 0202 electrical engineering electronic engineering information engineering 020201 artificial intelligence & image processing Edit distance String metric Algorithm Mathematics |
Zdroj: | 2017 IEEE 2nd Advanced Information Technology, Electronic and Automation Control Conference (IAEAC). |
DOI: | 10.1109/iaeac.2017.8054419 |
Popis: | The application of string similarity is very extensive, and the algorithm based on Levenshtein Distance is particularly classic, but it is still insufficient in the aspect of universal applicability and accuracy of results. Combined with the Longest Common Subsequence (LCS) and Longest Common Substring (LCCS), similarity algorithm based on Levenshtein Distance is improved, and the string similarity result of the improved algorithm is more distinct, reasonable and accurate, and also has a better universal applicability. What's more in the process of similarity calculation, the Solving algorithm of the LD and LCS has been optimized in the data structure, reduce the space complexity of the algorithm from the order of magnitude. And the experimental results are analyzed in detail, which proves the feasibility and correctness of the results. |
Databáze: | OpenAIRE |
Externí odkaz: |