SCHEMA – An Algorithm for Automated Product Taxonomy Mapping in E-commerce

Autor: Flavius Frasincar, Steven S. Aanen, Lennart J. Nederstigt, Damir Vandic
Přispěvatelé: Econometrics, Erasmus School of Economics
Jazyk: angličtina
Rok vydání: 2012
Předmět:
Zdroj: STARTPAGE=300;ENDPAGE=314;TITLE=The Interface for Dutch ICT-Research 2012 (ICT.OPEN 2012)
Lecture Notes in Computer Science ISBN: 9783642302831
ESWC
Popis: This paper proposes SCHEMA, an algorithm for automated mapping between heterogeneous product taxonomies in the e-commerce domain. SCHEMA utilises word sense disambiguation techniques, based on the ideas from the algorithm proposed by Lesk, in combination with the semantic lexicon WordNet. For finding candidate map categories and determining the path-similarity we propose a node matching function that is based on the Levenshtein distance. The final mapping quality score is calculated using the Damerau-Levenshtein distance and a node-dissimilarity penalty. The performance of SCHEMA was tested on three real-life datasets and compared with PROMPT and the algorithm proposed by Park & Kim. It is shown that SCHEMA improves considerably on both recall and F $_{\textrm{1}}$-score, while maintaining similar precision.
Databáze: OpenAIRE