Intelligent Mapping for Hotel Records Representing the Same Entity

Autor: Eyup Erkan Ozbek, Ahmet Tugrul Bayrak, Sedat Kestepe, Olcay Taner Yildiz
Rok vydání: 2019
Předmět:
Zdroj: 2019 4th International Conference on Computer Science and Engineering (UBMK).
Popis: Having the day by day increasing number of hotel entities, dealing with the whole set of hotels individually is almost impossible. Therefore, travel agencies work with online hotel providers which have deals with many hotels around the world. Whereas, working with online providers saves agencies from a big challenge, it degrades the problem of agency to another one: duplicate hotel records from different sources. The repeating records might either have all same set of identical features or features with different values that represent the same hotel. Matching and merging such records need to be applied for a consistent database. In this study, we propose a set of methods which aims to solve the pointed problem. We work on hotel records, applied machine learning algorithms using string and image similarity on records for which address enrichment and pre-processing applied, selecting prior methods as a baseline. Proposed method achieved 99.12% accuracy, matching 14.985 hotels on a 132.287 rows of data.
Databáze: OpenAIRE