Why Did My Consumer Shop? Learning an Efficient Distance Metric for Retailer Transaction Data
Autor: | Spenrath, Yorick, Hassani, Marwan, van Dongen, Boudewijn F., Tariq, Haseeb, Dong, Yuxiao, Mladenic, Dunja, Saunders, Craig |
---|---|
Přispěvatelé: | Process Science, EAISI Foundational, EAISI High Tech Systems |
Jazyk: | angličtina |
Rok vydání: | 2021 |
Předmět: |
Optimization
Measure (data warehouse) Similarity (geometry) Computer science Transaction categorization 02 engineering and technology computer.software_genre Clustering Distance metric SDG 12 – Verantwoordelijke consumptie en productie 020204 information systems Metric (mathematics) 0202 electrical engineering electronic engineering information engineering Unsupervised learning 020201 artificial intelligence & image processing Product (category theory) Data mining Cluster analysis SDG 12 - Responsible Consumption and Production Transaction data computer Consumer behaviour |
Zdroj: | Machine Learning and Knowledge Discovery in Databases. Applied Data Science and Demo Track ISBN: 9783030676698 ECML/PKDD (5) Machine Learning and Knowledge Discovery in Databases. Applied Data Science and Demo Track-European Conference, ECML PKDD 2020, Proceedings, 323-338 STARTPAGE=323;ENDPAGE=338;TITLE=Machine Learning and Knowledge Discovery in Databases. Applied Data Science and Demo Track-European Conference, ECML PKDD 2020, Proceedings |
Popis: | Transaction analysis is an important part in studies aiming to understand consumer behaviour. The first step is defining a proper measure of similarity, or more specifically a distance metric, between transactions. Existing distance metrics on transactional data are built on retailer specific information, such as extensive product hierarchies or a large product catalogue. In this paper we propose a new distance metric that is retailer independent by design, allowing cross-retailer and cross-country analysis. The metric comes with a novel method of finding the importance of categories of products, alternating between unsupervised learning techniques and importance calibration. We test our methodology on a real-world dataset and show how we can identify clusters of consumer behaviour. |
Databáze: | OpenAIRE |
Externí odkaz: |