Mining correlated high-utility itemsets using various measures

Autor: Hoai Bac Le, Yimin Zhang, Philippe Fournier-Viger, Duy-Tai Dinh, Jerry Chun-Wei Lin
Rok vydání: 2020
Předmět:
Zdroj: Logic Journal of the IGPL. 28:19-32
ISSN: 1368-9894
1367-0751
DOI: 10.1093/jigpal/jzz068
Popis: Discovering high-utility itemsets (HUIs) consists of finding sets of items that yield a high profit in customer transaction databases. An important limitation of traditional high-utility itemset mining (HUIM) is that only the utility measure is used for assessing the interestingness of patterns. This leads to finding several itemsets that have a high profit but contain items that are weakly correlated. To address this issue, this paper proposes to integrate the concept of correlation in HUIM to find profitable itemsets that are highly correlated, using the all-confidence and bond measures. An efficient algorithm named FCHM (fast correlated high-utility itemset miner) is proposed to efficiently discover correlated high-utility itemsets (CHIs). Two versions of the algorithm are proposed: FCHM$_{all\text{-}confidence}$ and FCHM$_{bond}$, which are based on the all-confidence and bond measures, respectively. An experimental evaluation was done using four real-life benchmark datasets from the HUIM literature: mushroom, retail, kosarak and foodmart. Results show that FCHM is efficient and can prune a huge amount of weakly CHIs.
Databáze: OpenAIRE
Nepřihlášeným uživatelům se plný text nezobrazuje