Mining High-Utility Patterns in Uncertain Tensors

Autor: Nicolas Nadisic, Loïc Cerf, Aurélien Coussat
Rok vydání: 2018
Předmět:
Zdroj: KES
ISSN: 1877-0509
DOI: 10.1016/j.procs.2018.07.274
Popis: Transactional datasets are 0/1 matrices, which generically stand for objects having Boolean properties. If every cell of the matrix is additionally associated with a real number called utility, a high-utility itemset relates to a all-ones sub-matrix with utilities that sum to a high-enough value. This article shows that “having a total utility exceeding a threshold” is a piecewise (anti-)monotone constraint, even in presence of both positive and negative utilities. For that reason, a generic algorithm, multidupehack, can prune the search of the high-utility patterns defined in a broader context than the 0/1 matrix: the uncertain tensor. Moreover, the utilities may relate to only some of the dimensions, the patterns can be forced (or not) to be closed and to satisfy additional constraints. A real-world application, which exploits all those possibilities, is presented. Despite its versatility, the proposal is also shown competitive when it comes to mining 0/1 matrices, a special case treated by dozens of specific algorithms.
Databáze: OpenAIRE