Mining generalized association rules

Autor:	Rakesh Agrawal, Ramakrishnan Srikant
Rok vydání:	1997
Předmět:	Measure (data warehouse) Hierarchy Information retrieval Association rule learning Computer Networks and Communications business.industry Computer science Machine learning computer.software_genre Set (abstract data type) Hardware and Architecture Taxonomy (general) Artificial intelligence business Database transaction computer Software
Zdroj:	Future Generation Computer Systems. 13:161-180
ISSN:	0167-739X
DOI:	10.1016/s0167-739x(97)00019-8
Popis:	We introduce the problem of mining generalized association rules. Given a large database of transactions, where each transaction consists of a set of items, and a taxonomy (is-a hierarchy) on the items, we find associations between items at any level of the taxonomy. For example, given a taxonomy that says that jackets is-a outerwear is-a clothes, we may infer a rule that “people who buy outerwear tend to buy shoes”. This rule may hold even if rules that “people who buy jackets tend to buy shoes”, and “people who buy clothes tend to buy shoes” do not hold. An obvious solution to the problem is to add all ancestors of each item in a transaction to the transaction, and then run any of the algorithms for mining association rules on these “extended transactions”. However, this “Basic” algorithm is not very fast; we present two algorithms, Cumulate and EstMerge, which run 2 to 5 times faster than Basic (and more than 100 times faster on one real-life dataset). Finally, we present a new interest-measure for rules which uses the information in the taxonomy. Given a user-specified “minimum-interest-level”, this measure prunes a large number of redundant rules; 40–60% of all the rules were pruned on two real-life datasets.
Databáze:	OpenAIRE
Externí odkaz:	https://explore.openaire.eu/search/publication?articleId=doi_________::a6b07baa4d30ff111099c2fabf2b8715 https://doi.org/10.1016/s0167-739x(97)00019-8 Zobrazit plný text záznamu Full Text from ScienceDirect