Autor: |
Gwadera, Robert, Crestani, Fabio |
Zdroj: |
Advances in Knowledge Discovery & Data Mining: 14th Pacific-Asia Conference, Pakdd 2010, Hyderabad, India, June 21-24, 2010. Proceedings. Part I; 2010, p286-299, 14p |
Abstrakt: |
We present a reliable universal method for ranking sequential patterns (itemset-sequences) with respect to significance in the problem of frequent sequential pattern mining. We approach the problem by first building a probabilistic reference model for the collection of itemset-sequences and then deriving an analytical formula for the frequency for sequential patterns in the reference model. We rank sequential patterns by computing the divergence between their actual frequencies and their frequencies in the reference model. We demonstrate the applicability of the presented method for discovering dependencies between streams of news stories in terms of significant sequential patterns, which is an important problem in multi-stream text mining and the topic detection and tracking research. [ABSTRACT FROM AUTHOR] |
Databáze: |
Complementary Index |
Externí odkaz: |
|