An Efficient Approach for Mining Fault-Tolerant Repeating Patterns based on Bit Sequences

Autor: Yu-Ting, Kung, 龔毓婷
Rok vydání: 2004
Druh dokumentu: 學位論文 ; thesis
Popis: 92
An efficient way of mining top-K non-trivial fault-tolerant repeating patterns (FT-RPs in short) with length no less than min_len for data sequences is proposed in this thesis. By extending the idea of appearing bit sequences, fault-tolerant appearing bit sequences are defined to represent the locations where candidate patterns appear in a data sequence with insertion/deletion errors allowed. Two algorithms, named TFTRP-Mine (Top-K non-trivial FT-RPs Mining) and RE-TFTRP-Mine (REfinement of TFTRP-Mine), respectively, are proposed. Both of two algorithms use the recursive formulas to obtain fault-tolerant appearing bit sequences of a pattern systematically and then the fault-tolerant frequency of each candidate pattern could be counted quickly. Besides, RE-TFTRP-Mine adopts two additional strategies to prune the searching space in order to increase the mining efficiency. From experiment results, we can know that RE-TFTRP-Mine outperforms TFTRP-Mine algorithm when K and min_len are small. In addition, when adopting fault tolerant mining, more important and implicit repeating patterns could be found for music objects.
Databáze: Networked Digital Library of Theses & Dissertations