An Efficient Approach for Mining Fault-Tolerant Repeating Patterns based on Bit Sequences
Autor: | Yu-Ting, Kung, 龔毓婷 |
---|---|
Rok vydání: | 2004 |
Druh dokumentu: | 學位論文 ; thesis |
Popis: | 92 An efficient way of mining top-K non-trivial fault-tolerant repeating patterns (FT-RPs in short) with length no less than min_len for data sequences is proposed in this thesis. By extending the idea of appearing bit sequences, fault-tolerant appearing bit sequences are defined to represent the locations where candidate patterns appear in a data sequence with insertion/deletion errors allowed. Two algorithms, named TFTRP-Mine (Top-K non-trivial FT-RPs Mining) and RE-TFTRP-Mine (REfinement of TFTRP-Mine), respectively, are proposed. Both of two algorithms use the recursive formulas to obtain fault-tolerant appearing bit sequences of a pattern systematically and then the fault-tolerant frequency of each candidate pattern could be counted quickly. Besides, RE-TFTRP-Mine adopts two additional strategies to prune the searching space in order to increase the mining efficiency. From experiment results, we can know that RE-TFTRP-Mine outperforms TFTRP-Mine algorithm when K and min_len are small. In addition, when adopting fault tolerant mining, more important and implicit repeating patterns could be found for music objects. |
Databáze: | Networked Digital Library of Theses & Dissertations |
Externí odkaz: |