Mining the Relationships in the form of the Predisposing Factors and Co-Incident Factors among Numerical Dynamic Attributes in Time Series Data Set by Using the Combination of Some Existing Techniques
Autor: | M. Abdus Salam, Suwimon Kooptiwoot |
---|---|
Rok vydání: | 2006 |
Předmět: | |
Zdroj: | Enterprise Information Systems VI ISBN: 1402036744 ICEIS (2) |
Popis: | Temporal mining is a natural extension of data mining with added capabilities of discovering interesting patterns, inferring relationships of contextual and temporal proximity and may also lead to possible cause-effect associations. Temporal mining covers a wide range of paradigms for knowledge modeling and discovery. A common practice is to discover frequent sequences and patterns of a single variable. In this paper we present a new algorithm which is the combination of many existing ideas consists of the reference event as proposed in (Bettini, Wang et al. 1998), the event detection technique proposed in (Guralnik and Srivastava 1999), the large fraction proposed in (Mannila, Toivonen et al. 1997), the causal inference proposed in (Blum 1982) We use all of these ideas to build up our new algorithm for the discovery of multivariable sequences in the form of the predisposing factor and co-incident factor of the reference event of interest. We define the event as positive direction of data change or negative direction of data change above a threshold value. From these patterns we infer predisposing and co-incident factors with respect to a reference variable. For this purpose we study the Open Source Software data collected from SourceForge website. Out of 240+ attributes we only consider thirteen time dependent attributes such as Page-views, Download, Bugs0, Bugs1, Support0, Support1, Patches0, Patches1, Tracker0, Tracker1, Tasks0, Tasks1 and CVS. These attributes indicate the degree and patterns of activities of projects through the course of their progress. The number of the Download is a good indication of the progress of the projects. So we use the Download as the reference attribute. We also test our algorithm with four synthetic data sets including noise up to 50 %. The results show that our algorithm can work well and tolerate the noise data. |
Databáze: | OpenAIRE |
Externí odkaz: |