Incorporating Repeating Temporal Association Rules in Naïve Bayes Classifiers for Coronary Heart Disease Diagnosis
Autor: | Riccardo Bellazzi, Athena Stassopoulou, Arianna Dagliati, Kalia Orphanou, Elpida T. Keravnou, Lucia Sacchi |
---|---|
Jazyk: | angličtina |
Rok vydání: | 2018 |
Předmět: |
Time series classification
Adult Time Factors Association rule learning Databases Factual Computer science Health Informatics Coronary Disease 02 engineering and technology Pattern Recognition Automated Naive Bayes classifier 020204 information systems 0202 electrical engineering electronic engineering information engineering Data Mining Humans Temporal pattern mining business.industry Decision Trees Reproducibility of Results Pattern recognition Bayes Theorem Middle Aged Coronary heart disease Computer Science Applications Single patient 020201 artificial intelligence & image processing Artificial intelligence Neural Networks Computer business Classifier (UML) Algorithms Medical Informatics |
Zdroj: | Orphanou, K, Dagliati, A, Sacchi, L, Stassopoulou, A, Keravnou, E & Bellazzi, R 2018, ' Incorporating Repeating Temporal Association Rules in Naïve Bayes Classifiers for Coronary Heart Disease Diagnosis ', Journal of Biomedical Informatics . https://doi.org/10.1016/j.jbi.2018.03.002 |
Popis: | In this paper, we develop a Naive Bayes classification model integrated with temporal association rules (TARs). A temporal pattern mining algorithm is used to detect TARs by identifying the most frequent temporal relationships among the derived basic temporal abstractions (TA). We develop and compare three classifiers that use as features the most frequent TARs as follows: (i) representing the most frequent TARs detected within the target class ('Disease = Present'), (ii) representing the most frequent TARs from both classes ('Disease = Present', 'Disease = Absent'), (iii) representing the most frequent TARs, after removing the ones that are low-risk predictors for the disease. These classifiers incorporate the horizontal support of TARs, which defines the number of times that a particular temporal pattern is found in some patient's record, as their features. All of the developed classifiers are applied for diagnosis of coronary heart disease (CHD) using a longitudinal dataset. We compare two ways of feature representation, using horizontal support or the mean duration of each TAR, on a single patient. The results obtained from this comparison show that the horizontal support representation outperforms the mean duration. The main effort of our research is to demonstrate that where long time periods are of significance in some medical domain, such as the CHD domain, the detection of the repeated occurrences of the most frequent TARs can yield better performances. We compared the classifier that uses the horizontal support representation and has the best performance with a Baseline Classifier which uses the binary representation of the most frequent TARs. The results obtained illustrate the comparatively high performance of the classifier representing the horizontal support, over the Baseline Classifier. |
Databáze: | OpenAIRE |
Externí odkaz: |