Decision Tree Models for Prediction of Macroinvertebrate Taxa in the River Axios (Northern Greece)

Autor: Andy Dedecker, M. Lazaridou-Dimitriadou, Peter Goethals, Niels De Pauw, Eleni Dakou, Tom D'Heygere
Rok vydání: 2006
Předmět:
Zdroj: Aquatic Ecology. 41:399-411
ISSN: 1573-5125
1386-2588
DOI: 10.1007/s10452-006-9058-y
Popis: In this study, decision tree models were induced to predict the habitat suitability of six macroinvertebrate taxa: Asellidae, Baetidae, Caenidae, Gammaridae, Gomphidae and Heptageniidae. The modelling techniques were applied on a dataset of 102 samples collected in 31 sites along the river Axios in Northern Greece. The database consisted of eight physical-chemical and seven structural variables, as well as the abundances of 90 macroinvertebrate taxa. A seasonal variable was included allowing the description of potential temporal changes in the macroinvertebrate taxa. Rules relating the presence/absence of six benthic macroinvertebrate taxa with the 15 physical-chemical and structural river characteristics and the seasonal variable were induced using the J48 algorithm. In order to improve the performance and the interpretability of the induced models, three optimisation techniques were applied: tree-pruning, bagging and boosting. The predictive performance of the decision tree models was assessed on the basis of the percentage of Correctly Classified Instances (CCI) and the Cohen’s kappa statistic. The results of the present study demonstrated that although the models had a relatively high predictive performance, noise in the dataset and inappropriate input variables prevented to some extent, the models from making reliable predictions. Although tree-pruning did not improve significantly the reliability of the induced models, it reduced considerably the tree complexity and in this way increased the transparency of the trees. Consequently, the induced models allowed for a correct ecological interpretation. The effect of bagging and boosting on the other hand varied considerably between the different models, as well as within different repetitions of 10-fold cross-validation in an individual model. In some cases the predictive performance was improved, in others stable or even worsened. The effect of bagging and boosting seemed to be strongly dependent on the dataset on which the two techniques were applied. Tree-pruning thus proved to have a high potential when applied in models used for decision-making of river restoration and conservation management.
Databáze: OpenAIRE