Effects of the Training Dataset Characteristics on the Performance of Nine Species Distribution Models: Application to Diabrotica virgifera virgifera
Autor: | Jan Pergl, Philippe Reynaud, Dominic Eyre, Richard Baker, Maxime Dupin, Vojtěch Jarošík, Sarah Brunel, David Makowski |
---|---|
Přispěvatelé: | Unité de recherche Zoologie Forestière (URZF), Institut National de la Recherche Agronomique (INRA), Peuplements végétaux et bioagresseurs en milieu tropical (UMR PVBMT), Centre de Coopération Internationale en Recherche Agronomique pour le Développement (Cirad)-Institut de Recherche pour le Développement (IRD)-Institut National de la Recherche Agronomique (INRA)-Université de La Réunion (UR), Lab Sante Vegetaux, Stn Angers, Agence nationale de sécurité sanitaire de l'alimentation, de l'environnement et du travail (ANSES), Fac Sci, Dept Ecol, Charles University [Prague] (CU), Inst Bot, Biology Centre of the ASCR, Food & Environm Res Agcy, EPPO OEPP, Inst Ecol & Evolut, University of Bern, Agronomie, Institut National de la Recherche Agronomique (INRA)-AgroParisTech, European Commission [212459], Czech Science Foundation [206/09/0563], Ministry of Education, Youth and Sports of the Czech Republic [MSM0021620828, AV0Z60050516, LC06073], Unité de recherche Zoologie Forestière (UZF), Institut de Recherche pour le Développement (IRD)-Centre de Coopération Internationale en Recherche Agronomique pour le Développement (Cirad)-Université de La Réunion (UR)-Institut National de la Recherche Agronomique (INRA), Charles University, AgroParisTech-Institut National de la Recherche Agronomique (INRA) |
Jazyk: | angličtina |
Rok vydání: | 2011 |
Předmět: |
0106 biological sciences
Calibration (statistics) ENVELOPE MODELS Systems Engineering ACCURACY [SDV]Life Sciences [q-bio] Species distribution lcsh:Medicine Plant Science 01 natural sciences Engineering Statistics WESTERN CORN-ROOTWORM lcsh:Science TEMPERATURE Mathematics Principal Component Analysis Plant Pests Multidisciplinary CLIMATE-CHANGE Geography Ecology biology Agriculture Research Assessment Europe Community Ecology Principal component analysis COLEOPTERA Risk Analysis Research Article SAMPLE-SIZE GEOGRAPHICAL-DISTRIBUTION BIOLOGICAL INVASIONS CHRYSOMELIDAE Science Policy Cereals Crops Ecological Risk Zea mays 010603 evolutionary biology Model Organisms Plant and Algal Models Plant-Environment Interactions Animals Biology Receiver operating characteristic Plant Ecology 010604 marine biology & hydrobiology lcsh:R Training (meteorology) Plant Pathology biology.organism_classification Maize Support vector machine Western corn rootworm Sample size determination North America lcsh:Q Pest Control |
Zdroj: | PLoS ONE PLoS ONE, Public Library of Science, 2011, 6 (6), ⟨10.1371/journal.pone.0020957⟩ PLoS ONE, 2011, 6 (6), ⟨10.1371/journal.pone.0020957⟩ Plos One 6 (6), . (2011) PLoS ONE, Vol 6, Iss 6, p e20957 (2011) |
ISSN: | 1932-6203 |
DOI: | 10.1371/journal.pone.0020957⟩ |
Popis: | Many distribution models developed to predict the presence/absence of invasive alien species need to be fitted to a training dataset before practical use. The training dataset is characterized by the number of recorded presences/absences and by their geographical locations. The aim of this paper is to study the effect of the training dataset characteristics on model performance and to compare the relative importance of three factors influencing model predictive capability; size of training dataset, stage of the biological invasion, and choice of input variables. Nine models were assessed for their ability to predict the distribution of the western corn rootworm, Diabrotica virgifera virgifera, a major pest of corn in North America that has recently invaded Europe. Twenty-six training datasets of various sizes (from 10 to 428 presence records) corresponding to two different stages of invasion (1955 and 1980) and three sets of input bioclimatic variables (19 variables, six variables selected using information on insect biology, and three linear combinations of 19 variables derived from Principal Component Analysis) were considered. The models were fitted to each training dataset in turn and their performance was assessed using independent data from North America and Europe. The models were ranked according to the area under the Receiver Operating Characteristic curve and the likelihood ratio. Model performance was highly sensitive to the geographical area used for calibration; most of the models performed poorly when fitted to a restricted area corresponding to an early stage of the invasion. Our results also showed that Principal Component Analysis was useful in reducing the number of model input variables for the models that performed poorly with 19 input variables. DOMAIN, Environmental Distance, MAXENT, and Envelope Score were the most accurate models but all the models tested in this study led to a substantial rate of mis-classification. |
Databáze: | OpenAIRE |
Externí odkaz: |