Solving the Sample Size Problem for Resource Selection Analysis

Autor: Christina M. Prokopenko, Eric Vander Wal, Kevin L. Monteith, Larissa T. Beumer, Levi J. Newediuk, James C. Beasley, Alexine Keuroghlian, Garrett M. Street, John M. Fryxell, Jonathan R. Potts, David A. Keiter, Miltinho C. Ribeiro, Olin E. Rhodes, Peter E. Schlichting, Floris M. van Beest, Guha Dharmarajan, Philip D. McLoughlin, Bronson K. Strickland, Samantha P. H. Dwinnell, David A. Bernasconi, Júlia Emi de Faria Oshima, Luca Börger, Stephen Demarais, Niels Martin Schmidt, Arthur R. Rodgers
Rok vydání: 2021
Předmět:
DOI: 10.1101/2021.02.22.432319
Popis: Sample size sufficiency is a critical consideration for conducting Resource-Selection Analyses (RSAs) from GPS-based animal telemetry. Cited thresholds for sufficiency include a number of captured animals M ≥ 30 and as many relocations per animal N as possible. These thresholds render many RSA-based studies misleading if large sample sizes were truly insufficient, or unpublishable if small sample sizes were sufficient but failed to meet reviewer expectations.We provide the first comprehensive solution for RSA sample size by deriving closed-form mathematical expressions for the number of animals M and the number of relocations per animal N required for model outputs to a given degree of precision. The sample sizes needed depend on just 2 biologically meaningful quantities: habitat selection strength and a novel measure of landscape complexity, which we define rigorously. The mathematical expressions are calculable for any environmental dataset at any spatial scale and are applicable to any study involving resource selection (including sessile organisms). We validate our analytical solutions using globally relevant empirical data including 5,678,623 GPS locations from 511 animals from 10 species (omnivores, carnivores, and herbivores living in boreal, temperate, and tropical forests, montane woodlands, swamps, and arctic tundra).Our analytic expressions show that the required M and N must decline with increasing selection strength and increasing landscape complexity, and this decline is insensitive to the definition of availability used in the analysis. Our results contradict conventional wisdom by demonstrating that the most biologically relevant effects on the utilization distribution (i.e. those landscape conditions with the greatest absolute magnitude of resource selection) can often be estimated with far fewer data than is commonly assumed.We identify several critical steps in implementing these equations, including (i) a priori selection of expected model coefficients, and (ii) sampling intensity for background (absence/pseudo-absence) data within a given definition of availability. We show that random sampling of background data violates the underlying mathematics of RSA, leading to incorrect values for necessary M and N and potentially incorrect RSA model outputs. We argue that these equations should be a mandatory component for all future RSA studies.
Databáze: OpenAIRE