Abstrakt: |
After applying canonical correspondence analysis to metagenomics data with hugely different library sizes (site totals) it became evident that Canoco and the R-packages ade4 and vegan can yield (at least up to 2022) very different P-values in statistical tests of the relationship between taxonomic composition (species composition) and predictors (environmental variables and/or treatments). The reason is that vegan and Canoco up to version 5.12 apply residualized response permutation (but ignore the model intercept), whereas ade4 applies predictor permutation. Predictor permutation, when extended to residualized predictor permutation, is applicable in partial constrained ordination. This paper shows by simulation that residualized response permutation can yield a very inflated Type I error rate, if the abundance data are both overdispersed and highly variable in site total. In contrast, residualized predictor permutation controlled the type I error rate and had good power, also when the predictors were skewed or binary. After square-root or log transformation of the abundance data, the differences between the permutation methods became small. Residualized predictor permutation is recommended, particularly in testing trait–environment relationships using double constrained correspondence analysis, because this method also critically depends on the species totals, which are generally highly variable. It is implemented in Canoco 5.15 and the R-code of this paper. [ABSTRACT FROM AUTHOR] |