Supervised machine learning is superior to indicator value inference in monitoring the environmental impacts of salmon aquaculture using eDNA metabarcodes
Autor: | Larissa Frühe, Verena Dully, Catarina I. M. Martins, Thomas A. Wilding, Thorsten Stoeck, Tristan Cordier, Jan Pawlowski, Guillaume Lentendu, Hans-Werner Breiner |
---|---|
Rok vydání: | 2020 |
Předmět: |
0106 biological sciences
0301 basic medicine Inference Aquaculture Biology Environment Machine learning computer.software_genre 010603 evolutionary biology 01 natural sciences 03 medical and health sciences Salmon Environmental monitoring Genetics Feature (machine learning) Animals DNA Barcoding Taxonomic 14. Life underwater Natural ecosystem Ecology Evolution Behavior and Systematics Ecosystem business.industry Norway Biodiversity Random forest 030104 developmental biology 13. Climate action Indicator value Salmon aquaculture Artificial intelligence Supervised Machine Learning business computer Bioindicator Environmental Monitoring |
Zdroj: | Molecular ecologyREFERENCES. 30(13) |
ISSN: | 1365-294X |
Popis: | Increasing anthropogenic impact and global change effects on natural ecosystems has prompted the development of less expensive and more efficient bioassessments methodologies. One promising approach is the integration of DNA metabarcoding in environmental monitoring. A critical step in this process is the inference of ecological quality (EQ) status from identified molecular bioindicator signatures that mirror environmental classification based on standard macroinvertebrate surveys. The most promising approaches to infer EQ from biotic indices (BI) are supervised machine learning (SML) and the calculation of indicator values (IndVal). In this study we compared the performance of both approaches using DNA metabarcodes of bacteria and ciliates as bioindicators obtained from 152 samples collected from seven Norwegian salmon farms. Results from standard macroinvertebrate-monitoring of the same samples were used as reference to compare the accuracy of both approaches. First, SML outperformed the IndVal approach to infer EQ from eDNA metabarcodes. The Random Forest (RF) algorithm appeared to be less sensitive to noisy data (a typical feature of massive environmental sequence data sets) and uneven data coverage across EQ classes (a typical feature of environmental compliance monitoring scheme) compared to a widely used method to infer IndVals for the calculation of a BI. Second, bacteria allowed for a more accurate EQ assessment than ciliate eDNA metabarcodes. For the implementation of DNA metabarcoding into routine monitoring programmes to assess EQ around salmon aquaculture cages, we therefore recommend bacterial DNA metabarcodes in combination with SML to classify EQ categories based on molecular signatures. |
Databáze: | OpenAIRE |
Externí odkaz: |