PACE: Probabilistic Assessment for Contributor Estimation— A machine learning-based assessment of the number of contributors in DNA mixtures
Autor: | Michael Marciano, Jonathan D. Adelman |
---|---|
Rok vydání: | 2017 |
Předmět: |
0301 basic medicine
Computer science Sample (statistics) computer.software_genre Machine learning Field (computer science) Pathology and Forensic Medicine Machine Learning 03 medical and health sciences Forensic dna 0302 clinical medicine Genetics Humans 030216 legal & forensic medicine Probability Pace Estimation Models Statistical business.industry Probabilistic logic DNA DNA Fingerprinting ComputingMethodologies_PATTERNRECOGNITION 030104 developmental biology State (computer science) Data mining Deconvolution Artificial intelligence business computer Algorithms |
Zdroj: | Forensic Science International: Genetics. 27:82-91 |
ISSN: | 1872-4973 |
DOI: | 10.1016/j.fsigen.2016.11.006 |
Popis: | The deconvolution of DNA mixtures remains one of the most critical challenges in the field of forensic DNA analysis. In addition, of all the data features required to perform such deconvolution, the number of contributors in the sample is widely considered the most important, and, if incorrectly chosen, the most likely to negatively influence the mixture interpretation of a DNA profile. Unfortunately, most current approaches to mixture deconvolution require the assumption that the number of contributors is known by the analyst, an assumption that can prove to be especially faulty when faced with increasingly complex mixtures of 3 or more contributors. In this study, we propose a probabilistic approach for estimating the number of contributors in a DNA mixture that leverages the strengths of machine learning. To assess this approach, we compare classification performances of six machine learning algorithms and evaluate the model from the top-performing algorithm against the current state of the art in the field of contributor number classification. Overall results show over 98% accuracy in identifying the number of contributors in a DNA mixture of up to 4 contributors. Comparative results showed 3-person mixtures had a classification accuracy improvement of over 6% compared to the current best-in-field methodology, and that 4-person mixtures had a classification accuracy improvement of over 20%. The Probabilistic Assessment for Contributor Estimation (PACE) also accomplishes classification of mixtures of up to 4 contributors in less than 1s using a standard laptop or desktop computer. Considering the high classification accuracy rates, as well as the significant time commitment required by the current state of the art model versus seconds required by a machine learning-derived model, the approach described herein provides a promising means of estimating the number of contributors and, subsequently, will lead to improved DNA mixture interpretation. |
Databáze: | OpenAIRE |
Externí odkaz: |