Symbolic, Neural, and Bayesian Machine Learning Models for Predicting Carcinogenicity of Chemical Compounds
Autor: | Dennis Bahler, Carol Wellington, Brian A. Stone, Douglas W. Bristol |
---|---|
Rok vydání: | 2000 |
Předmět: |
Male
Carcinogenicity Tests Process (engineering) Computer science Bayesian probability Decision tree Rodentia Machine learning computer.software_genre Bayes' theorem Empirical research Animals Humans Artificial neural network Rule sets business.industry Bayes Theorem General Chemistry Computer Science Applications Data set Computational Theory and Mathematics Carcinogens Female Neural Networks Computer Artificial intelligence Data mining business computer Information Systems |
Zdroj: | Journal of Chemical Information and Computer Sciences. 40:906-914 |
ISSN: | 1520-5142 0095-2338 |
DOI: | 10.1021/ci990116i |
Popis: | Experimental programs have been underway for several years to determine the environmental effects of chemical compounds, mixtures, and the like. Among these programs is the National Toxicology Program (NTP) on rodent carcinogenicity. Because these experiments are costly and time-consuming, the rate at which test articles (i.e., chemicals) can be tested is limited. The ability to predict the outcome of the analysis at various points in the process would facilitate informed decisions about the allocation of testing resources. To assist human experts in organizing an empirical testing regime, and to try to shed light on mechanisms of toxicity, we constructed toxicity models using various machine learning and data mining methods, both existing and those of our own devising. These models took the form of decision trees, rule sets, neural networks, rules extracted from trained neural networks, and Bayesian classifiers. As a training set, we used recent results from rodent carcinogenicity bioassays conducted by the NTP on 226 test articles. We performed 10-way cross-validation on each of our models to approximate their expected error rates on unseen data. The data set consists of physical-chemical parameters of test articles, alerting chemical substructures, salmonella mutagenicity assay results, subchronic histopathology data, and information on route, strain, and sex/species for 744 individual experiments. These results contribute to the ongoing process of evaluating and interpreting the data collected from chemical toxicity studies. |
Databáze: | OpenAIRE |
Externí odkaz: |