On Combining Recursive Partitioning and Simulated Annealing To Detect Groups of Biologically Active Compounds
Autor: | Paul E. Blower, Jeffrey Bjoraker, Michael A. Fligner, Joseph S. Verducci |
---|---|
Rok vydání: | 2002 |
Předmět: |
Molecular Structure
business.industry Decision tree Binary number Antineoplastic Agents Recursive partitioning Biological activity General Chemistry Machine learning computer.software_genre Cell Line Computer Science Applications Computational Theory and Mathematics Molecular descriptor Simulated annealing Node (circuits) Artificial intelligence Biological system business computer Algorithms Information Systems Mathematics |
Zdroj: | Journal of Chemical Information and Computer Sciences. 42:393-404 |
ISSN: | 0095-2338 |
DOI: | 10.1021/ci0101049 |
Popis: | Statistical data mining methods have proven to be powerful tools for investigating correlations between molecular structure and biological activity. Recursive partitioning (RP), in particular, offers several advantages in mining large, diverse data sets resulting from high throughput screening. When used with binary molecular descriptors, the standard implementation of RP splits on single descriptors. We use simulated annealing (SA) to find combinations of molecular descriptors whose simultaneous presence best separates off the most active, chemically similar group of compounds. The search is incorporated into a recursive partitioning design to produce a regression tree for biological activity on the space of structural fingerprints. Each node is characterized by a specific combination of structural features, and the terminal nodes with high average activities correspond, roughly, to different classes of compounds. Using LeadScope structural features as descriptors to mine a database from the National Cancer Institute, the merging of RP and SA consistently identifies structurally homogeneous classes of highly potent anticancer agents. |
Databáze: | OpenAIRE |
Externí odkaz: |