Data-driven high-throughput prediction of the 3-D structure of small molecules: review and progress
Autor: | Alessio Andronico, Arlo Randall, Ryan W. Benz, Pierre Baldi |
---|---|
Rok vydání: | 2011 |
Předmět: |
Models
Molecular Informatics Models Statistical Databases Factual General Chemical Engineering Molecular Conformation General Chemistry Library and Information Sciences Dihedral angle Small molecule Molecular conformation Article Computer Science Applications Data-driven Pattern Recognition Automated Distribution system Chemistry Prediction methods Biological property High field Biological system Simulation Algorithms |
Zdroj: | Journal of chemical information and modeling. 51(4) |
ISSN: | 1549-960X |
Popis: | Accurate prediction of the 3-D structure of small molecules is essential in order to understand their physical, chemical, and biological properties, including how they interact with other molecules. Here, we survey the field of high-throughput methods for 3-D structure prediction and set up new target specifications for the next generation of methods. We then introduce COSMOS, a novel data-driven prediction method that utilizes libraries of fragment and torsion angle parameters. We illustrate COSMOS using parameters extracted from the Cambridge Structural Database (CSD) by analyzing their distribution and then evaluating the system's performance in terms of speed, coverage, and accuracy. Results show that COSMOS represents a significant improvement when compared to state-of-the-art prediction methods, particularly in terms of coverage of complex molecular structures, including metal-organics. COSMOS can predict structures for 96.4% of the molecules in the CSD (99.6% organic, 94.6% metal-organic), whereas the widely used commercial method CORINA predicts structures for 68.5% (98.5% organic, 51.6% metal-organic). On the common subset of molecules predicted by both methods, COSMOS makes predictions with an average speed per molecule of 0.15 s (0.10 s organic, 0.21 s metal-organic) and an average rmsd of 1.57 Å (1.26 Å organic, 1.90 Å metal-organic), and CORINA makes predictions with an average speed per molecule of 0.13s (0.18s organic, 0.08s metal-organic) and an average rmsd of 1.60 Å (1.13 Å organic, 2.11 Å metal-organic). COSMOS is available through the ChemDB chemoinformatics Web portal at http://cdb.ics.uci.edu/ . |
Databáze: | OpenAIRE |
Externí odkaz: |