The Experimentalist's Guide to Machine Learning for Small Molecule Design.

Autor: Lindley SE; Department of Bioengineering, University of Illinois, Urbana-Champaign, Illinois 61801, United States., Lu Y; Department of Chemical and Biomolecular Engineering, University of Illinois, Urbana-Champaign, Illinois 61801, United States., Shukla D; Department of Bioengineering, University of Illinois, Urbana-Champaign, Illinois 61801, United States.; Department of Chemical and Biomolecular Engineering, University of Illinois, Urbana-Champaign, Illinois 61801, United States.; Center for Biophysics & Computational Biology, University of Illinois, Urbana-Champaign, Illinois 61801, United States.; Department of Plant Biology, University of Illinois, Urbana-Champaign, Illinois 61801, United States.
Jazyk: angličtina
Zdroj: ACS applied bio materials [ACS Appl Bio Mater] 2024 Feb 19; Vol. 7 (2), pp. 657-684. Date of Electronic Publication: 2023 Aug 03.
DOI: 10.1021/acsabm.3c00054
Abstrakt: Initially part of the field of artificial intelligence, machine learning (ML) has become a booming research area since branching out into its own field in the 1990s. After three decades of refinement, ML algorithms have accelerated scientific developments across a variety of research topics. The field of small molecule design is no exception, and an increasing number of researchers are applying ML techniques in their pursuit of discovering, generating, and optimizing small molecule compounds. The goal of this review is to provide simple, yet descriptive, explanations of some of the most commonly utilized ML algorithms in the field of small molecule design along with those that are highly applicable to an experimentally focused audience. The algorithms discussed here span across three ML paradigms: supervised learning, unsupervised learning, and ensemble methods. Examples from the published literature will be provided for each algorithm. Some common pitfalls of applying ML to biological and chemical data sets will also be explained, alongside a brief summary of a few more advanced paradigms, including reinforcement learning and semi-supervised learning.
Databáze: MEDLINE