Tuning Reinforcement Learning Parameters for Cluster Selection to Enhance Evolutionary Algorithms.

Autor: Villavicencio N; Department of Mathematics, California State University Fullerton, Fullerton, California 92834, United States., Groves MN; Department of Chemistry and Biochemistry, California State University Fullerton, Fullerton, California 92834, United States.
Jazyk: angličtina
Zdroj: ACS engineering Au [ACS Eng Au] 2024 Apr 16; Vol. 4 (4), pp. 381-393. Date of Electronic Publication: 2024 Apr 16 (Print Publication: 2024).
DOI: 10.1021/acsengineeringau.3c00068
Abstrakt: The ability to find optimal molecular structures with desired properties is a popular challenge, with applications in areas such as drug discovery. Genetic algorithms are a common approach to global minima molecular searches due to their ability to search large regions of the energy landscape and decrease computational time via parallelization. In order to decrease the amount of unstable intermediate structures being produced and increase the overall efficiency of an evolutionary algorithm, clustering was introduced in multiple instances. However, there is little literature detailing the effects of differentiating the selection frequencies between clusters. In order to find a balance between exploration and exploitation in our genetic algorithm, we propose a system of clustering the starting population and choosing clusters for an evolutionary algorithm run via a dynamic probability that is dependent on the fitness of molecules generated by each cluster. We define four parameters, MFavOvrAll-A, MFavClus-B, NoNewFavClus-C, and Select-D, that correspond to a reward for producing the best structure overall, a reward for producing the best structure in its own cluster, a penalty for not producing the best structure, and a penalty based on the selection ratio of the cluster, respectively. A reward increases the probability of a cluster's future selection, while a penalty decreases it. In order to optimize these four parameters, we used a Gaussian distribution to approximate the evolutionary algorithm performance of each cluster and performed a grid search for different parameter combinations. Results show parameter MFavOvrAll-A (rewarding clusters for producing the best structure overall) and parameter Select-D (appearance penalty) have a significantly larger effect than parameters MFavClus-B and NoNewFavClus-C. In order to produce the most successful models, a balance between MFavOvrAll-A and Select-D must be made that reflects the exploitation vs exploration trade-off often seen in reinforcement learning algorithms. Results show that our reinforcement-learning-based method for selecting clusters outperforms an unclustered evolutionary algorithm for quinoline-like structure searches.
Competing Interests: The authors declare no competing financial interest.
(© 2024 The Authors. Published by American Chemical Society.)
Databáze: MEDLINE