Abstrakt: |
Molecular optimization, which transforms a given input molecule $X$ into another $Y$ with desired properties, is essential in molecular drug discovery. The traditional approaches either suffer from sample-inefficient learning or ignore information that can be captured with the supervised learning of optimized molecule pairs. In this study, we present a novel molecular optimization paradigm, Graph Polish. In this paradigm, with the guidance of the source and target molecule pairs of the desired properties, a heuristic optimization solution can be derived: given an input molecule, we first predict which atom can be viewed as the optimization center, and then the nearby regions are optimized around this center. We then propose an effective and efficient learning framework, Teacher and Student polish, to capture the dependencies in the optimization steps. A teacher component automatically identifies and annotates the optimization centers and the preservation, removal, and addition of some parts of the molecules; a student component learns these knowledges and applies them to a new molecule. The proposed paradigm can offer an intuitive interpretation for the molecular optimization result. Experiments with multiple optimization tasks are conducted on several benchmark datasets. The proposed approach achieves a significant advantage over the six state-of-the-art baseline methods. Also, extensive studies are conducted to validate the effectiveness, explainability, and time savings of the novel optimization paradigm. |