The Development of Target-Specific Pose Filter Ensembles To Boost Ligand Enrichment for Structure-Based Virtual Screening

Autor: Song Wu, Xiang Simon Wang, Jui-Hua Hsieh, Huabin Hu, Jie Xia
Rok vydání: 2017
Předmět:
Zdroj: Journal of Chemical Information and Modeling. 57:1414-1425
ISSN: 1549-960X
1549-9596
Popis: Structure-based virtual screening (SBVS) has become an indispensable technique for hit identification at the early stage of drug discovery. However, the accuracy of current scoring functions is not high enough to confer success to every target and thus remains to be improved. Previously, we had developed binary pose filters (PFs) using knowledge derived from the protein–ligand interface of a single X-ray structure of a specific target. This novel approach had been validated as an effective way to improve ligand enrichment. Continuing from it, in the present work we attempted to incorporate knowledge collected from diverse protein–ligand interfaces of multiple crystal structures of the same target to build PF ensembles (PFEs). Toward this end, we first constructed a comprehensive data set to meet the requirements of ensemble modeling and validation. This set contains 10 diverse targets, 118 well-prepared X-ray structures of protein–ligand complexes, and large benchmarking actives/decoys sets. Notably, we designed a unique workflow of two-layer classifiers based on the concept of ensemble learning and applied it to the construction of PFEs for all of the targets. Through extensive benchmarking studies, we demonstrated that (1) coupling PFE with Chemgauss4 significantly improves the early enrichment of Chemgauss4 itself and (2) PFEs show greater consistency in boosting early enrichment and larger overall enrichment than our prior PFs. In addition, we analyzed the pairwise topological similarities among cognate ligands used to construct PFEs and found that it is the higher chemical diversity of the cognate ligands that leads to the improved performance of PFEs. Taken together, the results so far prove that the incorporation of knowledge from diverse protein–ligand interfaces by ensemble modeling is able to enhance the screening competence of SBVS scoring functions.
Databáze: OpenAIRE