Ligand virtual screening is a widely used tool to assist in new pharmaceutical discovery. In practice, virtual screening approaches have a number of limitations, and the development of new methodologies is required. Previously, we showed that remotely related proteins identified by threading often share a common binding site occupied by chemically similar ligands. Here, we demonstrate that across an evolutionarily related, but distant family of proteins, the ligands that bind to the common binding site contain a set of strongly conserved anchor functional groups as well as a variable region that accounts for their binding specificity. Furthermore, the sequence and structure conservation of residues contacting the anchor functional groups is significantly higher than those contacting ligand variable regions. Exploiting these insights, we developed FINDSITE LHM that employs structural information extracted from weakly related proteins to perform rapid ligand docking by homology modeling. In large scale benchmarking, using the predicted anchor-binding mode and the crystal structure of the receptor, FINDSITE LHM outperforms classical docking approaches with an average ligand RMSD from native of ∼2.5 Å. For weakly homologous receptor protein models, using FINDSITE LHM, the fraction of recovered binding residues and specific contacts is 0.66 (0.55) and 0.49 (0.38) for highly confident (all) targets, respectively. Finally, in virtual screening for HIV-1 protease inhibitors, using similarity to the ligand anchor region yields significantly improved enrichment factors. Thus, the rather accurate, computationally inexpensive FINDSITE LHM algorithm should be a useful approach to assist in the discovery of novel biopharmaceuticals.
As an integral part of drug development, high-throughput virtual screening is a widely used tool that could in principle significantly reduce the cost and time to discovery of new pharmaceuticals. In practice, virtual screening algorithms suffer from a number of limitations. The high sensitivity of all-atom ligand docking approaches to the quality of the target receptor structure restricts the selection of drug targets to those for which high-quality X-ray structures are available. Furthermore, the predicted binding affinity is typically strongly correlated with the molecular weight of the ligand, independent of whether or not it really binds. To address these significant problems, we developed FINDSITE LHM, a novel threading-based approach that employs structural information extracted from weakly related proteins to perform rapid ligand docking and ranking that is very much in the spirit of homology modeling of protein structures. Particularly for low-quality modeled receptor structures, FINDSITE LHM outperforms classical all-atom ligand docking approaches in terms of the accuracy of ligand binding pose prediction and requires considerably less CPU time. As an attractive alternative to classical molecular docking, FINDSITE LHM offers the possibility of rapid structure-based virtual screening at the proteome level to improve and speed up the discovery of new biopharmaceuticals.