Ligity: A Non-Superpositional, Knowledge-Based Approach
to Virtual Screening

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

We present Ligity, a hybrid ligand-structure-based, non-superpositional method for virtual screening of large databases of small molecules. Ligity uses the relative spatial distribution of pharmacophoric interaction points (PIPs) derived from the conformations of small molecules. These are compared with the PIPs derived from key interaction features found in protein–ligand complexes and are used to prioritize likely binders. We investigated the effect of generating PIPs using the single lowest energy conformer versus an ensemble of conformers for each screened ligand, using different bin sizes for the distance between two features, utilizing triangular sets of pharmacophoric features (3-PIPs) versus chiral tetrahedral sets (4-PIPs), fusing data for targets with multiple protein–ligand complex structures, and applying different similarity measures. Ligity was benchmarked using the Directory of Useful Decoys-Enhanced (DUD-E). Optimal results were obtained using the tetrahedral PIPs derived from an ensemble of bound ligand conformers and a bin size of 1.5 Å, which are used as the default settings for Ligity. The high-throughput screening mode of Ligity, using only the lowest-energy conformer of each ligand, was used for benchmarking against the whole of the DUD-E, and a more resource-intensive, “information-rich” mode of Ligity, using a conformational ensemble of each ligand, were used for a representative subset of 10 targets. Against the full DUD-E database, mean area under the receiver operating characteristic curve (AUC) values ranged from 0.44 to 0.99, while for the representative subset they ranged from 0.61 to 0.86. Data fusion further improved Ligity’s performance, with mean AUC values ranging from 0.64 to 0.95. Ligity is very efficient compared to a protein–ligand docking method such as AutoDock Vina: if the time taken for the precalculation of Ligity descriptors is included in the comparason, then Ligity is about 20 times faster than docking. A direct comparison of the virtual screening steps shows Ligity to be over 5000 times faster. Ligity highly ranks the lowest-energy conformers of DUD-E actives, in a statistically significant manner, behavior that is not observed for DUD-E decoys. Thus, our results suggest that active compounds tend to bind in relatively low-energy conformations compared to decoys. This may be because actives—and thus their lowest-energy conformations—have been optimized for conformational complementarity with their cognate binding sites.

Related collections

Most cited references 42

Record: found
Abstract: found
Article: found

Is Open Access

Why is Tanimoto index an appropriate choice for fingerprint-based similarity calculations?

Dávid Bajusz, Anita Rácz, Károly Héberger (2015)

Background Cheminformaticians are equipped with a very rich toolbox when carrying out molecular similarity calculations. A large number of molecular representations exist, and there are several methods (similarity and distance metrics) to quantify the similarity of molecular representations. In this work, eight well-known similarity/distance metrics are compared on a large dataset of molecular fingerprints with sum of ranking differences (SRD) and ANOVA analysis. The effects of molecular size, selection methods and data pretreatment methods on the outcome of the comparison are also assessed. Results A supplier database (https://mcule.com/) was used as the source of compounds for the similarity calculations in this study. A large number of datasets, each consisting of one hundred compounds, were compiled, molecular fingerprints were generated and similarity values between a randomly chosen reference compound and the rest were calculated for each dataset. Similarity metrics were compared based on their ranking of the compounds within one experiment (one dataset) using sum of ranking differences (SRD), while the results of the entire set of experiments were summarized on box and whisker plots. Finally, the effects of various factors (data pretreatment, molecule size, selection method) were evaluated with analysis of variance (ANOVA). Conclusions This study complements previous efforts to examine and rank various metrics for molecular similarity calculations. Here, however, an entirely general approach was taken to neglect any a priori knowledge on the compounds involved, as well as any bias introduced by examining only one or a few specific scenarios. The Tanimoto index, Dice index, Cosine coefficient and Soergel distance were identified to be the best (and in some sense equivalent) metrics for similarity calculations, i.e. these metrics could produce the rankings closest to the composite (average) ranking of the eight metrics. The similarity metrics derived from Euclidean and Manhattan distances are not recommended on their own, although their variability and diversity from other similarity metrics might be advantageous in certain cases (e.g. for data fusion). Conclusions are also drawn regarding the effects of molecule size, selection method and data pretreatment on the ranking behavior of the studied metrics. Graphical Abstract A visual summary of the comparison of similarity metrics with sum of ranking differences (SRD). Electronic supplementary material The online version of this article (doi:10.1186/s13321-015-0069-3) contains supplementary material, which is available to authorized users.

0 comments Cited 328 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: not found
Article: not found

SMILES. 2. Algorithm for generation of unique SMILES notation

David Weininger, Arthur Weininger, Joseph L. Weininger (1989)

0 comments Cited 258 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Similarity-based virtual screening using 2D fingerprints.

Peter Willett (2006)

This paper summarizes recent work at the University of Sheffield on virtual screening methods that use 2D fingerprint measures of structural similarity. A detailed comparison of a large number of similarity coefficients demonstrates that the well-known Tanimoto coefficient remains the method of choice for the computation of fingerprint-based similarity, despite possessing some inherent biases related to the sizes of the molecules that are being sought. Group fusion involves combining the results of similarity searches based on multiple reference structures and a single similarity measure. We demonstrate the effectiveness of this approach to screening, and also describe an approximate form of group fusion, turbo similarity searching, that can be used when just a single reference structure is available.

0 comments Cited 196 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): J Chem Inf Model

Journal ID (iso-abbrev): J Chem Inf Model

Journal ID (publisher-id): ci

Journal ID (coden): jcisd8

Title: Journal of Chemical Information and Modeling

Publisher: American Chemical Society

ISSN (Print): 1549-9596

ISSN (Electronic): 1549-960X

Publication date (Electronic): 22 May 2019

Publication date (Print): 24 June 2019

Volume: 59

Issue: 6

Pages: 2600-2616

Affiliations

[† ]Centre for Molecular Medicine and Biobanking, University of Malta , Msida, MSD 2080, Malta

[‡ ]Oxford Drug Design Limited, Oxford Centre for Innovation , New Road, Oxford OX1 1BY, U.K.

[¶ ]The School of Computing, University of Buckingham , Hunter Street, Buckingham, MK18 1EG, U.K.

[§ ]Oxford Protein Informatics Group, Department of Statistics, University of Oxford , 24-29 St. Giles’, Oxford, OX1 3LB, U.K.

Author notes

[* ]E-mail: garrett.morris@ 123456stats.ox.ac.uk . Phone: +44 1865 281770. Fax: +44 1865 282862.

Article

DOI: 10.1021/acs.jcim.8b00779

PMC ID: 7007185

PubMed ID: 31117509

SO-VID: 9bcec15c-7694-4fc6-8c8f-1614b57c43f2

License:

This is an open access article published under a Creative Commons Attribution (CC-BY) License, which permits unrestricted use, distribution and reproduction in any medium, provided the author and source are cited.