A rotation-translation invariant molecular descriptor of partial charges and its use in ligand-based virtual screening

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Background

Measures of similarity for chemical molecules have been developed since the dawn of chemoinformatics. Molecular similarity has been measured by a variety of methods including molecular descriptor based similarity, common molecular fragments, graph matching and 3D methods such as shape matching. Similarity measures are widespread in practice and have proven to be useful in drug discovery. Because of our interest in electrostatics and high throughput ligand-based virtual screening, we sought to exploit the information contained in atomic coordinates and partial charges of a molecule.

Results

A new molecular descriptor based on partial charges is proposed. It uses the autocorrelation function and linear binning to encode all atoms of a molecule into two rotation-translation invariant vectors. Combined with a scoring function, the descriptor allows to rank-order a database of compounds versus a query molecule. The proposed implementation is called ACPC (AutoCorrelation of Partial Charges) and released in open source. Extensive retrospective ligand-based virtual screening experiments were performed and other methods were compared with in order to validate the method and associated protocol.

Conclusions

While it is a simple method, it performed remarkably well in experiments. At an average speed of 1649 molecules per second, it reached an average median area under the curve of 0.81 on 40 different targets; hence validating the proposed protocol and implementation.

Related collections

Most cited references 33

Record: found
Abstract: found
Article: not found

Benchmarking sets for molecular docking.

Niu Huang, Brian K. Shoichet, John Irwin (2006)

Ligand enrichment among top-ranking hits is a key metric of molecular docking. To avoid bias, decoys should resemble ligands physically, so that enrichment is not simply a separation of gross features, yet be chemically distinct from them, so that they are unlikely to be binders. We have assembled a directory of useful decoys (DUD), with 2950 ligands for 40 different targets. Every ligand has 36 decoy molecules that are physically similar but topologically distinct, leading to a database of 98,266 compounds. For most targets, enrichment was at least half a log better with uncorrected databases such as the MDDR than with DUD, evidence of bias in the former. These calculations also allowed 40x40 cross-docking, where the enrichments of each ligand set could be compared for all 40 targets, enabling a specificity metric for the docking screens. DUD is freely available online as a benchmarking set for docking at http://blaster.docking.org/dud/.

0 comments Cited 292 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Similarity-based virtual screening using 2D fingerprints.

Peter Willett (2006)

This paper summarizes recent work at the University of Sheffield on virtual screening methods that use 2D fingerprint measures of structural similarity. A detailed comparison of a large number of similarity coefficients demonstrates that the well-known Tanimoto coefficient remains the method of choice for the computation of fingerprint-based similarity, despite possessing some inherent biases related to the sizes of the molecules that are being sought. Group fusion involves combining the results of similarity searches based on multiple reference structures and a single similarity measure. We demonstrate the effectiveness of this approach to screening, and also describe an approximate form of group fusion, turbo similarity searching, that can be used when just a single reference structure is available.

0 comments Cited 196 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Molecular similarity: a key technique in molecular informatics.

Andreas Bender, Robert C Glen (2004)

Molecular Informatics utilises many ideas and concepts to find relationships between molecules. The concept of similarity, where molecules may be grouped according to their biological effects or physicochemical properties has found extensive use in drug discovery. Some areas of particular interest have been in lead discovery and compound optimisation. For example, in designing libraries of compounds for lead generation, one approach is to design sets of compounds "similar" to known active compounds in the hope that alternative molecular structures are found that maintain the properties required while enhancing e.g. patentability, medicinal chemistry opportunities or even in achieving optimised pharmacokinetic profiles. Thus the practical importance of the concept of molecular similarity has grown dramatically in recent years. The predominant users are pharmaceutical companies, employing similarity methods in a wide range of applications e.g. virtual screening, estimation of absorption, distribution, metabolism, excretion and toxicity (ADME/Tox) and prediction of physicochemical properties (solubility, partitioning etc.). In this perspective, we discuss the representation of molecular structure (descriptors), methods of comparing structures and how these relate to measured properties. This leads to the concept of molecular similarity, its various definitions and uses and how these have evolved in recent years. Here, we wish to evaluate and in some cases challenge accepted views and uses of molecular similarity. Molecular similarity, as a paradigm, contains many implicit and explicit assumptions in particular with respect to the prediction of the binding and efficacy of molecules at biological receptors. The fundamental observation is that molecular similarity has a context which both defines and limits its use. The key issues of solvation effects, heterogeneity of binding sites and the fundamental problem of the form of similarity measure to use are addressed.

0 comments Cited 156 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Francois Berenger

Arnout Voet

Xiao Yin Lee

Kam YJ Zhang

Journal

Journal ID (nlm-ta): J Cheminform

Journal ID (iso-abbrev): J Cheminform

Title: Journal of Cheminformatics

Publisher: BioMed Central

ISSN (Electronic): 1758-2946

Publication date Collection: 2014

Publication date (Electronic): 10 May 2014

Volume: 6

Page: 23

Affiliations

[1 ]Zhang Initiative Research Unit, Institute Laboratories, RIKEN, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan

Article

Publisher ID: 1758-2946-6-23

DOI: 10.1186/1758-2946-6-23

PMC ID: 4030740

PubMed ID: 24887178

SO-VID: e5ba5c42-dbea-44e5-a5f3-e89bfcb5fa68

License:

This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

History

Date received : 8 January 2014

Date accepted : 22 April 2014

Comments

Comment on this article

scite_

Cited by 6

See all cited by

Most referenced authors 287

See all reference authors

- Version 1

A rotation-translation invariant molecular descriptor of partial charges and its use in ligand-based virtual screening

Read this article at

Abstract

Background

Results

Conclusions

Related collections

Drug_transporters

Most cited references 33

Benchmarking sets for molecular docking.

Similarity-based virtual screening using 2D fingerprints.

Molecular similarity: a key technique in molecular informatics.

Author and article information

Contributors

Journal

Affiliations

Article

History

Categories

Comments

Comment on this article

Similar content 126

Cited by 6

Most referenced authors 287