50
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Parallel Worlds of Public and Commercial Bioactive Chemistry Data : Miniperspective

      review-article

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The availability of structures and linked bioactivity data in databases is powerfully enabling for drug discovery and chemical biology. However, we now review some confounding issues with the divergent expansions of public and commercial sources of chemical structures. These are associated with not only expanding patent extraction but also increasingly large vendor collections amassed via different selection criteria between SciFinder from Chemical Abstracts Service (CAS) and major public sources such as PubChem, ChemSpider, UniChem, and others. These increasingly massive collections may include both real and virtual compounds, as well as so-called prophetic compounds from patents. We address a range of issues raised by the challenges faced resolving the NIH probe compounds. In addition we highlight the confounding of prior-art searching by virtual compounds that could impact the composition of matter patentability of a new medicinal chemistry lead. Finally, we propose some potential solutions.

          Related collections

          Most cited references23

          • Record: found
          • Abstract: found
          • Article: not found

          The properties of known drugs. 1. Molecular frameworks.

          In order to better understand the common features present in drug molecules, we use shape description methods to analyze a database of commercially available drugs and prepare a list of common drug shapes. A useful way of organizing this structural data is to group the atoms of each drug molecule into ring, linker, framework, and side chain atoms. On the basis of the two-dimensional molecular structures (without regard to atom type, hybridization, and bond order), there are 1179 different frameworks among the 5120 compounds analyzed. However, the shapes of half of the drugs in the database are described by the 32 most frequently occurring frameworks. This suggests that the diversity of shapes in the set of known drugs is extremely low. In our second method of analysis, in which atom type, hybridization, and bond order are considered, more diversity is seen; there are 2506 different frameworks among the 5120 compounds in the database, and the most frequently occurring 42 frameworks account for only one-fourth of the drugs. We discuss the possible interpretations of these findings and the way they may be used to guide future drug discovery research.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Enumeration of 166 billion organic small molecules in the chemical universe database GDB-17.

            Drug molecules consist of a few tens of atoms connected by covalent bonds. How many such molecules are possible in total and what is their structure? This question is of pressing interest in medicinal chemistry to help solve the problems of drug potency, selectivity, and toxicity and reduce attrition rates by pointing to new molecular series. To better define the unknown chemical space, we have enumerated 166.4 billion molecules of up to 17 atoms of C, N, O, S, and halogens forming the chemical universe database GDB-17, covering a size range containing many drugs and typical for lead compounds. GDB-17 contains millions of isomers of known drugs, including analogs with high shape similarity to the parent drug. Compared to known molecules in PubChem, GDB-17 molecules are much richer in nonaromatic heterocycles, quaternary centers, and stereoisomers, densely populate the third dimension in shape space, and represent many more scaffold types.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              PubChem as a public resource for drug discovery.

              PubChem is a public repository of small molecules and their biological properties. Currently, it contains more than 25 million unique chemical structures and 90 million bioactivity outcomes associated with several thousand macromolecular targets. To address the potential utility of this public resource for drug discovery, we systematically summarized the protein targets in PubChem by function, 3D structure and biological pathway. Moreover, we analyzed the potency, selectivity and promiscuity of the bioactive compounds identified for these biological targets, including the chemical probes generated by the NIH Molecular Libraries Program. As a public resource, PubChem lowers the barrier for researchers to advance the development of chemical tools for modulating biological processes and drug candidates for disease treatments. Published by Elsevier Ltd.
                Bookmark

                Author and article information

                Journal
                J Med Chem
                J. Med. Chem
                jm
                jmcmar
                Journal of Medicinal Chemistry
                American Chemical Society
                0022-2623
                1520-4804
                21 November 2014
                12 March 2015
                : 58
                : 5
                : 2068-2076
                Affiliations
                []Christopher A. Lipinski, Ph.D., LLC , 10 Connshire Drive, Waterford, Connecticut 06385-4122, United States
                []Collaborative Drug Discovery , 1633 Bayshore Highway, Suite 342, Burlingame, California 94010, United States
                [§ ]IUPHAR/BPS Database and Guide to PHARMACOLOGY Web Portal Group, Centre for Integrative Physiology, University of Edinburgh , Edinburgh, EH8 9XD, U.K.
                []Royal Society of Chemistry , 904 Tamaras Circle, Wake Forest, North Carolina 27587, United States
                []Molecular Materials Informatics, Inc. , 1900 St. Jacques No. 302, Montreal, Quebec H3J 2S1, Canada
                [# ]Collaborations in Chemistry , 5616 Hilltop Needmore Road, Fuquay Varina, North Carolina 27526, United States
                Author notes
                [* ]Phone: 215-687-1320. E-mail: ekinssean@ 123456yahoo.com .
                Article
                10.1021/jm5011308
                4360371
                25415348
                c29b6505-504a-4c47-ac53-2bbbca9bb077
                Copyright © 2014 American Chemical Society

                This is an open access article published under an ACS AuthorChoice License, which permits copying and redistribution of the article or any adaptations for non-commercial purposes.

                History
                : 24 July 2014
                Categories
                Perspective
                Custom metadata
                jm5011308
                jm-2014-011308

                Pharmaceutical chemistry
                Pharmaceutical chemistry

                Comments

                Comment on this article