45
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      MMsINC: a large-scale chemoinformatics database

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          MMsINC ( http://mms.dsfarm.unipd.it/MMsINC/search) is a database of non-redundant, richly annotated and biomedically relevant chemical structures. A primary goal of MMsINC is to guarantee the highest quality and the uniqueness of each entry. MMsINC then adds value to these entries by including the analysis of crucial chemical properties, such as ionization and tautomerization processes, and the in silico prediction of 24 important molecular properties in the biochemical profile of each structure. MMsINC is consequently a natural input for different chemoinformatics and virtual screening applications. In addition, MMsINC supports various types of queries, including substructure queries and the novel ‘molecular scissoring’ query. MMsINC is interfaced with other primary data collectors, such as PubChem, Protein Data Bank (PDB), the Food and Drug Administration database of approved drugs and ZINC.

          Related collections

          Most cited references12

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Database resources of the National Center for Biotechnology Information

          In addition to maintaining the GenBank(R) nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides analysis and retrieval resources for the data in GenBank and other biological data available through NCBI's web site. NCBI resources include Entrez, the Entrez Programming Utilities, My NCBI, PubMed, PubMed Central, Entrez Gene, the NCBI Taxonomy Browser, BLAST, BLAST Link, Electronic PCR, OrfFinder, Spidey, Splign, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosomes, Entrez Genome, Genome Project and related tools, the Trace, Assembly, and Short Read Archives, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups, Influenza Viral Resources, HIV-1/Human Protein Interaction Database, Gene Expression Omnibus, Entrez Probe, GENSAT, Database of Genotype and Phenotype, Online Mendelian Inheritance in Man, Online Mendelian Inheritance in Animals, the Molecular Modeling Database, the Conserved Domain Database, the Conserved Domain Architecture Retrieval Tool and the PubChem suite of small molecule databases. Augmenting the web applications are custom implementations of the BLAST program optimized to search specialized data sets. These resources can be accessed through the NCBI home page at www.ncbi.nlm.nih.gov.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Property distribution of drug-related chemical databases.

            T Oprea (2000)
            The process of compound selection and prioritization is crucial for both combinatorial chemistry (CBC) and high throughput screening (HTS). Compound libraries have to be screened for unwanted chemical structures, as well as for unwanted chemical properties. Property extrema can be eliminated by using property filters, in accordance with their actual distribution. Property distribution was examined in the following compound databases: MACCS-II Drug Data Report (MDDR), Current Patents Fast-alert, Comprehensive Medicinal Chemistry, Physician Desk Reference, New Chemical Entities, and the Available Chemical Directory (ACD). The ACDF and MDDRF subsets were created by removing reactive functionalities from the ACD and MDDR databases, respectively. The ACDF subset was further filtered by keeping only molecules with a 'drug-like' score [Ajay et al., J. Med. Chem., 41 (1998) 3314; Sadowski and Kubinyi, J. Med. Chem., 41 (1998) 3325] below 0.8. The following properties were examined: molecular weight (MW), the calculated octanol/water partition coefficient (CLOGP), the number of rotatable (RTB) and rigid bonds (RGB), the number of rings (RNG), and the number of hydrogen bond donors (HDO) and acceptors (HAC). Of these, MW and CLOGP follow a Gaussian distribution, whereas all other descriptors have an asymmetric (truncated Gaussian) distribution. Four out of five compounds in ACDF and MDDRF pass the 'rule of 5' test, a probability scheme that estimates oral absorption proposed by Lipinski et al. [Adv. Drug Deliv. Rev., 23 (1997) 3]. Because property distributions of HDO, HAC, MW and CLOGP (used in the 'rule of 5' test) do not differ significantly between these datasets, the 'rule of 5' does not distinguish 'drugs' from 'nondrugs'. Therefore, Pareto analyses were performed to examine skewed distributions in all compound collections. Seventy percent of the 'drug-like' compounds were found between the following limits: 0 or = 3, and RGB > or = 18, and only 24.73% of MDDRF compounds have 0 < or = RNG < or = 2 rings, and RGB < or = 17. The probability of identifying 'drug-like' structures increases with molecular complexity.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Recent developments of the chemistry development kit (CDK) - an open-source java library for chemo- and bioinformatics.

              The Chemistry Development Kit (CDK) provides methods for common tasks in molecular informatics, including 2D and 3D rendering of chemical structures, I/O routines, SMILES parsing and generation, ring searches, isomorphism checking, structure diagram generation, etc. Implemented in Java, it is used both for server-side computational services, possibly equipped with a web interface, as well as for applications and client-side applets. This article introduces the CDK's new QSAR capabilities and the recently introduced interface to statistical software.
                Bookmark

                Author and article information

                Journal
                Nucleic Acids Res
                Nucleic Acids Res
                nar
                nar
                Nucleic Acids Research
                Oxford University Press
                0305-1048
                1362-4962
                January 2009
                January 2009
                17 October 2008
                17 October 2008
                : 37
                : Database issue , Database issue
                : D284-D290
                Affiliations
                1CRS4 – Bioinformatics Laboratory, Parco Sardegna Ricerche, Pula (CA) 09010 and 2Molecular Modeling Section (MMS), Department of Pharmaceutical Sciences, University of Padova, PD 35131, Italy
                Author notes
                *To whom correspondence should be addressed. Tel: +39 049 8275704; Fax: +39 049 8275366; Email: stefano.moro@ 123456unipd.it

                The authors wish it to be known that, in their opinion, the first two and last two authors should be regarded as joint First Authors.

                Article
                gkn727
                10.1093/nar/gkn727
                2686567
                18931373
                6e09115e-6a45-44c7-a21b-f31d58c1bf32
                © 2008 The Author(s)

                This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 11 August 2008
                : 27 September 2008
                : 1 October 2008
                Categories
                Articles

                Genetics
                Genetics

                Comments

                Comment on this article