2
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      A deep learning architecture for metabolic pathway prediction

      1 , 2 , 3 , 3 , 3 , 4 , 1
      Bioinformatics
      Oxford University Press (OUP)

      Read this article at

      ScienceOpenPublisherPubMed
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Motivation

          Understanding the mechanisms and structural mappings between molecules and pathway classes are critical for design of reaction predictors for synthesizing new molecules. This article studies the problem of prediction of classes of metabolic pathways (series of chemical reactions occurring within a cell) in which a given biochemical compound participates. We apply a hybrid machine learning approach consisting of graph convolutional networks used to extract molecular shape features as input to a random forest classifier. In contrast to previously applied machine learning methods for this problem, our framework automatically extracts relevant shape features directly from input SMILES representations, which are atom-bond specifications of chemical structures composing the molecules.

          Results

          Our method is capable of correctly predicting the respective metabolic pathway class of 95.16% of tested compounds, whereas competing methods only achieve an accuracy of 84.92% or less. Furthermore, our framework extends to the task of classification of compounds having mixed membership in multiple pathway classes. Our prediction accuracy for this multi-label task is 97.61%. We analyze the relative importance of various global physicochemical features to the pathway class prediction problem and show that simple linear/logistic regression models can predict the values of these global features from the shape features extracted using our framework.

          Availability and implementation

          https://github.com/baranwa2/MetabolicPathwayPrediction.

          Supplementary information

          Supplementary data are available at Bioinformatics online.

          Related collections

          Most cited references56

          • Record: found
          • Abstract: found
          • Article: not found

          KEGG: kyoto encyclopedia of genes and genomes.

          M Kanehisa (2000)
          KEGG (Kyoto Encyclopedia of Genes and Genomes) is a knowledge base for systematic analysis of gene functions, linking genomic information with higher order functional information. The genomic information is stored in the GENES database, which is a collection of gene catalogs for all the completely sequenced genomes and some partial genomes with up-to-date annotation of gene functions. The higher order functional information is stored in the PATHWAY database, which contains graphical representations of cellular processes, such as metabolism, membrane transport, signal transduction and cell cycle. The PATHWAY database is supplemented by a set of ortholog group tables for the information about conserved subpathways (pathway motifs), which are often encoded by positionally coupled genes on the chromosome and which are especially useful in predicting gene functions. A third database in KEGG is LIGAND for the information about chemical compounds, enzyme molecules and enzymatic reactions. KEGG provides Java graphics tools for browsing genome maps, comparing two genome maps and manipulating expression maps, as well as computational tools for sequence comparison, graph comparison and path computation. The KEGG databases are daily updated and made freely available (http://www. genome.ad.jp/kegg/).
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings

              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Molecular properties that influence the oral bioavailability of drug candidates.

              Oral bioavailability measurements in rats for over 1100 drug candidates studied at SmithKline Beecham Pharmaceuticals (now GlaxoSmithKline) have allowed us to analyze the relative importance of molecular properties considered to influence that drug property. Reduced molecular flexibility, as measured by the number of rotatable bonds, and low polar surface area or total hydrogen bond count (sum of donors and acceptors) are found to be important predictors of good oral bioavailability, independent of molecular weight. That on average both the number of rotatable bonds and polar surface area or hydrogen bond count tend to increase with molecular weight may in part explain the success of the molecular weight parameter in predicting oral bioavailability. The commonly applied molecular weight cutoff at 500 does not itself significantly separate compounds with poor oral bioavailability from those with acceptable values in this extensive data set. Our observations suggest that compounds which meet only the two criteria of (1) 10 or fewer rotatable bonds and (2) polar surface area equal to or less than 140 A(2) (or 12 or fewer H-bond donors and acceptors) will have a high probability of good oral bioavailability in the rat. Data sets for the artificial membrane permeation rate and for clearance in the rat were also examined. Reduced polar surface area correlates better with increased permeation rate than does lipophilicity (C log P), and increased rotatable bond count has a negative effect on the permeation rate. A threshold permeation rate is a prerequisite of oral bioavailability. The rotatable bond count does not correlate with the data examined here for the in vivo clearance rate in the rat.
                Bookmark

                Author and article information

                Contributors
                (View ORCID Profile)
                Journal
                Bioinformatics
                Oxford University Press (OUP)
                1367-4803
                1460-2059
                April 15 2020
                April 15 2020
                December 26 2019
                April 15 2020
                April 15 2020
                December 26 2019
                : 36
                : 8
                : 2547-2553
                Affiliations
                [1 ]Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109, USA
                [2 ]Department of Computer Science, University at Albany, SUNY, Albany, NY 12222, USA
                [3 ]Department of Mechanical Engineering
                [4 ]Department of Chemical Engineering and Biophysics, University of Michigan, Ann Arbor, MI 48109, USA
                Article
                10.1093/bioinformatics/btz954
                31879763
                72c4c4ed-6fc4-4ac7-bd52-17938474d583
                © 2019

                https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model

                History

                Comments

                Comment on this article