7
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Recent advances and prospects of computational methods for metabolite identification: a review with emphasis on machine learning approaches

      review-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Motivation: Metabolomics involves studies of a great number of metabolites, which are small molecules present in biological systems. They play a lot of important functions such as energy transport, signaling, building block of cells and inhibition/catalysis. Understanding biochemical characteristics of the metabolites is an essential and significant part of metabolomics to enlarge the knowledge of biological systems. It is also the key to the development of many applications and areas such as biotechnology, biomedicine or pharmaceuticals. However, the identification of the metabolites remains a challenging task in metabolomics with a huge number of potentially interesting but unknown metabolites. The standard method for identifying metabolites is based on the mass spectrometry (MS) preceded by a separation technique. Over many decades, many techniques with different approaches have been proposed for MS-based metabolite identification task, which can be divided into the following four groups: mass spectra database, in silico fragmentation, fragmentation tree and machine learning. In this review paper, we thoroughly survey currently available tools for metabolite identification with the focus on in silico fragmentation, and machine learning-based approaches. We also give an intensive discussion on advanced machine learning methods, which can lead to further improvement on this task.

          Related collections

          Most cited references58

          • Record: found
          • Abstract: found
          • Article: not found

          Finding scientific topics.

          A first step in identifying the content of a document is determining which topics that document addresses. We describe a generative model for documents, introduced by Blei, Ng, and Jordan [Blei, D. M., Ng, A. Y. & Jordan, M. I. (2003) J. Machine Learn. Res. 3, 993-1022], in which each document is generated by choosing a distribution over topics and then choosing each word in the document from a topic selected according to this distribution. We then present a Markov chain Monte Carlo algorithm for inference in this model. We use this algorithm to analyze abstracts from PNAS by using Bayesian model selection to establish the number of topics. We show that the extracted topics capture meaningful structure in the data, consistent with the class designations provided by the authors of the articles, and outline further applications of this analysis, including identifying "hot topics" by examining temporal dynamics and tagging abstracts to illustrate semantic content.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            METLIN: a metabolite mass spectral database.

            Endogenous metabolites have gained increasing interest over the past 5 years largely for their implications in diagnostic and pharmaceutical biomarker discovery. METLIN (http://metlin.scripps.edu), a freely accessible web-based data repository, has been developed to assist in a broad array of metabolite research and to facilitate metabolite identification through mass analysis. METLINincludes an annotated list of known metabolite structural information that is easily cross-correlated with its catalogue of high-resolution Fourier transform mass spectrometry (FTMS) spectra, tandem mass spectrometry (MS/MS) spectra, and LC/MS data.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Mass spectral molecular networking of living microbial colonies.

              Integrating the governing chemistry with the genomics and phenotypes of microbial colonies has been a "holy grail" in microbiology. This work describes a highly sensitive, broadly applicable, and cost-effective approach that allows metabolic profiling of live microbial colonies directly from a Petri dish without any sample preparation. Nanospray desorption electrospray ionization mass spectrometry (MS), combined with alignment of MS data and molecular networking, enabled monitoring of metabolite production from live microbial colonies from diverse bacterial genera, including Bacillus subtilis, Streptomyces coelicolor, Mycobacterium smegmatis, and Pseudomonas aeruginosa. This work demonstrates that, by using these tools to visualize small molecular changes within bacterial interactions, insights can be gained into bacterial developmental processes as a result of the improved organization of MS/MS data. To validate this experimental platform, metabolic profiling was performed on Pseudomonas sp. SH-C52, which protects sugar beet plants from infections by specific soil-borne fungi [R. Mendes et al. (2011) Science 332:1097-1100]. The antifungal effect of strain SH-C52 was attributed to thanamycin, a predicted lipopeptide encoded by a nonribosomal peptide synthetase gene cluster. Our technology, in combination with our recently developed peptidogenomics strategy, enabled the detection and partial characterization of thanamycin and showed that it is a monochlorinated lipopeptide that belongs to the syringomycin family of antifungal agents. In conclusion, the platform presented here provides a significant advancement in our ability to understand the spatiotemporal dynamics of metabolite production in live microbial colonies and communities.
                Bookmark

                Author and article information

                Journal
                Brief Bioinform
                Brief. Bioinformatics
                bib
                Briefings in Bioinformatics
                Oxford University Press
                1467-5463
                1477-4054
                November 2019
                06 August 2018
                06 August 2018
                : 20
                : 6
                : 2028-2043
                Affiliations
                [1 ] Department of machine learning and bioinformatics , Bioinformatics Center, Kyoto University, Uji, Japan
                [2 ] Bioinformatics Center , Institute for Chemical Research, Kyoto University, Uji, Japan
                [3 ] Department of Computer Science , Aalto University, Otakaari, FI, Finland
                Author notes
                Corresponding author: Dai Hai Nguyen, Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji 611-0011, Japan. Email: hai@ 123456kuicr.kyoto-u.ac.jp
                Article
                bby066
                10.1093/bib/bby066
                6954430
                30099485
                c8868e8f-eb41-4238-ba9f-c529be36f3b1
                © The Author(s) 2018. Published by Oxford University Press.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com

                History
                : 30 April 2018
                : 14 June 2018
                : 3 July 2018
                Page count
                Pages: 16
                Funding
                Funded by: MEXT KAKENHI 10.13039/501100001700
                Award ID: 16H02868
                Funded by: ACCEL JST 10.13039/501100009025
                Award ID: JPMJAC1503
                Funded by: FiDiPro Tekes 10.13039/501100003406
                Funded by: AIPSE Academy of Finland 10.13039/501100002341
                Categories
                Review Article

                Bioinformatics & Computational biology
                mass spectrometry,machine learning,substructure prediction,substructure annotation

                Comments

                Comment on this article