21
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A data preprocessing strategy for metabolomics to reduce the mask effect in data analysis

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Highlights

          • Developed a data preprocessing strategy to cope with missing values and mask effects in data analysis from high variation of abundant metabolites.

          • A new method- ‘x-VAST’ was developed to amend the measurement deviation enlargement.

          • Applying the above strategy, several low abundant masked differential metabolites were rescued.

          Metabolomics is a booming research field. Its success highly relies on the discovery of differential metabolites by comparing different data sets (for example, patients vs. controls). One of the challenges is that differences of the low abundant metabolites between groups are often masked by the high variation of abundant metabolites. In order to solve this challenge, a novel data preprocessing strategy consisting of three steps was proposed in this study. In step 1, a ‘modified 80%’ rule was used to reduce effect of missing values; in step 2, unit-variance and Pareto scaling methods were used to reduce the mask effect from the abundant metabolites. In step 3, in order to fix the adverse effect of scaling, stability information of the variables deduced from intensity information and the class information, was used to assign suitable weights to the variables. When applying to an LC/MS based metabolomics dataset from chronic hepatitis B patients study and two simulated datasets, the mask effect was found to be partially eliminated and several new low abundant differential metabolites were rescued.

          Related collections

          Most cited references32

          • Record: found
          • Abstract: found
          • Article: not found

          Metabolomics--the link between genotypes and phenotypes.

          Metabolites are the end products of cellular regulatory processes, and their levels can be regarded as the ultimate response of biological systems to genetic or environmental changes. In parallel to the terms 'transcriptome' and proteome', the set of metabolites synthesized by a biological system constitute its 'metabolome'. Yet, unlike other functional genomics approaches, the unbiased simultaneous identification and quantification of plant metabolomes has been largely neglected. Until recently, most analyses were restricted to profiling selected classes of compounds, or to fingerprinting metabolic changes without sufficient analytical resolution to determine metabolite levels and identities individually. As a prerequisite for metabolomic analysis, careful consideration of the methods employed for tissue extraction, sample preparation, data acquisition, and data mining must be taken. In this review, the differences among metabolite target analysis, metabolite profiling, and metabolic fingerprinting are clarified, and terms are defined. Current approaches are examined, and potential applications are summarized with a special emphasis on data mining and mathematical modelling of metabolism.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Metabolomic profiles delineate potential role for sarcosine in prostate cancer progression.

            Multiple, complex molecular events characterize cancer development and progression. Deciphering the molecular networks that distinguish organ-confined disease from metastatic disease may lead to the identification of critical biomarkers for cancer invasion and disease aggressiveness. Although gene and protein expression have been extensively profiled in human tumours, little is known about the global metabolomic alterations that characterize neoplastic progression. Using a combination of high-throughput liquid-and-gas-chromatography-based mass spectrometry, we profiled more than 1,126 metabolites across 262 clinical samples related to prostate cancer (42 tissues and 110 each of urine and plasma). These unbiased metabolomic profiles were able to distinguish benign prostate, clinically localized prostate cancer and metastatic disease. Sarcosine, an N-methyl derivative of the amino acid glycine, was identified as a differential metabolite that was highly increased during prostate cancer progression to metastasis and can be detected non-invasively in urine. Sarcosine levels were also increased in invasive prostate cancer cell lines relative to benign prostate epithelial cells. Knockdown of glycine-N-methyl transferase, the enzyme that generates sarcosine from glycine, attenuated prostate cancer invasion. Addition of exogenous sarcosine or knockdown of the enzyme that leads to sarcosine degradation, sarcosine dehydrogenase, induced an invasive phenotype in benign prostate epithelial cells. Androgen receptor and the ERG gene fusion product coordinately regulate components of the sarcosine pathway. Here, by profiling the metabolomic alterations of prostate cancer progression, we reveal sarcosine as a potentially important metabolic intermediary of cancer cell invasion and aggressivity.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Chemometrics in metabonomics.

              We provide an overview of how the underlying philosophy of chemometrics is integrated throughout metabonomic studies. Four steps are demonstrated: (1) definition of the aim, (2) selection of objects, (3) sample preparation and characterization, and (4) evaluation of the collected data. This includes the tools applied for linear modeling, for example, Statistical Experimental Design (SED), Principal Component Analysis (PCA), Partial least-squares (PLS), Orthogonal-PLS (OPLS), and dynamic extensions thereof. This is illustrated by examples from the literature.
                Bookmark

                Author and article information

                Contributors
                Journal
                Front Mol Biosci
                Front Mol Biosci
                Front. Mol. Biosci.
                Frontiers in Molecular Biosciences
                Frontiers Media S.A.
                2296-889X
                02 February 2015
                2015
                : 2
                : 4
                Affiliations
                [1] 1Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences Dalian, China
                [2] 2Department of Entomology and Nematology, University of California, Davis Davis, CA, USA
                [3] 3School of Computer Science and Technology, Dalian University of Technology Dalian, China
                Author notes

                Edited by: Manuel Portero-Otin, IRBLLEIDA-UdL, Spain

                Reviewed by: Atsushi Fukushima, RIKEN, Japan; Hunter N. B. Moseley, University of Kentucky, USA

                *Correspondence: Guowang Xu, Key Laboratory of Separation Science for Analytical Chemistry, Dalian Institute of Chemical Physics, Chinese Academy of Sciences, 457 Zhongshan Road, Dalian 116023, China e-mail: xugw@ 123456dicp.ac.cn ;
                Jun Yang, Department of Entomology and Nematology, University of California, One Shields Ave, Davis, CA 95616, USA e-mail: junyang@ 123456ucdavis.edu

                This article was submitted to Metabolomics, a section of the journal Frontiers in Molecular Biosciences.

                Article
                10.3389/fmolb.2015.00004
                4428451
                25988172
                c6c4ac4c-8e54-4dcc-a79d-1976bc474385
                Copyright © 2015 Yang, Zhao, Lu, Lin and Xu.

                This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

                History
                : 01 October 2014
                : 09 January 2015
                Page count
                Figures: 6, Tables: 3, Equations: 7, References: 38, Pages: 9, Words: 5573
                Categories
                Molecular Biosciences
                Original Research Article

                metabolomics,data preprocessing,pattern recognition,biomarkers,differential metabolites

                Comments

                Comment on this article