19
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Evaluation and comparison of bioinformatic tools for the enrichment analysis of metabolomics data

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          Bioinformatic tools for the enrichment of ‘omics’ datasets facilitate interpretation and understanding of data. To date few are suitable for metabolomics datasets. The main objective of this work is to give a critical overview, for the first time, of the performance of these tools. To that aim, datasets from metabolomic repositories were selected and enriched data were created. Both types of data were analysed with these tools and outputs were thoroughly examined.

          Results

          An exploratory multivariate analysis of the most used tools for the enrichment of metabolite sets, based on a non-metric multidimensional scaling (NMDS) of Jaccard’s distances, was performed and mirrored their diversity. Codes (identifiers) of the metabolites of the datasets were searched in different metabolite databases (HMDB, KEGG, PubChem, ChEBI, BioCyc/HumanCyc, LipidMAPS, ChemSpider, METLIN and Recon2). The databases that presented more identifiers of the metabolites of the dataset were PubChem, followed by METLIN and ChEBI. However, these databases had duplicated entries and might present false positives. The performance of over-representation analysis (ORA) tools, including BioCyc/HumanCyc, ConsensusPathDB, IMPaLA, MBRole, MetaboAnalyst, Metabox, MetExplore, MPEA, PathVisio and Reactome and the mapping tool KEGGREST, was examined. Results were mostly consistent among tools and between real and enriched data despite the variability of the tools. Nevertheless, a few controversial results such as differences in the total number of metabolites were also found. Disease-based enrichment analyses were also assessed, but they were not found to be accurate probably due to the fact that metabolite disease sets are not up-to-date and the difficulty of predicting diseases from a list of metabolites.

          Conclusions

          We have extensively reviewed the state-of-the-art of the available range of tools for metabolomic datasets, the completeness of metabolite databases, the performance of ORA methods and disease-based analyses. Despite the variability of the tools, they provided consistent results independent of their analytic approach. However, more work on the completeness of metabolite and pathway databases is required, which strongly affects the accuracy of enrichment analyses. Improvements will be translated into more accurate and global insights of the metabolome.

          Electronic supplementary material

          The online version of this article (10.1186/s12859-017-2006-0) contains supplementary material, which is available to authorized users.

          Related collections

          Most cited references33

          • Record: found
          • Abstract: found
          • Article: not found

          METLIN: a metabolite mass spectral database.

          Endogenous metabolites have gained increasing interest over the past 5 years largely for their implications in diagnostic and pharmaceutical biomarker discovery. METLIN (http://metlin.scripps.edu), a freely accessible web-based data repository, has been developed to assist in a broad array of metabolite research and to facilitate metabolite identification through mass analysis. METLINincludes an annotated list of known metabolite structural information that is easily cross-correlated with its catalogue of high-resolution Fourier transform mass spectrometry (FTMS) spectra, tandem mass spectrometry (MS/MS) spectra, and LC/MS data.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            LMSD: LIPID MAPS structure database

            The LIPID MAPS Structure Database (LMSD) is a relational database encompassing structures and annotations of biologically relevant lipids. Structures of lipids in the database come from four sources: (i) LIPID MAPS Consortium's core laboratories and partners; (ii) lipids identified by LIPID MAPS experiments; (iii) computationally generated structures for appropriate lipid classes; (iv) biologically relevant lipids manually curated from LIPID BANK, LIPIDAT and other public sources. All the lipid structures in LMSD are drawn in a consistent fashion. In addition to a classification-based retrieval of lipids, users can search LMSD using either text-based or structure-based search options. The text-based search implementation supports data retrieval by any combination of these data fields: LIPID MAPS ID, systematic or common name, mass, formula, category, main class, and subclass data fields. The structure-based search, in conjunction with optional data fields, provides the capability to perform a substructure search or exact match for the structure drawn by the user. Search results, in addition to structure and annotations, also include relevant links to external databases. The LMSD is publicly available at
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              ChEBI: a database and ontology for chemical entities of biological interest

              Chemical Entities of Biological Interest (ChEBI) is a freely available dictionary of molecular entities focused on ‘small’ chemical compounds. The molecular entities in question are either natural products or synthetic products used to intervene in the processes of living organisms. Genome-encoded macromolecules (nucleic acids, proteins and peptides derived from proteins by cleavage) are not as a rule included in ChEBI. In addition to molecular entities, ChEBI contains groups (parts of molecular entities) and classes of entities. ChEBI includes an ontological classification, whereby the relationships between molecular entities or classes of entities and their parents and/or children are specified. ChEBI is available online at http://www.ebi.ac.uk/chebi/
                Bookmark

                Author and article information

                Contributors
                marcoramell@gmail.com
                palaumagali@ub.edu
                a.alaybadosa@gmail.com
                sara.tulipani@gmail.com
                murpi@ub.edu
                asanchez@ub.edu
                (+34) 934034840 , candres@ub.edu
                Journal
                BMC Bioinformatics
                BMC Bioinformatics
                BMC Bioinformatics
                BioMed Central (London )
                1471-2105
                2 January 2018
                2 January 2018
                2018
                : 19
                : 1
                Affiliations
                [1 ]ISNI 0000 0004 1937 0247, GRID grid.5841.8, Biomarkers & Nutrimetabolomics Laboratory, Nutrition, Food Science and Gastronomy Department, Food Technology Reference Net (XaRTA), Nutrition and Food Safety Research Institute (INSA-UB), Faculty of Pharmacy and Food Sciences, Pharmacy and Food Science Faculty, , University of Barcelona, ; Barcelona, Spain
                [2 ]ISNI 0000 0000 9314 1427, GRID grid.413448.e, CIBER Fragilidad y Envejecimiento Saludable [CIBERfes], Instituto de Salud Carlos III [ISCIII], ; Madrid, Spain
                [3 ]ISNI 0000 0004 1937 0247, GRID grid.5841.8, Genetics, Microbiology and Statistics Department, Biology Faculty, , University of Barcelona, ; Barcelona, Spain
                [4 ]ISNI 0000 0004 1763 0287, GRID grid.430994.3, Statistics and Bioinformatics Unit, Vall d’Hebron Institut de Recerca (VHIR), ; Barcelona, Spain
                Article
                2006
                10.1186/s12859-017-2006-0
                5749025
                29291722
                473c4795-f0bf-43a3-8785-88cad3cf5bed
                © The Author(s). 2017

                Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

                History
                : 23 August 2017
                : 18 December 2017
                Funding
                Funded by: ISCII-Subdirección General de Evaluación y Fomento de la Investigación
                Award ID: PI13/01172
                Award Recipient :
                Funded by: CIBERfes
                Funded by: Fondo Europeo de Desarrollo Regional
                Funded by: Generalitat de Catalunya's Agency AGAUR
                Award ID: 2014SGR1566
                Award ID: 2014 SGR 464
                Award Recipient :
                Funded by: MINECO
                Award ID: Juan de la Cierva postdoctoral fellowship
                Award ID: Juan de la Cierva postdoctoral fellowship
                Award ID: Ramon y Cajal postdoctoral fellowship
                Award ID: MTM2015-64465-C2-1-R
                Award Recipient :
                Funded by: FundRef http://dx.doi.org/10.13039/501100005774, Universitat de Barcelona;
                Award ID: APIF predoctoral fellowship
                Award Recipient :
                Categories
                Research Article
                Custom metadata
                © The Author(s) 2018

                Bioinformatics & Computational biology
                bioinformatic tools,database,enrichment,humancyc,kegg,metabolite,metabolomics,over-representation analysis,pathway,reactome

                Comments

                Comment on this article