100
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      The mzTab Data Exchange Format: Communicating Mass-spectrometry-based Proteomics and Metabolomics Experimental Results to a Wider Audience*

      research-article

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The HUPO Proteomics Standards Initiative has developed several standardized data formats to facilitate data sharing in mass spectrometry (MS)-based proteomics. These allow researchers to report their complete results in a unified way. However, at present, there is no format to describe the final qualitative and quantitative results for proteomics and metabolomics experiments in a simple tabular format. Many downstream analysis use cases are only concerned with the final results of an experiment and require an easily accessible format, compatible with tools such as Microsoft Excel or R.

          We developed the mzTab file format for MS-based proteomics and metabolomics results to meet this need. mzTab is intended as a lightweight supplement to the existing standard XML-based file formats (mzML, mzIdentML, mzQuantML), providing a comprehensive summary, similar in concept to the supplemental material of a scientific publication. mzTab files can contain protein, peptide, and small molecule identifications together with experimental metadata and basic quantitative information. The format is not intended to store the complete experimental evidence but provides mechanisms to report results at different levels of detail. These range from a simple summary of the final results to a representation of the results including the experimental design. This format is ideally suited to make MS-based proteomics and metabolomics results available to a wider biological community outside the field of MS. Several software tools for proteomics and metabolomics have already adapted the format as an output format. The comprehensive mzTab specification document and extensive additional documentation can be found online.

          Related collections

          Most cited references28

          • Record: found
          • Abstract: found
          • Article: not found

          Gene Ontology: tool for the unification of biology

          Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web (http://www.geneontology.org) are being constructed: biological process, molecular function and cellular component.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            HMDB: a knowledgebase for the human metabolome

            The Human Metabolome Database (HMDB, http://www.hmdb.ca) is a richly annotated resource that is designed to address the broad needs of biochemists, clinical chemists, physicians, medical geneticists, nutritionists and members of the metabolomics community. Since its first release in 2007, the HMDB has been used to facilitate the research for nearly 100 published studies in metabolomics, clinical biochemistry and systems biology. The most recent release of HMDB (version 2.0) has been significantly expanded and enhanced over the previous release (version 1.0). In particular, the number of fully annotated metabolite entries has grown from 2180 to more than 6800 (a 300% increase), while the number of metabolites with biofluid or tissue concentration data has grown by a factor of five (from 883 to 4413). Similarly, the number of purified compounds with reference to NMR, LC-MS and GC-MS spectra has more than doubled (from 380 to more than 790 compounds). In addition to this significant expansion in database size, many new database searching tools and new data content has been added or enhanced. These include better algorithms for spectral searching and matching, more powerful chemical substructure searches, faster text searching software, as well as dedicated pathway searching tools and customized, clickable metabolic maps. Changes to the user-interface have also been implemented to accommodate future expansion and to make database navigation much easier. These improvements should make the HMDB much more useful to a much wider community of users.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Open mass spectrometry search algorithm.

              Large numbers of MS/MS peptide spectra generated in proteomics experiments require efficient, sensitive and specific algorithms for peptide identification. In the Open Mass Spectrometry Search Algorithm (OMSSA), specificity is calculated by a classic probability score using an explicit model for matching experimental spectra to sequences. At default thresholds, OMSSA matches more spectra from a standard protein cocktail than a comparable algorithm. OMSSA is designed to be faster than published algorithms in searching large MS/MS datasets.
                Bookmark

                Author and article information

                Journal
                Mol Cell Proteomics
                Mol. Cell Proteomics
                mcprot
                mcprot
                MCP
                Molecular & Cellular Proteomics : MCP
                The American Society for Biochemistry and Molecular Biology
                1535-9476
                1535-9484
                October 2014
                30 June 2014
                30 June 2014
                : 13
                : 10
                : 2765-2775
                Affiliations
                [1]From the ‡European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, CB10 1SD, Hinxton, Cambridge, UK;
                [2]§Division of Immunology, Allergy and Infectious Diseases, Department of Dermatology, Medical University of Vienna, Vienna, Austria;
                [3]‖Institute of Integrative Biology, University of Liverpool, L69 7ZB, Liverpool, UK;
                [4]**Center for Bioinformatics and Department of Computer Science, University of Tübingen, D-72076 Tübingen, Germany;
                [5]‡‡Computational Proteomics Unit and Cambridge Centre for Proteomics, Cambridge Systems Biology Centre, Department of Biochemistry, University of Cambridge, CB2 1QR, Cambridge, UK;
                [6]§§Institute for Genomics and Bioinformatics, Graz University of Technology, Petersgasse 14/V, 8010 Graz, Austria;
                [7]¶¶Core Facility Bioinformatics, Austrian Centre of Industrial Biotechnology (ACIB GmbH), Petersgasse 14/V, 8010 Graz, Austria;
                [8]‖‖Department of Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Am Klopferspitz 18, D-82152 Martinsried, Germany;
                [9] a Department of Stress and Developmental Biology, Leibniz Institute of Plant Biochemistry, 06120 Halle (Saale), Germany;
                [10] b School of Biological and Chemical Sciences, Queen Mary University of London, London, UK;
                [11] c College of Computer, Hubei University of Education, Wuhan, China;
                [12] d Center for Computational Mass Spectrometry, University of California, San Diego, La Jolla, CA;
                [13] e Swiss-Prot group, SIB Swiss Institute of Bioinformatics, 1 Rue Michel Servet, 1211 Geneva, Switzerland;
                [14] f Vital-IT group, SIB Swiss Institute of Bioinformatics, Quartier Sorge, Genopode 1015 Lausanne;
                [15] g Center of Integrative Genomics, University of Lausanne, Quartier Sorge Genopode, 1015 Lausanne;
                [16] h Quantitative Biology Center, University of Tübingen, D-72076 Tübingen, Germany
                Author notes
                i To whom correspondence should be addressed: Dr. Juan Antonio Vizcaíno, European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, CB10 1SD, Hinxton, Cambridge, UK, Tel.: 44-1223-492-610, Fax: 44-1223-494-484, E-mail: juan@ 123456ebi.ac.uk .

                ¶ These authors contributed to this work equally.

                Article
                O113.036681
                10.1074/mcp.O113.036681
                4189001
                24980485
                44c2cc34-61df-4fa1-9a41-d809566af271
                © 2014 by The American Society for Biochemistry and Molecular Biology, Inc.

                Author's Choice—Final version full access.

                History
                : 11 December 2013
                : 20 June 2014
                Funding
                Funded by: National Institutes of Health
                Award ID: 8P41GM103485-05
                Categories
                Technological Innovation and Resources

                Molecular biology
                Molecular biology

                Comments

                Comment on this article