+1 Recommend
0 collections
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      The PRIDE database and related tools and resources in 2019: improving support for quantification data


      Read this article at

          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.


          The PRoteomics IDEntifications (PRIDE) database ( https://www.ebi.ac.uk/pride/) is the world’s largest data repository of mass spectrometry-based proteomics data, and is one of the founding members of the global ProteomeXchange (PX) consortium. In this manuscript, we summarize the developments in PRIDE resources and related tools since the previous update manuscript was published in Nucleic Acids Research in 2016. In the last 3 years, public data sharing through PRIDE (as part of PX) has definitely become the norm in the field. In parallel, data re-use of public proteomics data has increased enormously, with multiple applications. We first describe the new architecture of PRIDE Archive, the archival component of PRIDE. PRIDE Archive and the related data submission framework have been further developed to support the increase in submitted data volumes and additional data types. A new scalable and fault tolerant storage backend, Application Programming Interface and web interface have been implemented, as a part of an ongoing process. Additionally, we emphasize the improved support for quantitative proteomics data through the mzTab format. At last, we outline key statistics on the current data contents and volume of downloads, and how PRIDE data are starting to be disseminated to added-value resources including Ensembl, UniProt and Expression Atlas.

          Related collections

          Most cited references41

          • Record: found
          • Abstract: found
          • Article: not found

          The MaxQuant computational platform for mass spectrometry-based shotgun proteomics.

          MaxQuant is one of the most frequently used platforms for mass-spectrometry (MS)-based proteomics data analysis. Since its first release in 2008, it has grown substantially in functionality and can be used in conjunction with more MS platforms. Here we present an updated protocol covering the most important basic computational workflows, including those designed for quantitative label-free proteomics, MS1-level labeling and isobaric labeling techniques. This protocol presents a complete description of the parameters used in MaxQuant, as well as of the configuration options of its integrated search engine, Andromeda. This protocol update describes an adaptation of an existing protocol that substantially modifies the technique. Important concepts of shotgun proteomics and their implementation in MaxQuant are briefly reviewed, including different quantification strategies and the control of false-discovery rates (FDRs), as well as the analysis of post-translational modifications (PTMs). The MaxQuant output tables, which contain information about quantification of proteins and PTMs, are explained in detail. Furthermore, we provide a short version of the workflow that is applicable to data sets with simple and standard experimental designs. The MaxQuant algorithms are efficiently parallelized on multiple processors and scale well from desktop computers to servers with many cores. The software is written in C# and is freely available at http://www.maxquant.org.
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Ensembl 2018

            Abstract The Ensembl project has been aggregating, processing, integrating and redistributing genomic datasets since the initial releases of the draft human genome, with the aim of accelerating genomics research through rapid open distribution of public data. Large amounts of raw data are thus transformed into knowledge, which is made available via a multitude of channels, in particular our browser (http://www.ensembl.org). Over time, we have expanded in multiple directions. First, our resources describe multiple fields of genomics, in particular gene annotation, comparative genomics, genetics and epigenomics. Second, we cover a growing number of genome assemblies; Ensembl Release 90 contains exactly 100. Third, our databases feed simultaneously into an array of services designed around different use cases, ranging from quick browsing to genome-wide bioinformatic analysis. We present here the latest developments of the Ensembl project, with a focus on managing an increasing number of assemblies, supporting efforts in genome interpretation and improving our browser.
              • Record: found
              • Abstract: found
              • Article: not found

              Probability-based protein identification by searching sequence databases using mass spectrometry data

              Several algorithms have been described in the literature for protein identification by searching a sequence database using mass spectrometry data. In some approaches, the experimental data are peptide molecular weights from the digestion of a protein by an enzyme. Other approaches use tandem mass spectrometry (MS/MS) data from one or more peptides. Still others combine mass data with amino acid sequence data. We present results from a new computer program, Mascot, which integrates all three types of search. The scoring algorithm is probability based, which has a number of advantages: (i) A simple rule can be used to judge whether a result is significant or not. This is particularly useful in guarding against false positives. (ii) Scores can be compared with those from other types of search, such as sequence homology. (iii) Search parameters can be readily optimised by iteration. The strengths and limitations of probability-based scoring are discussed, particularly in the context of high throughput, fully automated protein identification.

                Author and article information

                Nucleic Acids Res
                Nucleic Acids Res
                Nucleic Acids Research
                Oxford University Press
                08 January 2019
                05 November 2018
                05 November 2018
                : 47
                : Database issue , Database issue
                : D442-D450
                [1 ]European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
                [2 ]Division of Immunology, Allergy and Infectious Diseases, Department of Dermatology, Medical University of Vienna, Vienna, 1090, Austria
                [3 ]Ruhr University Bochum, Medical Faculty, Medizinisches Proteom-Center, D-44801 Bochum, Germany
                [4 ]Applied Bioinformatics, Department for Computer Science, University of Tuebingen, Sand 14, 72076 Tuebingen, Germany
                [5 ]Computational Systems Biochemistry, Max Planck Institute for Biochemistry, Martinsried, 82152, Germany
                [6 ]Department of Congenital Heart Disease and Pediatric Cardiology, Universitätsklinikum Schleswig–Holstein Kiel, Kiel, 24105, Germany
                Author notes
                To whom correspondence should be addressed. Tel: +44 0 1223 492513; Fax: 01223 484696; Email: yperez@ 123456ebi.ac.uk . Correspondence may also be addressed to Dr. Juan Antonio Vizcaíno. Tel: +44 0 1223 492686; Fax: 01223 484696; Email: juan@ 123456ebi.ac.uk
                Author information
                © The Author(s) 2018. Published by Oxford University Press on behalf of Nucleic Acids Research.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

                : 22 October 2018
                : 19 October 2018
                : 22 September 2018
                Page count
                Pages: 9
                Funded by: Wellcome Trust 10.13039/100004440
                Award ID: WT101477MA
                Award ID: 208391/Z/17/Z
                Funded by: Biotechnology and Biological Sciences Research Council 10.13039/501100000268
                Award ID: BB/K01997X/1
                Award ID: BB/L024225/1
                Award ID: BB/P024599/1
                Funded by: UK-Japan Partnership
                Award ID: BB/N022440/1
                Funded by: National Institutes of Health 10.13039/100000002
                Award ID: R24 GM127667-01
                Funded by: Thor Industries 10.13039/100004694
                Award ID: 654039
                Funded by: Horizon 2020 10.13039/501100007601
                Award ID: 686547
                Database Issue



                Comment on this article