The PRIDE database and related tools and resources in 2019: improving support for quantification data

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

The PRoteomics IDEntifications (PRIDE) database ( https://www.ebi.ac.uk/pride/) is the world’s largest data repository of mass spectrometry-based proteomics data, and is one of the founding members of the global ProteomeXchange (PX) consortium. In this manuscript, we summarize the developments in PRIDE resources and related tools since the previous update manuscript was published in Nucleic Acids Research in 2016. In the last 3 years, public data sharing through PRIDE (as part of PX) has definitely become the norm in the field. In parallel, data re-use of public proteomics data has increased enormously, with multiple applications. We first describe the new architecture of PRIDE Archive, the archival component of PRIDE. PRIDE Archive and the related data submission framework have been further developed to support the increase in submitted data volumes and additional data types. A new scalable and fault tolerant storage backend, Application Programming Interface and web interface have been implemented, as a part of an ongoing process. Additionally, we emphasize the improved support for quantitative proteomics data through the mzTab format. At last, we outline key statistics on the current data contents and volume of downloads, and how PRIDE data are starting to be disseminated to added-value resources including Ensembl, UniProt and Expression Atlas.

Related collections

Most cited references 41

Record: found
Abstract: found
Article: not found

The MaxQuant computational platform for mass spectrometry-based shotgun proteomics.

Stefka Tyanova, Tikira Temu, Juergen Cox (2016)

MaxQuant is one of the most frequently used platforms for mass-spectrometry (MS)-based proteomics data analysis. Since its first release in 2008, it has grown substantially in functionality and can be used in conjunction with more MS platforms. Here we present an updated protocol covering the most important basic computational workflows, including those designed for quantitative label-free proteomics, MS1-level labeling and isobaric labeling techniques. This protocol presents a complete description of the parameters used in MaxQuant, as well as of the configuration options of its integrated search engine, Andromeda. This protocol update describes an adaptation of an existing protocol that substantially modifies the technique. Important concepts of shotgun proteomics and their implementation in MaxQuant are briefly reviewed, including different quantification strategies and the control of false-discovery rates (FDRs), as well as the analysis of post-translational modifications (PTMs). The MaxQuant output tables, which contain information about quantification of proteins and PTMs, are explained in detail. Furthermore, we provide a short version of the workflow that is applicable to data sets with simple and standard experimental designs. The MaxQuant algorithms are efficiently parallelized on multiple processors and scale well from desktop computers to servers with many cores. The software is written in C# and is freely available at http://www.maxquant.org.

0 comments Cited 1486 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

Ensembl 2018

Daniel Zerbino, Premanand Achuthan, Wasiu Akanni … (2017)

Abstract The Ensembl project has been aggregating, processing, integrating and redistributing genomic datasets since the initial releases of the draft human genome, with the aim of accelerating genomics research through rapid open distribution of public data. Large amounts of raw data are thus transformed into knowledge, which is made available via a multitude of channels, in particular our browser (http://www.ensembl.org). Over time, we have expanded in multiple directions. First, our resources describe multiple fields of genomics, in particular gene annotation, comparative genomics, genetics and epigenomics. Second, we cover a growing number of genome assemblies; Ensembl Release 90 contains exactly 100. Third, our databases feed simultaneously into an array of services designed around different use cases, ranging from quick browsing to genome-wide bioinformatic analysis. We present here the latest developments of the Ensembl project, with a focus on managing an increasing number of assemblies, supporting efforts in genome interpretation and improving our browser.

0 comments Cited 689 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Probability-based protein identification by searching sequence databases using mass spectrometry data

David N Perkins, Darryl J. C. Pappin, David Creasy … (1999)

Several algorithms have been described in the literature for protein identification by searching a sequence database using mass spectrometry data. In some approaches, the experimental data are peptide molecular weights from the digestion of a protein by an enzyme. Other approaches use tandem mass spectrometry (MS/MS) data from one or more peptides. Still others combine mass data with amino acid sequence data. We present results from a new computer program, Mascot, which integrates all three types of search. The scoring algorithm is probability based, which has a number of advantages: (i) A simple rule can be used to judge whether a result is significant or not. This is particularly useful in guarding against false positives. (ii) Scores can be compared with those from other types of search, such as sequence homology. (iii) Search parameters can be readily optimised by iteration. The strengths and limitations of probability-based scoring are discussed, particularly in the context of high throughput, fully automated protein identification.

0 comments Cited 465 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): Nucleic Acids Res

Journal ID (iso-abbrev): Nucleic Acids Res

Journal ID (publisher-id): nar

Title: Nucleic Acids Research

Publisher: Oxford University Press

ISSN (Print): 0305-1048

ISSN (Electronic): 1362-4962

Publication date (Print): 08 January 2019

Publication date (Electronic): 05 November 2018

Publication date PMC-release: 05 November 2018

Volume: 47

Issue: Database issue , Database issue

Pages: D442-D450

Affiliations

[1 ]European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK

[2 ]Division of Immunology, Allergy and Infectious Diseases, Department of Dermatology, Medical University of Vienna, Vienna, 1090, Austria

[3 ]Ruhr University Bochum, Medical Faculty, Medizinisches Proteom-Center, D-44801 Bochum, Germany

[4 ]Applied Bioinformatics, Department for Computer Science, University of Tuebingen, Sand 14, 72076 Tuebingen, Germany

[5 ]Computational Systems Biochemistry, Max Planck Institute for Biochemistry, Martinsried, 82152, Germany

[6 ]Department of Congenital Heart Disease and Pediatric Cardiology, Universitätsklinikum Schleswig–Holstein Kiel, Kiel, 24105, Germany

Author notes

To whom correspondence should be addressed. Tel: +44 0 1223 492513; Fax: 01223 484696; Email: yperez@ 123456ebi.ac.uk . Correspondence may also be addressed to Dr. Juan Antonio Vizcaíno. Tel: +44 0 1223 492686; Fax: 01223 484696; Email: juan@ 123456ebi.ac.uk

Author information

Juan Antonio Vizcaíno http://orcid.org/0000-0002-3905-4335

Article

Publisher ID: gky1106

DOI: 10.1093/nar/gky1106

PMC ID: 6323896

PubMed ID: 30395289

SO-VID: 6f5d4df0-6f89-41b4-94df-dbb6aae3d126

License:

This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

History

Date accepted : 22 October 2018

Date revision received : 19 October 2018

Date received : 22 September 2018

Page count

Pages: 9

Funding

Funded by: Wellcome Trust 10.13039/100004440

Award ID: WT101477MA

Award ID: 208391/Z/17/Z

Funded by: Biotechnology and Biological Sciences Research Council 10.13039/501100000268

Award ID: BB/K01997X/1

Award ID: BB/L024225/1

Award ID: BB/P024599/1

Funded by: UK-Japan Partnership

Award ID: BB/N022440/1

Funded by: National Institutes of Health 10.13039/100000002

Award ID: R24 GM127667-01

Funded by: Thor Industries 10.13039/100004694

Award ID: 654039

Funded by: Horizon 2020 10.13039/501100007601

Award ID: 686547

Comments

Comment on this article

scite_

Cited by 3,568

See all cited by

Most referenced authors 955

See all reference authors

- Version 1

The PRIDE database and related tools and resources in 2019: improving support for quantification data

Read this article at

Abstract

Related collections

G3: Genes|Genomes|Genetics

Most cited references 41

The MaxQuant computational platform for mass spectrometry-based shotgun proteomics.

Ensembl 2018

Probability-based protein identification by searching sequence databases using mass spectrometry data

Author and article information

Journal

Affiliations

Author notes

Author information

Article

History

Page count

Funding

Categories

Comments

Comment on this article

Similar content 124

Cited by 3,568

Most referenced authors 955