Protein phosphorylation is a post-translational
modification (PTM) that orchestrates a diverse array of cellular processes.
Because this modification serves as a rapid and reversible means to
modulate protein activity and transduce signals, the regulation of
phosphorylation is a central mechanism in cell health and disease.
1,2
The addition and removal of phosphoryl modifications via kinases
and phosphates, respectively, makes the landscape of phosphorylation
particularly dynamic.
3−5
Understanding the complex networks and functions
coordinated by phosphorylation requires knowledge of specific amino
acid modifications with both spatial and temporal resolution, a task
that remains a challenging analytical endeavor.
6−8
Mass spectrometry
(MS) has emerged as the premier tool for global PTM analysis, boasting
high sensitivity, considerable throughput, and the capacity to localize
modifications to a single residue.
9,10
Indeed, MS-centric
phosphoproteomics has become a standard approach for investigating
protein phosphorylation in laboratories worldwide.
Analytical
Chemistry last reviewed the contribution of MS and related
technologies to phosphoproteomics in 2011.
11
Since that time, MS methodology has developed at an impressive pace.
While routine proteomic experiments can now analyze thousands of proteins
in just a few hours, rather than days or weeks, characterizing the
global phosphoproteome is significantly more challenging than measuring
nonmodified proteins. The relative low abundance of phosphorylated
peptides and the need for residue-specific information require special
considerations in sample handling, data acquisition, and postacquisition
processing that constrain reproducibility, quantitative efficacy,
throughput, and depth in phosphoproteomic workflows. Advances in MS-based
approaches have remarkably improved our abilities to investigate the
many roles of protein phosphorylation across a diverse set of biological
contexts, but many technical obstacles still exist. Poor run-to-run
overlap, challenges in confident phosphosite assignment, and complications
inherent to various quantitative strategies limit biological insight,
despite ever-increasing numbers of detected phosphopeptides. Focusing
on work from the past two years (2013–2015), this review examines
major developments in MS technology that have enabled the characterization
of tens of thousands of phosphopeptides in a given experiment, and
considers the contribution of this analytical power to translational
research. We also discuss how future innovation can address technical
challenges of today’s methods, and we offer our perspective
on how phosphoproteomics will continue to mature.
Sampling the
Phosphoproteome
As much as a third of eukaryotic proteins
are estimated to be phosphorylated.
12
However,
because phosphorylation is a low stoichiometry modification, phosphopeptides
(or phosphoproteins) must be enriched from complex mixtures that have
high backgrounds of nonphosphorylated moieties. Typically, phosphoproteomic
experiments involve LC–MS/MS analysis (i.e., chromatographic
separations coupled to tandem MS) of phosphopeptides that have been
isolated from an enzymatic digestion of proteins from lysed cells
(Figure 1
). Ongoing
optimization efforts have focused on sample preparation protocols,
especially phosphopeptide enrichment strategies and fractionation
techniques to reduce complexity and maximize sampling depth. Proper
steps must be taken in sample collection, as well, to avoid unintentional
alteration of the phosphoproteome.
13,14
Beyond sample
handling, advancements in MS instrumentation have greatly improved
the speed and sensitivity of routine phosphoproteome interrogation.
These topics have been reviewed in the past,
15−19
but here we focus on how these techniques have addressed
challenges in run-to-run reproducibility and how they have contributed
to improvements in throughput and/or depth for phosphoproteomic experiments.
Figure 1
Typical
phosphoproteomic workflow. Each step in a phosphoproteomic experiment
can contribute to limitations in reproducibility and phosphoproteomic
depth, which can ultimately restrict the biological insight obtained
from an experiment. Concerted efforts in the phosphoproteomics community
to improve each step in this workflow continue to advance our ability
to sample the phosphoproteome with greater speed and depth, but comprehensive
phosphoproteome coverage remains out of reach.
Generating Phosphopeptides
Thanks to its high cleavage specificity
C-terminal to lysine and arginine residues, and to its proclivity
for producing peptides amenable to MS analysis, trypsin is the most
commonly used protease in proteomics and phosphoproteomics. However,
proximity of cleavage sites to phosphorylated amino acids can impair
tryptic digestion,
20
a problem which has
inspired the evaluation of multiple protease approaches for large-scale
phosphoproteomics.
21−24
Studies have described varying degrees of success, but generally
have demonstrated that utilizing two proteases improved both protein
sequence coverage and phosphoproteomic depth, i.e., the number of
identified phosphosites. Wisniewski and Mann found that consecutive
use of LysC and trypsin to generate phosphopeptides allowed them to
identify up to 40% more proteins and phosphorylation sites than a
one-step tryptic digestion. Subsequent experiments by others confirmed
the efficacy of this approach,
25
and combinations
of GluC and trypsin have also proven beneficial.
26
Furthermore, Heck and co-workers recently published a thorough
multiple-enzyme study, compiling a human phosphopeptide atlas composed
of 37 771 unique phosphopeptides that correspond to 18 430
unique phosphosites.
27
The overlap of sites
detected by the five proteases accounted for only a third of the total
number of sites. Clearly, the use of several orthogonal proteases
can significantly enhance phosphoproteomic sampling depth, enabling
detection of thousands phosphosites that may be inaccessible in traditional
trypsin-only approaches (Figure 2
). That said, the considerable increase in data acquisition
time limits the applicability of this strategy for high-throughput
or large-scale comparisons across many samples.
Figure 2
Phosphoproteomics using
complementary proteases. Phosphosite intensities correlate strongly
(r > 0.8, yellow) when data sets are obtained
following digestion with the same protease but correlation between
data sets originating from different proteases is low (r ∼ 0.25–0.55, blue). This
difference indicates that
using multiple proteases provides access to different regions of the
phosphoproteome. Reprinted with permission from Giansanti, P.; Aye,
T. T.; van den Toorn, H.; Peng, M.; van Breukelen, B.; Heck, A. J.
R. Cell Rep.
2015, 11, 1834–1843 (ref (27)). Copyright 2015 Cell Press.
Enrichment Strategies
Phosphopeptide enrichment arguably
introduces the most variation of any step into a standard phosphoproteomic
workflow. A variety of enrichment strategies have emerged as the field
has evolved, with metal-based affinity enrichment leading in popularity.
The two most prevalent metal-based methods are immobilized metal affinity
chromatography (IMAC) and metal oxide affinity chromatography (MOAC).
An established enrichment strategy in phosphoproteomics for over two
decades, IMAC uses transition metal cations (Fe3+, Ga3+, Zr4+, etc.) as affinity agents
for negatively
charged phosphate groups. These cations are immobilized via chelation
on a substrate, such as magnetic beads or silica-based resins, which
enables selective retention of phosphopeptides over nonphosphorylated
peptides. A recently described titanium(IV) substrate (Ti4+-IMAC) has been growing
in popularity in many laboratories.
28
MOAC, which has seen nearly a decade of broad
use, similarly leverages the affinity of oxygen in phosphoryl groups
for metals in metal oxide matrixes. Titanium dioxide (TiOx) is the
most commonly employed MOAC agent, although zirconium dioxide and
magnetite (Fe3O4) are also used. Both IMAC and
MOAC generally enrich phosphopeptides with phosphoserine (pSer), phosphothreonine
(pThr), and phosphotyrosine (pTyr) residues.
The success and
popularity of IMAC and MOAC enrichment methods derive from sustained,
widespread efforts in the phosphoproteomics community to improve protocols.
These efforts, however, have not yet produced consensus on which approach
performs best for global phosphopeptide enrichment: IMAC, MOAC, or
a combination thereof. A popular focus of recent years has been on
optimization of sequential or combined enrichment strategies to garner
the benefits of multiple metal cations or enrichment substrates. Combinations
of iron and gallium IMAC,
29
iron and titanium
IMAC,
30
and gallium IMAC and TiOx
31
have been used to enrich different classes of
phosphopeptides with moderate to considerable success. Hunt and co-workers
used complementary metal cation chelation groups like nitrilotriacetic
acid (NTA) and iminodiacetic acid (IDA) in Fe3+-IMAC enrichment
columns to identify hundreds of phosphopeptides from nanograms or
even picograms of starting material, an order of magnitude less than
other contemporary methods.
32
Several groups
have utilized solution additives, like glycerol, bis-Tris propane,
citric acid, or decoy amino acids (e.g., asparagine and glutamine),
to improve TiOx and other MOAC enrichment efficiencies.
33−35
Even different particle sizes of MOAC resins have been investigated
to compare phosphopeptide capture capacity and specificity.
36
Many of these studies attempt to mitigate issues
with reproducibility and sampling depth that challenge single-stage
enrichment strategies. In general, however, these approaches only
partially address the problem. They often introduce additional steps
in the sample handling workflow, which can increase rather than alleviate
variation. Furthermore, these protocols demonstrate a high degree
of enrichment orthogonality through the combination and optimization
of two strategies, but the overall result is often just as variable
from experiment to experiment as single-stage enrichment.
To
address the need for a simple yet robust enrichment strategy, several
studies have offered head-to-head comparisons of single-stage methods
using state-of-the-art enrichment protocols. Matheron et al. compared
Ti4+-IMAC and TiOx enrichments on HeLa cell digests and
on >23 000 synthetic phosphopeptides (pSer, pThr, and pTyr)
and their nonmodified counterparts.
37
Although
overlap was only ∼42% between Ti4+-IMAC and TiOx
enrichments of the phosphopeptide libraries, they found no clear differences
between the phosphopeptides enriched with the two methods when considering
peptide length, site position, isoelectric point, hydrophobicity,
motif analysis, and relative abundance of phosphopeptides (Figure 3
); they did observe,
however, a minor bias toward multiply phosphorylated peptides in Ti4+-IMAC versus
TiOx. When enrichments on HeLa cells were combined,
the results from both methods showed an increased number of localized
phosphosites, which indicates that tandem enrichment strategies for
titanium-based methods may still be valuable to increase phosphoproteomic
depth. Ultimately, the lack of bias between the two methods demonstrated
that biological origin, rather than methodological artifact, is largely
responsible for observed differences in comparisons of studies using
the two approaches.
Figure 3
Examining enrichment biases between Ti4+-IMAC
and TiOx. Frequency plots show physiochemical characteristics of a
phosphopeptide library (∼23 000 phosphopeptides) that
was analyzed via direct analysis (orange), Ti4+-IMAC enrichment
(blue), and TiOx enrichment (green). No major differences between
the enrichment strategies are evident when considering phosphopeptide
length (A), relative position of the phosphosite (B), number of basic
or acidic residues in the −1 to +1 position of the phosphosite
(C and D, respectively), calculated isoelectric point of the phosphopeptides
(E), or calculated Gravy hydropathy index (F). When considering replicate
Ti4+-IMAC and TiOx enrichments in HeLa cells (G), overlap
between replicates of the same method is not superb (requiring ∼4–5
replicates to approach asymptotical gains), but good phosphoproteomic
depth can be achieved by batching replicate measurements. Combining
replicate enrichments from the two methods also boosts phosphosite
identification. Reprinted with permission from Matheron, L.; van den
Toorn, H.; Heck, A. J. R.; Mohammed, S. Anal. Chem.
2014, 86, 8312–8320 (ref (37)). Copyright 2015 American
Chemical Society.
As a complement to this
study, Ruprecht et al. reported a comprehensive and reproducible enrichment
using Fe3+-IMAC in HPLC column format.
38
When they compared this strategy to Ti4+-IMAC
and TiOx, they found that the Fe3+-IMAC column performed
best, allowing identification of ∼5500 unique phosphosites
in triplicate 4 h analyses and as many as 15 000 phosphopeptides
in 48 h of analysis of fractionated samples. Moreover, they showed
that the orthogonality of the Fe3+-IMAC, Ti4+-IMAC, and TiOx methods was greatly reduced
when the phosphoproteomic
depth was increased via hydrophilic strong anion exchange fractionation.
They thus dismissed the concept of orthogonality between the methods
and attributed most of the previously reported complementarity to
artifacts of nonoptimized analytical methods, e.g., limited binding
capacity of the enrichment material, biased or incomplete elution,
compromised enrichment scaffold (tips, beads, etc.), and limited analytical
capacity of the mass spectrometer. This observation suggests that
focusing on increasing sampling depth is key to improving reproducibility
in phosphoproteomic experiments. The Hummon group performed a similar
analysis with multistep enrichments but showed the converse result:
that TiOx and Fe3+-IMAC are indeed complementary.
39
Not employing offline fractionation, however,
their work achieved less phosphoproteomic depth than the Fe3+-IMAC column study. In
light of this difference, there may be more
support for Ruprecht et al.’s argument that perceived orthogonality
diminishes as phosphoproteomic depth improves; however, a combination
of enrichment strategies as reported by the Hummon group can provide
a low-cost, time-efficient strategy to achieve greater depth when
access to HPLC fractionation is limited.
Even as IMAC and MOAC
methods dominate the field, alternative strategies for affinity-based
phosphopeptide enrichment have also continued to mature. Immunoprecipitation,
a canonical route for protein enrichment, is mainly limited to phosphotyrosine
studies in phosphoproteomics. Nevertheless, combinations of metal-based
and antibody-based affinity enrichments have proven useful for general
and pTyr-specific phosphoproteomics experiments.
40,41
Motif-based immune-affinity purification, affinity enrichment based
on polyhistidine tags, and polymer-based enrichment substrates have
also been successfully employed as alternative enrichment strategies.
42−45
Although they may align with more traditional biochemical methods
of purification, these approaches still struggle with reproducibility
due to nonspecific binding, batch-to-batch variability of antibody
production, and/or lack of dedicated effort from the field to refine
protocols for global phosphoproteomic experiments.
Affinity-based
methods can enrich intact phosphoproteins rather than digested phosphopeptides,
as well. The Ge group demonstrated that phosphoproteins could be selectively
enriched from complex cell and tissue lysates using superparamagnetic
Fe3O4 nanoparticles that were functionalized
via a glutaric acid linker with a zinc(II)-dipicolylamine coordination
complex to specifically bind phosphate groups.
46
These nanoparticles were designed for multivalent interaction
with phosphoproteins, which provided significantly higher enrichment
specificity than Fe3+-IMAC for intact phosphoproteins.
Liu et al., employing a hydrophilic antacid aluminum glycinate functionalization
for phosphate group affinity, also described nanoparticles for phosphoprotein
enrichment.
47
Interestingly, Hoehenwarter
et al. combined intact phosphoprotein enrichment with aluminum oxide
(Al(OH)3) with subsequent tryptic digestion and standard
TiOx phosphopeptide enrichment to study mitogen-activated protein
kinase substrates in Arabidopsis.
48
Because proteomics of intact proteins, especially of phosphoproteins,
in complex mixtures is still a maturing field, many of these protocols
have yet to see widespread use that could provide insight into their
reproducibility or utility in routine experiments.
In all, a
major challenge to establishing orthogonality or complementarity of
various enrichment methods comes from poor run-to-run reproducibility
in phosphoproteomic experiments. Analyzing back-to-back technical
replicates of the same sample often yields only 60–75% overlap
in identified phosphopeptides, and comparing technical replicates
of multiple enrichments further exacerbates this problem. Figure 3
g exemplifies this
prevalent, discipline-wide phenomenon. Starting at ∼2000 phosphopeptides
per a single enrichment, each additional replicate contributed a significant
increase to the cumulative total of phosphopeptides until ∼3500
phosphopeptides were identified with inclusion of the fourth replicate.
Also, this with phosphopeptide analysis from only a single enrichment
strategy! Results were similar in the Ruprecht et al. data, where
technical triplicate Fe3+-IMAC enrichments (without fractionation)
identified ∼7500 unique phosphopeptides but less than 50% of
those (∼3600) were detected in all three replicates.
38
Clearly, if not all phosphopeptides in a given
sample are identified, comparisons of observed phosphopeptides between
different enrichment methods can be misleading, hence, the great value
of technical replicate measurements in these studies. Although populations
of phosphopeptides enriched by a given method appear to be more similar
that previously thought, adequate sampling depth, whether it comes
from multidimensional chromatography, faster and more sensitive mass
spectrometers, or more reproducible strategies for data acquisition
(all discussed in the following sections), is imperative to understanding
the degree of overlap between enrichment methods for optimization
of single-stage and combinatorial approaches.
Chromatographic Separations
To increase sampling depth of the phosphoproteome, multidimensional
chromatography has become a common practice for simplifying samples
across many fractions. Ubiquitously used in proteomics and phosphoproteomics,
reversed phase liquid chromatography (RPLC) is the online chromatography
of choice for LC–MS/MS experiments but several studies have
also explored various other chromatography modalities for both online
and offline fractionation and enrichment of phosphopeptides.
Ion exchange chromatographies, especially strong cation exchange
(SCX), are widely used separations approaches. SCX is usually combined
with metal-based phosphopeptide enrichment for large-scale phosphopeptide
enrichment, but Hennrich et al. demonstrated that two-dimensional
SCX using complementary basic and acidic buffers could isolate phosphopeptides
with no further enrichment required.
49
In
this case, one-dimensional SCX with basic buffers provided only 537
phosphopeptides from a HeLa cell digest. By contrast, the two-dimensional
approach using basic then acidic SCX separations enabled identification
of more than 10 000 phosphopeptides, 480 of which were also
seen in the one-dimensional separation and most of which were basic
phosphopeptides with two or more basic residues. SCX has also been
successfully combined with TiOx for improved enrichment of phosphotyrosine,
although it did not perform as well as a combination of pTyr antibodies
and TiOx.
50
In addition, rather than enrich
phosphopeptides, ion exchange chromatographies can deplete undesired
populations of phosphopeptides; for example, acidic phosphopeptides
can be removed with strong anion exchange (SAX) to enhance detection
of motifs associated with basophilic kinases.
51
In hydrophilic interaction chromatography (HILIC), which uses
a polar stationary phase and an organic-to-polar mobile phase gradient,
peptide retention is based on hydrophilicity (the opposite of RPLC).
Several groups have recently explored HILIC as an orthogonal dimension
of separation for phosphopeptides, coupling it with metal-based affinity
enrichments and SCX separations with varying success.
52,53
Another approach that has gained favor in phosphoproteomics is electrostatic
repulsion-hydrophilic interaction liquid chromatography (ERLIC). Combining
the principles of HILIC and anion exchange, ERLIC has been used both
as an enrichment strategy and in multidimensional chromatography approaches.
54,55
Alpert and co-workers recently published a comparison between ERLIC,
weak anion exchange (WAX), and SAX for fractionation in phosphoproteomic
experiments. ERLIC enriched and identified more than double the number
of phosphopeptides achieved by the anion exchange chromatographies.
56
This study also offered insight into the benefits
of solvent additives for ERLIC and the performance of WAX and SAX
at different pH values. Two-dimensional ERLIC in combination with
other modes of separation and enrichment has been shown to increase
phosphoproteomic depth as well.
57
Other
interesting alternatives for fractionation include chromatographic
separations of intact proteins prior to digestion and subsequent phosphopeptide
enrichment. Several groups have explored these approaches,
58−60
but they are less common than the peptide separations described
above.
Much like the search for the best metal-based phosphopeptide
enrichment strategy, optimal fractionation methods are still open
to debate, with recent discussion centered on the comparison of high
pH reversed phase (RP) fractionation versus SCX. In 2014, Batth et
al. evaluated offline high-pH RPLC fractionation head-to-head with
SCX fractionation, both with TiOx enrichment. They demonstrated a
surprising advantage to the RP approach.
61
In four biological replicates of mouse embryonic cells, high-pH
RPLC facilitated the identification of an average of 17 566
(±3 737) phosphopeptides, compared to an average of 6 215
(±1 759) phosphopeptides for SCX fractionation. Moreover,
optimization of high-pH RPLC conditions and MS acquisition parameters
more than doubled the number of phosphopeptides identified in biological
replicates (>37 000 in individual replicates, 27 712
localized phosphosites in total, Figure 4
). Corroborating this result, Yue et al.
reported a similar advantage for high-pH reversed phase separation.
They employed a multistep Fe3+-IMAC approach in combination
with high-pH RP cartridges that not only fractionated the phosphopeptides
but also desalted the samples.
62
The multistep
IMAC-RP cartridge workflow lessened starting material requirements,
reduced sample preparation time, and eliminated the need for HPLC
instrumentation while identifying 8 969 phosphopeptides (6 337
phosphosites) from 3 mg of human epithelial cells, compared to 5 519
phospho-peptides (3 686 phosphosites) from 15 mg of starting
material with the traditional SCX-Fe3+-IMAC approach. Others
have reported that the addition of solvent additives, such as EDTA,
can further improve RPLC fractionation.
63
Figure 4
Fractionation
of phosphopeptides with high pH RPLC. The comparison of high pH RPLC
and SCX offline fractionation (A) shows that the two methods identify
many of the same phosphosites, but high pH RPLC provides nearly 10 000
additional sites. Through further optimization, high pH RPLC provided
27 712 localized phosphosites in three replicate measurements
(B). The number of confidently localized phosphosites (C and D) demonstrates
the superior performance of an optimized high pH RPLC for phosphopeptide
fractionation. Reprinted with permission from Batth, T. S.; Francavilla,
C.; Olsen, J. V. J. Proteome Res.
2014, 13, 6176–6186 (ref (61)). Copyright 2015 American
Chemical Society.
From our perspective,
recent data lends clear support for high-pH RP fractionation over
SCX. Additionally, the RP approach is generally more flexible than
SCX because the buffers require no additional cleanup to be MS compatible.
In increasing sampling depth, the combination of phosphopeptide enrichment
and fractionation for extensive sample characterization has the potential
to improve reproducibility issues in routine phosphoproteomic experiments.
However, this benefit incurs significant cost in data acquisition
time, a balance we discuss further below.
Mass Spectrometry Instrumentation
Many hundreds or thousands of phosphopeptides may be introduced
into a mass spectrometer at any given moment of an LC–MS/MS
experiment. The speed and sensitivity of mass spectrometers thus play
critical roles in successful and reproducible phosphopeptide identification.
Nearly all phosphoproteomic experiments in recent years have been
conducted on hybrid MS systems that couple multiple mass analyzers
for gains in sensitivity, acquisition speed, and efficacy of tandem
MS (MS/MS). Such hybrid systems include quadrupole-time-of-flight
(qTOF), linear ion trap (LIT)-Orbitrap, quadrupole-Orbitrap, and ion
trap-Fourier transform ion cyclotron resonance (FTICR) mass spectrometers.
64−67
The Orbitrap Fusion, a “tribrid” quadrupole-Orbitrap-LIT
mass spectrometer, has been introduced within the past 2 years as
a powerful and versatile new platform. Through the parallelization
of many scan functions, this instrument can operate with ion trap
MS/MS acquisition rates at 22 Hz or greater, nearly double those of
previous ion trap-Orbitrap hybrids.
68
Its
acquisition speed has driven improvements in throughput and depth
in proteomics experiments,
69,70
and recent work has
highlighted its analytical power and throughput capabilities for phosphoproteomics.
In a study of mouse brain and liver tissue, Gygi and co-workers quantified
>38 000 phosphopeptides (11 015 phosphosites) across
10 samples using a multiplexed isobaric labeling strategy on the quadrupole-Orbitrap-LIT
platform.
71
They achieved this quantitative
phosphoproteomic depth, typically the result of a week or longer of
acquisition time on previous instruments, in only 2 days of analysis.
Enhanced throughput capabilities for phosphoproteomic experiments
have also come through improvements on an alternative Orbitrap platform,
a quadrupole-Orbitrap hybrid called the Q-Exactive HF. This instrument,
which relies entirely on high-resolution Orbitrap data acquisition
for MS and MS/MS scans, is equipped with a segmented quadrupole for
more robust precursor selection and transmission. Its high field Orbitrap,
capable of achieving a sequencing speed above 20 Hz, permits faster
acquisition times for a given resolution.
72,73
The Olsen group, in addition to leveraging the strengths of this
instrument for the in-depth characterization of offline high-pH RPLC
studies described above,
61
showed that
the Q-Exactive HF can identify more than 7600 unique phosphopeptides
(6831 of which were localized) from a HeLa cell digest in an hour
of acquisition time.
61,72
Interestingly, they also explored
optimal instrument parameters to show that acquisition speed may not
always be the most important metric for phosphoproteomic experiments.
Rather, the collection of high-quality fragmentation spectra, which
can come at the cost of scan speed, permitted modification localization
for nearly all phosphopeptides (∼97%) detected in a given experiment.
Advances on other hybrid instrument platforms also promise to broaden
horizons for phosphoproteomics. The newest q-TOF instrument, the Impact
II described by Mann and co-workers, offers high transmission efficiencies
and improvements in resolution/mass accuracy that benefit shotgun
proteomics on complex samples.
74
In their
work, the Impact II ultimately led to the characterization of ∼5200
human proteins and ∼3600 yeast proteins in triplicate single-shot
analyses. The latest in ion trap-FTICR instruments feature the highest
field superconducting magnet ever used for FTICR (21 T), which has
enabled ultrahigh resolution/mass accuracy on the order of a resolving
power of 300 000 at 400 m/z for a 0.76 s detection period and 2 000 000 resolving
power for the z = 48+ charge state of bovine serum
albumin (∼1385 m/z) for a
12 s detection period.
75,76
Although to the best of our knowledge
no phosphoproteomics studies have been reported on these new systems
to date, they are poised to contribute to improved analysis of phosphopeptides
and phosphoproteins in the coming years.
Data Acquisition Strategies
Data-dependent acquisition (DDA), or automated selection and fragmentation
of precursor ions using predetermined criteria and real-time decision
making, is the most widely used data acquisition strategy in LC–MS/MS
proteomic and phosphoproteomic experiments, including those reported
in this review. Typically, precursors are selected based on their
relative abundance, biasing experiments toward highly abundant species,
including those that may simply ionize more favorably than others.
In hopes of improving run-to-run reproducibility and sampling of low
abundance precursors, data-independent acquisition (DIA), which collects
data largely independent of precursor ion information, has come into
vogue.
77
One of the most popular approaches
involves repeated sampling of successive isolation windows using discrete m/z ranges
over the course of chromatographic
elution (i.e., SWATH-MS).
78−82
The major potential benefit of DIA strategies is their reproducibility:
in theory, fragmentation spectra are collected for every precursor
ion in every experiment, as opposed to the stochastic precursor selection
in DDA approaches.
Aebersold and co-workers used SWATH-MS in
combination with affinity purification to study protein–protein
interactions of the 14-3-3 system, a family of seven abundant cellular
scaffolds with diverse regulatory functions that bind phosphorylated
residues on ligand proteins.
83
Providing
quantitative data to follow dynamic phosphorylation-related changes
in protein–protein interactions in perturbed systems, SWATH-MS
offered data consistency similar to targeted approaches but with reduced
overhead time in assay development and with increased peptide observation.
Parker et al. used DIA to quantify the effect of insulin on phosphorylation
of 86 protein targets and demonstrate 14-3-3 binding effects in insulin
signaling.
84
Improvements in postacquisition
data analysis in DIA experiments, such as the ability to differentiate
phosphopeptide isomers that may be missed using DDA methods, have
also benefited phosphoproteomic applications.
85
Used in combination with DIA, ion mobility has enhanced precursor
fragmentation efficiency for improved reproducibility and proteome
coverage.
86,87
Targeted methods that use directed/inclusion
list methods using a predefined set of precursor ion masses have also
proven useful in reproducibly measuring quantitative changes in specific
signaling cascasdes.
88
A recent evaluation
of the value of DDA and DIA in analyzing phosphopeptides (albeit in
a study limited to ∼10 phosphopeptides) showed that targeted
DIA methods can improve sensitivity of phosphopeptide identification
and quantification by 5–10-fold,
89
and a coupling of global DDA and targeted phosphoproteomics proved
useful in biomarker discovery in clinical breast cancer samples.
90
Methods combining the strengths of DIA and DDA
approaches have emerged in recent years, but they have yet to be widely
applied to phosphoproteomic analyses.
91−93
The major challenge
of DIA lies in the complicated spectra it generates, making data extraction
nontrivial and limiting the number of peptides detected in an experiment.
In our view, the popularity of DIA (i.e., SWATH-like approaches) has
produced many valuable new informatics tools
94−96
that will continue
to make it a viable alternative for proteomics and phosphoproteomics
alike, particularly when reproducibility is more critical than phosphoproteomic
depth. That said, consistent improvements in the speed and sensitivity
of mass spectrometers favor well-established DDA methods, especially
as the fastest instruments no longer struggle to select and fragment
every available precursor above a desired signal-to-noise ratio across
an LC–MS/MS experiment. In short, both DDA and DIA have valuable
utility in phosphoproteomics, but DIA does not appear set to outpace
DDA in global profiling or phosphoproteomic depth in the foreseeable
future.
Balancing Throughput and Depth
Advances in technology
have corresponded to increased numbers of phosphopeptide identifications
per experiment. Recent work has shown that, provided adequate acquisition
time, experiments can characterize tens of thousands of phosphopeptides
from a sample. Sharma et al. identified 38 229 phosphosites
from 51 098 unique phosphopeptides in a human cancer cell line
(HeLa cells), which provided valuable insight into the extent of phosphorylation
in the proteome and into the differences between serine/threonine
phosphorylation and tyrosine phosphorylation (Figure 5
).
97
A combination
of SCX fractionation and both TiOx and pTyr antibody enrichment enabled
this superb sampling depth. The price of this ultradeep phosphoproteomic
coverage, however, came to the tune of approximately 270 LC–MS/MS
experiments and 40 days of data acquisition time, not accounting for
additional overhead in sample preparation and data analysis. Other
studies have also reported impressive phosphoproteomic depth: 29 057
quantified phosphorylation sites in adipocytes,
98
31 480 quantified phosphorylation sites across 14
rat tissues and organs,
99
35 965
quantified phosphosites from 9 mouse tissues,
100
and 15 004 quantified phosphosites from human embryonic
stem cell differentiation.
101
Again, each
of these data sets required extensive fractionation and data acquisition
time.
Figure 5
Properties of the HeLa cell phosphoproteome. Label-free quantitative
proteomics provided dynamic range measurements for >38 000
phosphosites in the human phosphoproteome. The left panel of part
A shows a histogram of phosphopeptide abundances overlaid with intensity
rank order (red line, lowest to highest intensity) of the phosphopeptides.
The right panel shows the distribution of cumulative phosphopeptide
abundance and indicates that a significant portion of total phosphopeptide
intensity comes from a few thousand phosphopeptides. The majority
of phosphoproteins have five or fewer phosphosites (B, left), and
the relationship between protein abundance and its number of phosphosites
is displayed in the right panel of part B. The majority of phosphosites
are phosphoserine (pS), followed by phosphothreonine (pT), left panel
of part C. The number of phosphotyrosine (pY) sites can be increased
through immunoprecipitation strategies, but the enrichment strategy
used affects the observed intensity, left and center of part C. The
right panel of part C shows the distribution of known and novel phosphosites
compared to the PhosphoSitePlus database. Reprinted with permission
from Sharma, K.; D’Souza, R. C. J.; Tyanova, S.; Schaab, C.;
Wiśniewski, J. R.; Cox, J.; Mann, M. Cell Rep.
2014, 8, 1583–1594 (ref (96)). Copyright 2015 Cell
Press.
As proteomic workflows and MS
instrumentation have become more compatible with deep proteome coverage
in high-throughput experiments, many groups have shifted to favoring
single-shot (i.e., unfractionated) analyses of the phosphoproteome
(Figure 7
). Single-shot
approaches can offer good phosphoproteomic depth while maintaining
relatively simple and manageable throughput capabilities. Using a
single-stage Ti4+-IMAC enrichment and standard one-dimensional
online RPLC separation, de Graaf et al. quantitatively monitored nearly
13 000 phosphosites with high reproducibility across six time
points in an investigation of phosphorylation dynamics in Jurkat T
Cells, requiring only 2 h of acquisition time per LC–MS/MS
analysis.
102
In the same vein, Humphrey
et al. recently described their EasyPhos strategy, which combines
a trifluoroethanol-based tryptic digestion and a 96-well plate format
for TiOx phosphopeptide enrichment.
103
This
format facilitated high-throughput phosphoproteomic experiments without
the need for fractionation, enabling as many as six or more biological
replicates to be measured at multiple time points in time course experiments.
They reported 20 000–24 000 phosphosites detected
in 24 h or less of analysis time in various biological systems and
described the method as a scalable platform to profile >10 000
phosphosites in hundreds of samples in a high-throughput fashion.
With the rising interest in high-throughput capabilities for phosphoproteomic
measurements, several groups have explored sample preparation, enrichment,
and fractionation methods that can offer both reproducibility and
feasible labor requirements for large numbers of samples.
104−106
Describing methods requiring only 45 min for enrichment in a 96-well
format, Tape et al. reported high well-to-well reproducibility (r
2 ≥ 0.8) and plate-to-plate reproducibility
that remained robust over 5 days of independent enrichments.
105
Lee and co-workers constructed a multifunctional
LC system capable of standard one-dimensional separations in addition
to online TiOx phosphopeptide enrichment and two-dimensional chromatography
(SCX-RPLC).
107
Such an approach clearly
has potential benefits in reproducibility and reduced bench time for
sample preparation, but the technology is still specialized to a small
pool of researchers. Others have focused on development of cartridge-based
enrichments and fractionation on solid phase extraction substrates.
These relatively inexpensive alternative strategies are adaptable
to microgram amounts of starting material, yet still offer many thousands
of phosphopeptide identifications.
62,108,109
Because removing the requirement for HPLC instrumentation
enables rapid, flexible, and multiplexed phosphoproteomic sample preparation,
these methods are valuable in many settings when fractionation is
desired for increased phosphoproteomic coverage.
As with most
experimental design, the balance between throughput and depth requires
careful consideration. A dichotomy in the approach to sampling the
phosphoproteome has emerged: either fractionate to achieve maximum
phosphoproteomic depth at the cost of significant acquisition times
or settle on moderate phosphoproteomic coverage at the benefit of
only a few hours of analysis time per sample. (We do note that contemporary
phosphoproteomics has matured greatly in the past decade; a few thousand
phosphopeptides was once cutting edge, while 5 000 to even
10 000+ phosphosites is considered moderate phosphoproteomic
coverage by today’s standards, depending on the biological
system.) Despite all the efforts in sample preparation, enrichment,
separations, and MS instrumentation and data acquisition, the number
of phosphosites characterized per hour of instrument time has not
drastically grown in the past 2–3 years. We expect that this
will begin to change, however, as the newest generations of MS instrumentation
become more ubiquitous in laboratories across the field, especially
given the speed and sensitivity of the quadrupole-Orbitrap-LIT and
newest q-TOF platforms. This anticipated improvement may shift considerations
of the balance between throughput and depth in the coming years. Soon,
single-shot analyses may be able to offer >15 000–20 000
phosphosites in just a few hours of instrument time and thus render
highly fractionated approaches relatively obsolete, except when ultradeep
phosphoproteomic coverage is paramount.
Quantifying the Phosphoproteome
Quantitative proteomic tools have become universal and robust in
recent years, making quantitative phosphoproteomics an ever more accessible
undertaking (Figure 6
). Cases in point: a significant proportion of the works cited in
this review include quantitative components, even if not explicitly
stated (Figure 7
). Quantitation in phosphoproteomics is markedly
more difficult than standard proteomics because quantitative information
cannot be integrated over all peptides of a given protein. Quantitative
values of specific phosphosites can differ even for different sites
on the same protein. Thus, only quantitation from direct measurements
of phosphopeptides with a specific site are useful, meaning measurements
often come from relatively small sample sizes. Here we discuss recent
advances in quantitation strategies as they pertain to phosphoproteomics.
Additional, perhaps more extensive, information about these methods
in a larger context can be found elsewhere.
110−114
Figure 6
Quantitative
strategies for global phosphoproteomics. MS1 quantitation
is a popular approach because measurements of phosphopeptides across
their elution profiles provide accurate quantitative information.
Label-free quantitation requires no additional steps in the phosphoproteomic
workflow, and samples are analyzed individually. Quantitation is then
performed across separate LC–MS/MS analyses using accurate
mass and retention time windows to compare phosphopeptides from different
samples. In contrast, stable isotope labeling methods permit multiplexing,
where multiple samples can be mixed after labeling and then analyzed
in the same LC–MS/MS analysis. In metabolic labeling, e.g.,
SILAC, stable isotopes are incorporated into samples during growth
on a defined medium. Phosphopeptides from different samples vary in
mass based on the incorporated isotopes, which can be seen by mass
shifts in the MS1. Areas under the elution curve for the
corresponding light and heavy phosphopeptides can then be compared
for quantitative information. Chemical labeling (dimethyl, mTRAQ)
works via the same mechanism, except that the mass shifts are achieved
through a chemical label that is reactive with peptide functional
groups (e.g., primary amines), rather than incorporation in growth
media. Isobaric labeling also uses a reactive tag that labels peptide
functional groups, but quantitation is achieved at the MS2 level. The intact mass
of each label is the same based on the coupling
of reporter and balance regions that have an equivalent number of
total heavy isotopes. Upon phosphopeptide dissociation, the reporter
ions fragment off, allowing comparison of relative reporter ion intensities
for quantitative measurements between samples, all within the same
scan that provides phosphopeptide identification.
Figure 7
Cross section of recent phosphoproteomic literature. This graphic
shows relevant information for 30 recent and impactful phosphoproteomic
methodology publications. Although not comprehensive, it gives a snapshot
of popular methods in current studies. General details about the biological
system, enrichment method, fractionation approach, quantitative strategy,
and number of phosphosites characterized are provided. The number
of phosphosites reported here represents what was reported as confidently
localized and quantified by each manuscript. An asterisk (*) indicates
that localization confidence was not reported, and an octothorpe (#)
indicates other PTMs were also enriched in the study. Some publications
did not report quantitative information.
Stable Isotope Labeling
The incorporation of stable isotopes
into proteomic samples via cell culture or chemical tagging regimes
has been an active area of innovation in proteomic workflows for more
than 15 years. Because they allow many intensity measurements to be
taken over an elution profile, MS1 strategies, e.g., stable
isotope labeling in amino acid cell culture (SILAC), amine-modifying
tags for relative and absolute quantification (mTRAQ), and dimethyl
labeling, are the gold standard in quantitative accuracy. These methods
also enable multiplexed quantitation with different combinations of
stable isotopes that increase peptide masses by incremental amounts,
thereby allowing characterization of several samples in a single LC–MS/MS
analysis.
Developments in SILAC methods are providing new approaches
for quantitative phosphoproteomics. Mann and co-workers reported the
application of their in vivo labeling model, the
SILAC mouse, to study tumor development in skin cancer at the proteome
and phosphoproteome level. Their work provided a detailed molecular
picture into skin carcinogenesis and a platform for future work in
elucidating tumor progression mechanisms.
115
The development of super-SILAC approaches, which use spiked-in isotope
labeled standards for compatibility with primary tissues, has enabled
large-scale phosphoproteomics in primary mammalian tissues. Schweppe
et al. quantitatively accessed oncogenic kinase signaling in human
nonsmall cell lung cancer tumors by using relative super-SILAC quantification
of phosphopeptide abundance between tumor samples to determine differing
hubs and pathways specific to each tumor.
116
Monetti et al. used spiked-in SILAC standards from mouse liver cell
lines to quantitatively compare 10 000 sites in response to
insulin treatment, which allowed for accurate SILAC-like quantitation
at considerable phosphoproteomic depth in an in vivo system.
117
A comparison of metabolic
labeling with SILAC to chemical labeling with mTRAQ for 3-plex phosphoproteomic
quantitation showed that the two approaches can permit quantification
of similar numbers of phosphosites (∼16 500 total in
batched triplicate measurements, 11 322 seen in all three replicates)
in human lung cancer cells, with approximately 65% overlap (∼10 600
phosphosites shared between the two).
118
SILAC provided lower ratio variability and a higher fraction of
significantly regulated sites for higher quantitative accuracy, but
mTRAQ still proved a viable MS1-based quantitation strategy
when metabolic labeling is not ideal, e.g., with primary tissues.
SILAC is not highly compatible with in vivo models
that require larger numbers of animals, so Wilson-Grady et al. demonstrated
the utility of reductive dimethylation protocol at low-pH conditions
to quantify hepatic phosphoproteome changes in tissue from fasted
and refed mice.
119
Of the 8500 phosphosites
identified in this study, nearly 7400 of them were reliably quantified,
with 390 phosphosites found to be changing between the fasted and
refed conditions (2-fold change cutoff). Dimethyl labeling has been
used in combination with enzymatic kinase reactions, as well, providing
large-scale determination of absolute phosphorylation stoichiometries.
120
Chemical labeling strategies have also been
coupled with single-step enrichment platforms to enable robust yet
straightforward methods for quantitative phosphoproteomic experiments.
121
MS/MS quantitation strategies provide
an alternative approach for multiplexed quantitation, one that eliminates
the MS1 spectral complexity of the approaches described
above, which can limit sampling depth. Generally, these methods employ
isobaric labels, e.g., tandem mass tags (TMT) and isobaric tag for
relative and absolute quantification (iTRAQ), for the quantitative
comparison of six to ten samples in a single experiment.
122−124
Largely used for relative quantitation in global phosphoproteomic
experiments, isobaric labels have also proven useful in recent studies
of study phosphopeptide stoichiometry and absolute quantitation.
125,126
Carr and co-workers reported that isobaric chemical labels (iTRAQ)
not only increased multiplexing capabilities over nonisobaric labels
(mTRAQ) but they also performed favorably in phosphoproteomic experiments,
quantifying nearly 3-fold more phosphopeptides (12 129 versus
4 448) in their study.
127
The key limitation in isobaric labeling strategies is the cofragmentation
of peptides in the same precursor isolation window.
128
This well-known phenomenon impairs quantitative accuracy
by compressing ratios used in comparing reporter ion intensities.
MS3-based approaches and precursor charge reduction via
proton transfer reactions have been introduced to address precursor
interference.
128,129
Another approach to mitigate
precursor interference, called synchronous precursor selection (SPS),
has built off of the MS3 strategy. SPS uses a multinotch
waveform to isolate and cofragment multiple product ions in an CAD
MS/MS scan to increase the number of reporter ions in the MS3 spectrum 10-fold over
the standard MS3 method.
130
These improvements translate to gains in the
dynamic range of reporter ion quantitation and reduction in reporter
ion signal variance, which in turn provides higher-quality quantitative
measurements. The SPS method has been commercially implemented on
the quadrupole-Orbitrap-LIT platform and has enabled accurate, multiplexed
quantitation of >38 000 phosphopeptides (discussed above).
71
Increases in the plexing capacity of isobaric
labels have arisen from the manipulation of subtle mass differences
caused by nuclear binding energy variation in stable isotopes. When
coupled with high-resolution MS/MS scans, these ∼6 mDa mass
differences can be discriminated for quantitative measurements.
122,123
These so-called neutron-encoded signatures have also been leveraged
in the design of NeuCode, a new quantitation strategy that provides
the quantitative accuracy of MS1-based quantitation approaches
without sensitivity-limiting increases in spectral complexity. The
compatibility of NeuCode with both metabolic and chemical labeling
methods makes it a flexible platform for protein and PTM quantitation
in a variety of samples and experimental designs, including DIA approaches.
131−138
Recently, Rhoads et al. implemented an in vivo labeling
strategy with NeuCode to study the phosphoproteome in Caenorhabditis
elegans.
139
This study provided
one of the largest phosphoproteomic data sets to date for C. elegans (6 620 phosphorylation
isoforms), revealing
a post-translational signature of pheromone sensing in the organism.
Traditional stable isotope labeling methods use 13C
and 15N, as well as deuterium when necessary, but 18O isotope labeling approaches
have also proven valuable for
phosphoproteomics. Xue et al. described a novel stable isotope labeled
kinase reaction approach to study direct substrates of kinases, in
which a whole cell extract was moderately dephosphorylated and subjected
to in vitro kinase reactions using 18O-ATP
as the phosphate donor.
140
Similarly, Molden
et al. employed [γ-1
8O4]ATP
to label amino acids with heavy phosphate to determine global site-specific
phosphorylation rates.
141
This strategy
boasts direct labeling of phosphosites, the ability to measure phosphorylation
rates, improved confidence in phosphopeptide identification due to
the presence of heavy isotopes, and the identification of actively
phosphorylated sites in a cell-like environment. Approximate rate
constants for >1 000 phosphosites were calculated based
on labeling progress curves, with phosphorylation rate constants ranging
from 0.34 min–
1 to 0.001 min–
1.
Label-Free Strategies
Label-free
approaches, namely, spectral counting and spectral intensity (also
known as area under the curve, AUC) measurements, offer relative quantitative
comparisons between samples without the use of isotopic labels. Label-free
quantitation is popular in the proteomic and phosphoproteomic communities
due to its lack of implementation costs and its flexibility in experimental
design.
142
Opposed to stable isotope labeling
approaches, label-free strategies are not multiplexed and samples
are never mixed prior to LC–MS/MS analysis. Thus, the number
of conditions and replicates compared by label-free quantitation is
theoretically unrestrained, although practical limitations apply;
because each sample is analyzed individually, data acquisition times
can be significantly higher in label-free experiments compared to
isotopic labeling quantitative experiments, where samples can be multiplexed
in one analysis. However, the cost in acquisition time has not deterred
widespread use of label-free quantitation in large-scale phosphoproteomic
experiments, as indicated by the selected publications in Figure 7
. The straightforward
nature of label-free strategies make them easy to couple with the
various enrichment and fractionation strategies used in phosphoproteomics
and, as they do not introduce additional workflow steps, they are
simple to implement in high-throughput and automated sample preparation.
Thus, the majority of the challenges with label-free quantitation
come in postacquisition data analysis.
Key improvements in label-free
quantitation software, most notably Skyline and MaxQuant, have made
quantitation in phosphoproteomic experiments more robust.
143−145
Nevertheless, the stochastic nature of phosphopeptide data acquisition
(see above) still presents a significant challenge to label-free phosphoproteomic
strategies. If a given phosphopeptide evades detection in any single
LC–MS/MS analysis within a given set of experiments, quantitation
of that phosphopeptide becomes difficult. A variety of missing value
imputation methods can be used for label-free data;
146
a popular approach uses retention time alignment and accurate
mass to assign sequences to unidentified spectral features based on
MS/MS identification from other LC–MS/MS files in an experiment.
147,148
Popular as the “match-between-runs” feature in MaxQuant,
149
this valuable tool can salvage many peptide
identifications in large-scale studies, but it may also introduce
ambiguity in phosphoproteomic data. Different phosphopeptide isoforms,
i.e., peptides that have phosphoryl modifications on different residues,
have the same intact mass but not the same phosphosite identity, meaning
confident phosphosite assignment can be lost when using the match-between-runs
approach. Additionally, perturbations that cause large unidirectional
changes in biological systems can challenge the ability to reproducibly
detect and quantify phosphopeptides with label-free strategies. New
strategies are emerging to combat this difficulty, such as pairwise
normalization approaches that adjust normalization based on phosphopeptide
abundances before and after enrichment.
150
Label-free strategies will continue to be popular in phosphoproteomics,
even though stable isotope labeling approaches may offer more confident
quantification and multiplexed data acquisition. We view label-free
strategies as especially powerful in standard proteomic experiments;
they are practical and suitable for phosphoproteomics, as well, but
they must be used with consideration and awareness of their challenges
in throughput, phosphosite assignment, and reproducibility.
Confident
Phosphosite Assignment
One of the principal advantages of
MS-based phosphoproteomics is the ability to offer site-specific resolution
for systems-level phosphorylation events. Determination of specific
phosphosites permits the functional characterization of the modifications
observed. Thus, identification and quantification of tens of thousands
of phosphopeptides becomes far less powerful if an experiment cannot
provide unambiguous phosphosite assignment for the phosphopeptides
detected. A sizable percentage (20–40%) of identified phosphopeptides
in typical phosphoproteomic experiments is lost because confident
localization of a phosphosite (or phosphosites) cannot be assigned
to the peptide. Phosphosite localization data comes directly from
MS/MS fragmentation spectra of phosphopeptides, so advances in the
fragmentation methods and informatics tools used to generate and extract
this information are incredibly valuable to the field. The success
of these methods is felt in experimental reproducibility, as well:
if phosphopeptides are detected, but phosphosites cannot be localized
due to inconsistent fragmentation spectra or inadequate analysis software,
the overlap in useable information is diminished.
Tandem MS Approaches
The labile nature of phosphoryl groups on modified peptides has
often put phosphopeptide fragmentation at center stage. Challenges
in investigating phosphorylation with collisional activation dissociation
(CAD) have inspired the development of many alternative fragmentation
strategies for phosphoproteomic applications,
151−160
although its relative simplicity still makes collision-based dissociation
a popular option. Though CAD’s complications have been well
documented, current research continues to offer insight into its utility
in phosphoproteomic experiments. Citing increased rearrangement under
nonmobile or partially mobile protonation conditions, Cui and Reid
recently described the challenges of localizing phosphosites during
CAD of phosphopeptides due to competing fragmentation and rearrangement
reactions occurring upon activation.
161
Brown et al. have also reported the proclivity for neutral losses
in CAD with increased proximity of the phosphorylated residue to the
peptide N-terminus. However, neutral loss activity is reduced when
basic groups are directly N-terminal to the phosphate, which they
accounted to steric hindrances in catalyzing neutral loss.
162
Eyers and co-workers reported improved phosphopeptide
fragmentation and phosphosite localization with CAD through enzymatic
removal of basic lysine or arginine residues from the C-terminus of
tryptic phosphopeptides.
163
This strategy
promoted the formation of sequence-informative b- and y-type fragment
ions over the typical neutral loss of phosphoric acid (H3PO4) that can dominate CAD
spectra. Other groups continue
to use neutral loss ions from CAD fragmentation to inform data acquisition
for phosphoproteomics, combining CAD with alternative fragmentation
methods like electron transfer dissociation (ETD) to improve phosphopeptide
identification and phosphosite localization.
164−167
ETD technologies for phosphoproteomic analyses have steadily
matured.
168−170
Commercial developments, like the introduction
of a more stable front-end ETD source on quadrupole-Orbitrap-LIT instruments,
171
have improved accessibility of ETD for routine
use in the phosphoproteomic community. Although ETD can provide extensive
peptide backbone fragmentation while retaining the labile phosphoryl
modification, it can also suffer from poor dissociation efficiencies
when precursor ion charge density is low. Introducing additional energy
to the ETD reaction can improve dissociation efficiencies to provide
more sequence-informative product ions. Recent work has shown that
concurrent photoactivation with infrared photons
172
and combinations of ETD with ultraviolet photons
173
can improve phosphopeptide identification phosphosite
localization.
Collisional activation of ETD products has also
shown significant benefit for phosphopeptide fragmentation.
174−176
In 2012, the Heck lab introduced EThcD, a hybrid fragmentation method
that utilizes beam-type collisional activation of ETD products after
the ion–ion reaction. EThcD has compellingly improved identification
of phosphopeptides and localization of phosphosites.
177,178
In 2013, Frese et al. showed that EThcD, although identifying fewer
phosphopeptides than HCD, improved peptide sequence coverage and percentage
of localized phosphosites over both HCD and ETD fragmentation.
178
For endogenous peptides and phosphopeptides
presented by HLA class I, biomolecules that have been traditionally
difficult to analyze via conventional fragmentation methods, EThcD
improved identification rates by ∼15% over ETD and nearly 30%
over collisional dissociation methods.
179
EThcD can also improve fragmentation of whole proteins and
improve localization of phosphosites on phosphorylated proteoforms.
180
Intact phosphoprotein interrogation provides
a holistic picture of all modifications on a given protein, and extensive
backbone fragmentation can provide single residue specificity for
each modification. Because the increased chemical complexity of intact
proteins makes them more difficult to analyze than peptides, top-down
approaches for phosphoprotein characterization are far less common
that bottom-up approaches that target phosphopeptides. Still, the
combinatorial patterns of PTMs that decorate proteins are lost in
the popular peptide-centric approaches. Recent improvements in alternative
intact protein fragmentation methods may therefore help drive top-down
phosphoproteomics to better understand the role of PTMs on multiply
modified proteins.
181−183
Fragmentation of peptide anions is
another alternative approach gaining traction in the proteomics community,
as it can provide access to new information not seen by traditional
methods. The vast majority of proteomics workflows use positive electrospray
ionization with LC–MS/MS to fragment peptide cations with collisional
activation, which limits detection of species that prefer deprotonation
over protonation. While collision-based dissociation approaches do
not generate reproducible sequence-informative product ion spectra
of negatively charged peptides, several photodissociation and electron-driven
fragmentation methods have emerged to facilitate high-throughput proteomics
in the negative mode.
184−190
Because the negative charge of phosphoryl groups can lead to preferential
ionization of phosphopeptides as anions,
191,192
these negative mode approaches have the potential to provide a new
dimension to phosphoproteomic experiments. Holistically, there are
certainly many interesting avenues to explore with phosphopeptide
fragmentation. That said, the majority of the phosphoproteomic experiments
use higher-energy collisional dissociation (HCD) for routine experiments
because it is relatively straightforward to implement and the neutral
loss problems of CAD are largely overcome by the higher energy deposition
of the collisions.
193
Post-Acquisition
Processing and Informatics
Informatics tools to reliably
extract tandem MS data for phosphosite localization and functional
annotation are crucial to biological interpretation of phosphoproteomic
experiments. Recently developed libraries of synthesized phosphopeptides
of known sequence and their fragmentation spectra have provided excellent
resources for evaluating search algorithms, fragmentation schemes,
enrichment and separation strategies, and prediction tools.
194−197
Perhaps their most powerful application, these libraries help researchers
develop new informatics tools for MS/MS spectral interpretation and
phosphosite localization. One popular and relatively recent algorithm
for phosphosite localization is PhosphoRS, which assigns individual
site probabilities for phosphopeptides (Figure 8
).
198
PhosphoRS
is compatible with multiple fragmentation types and a range of mass
accuracy measurements. It has improved upon previously available algorithms,
with 3470 unique localized phosphosites from HeLa cells compared to
3107 with A-score
199
and 2763 with Mascot
Delta score.
200
A generic approach for
obtaining a single confidence score for PTM localization, called the
D-score, has been developed from the established Mascot Delta score
algorithm for compatibility with multiple search engines.
201
The D-score is calculated by translating search
engine scores into posterior error probabilities (PEP) and estimating
the PEP difference between the two most likely modification sites
independent of search engines, which can improve correct localization
by as much as 25.7% compared to using Mascot alone.
Figure 8
Phosphosite localization.
The workflow here shows the localization steps taken by phosphoRS,
but the concepts are valid for a variety of localization algorithms.
MS/MS spectra are binned into windows (A) and the optimal peak depth
to use for localization is determined by calculating cumulative binomial
probabilities for each isoform (B). Potential phosphopeptide isoforms
are scored based on the optimal number of most intense peaks from
each m/z window (C), and sequence
and phosphosite probabilities are calculated (D). Reprinted with permission
from Taus, T.; Köcher, T.; Pichler, P.; Paschke, C.; Schmidt,
A.; Henrich, C.; Mechtler, K. J. Proteome Res.
2011, 10, 5354–5362 (ref (197)). Copyright 2015 American
Chemical Society.
Using mass accuracy and
peak intensities, Nesvizhskii and co-workers introduced LuciPHOr to
improve site localization and false localization rate (FLR) estimation.
202
This tool estimates FLR based on a target-decoy
framework, in which artificial phosphorylation is used to generate
decoy phosphopeptides to compare with target matches from a database
search. Another alternative, PhosSA, implemented a fast and scalable
(reported up to 0.5 million spectra/hour) linear-time and space-dynamic
programming strategy for phosphosite assignment.
203
PhosSA sums peak intensities that match theoretical spectra
as an objection function and uses signal-to-noise measurements of
MS/MS spectra in postprocessing quality control.
Many informatics
platforms are being developed with flexibility and accessibility in
mind, with an ultimate aim for universal tools that perform well for
diverse types of spectral data sets. The MS-GF+ search algorithm has
demonstrated that its robust probabilistic model works well across
a variety of data sets, including spectra generated using diverse
configurations of MS instruments and experimental protocols.
204
Described as a truly universal MS/MS database
search tool, MS-GF+ performed more favorably for phosphopeptides than
older tools like Mascot-Percolator and InsPecT. Other more broadly
applicable PTM spectral matching approaches have also been developed;
these include wide precursor tolerance (±500 Da) database searches
to identify peptide modifications without a priori knowledge on a proteome-wide scale,
205
and directed database searching to match modifications like phosphorylation
based on previous observations at specific amino acid residue positions.
206
With the increasing availability of informatics
tools, recent studies have aimed to equip researchers with the knowledge
to choose the best tools for their work through evaluation of other
open-source applications for phosphoproteomic data analysis with multiple-search-engine
compatibility.
207,208
Overall, the maturation and
availability of robust phosphosite localization tools has greatly
increased the information density of phosphoproteomic data sets, providing
residue-specific data for further biological interpretation. Continual
development in making these tools compatible with high-volume data
and cutting edge phosphopeptide fragmentation techniques will only
improve the efficacy and reproducibility of phosphosite characterization.
Beyond postacquisition processing and phosphosite localization,
informatics tools are required to translate large-scale data sets
to a biologically relevant context, including spatial and temporal
information about signaling networks. In a recent subcellular phosphoproteomics
study, support vector machines were used to determine compartment-specific
phosphosites, which provided spatial resolution to more than 10 000
human phosphoproteins with experimentally verified information on
subcellular localization.
209
A cluster
evaluation approach used to study temporal dynamics of signaling cascades
in two time-series phosphoproteomics data sets identified key kinases
associated with human embryonic stem cell differentiation and insulin
signaling pathway.
210
This approach used
prior knowledge, annotated kinase-substrate relationships mined from
literature, and curated databases to generate biologically meaningful
partitioning of phosphorylation sites. It then determined key kinases
associated with each cluster based on temporal kinetics of similar
substrates of a given kinase. Deriving and training logic models to
handle high-content phosphoproteomic data using prior knowledge of
kinase/phosphatase-substrate interactions has also been utilized to
investigate targets and effects of kinase inhibitors and reconcile
conclusions obtained from multiple data sets.
211
A pipeline for systematic elucidation of signaling
networks has also been developed to identify key proteins in specific
pathways, discover protein–protein interactions, and infer
signaling networks.
212
Using quantitative
phosphoproteomic experiments, this informatics approach performed
phosphopeptide meta-analysis, correlation network analysis, and causal
relationship discovery to study stress responses in budding yeast.
Follow-up experiments validated the discovery of 5 high-confidence
proteins from meta-analysis and 19 hub proteins from correlation analysis.
Ultimately, this pipeline provides a comprehensive tool for systematically
discovering signaling networks and candidate proteins for further
investigation. PhosphoSitePlus, a publically available database that
contains ∼260 000 reported phosphosites, is another
valuable tool for network analysis in phosphoproteomic experiments,
although many of the phosphosites do not have a known function or
associated kinase.
213
Building on
these options, PhoSigNet was designed to be a phosphorylation-centric
database and analysis platform that can store and display human phosphorylation-mediated
signal transduction networks, taking kinase–substrate regulatory
pairs into account and also extending regulatory relationships up-
and downstream.
214
PhosphoPath takes visualization
one step further as a PTM-specific tool, focusing on displaying protein–protein
interactions, kinase–substrate interactions, and pathway enrichments
at a phosphosite-centric level (Figure 9
).
215
Integrating data from
three public databases, PhosphoPath is a Cytoscape plug-in that offers
accessibility and phosphosite-directed data analysis for quantitative
information for multiple conditions or time points at protein and
PTM levels. Given the development of analytical and informatics tools
across a diverse range of species and sample types, cross-species
mapping of PTMs can be a valuable informatics approach to understanding
key signal transduction mechanisms. Many functionally important modification
sites are more likely to be evolutionarily conserved, and new tools
like PhosphOrtholog are facilitating the comparison of data sets derived
from multiple species.
216
Figure 9
Interaction networks
built from a phosphosite-centric perspective. PhosphoPath is a Cytoscape-based
tool that aids visualization and analysis of quantitative phosphoproteomic
data. Displayed here is a quantitative interaction network of members
of the MAPK pathway, with blue and red representing down- and up-regulation,
respectively. Straight lines show protein–protein interactions
from Biogrid while arrows visualize kinase-substrate interactions
from PhosphoSitePlus. Multiplicity is indicated by the color bar for
each protein, and edges can be added manually, such as the red edge
at the top of the figure showing inhibition of NF1 on NRAS. Reprinted
with permission from Raaijmakers, L. M.; Giansanti, P.; Possik, P.
A.; Mueller, J.; Peeper, D. S.; Heck, A. J. R.; Altelaar, A. F. M. J. Proteome Res.
2015 (ref (213)). Copyright 2015 American
Chemical Society.
Biological Insights via
Phosphoproteomics
The development of analytical methods for
robust and quantitative phosphoproteomics over the past several years
has led to significant impact in translational science in human health
and disease. Even with the challenges in reproducibility and phosphosite
assignment discussed here, the field actively contributes knowledge
to the greater scientific community at an impressive pace, continually
accelerated by improvements in throughput and depth. The contribution
of phosphoproteomics to molecular biology is far too immense to review
extensively here. Instead, we discuss a cross section of studies that
capture the breadth of phosphoproteomics’ impact on biological
research. We direct readers desiring a greater context for the biological
implications of phosphoproteomics to recent and more thorough reviews
on subjects including cancer biology,
217,218
clinical
applications,
219
cell and tissue analysis,
220
metabolism,
221
and
systems biology.
222
The long-appreciated
role of phosphorylation signaling in cancer cells remains one of the
most active areas of research in phosphoproteomics. Recent work from
the Cutillas group showed that tumors from different hematological
cancer cells lines, including acute myeloid leukemia, three lymphoma,
and three multiple myeloma cell lines, can be distinguished by their
phosphoproteomes and their phenotypic responses to inhibitors.
223
This group also used phosphoproteomic analyses
of acute myeloid leukemia cells to systematically infer the activation
of kinase pathways, providing a computational approach to profile
dysregulation of signaling pathways in an untargeted fashion.
224
Conserved oncogenic signaling pathways can
also distinguish mouse models of breast cancer on the basis of tyrosine
phosphorylation signatures and signaling networks.
225
Protein kinase B (Akt) is known to play key roles
in cell proliferation and metabolism, and aberrant hyperactivation
of the mTORC2 (mechanistic target of rapamycin complex 2)–Akt
pathway can facilitate tumorigenesis.
226
Using MS-based phosphoproteomic methods in combination with other
approaches, Liu et al. showed that phosphorylation of Akt at serine
477 and threonine 479 is an essential layer of its activation mechanism
in the regulation of its physiological functions, providing a mechanistic
link between Akt hyperactivation in cancer and aberrant cell cycle
progression.
227
Quantitative phosphoproteomics
also facilitated identification of apoptosis-modifying kinases that
are highly connected to regulated substrates downstream of tumor necrosis
factor-related apoptosis-inducing ligand (TRAIL). This study offers
a resource of potential targets for the development of TRAIL combination
therapies to selectively kill cancer cells.
228
Mechanistic insights into novel kinase activity, such as enzymes
involved in coenzyme Q biosynthesis, have established a molecular
foundation for further investigation of how classes of proteins affect
cancer and other diseases through diverse biological pathways as well.
229
Nontraditional approaches have been integrated
with more canonical bottom-up phosphoproteomic techniques to study
phosphorylation signaling in cancer. In order to study combinatorial
PTM patterns related to the progression of breast cancer through the
cell cycle, top-down methods were leveraged to identify and quantify
phosphorylation of histone H1 proteoforms (a potential clinical biomarker
of breast and other cancers).
230
Peptide-centric
bottom-up phosphoproteomics was then integrated with the intact proteoform
data to ultimately show progressive H1 phosphorylation across the
cell cycle, suggesting specific phosphorylation events may serve as
markers for proliferation. Quantitative phosphoproteomics has also
provided more insight into cell-cycle regulation via histone H2A.
One study showed that autophosphorylation of Bub1 kinase, which phosphorylates
H2A at tyrosine 120 to promote centromere sister chromatid cohesion,
is mitosis specific, and that Bub1 activation is primed in interphase
but only fully achieved in mitosis.
231
Moreover,
phosphorylation of H2A at tyrosine 57, a conserved modification from
yeast to mammals, is involved in regulation in transcriptional elongation
based on the unsuspected tyrosine kinase activity of casein kinase
2.
232
Very recently, a “multi-omics”
approach that incorporated data from metabolomics, lipidomics, and
phosphoproteomics on multiple myeloma cells revealed that kinase inhibitors
may not only downregulate phosphorylation of their targets but also
induce metabolic events via increased phosphorylation of other cellular
components.
233
Phosphorylation of
proteins involved in nuclear activity is central to many cellular
processes. In 2013, Kirkpatrick et al. used large-scale phosphoproteomics
to uncover extensive signaling among proteins in the DNA damage pathway
when cell death was initiated in melanoma cells through treatment
with small molecule inhibitors against MAP/ERK kinase (MEK) and phosphatidylinositol
3-kinase (PI3K).
234
Their work provided
further insight into short- and long-term sensitivity of tumor cells
to MEK- and PI3K-targeted therapies, in addition to the broader impacts
of combinatorial therapeutic approaches for intervention in many cancers.
Regulation of transcription in the nucleus by stretches of consecutive
phosphoserine residues (3 to >10 in a row) has recently been shown
in the human phosphoproteome, with the majority of the phosphoproteins
with pSer stretches functioning in macromolecular, nucleotide, or
metal ion binding.
235
Interestingly, stretches
of consecutive pThr and pTyr are almost absent. Phosphorylation can
also play an important role in nuclear activity during the viral life
cycle. Phosphorylation of the monomeric nucleoprotein in large ribonucleoprotein
(RNP) complexes from negative-sense RNA viruses regulates oligomerization
of the monomer into the complex, an essential step for virus replication.
236
Appreciation for the role of reversible
mitochondrial phosphorylation in signaling and energy utilization
continues to grow. Using isobaric labeling-based quantitative phosphoproteomics,
Grimsrud and co-workers demonstrated that phosphorylation is widespread
in mitochondria and is a key mechanism for regulating ketogenesis
during the onset of obesity and type 2 diabetes.
237
Mitochondria play a key role in the cell’s adaptation
to metabolic demands, and Ferreira et al. used a label-free quantitative
approach to show that reprogramming of the phosphoproteome reflects
the response of heart mitochondria to metabolic demands of long exercise
programs, which are associated with improvement in cardiac function
and lifespan extension.
238
The plasticity
of mitochondrial response to acute exercise signaling via phosphorylation
has also been shown in skeletal muscle, where many previously undescribed
roles for phosphorylation modifications exposed the unexplored complexity
of signaling in acute exercise.
239
Unexpected
roles in signaling in the secreted phosphoproteome have emerged as
well. Extracellular phosphoproteins have been recognized for over
a century, for example, but challenges in measuring them have limited
the development of substantial knowledge. As it turns out, Fam20C
generates the majority of the extracellular phosphoproteome, and its
substrates suggest roles for the kinase beyond biomineralization,
including lipid homeostasis, wound healing, and cell migration and
adhesion.
240
Beyond the focus on
mammalian phosphoproteomics in translational research, the importance
of global phosphorylation in plants and microbial systems has garnered
considerable attention. Recent descriptions of the roles of phosphorylation
in Arabidopsis thaliana (a model organism for flower
development) have included responses to hormone signaling,
241−243
DNA damage,
244
and circadian clock cycles.
245
Large-scale experiments have linked phosphorylation-based
signaling in Medicago truncatula, a model legume
used for studying nitrogen fixation, to the formation and association
of symbiotic relationships with rhizobia that assist in the nitrogen
fixation process.
246−249
Yeast has been a popular system for studying TOR signaling.
Target of rapamycin signaling complex 1 (TORC1) is implicated in growth
control/proliferation and aging from yeast to humans.
250
Recent work has leveraged phosphoproteomics
in combination with dynamic metabolomics data to infer the functional
role of phosphorylation in the metabolic activity of 12 enzymes, including
three candidate TORC1-proximal targets. This work ultimately helped
resolve the temporal sequence of phosphorylation responses to nutritionally
and chemically induced changes.
251
A high
temporal-resolution global phosphoproteomics experiment was evaluated
recently in Saccharomyces cerevisiae. This study
indicated that putatively functional kinase- or phosphatase-substrate
interactions occur more rapidly (within 60 s) than promiscuous interactions,
allowing specific and functional kinase- and phosphatase-substrate
interactions to be profiled.
252
Measuring
proteomic and phosphoproteomic changes over the four major cell cycles
of Schizosaccharomyces pombe, Carpy et al. quantified
cell cycle-dependent fluctuations on a proteome-wide scale and showed
that protein phosphorylation peaked in mitosis. This study also coupled
measurements of copy numbers per cell with the phosphoproteomic data
to estimate phosphosite stoichiometry with absolute amounts of protein-bound
phosphate.
253
PTM Cross-Talk
Phosphorylation is one of many dozens of important PTMs in cellular
function. Protein modification rarely, if ever, occurs in isolation,
and the interaction of various PTMs can carry important biological
information in prokaryotic and eukaryotic systems.
254−256
Carr and co-workers introduced a robust serial enrichment strategy
that enabled characterization of an average of 20 800 localized
phosphosites, 15 408 ubiquitination sites, and 3 190
acetylation sites from 7.5 mg of a single sample, all with SILAC quantitation.
257
This powerful platform revealed cross-talk
among six interconnected protein networks that regulate cell cycle,
replication, transcription, translation, and the proteasome in Jurkat
cells. Swaney et al. studied cross-talk events between phosphosites
and ubiquitination that regulate protein degradation via the ubiquitin-proteasome
system.
258
Phosphosites had greater conservation
on ubiquitinated proteins, indicating a role in biological function
and suggesting a global cross-talk directionality, in which phosphorylation
more frequently precedes ubiquitination. Co-regulation of phosphorylation-
and ubiquitination-dependent signaling networks has been shown in
yeast treated with rapamycin, as well.
259
PTM interaction in signaling networks aimed at cell survival or
death in myocardial ischemia is also important, where lysine acetylation
can activate protein kinases during ischemia and increase proximal
dephosphorylation by as much as 10-fold.
260
The interplay of phosphorylation and acetylation was also investigated
in the strong correlation of maximal exercise-associated oxidative
capacity to health and longevity.
261
Cross-talk
between phosphorylation and O-linked N-acetylglucosamine
(O-GlcNAc), which modifies the same residues as phosphorylation, is
widespread and important, as well, serving as a nutrient/stress sensor
to modulate signaling, transcription, DNA damage response, and cytoskeletal
functions.
262,263
Phosphorylation can also
exhibit cross-talk with other phosphorylation-regulated pathways.
Signal integration between mitogen-activated protein kinase cascades
in budding yeast has shown that concurrent stimuli (high salt concentration
and pheromones) affect multiple pathways previously thought to be
specific to a given stimulus.
264
This intraphosphorylation
cross-talk revealed that phosphorylation events in many pathways affect
each other at more levels than anticipated, showing that the integration
of a response to different stimuli requires complex interconnections
between signaling cascades. Cross-talk is not limited to interactions
between PTMs, either; recent work has shown that cross-talk can occur
between phosphorylation and heat shock protein in brain tissue samples
from Alzheimer’s patients.
265
Phosphorylation beyond Serine, Threonine, and Tyrosine
While
the majority of the phosphoproteomics community focuses on O-phosphorylation
(serine, threonine, and tyrosine), there is a growing appreciation
for alternative sites of modification, namely, N-phosphorylation (lysine,
arginine, and histidine) and S-phosphorylation (cysteine). These alternative
phosphorylation events are known to function in important signaling
mechanisms in bacterial systems, and more studies are emerging to
suggest they may play a role in eukaryotic signaling as well.
266−271
Analytical tools for the investigation of N- and S-phosphorylation,
however, are lacking. Canonical LC–MS/MS approaches that utilize
acidic buffers are less than ideal because both phosphoramidate and
phosphothiolate bonds are acid labile. Similar to CAD on phosphopeptides
with pSer, pThr, and pTyr, gas-phase rearrangements in MS/MS fragmentation
for peptides containing phosphorylated histidine, arginine, and lysine
have led to false localizations. Electron-driven fragmentation methods
seem to hold more promise for these modifications than collisional
dissociation.
272−275
One alternative to O-phosphorylation in particular, phosphohistidine,
has garnered increased interest in recent years.
266
Phosphohistidine antibodies have been generated by the
Muir and Hunter groups,
276−278
and a large-scale phosphohistidine
study was reported using a neutral loss fragmentation strategy.
279
These studies certainly show progress in the
analysis of this elusive PTM, but in all, investigations have remained
limited. As specific enrichment strategies for phosphohistidine and
other nontraditional phosphorylated residues emerge, larger scale
and more systematic evaluations of their biological roles can be investigated.
These investigations may require alternative separations, fragmentation,
and informatics tools, but a more comprehensive understanding of nontraditional
phosphorylated residues’ role in prokaryotic and eukaryotic
systems makes the effort worthwhile.
Looking Forward
The optimal phosphoproteomic methodology for thorough phosphoproteome
coverage with minimal sample preparation steps and data acquisition
time remains evasive, but present-day techniques continue to progress
toward more reproducible methods that offer considerable throughput
and depth. Leveraging refinement of phosphopeptide enrichment protocols,
improved sensitivity and speed of new-generation mass spectrometers,
and more robust informatics tools, phosphoproteomic technology now
can offer >10 000 localized and quantified phosphosites
in only a few hours of data acquisition and tens of thousands of phosphosites
in weeks of analysis. We remain far from complete phosphoproteome
characterization, however. Confident reports of a complete phosphoproteome
from even a single sample or cell line have yet to emerge, much less
a generic platform where complete phosphoproteomes can be routinely
profiled. While rapid and deep whole proteome-level characterization
is currently within reach, comprehensive PTM-level characterization,
such as a complete phosphoproteome, appears a decade or more away.
Phosphoproteomic technology will surely continue to advance hand-in-hand
with improvements in MS instrumentation. Traditional shotgun phosphoproteomic
experiments will benefit directly from gains in instrument scan rate
and sensitivity, and alternative methods, such as SWATH-MS-like DIA
approaches, promise to offer increased reproducibility, perhaps coupled
to competitive phosphoproteomic sampling depths in coming years. We
anticipate an increase in routine sampling depth to entail a greater
appreciation for overlap, as opposed to orthogonality of sample preparation
methods. On the basis of current data comparing state-of-the-art techniques,
we expect the coming years will see a convergence of preparation and
enrichment methods into a few reproducible but versatile options.
Beyond traditional LC–MS/MS bottom-up strategies, advances
in top-down proteomic tools have placed large-scale, quantitative
intact phosphoprotein analysis within reach, and progress in alternative
separations technologies, like capillary zone electrophoresis, may
also provide more thorough phosphoproteome characterization.
280−284
Ultimately, the future of phosphoproteomic research holds
reproducible identification and quantification of tens of thousands
of phosphosites in a few hours of analysis per sample. With such capability,
phosphoproteomics can achieve even more significant impact in biological
research and clinical platforms. In parallel with technological developments
necessary for this level of analytical power, the field will require
streamlined data-analysis and interpretation tools that can capitalize
on the speed and sensitivity of state-of-the-art MS methodology. Success
will rely on technology that can integrate large data sets into systems
biology approaches while maintaining the flexibility of phosphoproteomic
tools to address discovery- and hypothesis-driven questions.