Discovering the functional mechanisms of biological systems frequently requires information
that challenges the spatial and temporal resolution limits of current experimental
techniques. Recent dramatic methodological advances have made all-atom molecular dynamics
(MD) simulations an ever more useful partner to experiment because MD simulations
capture the atomic resolution behavior of biological systems on timescales spanning
12 orders of magnitude, covering a spatiotemporal domain where experimental characterization
is often difficult if not impossible. We present here our perspective on the mechanistic
insights that a scientist—in particular, a membrane protein physiologist—might garner
by complementing experiments with atomistic MD simulations. Drawing on case studies
from our work, we illustrate the diversity of membrane proteins amenable to study
by MD and the types of discoveries one can make through simulation. We discuss the
strengths and limitations of MD as a tool for physiologists, and we speculate on advances
that such simulations may enable in the coming years.
Why simulate?
What might a physiologist gain by supplementing the usual experimental tools—cell
lines, patch clamp rig, spectrometers, and the like—with atomistic MD simulations?
Foremost is the ability to probe the biological system of interest, which may be anything
from an individual protein to a large biological assembly, across a very broad range
of timescales at high spatial resolution (Fig. 1). An all-atom MD simulation typically
comprises thousands to millions of individual atoms representing, for example, all
the atoms of a membrane protein and of the surrounding lipid bilayer and water bath
(Fig. 2). The simulation progresses in a series of short, discrete time steps; the
force on each atom is computed at each time step, and the position and velocity of
each atom are then updated according to Newton’s laws of motion. Each atom in the
system under study is thus followed intimately: its position in space, relative to
all the other atoms, is known at all times during the simulation. This exquisite spatial
resolution is accompanied by the unique ability to observe atomic motion over an extremely
broad range of timescales—12 orders of magnitude—from ∼1 femtosecond (10−15 s), less
than the time it takes for a chemical bond to vibrate, to >1 ms (10−3 s), the time
it takes for some proteins to fold, for a substrate to be actively transported across
a membrane, or for an action potential to be initiated by the opening of voltage-gated
sodium channels. MD simulations thus allow access to a spatiotemporal domain that
is difficult to probe experimentally (Fig. 1). Simulations can be particularly valuable
for membrane proteins, for which experimental characterization of structural dynamics
tends to be challenging.
Figure 1.
Spatiotemporal resolution of various biophysical techniques. The temporal (abcissa)
and spatial (ordinate) resolutions of each technique are indicated by colored boxes.
Techniques capable of yielding data on single molecules (as opposed to only on ensembles)
are shown in bold. Nuclear magnetic resonance methods can probe a wide range of timescales,
but they provide limited information on motion at certain intermediate timescales,
as indicated by the lighter shading and dashed lines. The timescales of some fundamental
molecular processes, as well as composite physiological processes, are indicated below
the abcissa. The (spatial) resolution needed to resolve certain objects is shown at
right. AFM, atomic force microscopy; EM, electron microscopy; FRET, Förster resonance
energy transfer; NMR, nuclear magnetic resonance.
Figure 2.
Single-channel recording of ion permeation from experiment and simulation. Electrophysiological
recording of current through a potassium channel (top) and recording of ion permeation
from an all-atom simulation of the Kv1.2 potassium channel (bottom; cf. Jensen et
al., 2010). (Bottom left) The simulated model, consisting of lipid molecules (spheres),
Kv1.2 (orange; only the pore domain is shown), potassium ions (shown as green spheres
within the selectivity filter [white box]; remaining K+ and Cl− ions not shown), lipid
molecules (chains of spheres), and water molecules (red/white spheres and gray surface).
(Bottom middle) Magnification of the selectivity filter. (Bottom right) Positions
of K+ ions permeating the selectivity filter (left ordinate; drawn to scale with the
selectivity filter) versus simulation time (abcissa) at a depolarizing voltage of
∼25 mV. Traces of individual ion positions are shown in various colors. Red line and
circles indicate accumulated permeation events of individual K+ ions (right ordinate).
The experimental channel recording was originally published by Li et al. (2006) and
is reprinted here with permission from Proceedings of the National Academy of Sciences
of the United States of America (PNAS). Use of this figure is gratefully acknowledged.
How might this ability to “see” the atoms of a biological system moving over time
truly be useful? First, one can observe qualitative behavior, such as the mechanism
of permeation through membrane channels. Second, one can probe systems quantitatively,
for example, determining the conductance of a single water or ion channel (Fig. 2).
Third, simulations often allow one to generate novel mechanistic hypotheses, sometimes
based simply on straightforward observation: as Yogi Berra once said, “You can observe
a lot by watching.” Fourth, simulations of perturbed or mutated molecular systems
can be used to test specific hypotheses originating from experiment, computation,
or theory. The power of MD simulations is further augmented by the ability to model
molecules that cannot easily be constructed experimentally.
A wide variety of physiological processes are amenable to study at the atomic level
by MD simulation. Examples relevant to membrane protein function include the active
transport of solutes across bilayers by antiporters and symporters; the passive transport
of water, ions, and other solutes by structurally diverse channels; the interconversion
of transmembrane electrochemical gradients and chemical potential energy by pumps
such as the F1F0-ATPase and the Na+/K+-ATPase; the transmission of extracellular stimuli
to the cell interior by G protein–coupled receptors (GPCRs) and tyrosine kinase receptors;
and the structural coupling of cells and organelles to one another by integrins and
membrane curvature modulators.
Uncovering the fundamental mechanisms that underlie the functions of membrane proteins,
and increasingly the dynamical details of those mechanisms, is a major thrust of molecular
physiology. Computational models, especially those arising from MD simulations, are
useful because they can provide crucial mechanistic insights that may be difficult
or impossible to garner otherwise (Pugh and Andersen, 2008). Insight from computer
models has a long history: in a landmark study at the dawn of the computer age, Fermi,
Pasta, Ulam, and Tsingou used simulation to discover the counterintuitive behavior
of a deceptively simple oscillator (Fermi et al., 1965). The authors noted that “the
results of our computations show features which were, from the beginning, surprising
to us.” A half-century later, computational methods have progressed sufficiently to
enable a true partnership of MD simulations and experiments that can yield mechanistic
insights into complex problems of biological interest.
Case studies
Using examples from our work on membrane proteins, we present several case studies
that illustrate the breadth of physiological processes that can be investigated using
MD simulations (Karplus and Kuriyan, 2005). Each study addressed unique mechanistic
questions, and thus each entailed a distinct approach. Our work was inspired by earlier
studies of membrane proteins, including simulations of ion and water channels (Roux
and Schulten, 2004; de Groot and Grubmüller, 2005), transporters (Khalili-Araghi et
al., 2009), and GPCRs (Martínez-Mayorga et al., 2006; Spijker et al., 2006).
Permeation through a water channel: aquaporin 0 (AQP0).
AQP0, the most abundant membrane protein in the mammalian eye lens, differs from most
other aquaporins in two ways: it transports water significantly more slowly (Yang
and Verkman, 1997), and AQP0 molecules in adjacent cells may link to form intercellular
junctions (Gonen et al., 2005). Comparison of the high-resolution structures of the
nonjunctional AQP0 tetramer (x ray) and the junctional AQP0 octamer (electron microscopy)
has stimulated intense debate on whether water even permeates AQP0 junctions (Harries
et al., 2004; Gonen et al., 2005), a question that has not been answered experimentally
because of the difficulty in measuring the permeability of a junctional octamer.
To address this question, we used MD to simulate several AQP0 assemblies, including
a nonjunctional tetramer embedded in a single lipid bilayer and a junctional octamer
embedded in two juxtaposed bilayers (Jensen et al., 2008). Our results indicated that
both AQP0 forms do in fact conduct water: the computed single-channel (monomeric)
water permeabilities of the AQP0 assemblies were all essentially the same and within
error of the experimental value determined for nonjunctional AQP0 (Yang and Verkman,
1997). We found further that the distinct gating action of two tyrosine residues unique
to AQP0—one a static gate and the other a dynamic gate—explained AQP0’s unusually
low water permeability. Our results also shed light on the possible biological significance
of this low permeability: as a cellular adhesion molecule, AQP0 may need to reduce
the rate of water permeation to maintain mechanical stability of the junction.
Reconciling discordant experimental results: β2-adrenergic receptor (β2AR).
The recently determined β2AR crystal structures (Cherezov et al., 2007; Rasmussen
et al., 2007)—the first of ligand-activated GPCRs, the largest class of drug targets—raised
new questions about the atomic-level mechanisms of GPCR signaling. Biochemical experiments
had previously suggested that, in their inactive state, β2AR and many other GPCRs
possess a network of salt bridges, known as the “ionic lock,” which was believed to
form a molecular switch for receptor activation (Ballesteros et al., 2001). Although
β2AR was crystallized in a presumably inactive state, the ionic lock was disrupted
in the crystal structures, provoking uncertainty about the true conformation(s) of
the inactive state as well as the role of the ionic lock in receptor activation and
signaling (Rasmussen et al., 2007, Lefkowitz et al., 2008).
To address these questions, we performed MD simulations of β2AR in multiple wild-type
and mutant forms (Dror et al., 2009). In simulations of wild-type β2AR, the ionic
lock formed reproducibly within a few hundred nanoseconds, with the salt bridge network
matching that suggested by the earlier biochemical studies (Ballesteros et al., 2001).
In microsecond-timescale simulations, we observed that the lock remained formed most
of the time but occasionally broke for tens or even hundreds of nanoseconds. Mutations
associated with increased activity of the unliganded receptor increased the fraction
of time the lock was broken in our simulations. To obtain the high-resolution β2AR
structure, a flexible loop of the receptor had been replaced by a small, rigid protein,
T4 lysozyme (Cherezov et al., 2007). We hypothesized that this replacement altered
the ionic lock conformational equilibrium. Indeed, when we simulated the crystallographic
β2AR–T4 lysozyme fusion protein, the lock-broken conformation predominated. Our results
suggest that inactive β2AR exists in a context-sensitive equilibrium and that the
lock-formed conformation may predominate in the inactive wild-type receptor, despite
the broken lock in the crystal structures.
Ion secretion by an alternating access mechanism: the NhaA antiporter.
The Na+/H+ antiporter NhaA, a prototypical membrane transporter, uses the bacterial
transmembrane proton gradient to drive secretion of one intracellular sodium ion for
every two protons taken up from the periplasm (Hunte et al., 2005). Using a large
set of MD simulations, we deduced an NhaA transport mechanism; the proposed mechanism
was substantiated with complementary experiments on NhaA mutants (Arkin et al., 2007).
The resulting mechanism provides a molecular realization of Jardetzky’s 40-year-old
alternating access model (Jardetzky, 1966): one aspartate residue near the center
of the protein acts as the Na+ binding site, while a second, nearby aspartate acts
as an accessibility control site, whose protonation state determines whether the Na+
binding site is accessible to the periplasm or to the cytoplasm.
Experimentally, the NhaA transport cycle takes roughly one millisecond. Because MD
simulations anywhere near this length were infeasible when we performed our study,
we relied instead on shorter simulations (≤100 ns each) initiated with NhaA in different
configurations thought to represent, or lead to, key functional intermediates. Simulations
were started with the sodium ion at different putative NhaA binding sites and with
all possible protonation states for the two aspartates known experimentally to be
indispensable for transport. Additional simulations also led to a model for the mechanism
of NhaA’s experimentally observed pH sensitivity. Moreover, MD-based free energy calculations
suggested that NhaA preferably binds Li+ over Na+, and Na+ over K+, hinting at how
its cation selectivity arises.
Permeation and gating of an ion channel: Kv1.2.
Potassium channels have recently reemerged as drug targets; conversely, unwanted blockade
of the hERG K+ channel represents perhaps the most serious pervasive safety concern
in current drug development. Ion channels have been the subject of numerous MD studies,
and such simulations, combined with experimental data, have produced the most accurate
atomic-level information on the energetics of permeation (Bernèche and Roux, 2001;
Morais-Cabral et al., 2001).
We recently used MD simulations of Kv1.2 to make the first direct atomic resolution
observations of permeation and pore domain gating in a voltage-gated K+ channel (Jensen
et al., 2010). To traverse the Kv1.2 pore (Long et al., 2007), ions pass through a
large, water-filled hydrophobic cavity and then through the narrow selectivity filter
(Fig. 2). Our analysis of hundreds of individual ion permeation events—driven by experimentally
accessible depolarizing voltages—revealed a detailed conduction mechanism, resembling
the Hodgkin-Keynes “knock-on” model, in which translocation of two selectivity filter–bound
ions is driven by a third, incoming ion; formation of this knock-on intermediate was
found to be the rate-determining step in permeation. These simulations also provided
quantitative results, including a stoichiometric ratio of permeating water molecules
to K+ ions of about one, in agreement with streaming potential measurements (e.g.,
Alcayaga et al., 1989), and a single-channel conductance within a factor of three
of the experimental value.
At reverse or zero voltages, we observed pore closure—gating of the pore domain—which
took place by a novel “hydrophobic gating” mechanism: water evacuated the hydrophobic
cavity, causing the open, conducting pore to collapse into a closed, nonconducting
conformation. In our simulations, in which the voltage sensor domains of Kv1.2 were
omitted, the pore closed within a few microseconds. Experimentally, intact Kv1.2 closes
on millisecond timescales. Thus, our simulation results support the idea that the
voltage sensors act to prevent pore collapse into an intrinsically more stable, closed
conformation (Yifrach and MacKinnon, 2002). Hydrophobic gating may be a fundamental
principle underlying the function of voltage-sensitive K+ channels, and this mechanism
could help to explain the evolutionary conservation of hydrophobic cavities in structurally
diverse ion channels (Jensen et al., 2010).
Strengths and limitations
How does one decide whether addressing a particular biological question using MD simulations
is likely to be feasible and productive? In this section, we address the major strengths
and limitations of MD as a technique for molecular physiology.
Accessible timescales.
Experimentally, phenomena that occur very quickly are typically harder to characterize
than those that occur over long timescales, particularly if high spatial resolution
is desired. In MD simulations, the opposite is true: the longer the physical time
to be simulated, the more wall-clock time is required to execute the simulation. The
highest atomic vibrational frequencies limit each time step to a few femtoseconds,
such that simulating even a microsecond requires nearly a billion sequential time
steps, with evaluation of the forces on every atom required at each step. Because
the force on each atom depends on the position of every other atom (through long-range
electrostatic effects), parallelizing the simulation by splitting it across multiple
processors requires substantial inter-processor communication, which limits the speedup
achievable through parallelization. Thus, MD simulations have historically been most
powerful for simulating motions that take place on sub-microsecond timescales. Of
course, many of the events of greatest interest in molecular physiology—for example,
conformational changes underlying functional mechanisms of proteins—take place on
timescales of microseconds or longer (Fig. 1).
Recent advances in parallel algorithms, software, and hardware have made simulations
beyond a microsecond practical on commodity computer clusters (Bhatele et al., 2008;
Hess et al., 2008; Klepeis et al., 2009). Our four case studies, for example, were
performed using novel parallelization methods on large commodity clusters (Bowers
et al., 2006). We also recently completed a specialized supercomputer designed especially
for MD simulations, named Anton, which has enabled simulations on timescales as long
as a millisecond (Shaw et al., 2009). These advances all broaden the applicability
of MD to molecular physiology, and facilitate direct comparisons of simulations to
experimental data.
Several approaches allow the use of MD to study conformational changes on timescales
exceeding those of individual atomistic simulations. First, one can simulate specific
parts of a longer-timescale process by initializing simulations from appropriately
chosen protein configurations thought to represent key mechanistic intermediates (see
NhaA antiporter study above). Second, one can use techniques devised to allow short-timescale
simulations to sample a broader swath of conformational space, for example, by applying
biasing forces to guide a simulation in a particular direction or by pushing the simulation
over energy barriers (Lei and Duan, 2007). Such approaches are particularly powerful
when prior information about the expected conformational change is available, but
they can also introduce unrealistic behavior into the simulation. Third, one can use
coarse-grained simulations, in which each simulated particle represents multiple atoms,
to substantially reduce computational requirements at the cost of reduced spatial
resolution and, in some cases, reduced accuracy (Bond et al., 2007; see also the complementary
approaches described in the accompanying Perspectives in this issue by Bahar and by
Silva and Rudy).
Accuracy and errors.
MD simulation, like any experimental technique, suffers from various sources of error.
Careful study design can mitigate these errors, and careful interpretation of the
simulation results can help prevent erroneous conclusions.
Errors associated with MD simulations are of two types. The first represents errors
due to limitations on the length and number of simulations that can be performed within
resource constraints. One must ask whether results are robust to minor changes in
the initial conditions, whether observations represent initial transients or steady-state
behavior, and whether the simulations have reached all the relevant conformational
states. Repeated simulations—typically starting from the same structure, but with
different initial velocities—help ensure robustness, but do not always substitute
for longer simulations: some minimum simulation time may be needed to reach conformations
differing substantially from the starting structure, and repeated simulations may
experience the same initial transients. Our recent experiences with long simulations
have indicated that, in some cases, transient behavior can last for a microsecond
or longer. When possible, a simulation long enough to capture repeated transitions
between states is desirable, as this provides evidence for an equilibrium on the simulated
timescale. Each of our simulations of the β2AR, for example, was long enough for the
protein to cycle repeatedly between the lock-broken and lock-formed conformations.
The second type of error stems from the physical models that underlie MD simulations.
These models, known as molecular mechanics force fields (or simply force fields),
specify the force on each atom given the positions of the other atoms and the network
of covalent bonds connecting them. The biomolecular force fields in widespread use
were developed, over several decades, through extensive fitting of models to experimental
data and ab initio quantum mechanics calculations. The most recent generation of force
fields has been shown to reproduce a variety of experimental data, particularly for
proteins (Buck et al., 2006; Hornak et al., 2006), although several shortcomings remain,
and more will likely be revealed through longer-timescale simulations. Known shortcomings
include poor accuracy in the heights of energy barriers separating different conformations
and biases toward α-helical over β-sheet conformations or vice versa.
From a user’s perspective, force field error has two implications. First, force field
parameters must be selected with care; parameters validated on specific molecules
(e.g., proteins or lipids) are typically much more reliable than parameters generated
automatically. Second, force field errors tend to impact certain results more than
others. The molecular conformations predicted by MD simulations tend to be more robust
than the relative populations of those different conformations because populations
are sensitive to minor errors in relative energies. Rates can be particularly difficult
to estimate by MD because they depend strongly on the heights of energy barriers.
Nevertheless, MD can produce reasonably accurate rate estimates under certain conditions,
as illustrated by the previously discussed studies of water and potassium ion permeation
through AQP0 and Kv1.2, respectively.
Further considerations.
Several other factors should be considered when designing an MD simulation study.
One is system size: the computation required is roughly proportional to the number
of atoms, and the relevant dynamics of large systems often take place on timescales
longer than those of small systems. A simulation of a hydrated globular protein will
typically involve 10,000–100,000 atoms, whereas a membrane protein simulation may
involve 40,000–400,000 atoms. Much larger MD simulations, containing many millions
of atoms, have also been performed (Schulten et al., 2008).
Classical MD simulations treat covalent bonds as unchanging. To simulate chemical
reactions, one must use alternative techniques such as quantum mechanics/molecular
mechanics simulations, in which the bulk of the system is simulated as in classical
MD, but a small part is evaluated using more computationally intensive quantum mechanical
approaches (Senn and Thiel, 2009; see also the accompanying Perspective in this issue
by Bucher and Rothlisberger). By default, protonation states remain fixed during a
simulation, but several methods have been developed to allow them to evolve dynamically,
at the cost of additional computation (Mongan and Case, 2005).
Although sufficiently long and accurate MD simulations may eventually be used to predict
protein structure, present-day simulations generally require an initial structure.
The quality of that structure is a major factor in the reliability of the simulation
results; a 2-Å resolution crystal structure, for example, provides a much better starting
point than either a 4-Å resolution crystal structure or a homology model.
Future developments
We expect that the capabilities of MD as a method for membrane protein physiology
will grow substantially in the future. Some of the most immediate changes will likely
result from recent advances that allow dramatically faster simulations: Anton accelerates
MD simulations by orders of magnitude compared with the previous state of the art
(Shaw et al., 2009), and its application to membrane proteins has just begun. In addition
to extending simulations to previously inaccessible timescales on which many events
of physiological interest occur, these speedups mean that simulations that previously
required months now take just days. Recent software advances provide an efficient
means to rapidly analyze and process the large quantities of data produced by such
high-throughput simulations (Tu et al., 2008). These combined developments enable
an iterative process of discovery in which the results of one simulation guide the
setup of the next—a sea change in simulation workflow that brings MD simulations in
line with experimental methods: hypothesize, test, revise hypothesis, test, and so
on.
By facilitating increasingly direct comparisons between simulated and experimental
results, longer simulations will also foster improved force fields (Lindorff-Larsen
et al., 2010). More accurate force fields, in turn, will enable ever better quantitative
measurements by simulation, including the determination of kinetic quantities (e.g.,
rates of transport or conformational change) as well as thermodynamic quantities (e.g.,
free energies of ligand binding, computed in a manner that properly accounts for protein
flexibility).
Continued improvements in computer power will also facilitate simulations of larger
molecular systems at atomistic detail over useful timeframes. Increasingly, MD is
being applied not only to study the behavior of individual proteins, but also to the
study of larger macromolecular complexes (Schulten et al., 2008).
MD simulation has already begun to function as a “computational microscope,” facilitating
discovery in spatial and temporal domains that would otherwise be inaccessible. Over
time, we expect that observations from this computational microscope will become a
routine source of primary data in biology, alongside data from traditional and novel
experimental methods. In several other fields—aircraft design, computer chip design,
and nuclear stockpile stewardship, to name a few—simulations are often performed in
preference to experiments because simulations may be cheaper, more flexible, and more
accurate. The complexity of biological systems is such that experiments, especially
in vivo experiments, will continue to play an indispensable role in molecular physiology,
but simulations may become an equal partner, offering a view of cell biology at a
level of spatial and temporal detail inaccessible through experiment alone.