Introduction
Appropriate tuning of binding selectivity
is a primary objective
in the discovery and optimization of a compound on the path toward
developing a drug. The environment in which drugs act is complex,
with many potential interaction partners. Proteins, DNA, RNA, lipids,
sugars, metabolites, and other small molecules all have the potential
to interact with a drug, and in many cases these unexpected interactions
lead to undesired and often severe side effects. Conversely, the ability
to interact with multiple targets or drug resistance mutants can be
advantageous in certain contexts. Designing a drug with the appropriate
balance of avoidance of undesirable targets (narrow selectivity) and
coverage of one or more targets of interest (broad selectivity, also
referred to as promiscuity) is a continual drug development challenge.
In many cases this objective is attained through trial and error,
but there are rational approaches that can guide the tuning of selectivity,
and examples have been published that illustrate a number of generalizable
strategies. In this review, we discuss fundamental principles that
account for selectivity and highlight examples where selectivity has
been attained through rational design. An understanding of the general
principles that drive selectivity should allow for more efficient
design of compounds with desirable selectivity profiles.
1−3
Traditionally, drug design has been pursued with the primary
objective
of finding a compound that binds with high affinity to a target of
interest.
4
Recently, considerable effort
has been expended measuring off-target interactions with partners
such as ion channels (including the Kv11.1 potassium ion channel hERG),
5,6
cytochrome P450s (CYPs),
7,8
and other proteins that
can lead to adverse side effects. Other considerations, such as family
or subtype selectivity have gained considerable attention for targets
with homologues that bind to the same or similar native substrates.
A common example is the kinase family (i.e., phosphotransferases),
for which each family member binds ATP in the process of transferring
a phosphate group to a substrate.
9
From
a drug discovery perspective, the aim is to hit only one or a subset
of kinases along the biochemical pathway of interest while avoiding
other kinases for which inhibition may result in adverse effects.
10
In practice, absolute selectivity for a single
kinase may be unattainable, but modulating the selectivity profile
can lead to improved drug properties and in many cases hitting multiple
kinases can be beneficial.
11
While
it is most common to design away from interactions with undesirable
proteins, in other cases it is desirable to hit a panel of targets.
12,13
An example of this type of broad coverage involves designing a drug
that is not sensitive to resistance mutations, which requires a molecule
that binds to drug-resistant variants as well as to the wild-type
target. This type of promiscuous, broad coverage is particularly important
for rapidly mutating targets, such as those that occur in infectious
disease (with HIV being a prototypical example) and cancer. This aspect
of drug discovery is of growing importance, as witnessed by the evolution
of resistance to existing anticancer
14−16
and antimicrobial agents
(antibiotics,
17
antivirals,
18
antifungals,
19
and antimalarials
20
). Similarly, when multiple pathways are accessible
for a given signaling cascade, it may be desirable to hit at least
one member of each parallel pathway in order to successfully block
the downstream signal. Recently, the idea of deliberately using promiscuous
drugs has gained credence.
11
However, this
promiscuity must itself be selective for a given subset of targets,
and nonspecific binding is always undesirable. In general, there is
a fine balance in designing the appropriate level of narrow and broad
selectivity, and one must determine the design criteria for selectivity
based on the relevant biological processes.
The importance of
gaining selectivity has been appreciated for
many years, and there are a number of experimental approaches to screen
for off-target interactions.
21−23
While performing an exhaustive
selectivity screen against all possible interaction partners is still
intractable, it is possible to construct selectivity screening panels
that can be used to gain insights and find more selective compounds.
21
Conceptually, the problem of designing
for a particular selectivity
profile is significantly more complex than designing for high affinity
to a single target. This is true whether purely experimental approaches
are being undertaken or whether computational analysis and design
are involved. The underlying problem is challenging because it is
necessary to evaluate energy differences for each ligand binding to
a panel of targets and decoys rather than to a single desirable target.
Computational methods are of limited accuracy when predicting affinities
of individual complexes; these difficulties are compounded when multiple
relative affinities are required to accurately design appropriate
specificities. From a computational perspective, structure-based design
methods typically are developed to yield low false-positive rates
(i.e., to maximize the chance that predictions of tight binders are
in fact tight binders) at the expense of higher false-negative rates
(tight binders that are not predicted to be so by the computational
method). Accurate selectivity prediction and design require reducing
the false-negative rate without increasing the false-positive one.
This is a difficult search problem and can require very fine sampling
of conformational space, including protein and ligand intramolecular
degrees of freedom, as well as intermolecular (“pose”)
degrees of freedom. This problem becomes increasingly more difficult
if the proteins and/or ligands have significant flexibility, as the
size of the search space increases enormously. Essentially, designing
for selectivity is significantly more complex than designing for affinity
for two reasons: first, because of the multifactorial nature of the
task and, second, because of the inherent difficulty of considering
all modes of relaxation with sufficient accuracy, particularly when
ligands bind decoy receptors.
In this review we highlight some recent
examples of successful
approaches to achieving changes in selectivity. We present cases where
the goal required narrowing the binding profile to one or a small
number of targets and increasing the relative binding affinity to
targets over decoys, and we present cases where the goal required
broadening the binding profile to increase the number of targets bound
and flattening the relative affinity across the panel of targets.
We have deliberately elected to organize the discussion around a set
of principles that have proven enabling in realizing selectivity goals.
In very simple yet still useful terms, achieving broad selectivity
involves recognizing and exploiting similarities in binding capabilities
across a collection of targets, and narrow selectivity involves identifying
and exploiting differences between targets and decoys. Most of the
review examines five aspects of binding and complementarity that have
proven useful handles that we have grouped together as structure-based
approaches. These five features (shape, electrostatics, flexibility,
hydration, and allostery) have been utilized because they differ,
whether subtly or substantially, across sets of target and decoy molecules
sufficiently to realize the affinity changes necessary for selectivity.
The principles of exploiting the features listed above are schematically
represented in Figure 1, and we will describe
and discuss each in detail. The review continues by discussing other
approaches that involve higher-level concepts beyond taking advantage
of structural similarities and differences, although ultimately they
can often be achieved through structure-based approaches. We describe
a substrate-mimetic approach to developing broad inhibition across
a population of rapidly mutating enzyme targets (called the substrate
envelope hypothesis), and we also describe methods for leveraging
differences in cellular environments to achieve selectivity goals.
We have necessarily chosen a limited number of examples from the recent
literature to review and illustrate the narrative that we have set
forward. We apologize in advance for necessary omissions and any inadvertent
oversights that kept us from including all of the truly wonderful
advances in this field. We also note that reviews on related topics
have appeared that will also be of use to the interested reader.
24−28
Figure 1
Selectivity Strategies.
This cartoon illustrates six design strategies
based on five principles (shape, electrostatics, flexibility, hydration,
and allostery) that can be employed to gain binding selectivity for
a given target: (A) optimization of ligand charges specifically for
the target and against the decoy; (B) displacement of a high-energy
water molecule in the target that is not present in the decoy; (C)
binding to an allosteric pocket in the target that is not present
in the decoy; (D) creating a clash with the decoy receptor but not
the target receptor, where the decoy is unable to alleviate the clash
by structural rearrangement; (E) binding to a receptor conformation
that is accessible in the target but inaccessible in the decoy; (F)
creating an interaction with the target receptor but not the decoy
receptor, where the decoy is unable to form the interaction by structural
rearrangement. Note that (D) and (F) are different manifestations
of the same underlying principle (shape complementarity), with (D)
decreasing binding to the decoy through the introduction of a clash
and (F) increasing binding to the target through the introduction
of a favorable contact.
Structure-Based Selectivity Design Considerations
Shape Complementarity
Shape complementarity between
ligands and receptors is a fundamental aspect of molecular recognition,
29
and there are numerous cases where selectivity
for natural substrates is attributable to the specific shape of the
binding site.
30,31
Unsurprisingly, molecular shape
has proven to be important in the rational design of selective inhibitors.
For example, narrow selectivity is essential for effective COX-2 inhibitors
to control pain and inflammation while lowering the risk of peptic
ulcers and renal failure associated with nonselective COX inhibitors.
Structural analysis by Kurumbail et al. highlighted a selectivity
pocket that is accessible in COX-2 but not in COX-1 because of the
V523I substitution.
32
Other than this small
change, the binding site residues are identical within 3.5 Å
of the ligand in the COX-2 structure from PDB entry 6COX,
32
and the only other changes in the binding site are Arg
to His and Ala to Ser in a flexible loop adjacent to the ligand. Over
the years, this V523I difference has been exploited to design inhibitors
with exquisite selectivity of over 13000-fold for COX-2 relative to
COX-1.
33
The single extra methylene group
of Ile523 in COX-1 is enough to induce a significant clash with COX-2-specific
ligands, as seen in Figure 2. This example
illustrates how small changes in protein shape can be used to gain
substantial selectivity. However, it is important to note that otherwise
unfavorable interactions can be accommodated in some contexts because
of molecular plasticity and the resulting rearrangement of the protein
target. In the case presented above, COX-1 is not able to alleviate
the clash with the ligand through protein rearrangement. However,
to predictively exploit this effect, accurate assessments of the potential
for relieving unfavorable interactions must be made.
Figure 2
Shape complementarity
in specific COX-2 inhibition. The crystal
structure of COX-2 complex from PDB entry 6COX(32) overlaid
with the apo crystal structure of COX-1 from PDB entry 3N8V.
181
The ligand is displayed in atom colored space filling.
The proteins are displayed as colored ribbons, and residues V523 from
COX-2 and I523 from COX-1 are displayed as colored balls and sticks.
The difference between the molecular surfaces of COX-2 residue V523
and COX-1 residue I523 is displayed in magenta.
In the case of COX-1/2, selectivity has been achieved
by designing
compounds that fit within and bind tightly to the larger site of COX-2
but clash with the smaller site of COX-1. That is, over 13000-fold
selectivity against the smaller binding site is achievable.
Given this finding, it is reasonable to ask whether similar selectivity
is achievable against a larger site by shape complementarity
alone. In cases where shape complementarity is the only mechanism
operating, selectivity against a smaller site primarily takes advantage
of the strongly repulsive van der Waals potential at short distances,
whereas the energetic driver for selectivity against a larger site
is the loss of favorable van der Waals and other interactions. The
nature of van der Waals interactions suggests that removing favorable
interactions will be a much weaker effect than introducing clashes.
Similarly, other interactions, such as π–π and
cation−π, are unlikely to exhibit as pronounced an effect
on binding as the repulsive van der Waals potential.
In support
of this notion, a number of examples can be found in
HIV-1 protease involving binding of inhibitors to wild type and to
mutants that increase the size of the binding site, such as the I84V
mutation. Darunavir binds to wild-type protease with an affinity of
0.22 nM but to the I84V mutant with an affinity of 1.1 nM.
34
Structural analysis suggests that the smaller
valine residue has less favorable van der Waals interactions with
the ligand.
35
Apparently, neither the ligand
nor the protease has enough flexibility to restore the lost favorable
interactions, thereby resulting in a loss of potency. The change elicits
a modest selectivity of 5-fold in this case, which is far from the
13000-fold change observed in the case of COX-1/2, where a clash was
introduced. Other HIV-1 protease mutants suggest that binding to a
smaller site can yield 50-fold selectivity,
36
but we find no evidence of a larger effect. These examples are not
ideal, however, because the goal of drug design in these cases was
to optimize for broad binding to wild type and mutants rather than
optimization for narrow selectivity, which would need to be done to
address how large a selectivity effect could be achieved against a larger site by
shape complementarity alone.
Small
differences in shape have also been exploited to gain selectivity
in the ATP binding pocket of kinases. Several isoquinoline and pyridine
derivatives have exhibited selectivity toward Rho-kinases, such as
ROCK-1, with a lower affinity for other kinases such as PKA, PRK2,
MSK1, and S6K1.
37
This selectivity was
attributed to five key residues in the ATP binding pocket of ROCK-1
(Met123, Ala142, Asp158, Ile186, and Phe327). Residue Phe327 is part
of a C-terminal strand that has only been found in a small subset
of kinases, including PKA, PKB, ROCK-1, and ROCK-2. For the other
four residues, sequence alignment of 491 kinases indicated that they
were relatively common, with frequencies of 25.1% (Met123), 28.9%
(Ala142), 32.2% (Asp158), and 37.9% (Ile186).
38
However, the specific combination of these residues found in ROCK-1
is rare and thus generates a uniquely shaped inhibitor binding pocket.
This allows for selective binding to ROCK-1 even though no single
residue is unique compared with other kinases.
While it is possible
that modifications introduced to clash with
one conformation of a decoy can potentially be alleviated by reorganization
of the decoy structure,
39
in many cases,
as has been shown here, the binding pocket is rigid enough to avoid
this problem. As another example of kinase selectivity arising from
shape changes, a series of pyridinylimidazole p38 MAPK inhibitors
from Vertex Pharmaceuticals
40
was shown
to attain selectivity through specific interactions with a single
residue (Thr106), which is different in other MAP kinases such as
JNK1 (methionine) or ERK1 (glutamine). Treatment with one of the pyridinylimidazole
derivatives reduced the p38 kinase activity to approximately 20% at
30 μM, whereas the p38 mutant T106M showed approximately 80%
kinase activity at the same ligand concentration, highlighting the
direct effect of this single residue.
Molecular shape can be
accounted for in a number of ways using
computational methods. Ligand-based methods that use shape overlap,
such as ROCS
41
or Phase Shape,
42
operate by superimposing molecules onto the
shape of a known active molecule in its actual or putative bioactive
conformation. This general approach is attractive because it can retrieve
molecules that are able to adopt a similar three-dimensional structure
to active molecules that are known to fit into the target binding
site of interest. ROCS has been applied successfully to a number of
drug-design projects, including the design of small molecule inhibitors
of the ZipA–FtsZ protein–protein interaction, an antibacterial
drug target.
43
While we have been unable
to find a publication highlighting shape-based screening tools being
applied directly to selectivity, it is possible that the approach
could be used to design for either narrow or broad selectivity by
requiring a high degree of shape complementarity with the target(s)
of interest while not matching the shapes of undesirable decoy targets.
For example, a screening protocol could be developed where compounds
are screened against an ensemble of desirable target shapes and undesirable
decoy shapes. These shapes could be derived from active molecules
for the desirable and undesirable targets. An objective function could
then be developed to tune the level of selectivity, where a baseline
level of similarity is desired for the target shapes while ensuring
that there is a relatively low level of similarity to the decoys shapes.
More sophisticated objective functions could be developed that look
at specific regions of the shapes around areas that are known or hypothesized
to be associated with narrow selectivity, since an agnostic approach
to the shapes may result in designing differences in solvent exposed
regions that might not significantly impact selectivity.
In
summary, shape complementarity is a vital aspect of molecular
recognition. Identifying differences in shape, even small differences,
can be a powerful approach to gain selectivity across a series of
related proteins. The examples of COX-2 and COX-1 above highlight
that very large gains in selectivity can be realized by binding to
a site or subsite that is larger in the target of interest than in
the decoys, suggesting that differences of this type should be one
of the first things to consider when designing for selectivity. In
the case of HIV-1 protease, it was shown that selectivity could be
gained in the context of binding to a smaller subsite, although the
changes were less pronounced because of the asymmetry of the van der
Waals potential.
While modeling of shape complementarity may
at first seem to be
trivial, the negative design aspect effectively requires a rigorous
consideration of protein flexibility, since induced-fit effects will
always act to lower the binding affinity for the true bound decoy
structure compared to the rigid decoy structure. Understanding the
subtleties and challenges of receptor flexibility is an essential
part of selectivity design and will be discussed in more detail in
the section entitled Conformational Selection and
Flexibility. In addition to protein flexibility, ligand flexibility
could also be a determinant of shape-based selectivity. To achieve
this, ligand modifications could be made to lock a molecule into a
conformation that can be better accommodated by one target than another.
This has proven to be a useful strategy in gaining binding affinity,
but the literature does not appear to contain any direct applications
to selectivity design. It is clear that leveraging differences in
shape complementarity can be an effective strategy in selectivity
design, although the outcomes will be context dependent and difficult
to predict from a simple analysis of rigid shapes because of the ability
of proteins to relax in order to alleviate unfavorable interactions.
Electrostatic Complementarity
Electrostatics encompasses
interactions among charged groups, neutral polar groups, and solvent.
Electrostatic complementarity is necessarily a more complex concept
than shape complementarity because interfacial polar and charged groups
generally pay a desolvation penalty when moving from an aqueous environment
in the unbound state to a partially or fully desolvated one in the
bound state. In favorable circumstances, the desolvation penalty is
outweighed by the complementary new interactions formed between charged
or polar groups across the interface, thereby resulting in a net gain
in binding affinity. In less opportune situations, the favorable interactions
are outweighed by the unfavorable desolvation and a net loss in binding
affinity is observed. Because charged and neutral polar groups have
significantly different desolvation penalties and improving binding
affinity involves a fine balance between maximizing favorable interactions
while minimizing the unfavorable desolvation penalty, deciding the
most complementary group for a particular site is nontrivial. So-called
electrostatic charge optimization theory provides both a useful definition
and a method of computing electrostatic complementarity.
44,45
Electrostatic complementarity, while conceptually more complex
than shape complementarity, is often easier to apply as a tool to
design selective compounds. This is consistent with the longstanding
view that salt bridges and electrostatic interactions can be used
to explain and design specificity in protein folding and molecular
recognition.
46−48
Whereas small changes in protein conformation can
relieve a shape clash introduced to disfavor binding to a decoy, such
changes in protein conformation cannot as easily relieve an electrostatic
repulsion introduced to achieve the same goal. This is due to the
longer-range nature of electrostatic interactions compared to excluded-volume
repulsion. In each case, the target must tolerate the interaction
introduced to negatively affect the decoy. There are numerous examples
where this objective has been achieved for the binding of naturally
occurring protein binding partners.
49,50
The general
notion that electrostatic selectivity can be sought by identifying
differences and similarities in polar and charged environments in
binding sites across the set of targets and decoys is largely applicable,
subject to the caveats above as well as the limited range of charge
distributions obtained through available chemistries and geometric
constraints.
Continuum electrostatic theory has been used to
systematically
explore the relationship between the distribution of polarity within
a molecule and the relative promiscuity of its binding interactions.
51
The results suggest that polar and charged molecules
will tend to have narrower binding selectivity compared to less polar
molecules, which will tend to be more promiscuous. This is due to
the strong orientational dependence of electrostatic interactions,
making polar and charged molecules more sensitive to molecular shape
than less polar molecules. It is also due to the nature of chemical
space that, on average, provides more partners for less polar molecules.
One might imagine that increased molecular flexibility would lead
to greater selectivity because a molecule can reconform to bind different
partners. Interestingly, this study found the opposite for polar and
charged molecules: increased flexibility allowed the attainment of
especially favorable electrostatic interactions with a small number
of binding partners, leading to narrowed selectivity compared to less
polar molecules with the same shape and conformational degrees of
freedom.
51
Positive and Negative Design with Electrostatic Interactions
Differences in the pattern of hydrophobic, polar, and charged groups
across potential binding partners can be exploited through positive
design (the introduction of groups that make especially good interactions
with targets) and negative design (groups that make especially unfavorable
interactions with decoys but are tolerated by targets). As illustrations
of these concepts, examples from blood clotting factors and signaling
kinases are discussed here. Each requires narrow selectivity, to some
extent, because of the large number of related enzymes: serine proteases
for the case of clotting and kinases for the case of signaling.
Clot formation is induced through one mechanism by a cascade of at
least 20 interactivating proteins, including thrombin and factors
V (Va), VIII (VIIIa), IX (IXa), and X (Xa).
52
Many cardiovascular patients are on long-term anticoagulation therapy,
53
which has proven difficult to develop for robust
implementation across a broad patient population without careful monitoring,
although recently approved entities promise improvement.
54−56
Comparative molecular field analysis (CoMFA) and comparative molecular
similarity indices analysis (CoMSIA) have been used to identify electrostatic
differences among the binding sites of serine protease blood clotting
factors thrombin, factor Xa, and the structurally related trypsin.
57
They identify a region in which increasing the
negative electrostatic potential would enhance selectivity toward
trypsin. An inhibitor placing an electronegative ester into this region
shows increased selectivity, binding to trypsin (pK
i = 7.10) more tightly than to thrombin (pK
i = 5.68). Conversely, in the context of a similar scaffold,
an inhibitor placing a methylsulfonyl group into this area shows an
inverted selectivity profile for binding to thrombin (pK
i= 8.38) over trypsin (pK
i = 6.77).
In the case of thrombin and factor Xa, differences
in electrostatics
within the S1 pocket have been exploited to provide selectivity.
52
Position 192 is highly variable across the coagulation
serine proteases and is a glutamate in thrombin but a glutamine in
factor Xa. An inhibitor developed by Boehringer
58
provides a good example of position-192 dependent selectivity,
where a high degree of selectivity for factor Xa (K
i = 41 nM) over thrombin (K
i > 2000 μM) was achieved by using negative design through
electrostatic
repulsion by introducing a carboxylate group near the Glu192 side
chain. Crystal structure examination shows that the carboxylate is
tolerated in factor Xa partially by hydrogen bonding with Gln192,
which goes some way toward compensating the carboxylate desolvation.
The corresponding methyl ester derivative of the inhibitor was nonselective.
Quantum mechanical methods have also been exploited to elucidate the
relative electrostatic potentials of the S4 subsite, locating a large
negative potential that is present in factor Xa but absent in thrombin.
59
Combining these findings suggests that tuning
the electrostatic properties of an inhibitor in these three regions
of thrombin, factor Xa, and trypsin can be sufficient to gain selectivity
for one of the targets.
Electrostatics has also proven key in
selectivity for protein tyrosine
phosphatases (PTPs). In the case of the drug target PTP1B, the negatively
charged Asp48 presents an opportunity for narrow selectivity in ligand
binding because many PTPs contain an uncharged asparagine at this
position. This has been exploited by introducing a positive charge
into an existing inhibitor at an appropriate position to form a salt
bridge with the Asp48 in PTP1B. This was expected to decrease the
affinity for other PTPs due to the lack of strongly compensating interactions
with the Asn residue to balance the ligand desolvation penalty.
60
In agreement with this prediction, a new compound
containing a basic nitrogen was found to have an increased affinity
for PTP1B of about 20-fold and showed high selectivity for PTP1B versus
all other PTPs tested. This can be explained by analyzing the interactions
seen in Figure 3, showing the favorable charge
complementarity between PTP1B and the basic nitrogen, which is absent
in the other receptor–ligand pairs.
60
Figure 3
Electrostatic
complementarity in specific PTP1B inhibition: (A)
structure of PTP1B in complex with a PTP1B specific cyclic amine from
PDB entry 1C88;
60
(B) structure of PTP1B in complex
with a cyclic ether from PDB entry 1C87;
60
(C) structure
of the PTP1B R47V/D48N double mutant in complex with a PTP1B specific
cyclic amine from PDB entry 1C86;
60
(D) modeled structure
of the PTP1B R47V/D48N double mutant in complex with a cyclic ether.
The ligands are displayed as atom colored balls and sticks with green
carbons and a transparent surface colored by electrostatic potential.
The protein surfaces are displayed in wireframe and colored by electrostatic
potential. Residues R47/V47 and D48/N48 are displayed in atom-colored
ball and stick representation with gray carbons.
Electrostatic Charge Optimization Applications for Selectivity
Developing a high-affinity inhibitor involves finding a balance
between the favorable intermolecular interactions and the unfavorable
desolvation penalty suffered when a ligand binds to a receptor. To
achieve this, continuum electrostatic models have been developed to
optimize the charge distribution of the ligand and yield the most
beneficial balance of these opposing contributions.
45
This method of charge optimization can be used to minimize
the electrostatic binding free energy
61
and has been applied in drug design to analyze and improve potency.
62−64
The concept of charge optimization is illustrated in Figure 4A. More recently, the
charge optimization methodology
has been applied to selectivity design using a formalism that simultaneously
considers panels of desired targets and undesired decoy receptors.
Within this framework it is possible to tailor a ligand for narrow
selectivity, broad selectivity, or a combination of the two. The framework
illustrates clearly the requirement that selectivity gains generally
come at a cost in optimal target affinity, with greater gains requiring
greater cost.
65
Specificity charge optimization
is illustrated in Figure 4B. This approach
has been applied to inhibitors of HIV-1 protease, where both broad
and narrow selectivity were investigated.
28
Narrow selectivity was explored with the promiscuous aspartyl protease
inhibitor pepstatin to predict modifications that would increase the
relatively weak affinity of pepstatin for HIV-1 protease and decrease
the affinity for the related proteases pepsin and cathepsin D. The
N-terminal portion of pepstatin was identified as the key specificity-determining
region, in line with experimental work showing that N-acetyl pepstatin increases potency
to HIV-1 protease (K
i =20 pM) but is not a known binder to pepsin or cathepsin
D. In the same work, broad selectivity was explored with a set of
clinically approved HIV-1 protease inhibitors to probe interactions
that could broaden their affinity toward both wild-type HIV-1 protease
and drug-resistant mutants. Saquinavir in particular was found to
have a narrow selectivity profile toward the wild-type protease, in
agreement with experimental data showing that saquinavir suffers markedly
from resistance mutations.
66
Modifications
to saquinavir and other approved HIV therapeutics were proposed to
improve the broad selectivity binding profiles, although experimental
validation of these compounds was not pursued.
Figure 4
Charge optimization.
(A) Affinity optimization, with a single well-defined
minimum. The green line is the favorable Coulombic interaction between
two opposite charges. The blue curve is the quadratic desolvation
penalty, and the black line is the sum of the two (i.e., total electrostatic
energy). Optimal charge is denoted with a black dot. (B) Specificity
optimization with two proteins (red and orange curves). Only the total
electrostatic energy is shown. The affinity optimal charge for each
curve is denoted with a dot. The specificity optimal charge, which
maximizes the energy difference between the curves, is denoted with
a starburst. Note that the specificity optimum to the orange curve
is theoretically unbounded but limits in chemical/biological reasonable
charge space restrict the maximum charges. Furthermore, in most cases,
high specificity is desirable but a baseline level of affinity (ΔG
max) to the primary target is needed to achieve
efficacy, as shown by the light orange starburst.
Charge optimization has also been applied in a
theoretical probe-based
approach that simulates binding of a model ligand to a target receptor
in order to understand general principles associated with selectivity.
67
The outcome of this analysis is a representation
of the protein surface that gives the sign and magnitude of the complementary
charge at a given location and also the strictness of selection for
this optimal charge.
68
Highly selective
sites have a steep curvature in the charge dependence of the binding
free energy around the optimal charge, whereas sites with low selectivity
have a shallow curvature. This analysis has been used to examine the
change in binding affinity within a series of trypsin inhibitors.
The trypsin profile shows one region with relatively low charge selectivity
for a small and positive optimal charge, which is consistent with
the experimental data that show that p-carboxybenzamidine
binds with an affinity of only 1.8 kcal/mol worse than p-aminobenzamidine. This indicates
that trypsin prefers the neutral
amino H-bond donor but will accept a negatively charged carboxylate
group in this region with a relatively small loss in binding affinity.
In contrast, there is a region of high selectivity for a positive charge
predicted in the S1 subsite of trypsin. Experimentally, the binding
affinity of P1-Met BPTI is 7.4 kcal/mol worse than P1-Lys BPTI, indicating
the strong selectivity for a positively charged group in this site,
in agreement with the charge optimization predictions. This concept
was recently extended to predict a coupled charge selectivity (CSq),
which is defined as the energetic cost of changing an atomic charge
by one electron charge from its optimal value while allowing all other
charges in the molecule to reoptimize.
69
The CSq method was applied to inhibitors of COX-2 such as celecoxib,
which have nanomolar affinity for carbonic anhydrase II (CAII). The
CSq analysis identified that the ionized sulfonamide group of celecoxib
was well optimized to bind CAII and was highly charge selective whereas
there was little charge selectivity of this group binding to COX2.
Studies have demonstrated that the sulfonamide group can be replaced
with the isosteric sulfomethyl group without impacting the COX2 inhibition,
in agreement with the computational predictions.
70
The examples detailed above illustrate that charge
complementarity
is an important design principle and can be used effectively in the
lead-optimization process. In many cases, electrostatic complementarity
design can be harnessed to achieve high affinity for the target(s)
of interest as well as a desirable selectivity profile. However, it
is often impossible to design a molecule with optimal charges, as
the limits of chemical space restrict the range of charge distributions
that can be attained within a molecule. Furthermore, even when a desirable
charge distribution can be attained to design narrow selectivity toward
a target receptor and against a panel of decoy receptors, it is possible
for the decoys to relax to alleviate some of the unfavorable electrostatic
interactions. This relaxation includes both conformational changes
(i.e., induced fit) and tautomeric and ionization state changes (i.e.,
His, Asp, and Glu adopting difference protonation states). The range
of relaxation effects has not been fully explored in previous applications
of charge optimization and could add significant challenges to the
application of the method. However, these relaxation effects can be
accounted for within the charge optimization framework through the
addition of multiple conformational states of each decoy receptor.
It is also important to note that certain charge distributions may
be chemically accessible but physiologically undesirable. For example,
charged molecules and zwitterions are often undesirable for intracellular
protein targets because of limited cell permeability. In addition,
the optimal charges for selectivity may be undesirable for other reasons
such as solubility, kinetics, or clearance.
In summary, differences
in electrostatics between otherwise similar
targets can be effectively exploited by utilizing techniques such
as molecular field analysis and specificity charge optimization. The
magnitude of selectivity gained through electrostatic complementarity
may be modest relative to introducing a shape change that creates
a steric clash, but the effects of changes in electrostatics tend
to be more predictable than the effects of changes in shape due to
the smoother form of the energy surface and the long-range character
of electrostatics relative to van der Waals interactions. Furthermore,
the long-range nature of electrostatic forces allows for modulation
of binding affinity from interactions with residues distal from the
binding site,
71−73
suggesting that binding selectivity can be derived
from long-range electrostatic interactions as well. In short, relatively
small receptor induced-fit effects can more easily eliminate unfavorable
steric clashes than electrostatic incompatibility. This makes optimization
of electrostaticinteractions a general mechanism for improving selectivity
whenever the target of interest and the decoys have differing charge
profiles.
Conformational Selection and Flexibility
The above
discussion focuses on the molecular properties of shape and electrostatics
and describes examples in which similarities among targets and differences
from decoys could be identified in these properties. It is interesting
and perhaps underappreciated that the molecular property of flexibility
can differ sufficiently between proteins with similar binding sites
to be a handle for attaining selectivity goals. One simple paradigm
involves a target and a decoy that both have similar binding sites
in terms of shape and electrostatic patterning, but the target is
more deformable than the decoy. An inhibitor that binds to the deformed
active site could then be designed to obtain selectivity for the target
over the decoy. It is essential that the deformation has a relatively
small energetic penalty in order to avoid too great a sacrifice in
affinity. Predicting the energy associated with these structural rearrangements
has been successful in a small number of very long time scale simulations
run on specialty hardware,
74
but this remains
a challenging area of research.
Perhaps the most renowned cases
of selectivity deriving from protein flexibility come from kinases,
27,75
and a great deal of experimental data exist for kinase selectivity
profiles.
21
A number of strategies have
been used to achieve kinase selectivity by considering shape and protein
flexibility.
9
One key notion has been to
target an inactive conformation of a particular kinase,
76
which may be inaccessible or very energetically
unfavorable for undesired targets. The primary structural change is
a movement of the activation loop (also called the DFG loop), which
opens up a deeper, more hydrophobic binding site that is adjacent
to the traditional ATP binding site. While all kinases have the activation
loop (which typically contains the DFG amino acid motif), the transition
to the inactive DFG-out state has not been observed in all kinases,
thereby offering a potential mechanism to gain selectivity. In the
development of imatinib, it was found that selectivity was achieved
by binding to the DFG-out conformation of the Abl kinase,
77
which also produced a desirable pharmacological
profile.
9
Another compound that binds to
P38 MAP kinase, doramapimod (BIRB796),
78
also targets an inactive kinase conformation and had great promise
for its affinity and selectivity profile. Unfortunately, clinical
success has not been on par with imatinib. Doramapimod was subsequently discontinued
from clinical trials because of lack of efficacy for the primary indications
and the development of liver function abnormalities.
79
However, a number of compounds that target kinases with
known DFG-out conformations are actively being pursued. These targets
include Aurora A,
80,81
cFMS,
82
EGFR,
83,84
KIT,
85
and PYK2.
86
A relatively recent computational method has
been published to convert kinase structures to the DFG-out form,
87
which can then be used for virtual screening
and structure-based lead optimization. In theory, this is an excellent
idea, but it is difficult to know whether the converted kinase structure
is energetically accessible, and therefore, the utility of such a
method still needs to be proven in prospective studies.
Selectivity
originating from protein flexibility has been observed
in many other protein classes as well. For example, crystal structure
analysis and docking studies have shown that selectivity between different
species of thymidylate synthase (TS) can be attributed to protein
flexibility.
88
In this case, the objective
was to target bacterial TS proteins and not the corresponding human
protein. The most selective inhibitors in this study were found to
bind 35-fold tighter to L. casei and 24-fold tighter
to E. coli compared with human TS. Studies of rigid
receptor docking to previously known crystal structures were not able
to accurately predict the pose for the most selective compounds. However,
a crystal structure of E. coli TS solved by the authors
of this work revealed substantial rearrangements of the protein, both
in the binding site and distal to the ligand. The greatest backbone
movements were in excess of 6.0 Å, highlighting the challenge
that protein flexibility presents. Variations in protein flexibility
have also been proposed as the origin of selectivity of carboxamide
analogues of zanamivir binding to influenza virus sialidase type A
preferentially over type B.
89
In this case,
the increased potency of some analogues was attributed to the formation
of an intramolecular salt bridge in the ligand. Interestingly, molecular
dynamics (MD) simulations predict that there is substantially more
rearrangement of sialidase type B than type A in order to accommodate
the intramolecular salt bridge. The authors propose that this additional
rearrangement in type B in order to accommodate the intramolecular
salt bridge comes at a significant energetic cost, thereby reducing
the potency of the zanamivir analogues to sialidase type B even though
they can still match the shape of the binding site.
Finally,
researchers at Bristol-Myers Squibb were able to develop
TNF-α converting enzyme (TACE) inhibitors with high selectivity
versus other similar matrix metalloproteinases (MMPs) by taking advantage
of differences in protein flexibility.
90
For example, the inhibitor in PDB structure 2FV5(91) uses flexibility in the loop,
forming the S1β pocket
(Pro437-His444) of TACE to gain selectivity over other MMPs. The movement
in the 2FV5 structure
is substantial and uniquecompared with other TACE structures, such
as 3KMC(92) (Figure 5A).Interestingly,
this inhibitor has a slow k
off, a factor
that is important in controlling pharmacodynamics. The observed kinetics
may be related to the induced fit required for binding. In order to
understand this selectivity, the authors built a homology model of
TACE on the crystal structure of atrolysin, a related member of the
reprolysin family. They identified that the S1′ pocket shows
substantial differences when compared with MMP-3, such as an alanine
residue in TACE replaced by a tyrosine in MMP-3. After several TACE
crystal structures were solved, it became apparent that the selectivity
toward TACE was due not only to the shape difference but also to the
additional flexibility of the TACE loop in the S1′ pocket that
was allowed by the smaller residues in TACE. Larger residues in other
MMPs, such as MMP-3 (PDB code 2JT5(93)) and MMP-9 (PDB code 2OW0(94)), retard
this flexibility, disfavoring ligand binding. This can be seen in
Figure 5B. Interestingly, these differences
in flexibility are suggested by analysis of the B-factors in the loop
residues of the crystal structures, as shown in Figure 5C. With careful analysis of
crystallographic data, consideration
of such difference in B-factors may prove useful for gaining selectivity
in other systems.
Figure 5
Protein Flexibility of TACE and MMPs. S1′ loop
in TACE and
related MMPs showing conformational flexibility that leads to selectivity.
(A) TACE structure 2FV5 (cyan) shows significant movement in the S1′ loop (red oval)
to accommodate the larger quinolone ring of the 2FV5(91) inhibitor relative to the
3KMC(92) (orange)
structure. (B) Overlays of TACE and MMP structures with the ligand
from 2FV5 for
reference showing side chains proximate to the quinolone ring in space
filling representation. TACE crystal structure before induced fit
(orange) shows clashes with the ligand. The small side chains in TACE
allow loop movement that can accommodate the quinolone ring (cyan).
The MMP-3 structure 2JT5(93) (green) and MMP-9 structure 2OW0(94) (yellow) with larger
residues show that the ligand could
not fit without substantial rearrangement of the S1′ loop,
which might not be possible because the larger side chains make interactions
with other protein residues that stabilize the loop (adjacent residues
not shown for clarity). (C) TACE (3KMC, left) and MMP-9 (2OW0, right) with S1′
loop colored by B-factor (blue = low; red = high). Gly442 in TACE
(circled in red) allows for increased flexibility of the S1′
loop.
These examples, in addition to other published
work,
25
highlight the importance of considering
multiple
protein conformations when modeling selectivity in order to sample
different binding site shapes effectively. Numerous methods have been
developed to account for protein flexibility, generally through a
combination of protein sampling and ligand docking, although they
have not been applied directly toward selectivity design.
95−99
Furthermore, the success of these methods depends heavily on the
complexity of the motion in the receptor required to accommodate the
ligand, where side chain rotamer changes are generally more successful
to predict than large-scale backbone movements. Once a reasonable
receptor structure (or ensemble of receptor structures) is generated,
techniques for estimating binding free energy can be applied to predict
differences in potency.
Explicit Water Molecules Bound at Target Site
Just
as similarities and differences in shape, electrostatics, and flexibility
among targets and decoys can form the basis of selectivity enhancing
design efforts, so can differences in the location and thermodynamics
of binding-site water molecules.
100
Even
in cases where the binding sites are highly similar, there can still
be key differences in the location and thermodynamic profile of water
molecules.
101
A simple paradigm illustrating
this idea is a decoy active site with a tightly bound (favorable)
water molecule at a position in which the target has a loosely bound
(unfavorable) one; an inhibitor that displaces each of the water molecules
to make identical interactions with the target and decoy active sites
gains a selectivity advantage in binding target over decoy due to
the relative water-displacement costs. This newly appreciated role
for water molecules in selectivity is in addition to their involvement
in playing key roles in molecular recognition,
102,103
computational drug design,
104
and metabolism
prediction.
105
A review by Cozzini et al.
presents a number of examples of rational methods that have been used
to understand the role of water in binding affinity.
106
In most cases, visualization of crystal structure water
molecules cannot explain their thermodynamic properties and it is
difficult to use simple empirical rules for determining whether to
displace a water molecule or form a bridging interaction.
107
Furthermore, bridging interactions with water
molecules can be either favorable
108,109
or unfavorable,
110
depending on the system. Therefore, more sophisticated
methods for characterizing water molecules have been developed, as
described below.
Free Energy Simulations
The most direct approach to
compute the thermodynamic stability of a water molecule in a given
environment is to use rigorous free energy methods,
111,112
such as free energy perturbation (FEP)
113−115
or thermodynamic integration (TI).
115,116
These methods
are general and can be applied to any molecule of interest or any
part of a molecule. It is thus possible to grow or annihilate a water
molecule to determine its thermodynamic contribution to binding. An
FEP approach has been applied in the Jorgensen group by Michel et
al. to assess the contribution of water molecules to binding affinity.
117,118
While their aim was not solely to determine the free energies of
binding site water molecules, they demonstrated that incorporation
of the water energetics could lead to improved reproduction of experimental
binding energies when combined with their FEP implementation in the
program MCPRO. However, it is important to note that both FEP and
TI are very sensitive to the implementation details. Without the proper
constraints on the system it is possible for the annihilation of one
water molecule to leave a hole that is filled by another water molecule.
This yields an uninformative or even misleading result regarding the
energetic contribution of the presence or absence of the water molecule.
While the application of explicit solvent free energy methods to
selectivity has been limited, there are cases where calculations have
been helpful in providing qualitative and quantitative insights. Of
particular interest is the case of differential binding of a single
compound to a wild-type and mutant protein. For example, Pearlman
and Connelly were able to accurately compute the energetic difference
of tacrolimus (FK506) binding to wild-type and Y82F mutant FKBP-12.
119
The authors attributed the higher affinity
of tacrolimus for the wild-type protein to a more favorable entropy
change associated with the release of water molecules when the ligand
binds.
Inhomogeneous Solvation Theory
Another computational
approach to assess the thermodynamic properties of binding site water
molecules, inhomogeneous solvation theory, was proposed by Lazaridis
120
and has been applied to ordered water molecule
in HIV-1 protease
121
and concanavalin A.
107
In the case of HIV-1 protease, the water molecule
bound between the flaps of the dimer subunits was computed to be stable
relative to bulk water, suggesting that the contribution for displacing
this water molecule should be unfavorable to binding, although contributions
due to the displacing group or other differences between inhibitors
can counterbalance this effect, which complicated comparison to available
experiments. In the case of concanavalin A, the authors performed
a more complete thermodynamic analysis of binding and the computational
results were consistent with experimental binding affinities. In both
cases, the authors highlighted the complexities associated with water
molecules and the fine balance between enthalpy and entropy, which
necessitates a careful analysis of water energetics that is not readily
predicted by simple empirical rules. Inhomogeneous solvation theory
has recently been used to identify binding hot spots at a protein
surface.
122
Qualitative Assessment of Water Molecule Locations
Theapplication of free energy and inhomogeneous solvation methods
validates the idea that differences in water thermodynamics can be
used to improve affinity and selectivity, but they can be expensive
and complex to implement and run and they require pre-existing knowledge
of water placement, which may not be available experimentally. Although
MD simulations can be used to predict the positions of observed water
molecules
123
and hypothesize their importance,
124
this does not improve the issues of computational
complexity and expense. Thus, considerable benefit can result from
faster and less computationally demanding methods of identifying the
same effects.
An alternative approach to study the role of water
molecules is to look exclusively at properties of water molecules
around a conformation of a protein, thereby reducing the variability
associated with the other components of the binding free energy. This
approach has been taken by Fernández and colleagues with the
development of a concept of a “dehydron”, which is a
region of a protein that is not adequately hydrated.
125
The suggestion is that backbone amide hydrogen bonds are
in a globally stable state when ideally packed by hydrophobic groups.
Backbone amide hydrogen bonds that are incompletely or suboptimally
packed are termed dehydrons, and potency can be gained by interacting
in these dehydron sites to improve the hydrophobic packing. Furthermore,
selectivity can be gained by taking advantage of differences in dehydrons
between similar proteins. Indeed, this approach was used to engineer
selectivity into a c-Kit kinase inhibitor by finding a dehydron that
was present in c-Kit but not the related Abl kinase, making it more
potent and less toxic.
126
Hydration Site Prediction and Thermodynamic Characterization
An approach that combines the prediction of water molecule locations
(called hydration sites) and thermodynamic characteristics (entropy
and enthalpy) has been described in recent years and has been applied
to affinity and selectivity predictions.
127−129
The method, called WaterMap, determines water molecule positions
by clustering water molecules from an MD simulation. Once the hydration
site locations are identified, the enthalpy and entropy of each hydration
site is determined using inhomogeneous solvation theory as developed
by Lazaridis.
120
The advantage of this
approach, in comparison with other free energy methods, is that a
single simulation can provide information about all binding site water
molecules for a given protein conformation. In a study on peptides
that bind to PDZ domains, it was shown that the displacement energies
of water molecules were able to explain why the tightest binding peptides
had very broad selectivity to wild-type Erbin and variants.
130
Alanine mutants of Erbin did not affect the
potency of Trp at the P-1 position of the peptide, which is consistent
with the finding that the high-energy water molecule pattern in this
region was preserved across the Erbin alanine mutations. In the same
paper, the authors presented an example in which water energetics
were able to explain the narrow selectivity of a peptide, where a
tryptophan-to-alanine mutation in the peptide had a substantial effect
on binding to the PDZ domains HTRA2 and HTRA3 but little effect in
HTRA1. In the case of both HTRA2 and HTRA3, there was a substantial
cluster of high-energy hydration sites that was displaced by the Trp,
whereas in HTRA1 the energetics of the related hydration sites were
not as highly unfavorable (Figure 6).
Figure 6
Water molecules
in PDZ domains HTRA1, HTRA2, and HTRA3. Selectivity
in the HTRA family of PDZ domains is predicted to arise from differences
in binding site waters. HTRA1 (A, PDB entry 2JOA)
182
does not
have a strong preference for Trp at the P-1 position, losing only
6-fold in potency when mutated to Ala. However, HTRA2 (B, PDB entry 2PZD)
183
and HTRA3 (C, PDB entry 2P3W)
182
lose considerable
binding potency when Trp is mutated to other residues, such as Ala
(over 300-fold for HTRA2 and 450-fold for HTRA3). Hydration site free
energies are computed with the WaterMap program, and only high-energy
hydration sites in the P-1 pocket are shown. Red sites are greater
than 4.0 kcal/mol and orange sites are greater than 2.0 kcal/mol unfavorable
relative to bulk water. HTRA2 and HTRA3 are computed to gain a substantial
amount of free energy from the displacement of high-energy hydration
sites, whereas HTRA1 gains significantly less. Importantly, Trp is
the only side chain that is able to displace all of the high-energy
hydration sites in the P-1 pocket of HTRA2 and HTRA3. The peptide
backbone is shown in green with only the P-1 Trp side chain displayed.
The same method has more recently been applied
to kinase selectivity,
where the authors studied general Src-family selectivity as well as
three cases comprising pairs of kinases (Abl/c-Kit, CDK2/4, and Syk/ZAP-70).
26
It was found that in all cases, the differences
in the water molecule locations, energetics, or both were able to
explain the experimentally observed selectivity trends. For example,
in the case of the Src-family kinases, it was shown that the water
molecules at the hinge are conserved, suggesting that selectivity
cannot be gained here. However, the back pocket (now known as the
selectivity pocket) shows a difference in the position and energetics
of Src water molecules compared with GSK3-β. An interesting
prediction is that it is not necessary for an inhibitor to extend
deeply into the selectivity pocket to gain differential binding affinity
toward Src because the high-energy water molecule in Src resides at
the opening of the selectivity pocket. A further consequence of this
is that an inhibitor that enters the selectivity pocket to any degree
risks hitting Src-family kinases.
The periplasmic oligopeptide-binding
protein (OppA) has been studied
for many years as a test case for selectivity; highly selective ligands
have been found in recent years,
131
and
water molecules have been implicated in the broad selectivity of this
and related proteins.
132,133
It has been proposed that the
large number of interfacial water molecules allows the binding site
to accommodate a wide variety of ligand shapes, sizes, and polarity.
134
It was noted that crystal structures with peptides
having small amino acids have a higher number of crystallographically
resolved water molecules at the interface, and it is thought that
the water molecules fill the volume between the smaller peptides and
the protein. Furthermore, selectivity between OppA and dipeptide binding
protein (DppA) was proposed to stem from a difference in direct ion
pairing in DppA (more favorable) versus water-mediated ion pairing
(less favorable) in OppA. Finally, differential potency between di-
versus tripeptides and tri- versus tetrapeptides was proposed to arise
from the gain in entropy associated with the displacement of three
structured water molecules by the larger peptide. A detailed series
of calculations using quantum mechanics and molecular mechanics with
Poisson–Boltzmann implicit solvent (MM-PBSA) suggested that
the broad selectivity resulted from a fine balance between many energetic
contributors to binding, including indirect desolvation effects.
135
Another interesting system in which water
molecules are proposed
to play a crucial role in binding selectivity is that of growth factor-bound
protein 2 (Grb2), which is involved in the Ras-MAPK signaling cascade.
Researchers have used MD to explore the binding of two selective ligands
to the SH2 domain of Grb2 and found that water molecules play a key
stabilizing role in binding.
136
They also
proposed that destabilizing interactions with bulk solvent played
a role. Although the authors did not explicitly explore selective
binding of these two ligands to other targets, it was also hypothesized
that the key water molecule interactions would contribute an important
part to the ligand selectivity. In another study on the indirect role
of water molecules in binding to SH2 domains, the authors used the
change in solvent accessible surface area upon binding to predict
binding thermodynamics.
137
Although explicit
water molecules were not used in binding energy predictions, the authors
did explore the possibility that explicit water molecules could impact
the calculations, and they included combinations of the interfacial
water molecules in the solvent accessibility calculations. The authors
then related changes in polar and nonpolar surface area to changes
in heat capacity, which can be directly related to the entropy of
binding. The approximations and parameters used in this study built
on previous work to generate empirical models for binding energy predictions
based on solvent accessibility described by Baker and Murphy.
138
Importance of Water in “Hard” Cases of Selectivity
Design
The reason that water alone can explain selectivity
in the difficult cases presented above can be understood by an analysis
of the thermodynamic process of binding. One can rigorously decompose
the binding process into a number of steps, where a series of events
takes the ligand and receptor from their relaxed unbound state in
solution to the bound complex state. The total free energy of binding
is the sum of the energies for each step. For cases of selectivity
that are typically considered to be difficult (i.e., the binding sites
of the two receptors exhibit high similarity), many of these energetic
terms approximately cancel.
For example, a ligand that binds
to two similar receptors will lose approximately the same amount of
conformational freedom (ligand entropy) and will pay approximately
the same desolvation cost when binding to each receptor. In fact,
all of the ligand-only thermodynamic properties should approximately
cancel. Furthermore, the interactions between the ligand and receptor
should be similar in difficult cases, where the binding site has roughly
the same shape, electrostatic properties, and hydrogen bonds between
the ligand and receptor. The receptor terms are thus the key determinants
of selectivity. The terms with the largest magnitude include the receptor
desolvation (discussed in this section) and the receptor reorganization
and strain energy (discussed in the earlier section on flexibility).
Thus, the location of explicit water molecules and the conformation
of the receptor play key roles in influencing binding affinity and
selectivity. For the majority of methods used in structural modeling
and virtual screening, it is necessary to predefine both of these
features before beginning. This determines both the binding site shape
and electrostatics and is thus an important choice that must be made.
Some methods include the ability to switch known water molecules on
or off
139
or to include limited receptor
flexibility,
99,140
but in many cases this is not
sufficient. Methodological advancements must be focused in these areas
in order to model selectivity in a thorough fashion.
Allosteric Pockets and Noncompetitive Binding
The traditional
view of inhibition is the blockage of a primary binding site that
is involved in the recognition of natural binding partners. However,
selectivity can also arise from noncompetitive allosteric inhibition
involving differences in protein flexibility in sites distal to the
primary inhibition site, and identifying and exploiting similarities
and differences in allosteric pockets and interactions across targets
and decoys can be another mechanism for attaining selectivity. For
example, a highly selective PTP1B compound was found that binds to
a site 20 Å from the catalytic site. It was proposed that binding
to this distal site reduces mobility of the catalytic loop and thereby
inhibits PTP1B enzyme function.
141
Interestingly,
this allosteric site has not been detected in related tyrosine phosphatases,
which provides a mechanism for designing highly selective compounds
that target this site. Allosteric sites have also been identified
for a number of other drug targets.
142−144
Targeting allosteric
sites is an attractive proposition. However, the prediction of such
allosteric sites in the absence of experimental data remains a challenging
problem for computational tools. There are a few examples where MD
has been used to reveal cryptic sites,
145−149
but to our knowledge all of the previous
studies have been retrospective and there are no examples of calculations
predicting allosteric sites that were later confirmed experimentally.
We see this as an area of great potential, as methods for enhanced
sampling are developed in conjunction with increasing computational
capacities.
Higher-Level Concepts
The previous section on structure-based
approaches was applicable
to cases in which there exists an explicit set of targets and decoys
together with appropriate structural information, and the goal is
to identify strategies for crafting families of ligands with the ability
to cover the targets while largely avoiding the decoys. Here we consider
two different classes, one in which all the targets are not explicitly
known and the other for which the targets and decoys are the same
molecules but the goal is to bind to them only in some tissues or
environments and not in others.
Substrate Envelope Hypothesis
For therapies to be useful
against rapidly mutating targets, they must avoid the development
of resistance mutants that no longer bind the therapeutic molecule.
Such cases are especially important in infectious disease and cancer,
and such considerations are paramount in HIV. Application of the previously
discussed structure-based concepts first requires knowledge of all the potential targets,
which can be daunting in these
situations.The substrate envelope hypothesis elegantly avoids this
difficulty for cases in which the target is an enzyme, by acknowledging
that all targets must still bind and process substrate; mutants that
fail to process substrate are lethal, if the target is truly valid.
The substrate envelope hypothesis is one implementation of the notion
that inhibitors sufficiently similar to substrate will bind to all
enzyme variants capable of binding and processing substrates. The
specific similarity criterion applied is that candidate inhibitors,
when bound to the active site, must reside within and not extend beyond
the molecular envelope of substrates when productively bound at the
active site (Figure 7).
Figure 7
Substrate envelope hypothesis.
To achieve broad binding selectivity
against an enzyme target and the collection of its functional mutants,
a useful approach has been to develop inhibitors that bind within
and do not extend beyond the envelope created by the outer shape of
the substrate (or a collection of substrates) bound to the active
site. The idea is illustrated in panels A–D, and an example
from HIV-1 protease is given in panels E–G. (A) The parent
target protein is shown in orange outline and shading, and a bound
substrate is shown in yellow with the substrate envelope indicated
by the yellow outline. (B) An inhibitor (green shading) that binds
within the substrate envelope (yellow outline) binds not only the
parent target (orange outline) but also a mutant (orange shading)
that includes positions that protrude further into the active site
(left side) and that retreat away from the site (right side). (C)
A different inhibitor (green shading) that extends beyond the substrate
envelope (yellow outline) might make better interactions with the
parent target (orange outline and shading) and even bind with higher
affinity than other inhibitors. (D) However, such an envelope-violating
inhibitor may bind poorly to protein mutants (orange shading) that
differ from the parent (orange outline) by protruding further into
the active site and introduce a potential clash with the inhibitor
(left side, green hatching) or by retreating away from the active
site and remove a stabilizing interaction (right side, orange hatching).
Interestingly, there is a preponderance of the “retreating”
mutations over the “protruding” ones for HIV-1 protease,
perhaps because of molecular plasticity issues. (E) An HIV-1 protease
inhibitor
66
that binds with high affinity
to wild-type HIV-1 proteases as well as to mutants is shown to reside
within the substrate envelope (yellow surface) in its crystal structure
in the protein complex (the protein has been removed for clarity).
(F, G) HIV-1 protease inhibitor saquinavir from PDB entry 3OXC,
184
which binds well to wild-type HIV-1 proteases but is susceptible
to resistance mutants, is shown to extend outside the substrate envelope
(yellow surface) in its crystal structure (the protein has been removed
for clarity in panel F but is present in panel G, in which some side
chains associated with resistance mutations have been highlighted
and labeled).
The substrate envelope hypothesis has been applied
to the protease
from HIV-1, a rapidly mutating target presenting significant drug
resistance.
66
Early clinically approved
inhibitors lopinavir and saquinavir are highly susceptible to resistance
mutations. Once these mutants were identified, it became clear that
lopinavir and saquinavir had overly narrow selectivity across the
true but initially unknown set of targets. Lopinavir and saquinavir
bind wild-type enzyme relatively strongly (K
i of 0.005 and 0.65 nM, respectively) but lose over 1000-fold
affinity to resistance mutants (L10I/G48V/V82A for lopinavir and L10I/G48V/I54V/L63P/V82A
for saquinavir).
150
Consistent with the
substrate envelope hypothesis, both lopinavir and saquinavir extend
outside the substrate envelope when bound at the active site (Figure 7F and Figure
7G illustrate
this for saquinavir).
151,152
To test the substrate
envelope hypothesis as a design methodology
rather than as an analysis tool, computational molecular design was
undertaken with the constraint that all designed inhibitors be required
to respect the substrate envelope. Some but not all of the resulting
high-affinity inhibitors had broad binding profiles to a panel of
drug-resistant mutants, which provides strong support for the substrate
envelope hypothesis as a useful design approach.
66
X-ray crystal structures on a selection of compounds showed
that all ligands successfully bound within the substrate envelope,
as seen in Figure 7E for one example ligand.
Interestingly, the result that some envelope-respecting high-affinity
inhibitors were susceptible to resistance mutations suggests that
the substrate envelope hypothesis represents one dimension of substrate
similarity and that other dimensions may also be necessary to ensure
that a designed molecule is sufficiently substrate-like to avoid resistance.
The substrate envelope hypothesis has more recently been applied
to HCV protease
153
and appears to be effective
in that system as well. Further application and validation of the
substrate envelope hypothesis could lead to a new way of developing
inhibitors with a broad selectivity profile with respect to potential
drug-resistance mutations. Designing inhibitors to fit within the
substrate envelope is a key design strategy in avoiding the problems
of drug resistance and highlights the importance of shape in controlling
selectivity and promiscuity. While the exact mechanism of achieving
broad selectivity depends on the system of interest, the idea of trying
to replicate the shape and flexibility of the natural substrates is
helpful when dealing with enzymes that are prone to resistance mutations.
Local Cellular Environments
In all of the above examples,
the primary determinant of selectivity has been the thermodynamics
of binding. However, drug targets exist in a complex environment and
there are approaches to design for selectivity that rely on the nonequilibrium
nature of cells, organs, and organisms. For example, the ability to
control the rate at which a compound enters or exits the cell can
provide a mechanism to achieve increases in local concentrations and
thereby offers an opportunity to tune selectivity. Membrane transporters
that span cell membranes and control the influx and efflux of endogenous
substrates are also known to be crucial in controlling the transport
of xenobiotics such as drugs. Indeed, it has been suggested that carrier-mediated
and active mechanisms represent the major mode of drug uptake.
154
Extensive genome analysis has recently provided
a comprehensive list of drug transporters, and experimental screening
systems have been suggested to pick appropriate transporters that
can be used for drug delivery.
155
There are a large number of transporters that act on existing drugs,
a subset of which are shown in Table 1. Drug
transporters play key roles in drug absorption, distribution, and
excretion and are differentially expressed in many tissues such as
the intestine, liver, kidney, and brain. This is neatly illustrated
by the quinolone antibacterial olamufloxacin (HSR-903). Many derivatives
of quinolone are known to cause severe central nervous system side
effects, such as convulsion. However, olamufloxacin is actively effluxed
by P-glycoprotein (P-gp) at the blood–brain barrier (BBB),
circumventing these potential side effects.
156
Furthermore, it is well absorbed from the intestine and actively
taken up by the lung, where it performs its function.
157
Harnessing such knowledge of drug transporters should allow
us to target transporter proteins in specific organs and thus develop
improved methods of selective drug delivery.
158
In fact, recent computational work has shown that a structure-based
method based on induced-fit docking
99
is
capable of predicting P-gp binding selectivity.
159
The approach was able to consistently differentiate between
P-gp binders and nonbinders, both in retrospective and prospective
studies. Accounting for receptor flexibility, as discussed in the
above section, was critical in obtaining accurate structural models
and predictions, which are likely to have a significant impact on
future drug development efforts.
Table 1
Known Drug Transporters along with
Their Natural Substrates and a Subset of the Identified Drug Substratesa
transporter
natural substrates
drug
substrates
PEPT1
dipeptides, tripeptides
ampicillin, temocapril,
enalapril, midodrine, valacyclovir,
PEPT2
dipeptides, tripeptides
amoxicillin, cefadroxil,
cefaclor, bestatin, valganciclovir
OCT1
organic cations
zidovudine, acyclovir, ganciclovir,
metformin, cimetidine
OCT2
organic cations
memantine, metformin, propranolol,
cimetidine, quinine
OAT1
organic
anions
quinidine, pyrilamine, verapamil,
valproate, cephaloridine
OAT2
organic anions
zidovudine, tetracycline,
salicylate, methotrexate, erythromycin
OATP-A
organic anions
fexofenadine, rocuronium,
enalapril, temocaprilat, rosuvastatin
OATP-B
organic anions
pravastatin, glibenclamide,
atorvastatin, fluvastatin, rosuvastatin
OATP-C
organic anions
benzylpenicillin, rifampicin,
cerivastatin, pitavastatin, methotrexate
a
The table is based on data from
Sai and Tsuji
155
and Dobson & Kell.
154
Another method to gain selectivity for specific tissues
is by regulating
cellular trafficking.
160
This is exemplified
in the design of novel cytokines that target cancer cells using the
iron-binding protein transferrin (Tf). When bound with iron, Tf binds
to the Tf receptor (TfR) on cell surfaces, where the complex is endocytosed.
The acidic environment of the endosome then stimulates iron release.
This process can be exploited, as cancerous cells express higher levels
of TfR than normal cells. Thus, cancer cells can be specifically targeted
by conjugating drugs to Tf.
161
The same
phenomenon of local cellular pH is also an important regulator of
protein structure and function in other systems.
162
The peculiarity of the tumor environment has also
been exploited
by other methods. Solid tumors commonly contain regions with very
low concentrations of oxygen, and cancerous cells in these hypoxic
regions are often resistant to both radiotherapy and chemotherapy.
However, hypoxic conditions provide an opportunity for tumor-selective
therapy,
163
including prodrugs activated
by hypoxia such as tirapazamine
164
and
banoxantrone (AQ4N).
165
Banoxantrone is
a prodrug with two dimethylamino N-oxide groups that
is converted to a topoisomerase II inhibitor by reduction of the N-oxides to dimethylamino
substituents. It appears that
banoxantrone is reduced by the cytochrome P450 enzymes CYP2S1 and
CYP2W1 under hypoxic conditions in vivo.
166
These two extrahepatic P450 enzymes are expressed in hypoxic tumor
cells at much higher levels than in normal tissue. Evidence from phase
I trials shows that banoxantrone penetrates hypoxic tumors and accumulates
selectively in cancer cells, providing a potentially useful therapeutic window.
167
It is clear that targeting drugs to specific
cells offers a direct
route to achieving selectivity. In addition to capitalizing upon the
effect of cell trafficking on drug molecules, it is also possible
to direct drugs to specific cells by coupling them with cell-targeting
oligopeptides. The glucose-regulated protein 78 (GRP78) is overexpressed
on the surface of human cancer cells, and the recently identified
peptide Pep42 binds to GRP78 and is selectively internalized.
168
Thus, Pep42 can potentially act as a carrier
for cytotoxic drugs to specifically target human cancer cells in a
GRP78-dependent manner. Linkage with Pep42 was shown to enrich the
presence of quantum dots in tumor tissue in a xenograft mouse model.
169
Pep42 has also been used to transport paclitaxel
and doxorubicin through a connection with a cathepsin B-cleavable
linker to facilitate intracellular release.
170
Such an application has the potential to minimize the adverse side
effects associated with conventional cancer therapeutics, as the drugs
are effective at a lower concentration. The effectiveness of such
cell-targeting peptide has also been improved by coupling to liposomes.
PIVO-8 (sequence SNPFSKPYGLTV) is one of a series of peptides that
binds to non-small-cell lung cancer cell lines but not to normal cells.
171
PIVO-8 was coupled to the polyethylene glycol
terminus of a stabilized liposome containing doxorubicin. This targeted
delivery of liposomal doxorubicin was shown to increase cancer cell
apoptosis and decrease tumor angiogenesis in mice.
172
Rational design to control pharmacokinetics also
shows promise
in the development of drugs targeting the central nervous system (CNS).
The market for CNS drugs is one of the fastest growing in the pharmaceutical
sector, but CNS drugs show the poorest success rates in clinical development.
173
One of the key problems is that drugs have
to penetrate the BBB to exert their action in the brain.
174
This can be achieved if the compound is highly
lipophilic and able to penetrate the BBB by passive diffusion or if
it is the substrate of an influx transporter.
156
One caveat is that the compound cannot also be a substrate
of efflux transporters such as the ABC or amino acid transporters.
Two of the most important families of influx transporters are the
large neutral amino acid transporters such as LAT1 and the glucose
transporters such as GLUT1. LAT1 is responsible for transporting amino
acids such as valine and tyrosine, but it has also been found to transport
drugs such as baclofen, levodopa, gabapentin, melphalan, and thyroxin.
This has recently been exploited for the purpose of drug delivery
by coupling ketoprofen to l-tyrosine. This prodrug has been
shown to cross the rat BBB by a LAT1-mediated mechanism.
175
A similar approach has been used to target
drugs for BBB uptake via the GLUT1 glucose transporter.
176
The idea of targeting drugs by coupling with
transporter substrates is illustrated in Figure 8.
Figure 8
Targeting drugs to cellular transporters. A cartoon illustrating
the mechanism by which selectivity is achieved from linking drug molecules
to targets of membrane transporters. Passive transport of molecules
across membranes is a slow process and is in competition with rapid
clearance (bottom). Active uptake by membrane bound transporters such
as GRP78 (top left) or LAT1/GLUT1 (top right) allows drug molecules
to be targeted toward particular cells or organs.
There have been many recent advances in controlling
drug pharmacodynamics
and pharmacokinetics, and the number of examples discussed here indicates their significance
to drug development and importance
for selective drug delivery. Such selectivity can be gained either
from the inherent properties of a molecule or by coupling with a specific
targeting species. Both techniques have proven to be useful, and exploiting
them successfully should lead to improved drug delivery and higher
success rates in drug development.
Conclusions
We have described the key principles of
rational selectivity design
and presented real-world examples of how these principles have been
successfully applied in achieving selectivity. While selectivity is
always desired in a drug discovery campaign, often it is not explicitly
considered during the discovery process. Furthermore, it is important
to think of selectivity along the continuum of narrow (hitting only
a single target) to broad (hitting a panel of desired targets). The
majority of drug discovery projects includes aspects of both broad
and narrow selectivity, to varying degrees. By continuation of expansion
of the knowledge base of experimental information related to interaction
networks and cellular processes within biological systems, the definition
of desirable targets and undesirable decoys will become increasingly
clear.
The key aspects of this paper are illustrated in Figure 1. We discuss five structure-related
design principles
that can be leveraged to achieve selectivity. Shape complementarity
provides one way of gaining selectivity, particularly when the binding
site of the target is larger than that of the decoy. In this case,
generating a clash with the decoy that is not present in the target
can be worth many log units in selectivity. Selectivity can also be
gained when the binding site of the target is smaller than that of
the decoy, but in this case the gains may be only modest. Electrostatic
complementarity also provides a direct means of gaining selectivity.
This can be particularly effective when the target or the decoy binding
site is charged or highly polar. Modulation of the ligand electrostatic
potential field provides an attractive means of attaining selectivity
because of the long-range nature of electrostatic interactions.
Protein flexibility is another crucial aspect to consider in selectivity
design. With respect to predicting selectivity, it is particularly
important to understand the plasticity of the undesirable decoy structures,
since induced fit effects may confound simple predictions based on
static crystal structures. However, such plasticity can also provide
a mechanism for gaining selectivity in cases where the target is flexible
and the decoy is rigid. Modeling of explicit water molecules is another
area that requires careful consideration in selectivity design. Interfacial
water molecules have been implicated in cases of both selective and
promiscuous binding, and recently developed computational methods
allow the effect of water molecules on binding to be probed. Finally,
allosteric modulation of the target can be used to gain selectivity
in cases where the decoy lacks an allosteric site. There are a number
of proteins for which allosteric sites have been identified, thereby
offering an opportunity to gain selectivity.
In addition to
these five structural properties, two higher-order
concepts are presented. The substrate envelope hypothesis postulates
that a drug molecule designed to fit within the consensus volume of
natural substrates will evade problems due to resistance mutations,
as mutations that adversely affect binding of the ligand will also
adversely affect substrate processing. The hypothesis has been utilized
to design inhibitors that show broad selectivity and are effective
against both the wild-type protein and resistance mutants. The second
higher-order concept is to alter the drug molecule to control pharmacokinetics
and target specific organs or cell types. Carrier-mediated uptake
of drug molecules is an area that is now being explored and has recently
been used to target hypoxic tumors, cancer cells, the lungs, and the
brain. These developments have the potential to yield higher success
rates in drug development by rational design of selective drug delivery.
While we have focused this work primarily on structure-based determinants
of selectivity, recent work has highlighted the relationship between
the nature of molecular scaffolds and the promiscuity of molecules
containing those scaffolds.
177
It was also
found that molecules with increased log P tend
to be more promiscuous binders, in agreement with previous work.
178
Smaller molecules with a large number of terminal
ring systems were also found to be more promiscuous. This agrees with
other work, suggesting that larger and more complex molecules have
a lower probability of exhibiting perfect shape and electrostatic
complementarity with any given target and are thus expected to show
narrower selectivity.
179,180
Indeed, using ligand information
can be valuable in improving selectivity and can be used in conjunction
with the structure-based techniques described in this work.
One important consideration not explored in this work is the process
of target selection itself. In some cases it is possible to choose
targets that are less likely to raise challenging selectivity problems.
For example, when multiple biologically viable targets are available,
one can use protein sequence analysis to choose the target that is
least similar to other targets, especially in the binding site. Correspondingly,
if other proteins that are highly similar to the target of interest
have been previously shown to have selectivity problems, this can
raise an early red flag in a discovery program.
We believe that
the current structure-based drug design methods
have great power when the right approach is taken for the appropriate
problem. Conversely, it is easy to overextend the applicable domain
of a method and deem the computation to have failed when indeed the
method may not be suitable to address the problem of interest. As
methods are improved and computational power is increased, we will
see the applicability of the methods expand. With the aforementioned advances and
the growing number of successful applications of rational selectivity design appearing
in the literature, the decisions about which method
to apply and when they are appropriate will become more straightforward. At present,
selectivity
design remains an immensely important and challenging problem in the
drug discovery process. We hope that the principles laid out in this
work and the associated examples will help make the practice of selectivity
design more transparent and lead to more explicit consideration of
how selectivity can be improved in the process of rational drug design.