Streptococcus pneumoniae (‘pneumococcus’) causes an estimated 14.5 million cases of
serious disease and 826,000 deaths annually in children <5 years of age
1
. The highly effective US introduction of the PCV7 pneumococcal vaccine in 2000
2,3
provided an unprecedented opportunity to investigate the response of an important
pathogen to a widespread, vaccine-induced, selective pressure. Here we use array-based
sequencing of 62 isolates from a US national monitoring program to study five independent
instances of vaccine escape recombination
4
, demonstrating directly the simultaneous transfer of multiple and often large (up
to at least 44kbp) DNA fragments. We show that one such novel strain quickly became
established, spreading from East to West across the US. These observations clarify
the roles of recombination and selection in the population genomics of pneumococcus,
and provide proof-of-principle of the considerable value of combining genomic and
epidemiological information in the surveillance and enhanced understanding of infectious
diseases.
Childhood vaccination has proved effective against many viral and bacterial diseases,
but more sophisticated vaccine approaches will be needed for pathogens with more complex
genomes, life-cycles, and population structures, where the evolutionary responses
of organisms are likely to be a key factor. Among the earliest successful bacterial
vaccines were some for diseases (diphtheria, tetanus) where the actual cause of disease
(e.g. a toxin) is targeted directly. Conjugate vaccines, in which a bacterial polysaccharide
is joined to an immunogenic protein, have been developed for two other important childhood
pathogens, Haemophilus influenzae type B and Neisseria meningitidis type C
5-7
. The success of these vaccine strategies was at least in part due to the relatively
simple structure of the pathogen population, and the limited variability and evolvability
of the pathogen molecule(s) targeted by the vaccine, but other organisms provide greater
challenges.
Pneumococcal cells are covered by a layer of polysaccharide called the capsule, which
serves as a major virulence factor and provides a target for vaccination. The first
pneumococcal conjugate vaccine in routine use in infants was PCV7 (Pfizer), a conjugate
incorporating the polysaccharides of seven of the >92 capsular types (serotypes) introduced
to the US in 2000 to immediate and dramatic effect. By 2001, rates of invasive pneumococcal
disease in the vaccinated age group, 0-2 years, had decreased by 69%
2
and by 2007 the rate in children <5 years old had stabilized at 24% of the level present
before the vaccine
3
.
At the time of its introduction, there was considerable speculation about the likely
results of the extreme selection induced by the vaccine, and its possible downstream
consequences on vaccine efficacy and longevity
8,9
. Colonization of the nasopharynx, especially of young children, provides the major
reservoir for transmission of the pneumococcus. Vaccination reduces the rate of colonization
of PCV7 serotypes, interrupting transmission while also allowing the rate of colonization
by non-vaccine serotypes to increase
10
. Two mechanisms were anticipated for this serotype replacement: demographic expansion
of non-vaccine serotype lineages and capsular switching – the replacement of the capsular
gene cluster in one genome by the non-vaccine capsular genes from a different lineage.
We combined epidemiological and genomics approaches to better understand the nature,
mechanisms, and consequences of vaccine escape. We sequenced a pre-selected subset
of 12% (300kb) of the pneumococcus genome (Supplementary Figure 1) using Affymetrix
CustomSeq technology and describe data in 62 isolates ascertained for potential epidemiological
interest (Table 1). Whole-genome resequencing of several of the isolates on the Illumina
GAIIx platform was used to confirm findings of particular interest.
During 2000-2007, approximately 27000 sterile-site pneumococcal isolates had been
recovered from patients in 10 US states and serotyped by the Centers for Disease Control
and Prevention in Atlanta, USA, as part of the “ABCs” monitoring programme
3
. The so-called sequence type of 1,902 serotype 19A isolates between 2001 and 2007,
collected from patients of all ages, was then determined
4,11,12
by MLST (multi-locus sequence typing), a widely used molecular fingerprinting approach
in which (Sanger) sequence data is collected from seven fixed ~400-500bp fragments
of essential genes
13
. Samples with a sequence type not commonly associated with serotype 19A were potential
examples of vaccine escape through capsular switching
14
(Table 1; details Supplementary Table 1). We had previously reported three distinct
progeny strains, which we refer to as P1, P2, and P3, resulting from capsular switching
with serotype 4 recipients
4
, which we confirmed by Sanger sequencing to identify recombinational breakpoints.
Two further instances (P4, P5) were identified in the current study by resequencing
of candidates (Supplementary Table 2). We have therefore defined a total of 5 independent
instances of vaccine escape through capsular switches whereby serotype 4, which is
included in the PCV7 vaccine, was replaced by serotype 19A, which is not.
In addition to identifying capsular switch recombinants, our resequencing approach
allowed us to search for putative donor and recipient genomes. When one sequence takes
up DNA from another through recombination, we refer to the former as the recipient
sequence and the latter as the donor sequence. For P1 and P2, our sequencing revealed
well-matched putative recipient and donor sequences with serotypes 4 and 19A. In addition,
genomic analyses identified respectively 4 and 8 additional sequence fragments that
did not match known serotype 4 (recipient) genomes (Figure 1 (A); Supplementary Figure
2; Supplementary Table 3) and suggested that all the imported fragments could have
originated from a single serotype 19A donor sequence in each case. Illumina sequencing
of an early P1 and its prospective donor and recipient confirmed our analysis and
identified 8 extra small imported fragments across the whole genome (Figure 1 (B)).
To rule out the possibility that the additional fragments could have come from other
serotype 4 sequences that we had not analysed, we used SNP typing to screen 88 archived
US serotype 4 isolates collected around the time PCV7 was introduced for P1- and P2-specific
additional sequence fragments, and found no alternative candidate recipients that
could explain the structure of P1 and P2 without invoking multiple imports from a
serotype 19A-like donor (Supplementary Note; Supplementary Table 7; Supplementary
Table 8).
Our data thus demonstrate the independent origin of each recombinant lineage and strongly
support the idea that multiple fragments may be transferred during a single episode
of recombination. Across P1-P5, conservative estimates suggest a range of 1-27 fragments
have been transferred in addition to the capsular locus, with sizes from 0.04 to at
least 44 kb (Supplementary Table 3). While it was impossible to exclude separate sequential
events in explaining each progeny structure, the observation that whenever we ascertained
a capsular recombination event we saw other serotype 19A-like imports elsewhere in
the genome is strong evidence that multiple fragments may be imported from the donor
simultaneously, or in a short time sequence. Recombination involving transfer of large
fragments or multiple fragments simultaneously has long been observed or inferred
in vitro in pneumococcus
15-18
. A recent report
19
documented multiple putative transfers in a single individual. Our findings show that
such recombination events can happen not only in vitro or in individuals, but also
at population scales, becoming evident after a nationwide immunization programme.
A further recent report
20
detected several instances of capsular recombination in the 40-year global spread
of a multi-drug-resistant lineage of pneumococcus but did not describe evidence for
multi-fragment recombination, perhaps because of differences in sampling strategy
or analytical methodology.
Predictions made at the introduction of the pneumococcal conjugate vaccine in the
US about the potential for serotype replacement were confirmed by early data from
the ABCs network
2
. Among all non-vaccine serotypes, 19A has increased in frequency most, for a variety
of possible reasons
12
. Between 1998 and 2007, rates of invasive pneumococcal disease caused by serotype
19A increased ~2.5-fold and its share of disease at all ages increased from approximately
3% to 20%, reaching 47% in children under 5
3
(Figure 2; Supplementary Table 4). Capsular switching as a means of vaccine escape
was also predicted in advance, but the success of the vaccine escape lineage P1 is
still remarkable. Isolates were first detected in New York (n = 3) and Connecticut
(n = 1) in 2003
14
and have spread westward in subsequent years. Since 2003, P1 has become one of the
most prevalent genotypes in post-vaccine populations, having been recovered from 175
patients of all ages by the end of 2007. In contrast, three of the other four vaccine
escape lineages we detected, P3-P5, have been seen only once in our screen, and P2
has been observed 8 times, predominantly in the northeastern US.
The spread of vaccine escape recombinant P1 and to a lesser extent P2 has also allowed
us an unprecedented opportunity to observe pneumococcal evolution in real populations
in real time. Genomic analyses of the evolution of the P1 and P2 lineages demonstrate
that recombination events have continued to occur and imply that when recombination
can be definitively inferred it tends to involve multiple genomic fragments (Supplementary
Note; Supplementary Figure 3). With no ascertainment bias to favour recombination
episodes involving a large transferred sequence such as the capsular locus, these
data are consistent with a model of variable numbers of smaller transferred sequences
and with published estimates of the relative rates of recombination and mutation
20,21
. Depending on the assumptions made, the proportion of new variation within the P1
and P2 recombinant lineages that has arisen due to recombination can be estimated
at at least ~60% (details in Supplementary Note).
In this study we have observed, in 5 separate vaccine escape lineages and during the
subsequent evolution of two of those lineages, 11 separate episodes of recombination
leading to the import of sequences into a pneumococcal genome. In only two of these
episodes was there no evidence for transfer of multiple separate fragments and so
we conclude that multi-fragment recombination is commonplace in pneumococcal populations.
One consequence of this is that even the terminology “capsular switch” is potentially
misleading because it suggests that only the capsular locus has been transferred.
Documentation of multi-fragment recombination in real populations is particularly
interesting because it has profound consequences for the way in which an organism
may be able to traverse its evolutionary fitness landscape. For example, moderate
to high level beta-lactam class antimicrobial resistance is usually associated with
horizontal transfer of variants at three dispersed pbp loci, and drug resistance took
about two decades after the introduction of penicillin to first emerge in pneumococcus.
Having emerged, penicillin resistance determinants now spread rapidly from one genetic
background to another under drug-induced selective pressure and pose a significant
threat to treatment. Multi-fragment recombination could also have been important in
generating the (currently unknown) factor(s) which allowed P1 to surpass many other
non-vaccine lineages in invasive disease incidence. The recent introduction of a 13-valent
pneumococcal conjugate vaccine including serotype 19A in the US and elsewhere is likely
to reduce significantly the impact of serotype 19A on vaccinated populations, but
how many, and which, serotypes will be needed for a vaccine that provides acceptable
long-term disease reduction are still unknown.
Modern high-throughput molecular technologies now allow typing of bacterial isolates
on a genomic scale, thus providing much greater resolution than current, standard,
MLST approaches. We have described a proof-of-principle experiment which confirms
the potential for combining genome-scale genetic information with epidemiological
data, in this case in better understanding serotype replacement following introduction
of a conjugate vaccine. We identified five independent instances of vaccine escape
through capsular switching from serotype 4 to 19A. Our genomic data provide strong
evidence that in each case the recombination event generating the capsular switch
involved simultaneous import of multiple and often large additional DNA fragments
around the genome. This process has far-reaching consequences for the evolution of
bacteria and their response to the strong selection imposed by vaccines or antimicrobials.
It may also play a role in the striking success of the P1 vaccine escape lineage as
an invasive pathogen among the 19A lineages present after vaccine introduction. While
vaccine escape through capsular switching was correctly predicted in advance of the
vaccination programme, our analyses show that, particularly in the light of complex
recombination mechanisms, its specific consequences are difficult to predict.
Supplementary Material
1