Many studies have shown that primary prostate cancers are multifocal1-3, and are composed
of multiple genetically distinct cancer cell clones4-6. Whether or not multiclonal
primary prostate cancers typically give rise to multiclonal or monoclonal prostate
cancer metastases is largely unknown, although studies at single chromosomal loci
are consistent with the latter. Here we show through a high-resolution genome-wide
SNP and copy number survey that most if not all metastatic prostate cancers have monoclonal
origins and maintain a unique signature copy number pattern of the parent cancer cell
while also accumulating a variable number of separate subclonally sustained changes.
We find no relationship between anatomic site of metastasis and genomic copy number
change pattern. Taken together with past animal and cytogenetic studies of metastasis7,
and recent single-locus genetic data in prostate and other metastatic cancers8-10,
it appears that despite common genomic heterogeneity in primary cancers, most metastatic
cancers arise from a single precursor cancer cell. Methodologically, this study establishes
that genomic archeology of multiple anatomically separate metastatic cancers in individuals
can be used to define the salient genomic features of a parent cancer clone of proven
lethal metastatic phenotype.
DNA was isolated from 94 anatomically separate cancer sites in 30 men who died from
metastatic prostate cancer (Fig. 1a) and was analyzed by chromosomal metaphase-based
comparative genomic hybridization (cCGH) and/or by Affymetrix Genome-Wide Human SNP
(single nucleotide polymorphism) Array 6.0 analysis (Affy6).
Eighty-five sites from 29 of the subjects were studied by cCGH. To assess possible
clonal relationships of metastasizing cells, two or more anatomically separate cancerous
lesions were studied by cCGH in 24 subjects (80 samples, range 2–8 samples/subject).
Significance Analysis of Microarrays (SAM)11 was used to detect 218 loci across the
genome which were affected by either copy number gain or loss. Copy number data from
these 218 loci for the 80 samples were analyzed by unsupervised hierarchical clustering
(Fig. 1b). For 15 of 24 subjects (63%) cCGH data from all samples clustered by subject
of origin, suggesting a strong clonal relationship of separate metastatic samples
in the majority of subjects.
Subject-specific “perfect” clustering of metastatic cCGH copy number data in a substantial
number (15/24) of the subjects with multiple anatomically separate samples led us
to explore their further association with genomic copy number12 through an unsupervised
cluster-subject matching test, a supervised classification-based assessment13, 14,
and through distance-based analysis, all of which reject the null hypothesis that
observed clustering is random.
To better visualize the relationships of the copy number data among the 80 samples
studied, we displayed the full cCGH dataset via the top discriminatory components
in 3-D Euclidean space extracted by weighted Fisher criterion-based discriminatory
component analysis (wFC-DCA)15, where the overall intrasubject copy number pattern
similarity of both clustering cases (for example case 17, the cyan circles) and nonclustering
cases (for example case 33, the magenta triangles) is apparent (Fig. 1c). Taken together
with the cCGH data clustering results, these data suggest that in the majority of
cases, metastatic cells in a given subject may have clonal origins.
To further examine potential clonal origins, we performed Affy6 analysis in a subset
of samples from 14 subjects where at least three metastatic deposits were available
for analysis. Affy6 genomic position resolution is approximately 5000x cCGH resolution
with an average physical distance of ~700 base pairs between a total of over 1.8 million
probes. Subject-specific “perfect” clustering is observed for all 58 samples studied
from 14 subjects (Fig. 1e). Permutation-based and statistical analyses similar to
those performed for the cCGH results showed evidence to reject the null hypothesis
of random subject-specific clustering. Interestingly, projection of the Discriminatory
Component Analysis results for the Affy6 data (Fig. 1d) show a tighter and more “exclusive”
association among anatomically distinct cancer samples from the same subject than
that seen in the cCGH data (Fig. 1c), suggesting that misclustering of samples in
the cCGH data is likely due to lower assay resolution.
Analysis of the probability of common origins of DNA samples from different individuals
is now relatively routine. Proving common origins of different populations of mutant
cells from the same individual is not routine, but has long been a topic of inquiry
in relation to the origins of metastatic cancer16. Previous analysis of cytogenetic,
isoenzyme, and X-chromosome inactivation9, 17, 18 data, as well as cell-line based
experimental metastasis studies7, 9, and more recent specific analyses of one or a
few genetic loci (including PTEN and TMPRSS2-ETS gene aberrations) have suggested
clonal origins of metastatic melanoma and prostate cancers8-10, 19, 20 in at least
a substantial percentage of metastatic cancer patients.
As a source of markers of clonality which might be more informative due to their unique
genomic position and relative copy number change, we sought to further test the hypothesis
of clonal origins of metastatic prostate cancer by examining allele-specific patterns
of gain and loss, and regions of the genome affected by homozygous deletion.
Analysis of a representative sample of allele-specific copy number data from two subjects
is shown (Figs. 2 and 3). A signature pattern of copy number gains and losses is present
in each sample studied, and elements of this signature are present in every anatomically
separate cancer DNA sample. Changes present in all samples are here termed omniclonal,
and other changes termed subclonal are present in only a subset of samples studied
in a given subject. The presence of omniclonal changes unique in chromosomal position
and copy number strongly suggests that all metastatic cancer cells in these subjects
had a single clonal cancer cell origin, and suggest that studies of multiple metastases
in cancer patients can be used to derive a set of changes present in the ultimate
parent cancer cell. The presence of omniclonal and subclonal changes depicted (Figs.
2, 3, and 1e) provide a picture of strikingly high-fidelity maintenance of a subject-specific
signature set of copy number changes derived from a single parent cancer cell, with
variable degrees of additional subclonally maintained changes. Fig. 1e also suggests
definable “personalities” of omniclonal and subclonal changes among subjects, with
for example a moderate number of medium-sized omniclonal changes and rare subclonal
changes in subject 17, relatively sparse omniclonal changes in subject 19 with relatively
greater numbers of subclonal changes, and a strikingly high number of relatively small
omniclonal changes in subject 33 with a moderate number of subclonal changes.
We found 17 homozygous deletions unique in chromosomal position and unique to one
of the 10 study subjects in which they were found (Supplementary Table 8). Fifteen
of 17 (88%) of these homozygous deletions were similarly present in all samples studied
from a given subject (“omniclonal”), strongly suggesting common cellular clonal origins
of the metastatic cancer cell populations in each of these subjects. Genes affected
by clonal homozygous deletions include PTEN, BRCA2, TGFBR2, PCAF, PR, FHIT, PPP2R2A,
BNIP3L, CDKN2A, and ACVRL1.
The spectrum of anatomic sites affected by metastasis in men with disseminated prostate
cancer is variable21, and could possibly be explained by variations at the genomic
level. To examine whether specific clonal or subclonal changes are associated with
specific anatomic sites of metastasis21, we used permutation-based analyses to compare
observed cCGH and Affy6 copy number data from all 85 DNA samples grouped by anatomic
location, and find no statistical evidence of copy number pattern similarity on this
basis (Supplementary Figs. 4 and 8).
Prostate cancer is more aggressive at every stage in African-Americans than in other
racial groups22. We used permutation-based analysis to compare copy number findings
in prostate cancer samples from four African-American men represented in the cCGH
data, and 2 African-American men represented in the Affy6 data presented. No difference
in overall genomic pattern was detected (data not shown), though results are based
on a small sample size.
Androgen pathway alterations are thought to play crucial roles in the progression
of prostate cancer to a lethal disease, with upregulation of Androgen Receptor (AR)
gene expression a consistent finding. With respect to AR copy number, we found that
only two subjects (17, 34) show a normal single copy of AR present in all metastatic
sites. Seven subjects (3, 12, 19, 22, 31, 32, 33) show gain from 2–8 copies, with
most of these falling stably in the 2–3 range. Five subjects (16, 21, 24, 28, 30)
show AR gains of 9–40 copies, similar to high level gains found in a minority of cases
in previous in situ hybridization based studies23, 24. Interestingly, as average AR
copy number in each subject’s set of metastases increases, the variation in copy number
among metastatic samples from a given subject also tends to increase, although in
most subjects overall AR copy number is relatively stable across metastastic sites
for individual subjects, consistent with clonal origins with subclonal variations
as seen in the overall copy number analysis (Supplementary Fig. 11).
Fusion transcript formation between TMPRSS2 and ETS family members have been shown
to occur in 50% or more of all prostate cancers, with TMPRSS2-ERG fusion transcripts
being most commonly found. With regard to TMPRSS2-ERG fusions, we found heterozygous
deletion between ERG and TMPRSS2 in 7 of 14 subjects studied by Affy6. When it was
present, the same deletion event was found in each metastatic site in a given case,
as previously observed by Mehra et al10. We examined TMPRSS2-ERG fusion transcript
status in 18 anatomically separate metastatic prostate cancer samples from a subset
of these subjects, and it is uniformly present in all 9 samples studied from subjects
with ERG deletion, and uniformly absent from all 9 samples studied from subjects without
ERG deletion (Supplementary Table 9 and Supplementary Fig. 10). These observations
are consistent with this deletion and resulting fusion transcript formation being
a common early, pre-metastatic event, although evidently one that is not required
for successful tumor cell dissemination. A more thorough cataloging of all ETS family
gene fusions will be necessary to understand their role in tumor progression.
Beyond the demonstrated presence of clonal and subclonal changes in each subjects’
set of metastatic samples, overall patterns of genomic change shown (Fig. 1e) vary
greatly between subjects. To test whether these overall patterns could be related
to therapy received, we compared clonal and subclonal change frequencies (Supplementary
Table 10) in 7 subjects having undergone DNA-damaging chemotherapy (cylophosphamide,
topotecan, etoposide, and/or carboplatin) vs 7 subjects who did not receive DNA-damaging
chemotherapy, and found no statistical differences (Supplementary Table 11 and Supplementary
Figs. 12-14).
To our knowledge, the study reported here provides the first full high-resolution
genomic overview of copy number changes in multiple metastatic cancers in individual
humans, analysis of which adds substantial depth to the clonal origins discussion.
This study provides the most comprehensive evidence to date that all or at least the
vast majority of patients with metastatic prostate cancer have cancers that originated
in a single aberrant cell, a finding likely to extend to other cancers, and demonstrates
that it is feasible to use metastatic site comparison to derive the set of changes
present in the parent cancer cell in each subject. The findings also demonstrate that
there are a substantial but variable number of subclonally maintained changes in metastatic
cancer sites in a given subject.
Our findings cast new light on previously published data suggesting that primary prostate
cancers are often multifocal1-3and often have multiple separate clonal parent cell
origins4-6. Our data show that lethal metastatic prostate cancer cells derive from
a common parent cell, and also show that subclonal changes arise and are sustained.
Studies of anatomically separate primary cancers from individual subjects using a
genome-wide set of loci are indicated to revisit this question and determine whether
previous studies were underpowered and detected subclonally maintained differences
but missed clonal changes, or whether primary prostate cancer is often truly multifocal
as is currently widely believed.
Our recent studies of relative hyper- and hypo- methylation at selected CpG islands
in the same subjects’ samples suggest that some hypermethylation changes are “clonal”
within a given subject25, while hypomethylation changes are more heterogeneous26.
These findings, together with transcript and protein expression studies in a similar
set of subjects studied by Shah et al27 suggest that full pathway-based integration
of genetic, epigenetic, and protein level data from multiple metastatic samples in
a larger series of patients with metastatic cancer may be a uniquely powerful way
to establish well-prioritized lists of targets for development of new drug and diagnostic
targets.
Several aspects of these data are relevant to tying together what is currently known
at the macrogenomic level about metastatic cancer in humans. First is to emphasize
that strong evidence of clonal origins does not mean that all cells are genomically
identical in a given metastatic cancer site. Cytogenetic and other studies show that
a degree of genomic copy number “wobble” exists in metastatic cancer cells18. The
data presented here show that despite this “wobble”, a relatively clean, clear, and
highly individual-specific pattern of copy number changes occurs in metastatic prostate
cancers in the majority of cases, and that this pattern is maintained in aggregate
among multiple metastatic sites in individuals with surprising fidelity when compared
to cell-line based metastasis studies28.
Second, the findings reported here are based on the aggregate signal from millions
of metastatic prostate cancer cell genomes represented in each sample studied. It
is possible but seems unlikely that two or more clonal populations dependent upon
each other for metastatic “success” could have quite different copy number changes
that sum to the data we observe here. Third, these data cannot rule out an alternative
hypothesis where clonal-appearing and individually unique copy number patterns observed
could be a result of individual subject-specific requirements for successful metastasis.
In this alternate scenario, polyclonal, highly genomically unstable cancer cells would
“succeed” only if they met very tight copy number gain and loss requirements specific
to the subject. It is hard to imagine a feasible biological mechanism through which
such specificity could arise from autochthonous cells, so this hypothesis appears
unlikely to be correct. Finally, if in most patients metastatic prostate cancer cells
have a common clonal origin, this suggests that cancer cells with stem cell properties
obtain these properties in the context of a shared set of individual-specific copy
number changes, consistent with recent findings29.
Our results show that metastatic prostate cancer deposits in individual men have clonal
origins in most if not all cases. Using subject A17 as representative of all subjects
in the current study, and considering reports suggesting that prostate cancer cells
may lie dormant in the bone marrow for many years30, 31, spread of cancer cells with
common clonal origins occurs either in a “Direct Clonal” or “Indirect Clonal” pattern
as illustrated in Fig. 5. The large tan circle represents the prostate, and the black
circle represents prostate cancers capable of lethal spread. Green circles represent
local prostate cancers incapable of spread, and Yellow circles represent nonlethal
spreading cancer as suggested by data from Ellis et al30. We found no significant
difference in copy number patterns in prostate cancer foci isolated from the prostate
at autopsy and metastases from various sites in the 5 subjects where prostate cancer
foci were isolated from the prostate at autopsy. “Direct clonal” lethal metastasis
provides the simplest explanation of these findings, since “Indirect” metastasis would
require that the metastatic prostate cancer metastasize back to the prostate as illustrated
by the dashed arrows.
In conclusion, these data suggest that in most if not all metastatic prostate cancer
cases, the origins of cancer cells within disparate metastatic prostate cancer deposits
can be traced to a single genomically aberrant prostate cell whose macrogenomic copy
number changes are relatively stably replicated with each cell division. Upon this
relatively stable base of copy number change, additional copy number changes occur
and are subclonally sustained. These findings have potentially important implications
for treatment of metastatic prostate cancer: understanding and predicting therapeutic
success in an individual will likely depend on the degree of clonal uniformity as
well as the specific genomic alteration pattern for metastatic lesions in a given
patient. Hypothetically, since high clonal diversity should improve cancer cell survival
in response to change, the degree of clonality of a given patient’s metastatic prostate
cancer cells could have as important an impact on therapeutic response as the specific
pattern of genomic changes found in the prostate cancer cells. Additional studies
are needed to determine how the macrogenomic monoclonality suggested in the majority
of metastatic prostate cancer patients studied here relates to what is found at the
microgenomic (individual base pair) level.
Methods Summary
PELICAN Autopsy Study of Lethal Prostate Cancer
Ninety-four cancer samples were studied from 30 men who died of prostate cancer and
underwent autopsy as part of the Project to Eliminate Lethal prostate CANcer (PELICAN)
rapid autopsy program at the Johns Hopkins Medical Institutions (JHASPC). Initiated
in 1994, all JHASPC study subjects gave informed consent to participate as part of
a Johns Hopkins Medicine IRB-approved protocol. All subjects underwent androgen-deprivation
during the course of their treatment for metastatic prostate cancer, and died between
1995 and 2004. Tissues were snap-frozen and cryostat-microdissected and DNA purified
as described previously20. Subject and sample data including distribution of samples
studied by cCGH and Affy6 array technology are contained in Supplementary Table 1.
Mean estimated cancer sample DNA purity based on hematoxylin and eosin histology is
88% (range 60–99%).
Chromosomal Comparative Genomic Hybridization (cCGH) was performed at resolution of
389 cytogenetic bands (excluding the chromosome Y) in 85 cancer DNA samples from 29
subjects. cCGH data (Supplementary Table 3) are of lower resolution but are highly
concordant with array-based CGH (aCGH) results32. CGH was done as described previously33
and as detailed in Supplementary Methods.
Affymetrix Genome-Wide Human SNP array 6.0 analysis
Genome-Wide Human SNP array 6.0 chips (Affy6) were purchased from Affymetrix, Inc.
All of the reagents used for the assay were obtained from manufacturers recommended
by Affymetrix. We amplified, purified, fragmented and labeled the genomic DNA, hybridized,
washed and stained the Affy6 arrays according to the manufacturer’s instructions (Supplementary
Methods). We used Partek Genomic Suite (PGS) version 6.4 for allele specific and non-allele
specific analyses using default settings (http://www.partek.com/Tutorials) unless
otherwise specified. Sixteen subject-paired noncancerous samples from 14 subjects
were used to create a copy number baseline (Supplementary Table 2). For each of 58
cancer DNA samples studied by Affy6, we then generated a DNA copy number estimate
for all ~1.8 million probes on the Affy6 chip, and then segmented these data into
52221 channels using the PGS Segmentation algorithm. Autosomal and sex-chromosomal
segmentation data for the 58 samples was then analyzed in PGS using unsupervised hierarchical
clustering (Pearson’s Dissimilarity algorithm) to produce data shown (Fig. 1e). Allele-specific
genomic analysis depicted (Figs. 2, 3 and Supplementary Table 8) was performed using
the PGS allele-specific analysis algorithm that includes genotype information and
allele-specific intensities from paired samples to estimate DNA copy number for each
heterozygous SNP, and is further described in Supplementary Methods.
Statistical Analysis
Permutation- Based Classification Analysis. For the cCGH data, considering each metastatic
DNA sample with 218 SAM-defined (Supplementary Methods and Supplementary Tables 4
and 5) CGH measures as a vector of 218-elements and the distance between two samples
defined as the Euclidean distance of two vectors. For the Affy6 data, we considered
each metastatic DNA sample with 52221 measures as a vector of 52221-elements and the
distance between two samples defined as the Euclidean distance of two vectors. All
the samples were divided into a training set and a testing set. The predicted label
for a sample in the testing set is the same as the label of the sample mean of all
samples belonging to the same subject in the training set with the smallest distance
to the testing sample (nearest mean classifier)13. Leave-One-Out Cross Validation
(LOOCV) is then performed utilizing one sample as the test sample and the remaining
samples as the training set13. This is repeated such that every sample is used once
as testing sample. If the predicted label coincides with the original label, it is
correctly classified; otherwise, it is in error. We apply the nearest mean classification
method to classify the samples and utilize the LOOCV to estimate the classification
error. The error rate is calculated as the percentile of wrongly classified samples
over all samples. Statistical tests on cCGH and Affy6 data are one-tailed.
We also tested cCGH and Affy6 data for evidence of clonality by testing the hypothesis
that there is no difference between the “between-subject” distance and “within-subject”
distance by considering each cCGH sample with 218 CGH measures as a vector of 218-elements
and each Affy6 sample with 52221 measures as a vector of 52221 elements. Let Dbm
be the average “between-subject” distance over all sample pairs belonging to different
subjects and Dbw
be the average “within-subject” distance over all sample pairs belonging to the same
subject, using the summary statistic Ss = Dbm − Dbw
, we compared experimentally observed Ss
to the distribution of Ss
calculated from 100,000 random permutations of the subject labels34, 35. The experimentally
observed cCGH data Ss
value is 3.8159, and the maximum value of Ss
in the permuted data is 0.8467 (Supplementary Fig. 3), rejecting the null hypothesis
with P <0.00001. The experimentally observed Affy6 data Ss
value is 110.24, and the maximum value of Ss
in the permuted data is 19.62 (Supplementary Fig. 7), rejecting the null hypothesis
with P <0.00001. Additional statistical methods details are contained in Supplementary
Methods.
Supplementary Material
1
2