Introduction Ovarian carcinoma is a heterogeneous disease. On the basis of histopathological examination, pathologists classify ovarian carcinoma into serous, clear cell, endometrioid, and mucinous subtypes. Each of theses subtypes is associated with different genetic risk factors and molecular events during oncogenesis [1,2], and characterized by distinct mRNA expression profiles [3,4]. These subtypes differ dramatically in frequency, when early stage carcinomas (where the majority are nonserous carcinomas ) and advanced stage carcinomas (which are predominantly of serous subtype ) are compared. Oncologists have noted that subtypes respond differently to chemotherapy. The dismal response rate of clear cell carcinomas (15%) contrasts sharply with that of high-grade serous (80%), resulting in a lower 5-y survival for clear cell compared with high-grade serous carcinoma in patients with advanced stage tumors (20% versus 30%) [7,8]. Therefore, the National Cancer Institute (NCI) State of Science meeting recently singled out clear cell carcinoma as a candidate for clinical trials to identify more active therapy than what is currently available . Although these data suggest substantial differences between subtypes, ovarian carcinoma is typically approached as a monolithic entity by researchers and clinicians. This practice impedes progress in understanding the biology or improving the management of the less common ovarian carcinoma subtypes. We hypothesized that correlations between biomarker expression and stage at diagnosis or prognosis would reflect subtype variation in biomarker expression. To test this hypothesis we correlated protein expression rates of a panel of 21 candidate biomarkers with stage at diagnosis and disease-specific survival (DSS) in a large cohort of ovarian carcinomas and also analyzed these associations within ovarian carcinoma subtypes. Methods Study Population The Cheryl Brown Ovarian Cancer Outcomes Unit is an ovarian cancer registry serving a population of approximately four million people in British Columbia. For the period 1984–2000, 2,555 patients with ovarian carcinoma were recorded in the registry. From these 834 patients were selected based on the criterion being free of macroscopic apparent residual disease after primary surgery and all histological slides underwent gynecopathological review. Subtypes were assigned according to refined World Health Organization (WHO) criteria  as recently described . A further 91 patients diagnosed in stage 1a or 1b, grade 1 were excluded from the study because of excellent prognosis; only 3% of women in this group died of disease during the follow-up period. From the remaining patients 541 tissue blocks were available and used for tissue microarray (TMA) construction. A representative area of each tumor was selected and duplicate 0.6-mm tissue cores were punched to construct a TMA (Beecher Instruments). Review after TMA construction revealed that 23 cases were not adequately sampled. Of these 23 cases, 20 mixed carcinomas (>10% of tumor showing a second histological cell type) were excluded because their highest grade component was not sampled on the TMA; 18 cases were either of rare histological types (including seven undifferentiated, six transitional, and one squamous carcinoma) or could not be specified (five cases). This approach resulted in a study population of exactly 500 cases belonging to one of the four major cell types (serous, endometrioid, clear cell, and mucinous) (Table 1). The serous subtype was further subdivided into low- and high-grade . Two cases of endometrioid carcinomas containing minor mucinous or low-grade serous components (>10%) are included in the study. Table 1 Study Population Adjuvant Therapy and Follow-up All patients received standardized treatment according to the provincial treatment guidelines of the British Columbia Cancer Agency (BCCA) [12,13]; however, 3% of patients refused the advised adjuvant chemotherapy and were excluded from survival analysis. For 3% adjuvant therapy was not advised, hence 94% received platinum-based chemotherapy (with or without abdomino-pelvic radiotherapy) adjuvant treatments. Outcomes were tracked via the Cheryl Brown Ovarian Cancer Outcomes Unit at the BCCA and were available for all patients. Follow-up information was obtained through the electronic patient record of the BCCA or the patient's paper chart. Examples of documentation used to ascertain vital status include BCCA progress notes, death certificates, and correspondence indicating status from other care providers. Ovarian carcinoma specific death was defined where ovarian cancer was the primary or underlying cause of death. Death from concurrent disease (i.e., second malignancy) was coded as “died of other cause.” Death resulting from toxicities relating to treatments for ovarian carcinoma was coded as “died of toxicities.” Abstracted data were reviewed by an experienced medical oncologist (K.S.). Median follow-up time was 5.1 y. Approval for the study was obtained from the Research Ethics Board of the University of British Columbia. Marker Selection and Immunohistochemistry The goal of our marker selection was to use proteins that are consistently expressed in ovarian carcinomas and have been reported as prognosticators (p53, p21, Ki-67, PR, WT1) [14–19] or being developed as early detection markers in ovarian carcinomas . This approach biased our results towards selection of markers mostly derived from and expressed in high-grade serous subtype. Serial 4-μm sections were cut for immunohistochemical (IHC) analysis and run through an automated protocol including heat antigen retrieval (Ventana System). The antibodies and suppliers are listed in Table 2. Specificity was determined by using appropriate positive controls, with omission of primary antibody as a negative control. Table 2 Antibodies Evaluation of Immunohistochemistry One or more pathologists (MK, DNI, or AR) scored these biomarkers after scanning with a BLISS scanner (Bacus Laboratories/Olympus America). Except KISS1  and p53  where recently published cut-off points were used, all markers were dichotomized into negative and positive cases (cut-off values for positive versus negative for all markers except Ki-67 are shown in Table S1). Ki-67 was assessed as a continuous variable as a percentage of positive tumor cells using automated image analysis software . Prior to analysis a pathologist (MK) manually selected regions of interest so as to avoid noncancerous cellular areas. The median was used to dichotomize into low- and high-expressing groups for Ki-67. Statistical Analysis Contingency analysis and Pearson's Chi2 statistic were used to test the change in the distribution of biomarker expression across stage and subtypes. The Kruskal-Wallis test was used to determine if Ki-67 was differentially expressed across stage and subtypes. Univariable DSS was illustrated by the generation of Kaplan-Meier curves and subgroup differences tested with a univariable Cox model. Multivariable DSS was tested using the Cox proportional hazards model. The Cox proportional hazards model was used to determine risk ratios (RRs) and p-values for all univariable and multivariable DSS analyses. Additionally, to assess significance in the presence of some small subgroups, permutation tests were performed and permutation p-values reported. Under the null hypothesis of no association of biomarker status with survival (for survival analyses) or stage/histology (for contingency table analyses), the biomarker outcomes are exchangeable across cases. For the survival analyses, permutations of biomarker outcomes were performed within stage/subtype subgroups, to preserve the observed distribution of biomarker frequencies within subgroups. Permutation was performed by exchanging each case's entire biomarker panel at random without replacement among cases, to preserve correlation structure within case. A total of 10,000 permutation replications were performed. p-Values were obtained by finding the number of permutation sample estimates (Cox model parameter estimate for survival analyses, Pearson Chi2 statistic for contingency table analyses) as extreme or more extreme than the observed value. p 75% of cases for WT1, Mesothelin, ER, and CA125 (Table S2). The biomarker expression pattern of low-grade serous carcinomas was similar to that of their high-grade counterparts. Three markers (PR, p53, K-Cadherin) showed a trend towards differential expression in low-grade versus high-grade serous subtypes. Only the median Ki-67 labelling index differed significantly between those groups, with median Ki-67 labelling index of 2.5% (95% confidence interval [CI] 0.5%–20.4%) in low-grade serous versus 22.4% (95% CI 3.6%–69.9%) in high-grade serous subtype (Figure 5). Endometrioid carcinomas coexpress high rates of hormone receptors ER and PR as well as CA125. Endometrioid and clear cell subtypes infrequently (<10%) expressed WT1 and p53. The median Ki-67 labelling index for endometrioid and clear cell carcinomas was similar (endometrioid 8.2%, 95% CI 0.8%–49.0%; clear cell 7.6%, 95% CI 0.5%–45.0%). Immunophenotypic characteristics of clear cell carcinomas included low levels of hormone receptors ER (10%) and PR (3%). The mucinous subtype displayed an intermediate proliferative capacity compared with the other subtypes (median Ki-67 labelling index 12.9%, 95% CI 2.1%–60.9%) and frequent expression of Matriptase (86%). Many of the markers expressed in other subtypes were either infrequently (<10%) expressed (p53, ER, PAX8, SLPI, K-Cadherin, and CA125), or completely absent (CRABP2, WT-1, and Mesothelin). Of note, EpCam was highly expressed across all subtypes included in this study. Figure 5 Distribution of Ki-67 Labelling Index across Subtypes Survival Analyses Can Be Confounded by Subtype Differences To assess the biological importance of a biomarker, its expression is usually correlated with outcome. Survival analysis was restricted to the three major subtypes (high-grade serous, clear cell, and endometrioid) because of insufficient numbers of cases of mucinous or low-grade serous subtypes. The primary endpoint was defined as DSS and the rates after 10 y are shown for subtypes in Table 1. A multivariable Cox regression model including age, stage, and histological subtype showed significant differences across stage (p < 0.0001) and subtype (p = 0.015). Survival by stage showed little difference between stages I and II, with stage III showing poorer DSS (RR 3.0, 95% CI 1.87%–4.66% relative to stage I). Survival by subtype showed poorer DSS for clear cell (RR 2.31, 95% CI 1.29%–4.15%) and high-grade serous (RR 2.74, 95% CI 1.56%–4.81%) relative to endometrioid subtype. Age was not predictive in the model (p = 0.211) (Table S3). Univariable Cox regression analysis for each biomarker was applied on the entire cohort as well as within the three largest subtypes (Figure S1, Table 3). RRs and p-values are presented in Table 3. Nine of 21 biomarkers show prognostic significance in the entire cohort. Of the nine biomarkers showing a significant association with DSS in the entire cohort, three remain prognostic indicators in the high-grade serous and one in the endometrioid subtype. As an extreme example, WT1 is an unfavourable prognostic biomarker in the entire cohort (p = 0.0017, Figure 6A) but is a favourable prognostic biomarker for high-grade serous carcinomas (p = 0.0086, Figure 6B). As WT1 is expressed in 80% of high-grade serous carcinomas but rarely in other subtypes, this negative prognostic significance in the entire cohort reflects subtype differences in expression, with WT1 most commonly expressed in the aggressive high-grade serous subtype. Four other biomarkers (KISS1, K-Cadherin, Mesothelin, Ki-67) that were significant in the entire cohort did not show significance in any subtype. Table 3 Univariable COX Regression for Disease-Specific Survival Figure 6 Prognostic Associations of WT1 Kaplan-Meier survival analysis of DSS. (A) Entire cohort grouped by WT1 positive versus negative cases (p = 0.0017, univariable COX regression). (B) high-grade serous subtype grouped by WT1 positive versus negative cases (p = 0.0086, univariable COX regression). Ki-67 serves as an additional example, which is prognostic in the whole cohort but not when corrected for subtype. The median for Ki-67 labelling index in the entire cohort was 13.0% and using this as a cut-off for high versus low Ki-67 labelling index effectively separates high-grade serous carcinomas from low-grade serous, endometrioid, and clear cell carcinomas (Figure 5). Mucinous carcinomas showed an intermediate Ki-67 labelling index. Associated with high-grade serous subtype, it is not surprising that Ki-67 has prognostic relevance in the whole cohort (p = 0.0062). When using the subtype specific median for separate analysis of each subtype however, Ki-67 labelling index was not of prognostic significance in any of the subtypes but Ki-67 labelling index was different between subtypes. Discussion Ovarian carcinomas subtypes are associated with distinct genetic risk factors, underlying molecular events during oncogenesis, stage at diagnosis, and responses to chemotherapy. With slight modification of the WHO criteria for histopathological assignment for subtype we have recently shown that classification of ovarian carcinomas into five subtypes (high-grade serous, low-grade serous, clear cell, endometrioid, and mucinous) is reproducible and is supported by biomarker expression data . By demonstrating that biomarker correlations with stage or prognosis can be explained by variations in expression rates between subtypes, our study offers persuasive evidence supporting the view that ovarian carcinoma subtypes are different diseases. Biomarker expression is stable across stage within a given subtype. Furthermore, differences in the expression profile between subtypes confound survival analysis for biomarkers, when multiple ovarian carcinoma subtypes are considered together. Collectively, these data have implications for ovarian carcinoma research and treatment. Cancer treatment in general is beginning to move towards therapies tailored for specific cancer subtypes (e.g., breast carcinoma and lymphoma [24,25]), and this subtype specific approach to treatment has implications for the design of clinical trials for women with ovarian carcinomas. It has been recognized for some time that certain ovarian carcinoma subtypes are less sensitive to platinum-based chemotherapy than the most common high-grade serous carcinomas. The clear cell and mucinous subtypes, in particular, are candidates for clinical trials to identify more active therapy than what is currently used . Given the dramatic differences in biomarker expression between ovarian carcinoma subtypes, our analysis suggests that advancing our understanding of these poorly understood subtypes—including identification of potential therapeutic targets—will only come through studies focusing on these specific subtypes rather than studies of unselected series of patients. The biomarker expression profile within a given subtype is consistent across stage. Hence, early and advanced stage ovarian carcinomas differ primarily based on subtype, while within a subtype there is no difference between early and advanced stage tumors. This distinction has implications for the research on biomarkers for ovarian carcinoma screening, where the goal is detection of early stage disease, which has a much greater likelihood of cure. If subtypes are neglected, a screening marker identified in advanced stage tumors (i.e., high-grade serous carcinomas), may not be expressed in most nonserous early stage ovarian carcinomas, and vice versa. For example, CA125 is expressed in most high-grade serous carcinoma, but only in 60% of mucinous and clear cell subtypes, a finding that is consistent with previous studies . A related observation is that serum CA125 levels are elevated in 80% of patients with advanced stage epithelial ovarian carcinoma but are increased in only 60% of patients with early stage disease [27,28]. It is likely that a panel of tumor markers will be required to detect all subtypes. As the biomarker expression was consistent between stages within the subtypes, these data support the use of late stage cancers to identify biomarkers for the early detection of cancers of the same subtype. Biomarker correlation with prognosis can be confounded by subtype differences in biomarker expression. Some biomarkers show prognostic significance independent from subtype, e.g., we confirmed that MMP7 expression is a strong independent prognostic factor for favourable prognosis in the entire cohort, as shown previously  as well as for high-grade serous and endometrioid subtypes. This result is the exception rather than the rule as for most of the biomarkers, the correlation with prognosis in the entire cohort is due to the correlation with the most common subtype (high-grade serous carcinoma), which in turn is associated with a poor prognosis. The biomarkers that were of prognostic significance in subtype analysis were typically only of prognostic value for a single subtype. WT1 is a widely used diagnostic marker for the serous subtype  and is an example for how analysis of the entire cohort can give misleading results. In the entire cohort WT1 is an unfavourable prognostic marker but is a favourable prognostic marker for high-grade serous tumors (Figure 6). This latter observation may be because WT1 is a marker for serous differentiation, and less differentiated high-grade serous cancers are both less likely to express WT1 and have a worse prognosis. This inverse association in a subgroup, also known as Simpson's paradox, will not typically be revealed by multivariable analysis . Another example is Ki-67; there are conflicting results on the prognostic value of Ki-67 in ovarian carcinoma [31–36]. After applying a single cut-off point on the entire cohort for identification of Ki-67 high and Ki-67 low cases, high Ki-67 index is associated with an unfavourable prognosis. But differences in Ki-67 indices between subtypes again confound the analysis because nearly all high-grade serous carcinomas have a high Ki-67 index. In analysis by subtype, Ki-67 is not of prognostic significance; the effect seen in the entire cohort reflects an association with the high-grade serous subtype. Adjustment for multiple comparisons is an important consideration. However, there are several bodies of data under discussion in different sections of this report, with differing numbers of comparisons. For example, assessing the proportion of positive cases across histological subtypes for each biomarker involves the assessment of 21 tests; whereas assessing survival within FIGO and histological subtype groups involves more tests. Since p-value adjustment for multiple testing uses the number of tests under consideration, several collections of adjusted p-values would have to be constructed yielding a complex distraction from the discussion at hand. We note that for the assessment of proportion of positive cases, the Bonferroni-adjusted level of significance would be 0.05/21 = 0.0024, and several p-values in that analysis are less than this level. We report raw p-values so that the reader can apply multiple comparison adjustments relevant to the size of comparisons being made in any section of the paper, and in future meta-analyses of subsets of these data. Corrections for multiple comparisons were not used because this issue of prognostic significance is not the central theme of the manuscript. Prognostic significance is used to illustrate the importance of subtype-specific analysis. A limitation of our study is that it is performed retrospectively , and 94% of patients received adjuvant platinum-based chemotherapy. Hence, we can not adhere to the strict definition of a prognostic marker as applying only to the natural history of the disease. We hope these data will end the lumping of ovarian carcinoma subtypes within biomarker studies, as is the current practice [38–41]. This biomarker panel shows that subtypes have distinct expression profiles. One of the reasons why no ovarian carcinoma tissue-based prognostic markers are used clinically, despite a voluminous literature suggesting many candidates, is that prognostic effects have proven difficult to validate. In addition to assay specific challenges, the different frequencies of subtypes within cohorts can vary or, as shown here with WT1, reverse prognostic effects. If ovarian carcinoma cases are not separated by subtype or evaluated using a stratified analysis or a model with complex interaction terms, even a multivariable model can conceal important findings or lead to misleading conclusions (Table S4). The discovery, development, and validation of subtype specific ovarian carcinoma biomarkers will require adequately powered and expertly subtyped cohorts of cases. For the rarer subtypes, the development of such research resources will likely prove difficult outside of large scale collaborative initiatives. In order to facilitate the shift to subtype specific management of ovarian carcinoma, subtypes should be considered as distinct diseases in biomarker studies and clinical trials. Supporting Information Figure S1 Kaplan-Meier Survival Analysis of Disease-Specific Survival in the Entire Cohort, Stage Subgroups, Subtypes, and the Stage Subgroups by Subtype p-Values (Wald) were generated using a multivariable Cox regression model including the biomarker and age. “Marker * CellType Xn p”-value assesses differential biomarker prognostic value in the different subtypes (a large p-value indicates that biomarker prognosis is similar in the subgroups, a small p-value indicates that biomarker prognosis differs in the subgroups); “Marker * FIGO Xn p”-value assesses differential biomarker prognostic value in the different subgroups. HG-SC, high-grade serous; EC, endometrioid; CC, clear cell; FIGO=stg12, FIGO stage I and II; FIGO=stg3, FIGO stage III. (2 MB PDF) Click here for additional data file. Table S1 Definition of Positive Staining (42 KB DOC) Click here for additional data file. Table S2 Biomarker Expression Rate Across Stage, Subtype, and Stage within Subtypes (133 KB DOC) Click here for additional data file. Table S3 Multivariable COX Proportional Hazards Including Stage, Subtype, and Age for the Entire Cohort (30 KB DOC) Click here for additional data file. Table S4 Multivariable COX Proportional Hazards Including Stage, Subtype, Age, and WT-1 for the Entire Cohort and High-Grade Serous Carcinomas (38 KB DOC) Click here for additional data file.